All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/14] linux-user: brk fixes
@ 2023-08-03  1:52 Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 01/14] linux-user: Unset MAP_FIXED_NOREPLACE for host Richard Henderson
                   ` (14 more replies)
  0 siblings, 15 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

Builds on Helge's v6, incorporating my feedback plus
some other minor cleanup.


r~


Akihiko Odaki (6):
  linux-user: Unset MAP_FIXED_NOREPLACE for host
  linux-user: Fix MAP_FIXED_NOREPLACE on old kernels
  linux-user: Do not call get_errno() in do_brk()
  linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
  linux-user: Do nothing if too small brk is specified
  linux-user: Do not align brk with host page size

Helge Deller (1):
  linux-user: Adjust initial brk when interpreter is close to executable

Richard Henderson (7):
  linux-user: Remove last_brk
  bsd-user: Remove last_brk
  linux-user: Adjust task_unmapped_base for reserved_va
  linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h
  linux-user: Add ELF_ET_DYN_BASE
  linux-user: Use elf_et_dyn_base for ET_DYN with interpreter
  linux-user: Properly set image_info.brk in flatload

 bsd-user/qemu.h                      |  1 -
 linux-user/aarch64/target_mman.h     | 13 ++++
 linux-user/alpha/target_mman.h       | 11 ++++
 linux-user/arm/target_mman.h         | 11 ++++
 linux-user/cris/target_mman.h        | 12 ++++
 linux-user/hexagon/target_mman.h     | 13 ++++
 linux-user/hppa/target_mman.h        |  6 ++
 linux-user/i386/target_mman.h        | 16 +++++
 linux-user/loongarch64/target_mman.h | 11 ++++
 linux-user/m68k/target_mman.h        |  5 ++
 linux-user/microblaze/target_mman.h  | 11 ++++
 linux-user/mips/target_mman.h        | 10 +++
 linux-user/nios2/target_mman.h       | 10 +++
 linux-user/openrisc/target_mman.h    | 10 +++
 linux-user/ppc/target_mman.h         | 20 ++++++
 linux-user/qemu.h                    |  2 -
 linux-user/riscv/target_mman.h       | 10 +++
 linux-user/s390x/target_mman.h       | 20 ++++++
 linux-user/sh4/target_mman.h         |  7 +++
 linux-user/sparc/target_mman.h       | 25 ++++++++
 linux-user/user-mmap.h               |  6 +-
 linux-user/x86_64/target_mman.h      | 15 +++++
 linux-user/xtensa/target_mman.h      | 10 +++
 bsd-user/mmap.c                      |  2 -
 linux-user/elfload.c                 | 94 ++++++++++++++++------------
 linux-user/flatload.c                |  2 +-
 linux-user/main.c                    | 43 ++++++++++++-
 linux-user/mmap.c                    | 68 ++++++++++++--------
 linux-user/syscall.c                 | 69 +++++---------------
 29 files changed, 401 insertions(+), 132 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v7 01/14] linux-user: Unset MAP_FIXED_NOREPLACE for host
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 02/14] linux-user: Fix MAP_FIXED_NOREPLACE on old kernels Richard Henderson
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

From: Akihiko Odaki <akihiko.odaki@daynix.com>

Passing MAP_FIXED_NOREPLACE to host will fail for reserved_va because
the address space is reserved with mmap.  Replace it with MAP_FIXED
in that case.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-2-akihiko.odaki@daynix.com>
[rth: Expand inline commentary.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/mmap.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index a5dfb56545..a11c630a7b 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -603,11 +603,26 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int target_prot,
             goto fail;
         }
 
-        /* Validate that the chosen range is empty. */
-        if ((flags & MAP_FIXED_NOREPLACE)
-            && !page_check_range_empty(start, last)) {
-            errno = EEXIST;
-            goto fail;
+        if (flags & MAP_FIXED_NOREPLACE) {
+            /* Validate that the chosen range is empty. */
+            if (!page_check_range_empty(start, last)) {
+                errno = EEXIST;
+                goto fail;
+            }
+
+            /*
+             * With reserved_va, the entire address space is mmaped in the
+             * host to ensure it isn't accidentally used for something else.
+             * We have just checked that the guest address is not mapped
+             * within the guest, but need to replace the host reservation.
+             *
+             * Without reserved_va, despite the guest address check above,
+             * keep MAP_FIXED_NOREPLACE so that the guest does not overwrite
+             * any host address mappings.
+             */
+            if (reserved_va) {
+                flags = (flags & ~MAP_FIXED_NOREPLACE) | MAP_FIXED;
+            }
         }
 
         /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 02/14] linux-user: Fix MAP_FIXED_NOREPLACE on old kernels
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 01/14] linux-user: Unset MAP_FIXED_NOREPLACE for host Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 03/14] linux-user: Do not call get_errno() in do_brk() Richard Henderson
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

From: Akihiko Odaki <akihiko.odaki@daynix.com>

The man page states:
> Note that older kernels which do not recognize the MAP_FIXED_NOREPLACE
> flag will typically (upon detecting a collision with a preexisting
> mapping) fall back to a “non-MAP_FIXED” type of behavior: they will
> return an address that is different from the requested address.
> Therefore, backward-compatible software should check the returned
> address against the requested address.
https://man7.org/linux/man-pages/man2/mmap.2.html

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-3-akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/mmap.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index a11c630a7b..90b3ef2140 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -263,7 +263,11 @@ static bool mmap_frag(abi_ulong real_start, abi_ulong start, abi_ulong last,
         void *p = mmap(host_start, qemu_host_page_size,
                        target_to_host_prot(prot),
                        flags | MAP_ANONYMOUS, -1, 0);
-        if (p == MAP_FAILED) {
+        if (p != host_start) {
+            if (p != MAP_FAILED) {
+                munmap(p, qemu_host_page_size);
+                errno = EEXIST;
+            }
             return false;
         }
         prot_old = prot;
@@ -687,17 +691,25 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int target_prot,
 
         /* map the middle (easier) */
         if (real_start < real_last) {
-            void *p;
+            void *p, *want_p;
             off_t offset1;
+            size_t len1;
 
             if (flags & MAP_ANONYMOUS) {
                 offset1 = 0;
             } else {
                 offset1 = offset + real_start - start;
             }
-            p = mmap(g2h_untagged(real_start), real_last - real_start + 1,
-                     target_to_host_prot(target_prot), flags, fd, offset1);
-            if (p == MAP_FAILED) {
+            len1 = real_last - real_start + 1;
+            want_p = g2h_untagged(real_start);
+
+            p = mmap(want_p, len1, target_to_host_prot(target_prot),
+                     flags, fd, offset1);
+            if (p != want_p) {
+                if (p != MAP_FAILED) {
+                    munmap(p, len1);
+                    errno = EEXIST;
+                }
                 goto fail;
             }
             passthrough_start = real_start;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 03/14] linux-user: Do not call get_errno() in do_brk()
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 01/14] linux-user: Unset MAP_FIXED_NOREPLACE for host Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 02/14] linux-user: Fix MAP_FIXED_NOREPLACE on old kernels Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 04/14] linux-user: Use MAP_FIXED_NOREPLACE for do_brk() Richard Henderson
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

From: Akihiko Odaki <akihiko.odaki@daynix.com>

Later the returned value is compared with -1, and negated errno is not
expected.

Fixes: 00faf08c95 ("linux-user: Don't use MAP_FIXED in do_brk()")
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-4-akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 95727a816a..b9d2ec02f9 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -862,9 +862,9 @@ abi_long do_brk(abi_ulong brk_val)
      */
     if (new_host_brk_page > brk_page) {
         new_alloc_size = new_host_brk_page - brk_page;
-        mapped_addr = get_errno(target_mmap(brk_page, new_alloc_size,
-                                        PROT_READ|PROT_WRITE,
-                                        MAP_ANON|MAP_PRIVATE, 0, 0));
+        mapped_addr = target_mmap(brk_page, new_alloc_size,
+                                  PROT_READ|PROT_WRITE,
+                                  MAP_ANON|MAP_PRIVATE, 0, 0);
     } else {
         new_alloc_size = 0;
         mapped_addr = brk_page;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 04/14] linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (2 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 03/14] linux-user: Do not call get_errno() in do_brk() Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 05/14] linux-user: Do nothing if too small brk is specified Richard Henderson
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

From: Akihiko Odaki <akihiko.odaki@daynix.com>

MAP_FIXED_NOREPLACE can ensure the mapped address is fixed without
concerning that the new mapping overwrites something else.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-5-akihiko.odaki@daynix.com>
[rth: Pass -1 as fd for MAP_ANON]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/syscall.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b9d2ec02f9..f64024273f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -854,17 +854,12 @@ abi_long do_brk(abi_ulong brk_val)
         return target_brk;
     }
 
-    /* We need to allocate more memory after the brk... Note that
-     * we don't use MAP_FIXED because that will map over the top of
-     * any existing mapping (like the one with the host libc or qemu
-     * itself); instead we treat "mapped but at wrong address" as
-     * a failure and unmap again.
-     */
     if (new_host_brk_page > brk_page) {
         new_alloc_size = new_host_brk_page - brk_page;
         mapped_addr = target_mmap(brk_page, new_alloc_size,
-                                  PROT_READ|PROT_WRITE,
-                                  MAP_ANON|MAP_PRIVATE, 0, 0);
+                                  PROT_READ | PROT_WRITE,
+                                  MAP_FIXED_NOREPLACE | MAP_ANON | MAP_PRIVATE,
+                                  -1, 0);
     } else {
         new_alloc_size = 0;
         mapped_addr = brk_page;
@@ -883,12 +878,6 @@ abi_long do_brk(abi_ulong brk_val)
         target_brk = brk_val;
         brk_page = new_host_brk_page;
         return target_brk;
-    } else if (mapped_addr != -1) {
-        /* Mapped but at wrong address, meaning there wasn't actually
-         * enough space for this brk.
-         */
-        target_munmap(mapped_addr, new_alloc_size);
-        mapped_addr = -1;
     }
 
 #if defined(TARGET_ALPHA)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 05/14] linux-user: Do nothing if too small brk is specified
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (3 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 04/14] linux-user: Use MAP_FIXED_NOREPLACE for do_brk() Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 06/14] linux-user: Do not align brk with host page size Richard Henderson
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

From: Akihiko Odaki <akihiko.odaki@daynix.com>

Linux 6.4.7 does nothing when a value smaller than the initial brk is
specified.

Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-6-akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f64024273f..e1436a3962 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -820,14 +820,14 @@ abi_long do_brk(abi_ulong brk_val)
 
     /* brk pointers are always untagged */
 
-    /* return old brk value if brk_val unchanged or zero */
-    if (!brk_val || brk_val == target_brk) {
+    /* return old brk value if brk_val unchanged */
+    if (brk_val == target_brk) {
         return target_brk;
     }
 
     /* do not allow to shrink below initial brk value */
     if (brk_val < initial_target_brk) {
-        brk_val = initial_target_brk;
+        return target_brk;
     }
 
     new_brk = TARGET_PAGE_ALIGN(brk_val);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 06/14] linux-user: Do not align brk with host page size
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (4 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 05/14] linux-user: Do nothing if too small brk is specified Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 07/14] linux-user: Remove last_brk Richard Henderson
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

From: Akihiko Odaki <akihiko.odaki@daynix.com>

do_brk() minimizes calls into target_mmap() by aligning the address
with host page size, which is potentially larger than the target page
size. However, the current implementation of this optimization has two
bugs:

- The start of brk is rounded up with the host page size while brk
  advertises an address aligned with the target page size as the
  beginning of brk. This makes the beginning of brk unmapped.
- Content clearing after mapping is flawed. The size to clear is
  specified as HOST_PAGE_ALIGN(brk_page) - brk_page, but brk_page is
  aligned with the host page size so it is always zero.

This optimization actually has no practical benefit. It makes difference
when brk() is called multiple times with values in a range of the host
page size. However, sophisticated memory allocators try to avoid to
make such frequent brk() calls. For example, glibc 2.37 calls brk() to
shrink the heap only when there is a room more than 128 KiB. It is
rare to have a page size larger than 128 KiB if it happens.

Let's remove the optimization to fix the bugs and make the code simpler.

Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1616
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-7-akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/elfload.c |  4 ++--
 linux-user/syscall.c | 54 ++++++++++----------------------------------
 2 files changed, 14 insertions(+), 44 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 861ec07abc..2aee2298ec 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3678,8 +3678,8 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
      * to mmap pages in this space.
      */
     if (info->reserve_brk) {
-        abi_ulong start_brk = HOST_PAGE_ALIGN(info->brk);
-        abi_ulong end_brk = HOST_PAGE_ALIGN(info->brk + info->reserve_brk);
+        abi_ulong start_brk = TARGET_PAGE_ALIGN(info->brk);
+        abi_ulong end_brk = TARGET_PAGE_ALIGN(info->brk + info->reserve_brk);
         target_munmap(start_brk, end_brk - start_brk);
     }
 
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index e1436a3962..7c2c2f6e2f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -802,81 +802,51 @@ static inline int host_to_target_sock_type(int host_type)
 }
 
 static abi_ulong target_brk, initial_target_brk;
-static abi_ulong brk_page;
 
 void target_set_brk(abi_ulong new_brk)
 {
     target_brk = TARGET_PAGE_ALIGN(new_brk);
     initial_target_brk = target_brk;
-    brk_page = HOST_PAGE_ALIGN(target_brk);
 }
 
 /* do_brk() must return target values and target errnos. */
 abi_long do_brk(abi_ulong brk_val)
 {
     abi_long mapped_addr;
-    abi_ulong new_alloc_size;
-    abi_ulong new_brk, new_host_brk_page;
+    abi_ulong new_brk;
+    abi_ulong old_brk;
 
     /* brk pointers are always untagged */
 
-    /* return old brk value if brk_val unchanged */
-    if (brk_val == target_brk) {
-        return target_brk;
-    }
-
     /* do not allow to shrink below initial brk value */
     if (brk_val < initial_target_brk) {
         return target_brk;
     }
 
     new_brk = TARGET_PAGE_ALIGN(brk_val);
-    new_host_brk_page = HOST_PAGE_ALIGN(brk_val);
+    old_brk = TARGET_PAGE_ALIGN(target_brk);
 
-    /* brk_val and old target_brk might be on the same page */
-    if (new_brk == TARGET_PAGE_ALIGN(target_brk)) {
-        /* empty remaining bytes in (possibly larger) host page */
-        memset(g2h_untagged(new_brk), 0, new_host_brk_page - new_brk);
+    /* new and old target_brk might be on the same page */
+    if (new_brk == old_brk) {
         target_brk = brk_val;
         return target_brk;
     }
 
     /* Release heap if necesary */
-    if (new_brk < target_brk) {
-        /* empty remaining bytes in (possibly larger) host page */
-        memset(g2h_untagged(new_brk), 0, new_host_brk_page - new_brk);
-
-        /* free unused host pages and set new brk_page */
-        target_munmap(new_host_brk_page, brk_page - new_host_brk_page);
-        brk_page = new_host_brk_page;
+    if (new_brk < old_brk) {
+        target_munmap(new_brk, old_brk - new_brk);
 
         target_brk = brk_val;
         return target_brk;
     }
 
-    if (new_host_brk_page > brk_page) {
-        new_alloc_size = new_host_brk_page - brk_page;
-        mapped_addr = target_mmap(brk_page, new_alloc_size,
-                                  PROT_READ | PROT_WRITE,
-                                  MAP_FIXED_NOREPLACE | MAP_ANON | MAP_PRIVATE,
-                                  -1, 0);
-    } else {
-        new_alloc_size = 0;
-        mapped_addr = brk_page;
-    }
-
-    if (mapped_addr == brk_page) {
-        /* Heap contents are initialized to zero, as for anonymous
-         * mapped pages.  Technically the new pages are already
-         * initialized to zero since they *are* anonymous mapped
-         * pages, however we have to take care with the contents that
-         * come from the remaining part of the previous page: it may
-         * contains garbage data due to a previous heap usage (grown
-         * then shrunken).  */
-        memset(g2h_untagged(brk_page), 0, HOST_PAGE_ALIGN(brk_page) - brk_page);
+    mapped_addr = target_mmap(old_brk, new_brk - old_brk,
+                              PROT_READ | PROT_WRITE,
+                              MAP_FIXED_NOREPLACE | MAP_ANON | MAP_PRIVATE,
+                              -1, 0);
 
+    if (mapped_addr == old_brk) {
         target_brk = brk_val;
-        brk_page = new_host_brk_page;
         return target_brk;
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 07/14] linux-user: Remove last_brk
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (5 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 06/14] linux-user: Do not align brk with host page size Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 08/14] bsd-user: " Richard Henderson
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

This variable is unused.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/user-mmap.h | 1 -
 linux-user/mmap.c      | 2 --
 2 files changed, 3 deletions(-)

diff --git a/linux-user/user-mmap.h b/linux-user/user-mmap.h
index 3fc986f92f..7265c2c116 100644
--- a/linux-user/user-mmap.h
+++ b/linux-user/user-mmap.h
@@ -26,7 +26,6 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
                        abi_ulong new_size, unsigned long flags,
                        abi_ulong new_addr);
 abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice);
-extern unsigned long last_brk;
 extern abi_ulong mmap_next_start;
 abi_ulong mmap_find_vma(abi_ulong, abi_ulong, abi_ulong);
 void mmap_fork_start(void);
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 90b3ef2140..eb04fab8ab 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -314,8 +314,6 @@ static bool mmap_frag(abi_ulong real_start, abi_ulong start, abi_ulong last,
 #endif
 abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
 
-unsigned long last_brk;
-
 /*
  * Subroutine of mmap_find_vma, used when we have pre-allocated
  * a chunk of guest address space.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 08/14] bsd-user: Remove last_brk
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (6 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 07/14] linux-user: Remove last_brk Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 09/14] linux-user: Adjust task_unmapped_base for reserved_va Richard Henderson
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

This variable is unused.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 bsd-user/qemu.h | 1 -
 bsd-user/mmap.c | 2 --
 2 files changed, 3 deletions(-)

diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index edf9602f9b..8f2d6a3c78 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -232,7 +232,6 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
                        abi_ulong new_size, unsigned long flags,
                        abi_ulong new_addr);
 int target_msync(abi_ulong start, abi_ulong len, int flags);
-extern unsigned long last_brk;
 extern abi_ulong mmap_next_start;
 abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size);
 void TSA_NO_TSA mmap_fork_start(void);
diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index b62a69bd07..8e148a2ea3 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -214,8 +214,6 @@ static int mmap_frag(abi_ulong real_start,
 #endif
 abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
 
-unsigned long last_brk;
-
 /*
  * Subroutine of mmap_find_vma, used when we have pre-allocated a chunk of guest
  * address space.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 09/14] linux-user: Adjust task_unmapped_base for reserved_va
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (7 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 08/14] bsd-user: " Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 10/14] linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h Richard Henderson
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

Ensure that the chosen values for mmap_next_start and
task_unmapped_base are within the guest address space.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/user-mmap.h | 18 +++++++++++++++++-
 linux-user/main.c      | 26 ++++++++++++++++++++++++++
 linux-user/mmap.c      | 18 +++---------------
 3 files changed, 46 insertions(+), 16 deletions(-)

diff --git a/linux-user/user-mmap.h b/linux-user/user-mmap.h
index 7265c2c116..fd456e024e 100644
--- a/linux-user/user-mmap.h
+++ b/linux-user/user-mmap.h
@@ -18,6 +18,23 @@
 #ifndef LINUX_USER_USER_MMAP_H
 #define LINUX_USER_USER_MMAP_H
 
+#if HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 64
+#ifdef TARGET_AARCH64
+# define TASK_UNMAPPED_BASE  0x5500000000
+#else
+# define TASK_UNMAPPED_BASE  (1ul << 38)
+#endif
+#else
+#ifdef TARGET_HPPA
+# define TASK_UNMAPPED_BASE  0xfa000000
+#else
+# define TASK_UNMAPPED_BASE  0x40000000
+#endif
+#endif
+
+extern abi_ulong task_unmapped_base;
+extern abi_ulong mmap_next_start;
+
 int target_mprotect(abi_ulong start, abi_ulong len, int prot);
 abi_long target_mmap(abi_ulong start, abi_ulong len, int prot,
                      int flags, int fd, off_t offset);
@@ -26,7 +43,6 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
                        abi_ulong new_size, unsigned long flags,
                        abi_ulong new_addr);
 abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice);
-extern abi_ulong mmap_next_start;
 abi_ulong mmap_find_vma(abi_ulong, abi_ulong, abi_ulong);
 void mmap_fork_start(void);
 void mmap_fork_end(int child);
diff --git a/linux-user/main.c b/linux-user/main.c
index dba67ffa36..c207b783d5 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -821,6 +821,32 @@ int main(int argc, char **argv, char **envp)
         reserved_va = max_reserved_va;
     }
 
+    /*
+     * Promote X and Y to a common type and compare.
+     * ??? Perhaps better to locally disable -Werror=type-limits.
+     */
+#define LESS(X, Y) ((1 ? X : Y) < (1 ? Y : X))
+
+    /*
+     * Select an initial value for task_unmapped_base that is in range.
+     */
+    if (reserved_va) {
+        if (LESS(TASK_UNMAPPED_BASE, reserved_va)) {
+            task_unmapped_base = TASK_UNMAPPED_BASE;
+        } else {
+            /* The most common default formula is TASK_SIZE / 3. */
+            task_unmapped_base = TARGET_PAGE_ALIGN(reserved_va / 3);
+        }
+    } else if (LESS(TASK_UNMAPPED_BASE, UINTPTR_MAX)) {
+        task_unmapped_base = TASK_UNMAPPED_BASE;
+    } else {
+        /* 32-bit host: pick something medium size. */
+        task_unmapped_base = 0x10000000;
+    }
+    mmap_next_start = task_unmapped_base;
+
+#undef LESS
+
     {
         Error *err = NULL;
         if (seed_optarg != NULL) {
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index eb04fab8ab..84436d45c8 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -299,20 +299,8 @@ static bool mmap_frag(abi_ulong real_start, abi_ulong start, abi_ulong last,
     return true;
 }
 
-#if HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 64
-#ifdef TARGET_AARCH64
-# define TASK_UNMAPPED_BASE  0x5500000000
-#else
-# define TASK_UNMAPPED_BASE  (1ul << 38)
-#endif
-#else
-#ifdef TARGET_HPPA
-# define TASK_UNMAPPED_BASE  0xfa000000
-#else
-# define TASK_UNMAPPED_BASE  0x40000000
-#endif
-#endif
-abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
+abi_ulong task_unmapped_base;
+abi_ulong mmap_next_start;
 
 /*
  * Subroutine of mmap_find_vma, used when we have pre-allocated
@@ -391,7 +379,7 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size, abi_ulong align)
 
             if ((addr & (align - 1)) == 0) {
                 /* Success.  */
-                if (start == mmap_next_start && addr >= TASK_UNMAPPED_BASE) {
+                if (start == mmap_next_start && addr >= task_unmapped_base) {
                     mmap_next_start = addr + size;
                 }
                 return addr;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 10/14] linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (8 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 09/14] linux-user: Adjust task_unmapped_base for reserved_va Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:52 ` [PATCH v7 11/14] linux-user: Add ELF_ET_DYN_BASE Richard Henderson
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

Provide default values that are as close as possible to the
values used by the guest's kernel.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/aarch64/target_mman.h     | 10 ++++++++++
 linux-user/alpha/target_mman.h       |  8 ++++++++
 linux-user/arm/target_mman.h         |  8 ++++++++
 linux-user/cris/target_mman.h        |  9 +++++++++
 linux-user/hexagon/target_mman.h     | 10 ++++++++++
 linux-user/hppa/target_mman.h        |  3 +++
 linux-user/i386/target_mman.h        | 13 +++++++++++++
 linux-user/loongarch64/target_mman.h |  8 ++++++++
 linux-user/m68k/target_mman.h        |  3 +++
 linux-user/microblaze/target_mman.h  |  8 ++++++++
 linux-user/mips/target_mman.h        |  7 +++++++
 linux-user/nios2/target_mman.h       |  7 +++++++
 linux-user/openrisc/target_mman.h    |  7 +++++++
 linux-user/ppc/target_mman.h         | 13 +++++++++++++
 linux-user/riscv/target_mman.h       |  7 +++++++
 linux-user/s390x/target_mman.h       | 10 ++++++++++
 linux-user/sh4/target_mman.h         |  4 ++++
 linux-user/sparc/target_mman.h       | 14 ++++++++++++++
 linux-user/user-mmap.h               | 14 --------------
 linux-user/x86_64/target_mman.h      | 12 ++++++++++++
 linux-user/xtensa/target_mman.h      |  6 ++++++
 21 files changed, 167 insertions(+), 14 deletions(-)

diff --git a/linux-user/aarch64/target_mman.h b/linux-user/aarch64/target_mman.h
index f721295fe1..4d3eecfb26 100644
--- a/linux-user/aarch64/target_mman.h
+++ b/linux-user/aarch64/target_mman.h
@@ -4,6 +4,16 @@
 #define TARGET_PROT_BTI         0x10
 #define TARGET_PROT_MTE         0x20
 
+/*
+ * arch/arm64/include/asm/processor.h:
+ *
+ * TASK_UNMAPPED_BASE     DEFAULT_MAP_WINDOW / 4
+ * DEFAULT_MAP_WINDOW     DEFAULT_MAP_WINDOW_64
+ * DEFAULT_MAP_WINDOW_64  UL(1) << VA_BITS_MIN
+ * VA_BITS_MIN            48 (unless explicitly configured smaller)
+ */
+#define TASK_UNMAPPED_BASE      (1ull << (48 - 2))
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/alpha/target_mman.h b/linux-user/alpha/target_mman.h
index 6bb03e7336..c90b493711 100644
--- a/linux-user/alpha/target_mman.h
+++ b/linux-user/alpha/target_mman.h
@@ -20,6 +20,14 @@
 #define TARGET_MS_SYNC 2
 #define TARGET_MS_INVALIDATE 4
 
+/*
+ * arch/alpha/include/asm/processor.h:
+ *
+ * TASK_UNMAPPED_BASE           TASK_SIZE / 2
+ * TASK_SIZE                    0x40000000000UL
+ */
+#define TASK_UNMAPPED_BASE      0x20000000000ull
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/arm/target_mman.h b/linux-user/arm/target_mman.h
index e7ba6070fe..76275b2c7e 100644
--- a/linux-user/arm/target_mman.h
+++ b/linux-user/arm/target_mman.h
@@ -1 +1,9 @@
+/*
+ * arch/arm/include/asm/memory.h
+ * TASK_UNMAPPED_BASE        ALIGN(TASK_SIZE / 3, SZ_16M)
+ * TASK_SIZE                 CONFIG_PAGE_OFFSET
+ * CONFIG_PAGE_OFFSET        0xC0000000 (default in Kconfig)
+ */
+#define TASK_UNMAPPED_BASE   0x40000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/cris/target_mman.h b/linux-user/cris/target_mman.h
index e7ba6070fe..9df7b1eda5 100644
--- a/linux-user/cris/target_mman.h
+++ b/linux-user/cris/target_mman.h
@@ -1 +1,10 @@
+/*
+ * arch/cris/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE      (PAGE_ALIGN(TASK_SIZE / 3))
+ *
+ * arch/cris/include/arch-v32/arch/processor.h
+ * TASK_SIZE               0xb0000000
+ */
+#define TASK_UNMAPPED_BASE TARGET_PAGE_ALIGN(0xb0000000 / 3)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/hexagon/target_mman.h b/linux-user/hexagon/target_mman.h
index e7ba6070fe..c5ae336e07 100644
--- a/linux-user/hexagon/target_mman.h
+++ b/linux-user/hexagon/target_mman.h
@@ -1 +1,11 @@
+/*
+ * arch/hexgon/include/asm/processor.h
+ * TASK_UNMAPPED_BASE        PAGE_ALIGN(TASK_SIZE / 3)
+ *
+ * arch/hexagon/include/asm/mem-layout.h
+ * TASK_SIZE                 PAGE_OFFSET
+ * PAGE_OFFSET               0xc0000000
+ */
+#define TASK_UNMAPPED_BASE   0x40000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/hppa/target_mman.h b/linux-user/hppa/target_mman.h
index 97f87d042a..6459e7dbdd 100644
--- a/linux-user/hppa/target_mman.h
+++ b/linux-user/hppa/target_mman.h
@@ -24,6 +24,9 @@
 #define TARGET_MS_ASYNC 2
 #define TARGET_MS_INVALIDATE 4
 
+/* arch/parisc/include/asm/processor.h: DEFAULT_MAP_BASE32 */
+#define TASK_UNMAPPED_BASE      0x40000000
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/i386/target_mman.h b/linux-user/i386/target_mman.h
index e7ba6070fe..cc3382007f 100644
--- a/linux-user/i386/target_mman.h
+++ b/linux-user/i386/target_mman.h
@@ -1 +1,14 @@
+/*
+ * arch/x86/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE         __TASK_UNMAPPED_BASE(TASK_SIZE_LOW)
+ * __TASK_UNMAPPED_BASE(S)    PAGE_ALIGN(S / 3)
+ *
+ * arch/x86/include/asm/page_32_types.h:
+ * TASK_SIZE_LOW              TASK_SIZE
+ * TASK_SIZE                  __PAGE_OFFSET
+ * __PAGE_OFFSET              CONFIG_PAGE_OFFSET
+ * CONFIG_PAGE_OFFSET         0xc0000000 (default in Kconfig)
+ */
+#define TASK_UNMAPPED_BASE    0x40000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/loongarch64/target_mman.h b/linux-user/loongarch64/target_mman.h
index e7ba6070fe..d70e44d44c 100644
--- a/linux-user/loongarch64/target_mman.h
+++ b/linux-user/loongarch64/target_mman.h
@@ -1 +1,9 @@
+/*
+ * arch/loongarch/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE         PAGE_ALIGN(TASK_SIZE / 3)
+ * TASK_SIZE64                0x1UL << (... ? VA_BITS : ...)
+ */
+#define TASK_UNMAPPED_BASE \
+    TARGET_PAGE_ALIGN((1ull << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/m68k/target_mman.h b/linux-user/m68k/target_mman.h
index e7ba6070fe..d3eceb663b 100644
--- a/linux-user/m68k/target_mman.h
+++ b/linux-user/m68k/target_mman.h
@@ -1 +1,4 @@
+/* arch/m68k/include/asm/processor.h */
+#define TASK_UNMAPPED_BASE      0xC0000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/microblaze/target_mman.h b/linux-user/microblaze/target_mman.h
index e7ba6070fe..ffee869db4 100644
--- a/linux-user/microblaze/target_mman.h
+++ b/linux-user/microblaze/target_mman.h
@@ -1 +1,9 @@
+/*
+ * arch/microblaze/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE           (TASK_SIZE / 8 * 3)
+ * TASK_SIZE                    CONFIG_KERNEL_START
+ * CONFIG_KERNEL_START          0xc0000000 (default in Kconfig)
+ */
+#define TASK_UNMAPPED_BASE      0x48000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/mips/target_mman.h b/linux-user/mips/target_mman.h
index e97694aa4e..fe1eec2d0b 100644
--- a/linux-user/mips/target_mman.h
+++ b/linux-user/mips/target_mman.h
@@ -14,6 +14,13 @@
 #define TARGET_MAP_STACK                0x40000
 #define TARGET_MAP_HUGETLB              0x80000
 
+/*
+ * arch/mips/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE         PAGE_ALIGN(TASK_SIZE / 3)
+ */
+#define TASK_UNMAPPED_BASE \
+    TARGET_PAGE_ALIGN((1ull << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/nios2/target_mman.h b/linux-user/nios2/target_mman.h
index e7ba6070fe..ce18f4f871 100644
--- a/linux-user/nios2/target_mman.h
+++ b/linux-user/nios2/target_mman.h
@@ -1 +1,8 @@
+/*
+ * arch/nios2/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE         PAGE_ALIGN(TASK_SIZE / 3)
+ * TASK_SIZE                  0x7FFF0000UL
+ */
+#define TASK_UNMAPPED_BASE    TARGET_PAGE_ALIGN(0x7FFF0000 / 3)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/openrisc/target_mman.h b/linux-user/openrisc/target_mman.h
index e7ba6070fe..f1aaad809d 100644
--- a/linux-user/openrisc/target_mman.h
+++ b/linux-user/openrisc/target_mman.h
@@ -1 +1,8 @@
+/*
+ * arch/openrisc/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE      (TASK_SIZE / 8 * 3)
+ * TASK_SIZE               (0x80000000UL)
+ */
+#define TASK_UNMAPPED_BASE      0x30000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/ppc/target_mman.h b/linux-user/ppc/target_mman.h
index 67cc218f2e..04f99c6077 100644
--- a/linux-user/ppc/target_mman.h
+++ b/linux-user/ppc/target_mman.h
@@ -4,6 +4,19 @@
 #define TARGET_MAP_NORESERVE            0x40
 #define TARGET_MAP_LOCKED               0x80
 
+/*
+ * arch/powerpc/include/asm/task_size_64.h
+ * TASK_UNMAPPED_BASE_USER32    (PAGE_ALIGN(TASK_SIZE_USER32 / 4))
+ * TASK_UNMAPPED_BASE_USER64    (PAGE_ALIGN(DEFAULT_MAP_WINDOW_USER64 / 4))
+ * TASK_SIZE_USER32             (0x0000000100000000UL - (1 * PAGE_SIZE))
+ * DEFAULT_MAP_WINDOW_USER64    TASK_SIZE_64TB (with 4k pages)
+ */
+#ifdef TARGET_PPC64
+#define TASK_UNMAPPED_BASE      0x0000100000000000ull
+#else
+#define TASK_UNMAPPED_BASE      0x40000000
+#endif
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/riscv/target_mman.h b/linux-user/riscv/target_mman.h
index e7ba6070fe..0f06dadbd4 100644
--- a/linux-user/riscv/target_mman.h
+++ b/linux-user/riscv/target_mman.h
@@ -1 +1,8 @@
+/*
+ * arch/loongarch/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE         PAGE_ALIGN(TASK_SIZE / 3)
+ */
+#define TASK_UNMAPPED_BASE \
+    TARGET_PAGE_ALIGN((1ull << (TARGET_VIRT_ADDR_SPACE_BITS - 1)) / 3)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/s390x/target_mman.h b/linux-user/s390x/target_mman.h
index e7ba6070fe..40d149b329 100644
--- a/linux-user/s390x/target_mman.h
+++ b/linux-user/s390x/target_mman.h
@@ -1 +1,11 @@
+/*
+ * arch/s390/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE           (... : (_REGION2_SIZE >> 1))
+ *
+ * arch/s390/include/asm/pgtable.h:
+ * _REGION2_SIZE                (1UL << _REGION2_SHIFT)
+ * _REGION2_SHIFT               42
+ */
+#define TASK_UNMAPPED_BASE      (1ull << 41)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/sh4/target_mman.h b/linux-user/sh4/target_mman.h
index e7ba6070fe..bbbc223398 100644
--- a/linux-user/sh4/target_mman.h
+++ b/linux-user/sh4/target_mman.h
@@ -1 +1,5 @@
+/* arch/sh/include/asm/processor_32.h */
+#define TASK_UNMAPPED_BASE \
+    TARGET_PAGE_ALIGN((1u << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/sparc/target_mman.h b/linux-user/sparc/target_mman.h
index 9bad99c852..692ebf9dd7 100644
--- a/linux-user/sparc/target_mman.h
+++ b/linux-user/sparc/target_mman.h
@@ -5,6 +5,20 @@
 #define TARGET_MAP_LOCKED              0x100
 #define TARGET_MAP_GROWSDOWN           0x0200
 
+/*
+ * arch/sparc/include/asm/page_64.h:
+ * TASK_UNMAPPED_BASE      (test_thread_flag(TIF_32BIT) ? \
+ *                          _AC(0x0000000070000000,UL) : \
+ *                          VA_EXCLUDE_END)
+ * But VA_EXCLUDE_END is > 0xffff800000000000UL which doesn't work
+ * in userland emulation.
+ */
+#ifdef TARGET_ABI32
+#define TASK_UNMAPPED_BASE      0x70000000
+#else
+#define TASK_UNMAPPED_BASE      (1ull << (TARGET_VIRT_ADDR_SPACE_BITS - 2))
+#endif
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/user-mmap.h b/linux-user/user-mmap.h
index fd456e024e..bae49059e0 100644
--- a/linux-user/user-mmap.h
+++ b/linux-user/user-mmap.h
@@ -18,20 +18,6 @@
 #ifndef LINUX_USER_USER_MMAP_H
 #define LINUX_USER_USER_MMAP_H
 
-#if HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 64
-#ifdef TARGET_AARCH64
-# define TASK_UNMAPPED_BASE  0x5500000000
-#else
-# define TASK_UNMAPPED_BASE  (1ul << 38)
-#endif
-#else
-#ifdef TARGET_HPPA
-# define TASK_UNMAPPED_BASE  0xfa000000
-#else
-# define TASK_UNMAPPED_BASE  0x40000000
-#endif
-#endif
-
 extern abi_ulong task_unmapped_base;
 extern abi_ulong mmap_next_start;
 
diff --git a/linux-user/x86_64/target_mman.h b/linux-user/x86_64/target_mman.h
index e7ba6070fe..f9ff652b37 100644
--- a/linux-user/x86_64/target_mman.h
+++ b/linux-user/x86_64/target_mman.h
@@ -1 +1,13 @@
+/*
+ * arch/x86/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE         __TASK_UNMAPPED_BASE(TASK_SIZE_LOW)
+ * __TASK_UNMAPPED_BASE(S)    PAGE_ALIGN(S / 3)
+ *
+ * arch/x86/include/asm/page_64_types.h:
+ * TASK_SIZE_LOW              DEFAULT_MAP_WINDOW
+ * DEFAULT_MAP_WINDOW         ((1UL << 47) - PAGE_SIZE)
+ */
+#define TASK_UNMAPPED_BASE \
+    TARGET_PAGE_ALIGN((1ull << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/xtensa/target_mman.h b/linux-user/xtensa/target_mman.h
index 3933771b5b..c4f671adb7 100644
--- a/linux-user/xtensa/target_mman.h
+++ b/linux-user/xtensa/target_mman.h
@@ -14,6 +14,12 @@
 #define TARGET_MAP_STACK                0x40000
 #define TARGET_MAP_HUGETLB              0x80000
 
+/*
+ * arch/xtensa/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE           (TASK_SIZE / 2)
+ */
+#define TASK_UNMAPPED_BASE      (1u << (TARGET_VIRT_ADDR_SPACE_BITS - 1))
+
 #include "../generic/target_mman.h"
 
 #endif
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 11/14] linux-user: Add ELF_ET_DYN_BASE
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (9 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 10/14] linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h Richard Henderson
@ 2023-08-03  1:52 ` Richard Henderson
  2023-08-03  1:53 ` [PATCH v7 12/14] linux-user: Use elf_et_dyn_base for ET_DYN with interpreter Richard Henderson
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

Copy each guest kernel's default value, then bound it
against reserved_va or the host address space.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/aarch64/target_mman.h     |  3 +++
 linux-user/alpha/target_mman.h       |  3 +++
 linux-user/arm/target_mman.h         |  3 +++
 linux-user/cris/target_mman.h        |  3 +++
 linux-user/hexagon/target_mman.h     |  3 +++
 linux-user/hppa/target_mman.h        |  3 +++
 linux-user/i386/target_mman.h        |  3 +++
 linux-user/loongarch64/target_mman.h |  3 +++
 linux-user/m68k/target_mman.h        |  2 ++
 linux-user/microblaze/target_mman.h  |  3 +++
 linux-user/mips/target_mman.h        |  3 +++
 linux-user/nios2/target_mman.h       |  3 +++
 linux-user/openrisc/target_mman.h    |  3 +++
 linux-user/ppc/target_mman.h         |  7 +++++++
 linux-user/riscv/target_mman.h       |  3 +++
 linux-user/s390x/target_mman.h       | 10 ++++++++++
 linux-user/sh4/target_mman.h         |  3 +++
 linux-user/sparc/target_mman.h       | 11 +++++++++++
 linux-user/user-mmap.h               |  1 +
 linux-user/x86_64/target_mman.h      |  3 +++
 linux-user/xtensa/target_mman.h      |  4 ++++
 linux-user/main.c                    | 15 +++++++++++++++
 linux-user/mmap.c                    |  1 +
 23 files changed, 96 insertions(+)

diff --git a/linux-user/aarch64/target_mman.h b/linux-user/aarch64/target_mman.h
index 4d3eecfb26..69ec5d5739 100644
--- a/linux-user/aarch64/target_mman.h
+++ b/linux-user/aarch64/target_mman.h
@@ -14,6 +14,9 @@
  */
 #define TASK_UNMAPPED_BASE      (1ull << (48 - 2))
 
+/* arch/arm64/include/asm/elf.h */
+#define ELF_ET_DYN_BASE         TARGET_PAGE_ALIGN((1ull << 48) / 3 * 2)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/alpha/target_mman.h b/linux-user/alpha/target_mman.h
index c90b493711..8edfe2b88c 100644
--- a/linux-user/alpha/target_mman.h
+++ b/linux-user/alpha/target_mman.h
@@ -28,6 +28,9 @@
  */
 #define TASK_UNMAPPED_BASE      0x20000000000ull
 
+/* arch/alpha/include/asm/elf.h */
+#define ELF_ET_DYN_BASE         (TASK_UNMAPPED_BASE + 0x1000000)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/arm/target_mman.h b/linux-user/arm/target_mman.h
index 76275b2c7e..51005da869 100644
--- a/linux-user/arm/target_mman.h
+++ b/linux-user/arm/target_mman.h
@@ -6,4 +6,7 @@
  */
 #define TASK_UNMAPPED_BASE   0x40000000
 
+/* arch/arm/include/asm/elf.h */
+#define ELF_ET_DYN_BASE      0x00400000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/cris/target_mman.h b/linux-user/cris/target_mman.h
index 9df7b1eda5..9ace8ac292 100644
--- a/linux-user/cris/target_mman.h
+++ b/linux-user/cris/target_mman.h
@@ -7,4 +7,7 @@
  */
 #define TASK_UNMAPPED_BASE TARGET_PAGE_ALIGN(0xb0000000 / 3)
 
+/* arch/cris/include/uapi/asm/elf.h */
+#define ELF_ET_DYN_BASE    (TASK_UNMAPPED_BASE * 2)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/hexagon/target_mman.h b/linux-user/hexagon/target_mman.h
index c5ae336e07..e6b5e2ca36 100644
--- a/linux-user/hexagon/target_mman.h
+++ b/linux-user/hexagon/target_mman.h
@@ -8,4 +8,7 @@
  */
 #define TASK_UNMAPPED_BASE   0x40000000
 
+/* arch/hexagon/include/asm/elf.h */
+#define ELF_ET_DYN_BASE      0x08000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/hppa/target_mman.h b/linux-user/hppa/target_mman.h
index 6459e7dbdd..ccda46e842 100644
--- a/linux-user/hppa/target_mman.h
+++ b/linux-user/hppa/target_mman.h
@@ -27,6 +27,9 @@
 /* arch/parisc/include/asm/processor.h: DEFAULT_MAP_BASE32 */
 #define TASK_UNMAPPED_BASE      0x40000000
 
+/* arch/parisc/include/asm/elf.h */
+#define ELF_ET_DYN_BASE         (TASK_UNMAPPED_BASE + 0x01000000)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/i386/target_mman.h b/linux-user/i386/target_mman.h
index cc3382007f..e3b8e1eaa6 100644
--- a/linux-user/i386/target_mman.h
+++ b/linux-user/i386/target_mman.h
@@ -11,4 +11,7 @@
  */
 #define TASK_UNMAPPED_BASE    0x40000000
 
+/* arch/x86/include/asm/elf.h */
+#define ELF_ET_DYN_BASE       0x00400000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/loongarch64/target_mman.h b/linux-user/loongarch64/target_mman.h
index d70e44d44c..8c2a3d5596 100644
--- a/linux-user/loongarch64/target_mman.h
+++ b/linux-user/loongarch64/target_mman.h
@@ -6,4 +6,7 @@
 #define TASK_UNMAPPED_BASE \
     TARGET_PAGE_ALIGN((1ull << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
 
+/* arch/loongarch/include/asm/elf.h */
+#define ELF_ET_DYN_BASE       (TASK_UNMAPPED_BASE * 2)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/m68k/target_mman.h b/linux-user/m68k/target_mman.h
index d3eceb663b..20cfe750c5 100644
--- a/linux-user/m68k/target_mman.h
+++ b/linux-user/m68k/target_mman.h
@@ -1,4 +1,6 @@
 /* arch/m68k/include/asm/processor.h */
 #define TASK_UNMAPPED_BASE      0xC0000000
+/* arch/m68k/include/asm/elf.h */
+#define ELF_ET_DYN_BASE         0xD0000000
 
 #include "../generic/target_mman.h"
diff --git a/linux-user/microblaze/target_mman.h b/linux-user/microblaze/target_mman.h
index ffee869db4..6b3dd54f89 100644
--- a/linux-user/microblaze/target_mman.h
+++ b/linux-user/microblaze/target_mman.h
@@ -6,4 +6,7 @@
  */
 #define TASK_UNMAPPED_BASE      0x48000000
 
+/* arch/microblaze/include/uapi/asm/elf.h */
+#define ELF_ET_DYN_BASE         0x08000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/mips/target_mman.h b/linux-user/mips/target_mman.h
index fe1eec2d0b..b84fe1e8a8 100644
--- a/linux-user/mips/target_mman.h
+++ b/linux-user/mips/target_mman.h
@@ -21,6 +21,9 @@
 #define TASK_UNMAPPED_BASE \
     TARGET_PAGE_ALIGN((1ull << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
 
+/* arch/mips/include/asm/elf.h */
+#define ELF_ET_DYN_BASE       (TASK_UNMAPPED_BASE * 2)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/nios2/target_mman.h b/linux-user/nios2/target_mman.h
index ce18f4f871..ab16ad4f03 100644
--- a/linux-user/nios2/target_mman.h
+++ b/linux-user/nios2/target_mman.h
@@ -5,4 +5,7 @@
  */
 #define TASK_UNMAPPED_BASE    TARGET_PAGE_ALIGN(0x7FFF0000 / 3)
 
+/* arch/nios2/include/asm/elf.h */
+#define ELF_ET_DYN_BASE       0xD0000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/openrisc/target_mman.h b/linux-user/openrisc/target_mman.h
index f1aaad809d..243c1d5f26 100644
--- a/linux-user/openrisc/target_mman.h
+++ b/linux-user/openrisc/target_mman.h
@@ -5,4 +5,7 @@
  */
 #define TASK_UNMAPPED_BASE      0x30000000
 
+/* arch/openrisc/include/asm/elf.h */
+#define ELF_ET_DYN_BASE         0x08000000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/ppc/target_mman.h b/linux-user/ppc/target_mman.h
index 04f99c6077..646d1ccae7 100644
--- a/linux-user/ppc/target_mman.h
+++ b/linux-user/ppc/target_mman.h
@@ -17,6 +17,13 @@
 #define TASK_UNMAPPED_BASE      0x40000000
 #endif
 
+/* arch/powerpc/include/asm/elf.h */
+#ifdef TARGET_PPC64
+#define ELF_ET_DYN_BASE         0x100000000ull
+#else
+#define ELF_ET_DYN_BASE         0x000400000
+#endif
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/riscv/target_mman.h b/linux-user/riscv/target_mman.h
index 0f06dadbd4..3049bcc67d 100644
--- a/linux-user/riscv/target_mman.h
+++ b/linux-user/riscv/target_mman.h
@@ -5,4 +5,7 @@
 #define TASK_UNMAPPED_BASE \
     TARGET_PAGE_ALIGN((1ull << (TARGET_VIRT_ADDR_SPACE_BITS - 1)) / 3)
 
+/* arch/riscv/include/asm/elf.h */
+#define ELF_ET_DYN_BASE       (TASK_UNMAPPED_BASE * 2)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/s390x/target_mman.h b/linux-user/s390x/target_mman.h
index 40d149b329..c82435e381 100644
--- a/linux-user/s390x/target_mman.h
+++ b/linux-user/s390x/target_mman.h
@@ -8,4 +8,14 @@
  */
 #define TASK_UNMAPPED_BASE      (1ull << 41)
 
+/*
+ * arch/s390/include/asm/elf.h:
+ * ELF_ET_DYN_BASE              (STACK_TOP / 3 * 2) & ~((1UL << 32) - 1)
+ *
+ * arch/s390/include/asm/processor.h:
+ * STACK_TOP                    VDSO_LIMIT - VDSO_SIZE - PAGE_SIZE
+ * VDSO_LIMIT                   _REGION2_SIZE
+ */
+#define ELF_ET_DYN_BASE         (((1ull << 42) / 3 * 2) & ~0xffffffffull)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/sh4/target_mman.h b/linux-user/sh4/target_mman.h
index bbbc223398..dd9016081e 100644
--- a/linux-user/sh4/target_mman.h
+++ b/linux-user/sh4/target_mman.h
@@ -2,4 +2,7 @@
 #define TASK_UNMAPPED_BASE \
     TARGET_PAGE_ALIGN((1u << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
 
+/* arch/sh/include/asm/elf.h */
+#define ELF_ET_DYN_BASE       (TASK_UNMAPPED_BASE * 2)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/sparc/target_mman.h b/linux-user/sparc/target_mman.h
index 692ebf9dd7..696ca73fe4 100644
--- a/linux-user/sparc/target_mman.h
+++ b/linux-user/sparc/target_mman.h
@@ -19,6 +19,17 @@
 #define TASK_UNMAPPED_BASE      (1ull << (TARGET_VIRT_ADDR_SPACE_BITS - 2))
 #endif
 
+/*
+ * arch/sparc/include/asm/elf_64.h
+ * Except that COMPAT_ELF_ET_DYN_BASE exactly matches TASK_UNMAPPED_BASE,
+ * so move it up a bit.
+ */
+#ifdef TARGET_ABI32
+#define ELF_ET_DYN_BASE         0x78000000
+#else
+#define ELF_ET_DYN_BASE         0x0000010000000000ull
+#endif
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/user-mmap.h b/linux-user/user-mmap.h
index bae49059e0..5dd48a458d 100644
--- a/linux-user/user-mmap.h
+++ b/linux-user/user-mmap.h
@@ -20,6 +20,7 @@
 
 extern abi_ulong task_unmapped_base;
 extern abi_ulong mmap_next_start;
+extern abi_ulong elf_et_dyn_base;
 
 int target_mprotect(abi_ulong start, abi_ulong len, int prot);
 abi_long target_mmap(abi_ulong start, abi_ulong len, int prot,
diff --git a/linux-user/x86_64/target_mman.h b/linux-user/x86_64/target_mman.h
index f9ff652b37..48fbf20b42 100644
--- a/linux-user/x86_64/target_mman.h
+++ b/linux-user/x86_64/target_mman.h
@@ -10,4 +10,7 @@
 #define TASK_UNMAPPED_BASE \
     TARGET_PAGE_ALIGN((1ull << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
 
+/* arch/x86/include/asm/elf.h */
+#define ELF_ET_DYN_BASE       (TASK_UNMAPPED_BASE * 2)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/xtensa/target_mman.h b/linux-user/xtensa/target_mman.h
index c4f671adb7..8fa6337a97 100644
--- a/linux-user/xtensa/target_mman.h
+++ b/linux-user/xtensa/target_mman.h
@@ -20,6 +20,10 @@
  */
 #define TASK_UNMAPPED_BASE      (1u << (TARGET_VIRT_ADDR_SPACE_BITS - 1))
 
+/* arch/xtensa/include/asm/elf.h */
+#define ELF_ET_DYN_BASE \
+    TARGET_PAGE_ALIGN((1u << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/main.c b/linux-user/main.c
index c207b783d5..2b6f3f22c7 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -845,6 +845,21 @@ int main(int argc, char **argv, char **envp)
     }
     mmap_next_start = task_unmapped_base;
 
+    /* Similarly for elf_et_dyn_base. */
+    if (reserved_va) {
+        if (LESS(ELF_ET_DYN_BASE, reserved_va)) {
+            elf_et_dyn_base = ELF_ET_DYN_BASE;
+        } else {
+            /* The most common default formula is TASK_SIZE / 3 * 2. */
+            elf_et_dyn_base = TARGET_PAGE_ALIGN(reserved_va / 3) * 2;
+        }
+    } else if (LESS(ELF_ET_DYN_BASE, UINTPTR_MAX)) {
+        elf_et_dyn_base = ELF_ET_DYN_BASE;
+    } else {
+        /* 32-bit host: pick something medium size. */
+        elf_et_dyn_base = 0x18000000;
+    }
+
 #undef LESS
 
     {
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 84436d45c8..949c4090f3 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -301,6 +301,7 @@ static bool mmap_frag(abi_ulong real_start, abi_ulong start, abi_ulong last,
 
 abi_ulong task_unmapped_base;
 abi_ulong mmap_next_start;
+abi_ulong elf_et_dyn_base;
 
 /*
  * Subroutine of mmap_find_vma, used when we have pre-allocated
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 12/14] linux-user: Use elf_et_dyn_base for ET_DYN with interpreter
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (10 preceding siblings ...)
  2023-08-03  1:52 ` [PATCH v7 11/14] linux-user: Add ELF_ET_DYN_BASE Richard Henderson
@ 2023-08-03  1:53 ` Richard Henderson
  2023-08-03  1:53 ` [PATCH v7 13/14] linux-user: Adjust initial brk when interpreter is close to executable Richard Henderson
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

Follow the lead of the linux kernel in fs/binfmt_elf.c,
in which an ET_DYN executable which uses an interpreter
(usually a PIE executable) is loaded away from where the
interpreter itself will be loaded.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/elfload.c | 43 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 2aee2298ec..fef9a0bc8f 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3021,7 +3021,7 @@ static void load_elf_image(const char *image_name, int image_fd,
     struct elfhdr *ehdr = (struct elfhdr *)bprm_buf;
     struct elf_phdr *phdr;
     abi_ulong load_addr, load_bias, loaddr, hiaddr, error;
-    int i, retval, prot_exec;
+    int i, retval, prot_exec, load_map_flags;
     Error *err = NULL;
 
     /* First of all, some simple consistency checks */
@@ -3106,6 +3106,19 @@ static void load_elf_image(const char *image_name, int image_fd,
         }
     }
 
+    load_addr = loaddr;
+
+    /*
+     * For ET_EXEC, load_addr is required.  Use MAP_FIXED_NOREPLACE instead
+     * of MAP_FIXED on the off-chance that the guest address overlaps a
+     * host address.  There should be no other conflict this early in the
+     * loading process.
+     */
+    if (ehdr->e_type == ET_EXEC) {
+        load_map_flags = MAP_FIXED_NOREPLACE;
+    } else {
+        load_map_flags = 0;
+    }
     if (pinterp_name != NULL) {
         /*
          * This is the main executable.
@@ -3135,11 +3148,34 @@ static void load_elf_image(const char *image_name, int image_fd,
              */
             probe_guest_base(image_name, loaddr, hiaddr);
         } else {
+            abi_ulong align;
+
             /*
              * The binary is dynamic, but we still need to
              * select guest_base.  In this case we pass a size.
              */
             probe_guest_base(image_name, 0, hiaddr - loaddr);
+
+            /*
+             * Avoid collision with the loader by providing a different
+             * default load address.
+             */
+            load_addr = loaddr + elf_et_dyn_base;
+
+            /*
+             * TODO: Better support for mmap alignment is desirable.
+             * Without reserved_va we would prefer any host conflict be
+             * resolved by choosing a different address, therefore we
+             * don't want to use MAP_FIXED.  But without that we cannot
+             * cannot guarantee alignment, only suggest it.
+             */
+            align = pow2ceil(info->alignment);
+            if (align) {
+                load_addr &= -align;
+            }
+            if (reserved_va) {
+                load_map_flags = MAP_FIXED_NOREPLACE;
+            }
         }
     }
 
@@ -3157,10 +3193,9 @@ static void load_elf_image(const char *image_name, int image_fd,
      * In both cases, we will overwrite pages in this range with mappings
      * from the executable.
      */
-    load_addr = target_mmap(loaddr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
+    load_addr = target_mmap(load_addr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
                             MAP_PRIVATE | MAP_ANON | MAP_NORESERVE |
-                            (ehdr->e_type == ET_EXEC ? MAP_FIXED : 0),
-                            -1, 0);
+                            load_map_flags, -1, 0);
     if (load_addr == -1) {
         goto exit_mmap;
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 13/14] linux-user: Adjust initial brk when interpreter is close to executable
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (11 preceding siblings ...)
  2023-08-03  1:53 ` [PATCH v7 12/14] linux-user: Use elf_et_dyn_base for ET_DYN with interpreter Richard Henderson
@ 2023-08-03  1:53 ` Richard Henderson
  2023-08-03 13:00   ` Helge Deller
  2023-08-03  1:53 ` [PATCH v7 14/14] linux-user: Properly set image_info.brk in flatload Richard Henderson
  2023-08-03 13:11 ` [PATCH v7 00/14] linux-user: brk fixes Joel Stanley
  14 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

From: Helge Deller <deller@gmx.de>

While we attempt to load a ET_DYN executable far away from
TASK_UNMAPPED_BASE, we are not completely in control of the
address space layout.  If the interpreter lands close to
the executable, leaving insufficient heap space, move brk.

Signed-off-by: Helge Deller <deller@gmx.de>
[rth: Re-order after ELF_ET_DYN_BASE patch so that we do not
 "temporarily break" tsan, and also to minimize the changes required.
 Remove image_info.reserve_brk as unused.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/qemu.h    |  1 -
 linux-user/elfload.c | 51 +++++++++++++-------------------------------
 2 files changed, 15 insertions(+), 37 deletions(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 802794db63..4b0c9da0dc 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -31,7 +31,6 @@ struct image_info {
         abi_ulong       end_data;
         abi_ulong       start_brk;
         abi_ulong       brk;
-        abi_ulong       reserve_brk;
         abi_ulong       start_mmap;
         abi_ulong       start_stack;
         abi_ulong       stack_limit;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index fef9a0bc8f..bf747a15b5 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3120,27 +3120,6 @@ static void load_elf_image(const char *image_name, int image_fd,
         load_map_flags = 0;
     }
     if (pinterp_name != NULL) {
-        /*
-         * This is the main executable.
-         *
-         * Reserve extra space for brk.
-         * We hold on to this space while placing the interpreter
-         * and the stack, lest they be placed immediately after
-         * the data segment and block allocation from the brk.
-         *
-         * 16MB is chosen as "large enough" without being so large as
-         * to allow the result to not fit with a 32-bit guest on a
-         * 32-bit host. However some 64 bit guests (e.g. s390x)
-         * attempt to place their heap further ahead and currently
-         * nothing stops them smashing into QEMUs address space.
-         */
-#if TARGET_LONG_BITS == 64
-        info->reserve_brk = 32 * MiB;
-#else
-        info->reserve_brk = 16 * MiB;
-#endif
-        hiaddr += info->reserve_brk;
-
         if (ehdr->e_type == ET_EXEC) {
             /*
              * Make sure that the low address does not conflict with
@@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
     info->end_code = 0;
     info->start_data = -1;
     info->end_data = 0;
-    info->brk = 0;
+    /* Usual start for brk is after all sections of the main executable. */
+    info->brk = TARGET_PAGE_ALIGN(hiaddr);
     info->elf_flags = ehdr->e_flags;
 
     prot_exec = PROT_EXEC;
@@ -3323,9 +3303,6 @@ static void load_elf_image(const char *image_name, int image_fd,
                     info->end_data = vaddr_ef;
                 }
             }
-            if (vaddr_em > info->brk) {
-                info->brk = vaddr_em;
-            }
 #ifdef TARGET_MIPS
         } else if (eppnt->p_type == PT_MIPS_ABIFLAGS) {
             Mips_elf_abiflags_v0 abiflags;
@@ -3654,6 +3631,19 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
     if (elf_interpreter) {
         load_elf_interp(elf_interpreter, &interp_info, bprm->buf);
 
+        /*
+         * While unusual because of ELF_ET_DYN_BASE, if we are unlucky
+         * with the mappings the interpreter can be loaded above but
+         * near the main executable, which can leave very little room
+         * for the heap.
+         * If the current brk has less than 16MB, use the end of the
+         * interpreter.
+         */
+        if (interp_info.brk > info->brk &&
+            interp_info.load_bias - info->brk < 16 * MiB)  {
+            info->brk = interp_info.brk;
+        }
+
         /* If the program interpreter is one of these two, then assume
            an iBCS2 image.  Otherwise assume a native linux image.  */
 
@@ -3707,17 +3697,6 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
     bprm->core_dump = &elf_core_dump;
 #endif
 
-    /*
-     * If we reserved extra space for brk, release it now.
-     * The implementation of do_brk in syscalls.c expects to be able
-     * to mmap pages in this space.
-     */
-    if (info->reserve_brk) {
-        abi_ulong start_brk = TARGET_PAGE_ALIGN(info->brk);
-        abi_ulong end_brk = TARGET_PAGE_ALIGN(info->brk + info->reserve_brk);
-        target_munmap(start_brk, end_brk - start_brk);
-    }
-
     return 0;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v7 14/14] linux-user: Properly set image_info.brk in flatload
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (12 preceding siblings ...)
  2023-08-03  1:53 ` [PATCH v7 13/14] linux-user: Adjust initial brk when interpreter is close to executable Richard Henderson
@ 2023-08-03  1:53 ` Richard Henderson
  2023-08-03 13:11 ` [PATCH v7 00/14] linux-user: brk fixes Joel Stanley
  14 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2023-08-03  1:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller, laurent, akihiko.odaki, joel

The heap starts at "brk" not "start_brk".  With this fixed,
image_info.start_brk is unused and may be removed.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/qemu.h     | 1 -
 linux-user/flatload.c | 2 +-
 linux-user/main.c     | 2 --
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 4b0c9da0dc..4f8b55e2fb 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -29,7 +29,6 @@ struct image_info {
         abi_ulong       end_code;
         abi_ulong       start_data;
         abi_ulong       end_data;
-        abi_ulong       start_brk;
         abi_ulong       brk;
         abi_ulong       start_mmap;
         abi_ulong       start_stack;
diff --git a/linux-user/flatload.c b/linux-user/flatload.c
index 5efec2630e..8f5e9f489b 100644
--- a/linux-user/flatload.c
+++ b/linux-user/flatload.c
@@ -811,7 +811,7 @@ int load_flt_binary(struct linux_binprm *bprm, struct image_info *info)
     info->end_code = libinfo[0].start_code + libinfo[0].text_len;
     info->start_data = libinfo[0].start_data;
     info->end_data = libinfo[0].end_data;
-    info->start_brk = libinfo[0].start_brk;
+    info->brk = libinfo[0].start_brk;
     info->start_stack = sp;
     info->stack_limit = libinfo[0].start_brk;
     info->entry = start_addr;
diff --git a/linux-user/main.c b/linux-user/main.c
index 2b6f3f22c7..c393a2ceb6 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -961,8 +961,6 @@ int main(int argc, char **argv, char **envp)
             fprintf(f, "page layout changed following binary load\n");
             page_dump(f);
 
-            fprintf(f, "start_brk   0x" TARGET_ABI_FMT_lx "\n",
-                    info->start_brk);
             fprintf(f, "end_code    0x" TARGET_ABI_FMT_lx "\n",
                     info->end_code);
             fprintf(f, "start_code  0x" TARGET_ABI_FMT_lx "\n",
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 13/14] linux-user: Adjust initial brk when interpreter is close to executable
  2023-08-03  1:53 ` [PATCH v7 13/14] linux-user: Adjust initial brk when interpreter is close to executable Richard Henderson
@ 2023-08-03 13:00   ` Helge Deller
  0 siblings, 0 replies; 24+ messages in thread
From: Helge Deller @ 2023-08-03 13:00 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: laurent, akihiko.odaki, joel

Hi Richard,

Thanks for putting this all together!
I'll test asap.

I haven't checked yet, but Akihiko did send a revised v2 patch
series, while my v6 series included his older v1 patches.
We should consider his latest series...

One other thing below....

On 8/3/23 03:53, Richard Henderson wrote:
> From: Helge Deller <deller@gmx.de>
>
> While we attempt to load a ET_DYN executable far away from
> TASK_UNMAPPED_BASE, we are not completely in control of the
> address space layout.  If the interpreter lands close to
> the executable, leaving insufficient heap space, move brk.
>
> Signed-off-by: Helge Deller <deller@gmx.de>
> [rth: Re-order after ELF_ET_DYN_BASE patch so that we do not
>   "temporarily break" tsan, and also to minimize the changes required.
>   Remove image_info.reserve_brk as unused.]
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   linux-user/qemu.h    |  1 -
>   linux-user/elfload.c | 51 +++++++++++++-------------------------------
>   2 files changed, 15 insertions(+), 37 deletions(-)
>...
> @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
>       info->end_code = 0;
>       info->start_data = -1;
>       info->end_data = 0;
> -    info->brk = 0;
> +    /* Usual start for brk is after all sections of the main executable. */
> +    info->brk = TARGET_PAGE_ALIGN(hiaddr);

This is from my original patch, and is probably wrong.
I think this needs to be:
     info->brk = HOST_PAGE_ALIGN(hiaddr);

The brk page needs to be aligned to the host page size variable (which
is always >= target page size).
The page will be mapped +rw (on host), so may need the distance to code/shared
libs below it, and for that distance target-alignment may not be sufficient.

I think this fixes the problem which joel faced with armel static binary
on ppc64le.

Helge


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 00/14] linux-user: brk fixes
  2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
                   ` (13 preceding siblings ...)
  2023-08-03  1:53 ` [PATCH v7 14/14] linux-user: Properly set image_info.brk in flatload Richard Henderson
@ 2023-08-03 13:11 ` Joel Stanley
  2023-08-03 13:55   ` Helge Deller
  14 siblings, 1 reply; 24+ messages in thread
From: Joel Stanley @ 2023-08-03 13:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, deller, laurent, akihiko.odaki

Hi Richard,

On Thu, 3 Aug 2023 at 01:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Builds on Helge's v6, incorporating my feedback plus
> some other minor cleanup.

This succeeds for the armhf static binary on ppc64le host that was
previously segfaulting.

However, the arm static binary on ppc64le host now segfaults:

$ gdb -q -ex r --args ./build/qemu-arm -d guest_errors,page,strace ~/hello
Reading symbols from ./build/qemu-arm...
Starting program: /scratch/joel/qemu/build/qemu-arm -d
guest_errors,page,strace /home/joel/hello
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff762ece0 (LWP 143553)]
host mmap_min_addr=0x10000
pgb_find_hole: base @ 140420000 for 4294967296 bytes
pgb_static: base @ 140420000 for 4294967295 bytes
pgb_reserved_va: base @ 0x140420000 for 4294967296 bytes
Locating guest address space @ 0x140420000
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 ---
00090000-0009b000 0000b000 ---
ffff0000-00000000 00010000 r-x
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-0009b000 0000b000 ---
ffff0000-00000000 00010000 r-x
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-000a0000 00010000 rw-
ffff0000-00000000 00010000 r-x
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-000a0000 00010000 rw-
40000000-40810000 00810000 rw-
ffff0000-00000000 00010000 r-x
page layout changed following mmap
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-000a0000 00010000 rw-
40000000-40010000 00010000 ---
40010000-40811000 00801000 rw-
ffff0000-00000000 00010000 r-x
guest_base  0x140420000
page layout changed following binary load
start    end      size     prot
00010000-00090000 00080000 r-x
00090000-000a0000 00010000 rw-
40000000-40010000 00010000 ---
40010000-40810000 00800000 rw-
40810000-40811000 00001000 r-x
ffff0000-00000000 00010000 r-x
end_code    0x00084f7c
start_code  0x00010000
start_data  0x00095098
end_data    0x00098394
start_stack 0x4080f410
brk         0x0009b000
entry       0x00010418
argv_start  0x4080f414
env_start   0x4080f41c
auxv_start  0x4080f4a0
143551 brk(NULL) = 0x0009b000
143551 brk(0x0009b8fc) = 0x0009b000

Thread 1 "qemu-arm" received signal SIGSEGV, Segmentation fault.
0x00007fffeed9bb74 in code_gen_buffer ()
(gdb) bt
#0  0x00007fffeed9bb74 in code_gen_buffer ()
#1  0x0000000100169fdc in cpu_tb_exec (cpu=cpu@entry=0x1003d4a90,
    itb=itb@entry=0x7fffeed9ba60 <code_gen_buffer+47512>,
tb_exit=tb_exit@entry=0x7fffffffe51c)
    at ../accel/tcg/cpu-exec.c:457
#2  0x000000010016a704 in cpu_loop_exec_tb (tb_exit=0x7fffffffe51c,
last_tb=<synthetic pointer>,
    pc=<optimised out>, tb=0x7fffeed9ba60 <code_gen_buffer+47512>,
cpu=<optimised out>)
    at ../accel/tcg/cpu-exec.c:919
#3  cpu_exec_loop (cpu=cpu@entry=0x1003d4a90, sc=<optimised out>) at
../accel/tcg/cpu-exec.c:1040
#4  0x000000010016abac in cpu_exec_setjmp (cpu=cpu@entry=0x1003d4a90,
sc=<optimised out>)
    at ../accel/tcg/cpu-exec.c:1057
#5  0x000000010016b270 in cpu_exec (cpu=0x1003d4a90) at
../accel/tcg/cpu-exec.c:1083
#6  0x000000010004d7b0 in cpu_loop (env=0x1003d4fa0) at
../linux-user/arm/cpu_loop.c:328
#7  0x0000000100047548 in main (argc=<optimised out>,
argv=0x7ffffffff188, envp=<optimised out>)
    at ../linux-user/main.c:1012
(gdb)


>
>
> r~
>
>
> Akihiko Odaki (6):
>   linux-user: Unset MAP_FIXED_NOREPLACE for host
>   linux-user: Fix MAP_FIXED_NOREPLACE on old kernels
>   linux-user: Do not call get_errno() in do_brk()
>   linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
>   linux-user: Do nothing if too small brk is specified
>   linux-user: Do not align brk with host page size
>
> Helge Deller (1):
>   linux-user: Adjust initial brk when interpreter is close to executable
>
> Richard Henderson (7):
>   linux-user: Remove last_brk
>   bsd-user: Remove last_brk
>   linux-user: Adjust task_unmapped_base for reserved_va
>   linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h
>   linux-user: Add ELF_ET_DYN_BASE
>   linux-user: Use elf_et_dyn_base for ET_DYN with interpreter
>   linux-user: Properly set image_info.brk in flatload
>
>  bsd-user/qemu.h                      |  1 -
>  linux-user/aarch64/target_mman.h     | 13 ++++
>  linux-user/alpha/target_mman.h       | 11 ++++
>  linux-user/arm/target_mman.h         | 11 ++++
>  linux-user/cris/target_mman.h        | 12 ++++
>  linux-user/hexagon/target_mman.h     | 13 ++++
>  linux-user/hppa/target_mman.h        |  6 ++
>  linux-user/i386/target_mman.h        | 16 +++++
>  linux-user/loongarch64/target_mman.h | 11 ++++
>  linux-user/m68k/target_mman.h        |  5 ++
>  linux-user/microblaze/target_mman.h  | 11 ++++
>  linux-user/mips/target_mman.h        | 10 +++
>  linux-user/nios2/target_mman.h       | 10 +++
>  linux-user/openrisc/target_mman.h    | 10 +++
>  linux-user/ppc/target_mman.h         | 20 ++++++
>  linux-user/qemu.h                    |  2 -
>  linux-user/riscv/target_mman.h       | 10 +++
>  linux-user/s390x/target_mman.h       | 20 ++++++
>  linux-user/sh4/target_mman.h         |  7 +++
>  linux-user/sparc/target_mman.h       | 25 ++++++++
>  linux-user/user-mmap.h               |  6 +-
>  linux-user/x86_64/target_mman.h      | 15 +++++
>  linux-user/xtensa/target_mman.h      | 10 +++
>  bsd-user/mmap.c                      |  2 -
>  linux-user/elfload.c                 | 94 ++++++++++++++++------------
>  linux-user/flatload.c                |  2 +-
>  linux-user/main.c                    | 43 ++++++++++++-
>  linux-user/mmap.c                    | 68 ++++++++++++--------
>  linux-user/syscall.c                 | 69 +++++---------------
>  29 files changed, 401 insertions(+), 132 deletions(-)
>
> --
> 2.34.1
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 00/14] linux-user: brk fixes
  2023-08-03 13:11 ` [PATCH v7 00/14] linux-user: brk fixes Joel Stanley
@ 2023-08-03 13:55   ` Helge Deller
  2023-08-03 14:17     ` Joel Stanley
  0 siblings, 1 reply; 24+ messages in thread
From: Helge Deller @ 2023-08-03 13:55 UTC (permalink / raw)
  To: Joel Stanley, Richard Henderson; +Cc: qemu-devel, laurent, akihiko.odaki

On 8/3/23 15:11, Joel Stanley wrote:
> Hi Richard,
>
> On Thu, 3 Aug 2023 at 01:53, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Builds on Helge's v6, incorporating my feedback plus
>> some other minor cleanup.
>
> This succeeds for the armhf static binary on ppc64le host that was
> previously segfaulting.
>
> However, the arm static binary on ppc64le host now segfaults:
>
> $ gdb -q -ex r --args ./build/qemu-arm -d guest_errors,page,strace ~/hello
> Reading symbols from ./build/qemu-arm...
> Starting program: /scratch/joel/qemu/build/qemu-arm -d
> guest_errors,page,strace /home/joel/hello
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1".
> [New Thread 0x7ffff762ece0 (LWP 143553)]
> host mmap_min_addr=0x10000
> pgb_find_hole: base @ 140420000 for 4294967296 bytes
> pgb_static: base @ 140420000 for 4294967295 bytes
> pgb_reserved_va: base @ 0x140420000 for 4294967296 bytes
> Locating guest address space @ 0x140420000
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 ---
> 00090000-0009b000 0000b000 ---
> ffff0000-00000000 00010000 r-x
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-0009b000 0000b000 ---
> ffff0000-00000000 00010000 r-x
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-000a0000 00010000 rw-
> ffff0000-00000000 00010000 r-x
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-000a0000 00010000 rw-
> 40000000-40810000 00810000 rw-
> ffff0000-00000000 00010000 r-x
> page layout changed following mmap
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-000a0000 00010000 rw-
> 40000000-40010000 00010000 ---
> 40010000-40811000 00801000 rw-
> ffff0000-00000000 00010000 r-x
> guest_base  0x140420000
> page layout changed following binary load
> start    end      size     prot
> 00010000-00090000 00080000 r-x
> 00090000-000a0000 00010000 rw-
> 40000000-40010000 00010000 ---
> 40010000-40810000 00800000 rw-
> 40810000-40811000 00001000 r-x
> ffff0000-00000000 00010000 r-x
> end_code    0x00084f7c
> start_code  0x00010000
> start_data  0x00095098
> end_data    0x00098394
> start_stack 0x4080f410
> brk         0x0009b000
> entry       0x00010418
> argv_start  0x4080f414
> env_start   0x4080f41c
> auxv_start  0x4080f4a0
> 143551 brk(NULL) = 0x0009b000
> 143551 brk(0x0009b8fc) = 0x0009b000

I think the problem is the brk with 9b000 here.
It's not 64k aligned (=pages size of your ppc64le).

Please try with this patch on top of Richard's series:

> @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
>       info->end_code = 0;
>       info->start_data = -1;
>       info->end_data = 0;
> -    info->brk = .....
change that to become:
     info->brk = HOST_PAGE_ALIGN(hiaddr);

Helge


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 00/14] linux-user: brk fixes
  2023-08-03 13:55   ` Helge Deller
@ 2023-08-03 14:17     ` Joel Stanley
  2023-08-03 15:01       ` Helge Deller
  0 siblings, 1 reply; 24+ messages in thread
From: Joel Stanley @ 2023-08-03 14:17 UTC (permalink / raw)
  To: Helge Deller; +Cc: Richard Henderson, qemu-devel, laurent, akihiko.odaki

On Thu, 3 Aug 2023 at 13:55, Helge Deller <deller@gmx.de> wrote:
> > 143551 brk(NULL) = 0x0009b000
> > 143551 brk(0x0009b8fc) = 0x0009b000
>
> I think the problem is the brk with 9b000 here.
> It's not 64k aligned (=pages size of your ppc64le).
>
> Please try with this patch on top of Richard's series:
>
> > @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
> >       info->end_code = 0;
> >       info->start_data = -1;
> >       info->end_data = 0;
> > -    info->brk = .....
> change that to become:
>      info->brk = HOST_PAGE_ALIGN(hiaddr);

That stopped the crashing, and the binaries seem to run fine. I tested
on two hosts: ppc64le (64K) and arm64 (16K).

Cheers,

Joel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 00/14] linux-user: brk fixes
  2023-08-03 14:17     ` Joel Stanley
@ 2023-08-03 15:01       ` Helge Deller
  2023-08-03 15:11         ` Richard Henderson
  2023-08-03 15:20         ` Richard Henderson
  0 siblings, 2 replies; 24+ messages in thread
From: Helge Deller @ 2023-08-03 15:01 UTC (permalink / raw)
  To: Joel Stanley, Laurent Vivier, Richard Henderson, qemu-devel
  Cc: Helge Deller, akihiko.odaki

* Joel Stanley <joel@jms.id.au>:
> On Thu, 3 Aug 2023 at 13:55, Helge Deller <deller@gmx.de> wrote:
> > > 143551 brk(NULL) = 0x0009b000
> > > 143551 brk(0x0009b8fc) = 0x0009b000
> >
> > I think the problem is the brk with 9b000 here.
> > It's not 64k aligned (=pages size of your ppc64le).
> >
> > Please try with this patch on top of Richard's series:
> >
> > > @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
> > >       info->end_code = 0;
> > >       info->start_data = -1;
> > >       info->end_data = 0;
> > > -    info->brk = .....
> > change that to become:
> >      info->brk = HOST_PAGE_ALIGN(hiaddr);
>
> That stopped the crashing, and the binaries seem to run fine. I tested
> on two hosts: ppc64le (64K) and arm64 (16K).

Great!

That made re-read Akihiko's patch:
----
Author: Akihiko Odaki <akihiko.odaki@daynix.com>
    linux-user: Do not align brk with host page size

    do_brk() minimizes calls into target_mmap() by aligning the address
    with host page size, which is potentially larger than the target page
    size. However, the current implementation of this optimization has two
    bugs:

    - The start of brk is rounded up with the host page size while brk
      advertises an address aligned with the target page size as the
      beginning of brk. This makes the beginning of brk unmapped.
----
this patch has wrong assumptions.

The start of brk always needs to be host page aligned.
It's not an optimization, but a requirement, since brk needs to be
located on a host-aligned page which may get different permissions
than the page before it (where code from the binary may be located).

I wonder if we need that patch at all.


Joel, could you give the patch below on top of git head (no other
patches applied) a spin?
(I just tested it here locally on a full range of linux-user chroots)

I think this is ALL what's needed for git head to fix the static binary
issues, has a nice preparation for Richard's ELF_ET_DYN_BASE patches.

If it does, it replaces patches 1,2 & 4-6 from Richard's v7 patch
series.

Helge



diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 861ec07abc..88d9e4056e 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3021,8 +3021,10 @@ static void load_elf_image(const char *image_name, int image_fd,
     struct elfhdr *ehdr = (struct elfhdr *)bprm_buf;
     struct elf_phdr *phdr;
     abi_ulong load_addr, load_bias, loaddr, hiaddr, error;
+    unsigned long load_offset = 0;
     int i, retval, prot_exec;
     Error *err = NULL;
+    bool is_main_executable;

     /* First of all, some simple consistency checks */
     if (!elf_check_ident(ehdr)) {
@@ -3106,28 +3108,8 @@ static void load_elf_image(const char *image_name, int image_fd,
         }
     }

-    if (pinterp_name != NULL) {
-        /*
-         * This is the main executable.
-         *
-         * Reserve extra space for brk.
-         * We hold on to this space while placing the interpreter
-         * and the stack, lest they be placed immediately after
-         * the data segment and block allocation from the brk.
-         *
-         * 16MB is chosen as "large enough" without being so large as
-         * to allow the result to not fit with a 32-bit guest on a
-         * 32-bit host. However some 64 bit guests (e.g. s390x)
-         * attempt to place their heap further ahead and currently
-         * nothing stops them smashing into QEMUs address space.
-         */
-#if TARGET_LONG_BITS == 64
-        info->reserve_brk = 32 * MiB;
-#else
-        info->reserve_brk = 16 * MiB;
-#endif
-        hiaddr += info->reserve_brk;
-
+    is_main_executable = (pinterp_name != NULL);
+    if (is_main_executable) {
         if (ehdr->e_type == ET_EXEC) {
             /*
              * Make sure that the low address does not conflict with
@@ -3136,10 +3118,11 @@ static void load_elf_image(const char *image_name, int image_fd,
             probe_guest_base(image_name, loaddr, hiaddr);
         } else {
             /*
-             * The binary is dynamic, but we still need to
+             * The binary is dynamic (pie-executabe), but we still need to
              * select guest_base.  In this case we pass a size.
              */
             probe_guest_base(image_name, 0, hiaddr - loaddr);
+            load_offset = 0 /* TODO: should be ELF_ET_DYN_BASE */;
         }
     }

@@ -3157,9 +3140,9 @@ static void load_elf_image(const char *image_name, int image_fd,
      * In both cases, we will overwrite pages in this range with mappings
      * from the executable.
      */
-    load_addr = target_mmap(loaddr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
+    load_addr = target_mmap(loaddr + load_offset, (size_t)hiaddr - loaddr + 1, PROT_NONE,
                             MAP_PRIVATE | MAP_ANON | MAP_NORESERVE |
-                            (ehdr->e_type == ET_EXEC ? MAP_FIXED : 0),
+                            (is_main_executable ? MAP_FIXED : 0),
                             -1, 0);
     if (load_addr == -1) {
         goto exit_mmap;
@@ -3194,7 +3177,8 @@ static void load_elf_image(const char *image_name, int image_fd,
     info->end_code = 0;
     info->start_data = -1;
     info->end_data = 0;
-    info->brk = 0;
+    /* possible start for brk is behind all sections of this ELF file. */
+    info->brk = HOST_PAGE_ALIGN(load_offset + hiaddr);
     info->elf_flags = ehdr->e_flags;

     prot_exec = PROT_EXEC;
@@ -3288,9 +3272,6 @@ static void load_elf_image(const char *image_name, int image_fd,
                     info->end_data = vaddr_ef;
                 }
             }
-            if (vaddr_em > info->brk) {
-                info->brk = vaddr_em;
-            }
 #ifdef TARGET_MIPS
         } else if (eppnt->p_type == PT_MIPS_ABIFLAGS) {
             Mips_elf_abiflags_v0 abiflags;
@@ -3618,6 +3599,15 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)

     if (elf_interpreter) {
         load_elf_interp(elf_interpreter, &interp_info, bprm->buf);
+        /*
+	 * Use brk address of interpreter if it was loaded above the
+	 * executable and leaves less than 16 MB for heap.
+	 * This happens e.g. with static binaries on armhf.
+         */
+        if (interp_info.brk > info->brk &&
+            interp_info.load_bias - info->brk < 16 * MiB)  {
+            info->brk = interp_info.brk;
+        }

         /* If the program interpreter is one of these two, then assume
            an iBCS2 image.  Otherwise assume a native linux image.  */
@@ -3672,17 +3662,6 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
     bprm->core_dump = &elf_core_dump;
 #endif

-    /*
-     * If we reserved extra space for brk, release it now.
-     * The implementation of do_brk in syscalls.c expects to be able
-     * to mmap pages in this space.
-     */
-    if (info->reserve_brk) {
-        abi_ulong start_brk = HOST_PAGE_ALIGN(info->brk);
-        abi_ulong end_brk = HOST_PAGE_ALIGN(info->brk + info->reserve_brk);
-        target_munmap(start_brk, end_brk - start_brk);
-    }
-
     return 0;
 }




^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 00/14] linux-user: brk fixes
  2023-08-03 15:01       ` Helge Deller
@ 2023-08-03 15:11         ` Richard Henderson
  2023-08-03 16:09           ` Helge Deller
  2023-08-03 15:20         ` Richard Henderson
  1 sibling, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2023-08-03 15:11 UTC (permalink / raw)
  To: Helge Deller, Joel Stanley, Laurent Vivier, qemu-devel; +Cc: akihiko.odaki

On 8/3/23 08:01, Helge Deller wrote:
> * Joel Stanley <joel@jms.id.au>:
>> On Thu, 3 Aug 2023 at 13:55, Helge Deller <deller@gmx.de> wrote:
>>>> 143551 brk(NULL) = 0x0009b000
>>>> 143551 brk(0x0009b8fc) = 0x0009b000
>>>
>>> I think the problem is the brk with 9b000 here.
>>> It's not 64k aligned (=pages size of your ppc64le).
>>>
>>> Please try with this patch on top of Richard's series:
>>>
>>>> @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
>>>>        info->end_code = 0;
>>>>        info->start_data = -1;
>>>>        info->end_data = 0;
>>>> -    info->brk = .....
>>> change that to become:
>>>       info->brk = HOST_PAGE_ALIGN(hiaddr);
>>
>> That stopped the crashing, and the binaries seem to run fine. I tested
>> on two hosts: ppc64le (64K) and arm64 (16K).
> 
> Great!
> 
> That made re-read Akihiko's patch:
> ----
> Author: Akihiko Odaki <akihiko.odaki@daynix.com>
>      linux-user: Do not align brk with host page size
> 
>      do_brk() minimizes calls into target_mmap() by aligning the address
>      with host page size, which is potentially larger than the target page
>      size. However, the current implementation of this optimization has two
>      bugs:
> 
>      - The start of brk is rounded up with the host page size while brk
>        advertises an address aligned with the target page size as the
>        beginning of brk. This makes the beginning of brk unmapped.
> ----
> this patch has wrong assumptions.
> 
> The start of brk always needs to be host page aligned.


There is a bunch of code in target_mmap that attempts to manage adjacent guest pages that 
fall into the same host page.  Akihiko's patch assumes that code actually works.  Which I 
think is entirely reasonable.

You can't move brk up like this either (without other adjustments to the binary mapping), 
since that will leave a hole in the guest address space, which can get filled with 
something else later, which will definitely cause problems.


r~


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 00/14] linux-user: brk fixes
  2023-08-03 15:01       ` Helge Deller
  2023-08-03 15:11         ` Richard Henderson
@ 2023-08-03 15:20         ` Richard Henderson
  2023-08-03 16:10           ` Helge Deller
  1 sibling, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2023-08-03 15:20 UTC (permalink / raw)
  To: Helge Deller, Joel Stanley, Laurent Vivier, qemu-devel; +Cc: akihiko.odaki

On 8/3/23 08:01, Helge Deller wrote:
> If it does, it replaces patches 1,2 & 4-6 from Richard's v7 patch
> series.

The patch you gave below has no overlap with 1,2,4,5 at all.


r~


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 00/14] linux-user: brk fixes
  2023-08-03 15:11         ` Richard Henderson
@ 2023-08-03 16:09           ` Helge Deller
  0 siblings, 0 replies; 24+ messages in thread
From: Helge Deller @ 2023-08-03 16:09 UTC (permalink / raw)
  To: Richard Henderson, Joel Stanley, Laurent Vivier, qemu-devel; +Cc: akihiko.odaki

On 8/3/23 17:11, Richard Henderson wrote:
> On 8/3/23 08:01, Helge Deller wrote:
>> * Joel Stanley <joel@jms.id.au>:
>>> On Thu, 3 Aug 2023 at 13:55, Helge Deller <deller@gmx.de> wrote:
>>>>> 143551 brk(NULL) = 0x0009b000
>>>>> 143551 brk(0x0009b8fc) = 0x0009b000
>>>>
>>>> I think the problem is the brk with 9b000 here.
>>>> It's not 64k aligned (=pages size of your ppc64le).
>>>>
>>>> Please try with this patch on top of Richard's series:
>>>>
>>>>> @@ -3229,7 +3208,8 @@ static void load_elf_image(const char *image_name, int image_fd,
>>>>>        info->end_code = 0;
>>>>>        info->start_data = -1;
>>>>>        info->end_data = 0;
>>>>> -    info->brk = .....
>>>> change that to become:
>>>>       info->brk = HOST_PAGE_ALIGN(hiaddr);
>>>
>>> That stopped the crashing, and the binaries seem to run fine. I tested
>>> on two hosts: ppc64le (64K) and arm64 (16K).
>>
>> Great!
>>
>> That made re-read Akihiko's patch:
>> ----
>> Author: Akihiko Odaki <akihiko.odaki@daynix.com>
>>      linux-user: Do not align brk with host page size
>>
>>      do_brk() minimizes calls into target_mmap() by aligning the address
>>      with host page size, which is potentially larger than the target page
>>      size. However, the current implementation of this optimization has two
>>      bugs:
>>
>>      - The start of brk is rounded up with the host page size while brk
>>        advertises an address aligned with the target page size as the
>>        beginning of brk. This makes the beginning of brk unmapped.
>> ----
>> this patch has wrong assumptions.
>>
>> The start of brk always needs to be host page aligned.
>
>
> There is a bunch of code in target_mmap that attempts to manage
> adjacent guest pages that fall into the same host page.  Akihiko's
> patch assumes that code actually works.  Which I think is entirely
> reasonable.

Ok.

> You can't move brk up like this either (without other adjustments to
> the binary mapping), since that will leave a hole in the guest
> address space, which can get filled with something else later, which
> will definitely cause problems.

Ah, true. I have to admit that I didn't thought of that yet.
What is a possible solution to increase brk then (if you agree
that it should be increased here if necessary) ?
Call target_mmap() on the area from current brk to new brk?

Helge


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v7 00/14] linux-user: brk fixes
  2023-08-03 15:20         ` Richard Henderson
@ 2023-08-03 16:10           ` Helge Deller
  0 siblings, 0 replies; 24+ messages in thread
From: Helge Deller @ 2023-08-03 16:10 UTC (permalink / raw)
  To: Richard Henderson, Joel Stanley, Laurent Vivier, qemu-devel; +Cc: akihiko.odaki

On 8/3/23 17:20, Richard Henderson wrote:
> On 8/3/23 08:01, Helge Deller wrote:
>> If it does, it replaces patches 1,2 & 4-6 from Richard's v7 patch
>> series.
>
> The patch you gave below has no overlap with 1,2,4,5 at all.

Yes, ignore this.... Your patch series is fine as-is....
(the brk()-host-page-alignment issue is the only one left I think)

Helge



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-08-03 16:10 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-03  1:52 [PATCH v7 00/14] linux-user: brk fixes Richard Henderson
2023-08-03  1:52 ` [PATCH v7 01/14] linux-user: Unset MAP_FIXED_NOREPLACE for host Richard Henderson
2023-08-03  1:52 ` [PATCH v7 02/14] linux-user: Fix MAP_FIXED_NOREPLACE on old kernels Richard Henderson
2023-08-03  1:52 ` [PATCH v7 03/14] linux-user: Do not call get_errno() in do_brk() Richard Henderson
2023-08-03  1:52 ` [PATCH v7 04/14] linux-user: Use MAP_FIXED_NOREPLACE for do_brk() Richard Henderson
2023-08-03  1:52 ` [PATCH v7 05/14] linux-user: Do nothing if too small brk is specified Richard Henderson
2023-08-03  1:52 ` [PATCH v7 06/14] linux-user: Do not align brk with host page size Richard Henderson
2023-08-03  1:52 ` [PATCH v7 07/14] linux-user: Remove last_brk Richard Henderson
2023-08-03  1:52 ` [PATCH v7 08/14] bsd-user: " Richard Henderson
2023-08-03  1:52 ` [PATCH v7 09/14] linux-user: Adjust task_unmapped_base for reserved_va Richard Henderson
2023-08-03  1:52 ` [PATCH v7 10/14] linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h Richard Henderson
2023-08-03  1:52 ` [PATCH v7 11/14] linux-user: Add ELF_ET_DYN_BASE Richard Henderson
2023-08-03  1:53 ` [PATCH v7 12/14] linux-user: Use elf_et_dyn_base for ET_DYN with interpreter Richard Henderson
2023-08-03  1:53 ` [PATCH v7 13/14] linux-user: Adjust initial brk when interpreter is close to executable Richard Henderson
2023-08-03 13:00   ` Helge Deller
2023-08-03  1:53 ` [PATCH v7 14/14] linux-user: Properly set image_info.brk in flatload Richard Henderson
2023-08-03 13:11 ` [PATCH v7 00/14] linux-user: brk fixes Joel Stanley
2023-08-03 13:55   ` Helge Deller
2023-08-03 14:17     ` Joel Stanley
2023-08-03 15:01       ` Helge Deller
2023-08-03 15:11         ` Richard Henderson
2023-08-03 16:09           ` Helge Deller
2023-08-03 15:20         ` Richard Henderson
2023-08-03 16:10           ` Helge Deller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.