All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] linux-user: ppoll: eliminate large alloca
@ 2022-12-16 19:22 Michael Tokarev
  2022-12-16 20:44 ` Richard Henderson
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Tokarev @ 2022-12-16 19:22 UTC (permalink / raw)
  To: qemu-devel, Laurent Vivier; +Cc: Michael Tokarev

do_ppoll() in linux-user/syscall.c uses alloca() to
allocate an array of struct pullfds on the stack.
The only upper boundary for number of entries for this
array is so that whole thing fits in INT_MAX. But this
is definitely too much for a stack allocation.

Use heap allocation when large number of entries
is requested (currently 128, arbitrary), and continue
to use alloca() for smaller allocations, to optimize
small operations for small sizes.

This, together with previous patch for getgroups(),
eliminates all large on-stack allocations from
qemu-user/syscall.c. What's left are actually small
ones.

While at it, also fix missing unlock_user() in two
places, and consolidate target_to_host_timespec*()
calls into time64?_timespec():_timespec64() construct.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
 linux-user/syscall.c | 50 ++++++++++++++++++++++----------------------
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 24b25759be..b45690b10a 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1558,14 +1558,12 @@ static abi_long do_pselect6(abi_long arg1, abi_long arg2, abi_long arg3,
 static abi_long do_ppoll(abi_long arg1, abi_long arg2, abi_long arg3,
                          abi_long arg4, abi_long arg5, bool ppoll, bool time64)
 {
-    struct target_pollfd *target_pfd;
+    struct target_pollfd *target_pfd = NULL;
     unsigned int nfds = arg2;
-    struct pollfd *pfd;
+    struct pollfd *pfd = NULL, *heap_pfd = NULL;
     unsigned int i;
     abi_long ret;
 
-    pfd = NULL;
-    target_pfd = NULL;
     if (nfds) {
         if (nfds > (INT_MAX / sizeof(struct target_pollfd))) {
             return -TARGET_EINVAL;
@@ -1576,7 +1574,16 @@ static abi_long do_ppoll(abi_long arg1, abi_long arg2, abi_long arg3,
             return -TARGET_EFAULT;
         }
 
-        pfd = alloca(sizeof(struct pollfd) * nfds);
+        if (nfds <= 128) {
+            pfd = alloca(sizeof(struct pollfd) * nfds);
+        } else {
+            heap_pfd = g_try_new(struct pollfd, nfds);
+            if (!heap_pfd) {
+                ret = -TARGET_ENOMEM;
+                goto out;
+            }
+            pfd = heap_pfd;
+        }
         for (i = 0; i < nfds; i++) {
             pfd[i].fd = tswap32(target_pfd[i].fd);
             pfd[i].events = tswap16(target_pfd[i].events);
@@ -1587,16 +1594,11 @@ static abi_long do_ppoll(abi_long arg1, abi_long arg2, abi_long arg3,
         sigset_t *set = NULL;
 
         if (arg3) {
-            if (time64) {
-                if (target_to_host_timespec64(timeout_ts, arg3)) {
-                    unlock_user(target_pfd, arg1, 0);
-                    return -TARGET_EFAULT;
-                }
-            } else {
-                if (target_to_host_timespec(timeout_ts, arg3)) {
-                    unlock_user(target_pfd, arg1, 0);
-                    return -TARGET_EFAULT;
-                }
+            if (time64
+                ? target_to_host_timespec64(timeout_ts, arg3)
+                : target_to_host_timespec(timeout_ts, arg3)) {
+                ret = -TARGET_EFAULT;
+                goto out;
             }
         } else {
             timeout_ts = NULL;
@@ -1605,8 +1607,7 @@ static abi_long do_ppoll(abi_long arg1, abi_long arg2, abi_long arg3,
         if (arg4) {
             ret = process_sigsuspend_mask(&set, arg4, arg5);
             if (ret != 0) {
-                unlock_user(target_pfd, arg1, 0);
-                return ret;
+                goto out;
             }
         }
 
@@ -1617,14 +1618,11 @@ static abi_long do_ppoll(abi_long arg1, abi_long arg2, abi_long arg3,
             finish_sigsuspend_mask(ret);
         }
         if (!is_error(ret) && arg3) {
-            if (time64) {
-                if (host_to_target_timespec64(arg3, timeout_ts)) {
-                    return -TARGET_EFAULT;
-                }
-            } else {
-                if (host_to_target_timespec(arg3, timeout_ts)) {
-                    return -TARGET_EFAULT;
-                }
+            if (time64
+                ? host_to_target_timespec64(arg3, timeout_ts)
+                : host_to_target_timespec(arg3, timeout_ts)) {
+                ret = -TARGET_EFAULT;
+                goto out;
             }
         }
     } else {
@@ -1647,6 +1645,8 @@ static abi_long do_ppoll(abi_long arg1, abi_long arg2, abi_long arg3,
             target_pfd[i].revents = tswap16(pfd[i].revents);
         }
     }
+out:
+    g_free(heap_pfd);
     unlock_user(target_pfd, arg1, sizeof(struct target_pollfd) * nfds);
     return ret;
 }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] linux-user: ppoll: eliminate large alloca
  2022-12-16 19:22 [PATCH] linux-user: ppoll: eliminate large alloca Michael Tokarev
@ 2022-12-16 20:44 ` Richard Henderson
  2023-04-09 11:31   ` Michael Tokarev
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Henderson @ 2022-12-16 20:44 UTC (permalink / raw)
  To: Michael Tokarev, qemu-devel, Laurent Vivier

On 12/16/22 11:22, Michael Tokarev wrote:
> do_ppoll() in linux-user/syscall.c uses alloca() to
> allocate an array of struct pullfds on the stack.
> The only upper boundary for number of entries for this
> array is so that whole thing fits in INT_MAX. But this
> is definitely too much for a stack allocation.
> 
> Use heap allocation when large number of entries
> is requested (currently 128, arbitrary), and continue
> to use alloca() for smaller allocations, to optimize
> small operations for small sizes.

I think it would be cleaner to always use heap allocation, and use g_autofree for the pointer.


r~


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] linux-user: ppoll: eliminate large alloca
  2022-12-16 20:44 ` Richard Henderson
@ 2023-04-09 11:31   ` Michael Tokarev
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Tokarev @ 2023-04-09 11:31 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, Laurent Vivier

(Replying to an old(ish) email... original thread:
https://patchwork.ozlabs.org/project/qemu-devel/patch/20221216192220.2881898-1-mjt@msgid.tls.msk.ru/ )

16.12.2022 23:44, Richard Henderson wrote:
> On 12/16/22 11:22, Michael Tokarev wrote:
>> do_ppoll() in linux-user/syscall.c uses alloca() to
>> allocate an array of struct pullfds on the stack.
>> The only upper boundary for number of entries for this
>> array is so that whole thing fits in INT_MAX. But this
>> is definitely too much for a stack allocation.
>>
>> Use heap allocation when large number of entries
>> is requested (currently 128, arbitrary), and continue
>> to use alloca() for smaller allocations, to optimize
>> small operations for small sizes.
> 
> I think it would be cleaner to always use heap allocation, and use g_autofree for the pointer.

Yes it is cleaner to always use the same type of allocation.
Does it really unnecessary to try to avoid heap allocations
for small things? It costs not that much, but might speed
some things up. Dunno how much it saves though.  Maybe it
is from the "premature optimization" field :)

Speaking of g_autofree, we already have to unlock_user anyway
(which we forgot to call), - so it makes no difference
between marking it as g_autofree or explicitly freeing it.

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-04-09 11:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-16 19:22 [PATCH] linux-user: ppoll: eliminate large alloca Michael Tokarev
2022-12-16 20:44 ` Richard Henderson
2023-04-09 11:31   ` Michael Tokarev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.