qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 0/8] tcg + linux-user patch queue
@ 2024-01-21  0:20 Richard Henderson
  2024-01-21  0:20 ` [PULL 1/8] tcg: Remove unreachable code Richard Henderson
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit 3f2a357b95845ea0bf7463eff6661e43b97d1afc:

  Merge tag 'hw-cpus-20240119' of https://github.com/philmd/qemu into staging (2024-01-19 11:39:38 +0000)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20240121

for you to fetch changes up to 1d5e32e3198d2d8fd2342c8f7f8e0875aeff49c5:

  linux-user/elfload: check PR_GET_DUMPABLE before creating coredump (2024-01-21 10:25:07 +1100)

----------------------------------------------------------------
tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
tcg: Clean up error paths in alloc_code_gen_buffer_splitwx_memfd
linux-user/riscv: Adjust vdso signal frame cfa offsets
linux-user: Fixed cpu restore with pc 0 on SIGBUS

----------------------------------------------------------------
Richard Henderson (3):
      tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
      tests/tcg/s390x: Import linux tools/testing/crypto/chacha20-s390
      linux-user/riscv: Adjust vdso signal frame cfa offsets

Robbin Ehn (1):
      linux-user: Fixed cpu restore with pc 0 on SIGBUS

Samuel Tardieu (2):
      tcg: Remove unreachable code
      tcg: Make the cleanup-on-error path unique

Thomas Weißschuh (2):
      linux-user/elfload: test return value of getrlimit
      linux-user/elfload: check PR_GET_DUMPABLE before creating coredump

 linux-user/elfload.c            |  10 +-
 linux-user/signal.c             |   5 +-
 tcg/region.c                    |  10 +-
 tests/tcg/s390x/chacha.c        | 341 +++++++++++++++
 tcg/s390x/tcg-target.c.inc      |   6 +-
 linux-user/riscv/vdso-32.so     | Bin 2900 -> 2980 bytes
 linux-user/riscv/vdso-64.so     | Bin 3856 -> 3944 bytes
 linux-user/riscv/vdso.S         |   8 +-
 tests/tcg/s390x/Makefile.target |   4 +
 tests/tcg/s390x/chacha-vx.S     | 914 ++++++++++++++++++++++++++++++++++++++++
 10 files changed, 1281 insertions(+), 17 deletions(-)
 create mode 100644 tests/tcg/s390x/chacha.c
 create mode 100644 tests/tcg/s390x/chacha-vx.S


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PULL 1/8] tcg: Remove unreachable code
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
@ 2024-01-21  0:20 ` Richard Henderson
  2024-01-21  0:20 ` [PULL 2/8] tcg: Make the cleanup-on-error path unique Richard Henderson
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Samuel Tardieu, Peter Maydell

From: Samuel Tardieu <sam@rfc1149.net>

The `fail_rx`/`fail` block is only entered while `buf_rx` is equal to
its initial value `MAP_FAILED`. The `munmap(buf_rx, size);` was never
executed.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2030
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20231219182212.455952-2-sam@rfc1149.net>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 86692455c0..467e51cf6f 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -597,9 +597,7 @@ static int alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
  fail_rx:
     error_setg_errno(errp, errno, "failed to map shared memory for execute");
  fail:
-    if (buf_rx != MAP_FAILED) {
-        munmap(buf_rx, size);
-    }
+    /* buf_rx is always equal to MAP_FAILED here and does not require cleanup */
     if (buf_rw) {
         munmap(buf_rw, size);
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PULL 2/8] tcg: Make the cleanup-on-error path unique
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
  2024-01-21  0:20 ` [PULL 1/8] tcg: Remove unreachable code Richard Henderson
@ 2024-01-21  0:20 ` Richard Henderson
  2024-01-21  0:20 ` [PULL 3/8] linux-user: Fixed cpu restore with pc 0 on SIGBUS Richard Henderson
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Samuel Tardieu, Peter Maydell

From: Samuel Tardieu <sam@rfc1149.net>

By calling `error_setg_errno()` before jumping to the cleanup-on-error
path at the `fail` label, the cleanup path is clearer.

Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20231219182212.455952-3-sam@rfc1149.net>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 467e51cf6f..478ec051c4 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -584,7 +584,9 @@ static int alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
 
     buf_rx = mmap(NULL, size, host_prot_read_exec(), MAP_SHARED, fd, 0);
     if (buf_rx == MAP_FAILED) {
-        goto fail_rx;
+        error_setg_errno(errp, errno,
+                         "failed to map shared memory for execute");
+        goto fail;
     }
 
     close(fd);
@@ -594,8 +596,6 @@ static int alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
 
     return PROT_READ | PROT_WRITE;
 
- fail_rx:
-    error_setg_errno(errp, errno, "failed to map shared memory for execute");
  fail:
     /* buf_rx is always equal to MAP_FAILED here and does not require cleanup */
     if (buf_rw) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PULL 3/8] linux-user: Fixed cpu restore with pc 0 on SIGBUS
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
  2024-01-21  0:20 ` [PULL 1/8] tcg: Remove unreachable code Richard Henderson
  2024-01-21  0:20 ` [PULL 2/8] tcg: Make the cleanup-on-error path unique Richard Henderson
@ 2024-01-21  0:20 ` Richard Henderson
  2024-01-21  0:20 ` [PULL 4/8] tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns Richard Henderson
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Robbin Ehn, Palmer Dabbelt

From: Robbin Ehn <rehn@rivosinc.com>

Commit f4e1168198 (linux-user: Split out host_sig{segv,bus}_handler)
introduced a bug, when returning from host_sigbus_handler the PC is
never set. Thus cpu_loop_exit_restore is called with a zero PC and
we immediate get a SIGSEGV.

Signed-off-by: Robbin Ehn <rehn@rivosinc.com>
Fixes: f4e1168198 ("linux-user: Split out host_sig{segv,bus}_handler")
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Message-Id: <33f27425878fb529b9e39ef22c303f6e0d90525f.camel@rivosinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/signal.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index b35d1e512f..c9527adfa3 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -925,7 +925,7 @@ static void host_sigsegv_handler(CPUState *cpu, siginfo_t *info,
     cpu_loop_exit_sigsegv(cpu, guest_addr, access_type, maperr, pc);
 }
 
-static void host_sigbus_handler(CPUState *cpu, siginfo_t *info,
+static uintptr_t host_sigbus_handler(CPUState *cpu, siginfo_t *info,
                                 host_sigcontext *uc)
 {
     uintptr_t pc = host_signal_pc(uc);
@@ -947,6 +947,7 @@ static void host_sigbus_handler(CPUState *cpu, siginfo_t *info,
         sigprocmask(SIG_SETMASK, host_signal_mask(uc), NULL);
         cpu_loop_exit_sigbus(cpu, guest_addr, access_type, pc);
     }
+    return pc;
 }
 
 static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
@@ -974,7 +975,7 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
             host_sigsegv_handler(cpu, info, uc);
             return;
         case SIGBUS:
-            host_sigbus_handler(cpu, info, uc);
+            pc = host_sigbus_handler(cpu, info, uc);
             sync_sig = true;
             break;
         case SIGILL:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PULL 4/8] tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
                   ` (2 preceding siblings ...)
  2024-01-21  0:20 ` [PULL 3/8] linux-user: Fixed cpu restore with pc 0 on SIGBUS Richard Henderson
@ 2024-01-21  0:20 ` Richard Henderson
  2024-01-21  0:20 ` [PULL 5/8] tests/tcg/s390x: Import linux tools/testing/crypto/chacha20-s390 Richard Henderson
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-stable, Michael Tokarev, Thomas Huth

While the format names the second vector register 'v3',
it is still in the second position (bits 12-15) and
the argument to RXB must match.

Example error:
 -   e7 00 00 10 2a 33       verllf  %v16,%v0,16
 +   e7 00 00 10 2c 33       verllf  %v16,%v16,16

Cc: qemu-stable@nongnu.org
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 22cb37b4172 ("tcg/s390x: Implement vector shift operations")
Fixes: 79cada8693d ("tcg/s390x: Implement tcg_out_dup*_vec")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2054
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20240117213646.159697-2-richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/s390x/tcg-target.c.inc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index fbee43d3b0..7f6b84aa2c 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -683,7 +683,7 @@ static void tcg_out_insn_VRIc(TCGContext *s, S390Opcode op,
     tcg_debug_assert(is_vector_reg(v3));
     tcg_out16(s, (op & 0xff00) | ((v1 & 0xf) << 4) | (v3 & 0xf));
     tcg_out16(s, i2);
-    tcg_out16(s, (op & 0x00ff) | RXB(v1, 0, v3, 0) | (m4 << 12));
+    tcg_out16(s, (op & 0x00ff) | RXB(v1, v3, 0, 0) | (m4 << 12));
 }
 
 static void tcg_out_insn_VRRa(TCGContext *s, S390Opcode op,
@@ -738,7 +738,7 @@ static void tcg_out_insn_VRSa(TCGContext *s, S390Opcode op, TCGReg v1,
     tcg_debug_assert(is_vector_reg(v3));
     tcg_out16(s, (op & 0xff00) | ((v1 & 0xf) << 4) | (v3 & 0xf));
     tcg_out16(s, b2 << 12 | d2);
-    tcg_out16(s, (op & 0x00ff) | RXB(v1, 0, v3, 0) | (m4 << 12));
+    tcg_out16(s, (op & 0x00ff) | RXB(v1, v3, 0, 0) | (m4 << 12));
 }
 
 static void tcg_out_insn_VRSb(TCGContext *s, S390Opcode op, TCGReg v1,
@@ -762,7 +762,7 @@ static void tcg_out_insn_VRSc(TCGContext *s, S390Opcode op, TCGReg r1,
     tcg_debug_assert(is_vector_reg(v3));
     tcg_out16(s, (op & 0xff00) | (r1 << 4) | (v3 & 0xf));
     tcg_out16(s, b2 << 12 | d2);
-    tcg_out16(s, (op & 0x00ff) | RXB(0, 0, v3, 0) | (m4 << 12));
+    tcg_out16(s, (op & 0x00ff) | RXB(0, v3, 0, 0) | (m4 << 12));
 }
 
 static void tcg_out_insn_VRX(TCGContext *s, S390Opcode op, TCGReg v1,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PULL 5/8] tests/tcg/s390x: Import linux tools/testing/crypto/chacha20-s390
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
                   ` (3 preceding siblings ...)
  2024-01-21  0:20 ` [PULL 4/8] tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns Richard Henderson
@ 2024-01-21  0:20 ` Richard Henderson
  2024-01-21  0:20 ` [PULL 6/8] linux-user/riscv: Adjust vdso signal frame cfa offsets Richard Henderson
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Michael Tokarev, Thomas Huth

Modify and simplify the driver, as we're really only interested
in correctness of translation of chacha-vx.S.

Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240117213646.159697-3-richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tests/tcg/s390x/chacha.c        | 341 ++++++++++++
 tests/tcg/s390x/Makefile.target |   4 +
 tests/tcg/s390x/chacha-vx.S     | 914 ++++++++++++++++++++++++++++++++
 3 files changed, 1259 insertions(+)
 create mode 100644 tests/tcg/s390x/chacha.c
 create mode 100644 tests/tcg/s390x/chacha-vx.S

diff --git a/tests/tcg/s390x/chacha.c b/tests/tcg/s390x/chacha.c
new file mode 100644
index 0000000000..ca9e4c1959
--- /dev/null
+++ b/tests/tcg/s390x/chacha.c
@@ -0,0 +1,341 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Derived from linux kernel sources:
+ *   ./include/crypto/chacha.h
+ *   ./crypto/chacha_generic.c
+ *   ./arch/s390/crypto/chacha-glue.c
+ *   ./tools/testing/crypto/chacha20-s390/test-cipher.c
+ *   ./tools/testing/crypto/chacha20-s390/run-tests.sh
+ */
+
+#include <stdlib.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <string.h>
+#include <inttypes.h>
+#include <sys/random.h>
+
+typedef uint8_t u8;
+typedef uint32_t u32;
+typedef uint64_t u64;
+
+static unsigned data_size;
+static bool debug;
+
+#define CHACHA_IV_SIZE          16
+#define CHACHA_KEY_SIZE         32
+#define CHACHA_BLOCK_SIZE       64
+#define CHACHAPOLY_IV_SIZE      12
+#define CHACHA_STATE_WORDS      (CHACHA_BLOCK_SIZE / sizeof(u32))
+
+static u32 rol32(u32 val, u32 sh)
+{
+    return (val << (sh & 31)) | (val >> (-sh & 31));
+}
+
+static u32 get_unaligned_le32(const void *ptr)
+{
+    u32 val;
+    memcpy(&val, ptr, 4);
+    return __builtin_bswap32(val);
+}
+
+static void put_unaligned_le32(u32 val, void *ptr)
+{
+    val = __builtin_bswap32(val);
+    memcpy(ptr, &val, 4);
+}
+
+static void chacha_permute(u32 *x, int nrounds)
+{
+    for (int i = 0; i < nrounds; i += 2) {
+        x[0]  += x[4];    x[12] = rol32(x[12] ^ x[0],  16);
+        x[1]  += x[5];    x[13] = rol32(x[13] ^ x[1],  16);
+        x[2]  += x[6];    x[14] = rol32(x[14] ^ x[2],  16);
+        x[3]  += x[7];    x[15] = rol32(x[15] ^ x[3],  16);
+
+        x[8]  += x[12];   x[4]  = rol32(x[4]  ^ x[8],  12);
+        x[9]  += x[13];   x[5]  = rol32(x[5]  ^ x[9],  12);
+        x[10] += x[14];   x[6]  = rol32(x[6]  ^ x[10], 12);
+        x[11] += x[15];   x[7]  = rol32(x[7]  ^ x[11], 12);
+
+        x[0]  += x[4];    x[12] = rol32(x[12] ^ x[0],   8);
+        x[1]  += x[5];    x[13] = rol32(x[13] ^ x[1],   8);
+        x[2]  += x[6];    x[14] = rol32(x[14] ^ x[2],   8);
+        x[3]  += x[7];    x[15] = rol32(x[15] ^ x[3],   8);
+
+        x[8]  += x[12];   x[4]  = rol32(x[4]  ^ x[8],   7);
+        x[9]  += x[13];   x[5]  = rol32(x[5]  ^ x[9],   7);
+        x[10] += x[14];   x[6]  = rol32(x[6]  ^ x[10],  7);
+        x[11] += x[15];   x[7]  = rol32(x[7]  ^ x[11],  7);
+
+        x[0]  += x[5];    x[15] = rol32(x[15] ^ x[0],  16);
+        x[1]  += x[6];    x[12] = rol32(x[12] ^ x[1],  16);
+        x[2]  += x[7];    x[13] = rol32(x[13] ^ x[2],  16);
+        x[3]  += x[4];    x[14] = rol32(x[14] ^ x[3],  16);
+
+        x[10] += x[15];   x[5]  = rol32(x[5]  ^ x[10], 12);
+        x[11] += x[12];   x[6]  = rol32(x[6]  ^ x[11], 12);
+        x[8]  += x[13];   x[7]  = rol32(x[7]  ^ x[8],  12);
+        x[9]  += x[14];   x[4]  = rol32(x[4]  ^ x[9],  12);
+
+        x[0]  += x[5];    x[15] = rol32(x[15] ^ x[0],   8);
+        x[1]  += x[6];    x[12] = rol32(x[12] ^ x[1],   8);
+        x[2]  += x[7];    x[13] = rol32(x[13] ^ x[2],   8);
+        x[3]  += x[4];    x[14] = rol32(x[14] ^ x[3],   8);
+
+        x[10] += x[15];   x[5]  = rol32(x[5]  ^ x[10],  7);
+        x[11] += x[12];   x[6]  = rol32(x[6]  ^ x[11],  7);
+        x[8]  += x[13];   x[7]  = rol32(x[7]  ^ x[8],   7);
+        x[9]  += x[14];   x[4]  = rol32(x[4]  ^ x[9],   7);
+    }
+}
+
+static void chacha_block_generic(u32 *state, u8 *stream, int nrounds)
+{
+    u32 x[16];
+
+    memcpy(x, state, 64);
+    chacha_permute(x, nrounds);
+
+    for (int i = 0; i < 16; i++) {
+        put_unaligned_le32(x[i] + state[i], &stream[i * sizeof(u32)]);
+    }
+    state[12]++;
+}
+
+static void crypto_xor_cpy(u8 *dst, const u8 *src1,
+                           const u8 *src2, unsigned len)
+{
+    while (len--) {
+        *dst++ = *src1++ ^ *src2++;
+    }
+}
+
+static void chacha_crypt_generic(u32 *state, u8 *dst, const u8 *src,
+                                 unsigned int bytes, int nrounds)
+{
+    u8 stream[CHACHA_BLOCK_SIZE];
+
+    while (bytes >= CHACHA_BLOCK_SIZE) {
+        chacha_block_generic(state, stream, nrounds);
+        crypto_xor_cpy(dst, src, stream, CHACHA_BLOCK_SIZE);
+        bytes -= CHACHA_BLOCK_SIZE;
+        dst += CHACHA_BLOCK_SIZE;
+        src += CHACHA_BLOCK_SIZE;
+    }
+    if (bytes) {
+        chacha_block_generic(state, stream, nrounds);
+        crypto_xor_cpy(dst, src, stream, bytes);
+    }
+}
+
+enum chacha_constants { /* expand 32-byte k */
+    CHACHA_CONSTANT_EXPA = 0x61707865U,
+    CHACHA_CONSTANT_ND_3 = 0x3320646eU,
+    CHACHA_CONSTANT_2_BY = 0x79622d32U,
+    CHACHA_CONSTANT_TE_K = 0x6b206574U
+};
+
+static void chacha_init_generic(u32 *state, const u32 *key, const u8 *iv)
+{
+    state[0]  = CHACHA_CONSTANT_EXPA;
+    state[1]  = CHACHA_CONSTANT_ND_3;
+    state[2]  = CHACHA_CONSTANT_2_BY;
+    state[3]  = CHACHA_CONSTANT_TE_K;
+    state[4]  = key[0];
+    state[5]  = key[1];
+    state[6]  = key[2];
+    state[7]  = key[3];
+    state[8]  = key[4];
+    state[9]  = key[5];
+    state[10] = key[6];
+    state[11] = key[7];
+    state[12] = get_unaligned_le32(iv +  0);
+    state[13] = get_unaligned_le32(iv +  4);
+    state[14] = get_unaligned_le32(iv +  8);
+    state[15] = get_unaligned_le32(iv + 12);
+}
+
+void chacha20_vx(u8 *out, const u8 *inp, size_t len, const u32 *key,
+                 const u32 *counter);
+
+static void chacha20_crypt_s390(u32 *state, u8 *dst, const u8 *src,
+                                unsigned int nbytes, const u32 *key,
+                                u32 *counter)
+{
+    chacha20_vx(dst, src, nbytes, key, counter);
+    *counter += (nbytes + CHACHA_BLOCK_SIZE - 1) / CHACHA_BLOCK_SIZE;
+}
+
+static void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src,
+                              unsigned int bytes, int nrounds)
+{
+    /*
+     * s390 chacha20 implementation has 20 rounds hard-coded,
+     * it cannot handle a block of data or less, but otherwise
+     * it can handle data of arbitrary size
+     */
+    if (bytes <= CHACHA_BLOCK_SIZE || nrounds != 20) {
+        chacha_crypt_generic(state, dst, src, bytes, nrounds);
+    } else {
+        chacha20_crypt_s390(state, dst, src, bytes, &state[4], &state[12]);
+    }
+}
+
+static void print_hex_dump(const char *prefix_str, const void *buf, int len)
+{
+    for (int i = 0; i < len; i += 16) {
+        printf("%s%.8x: ", prefix_str, i);
+        for (int j = 0; j < 16; ++j) {
+            printf("%02x%c", *(u8 *)(buf + i + j), j == 15 ? '\n' : ' ');
+        }
+    }
+}
+
+/* Perform cipher operations with the chacha lib */
+static int test_lib_chacha(u8 *revert, u8 *cipher, u8 *plain, bool generic)
+{
+    u32 chacha_state[CHACHA_STATE_WORDS];
+    u8 iv[16], key[32];
+
+    memset(key, 'X', sizeof(key));
+    memset(iv, 'I', sizeof(iv));
+
+    if (debug) {
+        print_hex_dump("key: ", key, 32);
+        print_hex_dump("iv:  ", iv, 16);
+    }
+
+    /* Encrypt */
+    chacha_init_generic(chacha_state, (u32*)key, iv);
+
+    if (generic) {
+        chacha_crypt_generic(chacha_state, cipher, plain, data_size, 20);
+    } else {
+        chacha_crypt_arch(chacha_state, cipher, plain, data_size, 20);
+    }
+
+    if (debug) {
+        print_hex_dump("encr:", cipher,
+                       (data_size > 64 ? 64 : data_size));
+    }
+
+    /* Decrypt */
+    chacha_init_generic(chacha_state, (u32 *)key, iv);
+
+    if (generic) {
+        chacha_crypt_generic(chacha_state, revert, cipher, data_size, 20);
+    } else {
+        chacha_crypt_arch(chacha_state, revert, cipher, data_size, 20);
+    }
+
+    if (debug) {
+        print_hex_dump("decr:", revert,
+                       (data_size > 64 ? 64 : data_size));
+    }
+    return 0;
+}
+
+static int chacha_s390_test_init(void)
+{
+    u8 *plain = NULL, *revert = NULL;
+    u8 *cipher_generic = NULL, *cipher_s390 = NULL;
+    int ret = -1;
+
+    printf("s390 ChaCha20 test module: size=%d debug=%d\n",
+           data_size, debug);
+
+    /* Allocate and fill buffers */
+    plain = malloc(data_size);
+    if (!plain) {
+        printf("could not allocate plain buffer\n");
+        ret = -2;
+        goto out;
+    }
+
+    memset(plain, 'a', data_size);
+    for (unsigned i = 0, n = data_size > 256 ? 256 : data_size; i < n; ) {
+        ssize_t t = getrandom(plain + i, n - i, 0);
+        if (t < 0) {
+            break;
+        }
+        i -= t;
+    }
+
+    cipher_generic = calloc(1, data_size);
+    if (!cipher_generic) {
+        printf("could not allocate cipher_generic buffer\n");
+        ret = -2;
+        goto out;
+    }
+
+    cipher_s390 = calloc(1, data_size);
+    if (!cipher_s390) {
+        printf("could not allocate cipher_s390 buffer\n");
+        ret = -2;
+        goto out;
+    }
+
+    revert = calloc(1, data_size);
+    if (!revert) {
+        printf("could not allocate revert buffer\n");
+        ret = -2;
+        goto out;
+    }
+
+    if (debug) {
+        print_hex_dump("src: ", plain,
+                       (data_size > 64 ? 64 : data_size));
+    }
+
+    /* Use chacha20 lib */
+    test_lib_chacha(revert, cipher_generic, plain, true);
+    if (memcmp(plain, revert, data_size)) {
+        printf("generic en/decryption check FAILED\n");
+        ret = -2;
+        goto out;
+    }
+    printf("generic en/decryption check OK\n");
+
+    test_lib_chacha(revert, cipher_s390, plain, false);
+    if (memcmp(plain, revert, data_size)) {
+        printf("lib en/decryption check FAILED\n");
+        ret = -2;
+        goto out;
+    }
+    printf("lib en/decryption check OK\n");
+
+    if (memcmp(cipher_generic, cipher_s390, data_size)) {
+        printf("lib vs generic check FAILED\n");
+        ret = -2;
+        goto out;
+    }
+    printf("lib vs generic check OK\n");
+
+    printf("--- chacha20 s390 test end ---\n");
+
+out:
+    free(plain);
+    free(cipher_generic);
+    free(cipher_s390);
+    free(revert);
+    return ret;
+}
+
+int main(int ac, char **av)
+{
+    static const unsigned sizes[] = {
+        63, 64, 65, 127, 128, 129, 511, 512, 513, 4096, 65611,
+        /* too slow for tcg: 6291456, 62914560 */
+    };
+
+    debug = ac >= 2;
+    for (int i = 0; i < sizeof(sizes) / sizeof(sizes[0]); ++i) {
+        data_size = sizes[i];
+        if (chacha_s390_test_init() != -1) {
+            return 1;
+        }
+    }
+    return 0;
+}
diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 30994dcf9c..b9dc12dc8a 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -66,9 +66,13 @@ Z13_TESTS+=vcksm
 Z13_TESTS+=vstl
 Z13_TESTS+=vrep
 Z13_TESTS+=precise-smc-user
+Z13_TESTS+=chacha
 $(Z13_TESTS): CFLAGS+=-march=z13 -O2
 TESTS+=$(Z13_TESTS)
 
+chacha: chacha.c chacha-vx.S
+	$(CC) $(LDFLAGS) $(CFLAGS) $(EXTRA_CFLAGS) $^ -o $@
+
 ifneq ($(CROSS_CC_HAS_Z14),)
 Z14_TESTS=vfminmax
 vfminmax: LDFLAGS+=-lm
diff --git a/tests/tcg/s390x/chacha-vx.S b/tests/tcg/s390x/chacha-vx.S
new file mode 100644
index 0000000000..dcb55b4324
--- /dev/null
+++ b/tests/tcg/s390x/chacha-vx.S
@@ -0,0 +1,914 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Original implementation written by Andy Polyakov, @dot-asm.
+ * This is an adaptation of the original code for kernel use.
+ *
+ * Copyright (C) 2006-2019 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
+ *
+ * For qemu testing, drop <asm/vx-insn-asm.h> and assume assembler support.
+ */
+
+#define SP	%r15
+#define FRAME	(16 * 8 + 4 * 8)
+
+	.data
+	.balign	32
+
+sigma:
+	.long	0x61707865,0x3320646e,0x79622d32,0x6b206574	# endian-neutral
+	.long	1,0,0,0
+	.long	2,0,0,0
+	.long	3,0,0,0
+	.long	0x03020100,0x07060504,0x0b0a0908,0x0f0e0d0c	# byte swap
+
+	.long	0,1,2,3
+	.long	0x61707865,0x61707865,0x61707865,0x61707865	# smashed sigma
+	.long	0x3320646e,0x3320646e,0x3320646e,0x3320646e
+	.long	0x79622d32,0x79622d32,0x79622d32,0x79622d32
+	.long	0x6b206574,0x6b206574,0x6b206574,0x6b206574
+
+	.type	sigma, @object
+	.size	sigma, . - sigma
+
+	.previous
+
+	.text
+
+#############################################################################
+# void chacha20_vx_4x(u8 *out, counst u8 *inp, size_t len,
+#		      counst u32 *key, const u32 *counter)
+
+#define	OUT		%r2
+#define	INP		%r3
+#define	LEN		%r4
+#define	KEY		%r5
+#define	COUNTER		%r6
+
+#define BEPERM		%v31
+#define CTR		%v26
+
+#define K0		%v16
+#define K1		%v17
+#define K2		%v18
+#define K3		%v19
+
+#define XA0		%v0
+#define XA1		%v1
+#define XA2		%v2
+#define XA3		%v3
+
+#define XB0		%v4
+#define XB1		%v5
+#define XB2		%v6
+#define XB3		%v7
+
+#define XC0		%v8
+#define XC1		%v9
+#define XC2		%v10
+#define XC3		%v11
+
+#define XD0		%v12
+#define XD1		%v13
+#define XD2		%v14
+#define XD3		%v15
+
+#define XT0		%v27
+#define XT1		%v28
+#define XT2		%v29
+#define XT3		%v30
+
+	.balign	32
+chacha20_vx_4x:
+	stmg	%r6,%r7,6*8(SP)
+
+	larl	%r7,sigma
+	lhi	%r0,10
+	lhi	%r1,0
+
+	vl	K0,0(%r7)		# load sigma
+	vl	K1,0(KEY)		# load key
+	vl	K2,16(KEY)
+	vl	K3,0(COUNTER)		# load counter
+
+	vl	BEPERM,0x40(%r7)
+	vl	CTR,0x50(%r7)
+
+	vlm	XA0,XA3,0x60(%r7),4	# load [smashed] sigma
+
+	vrepf	XB0,K1,0		# smash the key
+	vrepf	XB1,K1,1
+	vrepf	XB2,K1,2
+	vrepf	XB3,K1,3
+
+	vrepf	XD0,K3,0
+	vrepf	XD1,K3,1
+	vrepf	XD2,K3,2
+	vrepf	XD3,K3,3
+	vaf	XD0,XD0,CTR
+
+	vrepf	XC0,K2,0
+	vrepf	XC1,K2,1
+	vrepf	XC2,K2,2
+	vrepf	XC3,K2,3
+
+.Loop_4x:
+	vaf	XA0,XA0,XB0
+	vx	XD0,XD0,XA0
+	verllf	XD0,XD0,16
+
+	vaf	XA1,XA1,XB1
+	vx	XD1,XD1,XA1
+	verllf	XD1,XD1,16
+
+	vaf	XA2,XA2,XB2
+	vx	XD2,XD2,XA2
+	verllf	XD2,XD2,16
+
+	vaf	XA3,XA3,XB3
+	vx	XD3,XD3,XA3
+	verllf	XD3,XD3,16
+
+	vaf	XC0,XC0,XD0
+	vx	XB0,XB0,XC0
+	verllf	XB0,XB0,12
+
+	vaf	XC1,XC1,XD1
+	vx	XB1,XB1,XC1
+	verllf	XB1,XB1,12
+
+	vaf	XC2,XC2,XD2
+	vx	XB2,XB2,XC2
+	verllf	XB2,XB2,12
+
+	vaf	XC3,XC3,XD3
+	vx	XB3,XB3,XC3
+	verllf	XB3,XB3,12
+
+	vaf	XA0,XA0,XB0
+	vx	XD0,XD0,XA0
+	verllf	XD0,XD0,8
+
+	vaf	XA1,XA1,XB1
+	vx	XD1,XD1,XA1
+	verllf	XD1,XD1,8
+
+	vaf	XA2,XA2,XB2
+	vx	XD2,XD2,XA2
+	verllf	XD2,XD2,8
+
+	vaf	XA3,XA3,XB3
+	vx	XD3,XD3,XA3
+	verllf	XD3,XD3,8
+
+	vaf	XC0,XC0,XD0
+	vx	XB0,XB0,XC0
+	verllf	XB0,XB0,7
+
+	vaf	XC1,XC1,XD1
+	vx	XB1,XB1,XC1
+	verllf	XB1,XB1,7
+
+	vaf	XC2,XC2,XD2
+	vx	XB2,XB2,XC2
+	verllf	XB2,XB2,7
+
+	vaf	XC3,XC3,XD3
+	vx	XB3,XB3,XC3
+	verllf	XB3,XB3,7
+
+	vaf	XA0,XA0,XB1
+	vx	XD3,XD3,XA0
+	verllf	XD3,XD3,16
+
+	vaf	XA1,XA1,XB2
+	vx	XD0,XD0,XA1
+	verllf	XD0,XD0,16
+
+	vaf	XA2,XA2,XB3
+	vx	XD1,XD1,XA2
+	verllf	XD1,XD1,16
+
+	vaf	XA3,XA3,XB0
+	vx	XD2,XD2,XA3
+	verllf	XD2,XD2,16
+
+	vaf	XC2,XC2,XD3
+	vx	XB1,XB1,XC2
+	verllf	XB1,XB1,12
+
+	vaf	XC3,XC3,XD0
+	vx	XB2,XB2,XC3
+	verllf	XB2,XB2,12
+
+	vaf	XC0,XC0,XD1
+	vx	XB3,XB3,XC0
+	verllf	XB3,XB3,12
+
+	vaf	XC1,XC1,XD2
+	vx	XB0,XB0,XC1
+	verllf	XB0,XB0,12
+
+	vaf	XA0,XA0,XB1
+	vx	XD3,XD3,XA0
+	verllf	XD3,XD3,8
+
+	vaf	XA1,XA1,XB2
+	vx	XD0,XD0,XA1
+	verllf	XD0,XD0,8
+
+	vaf	XA2,XA2,XB3
+	vx	XD1,XD1,XA2
+	verllf	XD1,XD1,8
+
+	vaf	XA3,XA3,XB0
+	vx	XD2,XD2,XA3
+	verllf	XD2,XD2,8
+
+	vaf	XC2,XC2,XD3
+	vx	XB1,XB1,XC2
+	verllf	XB1,XB1,7
+
+	vaf	XC3,XC3,XD0
+	vx	XB2,XB2,XC3
+	verllf	XB2,XB2,7
+
+	vaf	XC0,XC0,XD1
+	vx	XB3,XB3,XC0
+	verllf	XB3,XB3,7
+
+	vaf	XC1,XC1,XD2
+	vx	XB0,XB0,XC1
+	verllf	XB0,XB0,7
+	brct	%r0,.Loop_4x
+
+	vaf	XD0,XD0,CTR
+
+	vmrhf	XT0,XA0,XA1		# transpose data
+	vmrhf	XT1,XA2,XA3
+	vmrlf	XT2,XA0,XA1
+	vmrlf	XT3,XA2,XA3
+	vpdi	XA0,XT0,XT1,0b0000
+	vpdi	XA1,XT0,XT1,0b0101
+	vpdi	XA2,XT2,XT3,0b0000
+	vpdi	XA3,XT2,XT3,0b0101
+
+	vmrhf	XT0,XB0,XB1
+	vmrhf	XT1,XB2,XB3
+	vmrlf	XT2,XB0,XB1
+	vmrlf	XT3,XB2,XB3
+	vpdi	XB0,XT0,XT1,0b0000
+	vpdi	XB1,XT0,XT1,0b0101
+	vpdi	XB2,XT2,XT3,0b0000
+	vpdi	XB3,XT2,XT3,0b0101
+
+	vmrhf	XT0,XC0,XC1
+	vmrhf	XT1,XC2,XC3
+	vmrlf	XT2,XC0,XC1
+	vmrlf	XT3,XC2,XC3
+	vpdi	XC0,XT0,XT1,0b0000
+	vpdi	XC1,XT0,XT1,0b0101
+	vpdi	XC2,XT2,XT3,0b0000
+	vpdi	XC3,XT2,XT3,0b0101
+
+	vmrhf	XT0,XD0,XD1
+	vmrhf	XT1,XD2,XD3
+	vmrlf	XT2,XD0,XD1
+	vmrlf	XT3,XD2,XD3
+	vpdi	XD0,XT0,XT1,0b0000
+	vpdi	XD1,XT0,XT1,0b0101
+	vpdi	XD2,XT2,XT3,0b0000
+	vpdi	XD3,XT2,XT3,0b0101
+
+	vaf	XA0,XA0,K0
+	vaf	XB0,XB0,K1
+	vaf	XC0,XC0,K2
+	vaf	XD0,XD0,K3
+
+	vperm	XA0,XA0,XA0,BEPERM
+	vperm	XB0,XB0,XB0,BEPERM
+	vperm	XC0,XC0,XC0,BEPERM
+	vperm	XD0,XD0,XD0,BEPERM
+
+	vlm	XT0,XT3,0(INP),0
+
+	vx	XT0,XT0,XA0
+	vx	XT1,XT1,XB0
+	vx	XT2,XT2,XC0
+	vx	XT3,XT3,XD0
+
+	vstm	XT0,XT3,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	aghi	LEN,-0x40
+
+	vaf	XA0,XA1,K0
+	vaf	XB0,XB1,K1
+	vaf	XC0,XC1,K2
+	vaf	XD0,XD1,K3
+
+	vperm	XA0,XA0,XA0,BEPERM
+	vperm	XB0,XB0,XB0,BEPERM
+	vperm	XC0,XC0,XC0,BEPERM
+	vperm	XD0,XD0,XD0,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_4x
+
+	vlm	XT0,XT3,0(INP),0
+
+	vx	XT0,XT0,XA0
+	vx	XT1,XT1,XB0
+	vx	XT2,XT2,XC0
+	vx	XT3,XT3,XD0
+
+	vstm	XT0,XT3,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	aghi	LEN,-0x40
+	je	.Ldone_4x
+
+	vaf	XA0,XA2,K0
+	vaf	XB0,XB2,K1
+	vaf	XC0,XC2,K2
+	vaf	XD0,XD2,K3
+
+	vperm	XA0,XA0,XA0,BEPERM
+	vperm	XB0,XB0,XB0,BEPERM
+	vperm	XC0,XC0,XC0,BEPERM
+	vperm	XD0,XD0,XD0,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_4x
+
+	vlm	XT0,XT3,0(INP),0
+
+	vx	XT0,XT0,XA0
+	vx	XT1,XT1,XB0
+	vx	XT2,XT2,XC0
+	vx	XT3,XT3,XD0
+
+	vstm	XT0,XT3,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	aghi	LEN,-0x40
+	je	.Ldone_4x
+
+	vaf	XA0,XA3,K0
+	vaf	XB0,XB3,K1
+	vaf	XC0,XC3,K2
+	vaf	XD0,XD3,K3
+
+	vperm	XA0,XA0,XA0,BEPERM
+	vperm	XB0,XB0,XB0,BEPERM
+	vperm	XC0,XC0,XC0,BEPERM
+	vperm	XD0,XD0,XD0,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_4x
+
+	vlm	XT0,XT3,0(INP),0
+
+	vx	XT0,XT0,XA0
+	vx	XT1,XT1,XB0
+	vx	XT2,XT2,XC0
+	vx	XT3,XT3,XD0
+
+	vstm	XT0,XT3,0(OUT),0
+
+.Ldone_4x:
+	lmg	%r6,%r7,6*8(SP)
+	br	%r14
+
+.Ltail_4x:
+	vlr	XT0,XC0
+	vlr	XT1,XD0
+
+	vst	XA0,8*8+0x00(SP)
+	vst	XB0,8*8+0x10(SP)
+	vst	XT0,8*8+0x20(SP)
+	vst	XT1,8*8+0x30(SP)
+
+	lghi	%r1,0
+
+.Loop_tail_4x:
+	llgc	%r5,0(%r1,INP)
+	llgc	%r6,8*8(%r1,SP)
+	xr	%r6,%r5
+	stc	%r6,0(%r1,OUT)
+	la	%r1,1(%r1)
+	brct	LEN,.Loop_tail_4x
+
+	lmg	%r6,%r7,6*8(SP)
+	br	%r14
+
+	.type	chacha20_vx_4x, @function
+	.size	chacha20_vx_4x, . - chacha20_vx_4x
+
+#undef	OUT
+#undef	INP
+#undef	LEN
+#undef	KEY
+#undef	COUNTER
+
+#undef BEPERM
+
+#undef K0
+#undef K1
+#undef K2
+#undef K3
+
+
+#############################################################################
+# void chacha20_vx(u8 *out, counst u8 *inp, size_t len,
+#		   counst u32 *key, const u32 *counter)
+
+#define	OUT		%r2
+#define	INP		%r3
+#define	LEN		%r4
+#define	KEY		%r5
+#define	COUNTER		%r6
+
+#define BEPERM		%v31
+
+#define K0		%v27
+#define K1		%v24
+#define K2		%v25
+#define K3		%v26
+
+#define A0		%v0
+#define B0		%v1
+#define C0		%v2
+#define D0		%v3
+
+#define A1		%v4
+#define B1		%v5
+#define C1		%v6
+#define D1		%v7
+
+#define A2		%v8
+#define B2		%v9
+#define C2		%v10
+#define D2		%v11
+
+#define A3		%v12
+#define B3		%v13
+#define C3		%v14
+#define D3		%v15
+
+#define A4		%v16
+#define B4		%v17
+#define C4		%v18
+#define D4		%v19
+
+#define A5		%v20
+#define B5		%v21
+#define C5		%v22
+#define D5		%v23
+
+#define T0		%v27
+#define T1		%v28
+#define T2		%v29
+#define T3		%v30
+
+	.balign	32
+chacha20_vx:
+	clgfi	LEN,256
+	jle	chacha20_vx_4x
+	stmg	%r6,%r7,6*8(SP)
+
+	lghi	%r1,-FRAME
+	lgr	%r0,SP
+	la	SP,0(%r1,SP)
+	stg	%r0,0(SP)		# back-chain
+
+	larl	%r7,sigma
+	lhi	%r0,10
+
+	vlm	K1,K2,0(KEY),0		# load key
+	vl	K3,0(COUNTER)		# load counter
+
+	vlm	K0,BEPERM,0(%r7),4	# load sigma, increments, ...
+
+.Loop_outer_vx:
+	vlr	A0,K0
+	vlr	B0,K1
+	vlr	A1,K0
+	vlr	B1,K1
+	vlr	A2,K0
+	vlr	B2,K1
+	vlr	A3,K0
+	vlr	B3,K1
+	vlr	A4,K0
+	vlr	B4,K1
+	vlr	A5,K0
+	vlr	B5,K1
+
+	vlr	D0,K3
+	vaf	D1,K3,T1		# K[3]+1
+	vaf	D2,K3,T2		# K[3]+2
+	vaf	D3,K3,T3		# K[3]+3
+	vaf	D4,D2,T2		# K[3]+4
+	vaf	D5,D2,T3		# K[3]+5
+
+	vlr	C0,K2
+	vlr	C1,K2
+	vlr	C2,K2
+	vlr	C3,K2
+	vlr	C4,K2
+	vlr	C5,K2
+
+	vlr	T1,D1
+	vlr	T2,D2
+	vlr	T3,D3
+
+.Loop_vx:
+	vaf	A0,A0,B0
+	vaf	A1,A1,B1
+	vaf	A2,A2,B2
+	vaf	A3,A3,B3
+	vaf	A4,A4,B4
+	vaf	A5,A5,B5
+	vx	D0,D0,A0
+	vx	D1,D1,A1
+	vx	D2,D2,A2
+	vx	D3,D3,A3
+	vx	D4,D4,A4
+	vx	D5,D5,A5
+	verllf	D0,D0,16
+	verllf	D1,D1,16
+	verllf	D2,D2,16
+	verllf	D3,D3,16
+	verllf	D4,D4,16
+	verllf	D5,D5,16
+
+	vaf	C0,C0,D0
+	vaf	C1,C1,D1
+	vaf	C2,C2,D2
+	vaf	C3,C3,D3
+	vaf	C4,C4,D4
+	vaf	C5,C5,D5
+	vx	B0,B0,C0
+	vx	B1,B1,C1
+	vx	B2,B2,C2
+	vx	B3,B3,C3
+	vx	B4,B4,C4
+	vx	B5,B5,C5
+	verllf	B0,B0,12
+	verllf	B1,B1,12
+	verllf	B2,B2,12
+	verllf	B3,B3,12
+	verllf	B4,B4,12
+	verllf	B5,B5,12
+
+	vaf	A0,A0,B0
+	vaf	A1,A1,B1
+	vaf	A2,A2,B2
+	vaf	A3,A3,B3
+	vaf	A4,A4,B4
+	vaf	A5,A5,B5
+	vx	D0,D0,A0
+	vx	D1,D1,A1
+	vx	D2,D2,A2
+	vx	D3,D3,A3
+	vx	D4,D4,A4
+	vx	D5,D5,A5
+	verllf	D0,D0,8
+	verllf	D1,D1,8
+	verllf	D2,D2,8
+	verllf	D3,D3,8
+	verllf	D4,D4,8
+	verllf	D5,D5,8
+
+	vaf	C0,C0,D0
+	vaf	C1,C1,D1
+	vaf	C2,C2,D2
+	vaf	C3,C3,D3
+	vaf	C4,C4,D4
+	vaf	C5,C5,D5
+	vx	B0,B0,C0
+	vx	B1,B1,C1
+	vx	B2,B2,C2
+	vx	B3,B3,C3
+	vx	B4,B4,C4
+	vx	B5,B5,C5
+	verllf	B0,B0,7
+	verllf	B1,B1,7
+	verllf	B2,B2,7
+	verllf	B3,B3,7
+	verllf	B4,B4,7
+	verllf	B5,B5,7
+
+	vsldb	C0,C0,C0,8
+	vsldb	C1,C1,C1,8
+	vsldb	C2,C2,C2,8
+	vsldb	C3,C3,C3,8
+	vsldb	C4,C4,C4,8
+	vsldb	C5,C5,C5,8
+	vsldb	B0,B0,B0,4
+	vsldb	B1,B1,B1,4
+	vsldb	B2,B2,B2,4
+	vsldb	B3,B3,B3,4
+	vsldb	B4,B4,B4,4
+	vsldb	B5,B5,B5,4
+	vsldb	D0,D0,D0,12
+	vsldb	D1,D1,D1,12
+	vsldb	D2,D2,D2,12
+	vsldb	D3,D3,D3,12
+	vsldb	D4,D4,D4,12
+	vsldb	D5,D5,D5,12
+
+	vaf	A0,A0,B0
+	vaf	A1,A1,B1
+	vaf	A2,A2,B2
+	vaf	A3,A3,B3
+	vaf	A4,A4,B4
+	vaf	A5,A5,B5
+	vx	D0,D0,A0
+	vx	D1,D1,A1
+	vx	D2,D2,A2
+	vx	D3,D3,A3
+	vx	D4,D4,A4
+	vx	D5,D5,A5
+	verllf	D0,D0,16
+	verllf	D1,D1,16
+	verllf	D2,D2,16
+	verllf	D3,D3,16
+	verllf	D4,D4,16
+	verllf	D5,D5,16
+
+	vaf	C0,C0,D0
+	vaf	C1,C1,D1
+	vaf	C2,C2,D2
+	vaf	C3,C3,D3
+	vaf	C4,C4,D4
+	vaf	C5,C5,D5
+	vx	B0,B0,C0
+	vx	B1,B1,C1
+	vx	B2,B2,C2
+	vx	B3,B3,C3
+	vx	B4,B4,C4
+	vx	B5,B5,C5
+	verllf	B0,B0,12
+	verllf	B1,B1,12
+	verllf	B2,B2,12
+	verllf	B3,B3,12
+	verllf	B4,B4,12
+	verllf	B5,B5,12
+
+	vaf	A0,A0,B0
+	vaf	A1,A1,B1
+	vaf	A2,A2,B2
+	vaf	A3,A3,B3
+	vaf	A4,A4,B4
+	vaf	A5,A5,B5
+	vx	D0,D0,A0
+	vx	D1,D1,A1
+	vx	D2,D2,A2
+	vx	D3,D3,A3
+	vx	D4,D4,A4
+	vx	D5,D5,A5
+	verllf	D0,D0,8
+	verllf	D1,D1,8
+	verllf	D2,D2,8
+	verllf	D3,D3,8
+	verllf	D4,D4,8
+	verllf	D5,D5,8
+
+	vaf	C0,C0,D0
+	vaf	C1,C1,D1
+	vaf	C2,C2,D2
+	vaf	C3,C3,D3
+	vaf	C4,C4,D4
+	vaf	C5,C5,D5
+	vx	B0,B0,C0
+	vx	B1,B1,C1
+	vx	B2,B2,C2
+	vx	B3,B3,C3
+	vx	B4,B4,C4
+	vx	B5,B5,C5
+	verllf	B0,B0,7
+	verllf	B1,B1,7
+	verllf	B2,B2,7
+	verllf	B3,B3,7
+	verllf	B4,B4,7
+	verllf	B5,B5,7
+
+	vsldb	C0,C0,C0,8
+	vsldb	C1,C1,C1,8
+	vsldb	C2,C2,C2,8
+	vsldb	C3,C3,C3,8
+	vsldb	C4,C4,C4,8
+	vsldb	C5,C5,C5,8
+	vsldb	B0,B0,B0,12
+	vsldb	B1,B1,B1,12
+	vsldb	B2,B2,B2,12
+	vsldb	B3,B3,B3,12
+	vsldb	B4,B4,B4,12
+	vsldb	B5,B5,B5,12
+	vsldb	D0,D0,D0,4
+	vsldb	D1,D1,D1,4
+	vsldb	D2,D2,D2,4
+	vsldb	D3,D3,D3,4
+	vsldb	D4,D4,D4,4
+	vsldb	D5,D5,D5,4
+	brct	%r0,.Loop_vx
+
+	vaf	A0,A0,K0
+	vaf	B0,B0,K1
+	vaf	C0,C0,K2
+	vaf	D0,D0,K3
+	vaf	A1,A1,K0
+	vaf	D1,D1,T1		# +K[3]+1
+
+	vperm	A0,A0,A0,BEPERM
+	vperm	B0,B0,B0,BEPERM
+	vperm	C0,C0,C0,BEPERM
+	vperm	D0,D0,D0,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_vx
+
+	vaf	D2,D2,T2		# +K[3]+2
+	vaf	D3,D3,T3		# +K[3]+3
+	vlm	T0,T3,0(INP),0
+
+	vx	A0,A0,T0
+	vx	B0,B0,T1
+	vx	C0,C0,T2
+	vx	D0,D0,T3
+
+	vlm	K0,T3,0(%r7),4		# re-load sigma and increments
+
+	vstm	A0,D0,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	aghi	LEN,-0x40
+	je	.Ldone_vx
+
+	vaf	B1,B1,K1
+	vaf	C1,C1,K2
+
+	vperm	A0,A1,A1,BEPERM
+	vperm	B0,B1,B1,BEPERM
+	vperm	C0,C1,C1,BEPERM
+	vperm	D0,D1,D1,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_vx
+
+	vlm	A1,D1,0(INP),0
+
+	vx	A0,A0,A1
+	vx	B0,B0,B1
+	vx	C0,C0,C1
+	vx	D0,D0,D1
+
+	vstm	A0,D0,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	aghi	LEN,-0x40
+	je	.Ldone_vx
+
+	vaf	A2,A2,K0
+	vaf	B2,B2,K1
+	vaf	C2,C2,K2
+
+	vperm	A0,A2,A2,BEPERM
+	vperm	B0,B2,B2,BEPERM
+	vperm	C0,C2,C2,BEPERM
+	vperm	D0,D2,D2,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_vx
+
+	vlm	A1,D1,0(INP),0
+
+	vx	A0,A0,A1
+	vx	B0,B0,B1
+	vx	C0,C0,C1
+	vx	D0,D0,D1
+
+	vstm	A0,D0,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	aghi	LEN,-0x40
+	je	.Ldone_vx
+
+	vaf	A3,A3,K0
+	vaf	B3,B3,K1
+	vaf	C3,C3,K2
+	vaf	D2,K3,T3		# K[3]+3
+
+	vperm	A0,A3,A3,BEPERM
+	vperm	B0,B3,B3,BEPERM
+	vperm	C0,C3,C3,BEPERM
+	vperm	D0,D3,D3,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_vx
+
+	vaf	D3,D2,T1		# K[3]+4
+	VLM	A1,D1,0(INP),0
+
+	vx	A0,A0,A1
+	vx	B0,B0,B1
+	vx	C0,C0,C1
+	vx	D0,D0,D1
+
+	vstm	A0,D0,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	aghi	LEN,-0x40
+	je	.Ldone_vx
+
+	vaf	A4,A4,K0
+	vaf	B4,B4,K1
+	vaf	C4,C4,K2
+	vaf	D4,D4,D3		# +K[3]+4
+	vaf	D3,D3,T1		# K[3]+5
+	vaf	K3,D2,T3		# K[3]+=6
+
+	vperm	A0,A4,A4,BEPERM
+	vperm	B0,B4,B4,BEPERM
+	vperm	C0,C4,C4,BEPERM
+	vperm	D0,D4,D4,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_vx
+
+	vlm	A1,D1,0(INP),0
+
+	vx	A0,A0,A1
+	vx	B0,B0,B1
+	vx	C0,C0,C1
+	vx	D0,D0,D1
+
+	vstm	A0,D0,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	aghi	LEN,-0x40
+	je	.Ldone_vx
+
+	vaf	A5,A5,K0
+	vaf	B5,B5,K1
+	vaf	C5,C5,K2
+	vaf	D5,D5,D3		# +K[3]+5
+
+	vperm	A0,A5,A5,BEPERM
+	vperm	B0,B5,B5,BEPERM
+	vperm	C0,C5,C5,BEPERM
+	vperm	D0,D5,D5,BEPERM
+
+	clgfi	LEN,0x40
+	jl	.Ltail_vx
+
+	vlm	A1,D1,0(INP),0
+
+	vx	A0,A0,A1
+	vx	B0,B0,B1
+	vx	C0,C0,C1
+	vx	D0,D0,D1
+
+	vstm	A0,D0,0(OUT),0
+
+	la	INP,0x40(INP)
+	la	OUT,0x40(OUT)
+	lhi	%r0,10
+	aghi	LEN,-0x40
+	jne	.Loop_outer_vx
+
+.Ldone_vx:
+	lmg	%r6,%r7,FRAME+6*8(SP)
+	la	SP,FRAME(SP)
+	br	%r14
+
+.Ltail_vx:
+	vstm	A0,D0,8*8(SP),3
+	lghi	%r1,0
+
+.Loop_tail_vx:
+	llgc	%r5,0(%r1,INP)
+	llgc	%r6,8*8(%r1,SP)
+	xr	%r6,%r5
+	stc	%r6,0(%r1,OUT)
+	la	%r1,1(%r1)
+	brct	LEN,.Loop_tail_vx
+
+	lmg	%r6,%r7,FRAME+6*8(SP)
+	la	SP,FRAME(SP)
+	br	%r14
+
+	.type	chacha20_vx, @function
+	.size	chacha20_vx, . - chacha20_vx
+	.globl	chacha20_vx
+
+.previous
+.section .note.GNU-stack,"",%progbits
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PULL 6/8] linux-user/riscv: Adjust vdso signal frame cfa offsets
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
                   ` (4 preceding siblings ...)
  2024-01-21  0:20 ` [PULL 5/8] tests/tcg/s390x: Import linux tools/testing/crypto/chacha20-s390 Richard Henderson
@ 2024-01-21  0:20 ` Richard Henderson
  2024-01-21  0:20 ` [PULL 7/8] linux-user/elfload: test return value of getrlimit Richard Henderson
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vineet Gupta

A typo in sizeof_reg put the registers at the wrong offset.

Simplify the expressions to use positive addresses from the
start of uc_mcontext instead of negative addresses from the
end of uc_mcontext.

Reported-by: Vineet Gupta <vineetg@rivosinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/riscv/vdso-32.so | Bin 2900 -> 2980 bytes
 linux-user/riscv/vdso-64.so | Bin 3856 -> 3944 bytes
 linux-user/riscv/vdso.S     |   8 ++++----
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/linux-user/riscv/vdso-32.so b/linux-user/riscv/vdso-32.so
index 1ad1e5cbbbb8b1fe36b0fe4bcb6c06fab8219ecd..c2ce2a4757900a16b891bb98f7a027ac30c47a5f 100755
GIT binary patch
delta 643
zcmYjPOH5Ni6ur|gwzc+9XhF0TMNp`GR6Zh~mKGzXLQ}<nX)IXKDriMfutkSuO^l&=
z8%#8QLeLmPh&6@~q8pbkjmE?!?quJE3vZFdn`GwPbMD;S+&S~jvFMz4i@%wQ7Lm>j
z)#}rFqIBkAs)(L%{#f{&3ii(}e_yh8Pgey7+LRKpj+~BYkcbM&LN$yL<+mtroa4HJ
zZMB#&#N4vYgN#$Ed?)jGwn>u^j$uL6%5;@6#JIs26v~>`CEXn6`%uX09<>tLIBejZ
zFY)AcUK{^`w8`*U60=@WX3@OR=)D9Xp?Lu9eduPPPr;Cc@g53huwxpgyD;B@WePiQ
z!+Hz5CSkh?Bv5z*_UkB`K=C*n*Cd=*Q4&Y#73{u@vN7x##oicPmry<e_b@6hVqXON
zFQ9S|2hOAF9IC@O7{Z}G)C5s`8lF=)d=hm5)OVx7i^fk9N7f~du1OqQm1z1X(fmQ8
z<-J7fip23{iMDqVC*Df5zmc$aEJ<`ON_4%FJY|0RKiMa`*FxkJpP`=5Nj~S5mxeB>
ze%vHqR9p0zd0b`2Q|4;3R+vXr`7g}X=20v*GSM6w@2yKv<qSmwLw&w8y?%;@!u|f9
zz(BYsnvEJuwVHOJBuy@TzIRS}W~$!$N`A#>wky9(Ht|*2WbKiW=;xN^G26tL(qVS~
E1D4W)y8r+H

delta 565
zcmZ1?eno780^^#Aisp<K6C<@*Em#>CEGBkVi#XS5FJ{d(y6ALK`0fjZ#F<&4uOCj9
zVB90Pf`x%0f`NfSh=GSe3rO=!e#$6sXaLm50TmSm(hNX850vi%q*ajQ?SOoV&8AHG
zjA8<eK<j|q3?Ll<q^AJsKMa$1G0W9o0CGY0Fa+=^r0_8?2mqM|Kn&8N0}=;<DG-W*
zxhg1_fw9JcQJRNo11qaTN@G$}LUUY8OlwqIM0;3ANM}%2K(}9yPp?;>N59(ymx)f3
z946aMv6*T$&0@OQ43n8gvkYeI&C!{wHBV!{+5(q_PELy)7TYbcS!%V+V!7E0la)rR
z3|8x{(OIjtPGh~=29=FUn-n(7ZIRh3wM}BX*bb4MLc0WZ^X=i;%e9YVKO5Mwb2h(Z
zS<c9~V)8^beIR)dNIFdZ$Yx)U92QkTA0UTlACNDB6n1lfeB`j*0Oa!^$sYmok;9rH
zIXR%X(kC%nFP}lBf+5}|($CS?)0rVYAjsd@)g?5@HGcDYc1dQ&H<K4~2~WPm#ls4+
se#PXsT<YM+cb+_vTX?b#w*X_!WLs`^RvUH(hMQoq8lafy<X&!Z0L@r})Bpeg

diff --git a/linux-user/riscv/vdso-64.so b/linux-user/riscv/vdso-64.so
index 83992bebe6d0182f24edfffc531015fd2f4e1cfb..ae49f5b043b5941b9d304a056c2b50c185f413b0 100755
GIT binary patch
delta 646
zcmZWmUr1A79KGMY?$6b`wWF>%t*xcjTq|2?Y1y{IoJiPMVG&Z%B2pwIB5I9(EFvNz
z?)pL^MhuHse{pSyL_`lkFFpi~An2jTdJL*3ANrQJmv!LB`8el0AAG-~d%k&ezwB?w
z2F6$=o7t~c%+g;}vY+2KKi``ijg36bKRDF->-VRR@qD&8Un;XH#&qseT&%$Rm2UT<
z5wR{OC8yv<Nt1Fsqqy8}YuvEmmP<wD5_3*7Ng5DRT5w!&LGU==ZEB~WIB7zu!XYLo
zrbB2|uHxr1NlGelKw{}AYt!U8*%Xz)E<;X3NullnOy^-f2g_Mlr=gxf{S+D|p`C{9
z6znIl;shMWfum>~N7E5BABJ-bD-Xhz0JX&79!2Xuw2ffZFjf!2GYIb<tck++hOqWE
zq5T!1V}Z~)Pgqwbtba-H&k?#_5CYE#8=eq?1;WNhgiQ|#-Lr(vc|zzuVM~s%^&X+;
zE}?gZu<Z_E`)xv>PUycy@b0)t*m;96aGkL08X<g@5V=C)*VSz-@lJE8^u?@+F0z>>
zzGUeUrG%_8_LX0>9{X?YDUqp`qVpeCm%D6~@^8O6!(HS))fFn#xbF-%SEEbG<V5e#
zU|*s%p~mXuVa5Zphkvj+RNc6#^Y69@tMD#+L~gP0Nqd+r@W=LunynXa3e51L=KTW|
Cbc<&I

delta 523
zcmaDMH$iTK2Ga$uiCSTd6%!l%IV;#0z@TE{&Eq0-y=7zemwaL`Tf>?2`}67j2EFQ>
zo3}9LFfy7<R%8}u^qFkQtS?yr)xrVM$sh`(g&24k940q1i%X`U$=X3>BQ|eju4WV!
zVPs&i0SYMq=>Q<@1Ed)yJF+U*UjQ<AfEWY>fLI5JPe5rP#V`Xz00DDVP%r~yjRT`J
z57PlQR)-eG#s<cwn&yg@lGcK@oc4^4l+J{%nC^(4kluhkpMH-CE)yLl*-W;WVlvfW
zn$C2M87easX35N!m?JV*U>?tWjs-3YofbJbEw)=?v(##t#d5P1CM%6r8LZY@qqA0P
zoyK~#4JsRzHYseD+aj}7YMaD%u^l2ig?0(-=G()wmunx#evpSj4qh<1k!|y419kyW
za5-|gGn!0n6rMbX!-TP6@<on##+1p1oZ^xu(C~qIwGXHh6nYVp6M?emVY>k;>oa*G
zP}TuWFGF&2M1Z-gk@Mz*obD`4Zx|+X@#%5GYydif&z`YjawDHTFrX$M;<IB+n9Rs;
L&uBT>kzXAEcnEna

diff --git a/linux-user/riscv/vdso.S b/linux-user/riscv/vdso.S
index a86d8fc488..c37275233a 100644
--- a/linux-user/riscv/vdso.S
+++ b/linux-user/riscv/vdso.S
@@ -101,12 +101,12 @@ endf __vdso_flush_icache
 	.cfi_startproc simple
 	.cfi_signal_frame
 
-#define sizeof_reg	(__riscv_xlen / 4)
+#define sizeof_reg	(__riscv_xlen / 8)
 #define sizeof_freg	8
-#define B_GR	(offsetof_uc_mcontext - sizeof_rt_sigframe)
-#define B_FR	(offsetof_uc_mcontext - sizeof_rt_sigframe + offsetof_freg0)
+#define B_GR	0
+#define B_FR	offsetof_freg0
 
-	.cfi_def_cfa	2, sizeof_rt_sigframe
+	.cfi_def_cfa	2, offsetof_uc_mcontext
 
 	/* Return address */
 	.cfi_return_column 64
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PULL 7/8] linux-user/elfload: test return value of getrlimit
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
                   ` (5 preceding siblings ...)
  2024-01-21  0:20 ` [PULL 6/8] linux-user/riscv: Adjust vdso signal frame cfa offsets Richard Henderson
@ 2024-01-21  0:20 ` Richard Henderson
  2024-01-21  0:20 ` [PULL 8/8] linux-user/elfload: check PR_GET_DUMPABLE before creating coredump Richard Henderson
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Thomas Weißschuh

From: Thomas Weißschuh <thomas@t-8ch.de>

Should getrlimit() fail the value of dumpsize.rlimit_cur may not be
initialized. Avoid reading garbage data by checking the return value of
getrlimit.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240120-qemu-user-dumpable-v3-1-6aa410c933f1@t-8ch.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/elfload.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index cf9e74468b..c596871938 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -4667,9 +4667,9 @@ static int elf_core_dump(int signr, const CPUArchState *env)
     init_note_info(&info);
 
     errno = 0;
-    getrlimit(RLIMIT_CORE, &dumpsize);
-    if (dumpsize.rlim_cur == 0)
+    if (getrlimit(RLIMIT_CORE, &dumpsize) == 0 && dumpsize.rlim_cur == 0) {
         return 0;
+    }
 
     corefile = core_dump_filename(ts);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PULL 8/8] linux-user/elfload: check PR_GET_DUMPABLE before creating coredump
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
                   ` (6 preceding siblings ...)
  2024-01-21  0:20 ` [PULL 7/8] linux-user/elfload: test return value of getrlimit Richard Henderson
@ 2024-01-21  0:20 ` Richard Henderson
  2024-01-21  7:33 ` [PULL 0/8] tcg + linux-user patch queue Michael Tokarev
  2024-01-22 15:26 ` Peter Maydell
  9 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-21  0:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Thomas Weißschuh

From: Thomas Weißschuh <thomas@t-8ch.de>

A process can opt-out of coredump creation by calling
prctl(PR_SET_DUMPABLE, 0).
linux-user passes this call from the guest through to the
operating system.
From there it can be read back again to avoid creating coredumps from
qemu-user itself if the guest chose so.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240120-qemu-user-dumpable-v3-2-6aa410c933f1@t-8ch.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/elfload.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index c596871938..daf7ef8435 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -2,6 +2,7 @@
 #include "qemu/osdep.h"
 #include <sys/param.h>
 
+#include <sys/prctl.h>
 #include <sys/resource.h>
 #include <sys/shm.h>
 
@@ -4667,6 +4668,11 @@ static int elf_core_dump(int signr, const CPUArchState *env)
     init_note_info(&info);
 
     errno = 0;
+
+    if (prctl(PR_GET_DUMPABLE) == 0) {
+        return 0;
+    }
+
     if (getrlimit(RLIMIT_CORE, &dumpsize) == 0 && dumpsize.rlim_cur == 0) {
         return 0;
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PULL 0/8] tcg + linux-user patch queue
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
                   ` (7 preceding siblings ...)
  2024-01-21  0:20 ` [PULL 8/8] linux-user/elfload: check PR_GET_DUMPABLE before creating coredump Richard Henderson
@ 2024-01-21  7:33 ` Michael Tokarev
  2024-01-22 15:26 ` Peter Maydell
  9 siblings, 0 replies; 12+ messages in thread
From: Michael Tokarev @ 2024-01-21  7:33 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

21.01.2024 03:20, Richard Henderson:

> tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
> tcg: Clean up error paths in alloc_code_gen_buffer_splitwx_memfd
> linux-user/riscv: Adjust vdso signal frame cfa offsets
> linux-user: Fixed cpu restore with pc 0 on SIGBUS

It looks like the last two should go to stable-8.2 too
(besides the s390 fix which is already marked for-stable).
Please let me know if I'm wrong.

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PULL 0/8] tcg + linux-user patch queue
  2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
                   ` (8 preceding siblings ...)
  2024-01-21  7:33 ` [PULL 0/8] tcg + linux-user patch queue Michael Tokarev
@ 2024-01-22 15:26 ` Peter Maydell
  2024-01-22 22:02   ` Richard Henderson
  9 siblings, 1 reply; 12+ messages in thread
From: Peter Maydell @ 2024-01-22 15:26 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Sun, 21 Jan 2024 at 00:22, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit 3f2a357b95845ea0bf7463eff6661e43b97d1afc:
>
>   Merge tag 'hw-cpus-20240119' of https://github.com/philmd/qemu into staging (2024-01-19 11:39:38 +0000)
>
> are available in the Git repository at:
>
>   https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20240121
>
> for you to fetch changes up to 1d5e32e3198d2d8fd2342c8f7f8e0875aeff49c5:
>
>   linux-user/elfload: check PR_GET_DUMPABLE before creating coredump (2024-01-21 10:25:07 +1100)
>
> ----------------------------------------------------------------
> tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
> tcg: Clean up error paths in alloc_code_gen_buffer_splitwx_memfd
> linux-user/riscv: Adjust vdso signal frame cfa offsets
> linux-user: Fixed cpu restore with pc 0 on SIGBUS
>
> ----------------------------------------------------------------

The new chacha test seems to consistently segfault on aarch64 host:

https://gitlab.com/qemu-project/qemu/-/jobs/5979230595
https://gitlab.com/qemu-project/qemu/-/jobs/5978381815
https://gitlab.com/qemu-project/qemu/-/jobs/5982075408

thanks
-- PMM


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PULL 0/8] tcg + linux-user patch queue
  2024-01-22 15:26 ` Peter Maydell
@ 2024-01-22 22:02   ` Richard Henderson
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2024-01-22 22:02 UTC (permalink / raw)
  To: Peter Maydell, Alex Bennée; +Cc: qemu-devel

On 1/23/24 01:26, Peter Maydell wrote:
> On Sun, 21 Jan 2024 at 00:22, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> The following changes since commit 3f2a357b95845ea0bf7463eff6661e43b97d1afc:
>>
>>    Merge tag 'hw-cpus-20240119' of https://github.com/philmd/qemu into staging (2024-01-19 11:39:38 +0000)
>>
>> are available in the Git repository at:
>>
>>    https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20240121
>>
>> for you to fetch changes up to 1d5e32e3198d2d8fd2342c8f7f8e0875aeff49c5:
>>
>>    linux-user/elfload: check PR_GET_DUMPABLE before creating coredump (2024-01-21 10:25:07 +1100)
>>
>> ----------------------------------------------------------------
>> tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns
>> tcg: Clean up error paths in alloc_code_gen_buffer_splitwx_memfd
>> linux-user/riscv: Adjust vdso signal frame cfa offsets
>> linux-user: Fixed cpu restore with pc 0 on SIGBUS
>>
>> ----------------------------------------------------------------
> 
> The new chacha test seems to consistently segfault on aarch64 host:
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/5979230595
> https://gitlab.com/qemu-project/qemu/-/jobs/5978381815
> https://gitlab.com/qemu-project/qemu/-/jobs/5982075408

Oh dear.  It seems to be a problem with the aarch64 cross-compiler for s390x.
If I use a binary created on an s390x or x86_64 host, it works.

Unless someone has a better idea, I'll drop the testcase for now.


r~


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-01-22 22:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-21  0:20 [PULL 0/8] tcg + linux-user patch queue Richard Henderson
2024-01-21  0:20 ` [PULL 1/8] tcg: Remove unreachable code Richard Henderson
2024-01-21  0:20 ` [PULL 2/8] tcg: Make the cleanup-on-error path unique Richard Henderson
2024-01-21  0:20 ` [PULL 3/8] linux-user: Fixed cpu restore with pc 0 on SIGBUS Richard Henderson
2024-01-21  0:20 ` [PULL 4/8] tcg/s390x: Fix encoding of VRIc, VRSa, VRSc insns Richard Henderson
2024-01-21  0:20 ` [PULL 5/8] tests/tcg/s390x: Import linux tools/testing/crypto/chacha20-s390 Richard Henderson
2024-01-21  0:20 ` [PULL 6/8] linux-user/riscv: Adjust vdso signal frame cfa offsets Richard Henderson
2024-01-21  0:20 ` [PULL 7/8] linux-user/elfload: test return value of getrlimit Richard Henderson
2024-01-21  0:20 ` [PULL 8/8] linux-user/elfload: check PR_GET_DUMPABLE before creating coredump Richard Henderson
2024-01-21  7:33 ` [PULL 0/8] tcg + linux-user patch queue Michael Tokarev
2024-01-22 15:26 ` Peter Maydell
2024-01-22 22:02   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).