From: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
To: "Wu, Fei" <fei2.wu@intel.com>,
Richard Henderson <richard.henderson@linaro.org>,
qemu-devel@nongnu.org
Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, palmer@dabbelt.com
Subject: Re: [PATCH v6 00/25] target/riscv: MSTATUS_SUM + cleanups
Date: Tue, 4 Apr 2023 15:11:05 +0800 [thread overview]
Message-ID: <a2669beb-2bb5-32cd-c31b-2f5aaeee42c7@linux.alibaba.com> (raw)
In-Reply-To: <66a60213-0783-c929-5bbc-e012de2a4183@intel.com>
On 2023/4/4 14:42, Wu, Fei wrote:
> On 3/25/2023 6:54 PM, Richard Henderson wrote:
>> This builds on Fei and Zhiwei's SUM and TB_FLAGS changes.
>>
>> * Reclaim 5 TB_FLAGS bits, since we nearly ran out.
>>
>> * Using cpu_mmu_index(env, true) is insufficient to implement
>> HLVX properly. While that chooses the correct mmu_idx, it
>> does not perform the read with execute permission.
>> I add a new tcg interface to perform a read-for-execute with
>> an arbitrary mmu_idx. This is still not 100% compliant, but
>> it's closer.
>>
>> * Handle mstatus.MPV in cpu_mmu_index.
>> * Use vsstatus.SUM when required for MMUIdx_S_SUM.
>> * Cleanups for get_physical_address.
>>
>> While this passes check-avocado, I'm sure that's insufficient.
>> Please have a close look.
>>
> I tested stress-ng to get the feeling of performance gain, although
> stress-ng is not designed to be a performance workload. btw, I had to
> revert commit 0ee342256af9 which is unrelated to this series, or qemu
> exited during the test.
> ./stress-ng --timeout 5 --metrics-brief --class os --sequential 1
>
> Here is the result, in general most of the tests benefit from these
> series, but please note that not all the results are necessary to be
> consistent across multiple runs, and some regressions are not real but I
> haven't checked it one by one.
>
> master(60ca584b) master + this speedup
>
> stressor bogo ops/s bogo ops/s
> (usr+sys time) (usr+sys time)
> sigsuspend 19430.09 1492746.34 76.8265
> utime 8779.64 271023.89 30.8696
> chmod 1728.26 27050.50 15.6519
> vdso 23527136.74 246955742.76 10.4966
> signal 584521.13 5470775.44 9.35941
> sigtrap 822935.76 7190973.63 8.7382
> signest 802706.93 6969509.05 8.68251
> sockpair 501188.08 4242275.08 8.46444
> msg 1627863.38 13557215.89 8.32823
> sigpending 551174.68 4575836.91 8.30197
> locka 1447750.95 11727762.91 8.10068
> lockofd 1460020.77 11562178.66 7.91919
> sigsegv 718492.57 5673228.57 7.89602
> getrandom 129004.90 1006544.31 7.80237
> sigq 892062.12 6828556.43 7.6548
> chdir 13.39 100.66 7.51755
> timerfd 2074142.37 15395307.29 7.42249
> mq 916620.00 6208148.59 6.77287
> mutex 1124306.59 7285459.79 6.47996
> urandom 104868.58 678510.46 6.4701
> pipe 2243935.71 14391093.39 6.41333
> loadavg 463874.30 2936816.17 6.33106
> fifo 423415.43 2632734.32 6.21785
> vm 16726.91 99928.62 5.97412
> handle 199246.08 1131172.45 5.67726
> fstat 2383.12 13479.35 5.65618
> sigrt 405007.13 2143758.11 5.29314
> access 8449.17 44145.10 5.22479
> sigfd 1506073.95 7408089.06 4.91881
> sysinfo 11711.47 54868.08 4.68499
> sigio 1672452.59 7564833.33 4.5232
> rlimit 26771.83 119476.12 4.46276
> xattr 772.25 3412.81 4.41931
> udp 595733.08 2495239.72 4.18852
> sockfd 260825.22 1061910.05 4.07135
> get 13169.56 52788.06 4.00834
> getdent 141465.81 564471.43 3.99016
> rename 61771.74 246277.28 3.98689
> chown 54946.74 212353.58 3.86472
> dev 3555.80 13596.14 3.82365
> mincore 6617.92 25215.66 3.81021
> file-ioctl 105919.35 398122.29 3.75873
> link 15.45 56.02 3.62589
> splice 239841.25 865390.06 3.60818
> io-uring 45798.90 157006.17 3.42816
> filename 7795.98 26238.75 3.36568
> sock 1746.96 5850.73 3.34909
> vm-splice 953550.50 3188724.62 3.34405
> schedpolicy 231915.33 773655.76 3.33594
> clock 21878.02 72400.21 3.30927
> fcntl 76122.11 245817.92 3.22926
> dentry 79533.95 247610.80 3.11327
> fpunch 11895.30 36608.97 3.0776
> revio 866066.56 2596187.53 2.99768
> null 2351038.37 6984334.92 2.97074
> mknod 71145.05 203284.26 2.85732
> symlink 12.40 35.41 2.85565
> fiemap 45437.02 128983.69 2.83874
> sleep 100093.89 282540.81 2.82276
> dir 99154.72 272727.21 2.75052
> timer 126419.44 344857.10 2.72788
> set 70640.29 192423.96 2.724
> udp-flood 662581.75 1782759.62 2.69063
> ioprio 7030.55 18807.67 2.67513
> epoll 147525.39 387861.58 2.62912
> vm-rw 1437.12 3774.13 2.62618
> kill 234075.90 613281.66 2.62001
> hdd 99017.45 257841.08 2.604
> rtc 57639.55 149363.61 2.59134
> dirmany 127249.90 323667.85 2.54356
> sem-sysv 1096787.78 2753588.88 2.51059
> close 194579.21 482854.54 2.48153
> dnotify 15125.16 37097.94 2.45273
> dccp 7554.97 18429.65 2.43941
> lease 285588.09 692990.31 2.42654
> eventfd 282256.72 681576.60 2.41474
> sockdiag 14803911.93 34934756.45 2.35983
> memfd 3632.45 8513.45 2.34372
> tee 124239.86 290298.68 2.3366
> alarm 78757.48 181210.40 2.30087
> poll 128638.34 292293.31 2.27221
> open 189323.41 418865.86 2.21244
> sigpipe 222534.69 486854.87 2.18777
> pty 18.95 39.13 2.06491
> futex 1333749.78 2742935.07 2.05656
> lockf 648732.25 1321326.88 2.03678
> kcmp 734152.03 1452613.12 1.97863
> procfs 7378.58 14503.74 1.96565
> sockmany 94910.81 180132.46 1.89791
> dirdeep 10330.82 19390.08 1.87692
> touch 97843.94 182585.97 1.86609
> chattr 2952.98 5426.15 1.83752
> mmaphuge 430.84 738.17 1.71333
> sem 649644.88 1107290.70 1.70446
> ptrace 1010862.41 1677555.44 1.65953
> vfork 244944.97 403514.39 1.64737
> nanosleep 23147.04 38097.83 1.64591
> mprotect 1068863.24 1729245.09 1.61784
> pipeherd 720787.08 1157261.92 1.60555
> pthread 48395.68 76169.49 1.57389
> enosys 8271.11 12705.37 1.53611
> sockabuse 2825.44 4251.52 1.50473
> af-alg 620270.87 916118.93 1.47697
> fork 10583.97 15363.15 1.45155
> copy-file 6675.07 9389.54 1.40666
> resched 1730236.55 2421449.49 1.39949
> msync 93196.18 122263.64 1.3119
> vforkmany 239372.56 304313.41 1.2713
> vm-segv 11918.23 14981.24 1.257
> readahead 261489.55 321372.13 1.22901
> sendfile 147043.77 174971.03 1.18992
> dynlib 8526.99 10078.23 1.18192
> fault 86430.63 100320.47 1.16071
> dup 9829.71 11264.11 1.14592
> full 473749.38 541801.33 1.14365
> mmapaddr 315772.34 351766.42 1.11399
> spawn 3937.57 4384.92 1.11361
> io 371206.67 409205.80 1.10237
> munmap 64162.14 70473.66 1.09837
> exit-group 5990.95 6522.70 1.08876
> pidfd 37614.16 40687.85 1.08172
> flock 14069057.61 15117799.43 1.07454
> wait 106334.40 113658.40 1.06888
> mmapfork 1.81 1.93 1.0663
> daemon 1161091.36 1234795.43 1.06348
> bigheap 185514.46 195279.13 1.05264
> mmapfixed 319.65 333.70 1.04395
> brk 1410050.59 1456025.25 1.0326
> sigabrt 12129.51 12520.45 1.03223
> sysfs 806.78 831.54 1.03069
> dev-shm 40.30 41.37 1.02655
> bad-altstack 7310.53 7493.23 1.02499
> shm 823.73 842.50 1.02279
> shm-sysv 1132.54 1151.86 1.01706
> mmapmany 400323.77 406078.50 1.01438
> session 12096.44 12228.64 1.01093
> madvise 116.81 117.96 1.00985
> clone 28152.35 28414.20 1.0093
> msyncmany 2220.25 2238.88 1.00839
> pageswap 205651.13 207367.84 1.00835
> unshare 637.92 642.98 1.00793
> remap 373.18 375.69 1.00673
> personality 1698012.68 1706642.92 1.00508
> reboot 117234.02 117421.91 1.0016
> itimer 24962.64 24971.19 1.00034
> sync-file 0.00 0.00 1
> sigfpe 0.00 0.00 1
> seek 0.00 0.00 1
> inode-flags 0.00 0.00 1
> env 0.00 0.00 1
> prctl 11805.81 11772.73 0.997198
> malloc 991487.43 987061.41 0.995536
> mmap 14.48 14.39 0.993785
> zombie 33753.24 33539.75 0.993675
> rmap 625.84 620.94 0.992171
> tlb-shootdown 358.25 355.33 0.991849
> switch 1251701.93 1240818.57 0.991305
> zero 127112.38 125254.50 0.985384
> resources 685.62 674.89 0.98435
> yield 4184626.17 4117860.34 0.984045
> mlock 494527.50 485733.90 0.982218
> fallocate 32711.39 32067.69 0.980322
> sigchld 46289.82 44914.65 0.970292
> inotify 3013.11 2879.87 0.95578
> opcode 11315.78 10538.58 0.931317
> nice 154327.30 136797.63 0.886412
> mremap 225.29 198.82 0.882507
> exec 4118.89 3282.85 0.797023
> vm-addr 214.25 166.69 0.778016
> landlock 950.00 722.74 0.760779
Thanks for testing. Have you analyzed the cases with worse performance?
As we are doing a optimization.
Thanks,
Zhiwei
> Thanks,
> Fei.
>> r~
>>
>>
>> Fei Wu (2):
>> target/riscv: Separate priv from mmu_idx
>> target/riscv: Reduce overhead of MSTATUS_SUM change
>>
>> LIU Zhiwei (4):
>> target/riscv: Extract virt enabled state from tb flags
>> target/riscv: Add a general status enum for extensions
>> target/riscv: Encode the FS and VS on a normal way for tb flags
>> target/riscv: Add a tb flags field for vstart
>>
>> Richard Henderson (19):
>> target/riscv: Remove mstatus_hs_{fs,vs} from tb_flags
>> accel/tcg: Add cpu_ld*_code_mmu
>> target/riscv: Use cpu_ld*_code_mmu for HLVX
>> target/riscv: Handle HLV, HSV via helpers
>> target/riscv: Rename MMU_HYP_ACCESS_BIT to MMU_2STAGE_BIT
>> target/riscv: Introduce mmuidx_sum
>> target/riscv: Introduce mmuidx_priv
>> target/riscv: Introduce mmuidx_2stage
>> target/riscv: Move hstatus.spvp check to check_access_hlsv
>> target/riscv: Set MMU_2STAGE_BIT in riscv_cpu_mmu_index
>> target/riscv: Check SUM in the correct register
>> target/riscv: Hoist second stage mode change to callers
>> target/riscv: Hoist pbmte and hade out of the level loop
>> target/riscv: Move leaf pte processing out of level loop
>> target/riscv: Suppress pte update with is_debug
>> target/riscv: Don't modify SUM with is_debug
>> target/riscv: Merge checks for reserved pte flags
>> target/riscv: Reorg access check in get_physical_address
>> target/riscv: Reorg sum check in get_physical_address
>>
>> include/exec/cpu_ldst.h | 9 +
>> target/riscv/cpu.h | 47 ++-
>> target/riscv/cpu_bits.h | 12 +-
>> target/riscv/helper.h | 12 +-
>> target/riscv/internals.h | 35 ++
>> accel/tcg/cputlb.c | 48 +++
>> accel/tcg/user-exec.c | 58 +++
>> target/riscv/cpu.c | 2 +-
>> target/riscv/cpu_helper.c | 393 +++++++++---------
>> target/riscv/csr.c | 21 +-
>> target/riscv/op_helper.c | 113 ++++-
>> target/riscv/translate.c | 72 ++--
>> .../riscv/insn_trans/trans_privileged.c.inc | 2 +-
>> target/riscv/insn_trans/trans_rvf.c.inc | 2 +-
>> target/riscv/insn_trans/trans_rvh.c.inc | 135 +++---
>> target/riscv/insn_trans/trans_rvv.c.inc | 22 +-
>> target/riscv/insn_trans/trans_xthead.c.inc | 7 +-
>> 17 files changed, 595 insertions(+), 395 deletions(-)
>>
next prev parent reply other threads:[~2023-04-04 7:12 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-25 10:54 [PATCH v6 00/25] target/riscv: MSTATUS_SUM + cleanups Richard Henderson
2023-03-25 10:54 ` [PATCH v6 01/25] target/riscv: Extract virt enabled state from tb flags Richard Henderson
2023-04-06 2:35 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 02/25] target/riscv: Add a general status enum for extensions Richard Henderson
2023-03-26 12:54 ` liweiwei
2023-04-11 2:05 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 03/25] target/riscv: Encode the FS and VS on a normal way for tb flags Richard Henderson
2023-04-11 1:59 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 04/25] target/riscv: Remove mstatus_hs_{fs, vs} from tb_flags Richard Henderson
2023-03-27 1:34 ` liweiwei
2023-03-27 16:22 ` Richard Henderson
2023-03-28 2:34 ` [PATCH v6 04/25] target/riscv: Remove mstatus_hs_{fs,vs} " LIU Zhiwei
2023-04-11 2:02 ` [PATCH v6 04/25] target/riscv: Remove mstatus_hs_{fs, vs} " Alistair Francis
2023-03-25 10:54 ` [PATCH v6 05/25] target/riscv: Add a tb flags field for vstart Richard Henderson
2023-04-11 2:07 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 06/25] target/riscv: Separate priv from mmu_idx Richard Henderson
2023-03-28 2:39 ` LIU Zhiwei
2023-04-11 2:08 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 07/25] target/riscv: Reduce overhead of MSTATUS_SUM change Richard Henderson
2023-03-28 2:41 ` LIU Zhiwei
2023-04-11 2:11 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 08/25] accel/tcg: Add cpu_ld*_code_mmu Richard Henderson
2023-04-11 3:10 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 09/25] target/riscv: Use cpu_ld*_code_mmu for HLVX Richard Henderson
2023-04-11 3:12 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 10/25] target/riscv: Handle HLV, HSV via helpers Richard Henderson
2023-04-11 3:34 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 11/25] target/riscv: Rename MMU_HYP_ACCESS_BIT to MMU_2STAGE_BIT Richard Henderson
2023-04-11 3:36 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 12/25] target/riscv: Introduce mmuidx_sum Richard Henderson
2023-04-11 3:39 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 13/25] target/riscv: Introduce mmuidx_priv Richard Henderson
2023-03-27 2:07 ` LIU Zhiwei
2023-03-27 16:29 ` Richard Henderson
2023-03-28 1:33 ` LIU Zhiwei
2023-03-28 1:54 ` LIU Zhiwei
2023-03-28 14:27 ` Richard Henderson
2023-04-11 3:53 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 14/25] target/riscv: Introduce mmuidx_2stage Richard Henderson
2023-04-11 3:55 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 15/25] target/riscv: Move hstatus.spvp check to check_access_hlsv Richard Henderson
2023-04-11 3:56 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 16/25] target/riscv: Set MMU_2STAGE_BIT in riscv_cpu_mmu_index Richard Henderson
2023-04-11 4:02 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 17/25] target/riscv: Check SUM in the correct register Richard Henderson
2023-04-11 4:25 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 18/25] target/riscv: Hoist second stage mode change to callers Richard Henderson
2023-04-11 4:25 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 19/25] target/riscv: Hoist pbmte and hade out of the level loop Richard Henderson
2023-04-11 4:26 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 20/25] target/riscv: Move leaf pte processing out of " Richard Henderson
2023-04-11 4:30 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 21/25] target/riscv: Suppress pte update with is_debug Richard Henderson
2023-04-11 4:30 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 22/25] target/riscv: Don't modify SUM " Richard Henderson
2023-04-11 4:31 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 23/25] target/riscv: Merge checks for reserved pte flags Richard Henderson
2023-04-11 4:32 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 24/25] target/riscv: Reorg access check in get_physical_address Richard Henderson
2023-04-11 4:55 ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 25/25] target/riscv: Reorg sum " Richard Henderson
2023-04-11 5:36 ` Alistair Francis
2023-03-26 5:17 ` [PATCH v6 00/25] target/riscv: MSTATUS_SUM + cleanups Richard Henderson
2023-03-26 14:18 ` liweiwei
2023-03-27 16:43 ` Daniel Henrique Barboza
2023-03-28 1:22 ` Wu, Fei
2023-04-04 6:42 ` Wu, Fei
2023-04-04 7:11 ` LIU Zhiwei [this message]
2023-04-04 7:23 ` Wu, Fei
2023-04-11 5:38 ` Alistair Francis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a2669beb-2bb5-32cd-c31b-2f5aaeee42c7@linux.alibaba.com \
--to=zhiwei_liu@linux.alibaba.com \
--cc=alistair.francis@wdc.com \
--cc=fei2.wu@intel.com \
--cc=palmer@dabbelt.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-riscv@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).