qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
To: "Wu, Fei" <fei2.wu@intel.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	qemu-devel@nongnu.org
Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, palmer@dabbelt.com
Subject: Re: [PATCH v6 00/25] target/riscv: MSTATUS_SUM + cleanups
Date: Tue, 4 Apr 2023 15:11:05 +0800	[thread overview]
Message-ID: <a2669beb-2bb5-32cd-c31b-2f5aaeee42c7@linux.alibaba.com> (raw)
In-Reply-To: <66a60213-0783-c929-5bbc-e012de2a4183@intel.com>


On 2023/4/4 14:42, Wu, Fei wrote:
> On 3/25/2023 6:54 PM, Richard Henderson wrote:
>> This builds on Fei and Zhiwei's SUM and TB_FLAGS changes.
>>
>>    * Reclaim 5 TB_FLAGS bits, since we nearly ran out.
>>
>>    * Using cpu_mmu_index(env, true) is insufficient to implement
>>      HLVX properly.  While that chooses the correct mmu_idx, it
>>      does not perform the read with execute permission.
>>      I add a new tcg interface to perform a read-for-execute with
>>      an arbitrary mmu_idx.  This is still not 100% compliant, but
>>      it's closer.
>>
>>    * Handle mstatus.MPV in cpu_mmu_index.
>>    * Use vsstatus.SUM when required for MMUIdx_S_SUM.
>>    * Cleanups for get_physical_address.
>>
>> While this passes check-avocado, I'm sure that's insufficient.
>> Please have a close look.
>>
> I tested stress-ng to get the feeling of performance gain, although
> stress-ng is not designed to be a performance workload. btw, I had to
> revert commit 0ee342256af9 which is unrelated to this series, or qemu
> exited during the test.
>      ./stress-ng --timeout 5 --metrics-brief --class os --sequential 1
>
> Here is the result, in general most of the tests benefit from these
> series, but please note that not all the results are necessary to be
> consistent across multiple runs, and some regressions are not real but I
> haven't checked it one by one.
>
>               master(60ca584b)   master + this      speedup
>
> stressor           bogo ops/s      bogo ops/s
>                 (usr+sys time)  (usr+sys time)
> sigsuspend            19430.09      1492746.34     76.8265
> utime                  8779.64       271023.89     30.8696
> chmod                  1728.26        27050.50     15.6519
> vdso               23527136.74    246955742.76     10.4966
> signal               584521.13      5470775.44     9.35941
> sigtrap              822935.76      7190973.63     8.7382
> signest              802706.93      6969509.05     8.68251
> sockpair             501188.08      4242275.08     8.46444
> msg                 1627863.38     13557215.89     8.32823
> sigpending           551174.68      4575836.91     8.30197
> locka               1447750.95     11727762.91     8.10068
> lockofd             1460020.77     11562178.66     7.91919
> sigsegv              718492.57      5673228.57     7.89602
> getrandom            129004.90      1006544.31     7.80237
> sigq                 892062.12      6828556.43     7.6548
> chdir                    13.39          100.66     7.51755
> timerfd             2074142.37     15395307.29     7.42249
> mq                   916620.00      6208148.59     6.77287
> mutex               1124306.59      7285459.79     6.47996
> urandom              104868.58       678510.46     6.4701
> pipe                2243935.71     14391093.39     6.41333
> loadavg              463874.30      2936816.17     6.33106
> fifo                 423415.43      2632734.32     6.21785
> vm                    16726.91        99928.62     5.97412
> handle               199246.08      1131172.45     5.67726
> fstat                  2383.12        13479.35     5.65618
> sigrt                405007.13      2143758.11     5.29314
> access                 8449.17        44145.10     5.22479
> sigfd               1506073.95      7408089.06     4.91881
> sysinfo               11711.47        54868.08     4.68499
> sigio               1672452.59      7564833.33     4.5232
> rlimit                26771.83       119476.12     4.46276
> xattr                   772.25         3412.81     4.41931
> udp                  595733.08      2495239.72     4.18852
> sockfd               260825.22      1061910.05     4.07135
> get                   13169.56        52788.06     4.00834
> getdent              141465.81       564471.43     3.99016
> rename                61771.74       246277.28     3.98689
> chown                 54946.74       212353.58     3.86472
> dev                    3555.80        13596.14     3.82365
> mincore                6617.92        25215.66     3.81021
> file-ioctl           105919.35       398122.29     3.75873
> link                     15.45           56.02     3.62589
> splice               239841.25       865390.06     3.60818
> io-uring              45798.90       157006.17     3.42816
> filename               7795.98        26238.75     3.36568
> sock                   1746.96         5850.73     3.34909
> vm-splice            953550.50      3188724.62     3.34405
> schedpolicy          231915.33       773655.76     3.33594
> clock                 21878.02        72400.21     3.30927
> fcntl                 76122.11       245817.92     3.22926
> dentry                79533.95       247610.80     3.11327
> fpunch                11895.30        36608.97     3.0776
> revio                866066.56      2596187.53     2.99768
> null                2351038.37      6984334.92     2.97074
> mknod                 71145.05       203284.26     2.85732
> symlink                  12.40           35.41     2.85565
> fiemap                45437.02       128983.69     2.83874
> sleep                100093.89       282540.81     2.82276
> dir                   99154.72       272727.21     2.75052
> timer                126419.44       344857.10     2.72788
> set                   70640.29       192423.96     2.724
> udp-flood            662581.75      1782759.62     2.69063
> ioprio                 7030.55        18807.67     2.67513
> epoll                147525.39       387861.58     2.62912
> vm-rw                  1437.12         3774.13     2.62618
> kill                 234075.90       613281.66     2.62001
> hdd                   99017.45       257841.08     2.604
> rtc                   57639.55       149363.61     2.59134
> dirmany              127249.90       323667.85     2.54356
> sem-sysv            1096787.78      2753588.88     2.51059
> close                194579.21       482854.54     2.48153
> dnotify               15125.16        37097.94     2.45273
> dccp                   7554.97        18429.65     2.43941
> lease                285588.09       692990.31     2.42654
> eventfd              282256.72       681576.60     2.41474
> sockdiag           14803911.93     34934756.45     2.35983
> memfd                  3632.45         8513.45     2.34372
> tee                  124239.86       290298.68     2.3366
> alarm                 78757.48       181210.40     2.30087
> poll                 128638.34       292293.31     2.27221
> open                 189323.41       418865.86     2.21244
> sigpipe              222534.69       486854.87     2.18777
> pty                      18.95           39.13     2.06491
> futex               1333749.78      2742935.07     2.05656
> lockf                648732.25      1321326.88     2.03678
> kcmp                 734152.03      1452613.12     1.97863
> procfs                 7378.58        14503.74     1.96565
> sockmany              94910.81       180132.46     1.89791
> dirdeep               10330.82        19390.08     1.87692
> touch                 97843.94       182585.97     1.86609
> chattr                 2952.98         5426.15     1.83752
> mmaphuge                430.84          738.17     1.71333
> sem                  649644.88      1107290.70     1.70446
> ptrace              1010862.41      1677555.44     1.65953
> vfork                244944.97       403514.39     1.64737
> nanosleep             23147.04        38097.83     1.64591
> mprotect            1068863.24      1729245.09     1.61784
> pipeherd             720787.08      1157261.92     1.60555
> pthread               48395.68        76169.49     1.57389
> enosys                 8271.11        12705.37     1.53611
> sockabuse              2825.44         4251.52     1.50473
> af-alg               620270.87       916118.93     1.47697
> fork                  10583.97        15363.15     1.45155
> copy-file              6675.07         9389.54     1.40666
> resched             1730236.55      2421449.49     1.39949
> msync                 93196.18       122263.64     1.3119
> vforkmany            239372.56       304313.41     1.2713
> vm-segv               11918.23        14981.24     1.257
> readahead            261489.55       321372.13     1.22901
> sendfile             147043.77       174971.03     1.18992
> dynlib                 8526.99        10078.23     1.18192
> fault                 86430.63       100320.47     1.16071
> dup                    9829.71        11264.11     1.14592
> full                 473749.38       541801.33     1.14365
> mmapaddr             315772.34       351766.42     1.11399
> spawn                  3937.57         4384.92     1.11361
> io                   371206.67       409205.80     1.10237
> munmap                64162.14        70473.66     1.09837
> exit-group             5990.95         6522.70     1.08876
> pidfd                 37614.16        40687.85     1.08172
> flock              14069057.61     15117799.43     1.07454
> wait                 106334.40       113658.40     1.06888
> mmapfork                  1.81            1.93     1.0663
> daemon              1161091.36      1234795.43     1.06348
> bigheap              185514.46       195279.13     1.05264
> mmapfixed               319.65          333.70     1.04395
> brk                 1410050.59      1456025.25     1.0326
> sigabrt               12129.51        12520.45     1.03223
> sysfs                   806.78          831.54     1.03069
> dev-shm                  40.30           41.37     1.02655
> bad-altstack           7310.53         7493.23     1.02499
> shm                     823.73          842.50     1.02279
> shm-sysv               1132.54         1151.86     1.01706
> mmapmany             400323.77       406078.50     1.01438
> session               12096.44        12228.64     1.01093
> madvise                 116.81          117.96     1.00985
> clone                 28152.35        28414.20     1.0093
> msyncmany              2220.25         2238.88     1.00839
> pageswap             205651.13       207367.84     1.00835
> unshare                 637.92          642.98     1.00793
> remap                   373.18          375.69     1.00673
> personality         1698012.68      1706642.92     1.00508
> reboot               117234.02       117421.91     1.0016
> itimer                24962.64        24971.19     1.00034
> sync-file                 0.00            0.00     1
> sigfpe                    0.00            0.00     1
> seek                      0.00            0.00     1
> inode-flags               0.00            0.00     1
> env                       0.00            0.00     1
> prctl                 11805.81        11772.73     0.997198
> malloc               991487.43       987061.41     0.995536
> mmap                     14.48           14.39     0.993785
> zombie                33753.24        33539.75     0.993675
> rmap                    625.84          620.94     0.992171
> tlb-shootdown           358.25          355.33     0.991849
> switch              1251701.93      1240818.57     0.991305
> zero                 127112.38       125254.50     0.985384
> resources               685.62          674.89     0.98435
> yield               4184626.17      4117860.34     0.984045
> mlock                494527.50       485733.90     0.982218
> fallocate             32711.39        32067.69     0.980322
> sigchld               46289.82        44914.65     0.970292
> inotify                3013.11         2879.87     0.95578
> opcode                11315.78        10538.58     0.931317
> nice                 154327.30       136797.63     0.886412
> mremap                  225.29          198.82     0.882507
> exec                   4118.89         3282.85     0.797023
> vm-addr                 214.25          166.69     0.778016
> landlock                950.00          722.74     0.760779

Thanks for testing. Have you analyzed the cases with worse performance? 
As we are doing a optimization.

Thanks,
Zhiwei

> Thanks,
> Fei.
>> r~
>>
>>
>> Fei Wu (2):
>>    target/riscv: Separate priv from mmu_idx
>>    target/riscv: Reduce overhead of MSTATUS_SUM change
>>
>> LIU Zhiwei (4):
>>    target/riscv: Extract virt enabled state from tb flags
>>    target/riscv: Add a general status enum for extensions
>>    target/riscv: Encode the FS and VS on a normal way for tb flags
>>    target/riscv: Add a tb flags field for vstart
>>
>> Richard Henderson (19):
>>    target/riscv: Remove mstatus_hs_{fs,vs} from tb_flags
>>    accel/tcg: Add cpu_ld*_code_mmu
>>    target/riscv: Use cpu_ld*_code_mmu for HLVX
>>    target/riscv: Handle HLV, HSV via helpers
>>    target/riscv: Rename MMU_HYP_ACCESS_BIT to MMU_2STAGE_BIT
>>    target/riscv: Introduce mmuidx_sum
>>    target/riscv: Introduce mmuidx_priv
>>    target/riscv: Introduce mmuidx_2stage
>>    target/riscv: Move hstatus.spvp check to check_access_hlsv
>>    target/riscv: Set MMU_2STAGE_BIT in riscv_cpu_mmu_index
>>    target/riscv: Check SUM in the correct register
>>    target/riscv: Hoist second stage mode change to callers
>>    target/riscv: Hoist pbmte and hade out of the level loop
>>    target/riscv: Move leaf pte processing out of level loop
>>    target/riscv: Suppress pte update with is_debug
>>    target/riscv: Don't modify SUM with is_debug
>>    target/riscv: Merge checks for reserved pte flags
>>    target/riscv: Reorg access check in get_physical_address
>>    target/riscv: Reorg sum check in get_physical_address
>>
>>   include/exec/cpu_ldst.h                       |   9 +
>>   target/riscv/cpu.h                            |  47 ++-
>>   target/riscv/cpu_bits.h                       |  12 +-
>>   target/riscv/helper.h                         |  12 +-
>>   target/riscv/internals.h                      |  35 ++
>>   accel/tcg/cputlb.c                            |  48 +++
>>   accel/tcg/user-exec.c                         |  58 +++
>>   target/riscv/cpu.c                            |   2 +-
>>   target/riscv/cpu_helper.c                     | 393 +++++++++---------
>>   target/riscv/csr.c                            |  21 +-
>>   target/riscv/op_helper.c                      | 113 ++++-
>>   target/riscv/translate.c                      |  72 ++--
>>   .../riscv/insn_trans/trans_privileged.c.inc   |   2 +-
>>   target/riscv/insn_trans/trans_rvf.c.inc       |   2 +-
>>   target/riscv/insn_trans/trans_rvh.c.inc       | 135 +++---
>>   target/riscv/insn_trans/trans_rvv.c.inc       |  22 +-
>>   target/riscv/insn_trans/trans_xthead.c.inc    |   7 +-
>>   17 files changed, 595 insertions(+), 395 deletions(-)
>>


  reply	other threads:[~2023-04-04  7:12 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-25 10:54 [PATCH v6 00/25] target/riscv: MSTATUS_SUM + cleanups Richard Henderson
2023-03-25 10:54 ` [PATCH v6 01/25] target/riscv: Extract virt enabled state from tb flags Richard Henderson
2023-04-06  2:35   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 02/25] target/riscv: Add a general status enum for extensions Richard Henderson
2023-03-26 12:54   ` liweiwei
2023-04-11  2:05   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 03/25] target/riscv: Encode the FS and VS on a normal way for tb flags Richard Henderson
2023-04-11  1:59   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 04/25] target/riscv: Remove mstatus_hs_{fs, vs} from tb_flags Richard Henderson
2023-03-27  1:34   ` liweiwei
2023-03-27 16:22     ` Richard Henderson
2023-03-28  2:34   ` [PATCH v6 04/25] target/riscv: Remove mstatus_hs_{fs,vs} " LIU Zhiwei
2023-04-11  2:02   ` [PATCH v6 04/25] target/riscv: Remove mstatus_hs_{fs, vs} " Alistair Francis
2023-03-25 10:54 ` [PATCH v6 05/25] target/riscv: Add a tb flags field for vstart Richard Henderson
2023-04-11  2:07   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 06/25] target/riscv: Separate priv from mmu_idx Richard Henderson
2023-03-28  2:39   ` LIU Zhiwei
2023-04-11  2:08   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 07/25] target/riscv: Reduce overhead of MSTATUS_SUM change Richard Henderson
2023-03-28  2:41   ` LIU Zhiwei
2023-04-11  2:11   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 08/25] accel/tcg: Add cpu_ld*_code_mmu Richard Henderson
2023-04-11  3:10   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 09/25] target/riscv: Use cpu_ld*_code_mmu for HLVX Richard Henderson
2023-04-11  3:12   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 10/25] target/riscv: Handle HLV, HSV via helpers Richard Henderson
2023-04-11  3:34   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 11/25] target/riscv: Rename MMU_HYP_ACCESS_BIT to MMU_2STAGE_BIT Richard Henderson
2023-04-11  3:36   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 12/25] target/riscv: Introduce mmuidx_sum Richard Henderson
2023-04-11  3:39   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 13/25] target/riscv: Introduce mmuidx_priv Richard Henderson
2023-03-27  2:07   ` LIU Zhiwei
2023-03-27 16:29     ` Richard Henderson
2023-03-28  1:33       ` LIU Zhiwei
2023-03-28  1:54         ` LIU Zhiwei
2023-03-28 14:27           ` Richard Henderson
2023-04-11  3:53   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 14/25] target/riscv: Introduce mmuidx_2stage Richard Henderson
2023-04-11  3:55   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 15/25] target/riscv: Move hstatus.spvp check to check_access_hlsv Richard Henderson
2023-04-11  3:56   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 16/25] target/riscv: Set MMU_2STAGE_BIT in riscv_cpu_mmu_index Richard Henderson
2023-04-11  4:02   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 17/25] target/riscv: Check SUM in the correct register Richard Henderson
2023-04-11  4:25   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 18/25] target/riscv: Hoist second stage mode change to callers Richard Henderson
2023-04-11  4:25   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 19/25] target/riscv: Hoist pbmte and hade out of the level loop Richard Henderson
2023-04-11  4:26   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 20/25] target/riscv: Move leaf pte processing out of " Richard Henderson
2023-04-11  4:30   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 21/25] target/riscv: Suppress pte update with is_debug Richard Henderson
2023-04-11  4:30   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 22/25] target/riscv: Don't modify SUM " Richard Henderson
2023-04-11  4:31   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 23/25] target/riscv: Merge checks for reserved pte flags Richard Henderson
2023-04-11  4:32   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 24/25] target/riscv: Reorg access check in get_physical_address Richard Henderson
2023-04-11  4:55   ` Alistair Francis
2023-03-25 10:54 ` [PATCH v6 25/25] target/riscv: Reorg sum " Richard Henderson
2023-04-11  5:36   ` Alistair Francis
2023-03-26  5:17 ` [PATCH v6 00/25] target/riscv: MSTATUS_SUM + cleanups Richard Henderson
2023-03-26 14:18 ` liweiwei
2023-03-27 16:43 ` Daniel Henrique Barboza
2023-03-28  1:22   ` Wu, Fei
2023-04-04  6:42 ` Wu, Fei
2023-04-04  7:11   ` LIU Zhiwei [this message]
2023-04-04  7:23     ` Wu, Fei
2023-04-11  5:38 ` Alistair Francis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a2669beb-2bb5-32cd-c31b-2f5aaeee42c7@linux.alibaba.com \
    --to=zhiwei_liu@linux.alibaba.com \
    --cc=alistair.francis@wdc.com \
    --cc=fei2.wu@intel.com \
    --cc=palmer@dabbelt.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).