From: Akira Tsukamoto <akira.tsukamoto@gmail.com> To: Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>, Akira Tsukamoto <akira.tsukamoto@gmail.com>, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 0/1] riscv: improving uaccess with logs from network bench Date: Wed, 23 Jun 2021 21:37:27 +0900 [thread overview] Message-ID: <3e1dbea4-3b0f-de32-5447-2e23c6d4652a@gmail.com> (raw) Optimizing copy_to_user and copy_from_user. I rewrote the functions but heavily influenced by Garry's memcpy function [1]. It must be written in assembler to handle page faults manually inside the function unlike other memcpy functions. This patch will reduce cpu usage dramatically in kernel space especially for applications which use sys-call with large buffer size, such as network applications. The main reason behind this is that every unaligned memory access will raise exceptions and switch between s-mode and m-mode causing large overhead. The motivation to create the patch was to improve network performance on beaglev beta board. By observing with perf, the memcpy and __asm_copy_to_user had heavy cpu usage and the network speed was limited at around 680Mbps on 1Gbps lan. Matteo is creating the patches to improve memcpy in C and this patch is meant to be used with them. Typical network applications use system calls with a large buffer on send/recv() and sendto/recvfrom() for the optimization. The bench result, when patching only copy_user. The memcpy is without Matteo's patches but listing the both since they are the top two largest overhead. All results are from the same base kernel, same rootfs and same BeagleV beta board. Results of iperf3 have speedup on UDP with the copy_user patch alone. --- UDP send --- 306 Mbits/sec 362 Mbits/sec 305 Mbits/sec 362 Mbits/sec --- UDP recv --- 772 Mbits/sec 787 Mbits/sec 773 Mbits/sec 784 Mbits/sec Comparison by "perf top -Ue task-clock" while running iperf3. --- TCP recv --- * Before 40.40% [kernel] [k] memcpy 33.09% [kernel] [k] __asm_copy_to_user * With patch 50.35% [kernel] [k] memcpy 13.76% [kernel] [k] __asm_copy_to_user --- TCP send --- * Before 19.96% [kernel] [k] memcpy 9.84% [kernel] [k] __asm_copy_to_user * With patch 14.27% [kernel] [k] memcpy 7.37% [kernel] [k] __asm_copy_to_user --- UDP recv --- * Before 44.45% [kernel] [k] memcpy 31.04% [kernel] [k] __asm_copy_to_user * With patch 55.62% [kernel] [k] memcpy 11.22% [kernel] [k] __asm_copy_to_user --- UDP send --- * Before 25.18% [kernel] [k] memcpy 22.50% [kernel] [k] __asm_copy_to_user * With patch 28.90% [kernel] [k] memcpy 9.49% [kernel] [k] __asm_copy_to_user --- v2 -> v3: - Merged all patches v1 -> v2: - Added shift copy - Separated patches for readability of changes in assembler - Using perf results [1] https://lkml.org/lkml/2021/2/16/778 Akira Tsukamoto (1): riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall arch/riscv/lib/uaccess.S | 181 +++++++++++++++++++++++++++++++-------- 1 file changed, 146 insertions(+), 35 deletions(-) -- 2.17.1
WARNING: multiple messages have this Message-ID (diff)
From: Akira Tsukamoto <akira.tsukamoto@gmail.com> To: Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>, Akira Tsukamoto <akira.tsukamoto@gmail.com>, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 0/1] riscv: improving uaccess with logs from network bench Date: Wed, 23 Jun 2021 21:37:27 +0900 [thread overview] Message-ID: <3e1dbea4-3b0f-de32-5447-2e23c6d4652a@gmail.com> (raw) Optimizing copy_to_user and copy_from_user. I rewrote the functions but heavily influenced by Garry's memcpy function [1]. It must be written in assembler to handle page faults manually inside the function unlike other memcpy functions. This patch will reduce cpu usage dramatically in kernel space especially for applications which use sys-call with large buffer size, such as network applications. The main reason behind this is that every unaligned memory access will raise exceptions and switch between s-mode and m-mode causing large overhead. The motivation to create the patch was to improve network performance on beaglev beta board. By observing with perf, the memcpy and __asm_copy_to_user had heavy cpu usage and the network speed was limited at around 680Mbps on 1Gbps lan. Matteo is creating the patches to improve memcpy in C and this patch is meant to be used with them. Typical network applications use system calls with a large buffer on send/recv() and sendto/recvfrom() for the optimization. The bench result, when patching only copy_user. The memcpy is without Matteo's patches but listing the both since they are the top two largest overhead. All results are from the same base kernel, same rootfs and same BeagleV beta board. Results of iperf3 have speedup on UDP with the copy_user patch alone. --- UDP send --- 306 Mbits/sec 362 Mbits/sec 305 Mbits/sec 362 Mbits/sec --- UDP recv --- 772 Mbits/sec 787 Mbits/sec 773 Mbits/sec 784 Mbits/sec Comparison by "perf top -Ue task-clock" while running iperf3. --- TCP recv --- * Before 40.40% [kernel] [k] memcpy 33.09% [kernel] [k] __asm_copy_to_user * With patch 50.35% [kernel] [k] memcpy 13.76% [kernel] [k] __asm_copy_to_user --- TCP send --- * Before 19.96% [kernel] [k] memcpy 9.84% [kernel] [k] __asm_copy_to_user * With patch 14.27% [kernel] [k] memcpy 7.37% [kernel] [k] __asm_copy_to_user --- UDP recv --- * Before 44.45% [kernel] [k] memcpy 31.04% [kernel] [k] __asm_copy_to_user * With patch 55.62% [kernel] [k] memcpy 11.22% [kernel] [k] __asm_copy_to_user --- UDP send --- * Before 25.18% [kernel] [k] memcpy 22.50% [kernel] [k] __asm_copy_to_user * With patch 28.90% [kernel] [k] memcpy 9.49% [kernel] [k] __asm_copy_to_user --- v2 -> v3: - Merged all patches v1 -> v2: - Added shift copy - Separated patches for readability of changes in assembler - Using perf results [1] https://lkml.org/lkml/2021/2/16/778 Akira Tsukamoto (1): riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall arch/riscv/lib/uaccess.S | 181 +++++++++++++++++++++++++++++++-------- 1 file changed, 146 insertions(+), 35 deletions(-) -- 2.17.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
next reply other threads:[~2021-06-23 12:37 UTC|newest] Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-06-23 12:37 Akira Tsukamoto [this message] 2021-06-23 12:37 ` [PATCH v3 0/1] riscv: improving uaccess with logs from network bench Akira Tsukamoto 2021-06-23 12:40 ` [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall Akira Tsukamoto 2021-06-23 12:40 ` Akira Tsukamoto 2021-07-06 23:16 ` Palmer Dabbelt 2021-07-06 23:16 ` Palmer Dabbelt 2021-07-07 10:07 ` David Laight 2021-07-07 10:07 ` David Laight 2021-07-10 1:49 ` Guenter Roeck 2021-07-10 1:49 ` Guenter Roeck 2021-07-13 18:10 ` Geert Uytterhoeven 2021-07-13 18:10 ` Geert Uytterhoeven 2021-07-15 6:20 ` Akira Tsukamoto 2021-07-15 6:20 ` Akira Tsukamoto
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=3e1dbea4-3b0f-de32-5447-2e23c6d4652a@gmail.com \ --to=akira.tsukamoto@gmail.com \ --cc=aou@eecs.berkeley.edu \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-riscv@lists.infradead.org \ --cc=palmer@dabbelt.com \ --cc=paul.walmsley@sifive.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.