All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
To: paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com,
	sfr@canb.auug.org.au
Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	liaochang1@huawei.com,
	Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Subject: [PATCH v6 00/13] Add OPTPROBES feature on RISCV
Date: Fri, 27 Jan 2023 21:05:28 +0800	[thread overview]
Message-ID: <20230127130541.1250865-1-chenguokai17@mails.ucas.ac.cn> (raw)

Add jump optimization support for RISC-V.

Replaces ebreak instructions used by normal kprobes with an AUIPC/JALR
instruction pair with the aim of suppressing the probe-hit overhead.

All known optprobe-capable RISC architectures have been using a single
jump or branch instructions while this patch chooses not. RISC-V has a
quite limited jump range (4KB or 2MB) for both its branch and jump
instructions, which prevent optimizations from supporting probes that
spread all over the kernel.

AUIPC/JALR instruction pair is introduced with a much wider jump range
(4GB), where AUIPC loads the upper 12 bits to a free register and JALR
Deaconappends the lower 20 bits to form a 32 bits immediate. Note that
returns from probe handler require another free register. As kprobes
can appear almost anywhere inside the kernel, the free register should
be found generically, not depending on calling convention or any other
regulations.

The algorithm for finding the free register is inspired by the register
renaming in modern processors. From the perspective of register
renaming, a register could be represented as two different registers if
two neighbor instructions both write to it but no one ever reads it.
Extending this fact, a register is considered to be free if there is no
read before its next write in the execution flow. We are free to change
its value without interfering normal execution.

Static analysis shows that 51% of instructions of the kernel (default
config) is capable of being replaced i.e. one free register can be found
at both the start and end of replaced instruction pairs while the
replaced instructions can be directly executed. We also made an
efficiency test on Gem 5 RISCV which shows a more than 5x speedup on 
breakpoint-based implementation.

Contribution:
Chen Guokai invents the algorithm for searching free register, evaluate
the ratio of optimization, the basic function support RVI kernel binary.
Liao Chang adds the support for hybrid RVI and RVC kernel binary, fix
some bugs with different kernel configure, refactor out the entire
feature into some individual patches.

v6:
1. Correct grammar and spelling errors in commit and comment.
2. Add instruction boundary check for RVI/RVC hybrid kernel.
3. Use addi/c.addi instead of 'nop/c.nop' in the detour assembly
   template.
4. Fix the instruction simulation of JALR.
5. Mark some symbols used in the path of kprobe and uprobe handler as
   NOKPROBE.
6. Add one selftest testcase that cover more complex opcode pattern in
   the code of decoding instruction and searching free register.
7. Run all tests in tools/testing/selftests/ftrace on RISCV64 QEMU
   platform, no regression.
8. Run with the CONFIG_KPROBES_SANITY_TEST module on RISCV64 QEMU
   platform, no regression.

v5:
1. Correct known nits
2. Enable the usage of unused caller-saved registers
3. Append an efficiency test result on Gem 5

v4:
Correct the sequence of Signed-off-by and Co-developed-by.

v3:
1. Support of hybrid RVI and RVC kernel binary.
2. Refactor out entire feature into some individual patches.

v2:
1. Adjust comments
2. Remove improper copyright
3. Clean up format issues that is no common practice
4. Extract common definition of instruction decoder
5. Fix race issue in SMP platform.

v1:
Chen Guokai contribute the basic functionality code.

Chen Guokai (1):
  riscv/kprobe: Search free registers from unused caller-saved ones

Liao Chang (12):
  riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES
  riscv/kprobe: Allocate detour buffer from module region
  riscv/kprobe: Add skeleton for preparing optimized kprobe
  riscv/kprobe: Add common RVI and RVC instruction decoder code
  riscv/kprobe: Introduce free register(s) searching algorithm
  riscv/kprobe: Add code to check if kprobe can be optimized
  riscv/kprobe: Prepare detour buffer for optimized kprobe
  riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe
  riscv/kprobe: Add instruction boundary check for RVI/RVC hybrid kernel
  riscv/kprobe: Fix instruction simulation of JALR
  riscv/kprobe: Move exception related symbols to .kprobe_blacklist
  selftest/kprobes: Add testcase for kprobe SYM[+offs]

 arch/riscv/Kconfig                            |   1 +
 arch/riscv/include/asm/asm.h                  |  10 +
 arch/riscv/include/asm/bug.h                  |   5 +-
 arch/riscv/include/asm/kprobes.h              |  49 ++
 arch/riscv/include/asm/patch.h                |   1 +
 arch/riscv/kernel/entry.S                     |  12 +
 arch/riscv/kernel/mcount.S                    |   1 +
 arch/riscv/kernel/patch.c                     |  23 +-
 arch/riscv/kernel/probes/Makefile             |   1 +
 arch/riscv/kernel/probes/decode-insn.h        | 177 +++++
 arch/riscv/kernel/probes/kprobes.c            |  48 +-
 arch/riscv/kernel/probes/opt.c                | 684 ++++++++++++++++++
 arch/riscv/kernel/probes/opt_trampoline.S     | 137 ++++
 arch/riscv/kernel/probes/simulate-insn.c      |   6 +-
 arch/riscv/kernel/probes/simulate-insn.h      |  42 ++
 .../ftrace/test.d/kprobe/kprobe_sym_offs.tc   |  49 ++
 16 files changed, 1235 insertions(+), 11 deletions(-)
 create mode 100644 arch/riscv/kernel/probes/opt.c
 create mode 100644 arch/riscv/kernel/probes/opt_trampoline.S
 create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_sym_offs.tc

-- 
2.34.1


WARNING: multiple messages have this Message-ID (diff)
From: Chen Guokai <chenguokai17@mails.ucas.ac.cn>
To: paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, rostedt@goodmis.org, mingo@redhat.com,
	sfr@canb.auug.org.au
Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	liaochang1@huawei.com,
	Chen Guokai <chenguokai17@mails.ucas.ac.cn>
Subject: [PATCH v6 00/13] Add OPTPROBES feature on RISCV
Date: Fri, 27 Jan 2023 21:05:28 +0800	[thread overview]
Message-ID: <20230127130541.1250865-1-chenguokai17@mails.ucas.ac.cn> (raw)

Add jump optimization support for RISC-V.

Replaces ebreak instructions used by normal kprobes with an AUIPC/JALR
instruction pair with the aim of suppressing the probe-hit overhead.

All known optprobe-capable RISC architectures have been using a single
jump or branch instructions while this patch chooses not. RISC-V has a
quite limited jump range (4KB or 2MB) for both its branch and jump
instructions, which prevent optimizations from supporting probes that
spread all over the kernel.

AUIPC/JALR instruction pair is introduced with a much wider jump range
(4GB), where AUIPC loads the upper 12 bits to a free register and JALR
Deaconappends the lower 20 bits to form a 32 bits immediate. Note that
returns from probe handler require another free register. As kprobes
can appear almost anywhere inside the kernel, the free register should
be found generically, not depending on calling convention or any other
regulations.

The algorithm for finding the free register is inspired by the register
renaming in modern processors. From the perspective of register
renaming, a register could be represented as two different registers if
two neighbor instructions both write to it but no one ever reads it.
Extending this fact, a register is considered to be free if there is no
read before its next write in the execution flow. We are free to change
its value without interfering normal execution.

Static analysis shows that 51% of instructions of the kernel (default
config) is capable of being replaced i.e. one free register can be found
at both the start and end of replaced instruction pairs while the
replaced instructions can be directly executed. We also made an
efficiency test on Gem 5 RISCV which shows a more than 5x speedup on 
breakpoint-based implementation.

Contribution:
Chen Guokai invents the algorithm for searching free register, evaluate
the ratio of optimization, the basic function support RVI kernel binary.
Liao Chang adds the support for hybrid RVI and RVC kernel binary, fix
some bugs with different kernel configure, refactor out the entire
feature into some individual patches.

v6:
1. Correct grammar and spelling errors in commit and comment.
2. Add instruction boundary check for RVI/RVC hybrid kernel.
3. Use addi/c.addi instead of 'nop/c.nop' in the detour assembly
   template.
4. Fix the instruction simulation of JALR.
5. Mark some symbols used in the path of kprobe and uprobe handler as
   NOKPROBE.
6. Add one selftest testcase that cover more complex opcode pattern in
   the code of decoding instruction and searching free register.
7. Run all tests in tools/testing/selftests/ftrace on RISCV64 QEMU
   platform, no regression.
8. Run with the CONFIG_KPROBES_SANITY_TEST module on RISCV64 QEMU
   platform, no regression.

v5:
1. Correct known nits
2. Enable the usage of unused caller-saved registers
3. Append an efficiency test result on Gem 5

v4:
Correct the sequence of Signed-off-by and Co-developed-by.

v3:
1. Support of hybrid RVI and RVC kernel binary.
2. Refactor out entire feature into some individual patches.

v2:
1. Adjust comments
2. Remove improper copyright
3. Clean up format issues that is no common practice
4. Extract common definition of instruction decoder
5. Fix race issue in SMP platform.

v1:
Chen Guokai contribute the basic functionality code.

Chen Guokai (1):
  riscv/kprobe: Search free registers from unused caller-saved ones

Liao Chang (12):
  riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES
  riscv/kprobe: Allocate detour buffer from module region
  riscv/kprobe: Add skeleton for preparing optimized kprobe
  riscv/kprobe: Add common RVI and RVC instruction decoder code
  riscv/kprobe: Introduce free register(s) searching algorithm
  riscv/kprobe: Add code to check if kprobe can be optimized
  riscv/kprobe: Prepare detour buffer for optimized kprobe
  riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe
  riscv/kprobe: Add instruction boundary check for RVI/RVC hybrid kernel
  riscv/kprobe: Fix instruction simulation of JALR
  riscv/kprobe: Move exception related symbols to .kprobe_blacklist
  selftest/kprobes: Add testcase for kprobe SYM[+offs]

 arch/riscv/Kconfig                            |   1 +
 arch/riscv/include/asm/asm.h                  |  10 +
 arch/riscv/include/asm/bug.h                  |   5 +-
 arch/riscv/include/asm/kprobes.h              |  49 ++
 arch/riscv/include/asm/patch.h                |   1 +
 arch/riscv/kernel/entry.S                     |  12 +
 arch/riscv/kernel/mcount.S                    |   1 +
 arch/riscv/kernel/patch.c                     |  23 +-
 arch/riscv/kernel/probes/Makefile             |   1 +
 arch/riscv/kernel/probes/decode-insn.h        | 177 +++++
 arch/riscv/kernel/probes/kprobes.c            |  48 +-
 arch/riscv/kernel/probes/opt.c                | 684 ++++++++++++++++++
 arch/riscv/kernel/probes/opt_trampoline.S     | 137 ++++
 arch/riscv/kernel/probes/simulate-insn.c      |   6 +-
 arch/riscv/kernel/probes/simulate-insn.h      |  42 ++
 .../ftrace/test.d/kprobe/kprobe_sym_offs.tc   |  49 ++
 16 files changed, 1235 insertions(+), 11 deletions(-)
 create mode 100644 arch/riscv/kernel/probes/opt.c
 create mode 100644 arch/riscv/kernel/probes/opt_trampoline.S
 create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_sym_offs.tc

-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

             reply	other threads:[~2023-01-27 13:06 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-27 13:05 Chen Guokai [this message]
2023-01-27 13:05 ` [PATCH v6 00/13] Add OPTPROBES feature on RISCV Chen Guokai
2023-01-27 13:05 ` [PATCH v6 01/13] riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-01-27 13:05 ` [PATCH v6 02/13] riscv/kprobe: Allocate detour buffer from module region Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-01-27 13:05 ` [PATCH v6 03/13] riscv/kprobe: Add skeleton for preparing optimized kprobe Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-01-27 13:05 ` [PATCH v6 04/13] riscv/kprobe: Add common RVI and RVC instruction decoder code Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-02-01 13:29   ` Björn Töpel
2023-02-01 13:29     ` Björn Töpel
2023-02-02 10:16   ` Conor Dooley
2023-02-02 10:16     ` Conor Dooley
2023-01-27 13:05 ` [PATCH v6 05/13] riscv/kprobe: Introduce free register(s) searching algorithm Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-01-27 13:05 ` [PATCH v6 06/13] riscv/kprobe: Add code to check if kprobe can be optimized Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-02-01 13:30   ` Björn Töpel
2023-02-01 13:30     ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 07/13] riscv/kprobe: Prepare detour buffer for optimized kprobe Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-02-01 13:30   ` Björn Töpel
2023-02-01 13:30     ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 08/13] riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-02-01 13:31   ` Björn Töpel
2023-02-01 13:31     ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 09/13] riscv/kprobe: Search free registers from unused caller-saved ones Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-02-01 13:31   ` Björn Töpel
2023-02-01 13:31     ` Björn Töpel
2023-02-02  9:08   ` Conor Dooley
2023-02-02  9:08     ` Conor Dooley
2023-01-27 13:05 ` [PATCH v6 10/13] riscv/kprobe: Add instruction boundary check for RVI/RVC hybrid kernel Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-01-27 13:05 ` [PATCH v6 11/13] riscv/kprobe: Fix instruction simulation of JALR Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-01-31 12:51   ` Björn Töpel
2023-01-31 12:51     ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 12/13] riscv/kprobe: Move exception related symbols to .kprobe_blacklist Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-02-01 13:30   ` Björn Töpel
2023-02-01 13:30     ` Björn Töpel
2023-01-27 13:05 ` [PATCH v6 13/13] selftest/kprobes: Add testcase for kprobe SYM[+offs] Chen Guokai
2023-01-27 13:05   ` Chen Guokai
2023-01-30 12:31 ` [PATCH v6 00/13] Add OPTPROBES feature on RISCV Björn Töpel
2023-01-30 12:31   ` Björn Töpel
2023-01-30 14:38   ` Xim
2023-01-30 14:38     ` Xim
2023-04-26 18:01     ` Palmer Dabbelt
2023-04-26 18:01       ` Palmer Dabbelt
2023-02-01 13:29 ` Björn Töpel
2023-02-01 13:29   ` Björn Töpel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230127130541.1250865-1-chenguokai17@mails.ucas.ac.cn \
    --to=chenguokai17@mails.ucas.ac.cn \
    --cc=aou@eecs.berkeley.edu \
    --cc=liaochang1@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mingo@redhat.com \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rostedt@goodmis.org \
    --cc=sfr@canb.auug.org.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.