llvm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [RFC/RFT 0/3] Add compiler support for Control Flow Integrity
@ 2022-12-19  5:54 Dan Li
  2022-12-19  5:54 ` [RFC/RFT 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features Dan Li
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Dan Li @ 2022-12-19  5:54 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

This series of patches is mainly used to support the control flow
integrity protection of the linux kernel [1], which is similar to
-fsanitize=kcfi in clang 16.0 [2,3].

I hope that this feature will also support user-mode CFI in the
future (at least for developers who can recompile the runtime),
so I use -fsanitize=cfi as a compilation option here.

Any suggestion please let me know :).

Thanks, Dan.

[1] https://lore.kernel.org/all/20220908215504.3686827-1-samitolvanen@google.com/
[2] https://clang.llvm.org/docs/ControlFlowIntegrity.html
[3] https://reviews.llvm.org/D119296

Dan Li (3):
  [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to
    64 bits to support more features
  [PR102768] Support CFI: Add new pass for Control Flow Integrity
  [PR102768] aarch64: Add support for Control Flow Integrity

Signed-off-by: Dan Li <ashimida.1990@gmail.com>

---
 gcc/Makefile.in                               |   1 +
 gcc/asan.h                                    |   4 +-
 gcc/c-family/c-attribs.cc                     |  10 +-
 gcc/c-family/c-common.h                       |   2 +-
 gcc/c/c-parser.cc                             |   4 +-
 gcc/cgraphunit.cc                             |  34 +++
 gcc/common.opt                                |   4 +-
 gcc/config/aarch64/aarch64.cc                 | 106 ++++++++
 gcc/cp/typeck.cc                              |   2 +-
 gcc/doc/invoke.texi                           |  35 +++
 gcc/doc/passes.texi                           |  10 +
 gcc/doc/tm.texi                               |  27 +++
 gcc/doc/tm.texi.in                            |   8 +
 gcc/dwarf2asm.cc                              |   2 +-
 gcc/flag-types.h                              |  67 ++---
 gcc/opt-suggestions.cc                        |   2 +-
 gcc/opts.cc                                   |  26 +-
 gcc/opts.h                                    |   8 +-
 gcc/output.h                                  |   3 +
 gcc/passes.def                                |   1 +
 gcc/target.def                                |  39 +++
 .../aarch64/control_flow_integrity_1.c        |  14 ++
 .../aarch64/control_flow_integrity_2.c        |  25 ++
 .../aarch64/control_flow_integrity_3.c        |  23 ++
 gcc/toplev.cc                                 |   4 +
 gcc/tree-cfg.cc                               |   2 +-
 gcc/tree-cfi.cc                               | 229 ++++++++++++++++++
 gcc/tree-pass.h                               |   1 +
 gcc/tree.cc                                   | 144 +++++++++++
 gcc/tree.h                                    |   1 +
 gcc/varasm.cc                                 |  29 +++
 31 files changed, 803 insertions(+), 64 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c
 create mode 100644 gcc/tree-cfi.cc

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC/RFT 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features
  2022-12-19  5:54 [RFC/RFT 0/3] Add compiler support for Control Flow Integrity Dan Li
@ 2022-12-19  5:54 ` Dan Li
  2022-12-19  5:54 ` [RFC/RFT 2/3] [PR102768] Support CFI: Add new pass for Control Flow Integrity Dan Li
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2022-12-19  5:54 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

32-bit sanitize_code can no longer accommodate new options,
extending it to 64-bit.

Signed-off-by: Dan Li <ashimida.1990@gmail.com>

gcc/ChangeLog:

	PR c/102768
	* asan.h (sanitize_flags_p): Promote to uint64_t.
	* common.opt: Likewise.
	* dwarf2asm.cc (dw2_output_indirect_constant_1): Likewise.
	* flag-types.h (enum sanitize_code): Likewise.
	* opt-suggestions.cc (option_proposer::build_option_suggestions):
	Likewise.
	* opts.cc (find_sanitizer_argument): Likewise.
	(report_conflicting_sanitizer_options): Likewise.
	(get_closest_sanitizer_option): Likewise.
	(parse_sanitizer_options): Likewise.
	(parse_no_sanitize_attribute): Likewise.
	* opts.h (parse_sanitizer_options): Likewise.
	(parse_no_sanitize_attribute): Likewise.
	* tree-cfg.cc (print_no_sanitize_attr_value): Likewise.

gcc/c-family/ChangeLog:

	* c-attribs.cc (add_no_sanitize_value): Likewise.
	(handle_no_sanitize_attribute): Likewise.
	* c-common.h (add_no_sanitize_value): Likewise.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_declaration_or_fndef): Likewise.

gcc/cp/ChangeLog:

	* typeck.cc (get_member_function_from_ptrfunc): Likewise.
---
 gcc/asan.h                |  4 +--
 gcc/c-family/c-attribs.cc | 10 +++---
 gcc/c-family/c-common.h   |  2 +-
 gcc/c/c-parser.cc         |  4 +--
 gcc/common.opt            |  4 +--
 gcc/cp/typeck.cc          |  2 +-
 gcc/dwarf2asm.cc          |  2 +-
 gcc/flag-types.h          | 65 ++++++++++++++++++++-------------------
 gcc/opt-suggestions.cc    |  2 +-
 gcc/opts.cc               | 22 ++++++-------
 gcc/opts.h                |  8 ++---
 gcc/tree-cfg.cc           |  2 +-
 12 files changed, 64 insertions(+), 63 deletions(-)

diff --git a/gcc/asan.h b/gcc/asan.h
index d4ea49cb240..5b98172549b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -233,9 +233,9 @@ asan_protect_stack_decl (tree decl)
    remove all flags mentioned in "no_sanitize" of DECL_ATTRIBUTES.  */
 
 static inline bool
-sanitize_flags_p (unsigned int flag, const_tree fn = current_function_decl)
+sanitize_flags_p (uint64_t flag, const_tree fn = current_function_decl)
 {
-  unsigned int result_flags = flag_sanitize & flag;
+  uint64_t result_flags = flag_sanitize & flag;
   if (result_flags == 0)
     return false;
 
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 111a33f405a..a73e2364525 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -1118,23 +1118,23 @@ handle_cold_attribute (tree *node, tree name, tree ARG_UNUSED (args),
 /* Add FLAGS for a function NODE to no_sanitize_flags in DECL_ATTRIBUTES.  */
 
 void
-add_no_sanitize_value (tree node, unsigned int flags)
+add_no_sanitize_value (tree node, uint64_t flags)
 {
   tree attr = lookup_attribute ("no_sanitize", DECL_ATTRIBUTES (node));
   if (attr)
     {
-      unsigned int old_value = tree_to_uhwi (TREE_VALUE (attr));
+      uint64_t old_value = tree_to_uhwi (TREE_VALUE (attr));
       flags |= old_value;
 
       if (flags == old_value)
 	return;
 
-      TREE_VALUE (attr) = build_int_cst (unsigned_type_node, flags);
+      TREE_VALUE (attr) = build_int_cst (long_long_unsigned_type_node, flags);
     }
   else
     DECL_ATTRIBUTES (node)
       = tree_cons (get_identifier ("no_sanitize"),
-		   build_int_cst (unsigned_type_node, flags),
+		   build_int_cst (long_long_unsigned_type_node, flags),
 		   DECL_ATTRIBUTES (node));
 }
 
@@ -1145,7 +1145,7 @@ static tree
 handle_no_sanitize_attribute (tree *node, tree name, tree args, int,
 			      bool *no_add_attrs)
 {
-  unsigned int flags = 0;
+  uint64_t flags = 0;
   *no_add_attrs = true;
   if (TREE_CODE (*node) != FUNCTION_DECL)
     {
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 52a85bfb783..eb91b9703db 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1500,7 +1500,7 @@ extern enum flt_eval_method
 excess_precision_mode_join (enum flt_eval_method, enum flt_eval_method);
 
 extern int c_flt_eval_method (bool ts18661_p);
-extern void add_no_sanitize_value (tree node, unsigned int flags);
+extern void add_no_sanitize_value (tree node, uint64_t flags);
 
 extern void maybe_add_include_fixit (rich_location *, const char *, bool);
 extern void maybe_suggest_missing_token_insertion (rich_location *richloc,
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index f679d53706a..9d55ea55fa6 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -2217,7 +2217,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 		  start_init (NULL_TREE, asm_name, global_bindings_p (), &richloc);
 		  /* A parameter is initialized, which is invalid.  Don't
 		     attempt to instrument the initializer.  */
-		  int flag_sanitize_save = flag_sanitize;
+		  uint64_t flag_sanitize_save = flag_sanitize;
 		  if (nested && !empty_ok)
 		    flag_sanitize = 0;
 		  init = c_parser_expr_no_commas (parser, NULL);
@@ -2275,7 +2275,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 		  start_init (d, asm_name, global_bindings_p (), &richloc);
 		  /* A parameter is initialized, which is invalid.  Don't
 		     attempt to instrument the initializer.  */
-		  int flag_sanitize_save = flag_sanitize;
+		  uint64_t flag_sanitize_save = flag_sanitize;
 		  if (TREE_CODE (d) == PARM_DECL)
 		    flag_sanitize = 0;
 		  init = c_parser_initializer (parser);
diff --git a/gcc/common.opt b/gcc/common.opt
index 8a0dafc522d..9613c2f8ba0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -217,11 +217,11 @@ bool flag_opts_finished
 
 ; What the sanitizer should instrument
 Variable
-unsigned int flag_sanitize
+uint64_t flag_sanitize
 
 ; What sanitizers should recover from errors
 Variable
-unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS | SANITIZE_KERNEL_HWADDRESS) & ~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
+uint64_t flag_sanitize_recover = (SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS | SANITIZE_KERNEL_HWADDRESS) & ~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
 
 ; Flag whether a prefix has been added to dump_base_name
 Variable
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index ceb80d9744f..0afaf58d87d 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -4023,7 +4023,7 @@ get_member_function_from_ptrfunc (tree *instance_ptrptr, tree function,
       idx = build1 (NOP_EXPR, vtable_index_type, e3);
       switch (TARGET_PTRMEMFUNC_VBIT_LOCATION)
 	{
-	  int flag_sanitize_save;
+	  uint64_t flag_sanitize_save;
 	case ptrmemfunc_vbit_in_pfn:
 	  e1 = cp_build_binary_op (input_location,
 				   BIT_AND_EXPR, idx, integer_one_node,
diff --git a/gcc/dwarf2asm.cc b/gcc/dwarf2asm.cc
index 274f574f25e..b54d1935d57 100644
--- a/gcc/dwarf2asm.cc
+++ b/gcc/dwarf2asm.cc
@@ -1026,7 +1026,7 @@ dw2_output_indirect_constant_1 (const char *sym, tree id)
   sym_ref = gen_rtx_SYMBOL_REF (Pmode, sym);
   /* Disable ASan for decl because redzones cause ABI breakage between GCC and
      libstdc++ for `.LDFCM*' variables.  See PR 78651 for details.  */
-  unsigned int save_flag_sanitize = flag_sanitize;
+  uint64_t save_flag_sanitize = flag_sanitize;
   flag_sanitize &= ~(SANITIZE_ADDRESS | SANITIZE_USER_ADDRESS
 		     | SANITIZE_KERNEL_ADDRESS);
   /* And also temporarily disable -fsection-anchors.  These indirect constants
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 2c8498169e0..0aa51e282fb 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -287,42 +287,43 @@ enum auto_init_type {
 /* Different instrumentation modes.  */
 enum sanitize_code {
   /* AddressSanitizer.  */
-  SANITIZE_ADDRESS = 1UL << 0,
-  SANITIZE_USER_ADDRESS = 1UL << 1,
-  SANITIZE_KERNEL_ADDRESS = 1UL << 2,
+  SANITIZE_ADDRESS = 1ULL << 0,
+  SANITIZE_USER_ADDRESS = 1ULL << 1,
+  SANITIZE_KERNEL_ADDRESS = 1ULL << 2,
   /* ThreadSanitizer.  */
-  SANITIZE_THREAD = 1UL << 3,
+  SANITIZE_THREAD = 1ULL << 3,
   /* LeakSanitizer.  */
-  SANITIZE_LEAK = 1UL << 4,
+  SANITIZE_LEAK = 1ULL << 4,
   /* UndefinedBehaviorSanitizer.  */
-  SANITIZE_SHIFT_BASE = 1UL << 5,
-  SANITIZE_SHIFT_EXPONENT = 1UL << 6,
-  SANITIZE_DIVIDE = 1UL << 7,
-  SANITIZE_UNREACHABLE = 1UL << 8,
-  SANITIZE_VLA = 1UL << 9,
-  SANITIZE_NULL = 1UL << 10,
-  SANITIZE_RETURN = 1UL << 11,
-  SANITIZE_SI_OVERFLOW = 1UL << 12,
-  SANITIZE_BOOL = 1UL << 13,
-  SANITIZE_ENUM = 1UL << 14,
-  SANITIZE_FLOAT_DIVIDE = 1UL << 15,
-  SANITIZE_FLOAT_CAST = 1UL << 16,
-  SANITIZE_BOUNDS = 1UL << 17,
-  SANITIZE_ALIGNMENT = 1UL << 18,
-  SANITIZE_NONNULL_ATTRIBUTE = 1UL << 19,
-  SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1UL << 20,
-  SANITIZE_OBJECT_SIZE = 1UL << 21,
-  SANITIZE_VPTR = 1UL << 22,
-  SANITIZE_BOUNDS_STRICT = 1UL << 23,
-  SANITIZE_POINTER_OVERFLOW = 1UL << 24,
-  SANITIZE_BUILTIN = 1UL << 25,
-  SANITIZE_POINTER_COMPARE = 1UL << 26,
-  SANITIZE_POINTER_SUBTRACT = 1UL << 27,
-  SANITIZE_HWADDRESS = 1UL << 28,
-  SANITIZE_USER_HWADDRESS = 1UL << 29,
-  SANITIZE_KERNEL_HWADDRESS = 1UL << 30,
+  SANITIZE_SHIFT_BASE = 1ULL << 5,
+  SANITIZE_SHIFT_EXPONENT = 1ULL << 6,
+  SANITIZE_DIVIDE = 1ULL << 7,
+  SANITIZE_UNREACHABLE = 1ULL << 8,
+  SANITIZE_VLA = 1ULL << 9,
+  SANITIZE_NULL = 1ULL << 10,
+  SANITIZE_RETURN = 1ULL << 11,
+  SANITIZE_SI_OVERFLOW = 1ULL << 12,
+  SANITIZE_BOOL = 1ULL << 13,
+  SANITIZE_ENUM = 1ULL << 14,
+  SANITIZE_FLOAT_DIVIDE = 1ULL << 15,
+  SANITIZE_FLOAT_CAST = 1ULL << 16,
+  SANITIZE_BOUNDS = 1ULL << 17,
+  SANITIZE_ALIGNMENT = 1ULL << 18,
+  SANITIZE_NONNULL_ATTRIBUTE = 1ULL << 19,
+  SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1ULL << 20,
+  SANITIZE_OBJECT_SIZE = 1ULL << 21,
+  SANITIZE_VPTR = 1ULL << 22,
+  SANITIZE_BOUNDS_STRICT = 1ULL << 23,
+  SANITIZE_POINTER_OVERFLOW = 1ULL << 24,
+  SANITIZE_BUILTIN = 1ULL << 25,
+  SANITIZE_POINTER_COMPARE = 1ULL << 26,
+  SANITIZE_POINTER_SUBTRACT = 1ULL << 27,
+  SANITIZE_HWADDRESS = 1ULL << 28,
+  SANITIZE_USER_HWADDRESS = 1ULL << 29,
+  SANITIZE_KERNEL_HWADDRESS = 1ULL << 30,
   /* Shadow Call Stack.  */
-  SANITIZE_SHADOW_CALL_STACK = 1UL << 31,
+  SANITIZE_SHADOW_CALL_STACK = 1ULL << 31,
+  SANITIZE_MAX = 1ULL << 63,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
 		       | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
diff --git a/gcc/opt-suggestions.cc b/gcc/opt-suggestions.cc
index 33f298560a1..c667e23e66f 100644
--- a/gcc/opt-suggestions.cc
+++ b/gcc/opt-suggestions.cc
@@ -173,7 +173,7 @@ option_proposer::build_option_suggestions (const char *prefix)
 		/* -fsanitize=all is not valid, only -fno-sanitize=all.
 		   So don't register the positive misspelling candidates
 		   for it.  */
-		if (sanitizer_opts[j].flag == ~0U && i == OPT_fsanitize_)
+		if (sanitizer_opts[j].flag == ~0ULL && i == OPT_fsanitize_)
 		  {
 		    optb = *option;
 		    optb.opt_text = opt_text = "-fno-sanitize=";
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 3a89da2dd03..11c5d70458f 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -966,7 +966,7 @@ vec<const char *> help_option_arguments;
 /* Return the string name describing a sanitizer argument which has been
    provided on the command line and has set this particular flag.  */
 const char *
-find_sanitizer_argument (struct gcc_options *opts, unsigned int flags)
+find_sanitizer_argument (struct gcc_options *opts, uint64_t flags)
 {
   for (int i = 0; sanitizer_opts[i].name != NULL; ++i)
     {
@@ -1000,10 +1000,10 @@ find_sanitizer_argument (struct gcc_options *opts, unsigned int flags)
    set these flags.  */
 static void
 report_conflicting_sanitizer_options (struct gcc_options *opts, location_t loc,
-				      unsigned int left, unsigned int right)
+				      uint64_t left, uint64_t right)
 {
-  unsigned int left_seen = (opts->x_flag_sanitize & left);
-  unsigned int right_seen = (opts->x_flag_sanitize & right);
+  uint64_t left_seen = (opts->x_flag_sanitize & left);
+  uint64_t right_seen = (opts->x_flag_sanitize & right);
   if (left_seen && right_seen)
     {
       const char* left_arg = find_sanitizer_argument (opts, left_seen);
@@ -2059,7 +2059,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
   SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true),
   SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true),
   SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false),
-  SANITIZER_OPT (all, ~0U, true),
+  SANITIZER_OPT (all, ~0ULL, true),
 #undef SANITIZER_OPT
   { NULL, 0U, 0UL, false }
 };
@@ -2128,7 +2128,7 @@ get_closest_sanitizer_option (const string_fragment &arg,
     {
       /* -fsanitize=all is not valid, so don't offer it.  */
       if (code == OPT_fsanitize_
-	  && opts[i].flag == ~0U
+	  && opts[i].flag == ~0ULL
 	  && value)
 	continue;
 
@@ -2148,9 +2148,9 @@ get_closest_sanitizer_option (const string_fragment &arg,
    adjust previous FLAGS and return new ones.  If COMPLAIN is false,
    don't issue diagnostics.  */
 
-unsigned int
+uint64_t
 parse_sanitizer_options (const char *p, location_t loc, int scode,
-			 unsigned int flags, int value, bool complain)
+			 uint64_t flags, int value, bool complain)
 {
   enum opt_code code = (enum opt_code) scode;
 
@@ -2176,7 +2176,7 @@ parse_sanitizer_options (const char *p, location_t loc, int scode,
 	    && memcmp (p, sanitizer_opts[i].name, len) == 0)
 	  {
 	    /* Handle both -fsanitize and -fno-sanitize cases.  */
-	    if (value && sanitizer_opts[i].flag == ~0U)
+	    if (value && sanitizer_opts[i].flag == ~0ULL)
 	      {
 		if (code == OPT_fsanitize_)
 		  {
@@ -2241,10 +2241,10 @@ parse_sanitizer_options (const char *p, location_t loc, int scode,
 /* Parse string values of no_sanitize attribute passed in VALUE.
    Values are separated with comma.  */
 
-unsigned int
+uint64_t
 parse_no_sanitize_attribute (char *value)
 {
-  unsigned int flags = 0;
+  uint64_t flags = 0;
   unsigned int i;
   char *q = strtok (value, ",");
 
diff --git a/gcc/opts.h b/gcc/opts.h
index a43ce66cffe..17a02cc7c14 100644
--- a/gcc/opts.h
+++ b/gcc/opts.h
@@ -425,10 +425,10 @@ extern void control_warning_option (unsigned int opt_index, int kind,
 extern char *write_langs (unsigned int mask);
 extern void print_ignored_options (void);
 extern void handle_common_deferred_options (void);
-unsigned int parse_sanitizer_options (const char *, location_t, int,
-				      unsigned int, int, bool);
+uint64_t parse_sanitizer_options (const char *, location_t, int,
+				      uint64_t, int, bool);
 
-unsigned int parse_no_sanitize_attribute (char *value);
+uint64_t parse_no_sanitize_attribute (char *value);
 extern bool common_handle_option (struct gcc_options *opts,
 				  struct gcc_options *opts_set,
 				  const struct cl_decoded_option *decoded,
@@ -470,7 +470,7 @@ extern bool opt_enum_arg_to_value (size_t opt_index, const char *arg,
 extern const struct sanitizer_opts_s
 {
   const char *const name;
-  unsigned int flag;
+  uint64_t flag;
   size_t len;
   bool can_recover;
 } sanitizer_opts[];
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index e321d929fd0..8cc31db9bea 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -8018,7 +8018,7 @@ dump_default_def (FILE *file, tree def, int spc, dump_flags_t flags)
 static void
 print_no_sanitize_attr_value (FILE *file, tree value)
 {
-  unsigned int flags = tree_to_uhwi (value);
+  uint64_t flags = tree_to_uhwi (value);
   bool first = true;
   for (int i = 0; sanitizer_opts[i].name != NULL; ++i)
     {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC/RFT 2/3] [PR102768] Support CFI: Add new pass for Control Flow Integrity
  2022-12-19  5:54 [RFC/RFT 0/3] Add compiler support for Control Flow Integrity Dan Li
  2022-12-19  5:54 ` [RFC/RFT 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features Dan Li
@ 2022-12-19  5:54 ` Dan Li
  2022-12-19  5:54 ` [RFC/RFT 3/3] [PR102768] aarch64: Add support " Dan Li
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2022-12-19  5:54 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

The CFI sanitizer enabled with -fsanitize=cfi implements a forward
edge control flow integrity scheme for indirect calls, roughly
similar to -fsanitize=kcfi [1] in llvm.

At compile time, it appends a uniform type identifier before the
first instruction of each function and inserts check code before
each indirect call in a function with protection enabled.

At runtime, according to the code order, the check code for each
indirect call will be executed first, and it will:
1. Dynamically obtain the typeid before the callee function.
2. Compare it to the expected typeid of the current call site (caller).
3. If the two match, continue to execute the indirect call, if not,
call the user-defined callback function cfi_check_failed.

A typeid (type identifier) is a 32-bit constant on all platforms,
whose value depends on the function's prototype, and is invariant
across compilation units. However, different platforms may ignore
some of the bits to avoid conflicts with instructions.

If a program contains indirect calls to assembly functions, they
must be manually annotated with the expected type identifiers to
prevent errors. To make this easier, gcc generates a weak SHN_ABS
__cfi_typeid_<function> symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one translation unit linked into the program
takes the function address.

It should be noted that on different platforms, the location of
typeid insertion (the offset between it and the function header)
may be different, such as [1], and this patch only implements
the platform-independent part.

[1]: https://reviews.llvm.org/D119296

Signed-off-by: Dan Li <ashimida.1990@gmail.com>

gcc/ChangeLog:

	PR c/102768
	* Makefile.in: Add tree-cfi.o.
	* cgraphunit.cc (output_decl_cfi_typeid_symbol): Output the
	CFI typeid corresponding to each external declaration when necessary.
	(output_decl_cfi_typeid_symbols): Likewise.
	* doc/passes.texi: Document it.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in: New hooks.
	* flag-types.h (enum sanitize_code):
	Add SANITIZE_CONTROL_FLOW_INTEGRITY.
	* opts.cc (parse_sanitizer_options): Add cfi and exclude
	SANITIZE_CONTROL_FLOW_INTEGRITY.
	* output.h (default_output_func_cfi_typeid): Declare.
	(default_calc_func_cfi_typeid): Declare.
	(default_gimple_get_func_cfi_typeid): Declare.
	* passes.def: Add pass_cfi.
	* target.def: Add new hooks.
	* toplev.cc (process_options): Add CFI compile option check.
	* tree-pass.h (make_pass_cfi): Declare.
	* tree.cc (tree_node_sizes[): Add the unified tree type hash
	calculation functions.
	(append_unified_type_hash): Likewise.
	(initialize_unified_tree_type_hash_table): Likewise.
	(append_unified_type_name_hash): Likewise.
	(append_unified_type_precision_hash): Likewise.
	(append_unified_function_ret_and_args_hash): Likewise.
	(unified_type_hash): Likewise.
	(init_ttree): Likewise.
	* tree.h (unified_type_hash): Declare.
	* varasm.cc (assemble_start_function): Output the CFI typeid
	of each function.
	(default_output_func_cfi_typeid): New.
	(default_gimple_get_func_cfi_typeid): New.
	(default_calc_func_cfi_typeid): New.
	* tree-cfi.cc: New file.
---
 gcc/Makefile.in     |   1 +
 gcc/cgraphunit.cc   |  34 +++++++
 gcc/doc/passes.texi |  10 ++
 gcc/doc/tm.texi     |  27 ++++++
 gcc/doc/tm.texi.in  |   8 ++
 gcc/flag-types.h    |   2 +
 gcc/opts.cc         |   4 +-
 gcc/output.h        |   3 +
 gcc/passes.def      |   1 +
 gcc/target.def      |  39 ++++++++
 gcc/toplev.cc       |   4 +
 gcc/tree-cfi.cc     | 229 ++++++++++++++++++++++++++++++++++++++++++++
 gcc/tree-pass.h     |   1 +
 gcc/tree.cc         | 144 ++++++++++++++++++++++++++++
 gcc/tree.h          |   1 +
 gcc/varasm.cc       |  29 ++++++
 16 files changed, 536 insertions(+), 1 deletion(-)
 create mode 100644 gcc/tree-cfi.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 31ff95500c9..0d23bad6b63 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1610,6 +1610,7 @@ OBJS = \
 	tree-call-cdce.o \
 	tree-cfg.o \
 	tree-cfgcleanup.o \
+	tree-cfi.o \
 	tree-chrec.o \
 	tree-complex.o \
 	tree-data-ref.o \
diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc
index 76d541755b8..fb4999559ae 100644
--- a/gcc/cgraphunit.cc
+++ b/gcc/cgraphunit.cc
@@ -2222,6 +2222,37 @@ ipa_passes (void)
   bitmap_obstack_release (NULL);
 }
 
+/* Output a weak symbol value of a decl's typeid (hash) to the
+   assembly file, like:
+	.weak __cfi_typeid_A
+	.set __cfi_typeid_A, 0x00000ADA
+   typeid is platform-dependent, because the bits in typeid that conflicts
+   with the instruction set of the current platform needs to be ignored.  */
+
+static void
+output_decl_cfi_typeid_symbol (FILE *stream, tree fndecl)
+{
+  unsigned int hash = targetm.calc_func_cfi_typeid (TREE_TYPE (fndecl));
+  const char *name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
+
+  fprintf (stream, ".weak __cfi_typeid_%s\n", name);
+  fprintf (stream, ".set __cfi_typeid_%s, %#010x\n", name, hash);
+}
+
+/* Calculate and output the symbols corresponding to the typeid of all
+   external declarations whose address is taken within the current
+   compilation unit.  If such a function is defined in assembly code,
+   its typeid can be obtained according to this symbol.  */
+
+static void
+output_decl_cfi_typeid_symbols (void)
+{
+  struct cgraph_node *node;
+
+  FOR_EACH_FUNCTION (node)
+    if (!node->definition && node->address_taken)
+      output_decl_cfi_typeid_symbol (asm_out_file, node->decl);
+}
 
 /* Weakrefs may be associated to external decls and thus not output
    at expansion time.  Emit all necessary aliases.  */
@@ -2339,6 +2370,9 @@ symbol_table::compile (void)
       }
 #endif
 
+  if (flag_sanitize & SANITIZE_CONTROL_FLOW_INTEGRITY)
+    output_decl_cfi_typeid_symbols ();
+
   state = EXPANSION;
 
   /* Output first asm statements and anything ordered. The process
diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index 1e821d4e513..7d36f196c5b 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -650,6 +650,16 @@ divisions to multiplications by the reciprocal.  The pass is located
 in @file{tree-ssa-math-opts.cc} and is described by
 @code{pass_cse_reciprocal}.
 
+@item Control Flow Integrity
+
+This pass enables the support for Control Flow Intergity sanitizer.
+The CFI sanitizer, enabled with @option{-fsanitize=cfi}, implements
+a forward-edge control flow integrity scheme for indirect calls.
+It attaches a uniform type identifier to each function that is
+invariant across compilation units and inserts checking code
+before indirect calls.  The pass is located in @file{tree-cfi.cc}
+and is described by @code{pass_cfi}.
+
 @item Full redundancy elimination
 
 This is a simpler form of PRE that only eliminates redundancies that
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c5006afc00d..1b603da309e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1003,6 +1003,21 @@ Return a value, with the same meaning as the C99 macro
 @code{FLT_EVAL_METHOD} that describes which excess precision should be
 applied.
 
+@deftypefn {Target Hook} tree TARGET_GIMPLE_GET_FUNC_CFI_TYPEID (gimple_seq *@var{stmts}, location_t @var{loc}, tree @var{fptr})
+This target hook is used to generate gimple instructions to get
+the typeid in front of the function pointed to by fptr.
+For different platforms, the location of typeid may be different,
+so a platform-dependent function is required.
+@end deftypefn
+
+@deftypefn {Target Hook} {unsigned int} TARGET_CALC_FUNC_CFI_TYPEID (const_tree @var{fntype})
+This target hook is used to calculate a platform-dependent typeid
+of a function.
+Although the length of typeid is always 4bytes on all platforms, different
+platforms may ignore some bits to avoid encoding conflicts with it's
+instruction set, so a platform-dependent function is required.
+@end deftypefn
+
 @deftypefn {Target Hook} machine_mode TARGET_PROMOTE_FUNCTION_MODE (const_tree @var{type}, machine_mode @var{mode}, int *@var{punsignedp}, const_tree @var{funtype}, int @var{for_return})
 Like @code{PROMOTE_MODE}, but it is applied to outgoing function arguments or
 function return values.  The target hook should return the new mode
@@ -8721,6 +8736,13 @@ global; that is, available for reference from other files.
 The default implementation uses the TARGET_ASM_GLOBALIZE_LABEL target hook.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_ASM_OUTPUT_FUNC_CFI_TYPEID (FILE *@var{stream}, tree @var{decl})
+This target hook is used to output a function's typeid before
+its assembly code.
+For different platforms, the output format of typeid may be different,
+so a platform-dependent function is required.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_ASM_ASSEMBLE_UNDEFINED_DECL (FILE *@var{stream}, const char *@var{name}, const_tree @var{decl})
 This target hook is a function to output to the stdio stream
 @var{stream} some commands that will declare the name associated with
@@ -12608,3 +12630,8 @@ type.
 This value is true if the target platform supports
 @option{-fsanitize=shadow-call-stack}.  The default value is false.
 @end deftypevr
+
+@deftypevr {Target Hook} bool TARGET_HAVE_CFI
+This value is true if the target platform supports
+@option{-fsanitize=cfi}.  The default value is false.
+@end deftypevr
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index f869ddd5e5b..7b7f2fa0be5 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -933,6 +933,10 @@ Return a value, with the same meaning as the C99 macro
 @code{FLT_EVAL_METHOD} that describes which excess precision should be
 applied.
 
+@hook TARGET_GIMPLE_GET_FUNC_CFI_TYPEID
+
+@hook TARGET_CALC_FUNC_CFI_TYPEID
+
 @hook TARGET_PROMOTE_FUNCTION_MODE
 
 @defmac PARM_BOUNDARY
@@ -5568,6 +5572,8 @@ You may wish to use @code{ASM_OUTPUT_SIZE_DIRECTIVE} and/or
 
 @hook TARGET_ASM_GLOBALIZE_DECL_NAME
 
+@hook TARGET_ASM_OUTPUT_FUNC_CFI_TYPEID
+
 @hook TARGET_ASM_ASSEMBLE_UNDEFINED_DECL
 
 @defmac ASM_WEAKEN_LABEL (@var{stream}, @var{name})
@@ -8183,3 +8189,5 @@ maintainer is familiar with.
 @hook TARGET_GCOV_TYPE_SIZE
 
 @hook TARGET_HAVE_SHADOW_CALL_STACK
+
+@hook TARGET_HAVE_CFI
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 0aa51e282fb..453549a46f3 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -323,6 +323,8 @@ enum sanitize_code {
   SANITIZE_KERNEL_HWADDRESS = 1ULL << 30,
   /* Shadow Call Stack.  */
   SANITIZE_SHADOW_CALL_STACK = 1ULL << 31,
+  /* Control Flow Integrity.  */
+  SANITIZE_CONTROL_FLOW_INTEGRITY = 1ULL << 32,
   SANITIZE_MAX = 1ULL << 63,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 11c5d70458f..9abe21e5029 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -2059,6 +2059,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
   SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true),
   SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true),
   SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false),
+  SANITIZER_OPT (cfi, SANITIZE_CONTROL_FLOW_INTEGRITY, false),
   SANITIZER_OPT (all, ~0ULL, true),
 #undef SANITIZER_OPT
   { NULL, 0U, 0UL, false }
@@ -2186,7 +2187,8 @@ parse_sanitizer_options (const char *p, location_t loc, int scode,
 		else
 		  flags |= ~(SANITIZE_THREAD | SANITIZE_LEAK
 			     | SANITIZE_UNREACHABLE | SANITIZE_RETURN
-			     | SANITIZE_SHADOW_CALL_STACK);
+			     | SANITIZE_SHADOW_CALL_STACK
+			     | SANITIZE_CONTROL_FLOW_INTEGRITY);
 	      }
 	    else if (value)
 	      {
diff --git a/gcc/output.h b/gcc/output.h
index 6dea630913a..5d17198eb2c 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -606,6 +606,9 @@ extern bool default_binds_local_p_2 (const_tree);
 extern bool default_binds_local_p_3 (const_tree, bool, bool, bool, bool);
 extern void default_globalize_label (FILE *, const char *);
 extern void default_globalize_decl_name (FILE *, tree);
+extern void default_output_func_cfi_typeid (FILE *, tree);
+extern unsigned int default_calc_func_cfi_typeid (const_tree);
+extern tree default_gimple_get_func_cfi_typeid (gimple_seq *, location_t, tree);
 extern void default_emit_unwind_label (FILE *, tree, int, int);
 extern void default_emit_except_table_label (FILE *);
 extern void default_generate_internal_label (char *, const char *,
diff --git a/gcc/passes.def b/gcc/passes.def
index 375d3d62d51..d8b7fd8d6e7 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -191,6 +191,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_omp_device_lower);
   NEXT_PASS (pass_omp_target_link);
   NEXT_PASS (pass_adjust_alignment);
+  NEXT_PASS (pass_cfi);
   NEXT_PASS (pass_all_optimizations);
   PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations)
       NEXT_PASS (pass_remove_cgraph_callee_edges);
diff --git a/gcc/target.def b/gcc/target.def
index d85adf36a39..858df4b89a6 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -136,6 +136,16 @@ global; that is, available for reference from other files.\n\
 The default implementation uses the TARGET_ASM_GLOBALIZE_LABEL target hook.",
  void, (FILE *stream, tree decl), default_globalize_decl_name)
 
+/* Output the uniform type identifier in front of a function
+   when cfi is enabled.  */
+DEFHOOK
+(output_func_cfi_typeid,
+ "This target hook is used to output a function's typeid before\n\
+its assembly code.\n\
+For different platforms, the output format of typeid may be different,\n\
+so a platform-dependent function is required.",
+ void, (FILE *stream, tree decl), default_output_func_cfi_typeid)
+
 /* Output code that will declare an external variable.  */
 DEFHOOK
 (assemble_undefined_decl,
@@ -4522,6 +4532,27 @@ by a subtarget.",
  unsigned HOST_WIDE_INT, (void),
  NULL)
 
+/* Generate gimple instructions to get the typeid in front of the
+   function pointed to by fptr.  */
+DEFHOOK
+(gimple_get_func_cfi_typeid,
+ "This target hook is used to generate gimple instructions to get\n\
+the typeid in front of the function pointed to by fptr.\n\
+For different platforms, the location of typeid may be different,\n\
+so a platform-dependent function is required.",
+ tree, (gimple_seq *stmts, location_t loc, tree fptr),
+ default_gimple_get_func_cfi_typeid)
+
+/* Calculate the typeid of a function's type.  */
+DEFHOOK
+(calc_func_cfi_typeid,
+ "This target hook is used to calculate a platform-dependent typeid\n\
+of a function.\n\
+Although the length of typeid is always 4bytes on all platforms, different\n\
+platforms may ignore some bits to avoid encoding conflicts with it's\n\
+instruction set, so a platform-dependent function is required.",
+ unsigned int, (const_tree fntype), default_calc_func_cfi_typeid)
+
 /* Functions relating to calls - argument passing, returns, etc.  */
 /* Members of struct call have no special macro prefix.  */
 HOOK_VECTOR (TARGET_CALLS, calls)
@@ -7111,6 +7142,14 @@ DEFHOOKPOD
 @option{-fsanitize=shadow-call-stack}.  The default value is false.",
  bool, false)
 
+/* This value represents whether the control flow integrity is implemented
+   on the target platform.  */
+DEFHOOKPOD
+(have_cfi,
+ "This value is true if the target platform supports\n\
+@option{-fsanitize=cfi}.  The default value is false.",
+ bool, false)
+
 /* Close the 'struct gcc_target' definition.  */
 HOOK_VECTOR_END (C90_EMPTY_HACK)
 
diff --git a/gcc/toplev.cc b/gcc/toplev.cc
index 055e0642f77..c4c47d03f73 100644
--- a/gcc/toplev.cc
+++ b/gcc/toplev.cc
@@ -1665,6 +1665,10 @@ process_options (bool no_backend)
 		  "requires %<-fno-exceptions%>");
     }
 
+  if (flag_sanitize & SANITIZE_CONTROL_FLOW_INTEGRITY)
+    if (!targetm.have_cfi)
+      sorry ("%<-fsanitize=cfi%> not supported in current platform");
+
   HOST_WIDE_INT patch_area_size, patch_area_start;
   parse_and_check_patch_area (flag_patchable_function_entry, false,
 			      &patch_area_size, &patch_area_start);
diff --git a/gcc/tree-cfi.cc b/gcc/tree-cfi.cc
new file mode 100644
index 00000000000..c852a961ccf
--- /dev/null
+++ b/gcc/tree-cfi.cc
@@ -0,0 +1,229 @@
+/* The pass of Control Flow Integrity.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "tree.h"
+#include "gimple.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "gimple-pretty-print.h"
+#include "gimple-iterator.h"
+#include "cfgloop.h"
+#include "cfghooks.h"
+#include "attribs.h"
+#include "asan.h"
+#include "diagnostic-core.h"
+#include "print-tree.h"
+
+/* When the typeid matching fails, the compiler will call the cfi_check_failed
+   function to report the failure, which needs to be defined by the user.
+   The prototype of this function is:
+     void cfi_callback (unsigned int caller_hash,
+			unsigned int callee_hash,
+			void *callee_addr);  */
+#define CFI_CALLBACK_FUNC_NAME "cfi_check_failed"
+
+static tree cfi_callback;
+
+static tree
+build_cfi_callback_decl (void)
+{
+  tree ftype = build_function_type_list (void_type_node, integer_type_node,
+					 integer_type_node, ptr_type_node,
+					 NULL_TREE);
+  tree decl = build_fn_decl (CFI_CALLBACK_FUNC_NAME, ftype);
+
+  return decl;
+}
+
+/* Returns a tree node representing the typeid calculated from fntype.  */
+static tree
+gen_func_type_hash_tree (gimple *stmt, tree fntype)
+{
+  unsigned int hash = targetm.calc_func_cfi_typeid (fntype);
+
+  return build_int_cst_type (integer_type_node, hash);
+}
+
+static void
+insert_gcall_cfg_check (gimple_stmt_iterator *gsi)
+{
+  gimple *call_stmt, *new_stmt;
+  gimple_seq stmts = NULL;
+  location_t loc;
+  tree fptr, fntype, callee_hash, caller_expected_hash;
+  basic_block cond_bb, true_bb, false_bb;
+  edge e;
+
+  call_stmt = gsi_stmt (*gsi);
+  loc = gimple_location (call_stmt);
+
+  fptr = gimple_call_fn (call_stmt);
+  fntype = TREE_TYPE (TREE_TYPE (fptr));
+
+  gcc_assert (TREE_CODE (fptr) == SSA_NAME);
+  gcc_assert (TREE_CODE (fntype) == FUNCTION_TYPE);
+
+  /* Get the caller's typeid tree node.  */
+  caller_expected_hash = gen_func_type_hash_tree (call_stmt, fntype);
+
+  /* Get the tree node representing the callee's typeid.  */
+  callee_hash = targetm.gimple_get_func_cfi_typeid (&stmts, loc, fptr);
+  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+
+  /* Insert insns to check whether the typeid matches,
+     and jump to the callback function if it fails.  */
+  new_stmt = gimple_build_cond (NE_EXPR, callee_hash,
+				caller_expected_hash, NULL_TREE, NULL_TREE);
+  gimple_set_location (new_stmt, loc);
+
+  gsi_insert_before (gsi, new_stmt, GSI_NEW_STMT);
+  cond_bb = gimple_bb (gsi_stmt (*gsi));
+
+  e = split_block (cond_bb, gsi_stmt (*gsi));
+  e->flags = EDGE_FALSE_VALUE;
+
+  false_bb = e->dest;
+
+  true_bb = create_empty_bb (cond_bb);
+  make_edge (cond_bb, true_bb, EDGE_TRUE_VALUE | EDGE_PRESERVE);
+  make_single_succ_edge (true_bb, false_bb, EDGE_FALLTHRU);
+
+  set_immediate_dominator (CDI_DOMINATORS, true_bb, cond_bb);
+  set_immediate_dominator (CDI_DOMINATORS, false_bb, cond_bb);
+  add_bb_to_loop (true_bb, cond_bb->loop_father);
+
+  /* Call cfi_callback when they mismatch.  */
+  *gsi = gsi_start_bb (true_bb);
+  new_stmt = gimple_build_call (cfi_callback, 3,
+				caller_expected_hash, callee_hash, fptr);
+  gimple_set_location (new_stmt, loc);
+  gsi_insert_after (gsi, new_stmt, GSI_CONTINUE_LINKING);
+
+  *gsi = gsi_start_bb (false_bb);
+}
+
+namespace {
+
+const pass_data pass_data_cfi =
+{
+  GIMPLE_PASS, /* type.  */
+  "cfi", /* name.  */
+  OPTGROUP_NONE, /* optinfo_flags.  */
+  TV_NONE, /* tv_id.  */
+  (PROP_cfg | PROP_ssa), /* properties_required.  */
+  0, /* properties_provided.  */
+  0, /* properties_destroyed.  */
+  0, /* todo_flags_start.  */
+  0, /* todo_flags_finish.  */
+};
+
+class pass_cfi : public gimple_opt_pass
+{
+public:
+  pass_cfi (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_cfi, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *);
+  virtual unsigned int execute (function *);
+
+}; // class pass_cfi
+
+bool
+pass_cfi::gate (function *)
+{
+  /* Do not insert cfg checks for functions that disable cfi.  */
+  if (!sanitize_flags_p (SANITIZE_CONTROL_FLOW_INTEGRITY,
+			 current_function_decl))
+    return 0;
+
+  if (!cfi_callback)
+    cfi_callback = build_cfi_callback_decl ();
+
+  return 1;
+}
+
+unsigned int
+pass_cfi::execute (function *fun)
+{
+  tree fptr;
+  gimple *stmt;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  int todo = 0;
+
+  loop_optimizer_init (LOOPS_NORMAL);
+  gcc_assert (current_loops);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	stmt = gsi_stmt (gsi);
+
+	if (!is_gimple_call (stmt)
+	    || gimple_call_internal_p (as_a <gcall*> (stmt)))
+	  continue;
+
+	fptr = gimple_call_fn (stmt);
+
+	switch (TREE_CODE (fptr))
+	  {
+	  case ADDR_EXPR: /* Ignore non-indirect calls.  */
+	  case INTEGER_CST:
+	    continue;
+
+	  case SSA_NAME:
+	    break;
+
+	  default:
+	    gcc_unreachable ();
+	  }
+
+	gcc_assert (TREE_CODE (TREE_TYPE (fptr)) == POINTER_TYPE);
+
+	insert_gcall_cfg_check (&gsi);
+
+	todo = TODO_remove_unused_locals | TODO_update_ssa
+	       | TODO_cleanup_cfg | TODO_rebuild_cgraph_edges;
+
+	/* Re-acquire the bb where the gcall instruction is located.  */
+	bb = gsi_bb (gsi);
+      }
+
+  free_dominance_info (CDI_DOMINATORS);
+  free_dominance_info (CDI_POST_DOMINATORS);
+  loop_optimizer_finalize ();
+
+  return todo;
+}
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_cfi (gcc::context *ctxt)
+{
+  return new pass_cfi (ctxt);
+}
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 606d1d60b85..ed10a941740 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -412,6 +412,7 @@ extern gimple_opt_pass *make_pass_early_thread_jumps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_split_crit_edges (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_laddress (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_pre (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_cfi (gcc::context *ctxt);
 extern unsigned int tail_merge_optimize (bool);
 extern gimple_opt_pass *make_pass_profile (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_strip_predict_hints (gcc::context *ctxt);
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 4cf3785270b..a493812ebcc 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -137,6 +137,8 @@ static uint64_t tree_code_counts[MAX_TREE_CODES];
 uint64_t tree_node_counts[(int) all_kinds];
 uint64_t tree_node_sizes[(int) all_kinds];
 
+static unsigned int unified_tree_type_hash_table[MAX_TREE_CODES];
+
 /* Keep in sync with tree.h:enum tree_node_kind.  */
 static const char * const tree_node_kind_names[] = {
   "decls",
@@ -252,6 +254,8 @@ static void print_type_hash_statistics (void);
 static void print_debug_expr_statistics (void);
 static void print_value_expr_statistics (void);
 
+static void append_unified_type_hash (const_tree type, inchash::hash &hstate);
+
 tree global_trees[TI_MAX];
 tree integer_types[itk_none];
 
@@ -694,6 +698,143 @@ initialize_tree_contains_struct (void)
   gcc_assert (tree_contains_struct[NAMELIST_DECL][TS_DECL_COMMON]);
 }
 
+static void
+initialize_unified_tree_type_hash_table (void)
+{
+  unified_tree_type_hash_table[OFFSET_TYPE] = 10;
+  unified_tree_type_hash_table[ENUMERAL_TYPE] = 20;
+  unified_tree_type_hash_table[BOOLEAN_TYPE] = 30;
+  unified_tree_type_hash_table[INTEGER_TYPE] = 40;
+  unified_tree_type_hash_table[REAL_TYPE] = 50;
+  unified_tree_type_hash_table[POINTER_TYPE] = 60;
+  unified_tree_type_hash_table[REFERENCE_TYPE] = 70;
+  unified_tree_type_hash_table[NULLPTR_TYPE] = 80;
+  unified_tree_type_hash_table[FIXED_POINT_TYPE] = 90;
+  unified_tree_type_hash_table[COMPLEX_TYPE] = 100;
+  unified_tree_type_hash_table[VECTOR_TYPE] = 110;
+  unified_tree_type_hash_table[ARRAY_TYPE] = 120;
+  unified_tree_type_hash_table[RECORD_TYPE] = 130;
+  unified_tree_type_hash_table[UNION_TYPE] = 140;
+  unified_tree_type_hash_table[QUAL_UNION_TYPE] = 150;
+  unified_tree_type_hash_table[VOID_TYPE] = 160;
+  unified_tree_type_hash_table[FUNCTION_TYPE] = 170;
+  unified_tree_type_hash_table[METHOD_TYPE] = 180;
+  unified_tree_type_hash_table[LANG_TYPE] = 190;
+  unified_tree_type_hash_table[OPAQUE_TYPE] = 200;
+}
+
+static void
+append_unified_type_name_hash (const_tree type, inchash::hash &hstate)
+{
+  tree n = TYPE_NAME (TYPE_MAIN_VARIANT (type));
+
+  if (!n)
+    return;
+
+  if (TREE_CODE (n) != IDENTIFIER_NODE)
+    n = DECL_NAME (n);
+
+  hstate.add ((const void *) IDENTIFIER_POINTER (n), IDENTIFIER_LENGTH (n));
+}
+
+static void
+append_unified_type_precision_hash (const_tree type, inchash::hash &hstate)
+{
+  unsigned HOST_WIDE_INT size = TYPE_PRECISION (type);
+
+  hstate.add_hwi (size);
+}
+
+/* Add the return and all parameter types of the function
+   to the hash calculation.  */
+
+static void
+append_unified_function_ret_and_args_hash (const_tree fntype,
+					   inchash::hash &hstate)
+{
+  const_tree arg_type;
+  function_args_iterator args_iter;
+
+  append_unified_type_hash (TREE_TYPE (fntype), hstate);
+
+  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
+    {
+      if (TYPE_READONLY (arg_type) || TYPE_VOLATILE (arg_type))
+	{
+	  int quals = TYPE_QUALS (arg_type)
+		      & ~TYPE_QUAL_CONST & ~TYPE_QUAL_VOLATILE;
+
+	  arg_type = build_qualified_type (CONST_CAST_TREE (arg_type), quals);
+	}
+      append_unified_type_hash (arg_type, hstate);
+    }
+}
+
+static void
+append_unified_type_hash (const_tree type, inchash::hash &hstate)
+{
+  enum tree_code type_code = TREE_CODE (type);
+  unsigned int u_hash = unified_tree_type_hash_table[type_code];
+
+  /* Make sure all type nodes have a unique initial hash.  */
+  if (!u_hash)
+    gcc_unreachable ();
+
+  hstate.add_int (u_hash);
+
+  /* Extra information about the type involved in the hash calculation.  */
+  switch (type_code)
+    {
+    case VOID_TYPE:
+    case BOOLEAN_TYPE:
+      break;
+
+    case INTEGER_TYPE:
+      append_unified_type_name_hash (type, hstate);
+      append_unified_type_precision_hash (type, hstate);
+      break;
+
+    case ENUMERAL_TYPE:
+      append_unified_type_name_hash (type, hstate);
+      append_unified_type_precision_hash (type, hstate);
+      break;
+
+    case REAL_TYPE:
+      append_unified_type_precision_hash (TYPE_MAIN_VARIANT (type), hstate);
+      break;
+
+    case POINTER_TYPE:
+    case REFERENCE_TYPE:
+    case ARRAY_TYPE:
+      append_unified_type_hash (TREE_TYPE (type), hstate);
+      break;
+
+    case UNION_TYPE:
+    case RECORD_TYPE:
+      append_unified_type_name_hash (type, hstate);
+      break;
+
+    case FUNCTION_TYPE:
+      append_unified_function_ret_and_args_hash (type, hstate);
+      break;
+
+    default:
+      break;
+    }
+}
+
+/* Calculate the hash of the type node that are invariant across
+   compilation units.  */
+
+hashval_t
+unified_type_hash (const_tree type)
+{
+  inchash::hash hstate;
+
+  append_unified_type_hash (type, hstate);
+
+  return hstate.end ();
+}
 
 /* Init tree.cc.  */
 
@@ -723,6 +864,9 @@ init_ttree (void)
 
   /* Initialize the tree_contains_struct array.  */
   initialize_tree_contains_struct ();
+
+  initialize_unified_tree_type_hash_table ();
+
   lang_hooks.init_ts ();
 }
 
diff --git a/gcc/tree.h b/gcc/tree.h
index 8844471e9a5..dd8e0bfba7b 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4813,6 +4813,7 @@ extern tree build_variant_type_copy (tree CXX_MEM_STAT_INFO);
 
 extern hashval_t type_hash_canon_hash (tree);
 extern tree type_hash_canon (unsigned int, tree);
+extern hashval_t unified_type_hash (const_tree);
 
 extern tree convert (tree, tree);
 extern tree size_in_bytes_loc (location_t, const_tree);
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 021e912a37c..c832075e1f3 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -1956,6 +1956,11 @@ assemble_start_function (tree decl, const char *fnname)
   if (!DECL_IGNORED_P (decl))
     (*debug_hooks->begin_function) (decl);
 
+  /* Regardless of whether the function can be called indirectly,
+     a typeid is always required before the function.  */
+  if (flag_sanitize & SANITIZE_CONTROL_FLOW_INTEGRITY)
+    targetm.asm_out.output_func_cfi_typeid (asm_out_file, decl);
+
   /* Make function name accessible from other files, if appropriate.  */
 
   if (TREE_PUBLIC (decl))
@@ -7674,6 +7679,30 @@ default_globalize_decl_name (FILE * stream, tree decl)
   targetm.asm_out.globalize_label (stream, name);
 }
 
+/* Default function to output the function's cfi typeid.  */
+void
+default_output_func_cfi_typeid (FILE * stream ATTRIBUTE_UNUSED,
+				tree decl ATTRIBUTE_UNUSED)
+{
+}
+
+/* Default function to generate gimple instructions to get the
+   typeid in front of the function pointed to by fptr.  */
+tree
+default_gimple_get_func_cfi_typeid (gimple_seq *stmts ATTRIBUTE_UNUSED,
+				    location_t loc ATTRIBUTE_UNUSED,
+				    tree fptr ATTRIBUTE_UNUSED)
+{
+  return NULL_TREE;
+}
+
+/* Default function to calculate the typeid of a function type.  */
+unsigned int
+default_calc_func_cfi_typeid (const_tree fntype ATTRIBUTE_UNUSED)
+{
+  return 0;
+}
+
 /* Default function to output a label for unwind information.  The
    default is to do nothing.  A target that needs nonlocal labels for
    unwind information must provide its own function to do this.  */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC/RFT 3/3] [PR102768] aarch64: Add support for Control Flow Integrity
  2022-12-19  5:54 [RFC/RFT 0/3] Add compiler support for Control Flow Integrity Dan Li
  2022-12-19  5:54 ` [RFC/RFT 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features Dan Li
  2022-12-19  5:54 ` [RFC/RFT 2/3] [PR102768] Support CFI: Add new pass for Control Flow Integrity Dan Li
@ 2022-12-19  5:54 ` Dan Li
  2023-02-09  1:48 ` [RFC/RFT 0/3] Add compiler " Hongtao Liu
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2022-12-19  5:54 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

In the AArch64 platform, typeid can be directly inserted in front
of the function header (offset is -4).

For all functions that will not be called indirectly, insert the
reserved RESERVED_CFI_TYPEID (0x0) as typeid in front of them. If
not, the attacker may use the instruction/data before the function
as typeid to bypass CFI.

All typeids ignore some bits (& AARCH64_UNALLOCATED_INSN_MASK) to
avoid conflicts with the AArch64 instruction set.

Signed-off-by: Dan Li <ashimida.1990@gmail.com>

gcc/ChangeLog:

	PR c/102768
	* config/aarch64/aarch64.cc (RESERVED_CFI_TYPEID): Macro definition.
	(DEFAULT_CFI_TYPEID): Likewise.
	(AARCH64_UNALLOCATED_INSN_MASK): Likewise.
	(aarch64_gimple_get_func_cfi_typeid): Platform-dependent
	CFI function.
	(aarch64_calc_func_cfi_typeid): Likewise.
	(cgraph_indirectly_callable): Determine whether a funtion may
	be called indirectly.
	(aarch64_output_func_cfi_typeid): Platform-dependent CFI function.
	(TARGET_HAVE_CFI): New hook.
	(TARGET_CALC_FUNC_CFI_TYPEID): Likewise.
	(TARGET_ASM_OUTPUT_FUNC_CFI_TYPEID): Likewise.
	(TARGET_GIMPLE_GET_FUNC_CFI_TYPEID): Likewise.
	* doc/invoke.texi: Document -fsanitize=cfi.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/control_flow_integrity_1.c: New test.
	* gcc.target/aarch64/control_flow_integrity_2.c: New test.
	* gcc.target/aarch64/control_flow_integrity_3.c: New test.
---
 gcc/config/aarch64/aarch64.cc                 | 106 ++++++++++++++++++
 gcc/doc/invoke.texi                           |  35 ++++++
 .../aarch64/control_flow_integrity_1.c        |  14 +++
 .../aarch64/control_flow_integrity_2.c        |  25 +++++
 .../aarch64/control_flow_integrity_3.c        |  23 ++++
 5 files changed, 203 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 5c9e7791a12..2796df0cdf3 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -81,6 +81,7 @@
 #include "rtlanal.h"
 #include "tree-dfa.h"
 #include "asan.h"
+#include "ssa.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -5450,6 +5451,99 @@ aarch64_output_sve_addvl_addpl (rtx offset)
   return buffer;
 }
 
+/* Reserved for all functions that cannot be called indirectly.  */
+#define RESERVED_CFI_TYPEID 0x0U
+
+/* If the typeid of a function that can be called indirectly is equal to
+   RESERVED_CFI_TYPEID, change it to DEFAULT_CFI_TYPEID.  */
+#define DEFAULT_CFI_TYPEID 0x00000ADAU
+
+/* Mask of reserved and unallocated instructions in AArch64 platform.  */
+#define AARCH64_UNALLOCATED_INSN_MASK 0xE7FFFFFFU
+
+/* Generate gimple insns to return the callee's typeid to a tmp var,
+   for aarch64, like:
+	__cfi_tmp = *(fptr - 4);  */
+
+static tree
+aarch64_gimple_get_func_cfi_typeid (gimple_seq *stmts,
+				    location_t loc, tree fptr)
+{
+  gimple *stmt;
+  tree result, rhs;
+
+  result = create_tmp_var (integer_type_node, "__cfi_tmp");
+  result = make_ssa_name (result, NULL);
+
+  rhs = build_pointer_type (integer_type_node);
+  rhs = build_int_cst_type (rhs, -4);
+  rhs = build2 (MEM_REF, integer_type_node, fptr, rhs);
+
+  stmt = gimple_build_assign (result, rhs);
+  gimple_set_location (stmt, loc);
+
+  SSA_NAME_DEF_STMT (result) = stmt;
+
+  gimple_seq_add_stmt (stmts, stmt);
+
+  return result;
+}
+
+static unsigned int
+aarch64_calc_func_cfi_typeid (const_tree fntype)
+{
+  unsigned int hash;
+
+  /* The value of typeid has a probability of being the same as the encoding
+     of an instruction.  If the attacker can find the same encoding as the
+     typeid in the assembly code, then he has found a usable jump location.
+     So here, a platform-related mask is used when generating a typeid to
+     avoid such conflicts as much as possible.  */
+  hash = unified_type_hash (fntype) & AARCH64_UNALLOCATED_INSN_MASK;
+
+  /* RESERVED_CFI_TYPEID is reserved for functions that cannot
+     be called indirectly.  */
+  if (hash == RESERVED_CFI_TYPEID)
+    hash = DEFAULT_CFI_TYPEID;
+
+  return hash;
+}
+
+static bool
+cgraph_indirectly_callable (struct cgraph_node *node,
+			    void *data ATTRIBUTE_UNUSED)
+{
+  if (node->externally_visible || node->address_taken)
+    return true;
+
+  return false;
+}
+
+static void
+aarch64_output_func_cfi_typeid (FILE * stream, tree decl)
+{
+  struct cgraph_node *node;
+  unsigned int cur_func_typeid;
+
+  node = cgraph_node::get (decl);
+
+  if (!node->call_for_symbol_thunks_and_aliases (cgraph_indirectly_callable,
+					       NULL, true))
+    /* CFI's typeid check always considers that there is a typeid before the
+       target function, so it is also necessary to output typeid for functions
+       that cannot be called indirectly to prevent attackers from bypassing
+       CFI by using instructions/data before those functions.
+       The typeid inserted before such a function is RESERVED_CFI_TYPEID,
+       and the calculation of the typeid must ensure that this value is always
+       reserved.  */
+    cur_func_typeid = RESERVED_CFI_TYPEID;
+  else
+    cur_func_typeid = aarch64_calc_func_cfi_typeid (TREE_TYPE (decl));
+
+  fprintf (stream, "__cfi_%s:\n", get_name (decl));
+  fprintf (stream, "\t.4byte %#010x\n", cur_func_typeid);
+}
+
 /* Return true if X is a valid immediate for an SVE vector INC or DEC
    instruction.  If it is, store the number of elements in each vector
    quadword in *NELTS_PER_VQ_OUT (if nonnull) and store the multiplication
@@ -27823,6 +27917,18 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_HAVE_SHADOW_CALL_STACK
 #define TARGET_HAVE_SHADOW_CALL_STACK true
 
+#undef TARGET_HAVE_CFI
+#define TARGET_HAVE_CFI true
+
+#undef TARGET_CALC_FUNC_CFI_TYPEID
+#define TARGET_CALC_FUNC_CFI_TYPEID aarch64_calc_func_cfi_typeid
+
+#undef TARGET_ASM_OUTPUT_FUNC_CFI_TYPEID
+#define TARGET_ASM_OUTPUT_FUNC_CFI_TYPEID aarch64_output_func_cfi_typeid
+
+#undef TARGET_GIMPLE_GET_FUNC_CFI_TYPEID
+#define TARGET_GIMPLE_GET_FUNC_CFI_TYPEID aarch64_gimple_get_func_cfi_typeid
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-aarch64.h"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ff6c338bedb..302ae6fe370 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15736,6 +15736,41 @@ to turn off exceptions.
 See @uref{https://clang.llvm.org/docs/ShadowCallStack.html} for more
 details.
 
+@item -fsanitize=cfi
+@opindex fsanitize=cfi
+The CFI sanitizer, enabled with @option{-fsanitize=cfi}, implements a
+forward-edge control flow integrity scheme for indirect calls.  It
+attaches a type identifier (@code{typeid}) for each function and injects
+verification code before indirect calls.
+
+A @code{typeid} is a 32-bit constant, its value is mainly related to the
+return value type and all parameter types of the function, and is invariant
+for each compilation.  Since the value of @code{typeid} may conflict with
+the instruction set encoding of the current platform, some bits may be
+ignored on different platforms.
+
+At compile time, the compiler inserts checking code on all indirect calls,
+and at run time, before any indirect calls occur, the code checks that
+the @code{typeid} before the callee function matches the @code{typeid}
+requested by the caller.  If the match fails, the @code{cfi_check_failed}
+function will be called.  When enabling cfi, users need to implement this
+function by themselves.
+
+If a program contains indirect calls to assembly functions, they must be
+manually annotated with the expected type identifiers to prevent errors.
+To make this easier, CFI generates a weak SHN_ABS
+@code{__cfi_typeid_<function>} symbol for each address-taken function
+declaration, which can be used to annotate functions in assembly as long
+as at least one C translation unit linked into the program takes the
+function address.
+
+Currently this feature only supports the aarch64 platform, mainly for
+the linux kernel.  Users who want to use this feature in user space
+need to provide their own support for the runtime.
+
+See @uref{https://clang.llvm.org/docs/ControlFlowIntegrity.html} for
+more details.
+
 @item -fsanitize=thread
 @opindex fsanitize=thread
 Enable ThreadSanitizer, a fast data race detector.
diff --git a/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c b/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c
new file mode 100644
index 00000000000..0e53e294a96
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c
@@ -0,0 +1,14 @@
+/* Verify:
+     * typeid is output for an external declaration if and only if
+       its address is token within the current compilation unit.  */
+
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=cfi" } */
+
+extern int func1(void);
+extern int func2(void);
+
+int (*p)(void) = func1;
+
+/* { dg-final { scan-assembler-times {.weak __cfi_typeid_func} 1 } } */
+/* { dg-final { scan-assembler-times {.set __cfi_typeid_func} 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c b/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c
new file mode 100644
index 00000000000..36396a904f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c
@@ -0,0 +1,25 @@
+/* Verify:
+     * When CFI is enabled, the default typeid inserted before
+       functions that cannot be called indirectly is 0.
+     * The default typeid inserted before the function that can
+       be called indirectly is not 0.
+     * A __cfi_A symbol is always inserted before function A.  */
+
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=cfi" } */
+
+static int func1(void)
+{
+  return 0;
+}
+
+static int func2(void)
+{
+  return 0;
+}
+
+int (*p)(void) = func1;
+
+/* { dg-final { scan-assembler-times {.4byte} 2 } } */
+/* { dg-final { scan-assembler-times {.4byte 0000000000} 1 } } */
+/* { dg-final { scan-assembler-times {__cfi_func} 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c b/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c
new file mode 100644
index 00000000000..ad8880d526c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c
@@ -0,0 +1,23 @@
+/* Verify:
+     * A cfi check is always inserted before an indirect function call,
+       and the cfi_check_failed function is called if the check fails.
+     * For functions with cfi disabled, no checks are inserted.  */
+
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=cfi" } */
+
+#define __no_cfi __attribute__((no_sanitize("cfi")))
+
+int (*p)(void);
+
+void __no_cfi func1(void)
+{
+  p();
+}
+
+void func2(void)
+{
+  p();
+}
+
+/* { dg-final { scan-assembler-times {bl\tcfi_check_failed} 1 } } */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC/RFT 0/3] Add compiler support for Control Flow Integrity
  2022-12-19  5:54 [RFC/RFT 0/3] Add compiler support for Control Flow Integrity Dan Li
                   ` (2 preceding siblings ...)
  2022-12-19  5:54 ` [RFC/RFT 3/3] [PR102768] aarch64: Add support " Dan Li
@ 2023-02-09  1:48 ` Hongtao Liu
  2023-02-10 16:18   ` Dan Li
  2023-02-09  5:32 ` Peter Collingbourne
  2023-03-25  8:11 ` [RFC/RFT,V2 0/3] Add compiler support for Kernel " Dan Li
  5 siblings, 1 reply; 16+ messages in thread
From: Hongtao Liu @ 2023-02-09  1:48 UTC (permalink / raw)
  To: Dan Li
  Cc: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du, linux-kbuild,
	linux-kernel, linux-arm-kernel, llvm, linux-hardening

On Mon, Dec 19, 2022 at 3:59 PM Dan Li via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This series of patches is mainly used to support the control flow
> integrity protection of the linux kernel [1], which is similar to
> -fsanitize=kcfi in clang 16.0 [2,3].
>
> I hope that this feature will also support user-mode CFI in the
> future (at least for developers who can recompile the runtime),
> so I use -fsanitize=cfi as a compilation option here.
>
> Any suggestion please let me know :).
Do you have this series as a branch somewhere that we could also try for x86?

>
> Thanks, Dan.
>
> [1] https://lore.kernel.org/all/20220908215504.3686827-1-samitolvanen@google.com/
> [2] https://clang.llvm.org/docs/ControlFlowIntegrity.html
> [3] https://reviews.llvm.org/D119296
>
> Dan Li (3):
>   [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to
>     64 bits to support more features
>   [PR102768] Support CFI: Add new pass for Control Flow Integrity
>   [PR102768] aarch64: Add support for Control Flow Integrity
>
> Signed-off-by: Dan Li <ashimida.1990@gmail.com>
>
> ---
>  gcc/Makefile.in                               |   1 +
>  gcc/asan.h                                    |   4 +-
>  gcc/c-family/c-attribs.cc                     |  10 +-
>  gcc/c-family/c-common.h                       |   2 +-
>  gcc/c/c-parser.cc                             |   4 +-
>  gcc/cgraphunit.cc                             |  34 +++
>  gcc/common.opt                                |   4 +-
>  gcc/config/aarch64/aarch64.cc                 | 106 ++++++++
>  gcc/cp/typeck.cc                              |   2 +-
>  gcc/doc/invoke.texi                           |  35 +++
>  gcc/doc/passes.texi                           |  10 +
>  gcc/doc/tm.texi                               |  27 +++
>  gcc/doc/tm.texi.in                            |   8 +
>  gcc/dwarf2asm.cc                              |   2 +-
>  gcc/flag-types.h                              |  67 ++---
>  gcc/opt-suggestions.cc                        |   2 +-
>  gcc/opts.cc                                   |  26 +-
>  gcc/opts.h                                    |   8 +-
>  gcc/output.h                                  |   3 +
>  gcc/passes.def                                |   1 +
>  gcc/target.def                                |  39 +++
>  .../aarch64/control_flow_integrity_1.c        |  14 ++
>  .../aarch64/control_flow_integrity_2.c        |  25 ++
>  .../aarch64/control_flow_integrity_3.c        |  23 ++
>  gcc/toplev.cc                                 |   4 +
>  gcc/tree-cfg.cc                               |   2 +-
>  gcc/tree-cfi.cc                               | 229 ++++++++++++++++++
>  gcc/tree-pass.h                               |   1 +
>  gcc/tree.cc                                   | 144 +++++++++++
>  gcc/tree.h                                    |   1 +
>  gcc/varasm.cc                                 |  29 +++
>  31 files changed, 803 insertions(+), 64 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c
>  create mode 100644 gcc/tree-cfi.cc
>
> --
> 2.17.1
>


--
BR,
Hongtao

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/RFT 0/3] Add compiler support for Control Flow Integrity
  2022-12-19  5:54 [RFC/RFT 0/3] Add compiler support for Control Flow Integrity Dan Li
                   ` (3 preceding siblings ...)
  2023-02-09  1:48 ` [RFC/RFT 0/3] Add compiler " Hongtao Liu
@ 2023-02-09  5:32 ` Peter Collingbourne
  2023-02-10 16:20   ` Dan Li
  2023-03-25  8:11 ` [RFC/RFT,V2 0/3] Add compiler support for Kernel " Dan Li
  5 siblings, 1 reply; 16+ messages in thread
From: Peter Collingbourne @ 2023-02-09  5:32 UTC (permalink / raw)
  To: Dan Li
  Cc: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du, linux-kbuild,
	linux-kernel, linux-arm-kernel, llvm, linux-hardening

On Sun, Dec 18, 2022 at 10:06 PM Dan Li <ashimida.1990@gmail.com> wrote:
>
> This series of patches is mainly used to support the control flow
> integrity protection of the linux kernel [1], which is similar to
> -fsanitize=kcfi in clang 16.0 [2,3].
>
> I hope that this feature will also support user-mode CFI in the
> future (at least for developers who can recompile the runtime),
> so I use -fsanitize=cfi as a compilation option here.

Please don't. The various CFI-related build flags are confusing enough
without also having this inconsistency between Clang and GCC.

Peter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/RFT 0/3] Add compiler support for Control Flow Integrity
  2023-02-09  1:48 ` [RFC/RFT 0/3] Add compiler " Hongtao Liu
@ 2023-02-10 16:18   ` Dan Li
  2023-02-13  1:39     ` Hongtao Liu
  0 siblings, 1 reply; 16+ messages in thread
From: Dan Li @ 2023-02-10 16:18 UTC (permalink / raw)
  To: Hongtao Liu
  Cc: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du, linux-kbuild,
	linux-kernel, linux-arm-kernel, llvm, linux-hardening

On 02/09, Hongtao Liu wrote:
> On Mon, Dec 19, 2022 at 3:59 PM Dan Li via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > This series of patches is mainly used to support the control flow
> > integrity protection of the linux kernel [1], which is similar to
> > -fsanitize=kcfi in clang 16.0 [2,3].
> >
> > I hope that this feature will also support user-mode CFI in the
> > future (at least for developers who can recompile the runtime),
> > so I use -fsanitize=cfi as a compilation option here.
> >
> > Any suggestion please let me know :).
> Do you have this series as a branch somewhere that we could also try for x86?

Hi Hongtao,

I haven't tried this feature on the x86 platform, if possible, I will try it in
the next version.

Thanks,
Dan.

> --
> BR,
> Hongtao

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/RFT 0/3] Add compiler support for Control Flow Integrity
  2023-02-09  5:32 ` Peter Collingbourne
@ 2023-02-10 16:20   ` Dan Li
  0 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2023-02-10 16:20 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du, linux-kbuild,
	linux-kernel, linux-arm-kernel, llvm, linux-hardening

On 02/08, Peter Collingbourne wrote:
> On Sun, Dec 18, 2022 at 10:06 PM Dan Li <ashimida.1990@gmail.com> wrote:
> >
> > This series of patches is mainly used to support the control flow
> > integrity protection of the linux kernel [1], which is similar to
> > -fsanitize=kcfi in clang 16.0 [2,3].
> >
> > I hope that this feature will also support user-mode CFI in the
> > future (at least for developers who can recompile the runtime),
> > so I use -fsanitize=cfi as a compilation option here.
> 
> Please don't. The various CFI-related build flags are confusing enough
> without also having this inconsistency between Clang and GCC.

Hi Peter,

Got it, as discussed before[1], in the next version I will use the same
compile option.

[1]. https://patchwork.kernel.org/project/linux-arm-kernel/patch/20221219061758.23321-1-ashimida.1990@gmail.com/

Thanks,
Dan.

> 
> Peter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/RFT 0/3] Add compiler support for Control Flow Integrity
  2023-02-10 16:18   ` Dan Li
@ 2023-02-13  1:39     ` Hongtao Liu
  0 siblings, 0 replies; 16+ messages in thread
From: Hongtao Liu @ 2023-02-13  1:39 UTC (permalink / raw)
  To: Dan Li
  Cc: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du, linux-kbuild,
	linux-kernel, linux-arm-kernel, llvm, linux-hardening

On Sat, Feb 11, 2023 at 12:18 AM Dan Li <ashimida.1990@gmail.com> wrote:
>
> On 02/09, Hongtao Liu wrote:
> > On Mon, Dec 19, 2022 at 3:59 PM Dan Li via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > This series of patches is mainly used to support the control flow
> > > integrity protection of the linux kernel [1], which is similar to
> > > -fsanitize=kcfi in clang 16.0 [2,3].
> > >
> > > I hope that this feature will also support user-mode CFI in the
> > > future (at least for developers who can recompile the runtime),
> > > so I use -fsanitize=cfi as a compilation option here.
> > >
> > > Any suggestion please let me know :).
> > Do you have this series as a branch somewhere that we could also try for x86?
>
> Hi Hongtao,
>
> I haven't tried this feature on the x86 platform, if possible, I will try it in
> the next version.
Thanks.
>
> Thanks,
> Dan.
>
> > --
> > BR,
> > Hongtao



--
BR,
Hongtao

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC/RFT,V2 0/3] Add compiler support for Kernel Control Flow Integrity
  2022-12-19  5:54 [RFC/RFT 0/3] Add compiler support for Control Flow Integrity Dan Li
                   ` (4 preceding siblings ...)
  2023-02-09  5:32 ` Peter Collingbourne
@ 2023-03-25  8:11 ` Dan Li
  2023-03-25  8:11   ` [RFC/RFT,V2 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features Dan Li
                     ` (4 more replies)
  5 siblings, 5 replies; 16+ messages in thread
From: Dan Li @ 2023-03-25  8:11 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

This series of patches is mainly used to support the control flow
integrity protection of the linux kernel [1], which is similar to
-fsanitize=kcfi in clang 16.0 [2,3].

Any suggestion please let me know :).

Thanks, Dan.

[1] https://lore.kernel.org/all/20220908215504.3686827-1-samitolvanen@google.com/
[2] https://clang.llvm.org/docs/ControlFlowIntegrity.html
[3] https://reviews.llvm.org/D119296

Signed-off-by: Dan Li <ashimida.1990@gmail.com>

---
Dan Li (3):
  [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to
    64 bits to support more features
  [PR102768] Support CFI: Add basic support for Kernel Control Flow
    Integrity
  [PR102768] aarch64: Add support for Kernel Control Flow Integrity

 gcc/asan.h                    |   4 +-
 gcc/c-family/c-attribs.cc     |  10 +-
 gcc/c-family/c-common.h       |   2 +-
 gcc/c/c-parser.cc             |   4 +-
 gcc/cfgexpand.cc              |  26 ++++++
 gcc/cgraphunit.cc             |  34 +++++++
 gcc/combine.cc                |   1 +
 gcc/common.opt                |   4 +-
 gcc/config/aarch64/aarch64.cc | 166 ++++++++++++++++++++++++++++++++++
 gcc/cp/typeck.cc              |   2 +-
 gcc/doc/invoke.texi           |  36 ++++++++
 gcc/doc/tm.texi               |  27 ++++++
 gcc/doc/tm.texi.in            |   8 ++
 gcc/dwarf2asm.cc              |   2 +-
 gcc/emit-rtl.cc               |   1 +
 gcc/emit-rtl.h                |   4 +
 gcc/final.cc                  |  24 ++++-
 gcc/flag-types.h              |  67 +++++++-------
 gcc/gimple.cc                 |  11 +++
 gcc/gimple.h                  |   5 +-
 gcc/opt-suggestions.cc        |   2 +-
 gcc/opts.cc                   |  26 +++---
 gcc/opts.h                    |   8 +-
 gcc/output.h                  |   3 +
 gcc/reg-notes.def             |   1 +
 gcc/target.def                |  38 ++++++++
 gcc/toplev.cc                 |   4 +
 gcc/tree-cfg.cc               |   2 +-
 gcc/tree.cc                   | 144 +++++++++++++++++++++++++++++
 gcc/tree.h                    |   1 +
 gcc/varasm.cc                 |  26 ++++++
 31 files changed, 627 insertions(+), 66 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC/RFT,V2 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features
  2023-03-25  8:11 ` [RFC/RFT,V2 0/3] Add compiler support for Kernel " Dan Li
@ 2023-03-25  8:11   ` Dan Li
  2023-03-25  8:11   ` [RFC/RFT,V2 2/3] [PR102768] Support CFI: Add basic support for Kernel Control Flow Integrity Dan Li
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2023-03-25  8:11 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

32-bit sanitize_code can no longer accommodate new options,
extending it to 64-bit.

Signed-off-by: Dan Li <ashimida.1990@gmail.com>

gcc/ChangeLog:

	PR c/102768
	* asan.h (sanitize_flags_p): Promote to uint64_t.
	* common.opt: Likewise.
	* dwarf2asm.cc (dw2_output_indirect_constant_1): Likewise.
	* flag-types.h (enum sanitize_code): Likewise.
	* opt-suggestions.cc (option_proposer::build_option_suggestions):
	Likewise.
	* opts.cc (find_sanitizer_argument): Likewise.
	(report_conflicting_sanitizer_options): Likewise.
	(get_closest_sanitizer_option): Likewise.
	(parse_sanitizer_options): Likewise.
	(parse_no_sanitize_attribute): Likewise.
	* opts.h (parse_sanitizer_options): Likewise.
	(parse_no_sanitize_attribute): Likewise.
	* tree-cfg.cc (print_no_sanitize_attr_value): Likewise.

gcc/c-family/ChangeLog:

	* c-attribs.cc (add_no_sanitize_value): Likewise.
	(handle_no_sanitize_attribute): Likewise.
	* c-common.h (add_no_sanitize_value): Likewise.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_declaration_or_fndef): Likewise.

gcc/cp/ChangeLog:

	* typeck.cc (get_member_function_from_ptrfunc): Likewise.
---
 gcc/asan.h                |  4 +--
 gcc/c-family/c-attribs.cc | 10 +++---
 gcc/c-family/c-common.h   |  2 +-
 gcc/c/c-parser.cc         |  4 +--
 gcc/common.opt            |  4 +--
 gcc/cp/typeck.cc          |  2 +-
 gcc/dwarf2asm.cc          |  2 +-
 gcc/flag-types.h          | 65 ++++++++++++++++++++-------------------
 gcc/opt-suggestions.cc    |  2 +-
 gcc/opts.cc               | 22 ++++++-------
 gcc/opts.h                |  8 ++---
 gcc/tree-cfg.cc           |  2 +-
 12 files changed, 64 insertions(+), 63 deletions(-)

diff --git a/gcc/asan.h b/gcc/asan.h
index d4ea49cb240..5b98172549b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -233,9 +233,9 @@ asan_protect_stack_decl (tree decl)
    remove all flags mentioned in "no_sanitize" of DECL_ATTRIBUTES.  */
 
 static inline bool
-sanitize_flags_p (unsigned int flag, const_tree fn = current_function_decl)
+sanitize_flags_p (uint64_t flag, const_tree fn = current_function_decl)
 {
-  unsigned int result_flags = flag_sanitize & flag;
+  uint64_t result_flags = flag_sanitize & flag;
   if (result_flags == 0)
     return false;
 
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 111a33f405a..a73e2364525 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -1118,23 +1118,23 @@ handle_cold_attribute (tree *node, tree name, tree ARG_UNUSED (args),
 /* Add FLAGS for a function NODE to no_sanitize_flags in DECL_ATTRIBUTES.  */
 
 void
-add_no_sanitize_value (tree node, unsigned int flags)
+add_no_sanitize_value (tree node, uint64_t flags)
 {
   tree attr = lookup_attribute ("no_sanitize", DECL_ATTRIBUTES (node));
   if (attr)
     {
-      unsigned int old_value = tree_to_uhwi (TREE_VALUE (attr));
+      uint64_t old_value = tree_to_uhwi (TREE_VALUE (attr));
       flags |= old_value;
 
       if (flags == old_value)
 	return;
 
-      TREE_VALUE (attr) = build_int_cst (unsigned_type_node, flags);
+      TREE_VALUE (attr) = build_int_cst (long_long_unsigned_type_node, flags);
     }
   else
     DECL_ATTRIBUTES (node)
       = tree_cons (get_identifier ("no_sanitize"),
-		   build_int_cst (unsigned_type_node, flags),
+		   build_int_cst (long_long_unsigned_type_node, flags),
 		   DECL_ATTRIBUTES (node));
 }
 
@@ -1145,7 +1145,7 @@ static tree
 handle_no_sanitize_attribute (tree *node, tree name, tree args, int,
 			      bool *no_add_attrs)
 {
-  unsigned int flags = 0;
+  uint64_t flags = 0;
   *no_add_attrs = true;
   if (TREE_CODE (*node) != FUNCTION_DECL)
     {
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 52a85bfb783..eb91b9703db 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1500,7 +1500,7 @@ extern enum flt_eval_method
 excess_precision_mode_join (enum flt_eval_method, enum flt_eval_method);
 
 extern int c_flt_eval_method (bool ts18661_p);
-extern void add_no_sanitize_value (tree node, unsigned int flags);
+extern void add_no_sanitize_value (tree node, uint64_t flags);
 
 extern void maybe_add_include_fixit (rich_location *, const char *, bool);
 extern void maybe_suggest_missing_token_insertion (rich_location *richloc,
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index f679d53706a..9d55ea55fa6 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -2217,7 +2217,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 		  start_init (NULL_TREE, asm_name, global_bindings_p (), &richloc);
 		  /* A parameter is initialized, which is invalid.  Don't
 		     attempt to instrument the initializer.  */
-		  int flag_sanitize_save = flag_sanitize;
+		  uint64_t flag_sanitize_save = flag_sanitize;
 		  if (nested && !empty_ok)
 		    flag_sanitize = 0;
 		  init = c_parser_expr_no_commas (parser, NULL);
@@ -2275,7 +2275,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 		  start_init (d, asm_name, global_bindings_p (), &richloc);
 		  /* A parameter is initialized, which is invalid.  Don't
 		     attempt to instrument the initializer.  */
-		  int flag_sanitize_save = flag_sanitize;
+		  uint64_t flag_sanitize_save = flag_sanitize;
 		  if (TREE_CODE (d) == PARM_DECL)
 		    flag_sanitize = 0;
 		  init = c_parser_initializer (parser);
diff --git a/gcc/common.opt b/gcc/common.opt
index 8a0dafc522d..9613c2f8ba0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -217,11 +217,11 @@ bool flag_opts_finished
 
 ; What the sanitizer should instrument
 Variable
-unsigned int flag_sanitize
+uint64_t flag_sanitize
 
 ; What sanitizers should recover from errors
 Variable
-unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS | SANITIZE_KERNEL_HWADDRESS) & ~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
+uint64_t flag_sanitize_recover = (SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS | SANITIZE_KERNEL_HWADDRESS) & ~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
 
 ; Flag whether a prefix has been added to dump_base_name
 Variable
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index ceb80d9744f..0afaf58d87d 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -4023,7 +4023,7 @@ get_member_function_from_ptrfunc (tree *instance_ptrptr, tree function,
       idx = build1 (NOP_EXPR, vtable_index_type, e3);
       switch (TARGET_PTRMEMFUNC_VBIT_LOCATION)
 	{
-	  int flag_sanitize_save;
+	  uint64_t flag_sanitize_save;
 	case ptrmemfunc_vbit_in_pfn:
 	  e1 = cp_build_binary_op (input_location,
 				   BIT_AND_EXPR, idx, integer_one_node,
diff --git a/gcc/dwarf2asm.cc b/gcc/dwarf2asm.cc
index 274f574f25e..b54d1935d57 100644
--- a/gcc/dwarf2asm.cc
+++ b/gcc/dwarf2asm.cc
@@ -1026,7 +1026,7 @@ dw2_output_indirect_constant_1 (const char *sym, tree id)
   sym_ref = gen_rtx_SYMBOL_REF (Pmode, sym);
   /* Disable ASan for decl because redzones cause ABI breakage between GCC and
      libstdc++ for `.LDFCM*' variables.  See PR 78651 for details.  */
-  unsigned int save_flag_sanitize = flag_sanitize;
+  uint64_t save_flag_sanitize = flag_sanitize;
   flag_sanitize &= ~(SANITIZE_ADDRESS | SANITIZE_USER_ADDRESS
 		     | SANITIZE_KERNEL_ADDRESS);
   /* And also temporarily disable -fsection-anchors.  These indirect constants
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 2c8498169e0..0aa51e282fb 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -287,42 +287,43 @@ enum auto_init_type {
 /* Different instrumentation modes.  */
 enum sanitize_code {
   /* AddressSanitizer.  */
-  SANITIZE_ADDRESS = 1UL << 0,
-  SANITIZE_USER_ADDRESS = 1UL << 1,
-  SANITIZE_KERNEL_ADDRESS = 1UL << 2,
+  SANITIZE_ADDRESS = 1ULL << 0,
+  SANITIZE_USER_ADDRESS = 1ULL << 1,
+  SANITIZE_KERNEL_ADDRESS = 1ULL << 2,
   /* ThreadSanitizer.  */
-  SANITIZE_THREAD = 1UL << 3,
+  SANITIZE_THREAD = 1ULL << 3,
   /* LeakSanitizer.  */
-  SANITIZE_LEAK = 1UL << 4,
+  SANITIZE_LEAK = 1ULL << 4,
   /* UndefinedBehaviorSanitizer.  */
-  SANITIZE_SHIFT_BASE = 1UL << 5,
-  SANITIZE_SHIFT_EXPONENT = 1UL << 6,
-  SANITIZE_DIVIDE = 1UL << 7,
-  SANITIZE_UNREACHABLE = 1UL << 8,
-  SANITIZE_VLA = 1UL << 9,
-  SANITIZE_NULL = 1UL << 10,
-  SANITIZE_RETURN = 1UL << 11,
-  SANITIZE_SI_OVERFLOW = 1UL << 12,
-  SANITIZE_BOOL = 1UL << 13,
-  SANITIZE_ENUM = 1UL << 14,
-  SANITIZE_FLOAT_DIVIDE = 1UL << 15,
-  SANITIZE_FLOAT_CAST = 1UL << 16,
-  SANITIZE_BOUNDS = 1UL << 17,
-  SANITIZE_ALIGNMENT = 1UL << 18,
-  SANITIZE_NONNULL_ATTRIBUTE = 1UL << 19,
-  SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1UL << 20,
-  SANITIZE_OBJECT_SIZE = 1UL << 21,
-  SANITIZE_VPTR = 1UL << 22,
-  SANITIZE_BOUNDS_STRICT = 1UL << 23,
-  SANITIZE_POINTER_OVERFLOW = 1UL << 24,
-  SANITIZE_BUILTIN = 1UL << 25,
-  SANITIZE_POINTER_COMPARE = 1UL << 26,
-  SANITIZE_POINTER_SUBTRACT = 1UL << 27,
-  SANITIZE_HWADDRESS = 1UL << 28,
-  SANITIZE_USER_HWADDRESS = 1UL << 29,
-  SANITIZE_KERNEL_HWADDRESS = 1UL << 30,
+  SANITIZE_SHIFT_BASE = 1ULL << 5,
+  SANITIZE_SHIFT_EXPONENT = 1ULL << 6,
+  SANITIZE_DIVIDE = 1ULL << 7,
+  SANITIZE_UNREACHABLE = 1ULL << 8,
+  SANITIZE_VLA = 1ULL << 9,
+  SANITIZE_NULL = 1ULL << 10,
+  SANITIZE_RETURN = 1ULL << 11,
+  SANITIZE_SI_OVERFLOW = 1ULL << 12,
+  SANITIZE_BOOL = 1ULL << 13,
+  SANITIZE_ENUM = 1ULL << 14,
+  SANITIZE_FLOAT_DIVIDE = 1ULL << 15,
+  SANITIZE_FLOAT_CAST = 1ULL << 16,
+  SANITIZE_BOUNDS = 1ULL << 17,
+  SANITIZE_ALIGNMENT = 1ULL << 18,
+  SANITIZE_NONNULL_ATTRIBUTE = 1ULL << 19,
+  SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1ULL << 20,
+  SANITIZE_OBJECT_SIZE = 1ULL << 21,
+  SANITIZE_VPTR = 1ULL << 22,
+  SANITIZE_BOUNDS_STRICT = 1ULL << 23,
+  SANITIZE_POINTER_OVERFLOW = 1ULL << 24,
+  SANITIZE_BUILTIN = 1ULL << 25,
+  SANITIZE_POINTER_COMPARE = 1ULL << 26,
+  SANITIZE_POINTER_SUBTRACT = 1ULL << 27,
+  SANITIZE_HWADDRESS = 1ULL << 28,
+  SANITIZE_USER_HWADDRESS = 1ULL << 29,
+  SANITIZE_KERNEL_HWADDRESS = 1ULL << 30,
   /* Shadow Call Stack.  */
-  SANITIZE_SHADOW_CALL_STACK = 1UL << 31,
+  SANITIZE_SHADOW_CALL_STACK = 1ULL << 31,
+  SANITIZE_MAX = 1ULL << 63,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
 		       | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
diff --git a/gcc/opt-suggestions.cc b/gcc/opt-suggestions.cc
index 33f298560a1..c667e23e66f 100644
--- a/gcc/opt-suggestions.cc
+++ b/gcc/opt-suggestions.cc
@@ -173,7 +173,7 @@ option_proposer::build_option_suggestions (const char *prefix)
 		/* -fsanitize=all is not valid, only -fno-sanitize=all.
 		   So don't register the positive misspelling candidates
 		   for it.  */
-		if (sanitizer_opts[j].flag == ~0U && i == OPT_fsanitize_)
+		if (sanitizer_opts[j].flag == ~0ULL && i == OPT_fsanitize_)
 		  {
 		    optb = *option;
 		    optb.opt_text = opt_text = "-fno-sanitize=";
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 3a89da2dd03..11c5d70458f 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -966,7 +966,7 @@ vec<const char *> help_option_arguments;
 /* Return the string name describing a sanitizer argument which has been
    provided on the command line and has set this particular flag.  */
 const char *
-find_sanitizer_argument (struct gcc_options *opts, unsigned int flags)
+find_sanitizer_argument (struct gcc_options *opts, uint64_t flags)
 {
   for (int i = 0; sanitizer_opts[i].name != NULL; ++i)
     {
@@ -1000,10 +1000,10 @@ find_sanitizer_argument (struct gcc_options *opts, unsigned int flags)
    set these flags.  */
 static void
 report_conflicting_sanitizer_options (struct gcc_options *opts, location_t loc,
-				      unsigned int left, unsigned int right)
+				      uint64_t left, uint64_t right)
 {
-  unsigned int left_seen = (opts->x_flag_sanitize & left);
-  unsigned int right_seen = (opts->x_flag_sanitize & right);
+  uint64_t left_seen = (opts->x_flag_sanitize & left);
+  uint64_t right_seen = (opts->x_flag_sanitize & right);
   if (left_seen && right_seen)
     {
       const char* left_arg = find_sanitizer_argument (opts, left_seen);
@@ -2059,7 +2059,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
   SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true),
   SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true),
   SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false),
-  SANITIZER_OPT (all, ~0U, true),
+  SANITIZER_OPT (all, ~0ULL, true),
 #undef SANITIZER_OPT
   { NULL, 0U, 0UL, false }
 };
@@ -2128,7 +2128,7 @@ get_closest_sanitizer_option (const string_fragment &arg,
     {
       /* -fsanitize=all is not valid, so don't offer it.  */
       if (code == OPT_fsanitize_
-	  && opts[i].flag == ~0U
+	  && opts[i].flag == ~0ULL
 	  && value)
 	continue;
 
@@ -2148,9 +2148,9 @@ get_closest_sanitizer_option (const string_fragment &arg,
    adjust previous FLAGS and return new ones.  If COMPLAIN is false,
    don't issue diagnostics.  */
 
-unsigned int
+uint64_t
 parse_sanitizer_options (const char *p, location_t loc, int scode,
-			 unsigned int flags, int value, bool complain)
+			 uint64_t flags, int value, bool complain)
 {
   enum opt_code code = (enum opt_code) scode;
 
@@ -2176,7 +2176,7 @@ parse_sanitizer_options (const char *p, location_t loc, int scode,
 	    && memcmp (p, sanitizer_opts[i].name, len) == 0)
 	  {
 	    /* Handle both -fsanitize and -fno-sanitize cases.  */
-	    if (value && sanitizer_opts[i].flag == ~0U)
+	    if (value && sanitizer_opts[i].flag == ~0ULL)
 	      {
 		if (code == OPT_fsanitize_)
 		  {
@@ -2241,10 +2241,10 @@ parse_sanitizer_options (const char *p, location_t loc, int scode,
 /* Parse string values of no_sanitize attribute passed in VALUE.
    Values are separated with comma.  */
 
-unsigned int
+uint64_t
 parse_no_sanitize_attribute (char *value)
 {
-  unsigned int flags = 0;
+  uint64_t flags = 0;
   unsigned int i;
   char *q = strtok (value, ",");
 
diff --git a/gcc/opts.h b/gcc/opts.h
index a43ce66cffe..17a02cc7c14 100644
--- a/gcc/opts.h
+++ b/gcc/opts.h
@@ -425,10 +425,10 @@ extern void control_warning_option (unsigned int opt_index, int kind,
 extern char *write_langs (unsigned int mask);
 extern void print_ignored_options (void);
 extern void handle_common_deferred_options (void);
-unsigned int parse_sanitizer_options (const char *, location_t, int,
-				      unsigned int, int, bool);
+uint64_t parse_sanitizer_options (const char *, location_t, int,
+				      uint64_t, int, bool);
 
-unsigned int parse_no_sanitize_attribute (char *value);
+uint64_t parse_no_sanitize_attribute (char *value);
 extern bool common_handle_option (struct gcc_options *opts,
 				  struct gcc_options *opts_set,
 				  const struct cl_decoded_option *decoded,
@@ -470,7 +470,7 @@ extern bool opt_enum_arg_to_value (size_t opt_index, const char *arg,
 extern const struct sanitizer_opts_s
 {
   const char *const name;
-  unsigned int flag;
+  uint64_t flag;
   size_t len;
   bool can_recover;
 } sanitizer_opts[];
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index e321d929fd0..8cc31db9bea 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -8018,7 +8018,7 @@ dump_default_def (FILE *file, tree def, int spc, dump_flags_t flags)
 static void
 print_no_sanitize_attr_value (FILE *file, tree value)
 {
-  unsigned int flags = tree_to_uhwi (value);
+  uint64_t flags = tree_to_uhwi (value);
   bool first = true;
   for (int i = 0; sanitizer_opts[i].name != NULL; ++i)
     {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC/RFT,V2 2/3] [PR102768] Support CFI: Add basic support for Kernel Control Flow Integrity
  2023-03-25  8:11 ` [RFC/RFT,V2 0/3] Add compiler support for Kernel " Dan Li
  2023-03-25  8:11   ` [RFC/RFT,V2 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features Dan Li
@ 2023-03-25  8:11   ` Dan Li
  2023-03-25  8:11   ` [RFC/RFT,V2 3/3] [PR102768] aarch64: Add " Dan Li
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2023-03-25  8:11 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

The KCFI sanitizer enabled with -fsanitize=kcfi implements a forward
edge control flow integrity scheme for indirect calls, similar to
-fsanitize=kcfi [1] in llvm.

At compile time, it appends a uniform type identifier before the
first instruction of each function and inserts check code before
each indirect call in a function with protection enabled.

At runtime, according to the code order, the check code for each
indirect call will be executed first, and it will:
1. Dynamically obtain the typeid before the callee function.
2. Compare it to the expected typeid of the current call site (caller).
3. If the two match, continue to execute the indirect call, if not,
an exception will be generated, and its user (usually the low-level
code such as the OS kernel) needs to support for the exception handling.

A typeid (type identifier) is a 32-bit constant on all platforms,
whose value depends on the function's prototype, and is invariant
across compilation units. However, different platforms may ignore
some of the bits to avoid conflicts with instructions.

If a program contains indirect calls to assembly functions, they
must be manually annotated with the expected type identifiers to
prevent errors. To make this easier, gcc generates a weak SHN_ABS
__kcfi_typeid_<function> symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one translation unit linked into the program
takes the function address. It should be noted that on different
platforms, the location of typeid insertion (the offset between
it and the function header) may be different, such as [1], and
this patch only implements the platform-independent part.

[1]: https://reviews.llvm.org/D119296

Signed-off-by: Dan Li <ashimida.1990@gmail.com>

gcc/ChangeLog:

	PR c/102768
	* cfgexpand.cc (expand_call_stmt): Add CFI_TYPEID reg note
	for call insn.
	(pass_expand::execute): Check whether enable KCFI for current
	function according to the attribute.
	* cgraphunit.cc (output_decl_kcfi_typeid_symbol): Output the
	CFI typeid corresponding to each external declaration when necessary.
	(output_decl_kcfi_typeid_symbols): Likewise.
	* combine.cc (distribute_notes): Add REG_CALL_CFI_TYPEID.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in: New hooks.
	* emit-rtl.cc (try_split): Add REG_CALL_CFI_TYPEID.
	* emit-rtl.h (struct rtl_data): Add is_kcfi_enabled.
	* final.cc (final_scan_insn_1): Output kcfi check for indirect calls.
	* flag-types.h (enum sanitize_code):
	Add SANITIZE_CONTROL_FLOW_INTEGRITY.
	* gimple.cc (gimple_call_set_cfi_typeid): New.
	(gimple_build_call_1): Calculate cfi_typeid when gcall is created.
	* gimple.h (struct GTY): Add new member cfi_typeid for gcall.
	* opts.cc (parse_sanitizer_options): Add cfi and exclude
	SANITIZE_CONTROL_FLOW_INTEGRITY.
	* output.h (default_output_func_kcfi_typeid): Declare.
	(default_output_icall_kcfi_check): Declare.
	(default_calc_func_cfi_typeid): Declare.
	* reg-notes.def (REG_NOTE): Add new REG_NOTE CALL_CFI_TYPEID.
	* target.def: Add new hooks.
	* toplev.cc (process_options): Add CFI compile option check.
	* tree.cc (tree_node_sizes[): Add the unified tree type hash
	calculation functions.
	(append_unified_type_hash): Likewise.
	(initialize_unified_tree_type_hash_table): Likewise.
	(append_unified_type_name_hash): Likewise.
	(append_unified_type_precision_hash): Likewise.
	(append_unified_function_ret_and_args_hash): Likewise.
	(unified_type_hash): Likewise.
	(init_ttree): Likewise.
	* tree.h (unified_type_hash): Declare.
	* varasm.cc (assemble_start_function): Output the CFI typeid
	of each function.
	(default_output_func_kcfi_typeid): New.
	(default_output_icall_kcfi_check): New.
	(default_calc_func_cfi_typeid): New.
---
 gcc/cfgexpand.cc   |  26 ++++++++
 gcc/cgraphunit.cc  |  34 +++++++++++
 gcc/combine.cc     |   1 +
 gcc/doc/tm.texi    |  27 +++++++++
 gcc/doc/tm.texi.in |   8 +++
 gcc/emit-rtl.cc    |   1 +
 gcc/emit-rtl.h     |   4 ++
 gcc/final.cc       |  24 +++++++-
 gcc/flag-types.h   |   2 +
 gcc/gimple.cc      |  11 ++++
 gcc/gimple.h       |   5 +-
 gcc/opts.cc        |   4 +-
 gcc/output.h       |   3 +
 gcc/reg-notes.def  |   1 +
 gcc/target.def     |  38 ++++++++++++
 gcc/toplev.cc      |   4 ++
 gcc/tree.cc        | 144 +++++++++++++++++++++++++++++++++++++++++++++
 gcc/tree.h         |   1 +
 gcc/varasm.cc      |  26 ++++++++
 19 files changed, 361 insertions(+), 3 deletions(-)

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index d3cc77d2ca9..69c5fa30c7e 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -2845,6 +2845,18 @@ expand_call_stmt (gcall *stmt)
 	add_reg_note (last, REG_CALL_NOCF_CHECK, const0_rtx);
     }
 
+  if (flag_sanitize & SANITIZE_KERNEL_CONTROL_FLOW_INTEGRITY)
+    {
+      rtx_insn *last = get_last_insn ();
+      rtx datum = gen_rtx_CONST_INT (SImode, stmt->cfi_typeid);
+      while (!CALL_P (last)
+	     && last != before_call)
+	last = PREV_INSN (last);
+
+      if (last != before_call)
+	add_reg_note (last, REG_CALL_CFI_TYPEID, datum);
+    }
+
   mark_transaction_restart_calls (stmt);
 }
 
@@ -6923,10 +6935,16 @@ pass_expand::execute (function *fun)
   if (crtl->tail_call_emit)
     fixup_tail_calls ();
 
+  crtl->is_kcfi_enabled
+    = sanitize_flags_p (SANITIZE_KERNEL_CONTROL_FLOW_INTEGRITY,
+			current_function_decl);
+
   HOST_WIDE_INT patch_area_size, patch_area_entry;
   parse_and_check_patch_area (flag_patchable_function_entry, false,
 			      &patch_area_size, &patch_area_entry);
 
+  HOST_WIDE_INT patch_area_entry_org = patch_area_entry;
+
   tree patchable_function_entry_attr
     = lookup_attribute ("patchable_function_entry",
 			DECL_ATTRIBUTES (cfun->decl));
@@ -6954,6 +6972,14 @@ pass_expand::execute (function *fun)
       patch_area_entry = 0;
     }
 
+  if (crtl->is_kcfi_enabled
+      && (patch_area_entry_org != patch_area_entry))
+    {
+      error_at (DECL_SOURCE_LOCATION (current_function_decl),
+		"%<-fsanitize=kcfi%> conflict with attribute "
+		"%<patchable_function_entry%>");
+    }
+
   crtl->patch_area_size = patch_area_size;
   crtl->patch_area_entry = patch_area_entry;
 
diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc
index 76d541755b8..23275b5ed36 100644
--- a/gcc/cgraphunit.cc
+++ b/gcc/cgraphunit.cc
@@ -2222,6 +2222,37 @@ ipa_passes (void)
   bitmap_obstack_release (NULL);
 }
 
+/* Output a weak symbol value of a decl's typeid (hash) to the
+   assembly file, like:
+	.weak __kcfi_typeid_A
+	.set __kcfi_typeid_A, 0x00000ADA
+   typeid is platform-dependent, because the bits in typeid that conflicts
+   with the instruction set of the current platform needs to be ignored.  */
+
+static void
+output_decl_kcfi_typeid_symbol (FILE *stream, tree fndecl)
+{
+  unsigned int hash = targetm.calc_func_cfi_typeid (TREE_TYPE (fndecl));
+  const char *name = IDENTIFIER_POINTER (DECL_NAME (fndecl));
+
+  fprintf (stream, ".weak __kcfi_typeid_%s\n", name);
+  fprintf (stream, ".set __kcfi_typeid_%s, %#010x\n", name, hash);
+}
+
+/* Calculate and output the symbols corresponding to the typeid of all
+   external declarations whose address is taken within the current
+   compilation unit.  If such a function is defined in assembly code,
+   its typeid can be obtained according to this symbol.  */
+
+static void
+output_decl_kcfi_typeid_symbols (void)
+{
+  struct cgraph_node *node;
+
+  FOR_EACH_FUNCTION (node)
+    if (!node->definition && node->address_taken)
+      output_decl_kcfi_typeid_symbol (asm_out_file, node->decl);
+}
 
 /* Weakrefs may be associated to external decls and thus not output
    at expansion time.  Emit all necessary aliases.  */
@@ -2339,6 +2370,9 @@ symbol_table::compile (void)
       }
 #endif
 
+  if (flag_sanitize & SANITIZE_KERNEL_CONTROL_FLOW_INTEGRITY)
+    output_decl_kcfi_typeid_symbols ();
+
   state = EXPANSION;
 
   /* Output first asm statements and anything ordered. The process
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 9a34ef847aa..ddba4b2ed7d 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -14273,6 +14273,7 @@ distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2,
 	case REG_SETJMP:
 	case REG_TM:
 	case REG_CALL_DECL:
+	case REG_CALL_CFI_TYPEID:
 	case REG_UNTYPED_CALL:
 	case REG_CALL_NOCF_CHECK:
 	  /* These notes must remain with the call.  It should not be
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c5006afc00d..d7e406f3386 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1003,6 +1003,14 @@ Return a value, with the same meaning as the C99 macro
 @code{FLT_EVAL_METHOD} that describes which excess precision should be
 applied.
 
+@deftypefn {Target Hook} {unsigned int} TARGET_CALC_FUNC_CFI_TYPEID (const_tree @var{fntype})
+This target hook is used to calculate a platform-dependent typeid
+of a function.
+Although the length of typeid is always 4 bytes on all platforms, different
+platforms may ignore some bits to avoid encoding conflicts with it's
+instruction set, so a platform-dependent function is required.
+@end deftypefn
+
 @deftypefn {Target Hook} machine_mode TARGET_PROMOTE_FUNCTION_MODE (const_tree @var{type}, machine_mode @var{mode}, int *@var{punsignedp}, const_tree @var{funtype}, int @var{for_return})
 Like @code{PROMOTE_MODE}, but it is applied to outgoing function arguments or
 function return values.  The target hook should return the new mode
@@ -8721,6 +8729,20 @@ global; that is, available for reference from other files.
 The default implementation uses the TARGET_ASM_GLOBALIZE_LABEL target hook.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID (FILE *@var{stream}, tree @var{decl})
+This target hook is used to output a function's typeid before
+its assembly code.
+For different platforms, the output format of typeid may be different,
+so a platform-dependent function is required.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK (rtx @var{reg}, unsigned int @var{value})
+This target hook is used to output the assembly codes to check the
+callee's typeid before an indirect call.
+For different platforms, the location of typeid may be different,
+so a platform-dependent function is required.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_ASM_ASSEMBLE_UNDEFINED_DECL (FILE *@var{stream}, const char *@var{name}, const_tree @var{decl})
 This target hook is a function to output to the stdio stream
 @var{stream} some commands that will declare the name associated with
@@ -12608,3 +12630,8 @@ type.
 This value is true if the target platform supports
 @option{-fsanitize=shadow-call-stack}.  The default value is false.
 @end deftypevr
+
+@deftypevr {Target Hook} bool TARGET_HAVE_KCFI
+This value is true if the target platform supports
+@option{-fsanitize=kcfi}.  The default value is false.
+@end deftypevr
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index f869ddd5e5b..424d6ecd435 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -933,6 +933,8 @@ Return a value, with the same meaning as the C99 macro
 @code{FLT_EVAL_METHOD} that describes which excess precision should be
 applied.
 
+@hook TARGET_CALC_FUNC_CFI_TYPEID
+
 @hook TARGET_PROMOTE_FUNCTION_MODE
 
 @defmac PARM_BOUNDARY
@@ -5568,6 +5570,10 @@ You may wish to use @code{ASM_OUTPUT_SIZE_DIRECTIVE} and/or
 
 @hook TARGET_ASM_GLOBALIZE_DECL_NAME
 
+@hook TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID
+
+@hook TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK
+
 @hook TARGET_ASM_ASSEMBLE_UNDEFINED_DECL
 
 @defmac ASM_WEAKEN_LABEL (@var{stream}, @var{name})
@@ -8183,3 +8189,5 @@ maintainer is familiar with.
 @hook TARGET_GCOV_TYPE_SIZE
 
 @hook TARGET_HAVE_SHADOW_CALL_STACK
+
+@hook TARGET_HAVE_KCFI
diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 1e02ae254d0..5673bd93995 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -3924,6 +3924,7 @@ try_split (rtx pat, rtx_insn *trial, int last)
 	  fixup_args_size_notes (NULL, insn_last, get_args_size (note));
 	  break;
 
+	case REG_CALL_CFI_TYPEID:
 	case REG_CALL_DECL:
 	case REG_UNTYPED_CALL:
 	  gcc_assert (call_insn != NULL_RTX);
diff --git a/gcc/emit-rtl.h b/gcc/emit-rtl.h
index 7a58fedb97a..83bf22a9e53 100644
--- a/gcc/emit-rtl.h
+++ b/gcc/emit-rtl.h
@@ -307,6 +307,10 @@ struct GTY(()) rtl_data {
      pass.  */
   bool bb_reorder_complete;
 
+  /* True if we should add kcfi indirect call check for the current
+     function.  */
+  bool is_kcfi_enabled;
+
   /* Like regs_ever_live, but 1 if a reg is set or clobbered from an
      asm.  Unlike regs_ever_live, elements of this array corresponding
      to eliminable regs (like the frame pointer) are set if an asm
diff --git a/gcc/final.cc b/gcc/final.cc
index a9868861bd2..ef9565516cb 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -2823,6 +2823,29 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED,
 
 	current_output_insn = debug_insn = insn;
 
+	rtx_call_insn *call_insn = dyn_cast <rtx_call_insn *> (insn);
+
+	/* Do not insert cfg checks for functions that disable kcfi.  */
+	if ((call_insn != NULL) && crtl->is_kcfi_enabled)
+	  {
+	    rtx x = call_from_call_insn (call_insn);
+	    x = XEXP (x, 0);
+	    if (x && MEM_P (x))
+	      {
+		x = XEXP (x, 0);
+		if (GET_CODE (x) == REG)
+		  {
+		    rtx note = find_reg_note (insn, REG_CALL_CFI_TYPEID,
+					      NULL_RTX);
+		    gcc_assert (note);
+
+		    unsigned value = (unsigned) XWINT (XEXP (note, 0), 0);
+
+		    targetm.asm_out.output_icall_kcfi_check (x, value);
+		  }
+	      }
+	  }
+
 	/* Find the proper template for this insn.  */
 	templ = get_insn_template (insn_code_number, insn);
 
@@ -2875,7 +2898,6 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED,
 	    && targetm.asm_out.unwind_emit)
 	  targetm.asm_out.unwind_emit (asm_out_file, insn);
 
-	rtx_call_insn *call_insn = dyn_cast <rtx_call_insn *> (insn);
 	if (call_insn != NULL)
 	  {
 	    rtx x = call_from_call_insn (call_insn);
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 0aa51e282fb..2c34f04b509 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -323,6 +323,8 @@ enum sanitize_code {
   SANITIZE_KERNEL_HWADDRESS = 1ULL << 30,
   /* Shadow Call Stack.  */
   SANITIZE_SHADOW_CALL_STACK = 1ULL << 31,
+  /* Control Flow Integrity for linux kernel.  */
+  SANITIZE_KERNEL_CONTROL_FLOW_INTEGRITY = 1ULL << 32,
   SANITIZE_MAX = 1ULL << 63,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
diff --git a/gcc/gimple.cc b/gcc/gimple.cc
index 9e62da4265b..6d0f840aad0 100644
--- a/gcc/gimple.cc
+++ b/gcc/gimple.cc
@@ -222,9 +222,16 @@ gimple_call_reset_alias_info (gcall *s)
    components of a GIMPLE_CALL statement to function FN with NARGS
    arguments.  */
 
+void
+gimple_call_set_cfi_typeid (gcall *call_stmt, unsigned int cfi_typeid)
+{
+  call_stmt->cfi_typeid = cfi_typeid;
+}
+
 static inline gcall *
 gimple_build_call_1 (tree fn, unsigned nargs)
 {
+  unsigned int cfi_typeid;
   gcall *s
     = as_a <gcall *> (gimple_build_with_ops (GIMPLE_CALL, ERROR_MARK,
 					     nargs + 3));
@@ -233,6 +240,10 @@ gimple_build_call_1 (tree fn, unsigned nargs)
   gimple_set_op (s, 1, fn);
   gimple_call_set_fntype (s, TREE_TYPE (TREE_TYPE (fn)));
   gimple_call_reset_alias_info (s);
+
+  cfi_typeid = targetm.calc_func_cfi_typeid (TREE_TYPE (TREE_TYPE (fn)));
+  gimple_call_set_cfi_typeid (s, cfi_typeid);
+
   return s;
 }
 
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 77a5a07e9b5..e3ce0fa2e44 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -362,12 +362,15 @@ struct GTY((tag("GSS_CALL")))
   struct pt_solution call_clobbered;
 
   /* [ WORD 14 ]  */
+  unsigned int cfi_typeid;
+
+  /* [ WORD 15 ]  */
   union GTY ((desc ("%1.subcode & GF_CALL_INTERNAL"))) {
     tree GTY ((tag ("0"))) fntype;
     enum internal_fn GTY ((tag ("GF_CALL_INTERNAL"))) internal_fn;
   } u;
 
-  /* [ WORD 15 ]
+  /* [ WORD 16 ]
      Operand vector.  NOTE!  This must always be the last field
      of this structure.  In particular, this means that this
      structure cannot be embedded inside another one.  */
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 11c5d70458f..71bfd786312 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -2059,6 +2059,7 @@ const struct sanitizer_opts_s sanitizer_opts[] =
   SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true),
   SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true),
   SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false),
+  SANITIZER_OPT (kcfi, SANITIZE_KERNEL_CONTROL_FLOW_INTEGRITY, false),
   SANITIZER_OPT (all, ~0ULL, true),
 #undef SANITIZER_OPT
   { NULL, 0U, 0UL, false }
@@ -2186,7 +2187,8 @@ parse_sanitizer_options (const char *p, location_t loc, int scode,
 		else
 		  flags |= ~(SANITIZE_THREAD | SANITIZE_LEAK
 			     | SANITIZE_UNREACHABLE | SANITIZE_RETURN
-			     | SANITIZE_SHADOW_CALL_STACK);
+			     | SANITIZE_SHADOW_CALL_STACK
+			     | SANITIZE_KERNEL_CONTROL_FLOW_INTEGRITY);
 	      }
 	    else if (value)
 	      {
diff --git a/gcc/output.h b/gcc/output.h
index 6dea630913a..3fab6bb707f 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -606,6 +606,9 @@ extern bool default_binds_local_p_2 (const_tree);
 extern bool default_binds_local_p_3 (const_tree, bool, bool, bool, bool);
 extern void default_globalize_label (FILE *, const char *);
 extern void default_globalize_decl_name (FILE *, tree);
+extern void default_output_func_kcfi_typeid (FILE *, tree);
+extern void default_output_icall_kcfi_check (rtx reg, unsigned int value);
+extern unsigned int default_calc_func_cfi_typeid (const_tree);
 extern void default_emit_unwind_label (FILE *, tree, int, int);
 extern void default_emit_except_table_label (FILE *);
 extern void default_generate_internal_label (char *, const char *,
diff --git a/gcc/reg-notes.def b/gcc/reg-notes.def
index 704bc75b0e7..0fddd523c23 100644
--- a/gcc/reg-notes.def
+++ b/gcc/reg-notes.def
@@ -247,3 +247,4 @@ REG_NOTE (CALL_NOCF_CHECK)
 
 /* The values passed to callee, for debuginfo purposes.  */
 REG_NOTE (CALL_ARG_LOCATION)
+REG_NOTE (CALL_CFI_TYPEID)
diff --git a/gcc/target.def b/gcc/target.def
index d85adf36a39..ca11f572370 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -136,6 +136,26 @@ global; that is, available for reference from other files.\n\
 The default implementation uses the TARGET_ASM_GLOBALIZE_LABEL target hook.",
  void, (FILE *stream, tree decl), default_globalize_decl_name)
 
+/* Output the uniform type identifier in front of a function
+   when kcfi is enabled.  */
+DEFHOOK
+(output_func_kcfi_typeid,
+ "This target hook is used to output a function's typeid before\n\
+its assembly code.\n\
+For different platforms, the output format of typeid may be different,\n\
+so a platform-dependent function is required.",
+ void, (FILE *stream, tree decl), default_output_func_kcfi_typeid)
+
+/* Output the assembly codes to check an indirect call's cfi typeid
+   when kcfi is enabled.  */
+DEFHOOK
+(output_icall_kcfi_check,
+ "This target hook is used to output the assembly codes to check the\n\
+callee's typeid before an indirect call.\n\
+For different platforms, the location of typeid may be different,\n\
+so a platform-dependent function is required.",
+ void, (rtx reg, unsigned int value), default_output_icall_kcfi_check)
+
 /* Output code that will declare an external variable.  */
 DEFHOOK
 (assemble_undefined_decl,
@@ -4522,6 +4542,16 @@ by a subtarget.",
  unsigned HOST_WIDE_INT, (void),
  NULL)
 
+/* Calculate the typeid of a function's type.  */
+DEFHOOK
+(calc_func_cfi_typeid,
+ "This target hook is used to calculate a platform-dependent typeid\n\
+of a function.\n\
+Although the length of typeid is always 4 bytes on all platforms, different\n\
+platforms may ignore some bits to avoid encoding conflicts with it's\n\
+instruction set, so a platform-dependent function is required.",
+ unsigned int, (const_tree fntype), default_calc_func_cfi_typeid)
+
 /* Functions relating to calls - argument passing, returns, etc.  */
 /* Members of struct call have no special macro prefix.  */
 HOOK_VECTOR (TARGET_CALLS, calls)
@@ -7111,6 +7141,14 @@ DEFHOOKPOD
 @option{-fsanitize=shadow-call-stack}.  The default value is false.",
  bool, false)
 
+/* This value represents whether the control flow integrity is implemented
+   on the target platform.  */
+DEFHOOKPOD
+(have_kcfi,
+ "This value is true if the target platform supports\n\
+@option{-fsanitize=kcfi}.  The default value is false.",
+ bool, false)
+
 /* Close the 'struct gcc_target' definition.  */
 HOOK_VECTOR_END (C90_EMPTY_HACK)
 
diff --git a/gcc/toplev.cc b/gcc/toplev.cc
index 055e0642f77..f7feb40e785 100644
--- a/gcc/toplev.cc
+++ b/gcc/toplev.cc
@@ -1665,6 +1665,10 @@ process_options (bool no_backend)
 		  "requires %<-fno-exceptions%>");
     }
 
+  if (flag_sanitize & SANITIZE_KERNEL_CONTROL_FLOW_INTEGRITY)
+    if (!targetm.have_kcfi)
+      sorry ("%<-fsanitize=kcfi%> not supported in current platform");
+
   HOST_WIDE_INT patch_area_size, patch_area_start;
   parse_and_check_patch_area (flag_patchable_function_entry, false,
 			      &patch_area_size, &patch_area_start);
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 4cf3785270b..a493812ebcc 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -137,6 +137,8 @@ static uint64_t tree_code_counts[MAX_TREE_CODES];
 uint64_t tree_node_counts[(int) all_kinds];
 uint64_t tree_node_sizes[(int) all_kinds];
 
+static unsigned int unified_tree_type_hash_table[MAX_TREE_CODES];
+
 /* Keep in sync with tree.h:enum tree_node_kind.  */
 static const char * const tree_node_kind_names[] = {
   "decls",
@@ -252,6 +254,8 @@ static void print_type_hash_statistics (void);
 static void print_debug_expr_statistics (void);
 static void print_value_expr_statistics (void);
 
+static void append_unified_type_hash (const_tree type, inchash::hash &hstate);
+
 tree global_trees[TI_MAX];
 tree integer_types[itk_none];
 
@@ -694,6 +698,143 @@ initialize_tree_contains_struct (void)
   gcc_assert (tree_contains_struct[NAMELIST_DECL][TS_DECL_COMMON]);
 }
 
+static void
+initialize_unified_tree_type_hash_table (void)
+{
+  unified_tree_type_hash_table[OFFSET_TYPE] = 10;
+  unified_tree_type_hash_table[ENUMERAL_TYPE] = 20;
+  unified_tree_type_hash_table[BOOLEAN_TYPE] = 30;
+  unified_tree_type_hash_table[INTEGER_TYPE] = 40;
+  unified_tree_type_hash_table[REAL_TYPE] = 50;
+  unified_tree_type_hash_table[POINTER_TYPE] = 60;
+  unified_tree_type_hash_table[REFERENCE_TYPE] = 70;
+  unified_tree_type_hash_table[NULLPTR_TYPE] = 80;
+  unified_tree_type_hash_table[FIXED_POINT_TYPE] = 90;
+  unified_tree_type_hash_table[COMPLEX_TYPE] = 100;
+  unified_tree_type_hash_table[VECTOR_TYPE] = 110;
+  unified_tree_type_hash_table[ARRAY_TYPE] = 120;
+  unified_tree_type_hash_table[RECORD_TYPE] = 130;
+  unified_tree_type_hash_table[UNION_TYPE] = 140;
+  unified_tree_type_hash_table[QUAL_UNION_TYPE] = 150;
+  unified_tree_type_hash_table[VOID_TYPE] = 160;
+  unified_tree_type_hash_table[FUNCTION_TYPE] = 170;
+  unified_tree_type_hash_table[METHOD_TYPE] = 180;
+  unified_tree_type_hash_table[LANG_TYPE] = 190;
+  unified_tree_type_hash_table[OPAQUE_TYPE] = 200;
+}
+
+static void
+append_unified_type_name_hash (const_tree type, inchash::hash &hstate)
+{
+  tree n = TYPE_NAME (TYPE_MAIN_VARIANT (type));
+
+  if (!n)
+    return;
+
+  if (TREE_CODE (n) != IDENTIFIER_NODE)
+    n = DECL_NAME (n);
+
+  hstate.add ((const void *) IDENTIFIER_POINTER (n), IDENTIFIER_LENGTH (n));
+}
+
+static void
+append_unified_type_precision_hash (const_tree type, inchash::hash &hstate)
+{
+  unsigned HOST_WIDE_INT size = TYPE_PRECISION (type);
+
+  hstate.add_hwi (size);
+}
+
+/* Add the return and all parameter types of the function
+   to the hash calculation.  */
+
+static void
+append_unified_function_ret_and_args_hash (const_tree fntype,
+					   inchash::hash &hstate)
+{
+  const_tree arg_type;
+  function_args_iterator args_iter;
+
+  append_unified_type_hash (TREE_TYPE (fntype), hstate);
+
+  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
+    {
+      if (TYPE_READONLY (arg_type) || TYPE_VOLATILE (arg_type))
+	{
+	  int quals = TYPE_QUALS (arg_type)
+		      & ~TYPE_QUAL_CONST & ~TYPE_QUAL_VOLATILE;
+
+	  arg_type = build_qualified_type (CONST_CAST_TREE (arg_type), quals);
+	}
+      append_unified_type_hash (arg_type, hstate);
+    }
+}
+
+static void
+append_unified_type_hash (const_tree type, inchash::hash &hstate)
+{
+  enum tree_code type_code = TREE_CODE (type);
+  unsigned int u_hash = unified_tree_type_hash_table[type_code];
+
+  /* Make sure all type nodes have a unique initial hash.  */
+  if (!u_hash)
+    gcc_unreachable ();
+
+  hstate.add_int (u_hash);
+
+  /* Extra information about the type involved in the hash calculation.  */
+  switch (type_code)
+    {
+    case VOID_TYPE:
+    case BOOLEAN_TYPE:
+      break;
+
+    case INTEGER_TYPE:
+      append_unified_type_name_hash (type, hstate);
+      append_unified_type_precision_hash (type, hstate);
+      break;
+
+    case ENUMERAL_TYPE:
+      append_unified_type_name_hash (type, hstate);
+      append_unified_type_precision_hash (type, hstate);
+      break;
+
+    case REAL_TYPE:
+      append_unified_type_precision_hash (TYPE_MAIN_VARIANT (type), hstate);
+      break;
+
+    case POINTER_TYPE:
+    case REFERENCE_TYPE:
+    case ARRAY_TYPE:
+      append_unified_type_hash (TREE_TYPE (type), hstate);
+      break;
+
+    case UNION_TYPE:
+    case RECORD_TYPE:
+      append_unified_type_name_hash (type, hstate);
+      break;
+
+    case FUNCTION_TYPE:
+      append_unified_function_ret_and_args_hash (type, hstate);
+      break;
+
+    default:
+      break;
+    }
+}
+
+/* Calculate the hash of the type node that are invariant across
+   compilation units.  */
+
+hashval_t
+unified_type_hash (const_tree type)
+{
+  inchash::hash hstate;
+
+  append_unified_type_hash (type, hstate);
+
+  return hstate.end ();
+}
 
 /* Init tree.cc.  */
 
@@ -723,6 +864,9 @@ init_ttree (void)
 
   /* Initialize the tree_contains_struct array.  */
   initialize_tree_contains_struct ();
+
+  initialize_unified_tree_type_hash_table ();
+
   lang_hooks.init_ts ();
 }
 
diff --git a/gcc/tree.h b/gcc/tree.h
index 8844471e9a5..dd8e0bfba7b 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4813,6 +4813,7 @@ extern tree build_variant_type_copy (tree CXX_MEM_STAT_INFO);
 
 extern hashval_t type_hash_canon_hash (tree);
 extern tree type_hash_canon (unsigned int, tree);
+extern hashval_t unified_type_hash (const_tree);
 
 extern tree convert (tree, tree);
 extern tree size_in_bytes_loc (location_t, const_tree);
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 021e912a37c..5ae68a77142 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -1956,6 +1956,11 @@ assemble_start_function (tree decl, const char *fnname)
   if (!DECL_IGNORED_P (decl))
     (*debug_hooks->begin_function) (decl);
 
+  /* Regardless of whether the function can be called indirectly,
+     a typeid is always required before the function.  */
+  if (flag_sanitize & SANITIZE_KERNEL_CONTROL_FLOW_INTEGRITY)
+    targetm.asm_out.output_func_kcfi_typeid (asm_out_file, decl);
+
   /* Make function name accessible from other files, if appropriate.  */
 
   if (TREE_PUBLIC (decl))
@@ -7674,6 +7679,27 @@ default_globalize_decl_name (FILE * stream, tree decl)
   targetm.asm_out.globalize_label (stream, name);
 }
 
+/* Default function to output the function's kcfi typeid.  */
+void
+default_output_func_kcfi_typeid (FILE * stream ATTRIBUTE_UNUSED,
+				 tree decl ATTRIBUTE_UNUSED)
+{
+}
+
+/* Default function to output the kcfi check for an indirect call.  */
+void
+default_output_icall_kcfi_check (rtx reg ATTRIBUTE_UNUSED,
+				 unsigned int value ATTRIBUTE_UNUSED)
+{
+}
+
+/* Default function to calculate the typeid of a function type.  */
+unsigned int
+default_calc_func_cfi_typeid (const_tree fntype ATTRIBUTE_UNUSED)
+{
+  return 0;
+}
+
 /* Default function to output a label for unwind information.  The
    default is to do nothing.  A target that needs nonlocal labels for
    unwind information must provide its own function to do this.  */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC/RFT,V2 3/3] [PR102768] aarch64: Add support for Kernel Control Flow Integrity
  2023-03-25  8:11 ` [RFC/RFT,V2 0/3] Add compiler support for Kernel " Dan Li
  2023-03-25  8:11   ` [RFC/RFT,V2 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features Dan Li
  2023-03-25  8:11   ` [RFC/RFT,V2 2/3] [PR102768] Support CFI: Add basic support for Kernel Control Flow Integrity Dan Li
@ 2023-03-25  8:11   ` Dan Li
  2023-06-21 21:54   ` [RFC/RFT,V2 0/3] Add compiler " Kees Cook
  2023-07-19  8:41   ` Dan Li
  4 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2023-03-25  8:11 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

In the AArch64 platform, typeid can be directly inserted in front
of the function header (offset is patch_area_entry + 4), it should
be assumed that patch_area_entry is the same for all functions.

For all functions that will not be called indirectly, insert the
reserved RESERVED_CFI_TYPEID (0x0) as typeid in front of them. If
not, the attacker may use the instruction/data before the function
as typeid to bypass CFI.

All typeids ignore some bits (& AARCH64_UNALLOCATED_INSN_MASK) to
avoid conflicts with the AArch64 instruction set (see AAPCS64 for
details).

Signed-off-by: Dan Li <ashimida.1990@gmail.com>

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (RESERVED_CFI_TYPEID): Macro definition.
	(DEFAULT_CFI_TYPEID): Likewise.
	(AARCH64_UNALLOCATED_INSN_MASK): Likewise.
	(aarch64_calc_func_cfi_typeid): Platform-dependent CFI function.
	(cgraph_indirectly_callable): Determine whether a funtion may
	be called indirectly.
	(aarch64_output_func_kcfi_typeid): Platform-dependent CFI function.
	(aarch64_output_icall_kcfi_check): Likewise.
	(TARGET_HAVE_KCFI): New hook.
	(TARGET_CALC_FUNC_CFI_TYPEID): Likewise.
	(TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID): Likewise.
	(TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK): Likewise.
	* doc/invoke.texi: Document -fsanitize=kcfi.
---
 gcc/config/aarch64/aarch64.cc | 166 ++++++++++++++++++++++++++++++++++
 gcc/doc/invoke.texi           |  36 ++++++++
 2 files changed, 202 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 5c9e7791a12..5b55541d437 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -5450,6 +5450,160 @@ aarch64_output_sve_addvl_addpl (rtx offset)
   return buffer;
 }
 
+/* Reserved for all functions that cannot be called indirectly.  */
+#define RESERVED_CFI_TYPEID 0x0U
+
+/* If the typeid of a function that can be called indirectly is equal to
+   RESERVED_CFI_TYPEID, change it to DEFAULT_CFI_TYPEID.  */
+#define DEFAULT_CFI_TYPEID 0x00000ADAU
+
+/* Mask of reserved and unallocated instructions in AArch64 platform.  */
+#define AARCH64_UNALLOCATED_INSN_MASK 0xE7FFFFFFU
+
+static unsigned int
+aarch64_calc_func_cfi_typeid (const_tree fntype)
+{
+  unsigned int hash;
+
+  /* The value of typeid has a probability of being the same as the encoding
+     of an instruction.  If the attacker can find the same encoding as the
+     typeid in the assembly code, then he has found a usable jump location.
+     So here, a platform-related mask is used when generating a typeid to
+     avoid such conflicts as much as possible.  */
+  hash = unified_type_hash (fntype) & AARCH64_UNALLOCATED_INSN_MASK;
+
+  /* RESERVED_CFI_TYPEID is reserved for functions that cannot
+     be called indirectly.  */
+  if (hash == RESERVED_CFI_TYPEID)
+    hash = DEFAULT_CFI_TYPEID;
+
+  return hash;
+}
+
+static bool
+cgraph_indirectly_callable (struct cgraph_node *node,
+			    void *data ATTRIBUTE_UNUSED)
+{
+  if (node->externally_visible || node->address_taken)
+    return true;
+
+  return false;
+}
+
+static void
+aarch64_output_func_kcfi_typeid (FILE * stream, tree decl)
+{
+  struct cgraph_node *node;
+  unsigned int cur_func_typeid;
+
+  node = cgraph_node::get (decl);
+
+  if (!node->call_for_symbol_thunks_and_aliases (cgraph_indirectly_callable,
+						 NULL, true))
+    /* CFI's typeid check always considers that there is a typeid before the
+       target function, so it is also necessary to output typeid for functions
+       that cannot be called indirectly to prevent attackers from bypassing
+       CFI by using instructions/data before those functions.
+       The typeid inserted before such a function is RESERVED_CFI_TYPEID,
+       and the calculation of the typeid must ensure that this value is always
+       reserved.  */
+    cur_func_typeid = RESERVED_CFI_TYPEID;
+  else
+    cur_func_typeid = aarch64_calc_func_cfi_typeid (TREE_TYPE (decl));
+
+  fprintf (stream, "__kcfi_%s:\n", get_name (decl));
+  fprintf (stream, "\t.4byte %#010x\n", cur_func_typeid);
+}
+
+/* This function outputs assembly instructions to check cfi typeid before
+   indirect call (blr Xn), which may destroy x16, x17, x9 registers (according
+   to the AAPCS64 specification, these registers do not need to be restored
+   after the function call).
+   The assembly code output by this function is as follows:
+	ldur    w16, [x1, #-4]
+	movk    w17, #13570
+	movk    w17, #17309, lsl #16
+	cmp     w16, w17
+	b.eq	.Lkcfi8
+	brk     #0x8221
+.Lkcfi8:
+	blr     x1
+ */
+
+static void
+aarch64_output_icall_kcfi_check (rtx reg, unsigned int value)
+{
+  unsigned int addr_reg, scratch_reg1, scratch_reg2;
+  unsigned int esr, addr_index, type_index;
+  char label_buf[256];
+  const char *label_ptr;
+  unsigned HOST_WIDE_INT patch_area_entry = crtl->patch_area_entry;
+  rtx_code_label * tmp_label = gen_label_rtx ();
+
+  gcc_assert (GET_CODE (reg) == REG);
+
+  addr_reg = REGNO (reg);
+
+  /* The typeid read from the front of the callee is saved in the
+     register specified by scratch_reg1, the default is R16_REGNUM.  */
+  scratch_reg1 = R16_REGNUM;
+
+  /* The expected typeid of the caller is saved in the register
+     specified by scratch_reg2, which defaults to R17_REGNUM.  */
+  scratch_reg2 = R17_REGNUM;
+
+  gcc_assert (GP_REGNUM_P (addr_reg));
+
+  /* If one of the scratch registers is used for the call target,
+     we can clobber another caller-saved temporary register instead
+     (in this case, R9_REGNUM) as the check is immediately followed
+     by the call instruction.  */
+  if (addr_reg == R16_REGNUM)
+    {
+      scratch_reg1 = R9_REGNUM;
+    }
+  else if (addr_reg == R17_REGNUM)
+    {
+      scratch_reg2 = R9_REGNUM;
+    }
+
+  gcc_assert ((scratch_reg1 != addr_reg) && (scratch_reg2 != addr_reg));
+
+  ASM_GENERATE_INTERNAL_LABEL (label_buf, "Lkcfi",
+			       CODE_LABEL_NUMBER (tmp_label));
+  label_ptr = targetm.strip_name_encoding (label_buf);
+
+  /* The offset of callee's typeid needs to be adjusted according to
+     patch_area_entry.  This assumes that patch_area_entry is the
+     same for all functions.  */
+  fprintf (asm_out_file, "\tldur\tw%d, [x%d, #-%ld]\n",
+	   scratch_reg1, addr_reg, patch_area_entry * 4 + 4);
+
+  fprintf (asm_out_file, "\tmovk\tw%d, #%d\n", scratch_reg2, value & 0xFFFF);
+
+  fprintf (asm_out_file, "\tmovk\tw%d, #%d, lsl #16\n",
+	   scratch_reg2, (value >> 16) & 0xFFFF);
+
+  fprintf (asm_out_file, "\tcmp\tw%d, w%d\n", scratch_reg1, scratch_reg2);
+
+  fprintf (asm_out_file, "\tb.eq\t%s\n", label_ptr);
+
+  /* The base ESR for brk is 0x8000 and the register information is
+     encoded in bits 0-9 as follows:
+     - 0-4: n, where the register Xn contains the callee address
+     - 5-9: m, where the register Wm contains the expected typeid
+     Where n, m are in[0,30].
+  */
+  addr_index = addr_reg - R0_REGNUM;
+  type_index = scratch_reg2 - R0_REGNUM;
+  esr = 0x8000 | ((type_index & 31) << 5) | (addr_index & 31);
+  fprintf (asm_out_file, "\tbrk\t#0x%x\n", esr);
+
+  fprintf (asm_out_file, "%s:\n", label_ptr);
+
+  return;
+}
+
 /* Return true if X is a valid immediate for an SVE vector INC or DEC
    instruction.  If it is, store the number of elements in each vector
    quadword in *NELTS_PER_VQ_OUT (if nonnull) and store the multiplication
@@ -27823,6 +27977,18 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_HAVE_SHADOW_CALL_STACK
 #define TARGET_HAVE_SHADOW_CALL_STACK true
 
+#undef TARGET_HAVE_KCFI
+#define TARGET_HAVE_KCFI true
+
+#undef TARGET_CALC_FUNC_CFI_TYPEID
+#define TARGET_CALC_FUNC_CFI_TYPEID aarch64_calc_func_cfi_typeid
+
+#undef TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID
+#define TARGET_ASM_OUTPUT_FUNC_KCFI_TYPEID aarch64_output_func_kcfi_typeid
+
+#undef TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK
+#define TARGET_ASM_OUTPUT_ICALL_KCFI_CHECK aarch64_output_icall_kcfi_check
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-aarch64.h"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ff6c338bedb..1b2ba7a0f29 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15736,6 +15736,42 @@ to turn off exceptions.
 See @uref{https://clang.llvm.org/docs/ShadowCallStack.html} for more
 details.
 
+@item -fsanitize=kcfi
+@opindex fsanitize=kcfi
+The KCFI sanitizer, enabled with @option{-fsanitize=kcfi}, implements a
+forward-edge control flow integrity scheme for indirect calls.  It
+attaches a type identifier (@code{typeid}) for each function and injects
+verification code before indirect calls.
+
+A @code{typeid} is a 32-bit constant, its value is mainly related to the
+return value type and all parameter types of the function, and is invariant
+for each compilation.  Since the value of @code{typeid} may conflict with
+the instruction set encoding of the current platform, some bits may be
+ignored on different platforms.
+
+At compile time, the compiler inserts checking code on all indirect calls,
+and at run time, before any indirect calls occur, the code checks that
+the @code{typeid} before the callee function matches the @code{typeid}
+requested by the caller.  If the match fails, an exception instruction
+will be triggered, such as a @code{brk} in aarch64.  This mechanism is
+mainly designed for low-level codes, such as operating systems, and the
+system needs to handle those exceptions by itself.
+
+If a program contains indirect calls to assembly functions, they must be
+manually annotated with the expected type identifiers to prevent errors.
+To make this easier, CFI generates a weak SHN_ABS
+@code{__kcfi_typeid_<function>} symbol for each address-taken function
+declaration, which can be used to annotate functions in assembly as long
+as at least one C translation unit linked into the program takes the
+function address.
+
+Currently this feature only supports the aarch64 platform, mainly for
+the linux kernel.  Users who want to use this feature in other system
+need to provide their own support for the exception handling.
+
+See @uref{https://clang.llvm.org/docs/ControlFlowIntegrity.html} for
+more details.
+
 @item -fsanitize=thread
 @opindex fsanitize=thread
 Enable ThreadSanitizer, a fast data race detector.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC/RFT,V2 0/3] Add compiler support for Kernel Control Flow Integrity
  2023-03-25  8:11 ` [RFC/RFT,V2 0/3] Add compiler support for Kernel " Dan Li
                     ` (2 preceding siblings ...)
  2023-03-25  8:11   ` [RFC/RFT,V2 3/3] [PR102768] aarch64: Add " Dan Li
@ 2023-06-21 21:54   ` Kees Cook
  2023-07-19  8:20     ` Dan Li
  2023-07-19  8:41   ` Dan Li
  4 siblings, 1 reply; 16+ messages in thread
From: Kees Cook @ 2023-06-21 21:54 UTC (permalink / raw)
  To: Dan Li
  Cc: Qing Zhao, gcc-patches, Richard Sandiford, Masahiro Yamada,
	Michal Marek, Nick Desaulniers, Catalin Marinas, Will Deacon,
	Sami Tolvanen, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du, linux-kbuild,
	linux-kernel, linux-arm-kernel, llvm, linux-hardening

On Sat, Mar 25, 2023 at 01:11:14AM -0700, Dan Li wrote:
> This series of patches is mainly used to support the control flow
> integrity protection of the linux kernel [1], which is similar to
> -fsanitize=kcfi in clang 16.0 [2,3].
> 
> Any suggestion please let me know :).

Hi Dan,

It's been a couple months, and I didn't see any other feedback on this
proposal. I was curious what the status of this work is. Are you able to
attend GNU Cauldron[1] this year? I'd love to see this get some traction
in GCC.

Thanks!

-Kees

[1] https://gcc.gnu.org/wiki/cauldron2023

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/RFT,V2 0/3] Add compiler support for Kernel Control Flow Integrity
  2023-06-21 21:54   ` [RFC/RFT,V2 0/3] Add compiler " Kees Cook
@ 2023-07-19  8:20     ` Dan Li
  0 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2023-07-19  8:20 UTC (permalink / raw)
  To: Kees Cook
  Cc: Qing Zhao, gcc-patches, Richard Sandiford, Masahiro Yamada,
	Michal Marek, Nick Desaulniers, Catalin Marinas, Will Deacon,
	Sami Tolvanen, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du, linux-kbuild,
	linux-kernel, linux-arm-kernel, llvm, linux-hardening

Hi Kees,

Sincerely sorry, I just saw this email.
Embarrassingly, due to another job change, my plan was postponed again :(.

I may not be able to attend this year's GCC meeting. Is there any other
way to let this get some traction in GCC? I really hope someone can help
with this topic.

BTW, I'm still looking at this and plan to finish it by the end of this
year, but it's taking too long and there's a lot of uncertainty, so
please just consider this only as a backup option.

Thanks,
Dan.

On Thu, 22 Jun 2023 at 05:54, Kees Cook <keescook@chromium.org> wrote:
>
> On Sat, Mar 25, 2023 at 01:11:14AM -0700, Dan Li wrote:
> > This series of patches is mainly used to support the control flow
> > integrity protection of the linux kernel [1], which is similar to
> > -fsanitize=kcfi in clang 16.0 [2,3].
> >
> > Any suggestion please let me know :).
>
> Hi Dan,
>
> It's been a couple months, and I didn't see any other feedback on this
> proposal. I was curious what the status of this work is. Are you able to
> attend GNU Cauldron[1] this year? I'd love to see this get some traction
> in GCC.
>
> Thanks!
>
> -Kees
>
> [1] https://gcc.gnu.org/wiki/cauldron2023
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/RFT,V2 0/3] Add compiler support for Kernel Control Flow Integrity
  2023-03-25  8:11 ` [RFC/RFT,V2 0/3] Add compiler support for Kernel " Dan Li
                     ` (3 preceding siblings ...)
  2023-06-21 21:54   ` [RFC/RFT,V2 0/3] Add compiler " Kees Cook
@ 2023-07-19  8:41   ` Dan Li
  4 siblings, 0 replies; 16+ messages in thread
From: Dan Li @ 2023-07-19  8:41 UTC (permalink / raw)
  To: gcc-patches, Richard Sandiford, Masahiro Yamada, Michal Marek,
	Nick Desaulniers, Catalin Marinas, Will Deacon, Sami Tolvanen,
	Kees Cook, Nathan Chancellor, Tom Rix, Peter Zijlstra,
	Paul E. McKenney, Mark Rutland, Josh Poimboeuf,
	Frederic Weisbecker, Eric W. Biederman, Dan Li, Marco Elver,
	Christophe Leroy, Song Liu, Andrew Morton, Uros Bizjak,
	Kumar Kartikeya Dwivedi, Juergen Gross, Luis Chamberlain,
	Borislav Petkov, Masami Hiramatsu, Dmitry Torokhov, Aaron Tomlin,
	Kalesh Singh, Yuntao Wang, Changbin Du
  Cc: linux-kbuild, linux-kernel, linux-arm-kernel, llvm, linux-hardening

Hi All,

Embarrassingly, due to personal reasons, I may not be able to complete
the series of patches on the forward side of GCC CFI for the time being.

Please forgive me for not realizing that I should have sent this help
email a long time ago :(

This topic has been delayed for a long time, and I would be very grateful
if someone can help complete this series of patches.

BTW, please let me know if there are more groups I can cc for help.

Thanks!
Dan.

On Sat, 25 Mar 2023 at 16:11, Dan Li <ashimida.1990@gmail.com> wrote:
>
> This series of patches is mainly used to support the control flow
> integrity protection of the linux kernel [1], which is similar to
> -fsanitize=kcfi in clang 16.0 [2,3].
>
> Any suggestion please let me know :).
>
> Thanks, Dan.
>
> [1] https://lore.kernel.org/all/20220908215504.3686827-1-samitolvanen@google.com/
> [2] https://clang.llvm.org/docs/ControlFlowIntegrity.html
> [3] https://reviews.llvm.org/D119296
>
> Signed-off-by: Dan Li <ashimida.1990@gmail.com>
>
> ---
> Dan Li (3):
>   [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to
>     64 bits to support more features
>   [PR102768] Support CFI: Add basic support for Kernel Control Flow
>     Integrity
>   [PR102768] aarch64: Add support for Kernel Control Flow Integrity
>
>  gcc/asan.h                    |   4 +-
>  gcc/c-family/c-attribs.cc     |  10 +-
>  gcc/c-family/c-common.h       |   2 +-
>  gcc/c/c-parser.cc             |   4 +-
>  gcc/cfgexpand.cc              |  26 ++++++
>  gcc/cgraphunit.cc             |  34 +++++++
>  gcc/combine.cc                |   1 +
>  gcc/common.opt                |   4 +-
>  gcc/config/aarch64/aarch64.cc | 166 ++++++++++++++++++++++++++++++++++
>  gcc/cp/typeck.cc              |   2 +-
>  gcc/doc/invoke.texi           |  36 ++++++++
>  gcc/doc/tm.texi               |  27 ++++++
>  gcc/doc/tm.texi.in            |   8 ++
>  gcc/dwarf2asm.cc              |   2 +-
>  gcc/emit-rtl.cc               |   1 +
>  gcc/emit-rtl.h                |   4 +
>  gcc/final.cc                  |  24 ++++-
>  gcc/flag-types.h              |  67 +++++++-------
>  gcc/gimple.cc                 |  11 +++
>  gcc/gimple.h                  |   5 +-
>  gcc/opt-suggestions.cc        |   2 +-
>  gcc/opts.cc                   |  26 +++---
>  gcc/opts.h                    |   8 +-
>  gcc/output.h                  |   3 +
>  gcc/reg-notes.def             |   1 +
>  gcc/target.def                |  38 ++++++++
>  gcc/toplev.cc                 |   4 +
>  gcc/tree-cfg.cc               |   2 +-
>  gcc/tree.cc                   | 144 +++++++++++++++++++++++++++++
>  gcc/tree.h                    |   1 +
>  gcc/varasm.cc                 |  26 ++++++
>  31 files changed, 627 insertions(+), 66 deletions(-)
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-07-19  8:41 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-19  5:54 [RFC/RFT 0/3] Add compiler support for Control Flow Integrity Dan Li
2022-12-19  5:54 ` [RFC/RFT 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features Dan Li
2022-12-19  5:54 ` [RFC/RFT 2/3] [PR102768] Support CFI: Add new pass for Control Flow Integrity Dan Li
2022-12-19  5:54 ` [RFC/RFT 3/3] [PR102768] aarch64: Add support " Dan Li
2023-02-09  1:48 ` [RFC/RFT 0/3] Add compiler " Hongtao Liu
2023-02-10 16:18   ` Dan Li
2023-02-13  1:39     ` Hongtao Liu
2023-02-09  5:32 ` Peter Collingbourne
2023-02-10 16:20   ` Dan Li
2023-03-25  8:11 ` [RFC/RFT,V2 0/3] Add compiler support for Kernel " Dan Li
2023-03-25  8:11   ` [RFC/RFT,V2 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features Dan Li
2023-03-25  8:11   ` [RFC/RFT,V2 2/3] [PR102768] Support CFI: Add basic support for Kernel Control Flow Integrity Dan Li
2023-03-25  8:11   ` [RFC/RFT,V2 3/3] [PR102768] aarch64: Add " Dan Li
2023-06-21 21:54   ` [RFC/RFT,V2 0/3] Add compiler " Kees Cook
2023-07-19  8:20     ` Dan Li
2023-07-19  8:41   ` Dan Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).