All of lore.kernel.org
 help / color / mirror / Atom feed
* Follow up from the btf_type_tag discussion in the BPF office hours
@ 2022-12-15 18:43 Jose E. Marchesi
  2022-12-15 22:14 ` Jose E. Marchesi
  2022-12-17  1:38 ` Yonghong Song
  0 siblings, 2 replies; 5+ messages in thread
From: Jose E. Marchesi @ 2022-12-15 18:43 UTC (permalink / raw)
  To: bpf
  Cc: david.faust, elena.zannoni, David Malcolm, Nick Desaulniers,
	Julia Lawall


Of the two problems discussed:

1. DW_TAG_LLVM_annotation not being able to denote annotations to
   non-pointed based types.  clang currently ignores these instances.

   We discussed two possible options to deal with this:
   1.1 To continue ignoring these cases in the front-end, keep the dwarf
       expressiveness limitation, and document it.
   1.2 To change DW_TAG_LLVM_annotation so it behaves like a qualifier
       DIE (like const, volatile, etc.) so it can apply to any type.

2. The ordering problem: sparse annotations order differently than
   GNU/C99 compiler attributes.  Therefore translating 1-to-1 from
   sparse annotations to compiler attributes results in attributes with
   different syntax than normal compiler attributes.

   This was accepted in clang.
   But found resistance in GCC when we sent the first patch series.

   During the meeting we went thru several possible ways of dealing with
   this problem, but we didn't reach any conclusion on what to do, since
   the time ran out.

We agreed to continue the discussion at the BPF office hours next 5
January 2023.

In the meanwhile, below in this email is a slightly updated version of
the material used to go thru the topics during the discussion.  If there
is any mistake or if you see that our understanding of the
problem/situation is not correct, please point it out.  If you want to
add more information, please do so by replying to this thread.

Finally, it was agreed that we (GCC BPF hackers) would send Yonghong our
github accounts so he can subscribe us to notifications in the llvm
phabricator, so we can be aware of potentially ABI/breaking changes at
the time they are discussed, and not afterwards scanning bpf@vger.  I
alredy sent him the information.

Thank you for your time today.  It is appreciated.

----

* Background
** Attribute/annotation ordering in declarations

   - By "ordering" in this context we understand the correspondence of
     annotations (like compiler attributes) with language entities.

     Given:

     : char * __attribute ((btf_type_tag("foo"))) * buf;

     What is the language entity (type) to which the attribute
     applies?  Is it char?  Is it char*?  Is it char**?

   - This is decided by compilers at parse-time and reflected in the
     structure of the AST.

   - Therefore this is a source-level concept, and it is not amenable
     by generating debug info with a different ordering.

** Compilers implement different orderings for different kinds of attributes

   - The sparse compiler implements a particular ordering for sparse
     annotations, which look like:

     : __attribute__ ((address_space(user)))

     The ordering is not documented, but can be determined by running
     sparse and/or by looking at the sparse parser source code.

     [The fact sparse annotations look like compiler GNU-like
      attributes must not mislead us: they are not the same and they
      don't order the same way.]

   - The GCC and clang compilers implement another particular ordering
     for GNU-like compiler attributes, which have the form:

     : __attribute__ ((noinline))

     The ordering is specified in the "Attribute Syntax" chapter in
     the GCC manual.

   - The GCC and clang compilers implement another particular ordering
     for C2x compiler attributes, which have the form:

     : [[noinline]]

     The ordering is specified in the C2x specification.

   All three entities above, sparse annotations, GNU-like compiler
   attributes and C2x compiler attributes use different orderings,
   i.e. the correspondence of the annotation/attribute with the
   language entity it annotates (type, declaration, ...) varies.

* The problem
** Re-use of sparse annotations as compiler attributes

  - At some point we got the request from the BPF community to support
    a couple of new compiler attributes in GCC:

    : btf_type_tag ("foobar")
    : btf_decl_tag ("foobar")

    These attributes translate into annotations in both BTF and DWARF
    debugging information.

    clang/llvm had already been modified to support these attributes,
    the new BTF kind, and a new DW_TAG_LLVM_annotation DWARF DIE type.

  - David Faust wrote an implementation, and used as a criteria the
    several examples he saw in the kernel sources and what clang
    generates.  He added the same support in BTF and a compatible
    DW_TAG_GNU_annotation for DWARF.

    He obviously implemented the new compiler attributes as what they
    are: GNU-like compiler attributes, and saw that they were being
    ordered differently to what the kernel sources would expect.

  - Then we investigated and learned that the intention at the BPF
    side was to re-use the already existing address_space sparse
    annotations in the kernel sources as btf_*_tag compiler
    attributes.

  - At present, it looks like support for several other sparse-like
    attributes have been added to clang (which ones?) and as a result
    now clang effectively has GNU-like compiler attributes that order
    differently than all other attributes.

  - At LPC 2022 we met David Malcolm, who has been working in
    implementing sparse annotations and warnings natively in GCC.  He
    found similar problems with the ordering of the attributes.

** Scope of the problem

   - This happens when sparse annotations are re-used as compiler
     attributes.

   - sparse annotations are NOT compiler attributes => there is NO bug
     in sparse.

   - These annotations are used very widely in the Linux kernel sources.

   - Compilers involved: sparse, GCC and clang

   - sparse annotations can appear in both BPF C programs and kernel
     source files.

** Ordering discrepancies between sparse annotations and compiler attributes

   Using these definitions:

   : #ifdef __CHECKER__
   : #define __tag1 __attribute__((noderef, address_space(__tag1)))
   : #define __tag2 __attribute__((noderef, address_space(__tag2)))
   : #else
   : #define __tag1 __attribute__((btf_type_tag("tag1")))
   : #define __tag2 __attribute__((btf_type_tag("tag2")))
   : #endif

   __CHECKER__ is defined by sparse, so the former definitions are
   used * while compiling the examples with sparse.

   The latter definitions with btf_type_tag are used when compiling
   with GCC or clang.

   Compilers compared:

   - sparse

     + Ordering information is obtained via sparse debug printed as
       part of a warning.

   - GCC

     + Slightly outdated 'master' branch, with the patches to support
       btf_type_tag and btf_decl_tag applied:
       https://gcc.gnu.org/git/gcc.git:refs/users/dfaust/heads/btf-type-tag-new-rebase

     + Ordering information is obtained via the generated BTF and
       DWARF.  Note that we have verified that the BTF and DWARF
       results are accurately representative of the GCC internal
       compiler structures which produce it, and that therefore they
       reflect how the compiler has parsed the declarations.

   - clang

     + clang version 16.0.0. main branch of December 14 2022.
     + Ordering information is obtained via the generated BTF and
       DWARF.

    Probably the simplest case involving pointers, shows how both
    compilers are mangling the generated BTF in order to reflect the
    sparse expectations (presumably the same hack is in pahole when
    it does the DWARF -> BTF translation):

    : char __tag1 * buf;

    - sparse
      + __tag1 applies to type char.
      : got char [noderef] __tag1 *[addressable] [toplevel] buf
    - clang
      + According to DWARF __tag1 applies to type char
      + According to BTF  __tag1 applies to type char
      : DWARF
      : 0x00000023:   DW_TAG_variable
      :                 DW_AT_name	("buf")
      :                 DW_AT_type	(0x0000002e "char *")
      :
      : 0x0000002e:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000037 "char")
      :
      : 0x00000033:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      : BTF
      : [1] TYPE_TAG 'tag1' type_id=3
      : [2] PTR '(anon)' type_id=1
      : [3] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
      : [4] VAR 'buf' type_id=2, linkage=global
      :
      : buf -> ptr -> 'tag1' -> char
    - GCC
      + According to DWARF __tag1 applies to type char
      + According to BTF  __tag1 applies to type char
      : DWARF
      : 0x0000002e:   DW_TAG_variable
      :                 DW_AT_name	("buf")
      :                 DW_AT_type	(0x00000044 "char *")
      :
      : 0x00000044:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000057 "char")
      :
      : 0x0000004d:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      :
      : BTF
      : [1] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
      : [2] PTR '(anon)' type_id=3
      : [3] TYPE_TAG 'tag1' type_id=1
      : [4] VAR 'buf' type_id=2, linkage=global
      :
      : buf -> ptr -> 'tag1' -> char

    The next example is from from bpf-next/arch/x86/kernel/kgdb.c:184
    and shows cases where both the DWARF and the BTF are different:

    : struct perf_event * __percpu *pev;

    - sparse
      + __percpu applies to struct perf_event*
      : got struct perf_event *[noderef] __percpu *[addressable] [toplevel] pev
    - clang
      + According to DWARF __percpu applies to struct perf_event*
      + According to BTF __percpu applies to struct perf_event*
      : DWARF
      : 0x00000023:   DW_TAG_variable
      :                 DW_AT_name	("pev")
      :                 DW_AT_type	(0x0000002e "perf_event **")
      :
      : 0x0000002e:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000037 "perf_event *")
      :
      : 0x00000033:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("percpu")
      :
      : 0x00000037:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x0000003c "perf_event")
      : BTF
      : [2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      : [4] PTR '(anon)' type_id=5
      : [5] STRUCT 'perf_event' size=4 vlen=1
      : 'a' type_id=2 bits_offset=0
      : [8] TYPE_TAG 'percpu' type_id=4
      : [9] PTR '(anon)' type_id=8
      : [13] VAR 'pev' type_id=9, linkage=global
      :
      : pev -> ptr -> 'percpu' -> ptr -> struct perf_event
    - GCC
      + According to DWARF __percpu applies to struct perf_event
      + According to BTF __percpu applies to struct perf_event
      : DWARF
      : 0x00000023:   DW_TAG_variable
      :                 DW_AT_name	("pev")
      :                 DW_AT_type	(0x00000039 "perf_event **")
      :
      : 0x00000039:   DW_TAG_pointer_type (perf_event **)
      :                 DW_AT_type	(0x0000003f "perf_event *")
      :
      : 0x0000003f:   DW_TAG_pointer_type (perf_event *)
      :                 DW_AT_type	(0x0000001e "perf_event")
      :
      : 0x00000045:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("percpu")
      : BTF
      : [1] STRUCT 'perf_event' size=4 vlen=1
      : 'a' type_id=2 bits_offset=0
      : [2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      : [3] PTR '(anon)' type_id=4
      : [4] TYPE_TAG 'percpu' type_id=1
      : [5] PTR '(anon)' type_id=3
      : [14] VAR 'pev' type_id=5, linkage=global
      :
      : pev -> ptr -> ptr -> 'percpu' -> perf_event

    The following case shows two different attributes applied to
    pointers-to-pointers:

    : int __tag1 * __tag2 * g;

    - sparse
      + __tag1 applies to int, __tag2 applies to int*
      : got int [noderef] __tag1 *[noderef] __tag2 *[addressable] [toplevel] g
    - clang
      + According to DWARF __tag1 applies to int, __tag2 applies to int*
      + According to BTF __tag1 applies to int, __tag2 applies to int*
      : DWARF
      : 0x00000023:   DW_TAG_variable
      :                 DW_AT_name	("g")
      :                 DW_AT_type	(0x0000002e "int **")
      :
      : 0x0000002e:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000037 "int *")
      :
      : 0x00000033:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag2")
      :
      : 0x00000037:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000040 "int")
      :
      : 0x0000003c:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      : BTF
      : [1] TYPE_TAG 'tag2' type_id=4
      : [2] PTR '(anon)' type_id=1
      : [3] TYPE_TAG 'tag1' type_id=5
      : [4] PTR '(anon)' type_id=3
      : [5] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      : [6] VAR 'g' type_id=2, linkage=global
      :
      : 'g' -> ptr -> 'tag2' -> ptr -> 'tag1' -> int
    - GCC
      + According to DWARF __tag1 applies to int*, __tag2 applies to int
      + According to BTF __tag1 applies to int*, __tag2 applies to int
      : DWARF
      : 0x0000002e:   DW_TAG_variable
      :                 DW_AT_name	("g")
      :                 DW_AT_type	(0x00000042 "int **")
      :
      : 0x00000042:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000055 "int *")
      :
      : 0x0000004b:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      :
      : 0x00000055:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000068 "int")
      :
      : 0x0000005e:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag2")
      : BTF
      : [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      : [2] PTR '(anon)' type_id=3
      : [3] TYPE_TAG 'tag2' type_id=1
      : [4] PTR '(anon)' type_id=5
      : [5] TYPE_TAG 'tag1' type_id=2
      : [6] VAR 'g' type_id=4, linkage=global
      :
      : 'g' -> ptr -> 'tag1' -> ptr -> 'tag2' -> int

    What follows is a typical kernel example.  We get "right" results
    in GCC just because both tags are the same attributes, but
    effectively we hit the same problem than in the previous example:

    : int __tag1 * __tag1 * val;

    - sparse
      + __tag1 applies to both int and int*
      : got int [noderef] __tag1 *[noderef] __tag1 *[addressable] [toplevel] val
    - clang
      + According to DWARF __tag1 applies to both int and int*
      + According to BTF __tag1 applies to both int and int*
      : DWARF
      : 0x0000001e:   DW_TAG_variable
      :                 DW_AT_name	("val")
      :                 DW_AT_type	(0x00000029 "int **")
      :
      : 0x00000029:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000032 "int *")
      :
      : 0x0000002e:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      :
      : 0x00000032:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x0000003b "int")
      :
      : 0x00000037:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      : BTF
      : [1] TYPE_TAG 'tag1' type_id=4
      : [2] PTR '(anon)' type_id=1
      : [3] TYPE_TAG 'tag1' type_id=5
      : [4] PTR '(anon)' type_id=3
      : [5] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      : [6] VAR 'val' type_id=2, linkage=global
      :
      : 'val' -> ptr -> 'tag1' -> ptr -> 'tag1' -> int
    - GCC
      + According to DWARF __tag1 applies to both int and int*
      + According to BTF __tag1 applies to both int and int*
      : DWARF
      : 0x0000001e:   DW_TAG_variable
      :                 DW_AT_name	("val")
      :                 DW_AT_type	(0x00000034 "int **")
      :
      : 0x00000034:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000047 "int *")
      :
      : 0x0000003d:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      :
      : 0x00000047:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x0000005a "int")
      :
      : 0x00000050:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      : BTF
      : [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      : [2] PTR '(anon)' type_id=3
      : [3] TYPE_TAG 'tag1' type_id=1
      : [4] PTR '(anon)' type_id=5
      : [5] TYPE_TAG 'tag1' type_id=2
      : [6] VAR 'val' type_id=4, linkage=global
      :
      : 'val' -> ptr -> 'tag1' -> ptr -> 'tag1' -> int

    This last example is another variation of pointer-to-pointer with
    different tags and different positions:

    : int * __tag1 * __tag2 h;

    - sparse
      +  __tag1 applies to int*, __tag2 applies to int**
      : got int *[noderef] __tag1 *[addressable] [noderef] [toplevel] __tag2 h
    - clang
      + According to DWARF __tag1 applies to int*, no __tag2 (??).
      + According to BTF  __tag1 applies to int*, no __tag2 (??).
      : DWARF
      : 0x00000023:   DW_TAG_variable
      :                 DW_AT_name	("h")
      :                 DW_AT_type	(0x0000002e "int **")
      :
      : 0x0000002e:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000037 "int *")
      :
      : 0x00000033:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      : BTF
      : [1] TYPE_TAG 'tag1' type_id=3
      : [2] PTR '(anon)' type_id=1
      : [3] PTR '(anon)' type_id=4
      : [4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      : [5] VAR 'h' type_id=2, linkage=global
      :
      : 'h' -> ptr -> 'tag1' -> ptr -> int
    - GCC
      + According to DWARF __tag1 applies to int*, __tag2 applies to int**
      + According to BTF __tag1 applies to int, __tag2 applies to int*
      : DWARF
      : 0x0000002e:   DW_TAG_variable
      :                 DW_AT_name	("h")
      :                 DW_AT_type	(0x00000042 "int **")
      :
      : 0x00000042:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000055 "int *")
      :
      : 0x0000004b:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag2")
      :
      : 0x00000055:   DW_TAG_pointer_type
      :                 DW_AT_type	(0x00000068 "int")
      :
      : 0x0000005e:     DW_TAG_LLVM_annotation
      :                 DW_AT_name	("btf_type_tag")
      :                 DW_AT_const_value	("tag1")
      : BTF
      : [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      : [2] PTR '(anon)' type_id=3
      : [3] TYPE_TAG 'tag1' type_id=1
      : [4] PTR '(anon)' type_id=5
      : [5] TYPE_TAG 'tag2' type_id=2
      : [6] VAR 'h' type_id=4, linkage=global
      :
      : 'h' -> ptr -> 'tag2' -> ptr -> 'tag1' -> int

* Solutions
** Reorder sparse annotations in kernel source

   [Julia Jawall kindly adapted coccinelle so it can handle this and
    the following possible solution involving modifying kernel
    sources.]

   This would basically involve writing and applying a coccinelle
   script to reorder something like this:

   : int __tag1 * __tag2 * g;

   into this

   : int __tag2 * __tag1 * g;

   where

   : #define __tag1 __attribute__((address_space(__tag1)))

   This totally breaks sparse so it is hardly a realistic solution.

** Add compiler attributes corresponding to existing sparse annotations to the kernel source

   This would involve writing and applying a coccinelle script to add
   compiler attributes corresponding to the existing sparse
   annotations.

   So from this:

   : int __tag1 * __tag2 * g;

   We would go to:

   : int __tag1 __comp_tag2 * __tag2 __comp_tag1 * g;

   where

   : #if __CHECKER__
   : #  define __tag1 __attribute__((address_space(__tag1)))
   : #else
   : #  define __tag1
   : #endif
   : #define __comp_tag1 __attribute__((btf_type_tag("tag1")))

   This would not break sparse, but would lead to a lot of (non
   trivial) redundancy in the kernel sources: both sparse annotations
   and compiler attributes will denote the same thing.

** Adding native support for sparse annotations to GCC and clang

   This would make both GCC and clang to handle sparse annotations
   using the sparse ordering.  This is what David Malcolm has been
   working on.

   There is some previous work From Tom Tromey to support the
   following sparse annotations: address_space, noderef and force.
   See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59850 (But AFAIK
   Tom's patch also falls in the mistake of ordering the attributes
   using GNU-like attribute ordering, not the sparse ordering?)

   For example, in this C declaration:

   : int * __attribute__((address_space(__tag1))) * p;

   The compilers would recognize that as a sparse annotation (and not
   a GNU-like compiler attribute) and parse it in a way it applies to
   int* instead of int**.

   At least two problems with this approach:

   1. This could be confusing for users, and difficult to get applied
      upstream, because sparse annotations look like GNU-like compiler
      attributes, despite not being the same thing (ordering
      differently.)

   2. sparse lacks support for btf_type_tag and btf_decl_tag, and
      these compiler attributes are more general than the existing
      sparse annotations.

** Adding GNU-like compiler attributes to GCC and clang with sparse ordering

   This is what clang does right now (or tries to do).

   Solves problem 2. above, but not 1.

   Maybe problem 1. could be alleviated by oddly-ordering compiler
   attributes to have a distinguished name, such as using a 'sparse'
   prefix:

   : __attribute__((sparse_btf_type_tag ("tag1")))

** Other solutions?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Follow up from the btf_type_tag discussion in the BPF office hours
  2022-12-15 18:43 Follow up from the btf_type_tag discussion in the BPF office hours Jose E. Marchesi
@ 2022-12-15 22:14 ` Jose E. Marchesi
  2022-12-17  1:38 ` Yonghong Song
  1 sibling, 0 replies; 5+ messages in thread
From: Jose E. Marchesi @ 2022-12-15 22:14 UTC (permalink / raw)
  To: bpf
  Cc: david.faust, elena.zannoni, David Malcolm, Nick Desaulniers,
	Julia Lawall


> Of the two problems discussed:
>
> 1. DW_TAG_LLVM_annotation not being able to denote annotations to
>    non-pointed based types.  clang currently ignores these instances.
>
>    We discussed two possible options to deal with this:
>    1.1 To continue ignoring these cases in the front-end, keep the dwarf
>        expressiveness limitation, and document it.
>    1.2 To change DW_TAG_LLVM_annotation so it behaves like a qualifier
>        DIE (like const, volatile, etc.) so it can apply to any type.

Note that the non-pointed based types don't have to be basic types.  The
limitation also impacts non-basic types that are not pointed to,
including pointer types themselves.

Therefore:

>     : int * __tag1 * __tag2 h;
>
>     - sparse
>       +  __tag1 applies to int*, __tag2 applies to int**
>       : got int *[noderef] __tag1 *[addressable] [noderef] [toplevel] __tag2 h
>     - clang
>       + According to DWARF __tag1 applies to int*, no __tag2 (??).
>       + According to BTF  __tag1 applies to int*, no __tag2 (??).
>       : DWARF
>       : 0x00000023:   DW_TAG_variable
>       :                 DW_AT_name	("h")
>       :                 DW_AT_type	(0x0000002e "int **")
>       :
>       : 0x0000002e:   DW_TAG_pointer_type
>       :                 DW_AT_type	(0x00000037 "int *")
>       :
>       : 0x00000033:     DW_TAG_LLVM_annotation
>       :                 DW_AT_name	("btf_type_tag")
>       :                 DW_AT_const_value	("tag1")
>       : BTF
>       : [1] TYPE_TAG 'tag1' type_id=3
>       : [2] PTR '(anon)' type_id=1
>       : [3] PTR '(anon)' type_id=4
>       : [4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>       : [5] VAR 'h' type_id=2, linkage=global
>       :
>       : 'h' -> ptr -> 'tag1' -> ptr -> int

In the example above, `tag2' doesn't appear in neither DWARF nor BTF
because the type int** isn't pointed itself, and as Yonghong mentioned
in the call, the implementation of btf_type_tag in clang ignores these
cases.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Follow up from the btf_type_tag discussion in the BPF office hours
  2022-12-15 18:43 Follow up from the btf_type_tag discussion in the BPF office hours Jose E. Marchesi
  2022-12-15 22:14 ` Jose E. Marchesi
@ 2022-12-17  1:38 ` Yonghong Song
  2022-12-19 17:27   ` Jose E. Marchesi
  1 sibling, 1 reply; 5+ messages in thread
From: Yonghong Song @ 2022-12-17  1:38 UTC (permalink / raw)
  To: Jose E. Marchesi, bpf
  Cc: david.faust, elena.zannoni, David Malcolm, Nick Desaulniers,
	Julia Lawall



On 12/15/22 10:43 AM, Jose E. Marchesi wrote:
> 
> Of the two problems discussed:
> 
> 1. DW_TAG_LLVM_annotation not being able to denote annotations to
>     non-pointed based types.  clang currently ignores these instances.
> 
>     We discussed two possible options to deal with this:
>     1.1 To continue ignoring these cases in the front-end, keep the dwarf
>         expressiveness limitation, and document it.
>     1.2 To change DW_TAG_LLVM_annotation so it behaves like a qualifier
>         DIE (like const, volatile, etc.) so it can apply to any type.

Thanks for the detailed update. Yes, we do want to __tag behaving like
a qualifier.

Today clang only support 'base_type <type_tag> *' style of code.
But we are open to support non-pointer style of tagging like
'base_type <type_tag> global_var'. Because of this, the following
dwarf output should be adopted:
    C: int __tag1 * __tag2 * p;
    dwarf: ptr -> __tag2 --> ptr -> __tag1 -> int
or
    C: int __tag1 g;
    dwarf: var_g -> __tag1 --> int

The above format *might* require particular dwarf tools to add support
for __tag attribute. But I think it is a good thing in the long run
esp. if we might add support to non-pointer types. In current
implementation, dwarf tools can simply ignore the children of ptr
which they may already do it.

> 
> 2. The ordering problem: sparse annotations order differently than
>     GNU/C99 compiler attributes.  Therefore translating 1-to-1 from
>     sparse annotations to compiler attributes results in attributes with
>     different syntax than normal compiler attributes.
> 
>     This was accepted in clang.
>     But found resistance in GCC when we sent the first patch series.
> 
>     During the meeting we went thru several possible ways of dealing with
>     this problem, but we didn't reach any conclusion on what to do, since
>     the time ran out.
> 
> We agreed to continue the discussion at the BPF office hours next 5
> January 2023.
> 
> In the meanwhile, below in this email is a slightly updated version of
> the material used to go thru the topics during the discussion.  If there
> is any mistake or if you see that our understanding of the
> problem/situation is not correct, please point it out.  If you want to
> add more information, please do so by replying to this thread.
> 
> Finally, it was agreed that we (GCC BPF hackers) would send Yonghong our
> github accounts so he can subscribe us to notifications in the llvm
> phabricator, so we can be aware of potentially ABI/breaking changes at
> the time they are discussed, and not afterwards scanning bpf@vger.  I
> alredy sent him the information.
> 
> Thank you for your time today.  It is appreciated.
> 
[...]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Follow up from the btf_type_tag discussion in the BPF office hours
  2022-12-17  1:38 ` Yonghong Song
@ 2022-12-19 17:27   ` Jose E. Marchesi
  2022-12-28  4:49     ` Yonghong Song
  0 siblings, 1 reply; 5+ messages in thread
From: Jose E. Marchesi @ 2022-12-19 17:27 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, david.faust, elena.zannoni, David Malcolm, Nick Desaulniers,
	Julia Lawall


Hi Yonghong.

> On 12/15/22 10:43 AM, Jose E. Marchesi wrote:
>> Of the two problems discussed:
>> 1. DW_TAG_LLVM_annotation not being able to denote annotations to
>>     non-pointed based types.  clang currently ignores these instances.
>>     We discussed two possible options to deal with this:
>>     1.1 To continue ignoring these cases in the front-end, keep the dwarf
>>         expressiveness limitation, and document it.
>>     1.2 To change DW_TAG_LLVM_annotation so it behaves like a qualifier
>>         DIE (like const, volatile, etc.) so it can apply to any type.
>
> Thanks for the detailed update. Yes, we do want to __tag behaving like
> a qualifier.
>
> Today clang only support 'base_type <type_tag> *' style of code.
> But we are open to support non-pointer style of tagging like
> 'base_type <type_tag> global_var'. Because of this, the following
> dwarf output should be adopted:
>    C: int __tag1 * __tag2 * p;
>    dwarf: ptr -> __tag2 --> ptr -> __tag1 -> int
> or
>    C: int __tag1 g;
>    dwarf: var_g -> __tag1 --> int
>
> The above format *might* require particular dwarf tools to add support
> for __tag attribute. But I think it is a good thing in the long run
> esp. if we might add support to non-pointer types. In current
> implementation, dwarf tools can simply ignore the children of ptr
> which they may already do it.

I wonder, since these annotations are atomic, is there a reason for not
using an attribute instead of a DIE tag?  Something like DW_AT_annotation.

The attribute could then be used by any DIE (declaration, type, ...) and
existing DWARF consumers that don't support the new attribute would
happily just ignore it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Follow up from the btf_type_tag discussion in the BPF office hours
  2022-12-19 17:27   ` Jose E. Marchesi
@ 2022-12-28  4:49     ` Yonghong Song
  0 siblings, 0 replies; 5+ messages in thread
From: Yonghong Song @ 2022-12-28  4:49 UTC (permalink / raw)
  To: Jose E. Marchesi
  Cc: bpf, david.faust, elena.zannoni, David Malcolm, Nick Desaulniers,
	Julia Lawall



On 12/19/22 9:27 AM, Jose E. Marchesi wrote:
> 
> Hi Yonghong.
> 
>> On 12/15/22 10:43 AM, Jose E. Marchesi wrote:
>>> Of the two problems discussed:
>>> 1. DW_TAG_LLVM_annotation not being able to denote annotations to
>>>      non-pointed based types.  clang currently ignores these instances.
>>>      We discussed two possible options to deal with this:
>>>      1.1 To continue ignoring these cases in the front-end, keep the dwarf
>>>          expressiveness limitation, and document it.
>>>      1.2 To change DW_TAG_LLVM_annotation so it behaves like a qualifier
>>>          DIE (like const, volatile, etc.) so it can apply to any type.
>>
>> Thanks for the detailed update. Yes, we do want to __tag behaving like
>> a qualifier.
>>
>> Today clang only support 'base_type <type_tag> *' style of code.
>> But we are open to support non-pointer style of tagging like
>> 'base_type <type_tag> global_var'. Because of this, the following
>> dwarf output should be adopted:
>>     C: int __tag1 * __tag2 * p;
>>     dwarf: ptr -> __tag2 --> ptr -> __tag1 -> int
>> or
>>     C: int __tag1 g;
>>     dwarf: var_g -> __tag1 --> int
>>
>> The above format *might* require particular dwarf tools to add support
>> for __tag attribute. But I think it is a good thing in the long run
>> esp. if we might add support to non-pointer types. In current
>> implementation, dwarf tools can simply ignore the children of ptr
>> which they may already do it.
> 
> I wonder, since these annotations are atomic, is there a reason for not
> using an attribute instead of a DIE tag?  Something like DW_AT_annotation.

Yes, we can. My suggestion is to facilitate gcc implementation. 
Currently clang uses an attribute instead of a DIE tag. I am totally 
fine if gcc uses the same dwarf representation mechanism as clang.

> 
> The attribute could then be used by any DIE (declaration, type, ...) and
> existing DWARF consumers that don't support the new attribute would
> happily just ignore it.

clang already use attributes to represent btf_type_tag and btf_decl_tag.
One of early considerations to use attribute in clang indeed is to avoid 
existing tool changes as much as possible.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-12-28  4:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-15 18:43 Follow up from the btf_type_tag discussion in the BPF office hours Jose E. Marchesi
2022-12-15 22:14 ` Jose E. Marchesi
2022-12-17  1:38 ` Yonghong Song
2022-12-19 17:27   ` Jose E. Marchesi
2022-12-28  4:49     ` Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.