linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Martin Sebor <msebor@gmail.com>
Cc: David Malcolm <dmalcolm@redhat.com>,
	gcc-patches@gcc.gnu.org, linux-toolchains@vger.kernel.org
Subject: Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries
Date: Mon, 6 Dec 2021 13:40:25 -0600	[thread overview]
Message-ID: <20211206194025.GQ614@gate.crashing.org> (raw)
In-Reply-To: <c5961336-3d27-20a4-9094-e96eb6262cec@gmail.com>

On Mon, Dec 06, 2021 at 11:12:00AM -0700, Martin Sebor wrote:
> On 11/13/21 1:37 PM, David Malcolm via Gcc-patches wrote:
> >Approach 1: Custom Address Spaces
> >=================================
> >
> >GCC's C frontend supports target-specific address spaces; see:
> >   https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html
> >Quoting the N1275 draft of ISO/IEC DTR 18037:
> >   "Address space names are ordinary identifiers, sharing the same name
> >   space as variables and typedef names.  Any such names follow the same
> >   rules for scope as other ordinary identifiers (such as typedef names).
> >   An implementation may provide an implementation-defined set of
> >   intrinsic address spaces that are, in effect, predefined at the start
> >   of every translation unit.  The names of intrinsic address spaces must
> >   be reserved identifiers (beginning with an underscore and an uppercase
> >   letter or with two underscores).  An implementation may also
> >   optionally support a means for new address space names to be defined
> >   within a translation unit."
> >
> >Patch 1a in the following patch kit for GCC implements such a means to
> >define new address spaces names in a translation unit, via a pragma:
> >   #prgama GCC custom_address_space(NAME_OF_ADDRESS_SPACE)
> >
> >For example, the Linux kernel could perhaps write:
> >
> >   #define __kernel
> >   #pragma GCC custom_address_space(__user)
> >   #pragma GCC custom_address_space(__iomem)
> >   #pragma GCC custom_address_space(__percpu)
> >   #pragma GCC custom_address_space(__rcu)
> >
> >and thus the C frontend can complain about code that mismatches __user
> >and kernel pointers, e.g.:
> >
> >custom-address-space-1.c: In function ‘test_argpass_to_p’:
> >custom-address-space-1.c:29:14: error: passing argument 1 of 
> >‘accepts_p’
> >from pointer to non-enclosed address space
> >    29 |   accepts_p (p_user);
> >       |              ^~~~~~
> >custom-address-space-1.c:21:24: note: expected ‘void *’ but argument is
> >of type ‘__user void *’
> >    21 | extern void accepts_p (void *);
> >       |                        ^~~~~~
> >custom-address-space-1.c: In function ‘test_cast_k_to_u’:
> >custom-address-space-1.c:135:12: warning: cast to ‘__user’ address 
> >space
> >pointer from disjoint generic address space pointer
> >   135 |   p_user = (void __user *)p_kernel;
> >       |            ^
> 
> This seems like an excellent use of named address spaces :)

It has some big problems though.

Named address spaces are completely target-specific.  Defining them with
a pragma like this does not allow you to set the pointer mode or
anything related to a custom LEGITIMATE_ADDRESS_P.  It does not allow
you to sayy zero pointers are invalid in some address spaces and not in
others.  You cannot provide any of the DWARF address space stuff this
way.  But most importantly, there are only four bits for the address
space field internally, and they are used by however a backend wants to
use them.

None of this cannot be solved, but all of it will have to be solved.

IMO it will be best to not mix this with address spaces in the user
interface (it is of course fine to *implement* it like that, or with
big overlap at least).

> >The patch doesn't yet maintain a good distinction between implicit
> >target-specific address spaces and user-defined address spaces,

And that will have to be fixed in the user code syntax at least.

> >has at
> >least one known major bug, and has only been lightly tested.  I can
> >fix these issues, but was hoping for feedback that this approach is the
> >right direction from both the GCC and Linux development communities.

Allowing the user to define new address spaces does not jibe well with
how targets do (validly!) use them.

> >Approach 2: An "untrusted" attribute
> >====================================
> >
> >Alternatively, patch 1b in the kit implements:
> >
> >   __attribute__((untrusted))
> >
> >which can be applied to types as a qualifier (similarly to const,
> >volatile, etc) to mark a trust boundary, hence the kernel could have:
> >
> >   #define __user __attribute__((untrusted))
> >
> >where my patched GCC treats
> >   T *
> >vs
> >   T __attribute__((untrusted)) *
> >as being different types and thus the C frontend can complain (even without
> >-fanalyzer) about e.g.:
> >
> >extern void accepts_p(void *);
> >
> >void test_argpass_to_p(void __user *p_user)
> >{
> >   accepts_p(p_user);
> >}
> >
> >untrusted-pointer-1.c: In function ‘test_argpass_to_p’:
> >untrusted-pointer-1.c:22:13: error: passing argument 1 of ‘accepts_p’
> >from pointer with different trust level
> >    22 |   accepts_p(p_user);
> >       |              ^~~~~~
> >untrusted-pointer-1.c:14:23: note: expected ‘void *’ but argument is of
> >type ‘__attribute__((untrusted)) void *’
> >    14 | extern void accepts_p(void *);
> >       |                        ^~~~~~
> >
> >So you'd get enforcement of __user vs non-__user pointers as part of
> >GCC's regular type-checking.  (You need an explicit cast to convert
> >between the untrusted vs trusted types).
> 
> As with the named address space idea, this approach also looks
> reasonable to me.  If you anticipate using the attribute only
> in the analyzer I would suggest to consider introducing it in
> the analyzer's namespace (e.g., analyzer::untrusted, or even
> gnu::analyzer::untrusted).

I don't see any fundamental problems with this approach.  It also is
very much in line with how Perl handles this (and some copycat languages
do as well), the "tainted" flag on data.

> >This approach is much less expressive that the custom addres space
> >approach; it would only cover the trust boundary aspect; it wouldn't
> >cover any differences between generic pointers and __user, vs __iomem,
> >__percpu, and __rcu which I admit I only dimly understand.

Yes, it does not have any of the big problems that come with those
address spaces either!  :-)

> >Other attributes
> >================
> >
> >Patch 2 in the kit adds:
> >   __attribute__((returns_zero_on_success))
> >and
> >   __attribute__((returns_nonzero_on_success))
> >as hints to the analyzer that it's worth bifurcating the analysis of
> >such functions (to explore failure vs success, and thus to better
> >explore error-handling paths).  It's also a hint to the human reader of
> >the source code.
> 
> I thing being able to express something along these lines would
> be useful even outside the analyzer, both for warnings and, when
> done right, perhaps also for optimization.  So I'm in favor of
> something like this.  I'll just reiterate here the comment on
> this attribute I sent you privately some time ago.

What is "success" though?  You probably want it so some checker can make
sure you do handle failure some way, but how do you see what is handling
failure and what is handling the successful case?


Segher

  reply	other threads:[~2021-12-06 19:43 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-13 20:37 [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries David Malcolm
2021-11-13 20:37 ` [PATCH 1a/6] RFC: Implement "#pragma GCC custom_address_space" David Malcolm
2021-11-13 20:37 ` [PATCH 1b/6] Add __attribute__((untrusted)) David Malcolm
2021-12-09 22:54   ` Martin Sebor
2022-01-06 15:10     ` David Malcolm
2022-01-06 18:59       ` Martin Sebor
2021-11-13 20:37 ` [PATCH 2/6] Add returns_zero_on_success/failure attributes David Malcolm
2021-11-15  7:03   ` Prathamesh Kulkarni
2021-11-15 14:45     ` Peter Zijlstra
2021-11-15 22:30       ` David Malcolm
2021-11-15 22:12     ` David Malcolm
2021-11-17  9:23       ` Prathamesh Kulkarni
2021-11-17 22:43         ` Joseph Myers
2021-11-18 20:08           ` Segher Boessenkool
2021-11-18 23:45             ` David Malcolm
2021-11-19 21:52               ` Segher Boessenkool
2021-11-18 23:34           ` David Malcolm
2021-12-06 18:34             ` Martin Sebor
2021-11-18 23:15         ` David Malcolm
2021-11-13 20:37 ` [PATCH 4a/6] analyzer: implement region::untrusted_p in terms of custom address spaces David Malcolm
2021-11-13 20:37 ` [PATCH 4b/6] analyzer: implement region::untrusted_p in terms of __attribute__((untrusted)) David Malcolm
2021-11-13 20:37 ` [PATCH 5/6] analyzer: use region::untrusted_p in taint detection David Malcolm
2021-11-13 20:37 ` [PATCH 6/6] Add __attribute__ ((tainted)) David Malcolm
2022-01-06 14:08   ` PING (C/C++): " David Malcolm
2022-01-10 21:36     ` PING^2 " David Malcolm
2022-01-12  4:36       ` Jason Merrill
2022-01-12 15:33         ` David Malcolm
2022-01-13 19:08           ` Jason Merrill
2022-01-14  1:25             ` [committed] Add __attribute__ ((tainted_args)) David Malcolm
2021-11-13 23:20 ` [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries Peter Zijlstra
2021-11-14  2:54   ` David Malcolm
2021-11-14 13:54 ` Miguel Ojeda
2021-12-06 18:12 ` Martin Sebor
2021-12-06 19:40   ` Segher Boessenkool [this message]
2021-12-09  0:06     ` David Malcolm
2021-12-09  0:41       ` Segher Boessenkool
2021-12-09 16:42     ` Martin Sebor
2021-12-09 23:40       ` Segher Boessenkool
2021-12-08 23:11   ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211206194025.GQ614@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=msebor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).