linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: David Malcolm <dmalcolm@redhat.com>
Cc: Martin Sebor <msebor@gmail.com>,
	gcc-patches@gcc.gnu.org, linux-toolchains@vger.kernel.org
Subject: Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries
Date: Wed, 8 Dec 2021 18:41:11 -0600	[thread overview]
Message-ID: <20211209004111.GT614@gate.crashing.org> (raw)
In-Reply-To: <94ff6309ba7449f93450cf0adace53a1c1aa7480.camel@redhat.com>

Hi!

On Wed, Dec 08, 2021 at 07:06:30PM -0500, David Malcolm wrote:
> On Mon, 2021-12-06 at 13:40 -0600, Segher Boessenkool wrote:
> > Named address spaces are completely target-specific.  Defining them
> > with
> > a pragma like this does not allow you to set the pointer mode or
> > anything related to a custom LEGITIMATE_ADDRESS_P.
> 
> My thinking was that each custom address space is based on an existing
> address space, but is disjoint from it, where "based on" means "what it
> looks like in terms of RTL generation" (clearly I'm handwaving here).  
> 
> In patch 1a, the custom address spaces are all based on the generic
> address space (but disjoint from it); syntax could be added to base
> them on one of the target-specific address spaces.
> 
> >   It does not allow
> > you to sayy zero pointers are invalid in some address spaces and not in
> > others.
> 
> Syntax could be added for this, I suppose.
> 
> > You cannot provide any of the DWARF address space stuff this
> > way.
> 
> True.  I confess that I haven't thought about the debugging experience,
> and I'd need to think what would happen at the DWARF level.
> 
> > But most importantly, there are only four bits for the address
> > space field internally, and they are used by however a backend wants to
> > use them.
> 
> One of the ideas of patch 1a is to divide up this 4-bit space between
> the target-specific and the custom address spaces.  The backend code
> would need to be tweaked to decode the 4-bit value to get at the
> underlying target-specific address space value.  This is done by the
> function ensure_builtin_addr_space in patch 1a, though I've likely
> missed some places.
> 
> IIRC, the target that's currently using the most address spaces is avr,
> which I believe has 8 target-specific address spaces, in addition to
> the generic one, i.e. 9 builtin address spaces, which would leave room
> for up to 6 user-defined address spaces.

Except that a backend is free to use this bitfield any way it pleases.

All of the above says that what you want is something completely
orthogonal to and separate from named address spaces.  Very similar in
some ways, sure, but keeping it apart will work much better and be much
less pain.

> The Linux kernel's smatch
> annotations currently effectively introduce 4 custom address spaces,
> __user, __iomem, __percpu, and __rcu (assuming that __kernel is the
> generic address space), so it's something of a tight squeeze, but it
> does fit.  This doesn't account for out-of-tree backends, of course.

Or any future backends.

> > IMO it will be best to not mix this with address spaces in the user
> > interface (it is of course fine to *implement* it like that, or with
> > big overlap at least).
> 
> I was thinking the other way around, in that it should look like
> address spaces in terms of the user's source code, but has some
> implementation differences.

That does not solve any of the problems I brought up though.  That was
just a list of all the basic features from address spaces btw, from
gccint.

> > Allowing the user to define new address spaces does not jibe well with
> > how targets do (validly!) use them.
> 
> I think from a user's perspective it's a nice approach - my feeling is
> that it makes certain things easier for the user, whilst complicating
> things from a backend implementation perspective.
> 
> Plus you've raised various technical issues which I'd have to resolve
> if we went in this direction.

It is fine to have a (very) similar concept for the user, but it does
not work well at all to equate this to the existing concept of named
address spaces.

> Indeed - I think this "untrusted attribute" approach is much simpler
> implementation-wise than the "custom address space" approach, which is
> also in its favor.
> 
> I'm wondering if anyone from the kernel development community has
> strong opinions here, since the custom address space approach is
> potentially much more expressive.

Anything that is more expressive than you have thought through what the
consequences will be is not a feature but a danger.  Anything that does
not fit in with the rest structurally now, will never do that.

> Otherwise I think we're both preferring the "untrusted attribute"
> approach (patch 1b).

That attribute does not interfere with anything else afaics, so that is
much safer.

> > > I thing being able to express something along these lines would
> > > be useful even outside the analyzer, both for warnings and, when
> > > done right, perhaps also for optimization.  So I'm in favor of
> > > something like this.  I'll just reiterate here the comment on
> > > this attribute I sent you privately some time ago.
> > 
> > What is "success" though?  You probably want it so some checker can
> > make
> > sure you do handle failure some way, but how do you see what is
> > handling
> > failure and what is handling the successful case?
> 
> "success" and "failure" in this case are purely in terms of how we
> label events for the user in the analyzer, such as in event (3) in the
> following:
> 
> infoleak-antipatterns-1.c: In function ‘infoleak_stack_unchecked_err’:
> infoleak-antipatterns-1.c:118:10: warning: potential exposure of
> sensitive information by copying uninitialized data from stack across
> trust boundary [CWE-200] [-Wanalyzer-exposure-through-uninit-copy]
>   118 |   err |= copy_to_user (dst, &st, sizeof(st));
>       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   ‘infoleak_stack_unchecked_err’: events 1-4
>     |
>     |  110 |   struct infoleak_buf st;
>     |      |                       ^~
>     |      |                       |
>     |      |                       (1) source region created on stack here
>     |      |                       (2) capacity: 256 bytes
>     |......
>     |  117 |   int err = copy_from_user (&st, src, sizeof(st));
>     |      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>     |      |             |
>     |      |             (3) when ‘copy_from_user’ fails, returning non-zero
>     |  118 |   err |= copy_to_user (dst, &st, sizeof(st));
>     |      |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>     |      |          |
>     |      |          (4) uninitialized data copied from stack here
>     |
> 
> i.e. it's allows the analyzer to provide a hint to the reader of the
> analyzer output.  The attribute is also a hint to the human reader of
> the source code.

But how do you tell the analyser what is success and what is failure?
Do you always count non-zero return values as failure, like here?  There
are other conventions (negative means error, zero means error, etc.)


Segher

  reply	other threads:[~2021-12-09  0:44 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-13 20:37 [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries David Malcolm
2021-11-13 20:37 ` [PATCH 1a/6] RFC: Implement "#pragma GCC custom_address_space" David Malcolm
2021-11-13 20:37 ` [PATCH 1b/6] Add __attribute__((untrusted)) David Malcolm
2021-12-09 22:54   ` Martin Sebor
2022-01-06 15:10     ` David Malcolm
2022-01-06 18:59       ` Martin Sebor
2021-11-13 20:37 ` [PATCH 2/6] Add returns_zero_on_success/failure attributes David Malcolm
2021-11-15  7:03   ` Prathamesh Kulkarni
2021-11-15 14:45     ` Peter Zijlstra
2021-11-15 22:30       ` David Malcolm
2021-11-15 22:12     ` David Malcolm
2021-11-17  9:23       ` Prathamesh Kulkarni
2021-11-17 22:43         ` Joseph Myers
2021-11-18 20:08           ` Segher Boessenkool
2021-11-18 23:45             ` David Malcolm
2021-11-19 21:52               ` Segher Boessenkool
2021-11-18 23:34           ` David Malcolm
2021-12-06 18:34             ` Martin Sebor
2021-11-18 23:15         ` David Malcolm
2021-11-13 20:37 ` [PATCH 4a/6] analyzer: implement region::untrusted_p in terms of custom address spaces David Malcolm
2021-11-13 20:37 ` [PATCH 4b/6] analyzer: implement region::untrusted_p in terms of __attribute__((untrusted)) David Malcolm
2021-11-13 20:37 ` [PATCH 5/6] analyzer: use region::untrusted_p in taint detection David Malcolm
2021-11-13 20:37 ` [PATCH 6/6] Add __attribute__ ((tainted)) David Malcolm
2022-01-06 14:08   ` PING (C/C++): " David Malcolm
2022-01-10 21:36     ` PING^2 " David Malcolm
2022-01-12  4:36       ` Jason Merrill
2022-01-12 15:33         ` David Malcolm
2022-01-13 19:08           ` Jason Merrill
2022-01-14  1:25             ` [committed] Add __attribute__ ((tainted_args)) David Malcolm
2021-11-13 23:20 ` [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries Peter Zijlstra
2021-11-14  2:54   ` David Malcolm
2021-11-14 13:54 ` Miguel Ojeda
2021-12-06 18:12 ` Martin Sebor
2021-12-06 19:40   ` Segher Boessenkool
2021-12-09  0:06     ` David Malcolm
2021-12-09  0:41       ` Segher Boessenkool [this message]
2021-12-09 16:42     ` Martin Sebor
2021-12-09 23:40       ` Segher Boessenkool
2021-12-08 23:11   ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211209004111.GT614@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=msebor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).