[Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
@ 2016-07-19 15:32 Eric W. Biederman
  2016-07-19 17:31 ` Mark Brown
                   ` (5 more replies)
  0 siblings, 6 replies; 82+ messages in thread
From: Eric W. Biederman @ 2016-07-19 15:32 UTC (permalink / raw)
  To: ksummit-discuss

Historically the types in C came about because the machines
fundamentally supported different data types either with different
sizes or different characteristics (i.e. u8, u16, float, double).
These data types and the C type system were built so programmers
could tell the machine what they needed it to do.

There is another genesis of types that started with the simply typed
lambda calculs that is about eliminating common errors and otherwise
helping a programmer get their code right.

In the years since C was invented there has been a lot of activity and a
little bit of progress in this area.  Would people be receptive to
improvements in this area?

I would like to talk to folks and gague what it would take to make
improvements in this area acceptable, practical, and useful.

Would a gcc plugin that checks the most interesting things that sparse
checks on every build be interesting? (endianness of integer types for example)

Would a type system for pointers derived from separation logic that
has the concept that a piece of data is owned by a piece of running
code rather than another piece of data be interesting?

  * This cleanly allows for doubly linked lists.

  * This is useful to ensure that data is either put in another data
    structure where it is remembered or it is freed.

  * This is useful to ensure reference counts are not leaked.

  * This is useful to ensure that every lock is paired with an unlock.

My personal filter for things like this are types that can be checked
in time proportional to the amount of code to be built so that it is
roughly the same situation we are in now.

Given it's heritage and it's history the type system in C has serious
limitations that I don't know if they are correctible, when it comes to
catching programmer mistakes: silent truncation of unsigned types into
smaller unsigned types, casts, etc.  Would people be willing to consider
a simple, link compatible alternative to C for some of the code in the
kernel that had the same low level control of the machine but had a type
system that made catching mistakes easier?

Deploying solutions like this will take a fair bit of grunt work, and
time similar or worse than the big kernel lock removal.  Given how
widely Linux is used and how annoying some of these bugs can be I think
it is worthwhile to dig in and see what kind of improvements can be
made.

I would really like to get a feel among kernel maintainers and
developers if this is something that is interesting, and what kind of
constraints they think something like this would need to be usable for
the kernel?

Eric

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 15:32 [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel Eric W. Biederman
@ 2016-07-19 17:31 ` Mark Brown
  2016-07-19 18:52   ` Jiri Kosina
  2016-07-19 21:08 ` [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel James Bottomley
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 82+ messages in thread
From: Mark Brown @ 2016-07-19 17:31 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 358 bytes --]

On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:

> Would a gcc plugin that checks the most interesting things that sparse
> checks on every build be interesting? (endianness of integer types for example)

There's a push from certain quarters to move away from GCC to LLVM.  Not
that these things are unachievable in LLVM but it's a thing.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 17:31 ` Mark Brown
@ 2016-07-19 18:52   ` Jiri Kosina
  2016-07-19 20:39     ` Eric W. Biederman
  2016-07-20 15:53     ` Mark Brown
  0 siblings, 2 replies; 82+ messages in thread
From: Jiri Kosina @ 2016-07-19 18:52 UTC (permalink / raw)
  To: Mark Brown; +Cc: ksummit-discuss

On Tue, 19 Jul 2016, Mark Brown wrote:

> There's a push from certain quarters to move away from GCC to LLVM.  

This might actually be an interesting topic per se.

LLVM definitely has quite some nice features, but their attitude towards 
bugs which are rather severe for kernel programming should be taken as a 
warning at least. Look at the "pushf/popf being generated around 
local_irq_save()" trainwreck as an example.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 18:52   ` Jiri Kosina
@ 2016-07-19 20:39     ` Eric W. Biederman
  2016-07-20 15:53     ` Mark Brown
  1 sibling, 0 replies; 82+ messages in thread
From: Eric W. Biederman @ 2016-07-19 20:39 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

Jiri Kosina <jikos@kernel.org> writes:

> On Tue, 19 Jul 2016, Mark Brown wrote:
>
>> There's a push from certain quarters to move away from GCC to LLVM.  
>
> This might actually be an interesting topic per se.
>
> LLVM definitely has quite some nice features, but their attitude towards 
> bugs which are rather severe for kernel programming should be taken as a 
> warning at least. Look at the "pushf/popf being generated around 
> local_irq_save()" trainwreck as an example.

We have the precedent of sparse that shows how we can have extended
types with the existing compiler seeing the ordinary types.  I can't
imagine doing anything different if we continue to use C code.  Anything
would just be silly.

I can see value in a kernel that multiple compilers can build. But so
far a switch to llvm looks scary.

Eric

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 15:32 [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel Eric W. Biederman
  2016-07-19 17:31 ` Mark Brown
@ 2016-07-19 21:08 ` James Bottomley
  2016-07-20  0:08   ` Eric W. Biederman
  2016-07-19 21:26 ` Josh Triplett
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 82+ messages in thread
From: James Bottomley @ 2016-07-19 21:08 UTC (permalink / raw)
  To: Eric W. Biederman, ksummit-discuss

On Tue, 2016-07-19 at 10:32 -0500, Eric W. Biederman wrote:
> Historically the types in C came about because the machines
> fundamentally supported different data types either with different
> sizes or different characteristics (i.e. u8, u16, float, double).
> These data types and the C type system were built so programmers
> could tell the machine what they needed it to do.
> 
> There is another genesis of types that started with the simply typed
> lambda calculs that is about eliminating common errors and otherwise
> helping a programmer get their code right.
> 
> In the years since C was invented there has been a lot of activity
> and a
> little bit of progress in this area.  Would people be receptive to
> improvements in this area?
> 
> I would like to talk to folks and gague what it would take to make
> improvements in this area acceptable, practical, and useful.
> 
> Would a gcc plugin that checks the most interesting things that 
> sparse checks on every build be interesting? (endianness of integer 
> types for example)

How would this be different from simply automatically running sparse in
the kernel build if the binary is present (effectively making make C=1
the default)?

> Would a type system for pointers derived from separation logic that
> has the concept that a piece of data is owned by a piece of running
> code rather than another piece of data be interesting?

By this you mean a thread of execution that should be expected to free
the data pointed to when it finishes?  Sort of like a self garbage
collecting reference?

>   * This cleanly allows for doubly linked lists.
>   
>   * This is useful to ensure that data is either put in another data
>     structure where it is remembered or it is freed.
> 
>   * This is useful to ensure reference counts are not leaked.
> 
>   * This is useful to ensure that every lock is paired with an
> unlock.
> 
> My personal filter for things like this are types that can be checked
> in time proportional to the amount of code to be built so that it is
> roughly the same situation we are in now.
> 
> 
> Given it's heritage and it's history the type system in C has serious
> limitations that I don't know if they are correctible, when it comes 
> to catching programmer mistakes: silent truncation of unsigned types
> into smaller unsigned types, casts, etc.  Would people be willing to
> consider a simple, link compatible alternative to C for some of the 
> code in the kernel that had the same low level control of the machine 
> but had a type system that made catching mistakes easier?
> 
> 
> Deploying solutions like this will take a fair bit of grunt work, and
> time similar or worse than the big kernel lock removal.  Given how
> widely Linux is used and how annoying some of these bugs can be I 
> think it is worthwhile to dig in and see what kind of improvements 
> can be made.
> 
> I would really like to get a feel among kernel maintainers and
> developers if this is something that is interesting, and what kind of
> constraints they think something like this would need to be usable 
> for the kernel?

I've got to say that rewriting in some as yet undefined language simply
to get better typing and conversions looks very daunting.  Before even
considering this, can you answer why an extension to sparse, which
would mostly flag the problems if we use the correct annotations,
wouldn't work just as well?  We'd still have to add the additional
annotations (and someone would have to update sparse) but it would
still be a lot easier than rewriting and giving the kernel a build
dependency on whatever the rewrite is done in.

James

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 15:32 [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel Eric W. Biederman
  2016-07-19 17:31 ` Mark Brown
  2016-07-19 21:08 ` [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel James Bottomley
@ 2016-07-19 21:26 ` Josh Triplett
  2016-07-20  2:36   ` Eric W. Biederman
  2016-07-30 18:03   ` Eric W. Biederman
  2016-07-21 15:05 ` David Howells
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 82+ messages in thread
From: Josh Triplett @ 2016-07-19 21:26 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: ksummit-discuss

On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
> Would a gcc plugin that checks the most interesting things that sparse
> checks on every build be interesting? (endianness of integer types for example)

I'd like to see those checks more widely available, ideally not just as
plugins.  Some exploration of that occurred upstream:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59852 (bitwise/endian types)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59856 (contexts/locking)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59851 (nocast: no implicit conversions)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59850 (address spaces)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59855 (designated_init; done)

I'd love to see someone pick those up and get them into upstream GCC.

> Would a type system for pointers derived from separation logic that
> has the concept that a piece of data is owned by a piece of running
> code rather than another piece of data be interesting?

Interesting, yes, but trying to track "ownership" gets complicated
*fast* to handle real-world cases.  Rust went through quite a lot of
work, and multiple iterations, to get to the system it has now.  I don't
think you'd be able to handle many of the cases in the kernel without
about that much complexity.

> I would really like to get a feel among kernel maintainers and
> developers if this is something that is interesting, and what kind of
> constraints they think something like this would need to be usable for
> the kernel?

I think the biggest constraint is that new tools get very slow adoption,
and it's incredibly difficult to introduce a new *mandatory* tool or
compiler version (with the exception of tools that ship with the
kernel).  And optional ones have a tendency to break due to patches from
people not running them.  Apart from that: false positive rate.

Ideally, build something you can opt into using, such that if you
explicitly use it, the false positive rate should be *zero* by design.

- Josh Triplett

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 21:08 ` [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel James Bottomley
@ 2016-07-20  0:08   ` Eric W. Biederman
  2016-07-20  7:32     ` Julia Lawall
  2016-07-20 12:11     ` Jan Kara
  0 siblings, 2 replies; 82+ messages in thread
From: Eric W. Biederman @ 2016-07-20  0:08 UTC (permalink / raw)
  To: James Bottomley; +Cc: ksummit-discuss

James Bottomley <James.Bottomley@HansenPartnership.com> writes:

> On Tue, 2016-07-19 at 10:32 -0500, Eric W. Biederman wrote:
>> Historically the types in C came about because the machines
>> fundamentally supported different data types either with different
>> sizes or different characteristics (i.e. u8, u16, float, double).
>> These data types and the C type system were built so programmers
>> could tell the machine what they needed it to do.
>> 
>> There is another genesis of types that started with the simply typed
>> lambda calculs that is about eliminating common errors and otherwise
>> helping a programmer get their code right.
>> 
>> In the years since C was invented there has been a lot of activity
>> and a
>> little bit of progress in this area.  Would people be receptive to
>> improvements in this area?
>> 
>> I would like to talk to folks and gague what it would take to make
>> improvements in this area acceptable, practical, and useful.
>> 
>> Would a gcc plugin that checks the most interesting things that 
>> sparse checks on every build be interesting? (endianness of integer 
>> types for example)
>
> How would this be different from simply automatically running sparse in
> the kernel build if the binary is present (effectively making make C=1
> the default)?

Nothing.  I am just honestly looking at ways that we can get things to
always or almost always run.   Sparse isn't getting run regularly now so
I was suspect that would not be as good of a solution.

>> Would a type system for pointers derived from separation logic that
>> has the concept that a piece of data is owned by a piece of running
>> code rather than another piece of data be interesting?
>
> By this you mean a thread of execution that should be expected to free
> the data pointed to when it finishes?  Sort of like a self garbage
> collecting reference?

Not really.  The big idea is to make expressable in the type system the
key concepts on how a programmer reasons about their data structures,
instead of trying to make programmers perform some convoluted logic to
express things in concepts that are trivial to implement in a type
system.

Another way to talk about it would be complete alias analysis at type
checking time.

Which means that the type system then knows essentially everything the
programmer knows about aliases and object lifetimes and if you don't
pass something on or free it, the type checker then knows you made
a mistake.

>>   * This cleanly allows for doubly linked lists.
>>   
>>   * This is useful to ensure that data is either put in another data
>>     structure where it is remembered or it is freed.
>> 
>>   * This is useful to ensure reference counts are not leaked.
>> 
>>   * This is useful to ensure that every lock is paired with an
>> unlock.
>> 
>> My personal filter for things like this are types that can be checked
>> in time proportional to the amount of code to be built so that it is
>> roughly the same situation we are in now.
>> 
>> 
>> Given it's heritage and it's history the type system in C has serious
>> limitations that I don't know if they are correctible, when it comes 
>> to catching programmer mistakes: silent truncation of unsigned types
>> into smaller unsigned types, casts, etc.  Would people be willing to
>> consider a simple, link compatible alternative to C for some of the 
>> code in the kernel that had the same low level control of the machine 
>> but had a type system that made catching mistakes easier?
>> 
>> 
>> Deploying solutions like this will take a fair bit of grunt work, and
>> time similar or worse than the big kernel lock removal.  Given how
>> widely Linux is used and how annoying some of these bugs can be I 
>> think it is worthwhile to dig in and see what kind of improvements 
>> can be made.
>> 
>> I would really like to get a feel among kernel maintainers and
>> developers if this is something that is interesting, and what kind of
>> constraints they think something like this would need to be usable 
>> for the kernel?
>
> I've got to say that rewriting in some as yet undefined language simply
> to get better typing and conversions looks very daunting.  Before even
> considering this, can you answer why an extension to sparse, which
> would mostly flag the problems if we use the correct annotations,
> wouldn't work just as well?  We'd still have to add the additional
> annotations (and someone would have to update sparse) but it would
> still be a lot easier than rewriting and giving the kernel a build
> dependency on whatever the rewrite is done in.

That is a very good question.  The short answer is that I am not yet
convinced I can figure out how to retrofit the type checks I want onto
C's type system.  I have implemented the type checks I am looking at in
a fresh language I am toying with where I can throw out inconvinient
unnecessary corner cases.  After I finished boiling the ocean the basic
types were similar enough to C's type system that I think I can make
the connection but I haven't tried yet.

Assuming things can be sorted so they coexist nicely C's types.  No
advantage.  If it turns out that to make the code not a pain to write we
need better syntax or some basic type inference there is a large
advantage in converting code.

The first place I can think of where I might get hung up in the area of
discriminated unions.  I am pretty certain that some cases of type
safety will require them and C doesn't have anything like that today.  A
practical example would be a type system that ensures you call
IS_ERR(ptr) before you use a pointer as a pointer.

Ordinary C unions and type casts that serve the same purpose would at the
end of the day all have to be reabstracted.  Because otherwise memory
safety cound not be analyized.

Some of that reabstraction I will need to introduce type variables which
a completely foreing concept to C.

On the flip side since the linux kernel and low level programming where
precise control of the machine happens is my target audience it is
definitely worth looking at what can be done in a context like sparse.
That is the fastest way to connect the tools with the real world
problems.  If it can't be done or if it can't be made nice simple
and easy to use then there will be good arguments for why we need a new
syntax.

Mostly I asked the question because the type system would be so much
easier in a green field setting, and I wanted to be lazy.  So hearing
the common sense request (Please fix C.) Is helpful to get that lazy
part of myself in gear.

The end game for me is a type system that doesn't permit memory errors,
reference counting errors, locking errors, or deadlocks, while at the
same time allowing pretty much the code and data structures that we use
in the kernel today.

Eric

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 21:26 ` Josh Triplett
@ 2016-07-20  2:36   ` Eric W. Biederman
  2016-07-30 18:03   ` Eric W. Biederman
  1 sibling, 0 replies; 82+ messages in thread
From: Eric W. Biederman @ 2016-07-20  2:36 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss

Josh Triplett <josh@joshtriplett.org> writes:

> On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
>> Would a gcc plugin that checks the most interesting things that sparse
>> checks on every build be interesting? (endianness of integer types for example)
>
> I'd like to see those checks more widely available, ideally not just as
> plugins.  Some exploration of that occurred upstream:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59852 (bitwise/endian types)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59856 (contexts/locking)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59851 (nocast: no implicit conversions)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59850 (address spaces)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59855 (designated_init; done)
>
> I'd love to see someone pick those up and get them into upstream GCC.

Interesting.

Whatever I do I intend to start and the hard problem of types that know
where all of your aliases are because I have something that works and is
very interesting there, so I won't volunteer for those but that does
look like a reasonable part of the discussion.

>> Would a type system for pointers derived from separation logic that
>> has the concept that a piece of data is owned by a piece of running
>> code rather than another piece of data be interesting?
>
> Interesting, yes, but trying to track "ownership" gets complicated
> *fast* to handle real-world cases.  Rust went through quite a lot of
> work, and multiple iterations, to get to the system it has now.  I don't
> think you'd be able to handle many of the cases in the kernel without
> about that much complexity.

So "ownership" may be the wrong word.  See my reply to James Bottomley.
Rust never advanced past what are effectively smart pointers, and smart
pointers are the wrong concept for tracking aliases.

To get past that it takes some turning of all of your trained exceptions
of what a type system is tracking inside out (or at least it did for me)
but when you are done there is something that is much more expressive
and simpler than what Rust implemented.

Our fundamental complexity with smp synchronization may add up the
complexity again but it is undoubtedly possible to do better than Rust.
The way our kernel is built all of the core kernel would need to be in a
Rust unsafe block.  So I would not even consider doing what Rust did.

>> I would really like to get a feel among kernel maintainers and
>> developers if this is something that is interesting, and what kind of
>> constraints they think something like this would need to be usable for
>> the kernel?
>
> I think the biggest constraint is that new tools get very slow adoption,
> and it's incredibly difficult to introduce a new *mandatory* tool or
> compiler version (with the exception of tools that ship with the
> kernel).  And optional ones have a tendency to break due to patches from
> people not running them.  Apart from that: false positive rate.
>
> Ideally, build something you can opt into using, such that if you
> explicitly use it, the false positive rate should be *zero* by design.

With types what you do get a *zero* false positive rate by desgin.
Although sometimes you get a lot of const correctness kinds of
conversion requitements that are a royal PITA.

Eric

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-20  0:08   ` Eric W. Biederman
@ 2016-07-20  7:32     ` Julia Lawall
  2016-07-20 12:11     ` Jan Kara
  1 sibling, 0 replies; 82+ messages in thread
From: Julia Lawall @ 2016-07-20  7:32 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: James Bottomley, ksummit-discuss



On Tue, 19 Jul 2016, Eric W. Biederman wrote:

> James Bottomley <James.Bottomley@HansenPartnership.com> writes:
>
> > On Tue, 2016-07-19 at 10:32 -0500, Eric W. Biederman wrote:
> >> Historically the types in C came about because the machines
> >> fundamentally supported different data types either with different
> >> sizes or different characteristics (i.e. u8, u16, float, double).
> >> These data types and the C type system were built so programmers
> >> could tell the machine what they needed it to do.
> >>
> >> There is another genesis of types that started with the simply typed
> >> lambda calculs that is about eliminating common errors and otherwise
> >> helping a programmer get their code right.
> >>
> >> In the years since C was invented there has been a lot of activity
> >> and a
> >> little bit of progress in this area.  Would people be receptive to
> >> improvements in this area?
> >>
> >> I would like to talk to folks and gague what it would take to make
> >> improvements in this area acceptable, practical, and useful.
> >>
> >> Would a gcc plugin that checks the most interesting things that
> >> sparse checks on every build be interesting? (endianness of integer
> >> types for example)
> >
> > How would this be different from simply automatically running sparse in
> > the kernel build if the binary is present (effectively making make C=1
> > the default)?
>
> Nothing.  I am just honestly looking at ways that we can get things to
> always or almost always run.   Sparse isn't getting run regularly now so
> I was suspect that would not be as good of a solution.

A problem with putting time consuming tools into the normal build process
is that they work on all the code you compile, which may be a large
superset of the code that is actually changed.

julia


>
> >> Would a type system for pointers derived from separation logic that
> >> has the concept that a piece of data is owned by a piece of running
> >> code rather than another piece of data be interesting?
> >
> > By this you mean a thread of execution that should be expected to free
> > the data pointed to when it finishes?  Sort of like a self garbage
> > collecting reference?
>
> Not really.  The big idea is to make expressable in the type system the
> key concepts on how a programmer reasons about their data structures,
> instead of trying to make programmers perform some convoluted logic to
> express things in concepts that are trivial to implement in a type
> system.
>
> Another way to talk about it would be complete alias analysis at type
> checking time.
>
> Which means that the type system then knows essentially everything the
> programmer knows about aliases and object lifetimes and if you don't
> pass something on or free it, the type checker then knows you made
> a mistake.
>
> >>   * This cleanly allows for doubly linked lists.
> >>
> >>   * This is useful to ensure that data is either put in another data
> >>     structure where it is remembered or it is freed.
> >>
> >>   * This is useful to ensure reference counts are not leaked.
> >>
> >>   * This is useful to ensure that every lock is paired with an
> >> unlock.
> >>
> >> My personal filter for things like this are types that can be checked
> >> in time proportional to the amount of code to be built so that it is
> >> roughly the same situation we are in now.
> >>
> >>
> >> Given it's heritage and it's history the type system in C has serious
> >> limitations that I don't know if they are correctible, when it comes
> >> to catching programmer mistakes: silent truncation of unsigned types
> >> into smaller unsigned types, casts, etc.  Would people be willing to
> >> consider a simple, link compatible alternative to C for some of the
> >> code in the kernel that had the same low level control of the machine
> >> but had a type system that made catching mistakes easier?
> >>
> >>
> >> Deploying solutions like this will take a fair bit of grunt work, and
> >> time similar or worse than the big kernel lock removal.  Given how
> >> widely Linux is used and how annoying some of these bugs can be I
> >> think it is worthwhile to dig in and see what kind of improvements
> >> can be made.
> >>
> >> I would really like to get a feel among kernel maintainers and
> >> developers if this is something that is interesting, and what kind of
> >> constraints they think something like this would need to be usable
> >> for the kernel?
> >
> > I've got to say that rewriting in some as yet undefined language simply
> > to get better typing and conversions looks very daunting.  Before even
> > considering this, can you answer why an extension to sparse, which
> > would mostly flag the problems if we use the correct annotations,
> > wouldn't work just as well?  We'd still have to add the additional
> > annotations (and someone would have to update sparse) but it would
> > still be a lot easier than rewriting and giving the kernel a build
> > dependency on whatever the rewrite is done in.
>
> That is a very good question.  The short answer is that I am not yet
> convinced I can figure out how to retrofit the type checks I want onto
> C's type system.  I have implemented the type checks I am looking at in
> a fresh language I am toying with where I can throw out inconvinient
> unnecessary corner cases.  After I finished boiling the ocean the basic
> types were similar enough to C's type system that I think I can make
> the connection but I haven't tried yet.
>
> Assuming things can be sorted so they coexist nicely C's types.  No
> advantage.  If it turns out that to make the code not a pain to write we
> need better syntax or some basic type inference there is a large
> advantage in converting code.
>
> The first place I can think of where I might get hung up in the area of
> discriminated unions.  I am pretty certain that some cases of type
> safety will require them and C doesn't have anything like that today.  A
> practical example would be a type system that ensures you call
> IS_ERR(ptr) before you use a pointer as a pointer.
>
> Ordinary C unions and type casts that serve the same purpose would at the
> end of the day all have to be reabstracted.  Because otherwise memory
> safety cound not be analyized.
>
> Some of that reabstraction I will need to introduce type variables which
> a completely foreing concept to C.
>
> On the flip side since the linux kernel and low level programming where
> precise control of the machine happens is my target audience it is
> definitely worth looking at what can be done in a context like sparse.
> That is the fastest way to connect the tools with the real world
> problems.  If it can't be done or if it can't be made nice simple
> and easy to use then there will be good arguments for why we need a new
> syntax.
>
> Mostly I asked the question because the type system would be so much
> easier in a green field setting, and I wanted to be lazy.  So hearing
> the common sense request (Please fix C.) Is helpful to get that lazy
> part of myself in gear.
>
> The end game for me is a type system that doesn't permit memory errors,
> reference counting errors, locking errors, or deadlocks, while at the
> same time allowing pretty much the code and data structures that we use
> in the kernel today.
>
> Eric
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-20  0:08   ` Eric W. Biederman
  2016-07-20  7:32     ` Julia Lawall
@ 2016-07-20 12:11     ` Jan Kara
  2016-07-28  3:33       ` Steven Rostedt
  1 sibling, 1 reply; 82+ messages in thread
From: Jan Kara @ 2016-07-20 12:11 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: James Bottomley, ksummit-discuss

On Tue 19-07-16 19:08:12, Eric W. Biederman wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> writes:
> > On Tue, 2016-07-19 at 10:32 -0500, Eric W. Biederman wrote:
> >> Historically the types in C came about because the machines
> >> fundamentally supported different data types either with different
> >> sizes or different characteristics (i.e. u8, u16, float, double).
> >> These data types and the C type system were built so programmers
> >> could tell the machine what they needed it to do.
> >> 
> >> There is another genesis of types that started with the simply typed
> >> lambda calculs that is about eliminating common errors and otherwise
> >> helping a programmer get their code right.
> >> 
> >> In the years since C was invented there has been a lot of activity
> >> and a
> >> little bit of progress in this area.  Would people be receptive to
> >> improvements in this area?
> >> 
> >> I would like to talk to folks and gague what it would take to make
> >> improvements in this area acceptable, practical, and useful.
> >> 
> >> Would a gcc plugin that checks the most interesting things that 
> >> sparse checks on every build be interesting? (endianness of integer 
> >> types for example)
> >
> > How would this be different from simply automatically running sparse in
> > the kernel build if the binary is present (effectively making make C=1
> > the default)?
> 
> Nothing.  I am just honestly looking at ways that we can get things to
> always or almost always run.   Sparse isn't getting run regularly now so
> I was suspect that would not be as good of a solution.

Isn't sparse run by 0-day testing? I thought it is...


								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 18:52   ` Jiri Kosina
  2016-07-19 20:39     ` Eric W. Biederman
@ 2016-07-20 15:53     ` Mark Brown
  2016-07-20 17:04       ` [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM Jiri Kosina
  1 sibling, 1 reply; 82+ messages in thread
From: Mark Brown @ 2016-07-20 15:53 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1091 bytes --]

On Tue, Jul 19, 2016 at 08:52:33PM +0200, Jiri Kosina wrote:
> On Tue, 19 Jul 2016, Mark Brown wrote:

> > There's a push from certain quarters to move away from GCC to LLVM.  

> This might actually be an interesting topic per se.

Yes, indeed.

> LLVM definitely has quite some nice features, but their attitude towards 
> bugs which are rather severe for kernel programming should be taken as a 
> warning at least. Look at the "pushf/popf being generated around 
> local_irq_save()" trainwreck as an example.

My understanding (which is a bit second hand here) is that LLVM upstream
really values direct engagement at time of development much more than
anything else - AFAICT they're more focused on driving things forward,
partly due to expecting other people to do the main release management.
To that end if people really want to run LLVM built kernels in
production they probably need to either use a downstream that ensures
that things are working well for kernel builds or work directly on
testing development versions of the compilers to catch issues there.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-20 15:53     ` Mark Brown
@ 2016-07-20 17:04       ` Jiri Kosina
  2016-07-20 18:35         ` Alexey Dobriyan
  2016-07-21  9:54         ` David Woodhouse
  0 siblings, 2 replies; 82+ messages in thread
From: Jiri Kosina @ 2016-07-20 17:04 UTC (permalink / raw)
  To: Mark Brown; +Cc: ksummit-discuss

On Wed, 20 Jul 2016, Mark Brown wrote:

> > > There's a push from certain quarters to move away from GCC to LLVM.  
> 
> > This might actually be an interesting topic per se.
> 
> Yes, indeed.

Let's make this a real proposal then ... (subject changed). I am again a 
bit unsure about the core / tech division here.

People who should be invited: proponents of the push from the certain 
quarters mentioned by Mark above, and ideally some LLVM folks as well.

I've never actually used llvm to compile the kernel (which makes me rather 
poor contributor should any such discussion happen), but I've been on the 
"receiving side", debugging a crash that turned out to be llvm messing up 
with IF in a way that interfers with local_irq_save(), and also suffered 
from the followup frustration when I found out that this has been reported 
to llvm folks ages ago, and they haven't bothered to fix it (it's now at 
least worked around, in a very sub-optimal way (lahf/sahf)).

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-20 17:04       ` [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM Jiri Kosina
@ 2016-07-20 18:35         ` Alexey Dobriyan
  2016-07-20 18:52           ` Mark Brown
  2016-07-21  9:54         ` David Woodhouse
  1 sibling, 1 reply; 82+ messages in thread
From: Alexey Dobriyan @ 2016-07-20 18:35 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

On Wed, Jul 20, 2016 at 07:04:59PM +0200, Jiri Kosina wrote:
> On Wed, 20 Jul 2016, Mark Brown wrote:
> 
> > > > There's a push from certain quarters to move away from GCC to LLVM.  
> > 
> > > This might actually be an interesting topic per se.
> > 
> > Yes, indeed.
> 
> Let's make this a real proposal then ... (subject changed). I am again a 
> bit unsure about the core / tech division here.
> 
> People who should be invited: proponents of the push from the certain 
> quarters mentioned by Mark above, and ideally some LLVM folks as well.

This is a bit premature until Linux Clang patchset is empty.
As an example, alternatives do not work and IIRC VLAs won't work ever:
https://llvm.org/bugs/show_bug.cgi?id=24487

Of course, compiling with Clang should be encouraged. At worst, it is
yet another static checker.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-20 18:35         ` Alexey Dobriyan
@ 2016-07-20 18:52           ` Mark Brown
  0 siblings, 0 replies; 82+ messages in thread
From: Mark Brown @ 2016-07-20 18:52 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 743 bytes --]

On Wed, Jul 20, 2016 at 09:35:30PM +0300, Alexey Dobriyan wrote:
> On Wed, Jul 20, 2016 at 07:04:59PM +0200, Jiri Kosina wrote:

> > People who should be invited: proponents of the push from the certain 
> > quarters mentioned by Mark above, and ideally some LLVM folks as well.

> This is a bit premature until Linux Clang patchset is empty.
> As an example, alternatives do not work and IIRC VLAs won't work ever:
> https://llvm.org/bugs/show_bug.cgi?id=24487

The main interest I'm aware of is from people working on ARM (and it
looks like there's some MIPS as well from the LLVMLinux git) so x86
specific issues might not be such a big deal.  There's definitely some
upstreaming still to do but the delta is getting smaller.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-20 17:04       ` [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM Jiri Kosina
  2016-07-20 18:35         ` Alexey Dobriyan
@ 2016-07-21  9:54         ` David Woodhouse
  2016-07-21 13:41           ` Shuah Khan
  2016-07-21 18:38           ` Jiri Kosina
  1 sibling, 2 replies; 82+ messages in thread
From: David Woodhouse @ 2016-07-21  9:54 UTC (permalink / raw)
  To: Jiri Kosina, Mark Brown; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2032 bytes --]

On Wed, 2016-07-20 at 19:04 +0200, Jiri Kosina wrote:
> On Wed, 20 Jul 2016, Mark Brown wrote:
> 
> > > > There's a push from certain quarters to move away from GCC to LLVM.  
> > 
> > > This might actually be an interesting topic per se.
> > 
> > Yes, indeed.
> 
> Let's make this a real proposal then ... (subject changed). I am again a 
> bit unsure about the core / tech division here.
> 
> People who should be invited: proponents of the push from the certain 
> quarters mentioned by Mark above, and ideally some LLVM folks as well.
> 
> I've never actually used llvm to compile the kernel (which makes me rather 
> poor contributor should any such discussion happen), but I've been on the 
> "receiving side", debugging a crash that turned out to be llvm messing up 
> with IF in a way that interfers with local_irq_save(), and also suffered 
> from the followup frustration when I found out that this has been reported 
> to llvm folks ages ago, and they haven't bothered to fix it (it's now at 
> least worked around, in a very sub-optimal way (lahf/sahf)).

I got involved in building the kernel with LLVM a little while ago,
after accidentally implementing .code16 support in LLVM — for other
reasons, but it allowed the arch/x86/boot/ bits to be built with LLVM.

Apart from resolutely not wanting to implement variable length arrays
on the stack, the LLVM folks actually seem quite keen to make things
work. I'm interested in the problem you report above.. and note the
absence of a bug number. Can you provide it?

You're right that it does take a while to get some things fixed, but
people *are* doing a fairly good job of identifying them, filing bugs,
and implementing workarounds until the bugs can be fixed.

Building with LLVM has also helped to find some real kernel bugs. I'd
be keen to get this working more widely.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-21  9:54         ` David Woodhouse
@ 2016-07-21 13:41           ` Shuah Khan
  2016-07-21 14:02             ` David Woodhouse
  2016-07-21 18:38           ` Jiri Kosina
  1 sibling, 1 reply; 82+ messages in thread
From: Shuah Khan @ 2016-07-21 13:41 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Thu, Jul 21, 2016 at 3:54 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> On Wed, 2016-07-20 at 19:04 +0200, Jiri Kosina wrote:
>> On Wed, 20 Jul 2016, Mark Brown wrote:
>>
>> > > > There's a push from certain quarters to move away from GCC to LLVM.
>> >
>> > > This might actually be an interesting topic per se.
>> >
>> > Yes, indeed.
>>
>> Let's make this a real proposal then ... (subject changed). I am again a
>> bit unsure about the core / tech division here.
>>
>> People who should be invited: proponents of the push from the certain
>> quarters mentioned by Mark above, and ideally some LLVM folks as well.
>>
>> I've never actually used llvm to compile the kernel (which makes me rather
>> poor contributor should any such discussion happen), but I've been on the
>> "receiving side", debugging a crash that turned out to be llvm messing up
>> with IF in a way that interfers with local_irq_save(), and also suffered
>> from the followup frustration when I found out that this has been reported
>> to llvm folks ages ago, and they haven't bothered to fix it (it's now at
>> least worked around, in a very sub-optimal way (lahf/sahf)).
>
> I got involved in building the kernel with LLVM a little while ago,
> after accidentally implementing .code16 support in LLVM — for other
> reasons, but it allowed the arch/x86/boot/ bits to be built with LLVM.
>
> Apart from resolutely not wanting to implement variable length arrays
> on the stack, the LLVM folks actually seem quite keen to make things
> work. I'm interested in the problem you report above.. and note the
> absence of a bug number. Can you provide it?
>
> You're right that it does take a while to get some things fixed, but
> people *are* doing a fairly good job of identifying them, filing bugs,
> and implementing workarounds until the bugs can be fixed.
>
> Building with LLVM has also helped to find some real kernel bugs. I'd
> be keen to get this working more widely.
>

Would you be willing to share your experiences and the nature of bugs
you were able to find using LLVM. Maybe that could be folded into this
discussion as a real life experience.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-21 13:41           ` Shuah Khan
@ 2016-07-21 14:02             ` David Woodhouse
  2016-07-21 16:21               ` Mark Brown
  0 siblings, 1 reply; 82+ messages in thread
From: David Woodhouse @ 2016-07-21 14:02 UTC (permalink / raw)
  To: Shuah Khan, Behan Webster; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2663 bytes --]

On Thu, 2016-07-21 at 07:41 -0600, Shuah Khan wrote:
> On Thu, Jul 21, 2016 at 3:54 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> > On Wed, 2016-07-20 at 19:04 +0200, Jiri Kosina wrote:
> > > On Wed, 20 Jul 2016, Mark Brown wrote:
> > > 
> > > > > > There's a push from certain quarters to move away from GCC to LLVM.
> > > > 
> > > > > This might actually be an interesting topic per se.
> > > > 
> > > > Yes, indeed.
> > > 
> > > Let's make this a real proposal then ... (subject changed). I am again a
> > > bit unsure about the core / tech division here.
> > > 
> > > People who should be invited: proponents of the push from the certain
> > > quarters mentioned by Mark above, and ideally some LLVM folks as well.
> > > 
> > > I've never actually used llvm to compile the kernel (which makes me rather
> > > poor contributor should any such discussion happen), but I've been on the
> > > "receiving side", debugging a crash that turned out to be llvm messing up
> > > with IF in a way that interfers with local_irq_save(), and also suffered
> > > from the followup frustration when I found out that this has been reported
> > > to llvm folks ages ago, and they haven't bothered to fix it (it's now at
> > > least worked around, in a very sub-optimal way (lahf/sahf)).
> > 
> > I got involved in building the kernel with LLVM a little while ago,
> > after accidentally implementing .code16 support in LLVM — for other
> > reasons, but it allowed the arch/x86/boot/ bits to be built with LLVM.
> > 
> > Apart from resolutely not wanting to implement variable length arrays
> > on the stack, the LLVM folks actually seem quite keen to make things
> > work. I'm interested in the problem you report above.. and note the
> > absence of a bug number. Can you provide it?
> > 
> > You're right that it does take a while to get some things fixed, but
> > people *are* doing a fairly good job of identifying them, filing bugs,
> > and implementing workarounds until the bugs can be fixed.
> > 
> > Building with LLVM has also helped to find some real kernel bugs. I'd
> > be keen to get this working more widely.
> > 
> 
> Would you be willing to share your experiences and the nature of bugs
> you were able to find using LLVM. Maybe that could be folded into this
> discussion as a real life experience.

http://llvm.linuxfoundation.org/index.php/Bugs

Probably horrifically out of date. Behan is the best person to ask for
a current status, I suspect...

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 15:32 [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel Eric W. Biederman
                   ` (2 preceding siblings ...)
  2016-07-19 21:26 ` Josh Triplett
@ 2016-07-21 15:05 ` David Howells
  2016-07-21 23:33   ` Dmitry Torokhov
                     ` (5 more replies)
  2016-07-22 11:19 ` David Howells
  2016-08-12  4:42 ` Michael S. Tsirkin
  5 siblings, 6 replies; 82+ messages in thread
From: David Howells @ 2016-07-21 15:05 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: ksummit-discuss

I know it's not precisely what you're asking about, but there are a number of
types I would like to see:

 (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
     long when you want to deploy a flags field with which you're going to use
     test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
     because you can't use bits 32-63 as they might not exist.

     Some arches, x86_64 and ppc64 for example, can do 32-bit atomic ops, so
     we could make the field smaller in some cases.

     We have a *lot* of flags fields in the kernel, so I wonder if we could
     actually save any space.

     I seem to remember that the argument is (or was) that the type must be
     the natural word size of the machine, but how true is that in actuality?

 (2) Differentiate non-BH spinlocks and BH spinlocks by type.

     It seems like you can't mix BH and non-BH ops on a spinlock without
     lockdep barking.  If that's the case, let's make this a compile-time
     check.

 (3) Let's use bool a lot more for boolean values as the compiler might be
     able to make better choices with it.


And whilst we're at it, a function that I'd like to see:

 (1) on_list() in addition to list_empty() (and similar for other list types).

     I know this would be kind of redundant as is would be implemented exactly
     the same as list_empty() - but semantically you're asking a different
     question.

David

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-21 14:02             ` David Woodhouse
@ 2016-07-21 16:21               ` Mark Brown
  2016-07-23  3:28                 ` Behan Webster
  0 siblings, 1 reply; 82+ messages in thread
From: Mark Brown @ 2016-07-21 16:21 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Behan Webster, ksummit-discuss, sumit.semwal

[-- Attachment #1: Type: text/plain, Size: 564 bytes --]

On Thu, Jul 21, 2016 at 03:02:12PM +0100, David Woodhouse wrote:
> On Thu, 2016-07-21 at 07:41 -0600, Shuah Khan wrote:

> > Would you be willing to share your experiences and the nature of bugs
> > you were able to find using LLVM. Maybe that could be folded into this
> > discussion as a real life experience.

> http://llvm.linuxfoundation.org/index.php/Bugs

> Probably horrifically out of date. Behan is the best person to ask for
> a current status, I suspect...

CCing in Sumit who I believe has also been looking at some issues
building kernels with LLVM.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-21  9:54         ` David Woodhouse
  2016-07-21 13:41           ` Shuah Khan
@ 2016-07-21 18:38           ` Jiri Kosina
  2016-07-21 20:47             ` Paul Turner
  2016-07-26 11:22             ` David Woodhouse
  1 sibling, 2 replies; 82+ messages in thread
From: Jiri Kosina @ 2016-07-21 18:38 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Thu, 21 Jul 2016, David Woodhouse wrote:

> Apart from resolutely not wanting to implement variable length arrays on 
> the stack, the LLVM folks actually seem quite keen to make things work. 
> I'm interested in the problem you report above.. and note the absence of 
> a bug number. Can you provide it?

I am currently on vacation and on super-lousy internet connection, so 
looking through my archives is a bit complicated ... I *think* it started 
in "[PATCH] usbhid: Fix lockdep unannotated irqs-off warning" thread on 
lkml.

In case you're not able to find it from there, I'll do my homework 
mid-next week when I am back properly online.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-21 18:38           ` Jiri Kosina
@ 2016-07-21 20:47             ` Paul Turner
  2016-07-26 11:22             ` David Woodhouse
  1 sibling, 0 replies; 82+ messages in thread
From: Paul Turner @ 2016-07-21 20:47 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1186 bytes --]

We're considering trying to move our builds to Clang.  It's definitely
something we'd be interested in talking about, and depending on how the
initial attempts go we may have some anecdotal/early data.

On Thu, Jul 21, 2016 at 11:38 AM, Jiri Kosina <jikos@kernel.org> wrote:

> On Thu, 21 Jul 2016, David Woodhouse wrote:
>
> > Apart from resolutely not wanting to implement variable length arrays on
> > the stack, the LLVM folks actually seem quite keen to make things work.
> > I'm interested in the problem you report above.. and note the absence of
> > a bug number. Can you provide it?
>
> I am currently on vacation and on super-lousy internet connection, so
> looking through my archives is a bit complicated ... I *think* it started
> in "[PATCH] usbhid: Fix lockdep unannotated irqs-off warning" thread on
> lkml.
>
> In case you're not able to find it from there, I'll do my homework
> mid-next week when I am back properly online.
>
> Thanks,
>
> --
> Jiri Kosina
> SUSE Labs
>
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
>

[-- Attachment #2: Type: text/html, Size: 1881 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-21 15:05 ` David Howells
@ 2016-07-21 23:33   ` Dmitry Torokhov
  2016-07-22  6:00   ` Hannes Reinecke
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 82+ messages in thread
From: Dmitry Torokhov @ 2016-07-21 23:33 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

On Thu, Jul 21, 2016 at 04:05:25PM +0100, David Howells wrote:
> I know it's not precisely what you're asking about, but there are a number of
> types I would like to see:
> 
>  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
>      long when you want to deploy a flags field with which you're going to use
>      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
>      because you can't use bits 32-63 as they might not exist.

What is wrong with using DECLARE_BITMAP()? It will allocate exactly as
many unsigned longs as needed (so it will be 1 on 64 and 2 on 32 arches
for bitmap in range [33, 64]).

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-21 15:05 ` David Howells
  2016-07-21 23:33   ` Dmitry Torokhov
@ 2016-07-22  6:00   ` Hannes Reinecke
  2016-07-22  6:14     ` Julia Lawall
  2016-07-22  7:03   ` David Howells
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 82+ messages in thread
From: Hannes Reinecke @ 2016-07-22  6:00 UTC (permalink / raw)
  To: ksummit-discuss

On 07/21/2016 05:05 PM, David Howells wrote:
> I know it's not precisely what you're asking about, but there are a number of
> types I would like to see:
> 
>  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
>      long when you want to deploy a flags field with which you're going to use
>      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
>      because you can't use bits 32-63 as they might not exist.
> 
>      Some arches, x86_64 and ppc64 for example, can do 32-bit atomic ops, so
>      we could make the field smaller in some cases.
> 
>      We have a *lot* of flags fields in the kernel, so I wonder if we could
>      actually save any space.
> 
>      I seem to remember that the argument is (or was) that the type must be
>      the natural word size of the machine, but how true is that in actuality?
> 
>  (2) Differentiate non-BH spinlocks and BH spinlocks by type.
> 
>      It seems like you can't mix BH and non-BH ops on a spinlock without
>      lockdep barking.  If that's the case, let's make this a compile-time
>      check.
> 
>  (3) Let's use bool a lot more for boolean values as the compiler might be
>      able to make better choices with it.
> 
> 
> And whilst we're at it, a function that I'd like to see:
> 
>  (1) on_list() in addition to list_empty() (and similar for other list types).
> 
>      I know this would be kind of redundant as is would be implemented exactly
>      the same as list_empty() - but semantically you're asking a different
>      question.
> 
Actually that's one thing on my long-term to-do list: stricter range
checking for kernel functions.
enums are an obvious way to address that, but there is a broad range of
functions which return an errno value without ever exhausting the entire
errno range.
Which makes it really hard from the calling function to figure out which
errno values are legit and which can be considered an error.

If we had a programmatic way of addressing that we could
a) validate the usage of those functions
and
b) easily do error injection with systemtap as we would know which
values the systemtap function should return

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22  6:00   ` Hannes Reinecke
@ 2016-07-22  6:14     ` Julia Lawall
  2016-07-22 13:57       ` Hannes Reinecke
  2016-08-04  7:15       ` NeilBrown
  0 siblings, 2 replies; 82+ messages in thread
From: Julia Lawall @ 2016-07-22  6:14 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3341 bytes --]



On Fri, 22 Jul 2016, Hannes Reinecke wrote:

> On 07/21/2016 05:05 PM, David Howells wrote:
> > I know it's not precisely what you're asking about, but there are a number of
> > types I would like to see:
> >
> >  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
> >      long when you want to deploy a flags field with which you're going to use
> >      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
> >      because you can't use bits 32-63 as they might not exist.
> >
> >      Some arches, x86_64 and ppc64 for example, can do 32-bit atomic ops, so
> >      we could make the field smaller in some cases.
> >
> >      We have a *lot* of flags fields in the kernel, so I wonder if we could
> >      actually save any space.
> >
> >      I seem to remember that the argument is (or was) that the type must be
> >      the natural word size of the machine, but how true is that in actuality?
> >
> >  (2) Differentiate non-BH spinlocks and BH spinlocks by type.
> >
> >      It seems like you can't mix BH and non-BH ops on a spinlock without
> >      lockdep barking.  If that's the case, let's make this a compile-time
> >      check.
> >
> >  (3) Let's use bool a lot more for boolean values as the compiler might be
> >      able to make better choices with it.
> >
> >
> > And whilst we're at it, a function that I'd like to see:
> >
> >  (1) on_list() in addition to list_empty() (and similar for other list types).
> >
> >      I know this would be kind of redundant as is would be implemented exactly
> >      the same as list_empty() - but semantically you're asking a different
> >      question.
>
> Actually that's one thing on my long-term to-do list: stricter range
> checking for kernel functions.
> enums are an obvious way to address that,

In C, enums are ints.  Is there a Gcc option that checks them?  For
example, the following program compiles fine with -Wall:

enum one {ONE=1, TWO=2, THREE=3};
enum two {ONEX=7, TWOX=8, THREEX=9};

int f (int x) {
  enum one o = ONE;
  enum two t = THREEX;
  if (x) o = t; else t = o;
  return 0;
}

> but there is a broad range of
> functions which return an errno value without ever exhausting the entire
> errno range.

I guess that almost all functions return only a few possible error codes?
But only a few call sites check the values either.

In terms of collecting this information, one would quickly run into
function pointers, but that would probably not be a insurmountable
obstacle.

julia

> Which makes it really hard from the calling function to figure out which
> errno values are legit and which can be considered an error.
>
> If we had a programmatic way of addressing that we could
> a) validate the usage of those functions
> and
> b) easily do error injection with systemtap as we would know which
> values the systemtap function should return
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke		               zSeries & Storage
> hare@suse.com			               +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
> HRB 21284 (AG Nürnberg)
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-21 15:05 ` David Howells
  2016-07-21 23:33   ` Dmitry Torokhov
  2016-07-22  6:00   ` Hannes Reinecke
@ 2016-07-22  7:03   ` David Howells
  2016-07-22 10:10     ` Alexey Dobriyan
                       ` (2 more replies)
  2016-07-28  3:40   ` Steven Rostedt
                     ` (2 subsequent siblings)
  5 siblings, 3 replies; 82+ messages in thread
From: David Howells @ 2016-07-22  7:03 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: ksummit-discuss

Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:

> >  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
> >      long when you want to deploy a flags field with which you're going to use
> >      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
> >      because you can't use bits 32-63 as they might not exist.
> 
> What is wrong with using DECLARE_BITMAP()? It will allocate exactly as
> many unsigned longs

You missed my point.  *unsigned long* is the issue.  The majority of the time
that wastes 32 bits on a 64-bit machine - especially when we don't need that
many flags.  On some 64-bit arches we could use unsigned int instead.

David

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22  7:03   ` David Howells
@ 2016-07-22 10:10     ` Alexey Dobriyan
  2016-07-22 10:13     ` David Howells
  2016-07-22 18:19     ` Dmitry Torokhov
  2 siblings, 0 replies; 82+ messages in thread
From: Alexey Dobriyan @ 2016-07-22 10:10 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

On Fri, Jul 22, 2016 at 10:03 AM, David Howells <dhowells@redhat.com> wrote:
> Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:
>
>> >  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
>> >      long when you want to deploy a flags field with which you're going to use
>> >      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
>> >      because you can't use bits 32-63 as they might not exist.
>>
>> What is wrong with using DECLARE_BITMAP()? It will allocate exactly as
>> many unsigned longs
>
> You missed my point.  *unsigned long* is the issue.  The majority of the time
> that wastes 32 bits on a 64-bit machine - especially when we don't need that
> many flags.  On some 64-bit arches we could use unsigned int instead.

Indeed, but please call them flags_t and flags64_t.
test_bit et all can dispatch with  __builtin_choose_expr or even _Generic.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22  7:03   ` David Howells
  2016-07-22 10:10     ` Alexey Dobriyan
@ 2016-07-22 10:13     ` David Howells
  2016-07-22 10:22       ` Alexey Dobriyan
  2016-07-22 18:19     ` Dmitry Torokhov
  2 siblings, 1 reply; 82+ messages in thread
From: David Howells @ 2016-07-22 10:13 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: ksummit-discuss

Alexey Dobriyan <adobriyan@gmail.com> wrote:

> Indeed, but please call them flags_t and flags64_t.

Fine by me, though I was trying to avoid confusion with the argument to IRQ
munging functions.

> test_bit et all can dispatch with  __builtin_choose_expr or even _Generic.

Or by __atomic_fetch_{and,or,xor}().

David

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 10:13     ` David Howells
@ 2016-07-22 10:22       ` Alexey Dobriyan
  2016-07-22 10:53         ` Vlastimil Babka
  2016-07-22 11:05         ` David Howells
  0 siblings, 2 replies; 82+ messages in thread
From: Alexey Dobriyan @ 2016-07-22 10:22 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

On Fri, Jul 22, 2016 at 1:13 PM, David Howells <dhowells@redhat.com> wrote:
> Alexey Dobriyan <adobriyan@gmail.com> wrote:
>
>> Indeed, but please call them flags_t and flags64_t.
>
> Fine by me, though I was trying to avoid confusion with the argument to IRQ
> munging functions.

I can resurrect irq_flags_t but there is no procedure for merging 1MB+ diffs
without spending whole life splitting and emailing maintainers.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 10:22       ` Alexey Dobriyan
@ 2016-07-22 10:53         ` Vlastimil Babka
  2016-07-22 11:05         ` David Howells
  1 sibling, 0 replies; 82+ messages in thread
From: Vlastimil Babka @ 2016-07-22 10:53 UTC (permalink / raw)
  To: Alexey Dobriyan, David Howells; +Cc: ksummit-discuss

On 07/22/2016 12:22 PM, Alexey Dobriyan wrote:
> On Fri, Jul 22, 2016 at 1:13 PM, David Howells <dhowells@redhat.com> wrote:
>> Alexey Dobriyan <adobriyan@gmail.com> wrote:
>>
>>> Indeed, but please call them flags_t and flags64_t.
>>
>> Fine by me, though I was trying to avoid confusion with the argument to IRQ
>> munging functions.
>
> I can resurrect irq_flags_t but there is no procedure for merging 1MB+ diffs
> without spending whole life splitting and emailing maintainers.

I think the procedure is that you have a script (e.g. coccinelle) to 
make such patch and Linus applies it himself just before/after rc1.

> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 10:22       ` Alexey Dobriyan
  2016-07-22 10:53         ` Vlastimil Babka
@ 2016-07-22 11:05         ` David Howells
  2016-07-22 17:18           ` Julia Lawall
  1 sibling, 1 reply; 82+ messages in thread
From: David Howells @ 2016-07-22 11:05 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: ksummit-discuss

Vlastimil Babka <vbabka@suse.cz> wrote:

> > I can resurrect irq_flags_t but there is no procedure for merging 1MB+ diffs
> > without spending whole life splitting and emailing maintainers.
> 
> I think the procedure is that you have a script (e.g. coccinelle) to make such
> patch and Linus applies it himself just before/after rc1.

Can coccinelle manage changing the local variable declaration, especially if
someone does:

	unsigned long flags, fred;

where fred is unrelated.

But, on the other hand, it should be possible to compile test very easily if
irq_flags_t is a typedef'd struct.

David

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 15:32 [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel Eric W. Biederman
                   ` (3 preceding siblings ...)
  2016-07-21 15:05 ` David Howells
@ 2016-07-22 11:19 ` David Howells
  2016-07-22 12:44   ` Linus Walleij
  2016-07-22 13:26   ` David Howells
  2016-08-12  4:42 ` Michael S. Tsirkin
  5 siblings, 2 replies; 82+ messages in thread
From: David Howells @ 2016-07-22 11:19 UTC (permalink / raw)
  To: Mark Brown; +Cc: ksummit-discuss

Mark Brown <broonie@kernel.org> wrote:

> There's a push from certain quarters to move away from GCC to LLVM.  Not
> that these things are unachievable in LLVM but it's a thing.

I currently maintain a set of 26 cross-compilers (ppc64 is a box of symlinks)
for building the kernel and bootloaders with gcc on Fedora:

	gcc-aarch64-linux-gnu.x86_64                 5.3.1-2.fc23   updates
	gcc-alpha-linux-gnu.x86_64                   5.3.1-2.fc23   updates
	gcc-arm-linux-gnu.x86_64                     5.3.1-2.fc23   updates
	gcc-avr32-linux-gnu.x86_64                   5.3.1-2.fc23   updates
	gcc-bfin-linux-gnu.x86_64                    5.3.1-2.fc23   updates
	gcc-c6x-linux-gnu.x86_64                     5.3.1-2.fc23   updates
	gcc-cris-linux-gnu.x86_64                    5.3.1-2.fc23   updates
	gcc-frv-linux-gnu.x86_64                     5.3.1-2.fc23   updates
	gcc-h8300-linux-gnu.x86_64                   5.3.1-2.fc23   updates
	gcc-hppa-linux-gnu.x86_64                    5.3.1-2.fc23   updates
	gcc-hppa64-linux-gnu.x86_64                  5.3.1-2.fc23   updates
	gcc-ia64-linux-gnu.x86_64                    5.3.1-2.fc23   updates
	gcc-m32r-linux-gnu.x86_64                    5.3.1-2.fc23   updates
	gcc-m68k-linux-gnu.x86_64                    5.3.1-2.fc23   updates
	gcc-microblaze-linux-gnu.x86_64              5.3.1-2.fc23   updates
	gcc-mips64-linux-gnu.x86_64                  5.3.1-2.fc23   updates
	gcc-mn10300-linux-gnu.x86_64                 5.3.1-2.fc23   updates
	gcc-nios2-linux-gnu.x86_64                   5.3.1-2.fc23   updates
	gcc-powerpc64-linux-gnu.x86_64               5.3.1-2.fc23   updates
	gcc-ppc64-linux-gnu.x86_64                   5.3.1-2.fc23   updates
	gcc-s390x-linux-gnu.x86_64                   5.3.1-2.fc23   updates
	gcc-sh-linux-gnu.x86_64                      5.3.1-2.fc23   updates
	gcc-sh64-linux-gnu.x86_64                    5.3.1-2.fc23   updates
	gcc-sparc64-linux-gnu.x86_64                 5.3.1-2.fc23   updates
	gcc-tile-linux-gnu.x86_64                    5.3.1-2.fc23   updates
	gcc-x86_64-linux-gnu.x86_64                  5.3.1-2.fc23   updates
	gcc-xtensa-linux-gnu.x86_64                  5.3.1-2.fc23   updates

These are gcc-6.1.1-based in fc24.  Can LLVM cover all of these, plus the
missing arches that gcc doesn't support yet?

David

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 11:19 ` David Howells
@ 2016-07-22 12:44   ` Linus Walleij
  2016-07-22 13:26   ` David Howells
  1 sibling, 0 replies; 82+ messages in thread
From: Linus Walleij @ 2016-07-22 12:44 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

On Fri, Jul 22, 2016 at 1:19 PM, David Howells <dhowells@redhat.com> wrote:

I'm just gonna hijack this to ask a question...

> I currently maintain a set of 26 cross-compilers (ppc64 is a box of symlinks)
> for building the kernel and bootloaders with gcc on Fedora:
(...)
>         gcc-arm-linux-gnu.x86_64                     5.3.1-2.fc23   updates

I've tested this thing and it works flawlessly with the kernel.
Also U-boot recently supplies its own div64 routine and dropped
the dependency on libgcc.a (IIRC).

So it works, but still of course this thing is there:

rpm -ql gcc-arm-linux-gnu-6.1.1-1.fc24.x86_64
(...)
/usr/lib/gcc/arm-linux-gnueabi/6.1.1/libgcc.a

Also this:

/usr/lib/gcc/arm-linux-gnueabi/6.1.1/crtbegin.o
/usr/lib/gcc/arm-linux-gnueabi/6.1.1/crtbeginS.o
/usr/lib/gcc/arm-linux-gnueabi/6.1.1/crtbeginT.o
/usr/lib/gcc/arm-linux-gnueabi/6.1.1/crtend.o
/usr/lib/gcc/arm-linux-gnueabi/6.1.1/crtendS.o
/usr/lib/gcc/arm-linux-gnueabi/6.1.1/crtfastmath.o

I suspect these things are implicitly compiled for one and only one ISA?
I honestly don't know how to even check which one. Does it use
hardfloat? Or has this problem of targetting several ISAs with a .a
file been fixes since I looked at it last?

I need to cross-compile ARMv4, ARMv5, ARMv6, ARMv7 and
ARMv8 for my systems. (Sorry for all old crap I'm keeping it's just
my job...)

That ABI problem is then manifolded if I would want to do anything
userspace, so I keep a whole range of tailored cross compilers and
prebuilt C libraries around that I know "just work", but I guess will
grow increasingly hard to maintain and it feels pretty unelegant at
times.

Is there a way for a distribution to provide a proper set of
ISA-specific stuff alongside a crosscompiler, if the crosscompiler
supports several different ISAs, like the ARMvN variants do?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 11:19 ` David Howells
  2016-07-22 12:44   ` Linus Walleij
@ 2016-07-22 13:26   ` David Howells
  1 sibling, 0 replies; 82+ messages in thread
From: David Howells @ 2016-07-22 13:26 UTC (permalink / raw)
  To: Linus Walleij; +Cc: ksummit-discuss

Linus Walleij <linus.walleij@linaro.org> wrote:

> So it works, but still of course this thing is there:
> 
> rpm -ql gcc-arm-linux-gnu-6.1.1-1.fc24.x86_64
> (...)
> /usr/lib/gcc/arm-linux-gnueabi/6.1.1/libgcc.a

Some people really wanted libgcc libraries.  Even some kernel arches require
them.

> I suspect these things are implicitly compiled for one and only one ISA?
> I honestly don't know how to even check which one. Does it use
> hardfloat? Or has this problem of targetting several ISAs with a .a
> file been fixes since I looked at it last?
> 
> I need to cross-compile ARMv4, ARMv5, ARMv6, ARMv7 and
> ARMv8 for my systems. (Sorry for all old crap I'm keeping it's just
> my job...)

The arm- compiler is built with:

	CONFIG_FLAGS="--with-tune=cortex-a8 --with-arch=armv7-a \
		--with-float=hard --with-fpu=vfpv3-d16 --with-abi=aapcs-linux"

I can probably add a --with-multilib-list= option to build multiple variants
as I do for sh and sh64:

	sh-*)
	    CONFIG_FLAGS=--with-multilib-list=m1,m2,m2e,m2a,m2a-single,m4,m4-single,m4-single-only,m4-nofpu
	    ;;
	sh4-*)
	    CONFIG_FLAGS=--with-multilib-list=m4,m4-single,m4-single-only,m4-nofpu
	    ;;

if someone tells me what they'd like to see on there.  In the sh case, this
gives me all of:

	/usr/lib/gcc/sh-linux-gnu/5.3.1/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/m2/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/m2e/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/m4-nofpu/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/m4-single-only/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/m4-single/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/m4/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/m2/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/m2a-single/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/m2a/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/m2e/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/m4-nofpu/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/m4-single-only/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/m4-single/libgcc.a
	/usr/lib/gcc/sh-linux-gnu/5.3.1/mb/m4/libgcc.a

> That ABI problem is then manifolded if I would want to do anything
> userspace, so I keep a whole range of tailored cross compilers and
> prebuilt C libraries around that I know "just work", but I guess will
> grow increasingly hard to maintain and it feels pretty unelegant at
> times.

Yeah.  It's made more fun by the fact that if I want to do a general compiler
SRPM that covers all the arches fully bootstrapped with C libraries, not every
arch is supported by glibc, so I have to include uClibc as well for those
arches.

Take sh as mentioned above, that has at least 16 different potential
userspaces by the libgcc count.

> Is there a way for a distribution to provide a proper set of
> ISA-specific stuff alongside a crosscompiler, if the crosscompiler
> supports several different ISAs, like the ARMvN variants do?

It *ought* to be possible to build C libraries separately, given an available
cross-compiler - except that the gcc build wants stuff from the C library
you're using.  Having talked to some gcc people about this before, IIRC, it's
something that could be managed without - if someone's willing to do the work
to make gcc no longer require it.

David

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22  6:14     ` Julia Lawall
@ 2016-07-22 13:57       ` Hannes Reinecke
  2016-07-22 14:40         ` Julia Lawall
                           ` (3 more replies)
  2016-08-04  7:15       ` NeilBrown
  1 sibling, 4 replies; 82+ messages in thread
From: Hannes Reinecke @ 2016-07-22 13:57 UTC (permalink / raw)
  To: Julia Lawall; +Cc: ksummit-discuss

On 07/22/2016 08:14 AM, Julia Lawall wrote:
> 
> 
> On Fri, 22 Jul 2016, Hannes Reinecke wrote:
> 
>> On 07/21/2016 05:05 PM, David Howells wrote:
>>> I know it's not precisely what you're asking about, but there are a number of
>>> types I would like to see:
>>>
>>>  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
>>>      long when you want to deploy a flags field with which you're going to use
>>>      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
>>>      because you can't use bits 32-63 as they might not exist.
>>>
>>>      Some arches, x86_64 and ppc64 for example, can do 32-bit atomic ops, so
>>>      we could make the field smaller in some cases.
>>>
>>>      We have a *lot* of flags fields in the kernel, so I wonder if we could
>>>      actually save any space.
>>>
>>>      I seem to remember that the argument is (or was) that the type must be
>>>      the natural word size of the machine, but how true is that in actuality?
>>>
>>>  (2) Differentiate non-BH spinlocks and BH spinlocks by type.
>>>
>>>      It seems like you can't mix BH and non-BH ops on a spinlock without
>>>      lockdep barking.  If that's the case, let's make this a compile-time
>>>      check.
>>>
>>>  (3) Let's use bool a lot more for boolean values as the compiler might be
>>>      able to make better choices with it.
>>>
>>>
>>> And whilst we're at it, a function that I'd like to see:
>>>
>>>  (1) on_list() in addition to list_empty() (and similar for other list types).
>>>
>>>      I know this would be kind of redundant as is would be implemented exactly
>>>      the same as list_empty() - but semantically you're asking a different
>>>      question.
>>
>> Actually that's one thing on my long-term to-do list: stricter range
>> checking for kernel functions.
>> enums are an obvious way to address that,
> 
> In C, enums are ints.  Is there a Gcc option that checks them?  For
> example, the following program compiles fine with -Wall:
> 
> enum one {ONE=1, TWO=2, THREE=3};
> enum two {ONEX=7, TWOX=8, THREEX=9};
> 
> int f (int x) {
>   enum one o = ONE;
>   enum two t = THREEX;
>   if (x) o = t; else t = o;
>   return 0;
> }
> 
>From what I know GCC only checks enums if they are used in a switch()
statement. Other types are not checked.

>> but there is a broad range of
>> functions which return an errno value without ever exhausting the entire
>> errno range.
> 
> I guess that almost all functions return only a few possible error codes?

Precisely. If we had a way of specifying "the return value is an errno
with the possible values '0', '-EIO', and '-EINVAL'" that would be
_so_ cool.

> But only a few call sites check the values either.
> 
I know. But this is primarily because the caller itself isn't quite sure
which values to expect.
ATM one has to rely on the function documentation if existing.
And even that might be outdated.

> In terms of collecting this information, one would quickly run into
> function pointers, but that would probably not be a insurmountable
> obstacle.
> 
But then even the functions pointed to could / should use this syntax,
so we would have a set of all possible return codes.
The important bit here is that we would be able to check
programmatically which values can be returned.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 13:57       ` Hannes Reinecke
@ 2016-07-22 14:40         ` Julia Lawall
  2016-07-22 19:12         ` Arnd Bergmann
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 82+ messages in thread
From: Julia Lawall @ 2016-07-22 14:40 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss



On Fri, 22 Jul 2016, Hannes Reinecke wrote:

> On 07/22/2016 08:14 AM, Julia Lawall wrote:
> >
> >
> > On Fri, 22 Jul 2016, Hannes Reinecke wrote:
> >
> >> On 07/21/2016 05:05 PM, David Howells wrote:
> >>> I know it's not precisely what you're asking about, but there are a number of
> >>> types I would like to see:
> >>>
> >>>  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
> >>>      long when you want to deploy a flags field with which you're going to use
> >>>      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
> >>>      because you can't use bits 32-63 as they might not exist.
> >>>
> >>>      Some arches, x86_64 and ppc64 for example, can do 32-bit atomic ops, so
> >>>      we could make the field smaller in some cases.
> >>>
> >>>      We have a *lot* of flags fields in the kernel, so I wonder if we could
> >>>      actually save any space.
> >>>
> >>>      I seem to remember that the argument is (or was) that the type must be
> >>>      the natural word size of the machine, but how true is that in actuality?
> >>>
> >>>  (2) Differentiate non-BH spinlocks and BH spinlocks by type.
> >>>
> >>>      It seems like you can't mix BH and non-BH ops on a spinlock without
> >>>      lockdep barking.  If that's the case, let's make this a compile-time
> >>>      check.
> >>>
> >>>  (3) Let's use bool a lot more for boolean values as the compiler might be
> >>>      able to make better choices with it.
> >>>
> >>>
> >>> And whilst we're at it, a function that I'd like to see:
> >>>
> >>>  (1) on_list() in addition to list_empty() (and similar for other list types).
> >>>
> >>>      I know this would be kind of redundant as is would be implemented exactly
> >>>      the same as list_empty() - but semantically you're asking a different
> >>>      question.
> >>
> >> Actually that's one thing on my long-term to-do list: stricter range
> >> checking for kernel functions.
> >> enums are an obvious way to address that,
> >
> > In C, enums are ints.  Is there a Gcc option that checks them?  For
> > example, the following program compiles fine with -Wall:
> >
> > enum one {ONE=1, TWO=2, THREE=3};
> > enum two {ONEX=7, TWOX=8, THREEX=9};
> >
> > int f (int x) {
> >   enum one o = ONE;
> >   enum two t = THREEX;
> >   if (x) o = t; else t = o;
> >   return 0;
> > }
> >
> From what I know GCC only checks enums if they are used in a switch()
> statement. Other types are not checked.
>
> >> but there is a broad range of
> >> functions which return an errno value without ever exhausting the entire
> >> errno range.
> >
> > I guess that almost all functions return only a few possible error codes?
>
> Precisely. If we had a way of specifying "the return value is an errno
> with the possible values '0', '-EIO', and '-EINVAL'" that would be
> _so_ cool.
>
> > But only a few call sites check the values either.
> >
> I know. But this is primarily because the caller itself isn't quite sure
> which values to expect.
> ATM one has to rely on the function documentation if existing.
> And even that might be outdated.
>
> > In terms of collecting this information, one would quickly run into
> > function pointers, but that would probably not be a insurmountable
> > obstacle.
> >
> But then even the functions pointed to could / should use this syntax,
> so we would have a set of all possible return codes.
> The important bit here is that we would be able to check
> programmatically which values can be returned.

Thanks for the clarifications.  Perhaps it could be possible.

julia

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 11:05         ` David Howells
@ 2016-07-22 17:18           ` Julia Lawall
  0 siblings, 0 replies; 82+ messages in thread
From: Julia Lawall @ 2016-07-22 17:18 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss, Vlastimil Babka



On Fri, 22 Jul 2016, David Howells wrote:

> Vlastimil Babka <vbabka@suse.cz> wrote:
>
> > > I can resurrect irq_flags_t but there is no procedure for merging 1MB+ diffs
> > > without spending whole life splitting and emailing maintainers.
> >
> > I think the procedure is that you have a script (e.g. coccinelle) to make such
> > patch and Linus applies it himself just before/after rc1.
>
> Can coccinelle manage changing the local variable declaration, especially if
> someone does:
>
> 	unsigned long flags, fred;
>
> where fred is unrelated.

Recent versions of Coccinelle should allow this.  Older versions would
crash, so you would at least know that there was a problem.

julia

>
> But, on the other hand, it should be possible to compile test very easily if
> irq_flags_t is a typedef'd struct.
>
> David
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22  7:03   ` David Howells
  2016-07-22 10:10     ` Alexey Dobriyan
  2016-07-22 10:13     ` David Howells
@ 2016-07-22 18:19     ` Dmitry Torokhov
  2016-07-22 19:43       ` Guenter Roeck
  2 siblings, 1 reply; 82+ messages in thread
From: Dmitry Torokhov @ 2016-07-22 18:19 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

On Fri, Jul 22, 2016 at 08:03:06AM +0100, David Howells wrote:
> Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:
> 
> > >  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
> > >      long when you want to deploy a flags field with which you're going to use
> > >      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
> > >      because you can't use bits 32-63 as they might not exist.
> > 
> > What is wrong with using DECLARE_BITMAP()? It will allocate exactly as
> > many unsigned longs
> 
> You missed my point.  *unsigned long* is the issue.  The majority of the time
> that wastes 32 bits on a 64-bit machine - especially when we don't need that
> many flags.  On some 64-bit arches we could use unsigned int instead.

I was responding to your statement where you were saying you could not
use bits 32 - 63 as they might not exist. If you have so many bits then
DECLARE_BITMAP is useful. Otherwise we can simply use u{8|16|32} and
BIT() macro.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 13:57       ` Hannes Reinecke
  2016-07-22 14:40         ` Julia Lawall
@ 2016-07-22 19:12         ` Arnd Bergmann
  2016-07-26 11:48         ` David Woodhouse
  2016-08-11 15:44         ` Dan Carpenter
  3 siblings, 0 replies; 82+ messages in thread
From: Arnd Bergmann @ 2016-07-22 19:12 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: Dan Carpenter

On Friday, July 22, 2016 3:57:40 PM CEST Hannes Reinecke wrote:
> On 07/22/2016 08:14 AM, Julia Lawall wrote:
> > On Fri, 22 Jul 2016, Hannes Reinecke wrote:
> >>
> >> Actually that's one thing on my long-term to-do list: stricter range
> >> checking for kernel functions.
> >> enums are an obvious way to address that,

I think smatch has some useful checks in that area, and using a gcc
plugin to have a similar implementation could help do the checks
by default (if there are not too many false positives). 

> > In C, enums are ints.  Is there a Gcc option that checks them?  For
> > example, the following program compiles fine with -Wall:
> > 
> > enum one {ONE=1, TWO=2, THREE=3};
> > enum two {ONEX=7, TWOX=8, THREEX=9};
> > 
> > int f (int x) {
> >   enum one o = ONE;
> >   enum two t = THREEX;
> >   if (x) o = t; else t = o;
> >   return 0;
> > }
> > 
> From what I know GCC only checks enums if they are used in a switch()
> statement. Other types are not checked.

I think a common bug is having a function that takes an enum argument
type but that is called with integer argument, or a different
enum. This might even be something that could be added to a gcc
version by default.

	Arnd

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 18:19     ` Dmitry Torokhov
@ 2016-07-22 19:43       ` Guenter Roeck
  0 siblings, 0 replies; 82+ messages in thread
From: Guenter Roeck @ 2016-07-22 19:43 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: ksummit-discuss

On Fri, Jul 22, 2016 at 11:19:46AM -0700, Dmitry Torokhov wrote:
> On Fri, Jul 22, 2016 at 08:03:06AM +0100, David Howells wrote:
> > Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:
> > 
> > > >  (1) A 'bits' and maybe a 'bits64' type.  Currently you have to use unsigned
> > > >      long when you want to deploy a flags field with which you're going to use
> > > >      test_bit() and co. - but this typically wastes 32 bits on a 64-bit arch
> > > >      because you can't use bits 32-63 as they might not exist.
> > > 
> > > What is wrong with using DECLARE_BITMAP()? It will allocate exactly as
> > > many unsigned longs
> > 
> > You missed my point.  *unsigned long* is the issue.  The majority of the time
> > that wastes 32 bits on a 64-bit machine - especially when we don't need that
> > many flags.  On some 64-bit arches we could use unsigned int instead.
> 
> I was responding to your statement where you were saying you could not
> use bits 32 - 63 as they might not exist. If you have so many bits then
> DECLARE_BITMAP is useful. Otherwise we can simply use u{8|16|32} and
> BIT() macro.
> 
... unless one needs to statically initialize the bitmap. At least for my part
I have not been able to find a means to initialize a static bitmap with anything
but 0. Effectively this means that such bitmaps have to use u{8|16|32|64}
and thus can not use any of the bitmap operations, or one has to use unsigned
long and thus can only use 32 bit.

Guenter

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-21 16:21               ` Mark Brown
@ 2016-07-23  3:28                 ` Behan Webster
  0 siblings, 0 replies; 82+ messages in thread
From: Behan Webster @ 2016-07-23  3:28 UTC (permalink / raw)
  To: Shuah Khan; +Cc: sumit.semwal, ksummit-discuss

On 2016-07-21 09:21 AM, Mark Brown wrote:
> On Thu, Jul 21, 2016 at 03:02:12PM +0100, David Woodhouse wrote:
>> On Thu, 2016-07-21 at 07:41 -0600, Shuah Khan wrote:
>>> Would you be willing to share your experiences and the nature of bugs
>>> you were able to find using LLVM. Maybe that could be folded into this
>>> discussion as a real life experience.
>> http://llvm.linuxfoundation.org/index.php/Bugs
>> Probably horrifically out of date. Behan is the best person to ask for
>> a current status, I suspect...
> CCing in Sumit who I believe has also been looking at some issues
> building kernels with LLVM.
We're trying to start up the group again who was working on this a 
couple years ago. Part of that is figuring out an up to date list of 
issues. Some have been fixed, some are outstanding. And a few more 
things have been unearthed.

Despite other people seemingly interested in this, the traditional 
LLVMLinux people have mostly been moved to different projects and have 
had less time to concentrate on this topic, which is why things have 
fallen behind.

Behan

-- 
Behan Webster
behanw@gmail.com

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM
  2016-07-21 18:38           ` Jiri Kosina
  2016-07-21 20:47             ` Paul Turner
@ 2016-07-26 11:22             ` David Woodhouse
  1 sibling, 0 replies; 82+ messages in thread
From: David Woodhouse @ 2016-07-26 11:22 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1370 bytes --]

On Thu, 2016-07-21 at 20:38 +0200, Jiri Kosina wrote:
> On Thu, 21 Jul 2016, David Woodhouse wrote:
> 
> > Apart from resolutely not wanting to implement variable length arrays on 
> > the stack, the LLVM folks actually seem quite keen to make things work. 
> > I'm interested in the problem you report above.. and note the absence of 
> > a bug number. Can you provide it?
> 
> I am currently on vacation and on super-lousy internet connection, so 
> looking through my archives is a bit complicated ... I *think* it started 
> in "[PATCH] usbhid: Fix lockdep unannotated irqs-off warning" thread on 
> lkml.
> 
> In case you're not able to find it from there, I'll do my homework 
> mid-next week when I am back properly online.

OK, thanks. So it looks like that's acknowledged to have been a bug,
there's a patch to fix it at https://reviews.llvm.org/D6629 which may
even have been committed already.

The main sticking point for LLVM seems to be the variable length arrays
on the stack (VLAIS), which is a GCC'ism that the LLVM/clang folks
*really* don't want to support, and the fact that
__builtin_constant_p() is basically always false under LLVM because it
doesn't look very hard.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 13:57       ` Hannes Reinecke
  2016-07-22 14:40         ` Julia Lawall
  2016-07-22 19:12         ` Arnd Bergmann
@ 2016-07-26 11:48         ` David Woodhouse
  2016-07-26 12:53           ` Hannes Reinecke
                             ` (2 more replies)
  2016-08-11 15:44         ` Dan Carpenter
  3 siblings, 3 replies; 82+ messages in thread
From: David Woodhouse @ 2016-07-26 11:48 UTC (permalink / raw)
  To: Hannes Reinecke, Julia Lawall; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]

On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
> 
> > I guess that almost all functions return only a few possible error codes?
> 
> Precisely. If we had a way of specifying "the return value is an errno
> with the possible values '0', '-EIO', and '-EINVAL'" that would be
> _so_ cool.

And perpetually out of date. Because functions call through to *other*
functions which might return an errno outside the 'known' set.

Any why would you *want* to know the precise set of errnos that a
function might return, if not to deliberately code your error handling
non-defensively?

I can understand wanting to distinguish between errors and non-errors
and ensure that the ranges cannot overlap. But IS_ERR_VALUE() typically
reserves the whole range to -4095 (-MAX_ERRNO) for that. And I don't
think we'd ever want to do anything different.

In particular I don't want anyone ever saying "oh, -123 is a valid non-
error return but no other negative numbers are. But that's OK because
it'll never *actually* return an error of -ENOMEDIUM so there's no
ambiguity."

Of course, that's a silly example... but it's precisely where it sounds
like you're going, from the above citation :)

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-26 11:48         ` David Woodhouse
@ 2016-07-26 12:53           ` Hannes Reinecke
  2016-07-26 13:59             ` Alexey Dobriyan
  2016-07-26 13:53           ` Alexey Dobriyan
  2016-07-27 12:40           ` Julia Lawall
  2 siblings, 1 reply; 82+ messages in thread
From: Hannes Reinecke @ 2016-07-26 12:53 UTC (permalink / raw)
  To: David Woodhouse, Julia Lawall; +Cc: ksummit-discuss

On 07/26/2016 01:48 PM, David Woodhouse wrote:
> On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
>>
>>> I guess that almost all functions return only a few possible error codes?
>>
>> Precisely. If we had a way of specifying "the return value is an errno
>> with the possible values '0', '-EIO', and '-EINVAL'" that would be
>> _so_ cool.
>
> And perpetually out of date. Because functions call through to *other*
> functions which might return an errno outside the 'known' set.
>
What I want to catch with that are value range collisions; has the 'int' 
returned from that function the same meaning as the 'int' returned from 
the next function.
Random example: drivers/net/veth.c:veth_newlink()
'rtnl_nla_parse_ifla()' returns a value which is stored in the same 
variable as the return value from veth_validate(). And that value is 
then used as the return value for the entire function.
ATM we need to do code inspection to figure out if both indeed return an 
errno or not.

> Any why would you *want* to know the precise set of errnos that a
> function might return, if not to deliberately code your error handling
> non-defensively?
>
The ultimate goal is to provide a map with the known return codes for 
the various functions. Then we can invert that map and _inject_ errors
via systemtap and friend for those functions to test the error paths.
Using a fuzzer would work, too, but I think it's a bit too generic here
(scanning the entire range of 'int' _does_ take some time).
In general we want to trigger the 'exciting' cases (ie values where 
there _is_ an error path coded) to figure out if the error handling 
actually does behave as advertised.

> I can understand wanting to distinguish between errors and non-errors
> and ensure that the ranges cannot overlap. But IS_ERR_VALUE() typically
> reserves the whole range to -4095 (-MAX_ERRNO) for that. And I don't
> think we'd ever want to do anything different.
>
Even that would be fine; even restricting the range from the entire 
'int' to 4096 will make live easier.
But ATM we don't even have a way of expressing that.

> In particular I don't want anyone ever saying "oh, -123 is a valid non-
> error return but no other negative numbers are. But that's OK because
> it'll never *actually* return an error of -ENOMEDIUM so there's no
> ambiguity."
>
Na, of course not.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-26 11:48         ` David Woodhouse
  2016-07-26 12:53           ` Hannes Reinecke
@ 2016-07-26 13:53           ` Alexey Dobriyan
  2016-07-27 12:40           ` Julia Lawall
  2 siblings, 0 replies; 82+ messages in thread
From: Alexey Dobriyan @ 2016-07-26 13:53 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Tue, Jul 26, 2016 at 2:48 PM, David Woodhouse <dwmw2@infradead.org> wrote:
> On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
>>
>> > I guess that almost all functions return only a few possible error codes?
>>
>> Precisely. If we had a way of specifying "the return value is an errno
>> with the possible values '0', '-EIO', and '-EINVAL'" that would be
>> _so_ cool.
>
> And perpetually out of date. Because functions call through to *other*
> functions which might return an errno outside the 'known' set.
>
> Any why would you *want* to know the precise set of errnos that a
> function might return, if not to deliberately code your error handling
> non-defensively?

Java has checked exceptions.
Obviously, people "catch (e) {}" them.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-26 12:53           ` Hannes Reinecke
@ 2016-07-26 13:59             ` Alexey Dobriyan
  0 siblings, 0 replies; 82+ messages in thread
From: Alexey Dobriyan @ 2016-07-26 13:59 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On Tue, Jul 26, 2016 at 3:53 PM, Hannes Reinecke <hare@suse.com> wrote:
> On 07/26/2016 01:48 PM, David Woodhouse wrote:
>>
>> On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
>>>
>>>
>>>> I guess that almost all functions return only a few possible error
>>>> codes?
>>>
>>>
>>> Precisely. If we had a way of specifying "the return value is an errno
>>> with the possible values '0', '-EIO', and '-EINVAL'" that would be
>>> _so_ cool.
>>
>>
>> And perpetually out of date. Because functions call through to *other*
>> functions which might return an errno outside the 'known' set.
>>
> What I want to catch with that are value range collisions; has the 'int'
> returned from that function the same meaning as the 'int' returned from the
> next function.
> Random example: drivers/net/veth.c:veth_newlink()
> 'rtnl_nla_parse_ifla()' returns a value which is stored in the same variable
> as the return value from veth_validate(). And that value is then used as the
> return value for the entire function.
> ATM we need to do code inspection to figure out if both indeed return an
> errno or not.
>
>> Any why would you *want* to know the precise set of errnos that a
>> function might return, if not to deliberately code your error handling
>> non-defensively?
>>
> The ultimate goal is to provide a map with the known return codes for the
> various functions. Then we can invert that map and _inject_ errors
> via systemtap and friend for those functions to test the error paths.
> Using a fuzzer would work, too, but I think it's a bit too generic here
> (scanning the entire range of 'int' _does_ take some time).
> In general we want to trigger the 'exciting' cases (ie values where there
> _is_ an error path coded) to figure out if the error handling actually does
> behave as advertised.
>
>> I can understand wanting to distinguish between errors and non-errors
>> and ensure that the ranges cannot overlap. But IS_ERR_VALUE() typically
>> reserves the whole range to -4095 (-MAX_ERRNO) for that. And I don't
>> think we'd ever want to do anything different.
>>
> Even that would be fine; even restricting the range from the entire 'int' to
> 4096 will make live easier.
> But ATM we don't even have a way of expressing that.

This requires rewriting kernel in a programming language
with real type system. :^)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-26 11:48         ` David Woodhouse
  2016-07-26 12:53           ` Hannes Reinecke
  2016-07-26 13:53           ` Alexey Dobriyan
@ 2016-07-27 12:40           ` Julia Lawall
  2016-07-27 13:25             ` James Bottomley
  2 siblings, 1 reply; 82+ messages in thread
From: Julia Lawall @ 2016-07-27 12:40 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss



On Tue, 26 Jul 2016, David Woodhouse wrote:

> On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
> >
> > > I guess that almost all functions return only a few possible error codes?
> >
> > Precisely. If we had a way of specifying "the return value is an errno
> > with the possible values '0', '-EIO', and '-EINVAL'" that would be
> > _so_ cool.
>
> And perpetually out of date. Because functions call through to *other*
> functions which might return an errno outside the 'known' set.

If you have a script to calculate it, it doesn't have to be perpetually
out of date.  The problem is just the time to collect the information for
the whole kernel.  It could be a good intern project.

julia

>
> Any why would you *want* to know the precise set of errnos that a
> function might return, if not to deliberately code your error handling
> non-defensively?
>
> I can understand wanting to distinguish between errors and non-errors
> and ensure that the ranges cannot overlap. But IS_ERR_VALUE() typically
> reserves the whole range to -4095 (-MAX_ERRNO) for that. And I don't
> think we'd ever want to do anything different.
>
> In particular I don't want anyone ever saying "oh, -123 is a valid non-
> error return but no other negative numbers are. But that's OK because
> it'll never *actually* return an error of -ENOMEDIUM so there's no
> ambiguity."
>
> Of course, that's a silly example... but it's precisely where it sounds
> like you're going, from the above citation :)
>
> --
> David Woodhouse                            Open Source Technology Centre
> David.Woodhouse@intel.com                              Intel Corporation

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-27 12:40           ` Julia Lawall
@ 2016-07-27 13:25             ` James Bottomley
  2016-07-27 13:33               ` David Woodhouse
  0 siblings, 1 reply; 82+ messages in thread
From: James Bottomley @ 2016-07-27 13:25 UTC (permalink / raw)
  To: Julia Lawall, David Woodhouse; +Cc: ksummit-discuss

On Wed, 2016-07-27 at 14:40 +0200, Julia Lawall wrote:
> 
> On Tue, 26 Jul 2016, David Woodhouse wrote:
> 
> > On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
> > > 
> > > > I guess that almost all functions return only a few possible
> > > > error codes?
> > > 
> > > Precisely. If we had a way of specifying "the return value is an 
> > > errno with the possible values '0', '-EIO', and '-EINVAL'" that 
> > > would be _so_ cool.
> > 
> > And perpetually out of date. Because functions call through to 
> > *other* functions which might return an errno outside the 'known'
> > set.
> 
> If you have a script to calculate it, it doesn't have to be 
> perpetually out of date.  The problem is just the time to collect the 
> information for the whole kernel.  It could be a good intern project.

It's a lot of pain, for what gain?  What, practically would we get as a
benefit if we did this?  Every time I see proposals about scripting
checks in the kernel, I'm reminded of our section mismatch debacle. 
 Life is so much easier without every kernel release generating 100s of
patches trying to correct section mismatches which didn't matter in the
first place ...

James

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-27 13:25             ` James Bottomley
@ 2016-07-27 13:33               ` David Woodhouse
  2016-07-27 17:21                 ` Bird, Timothy
  0 siblings, 1 reply; 82+ messages in thread
From: David Woodhouse @ 2016-07-27 13:33 UTC (permalink / raw)
  To: James Bottomley, Julia Lawall; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1611 bytes --]

On Wed, 2016-07-27 at 09:25 -0400, James Bottomley wrote:
> On Wed, 2016-07-27 at 14:40 +0200, Julia Lawall wrote:
> > 
> > On Tue, 26 Jul 2016, David Woodhouse wrote:
> > 
> > > On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
> > > > 
> > > > > I guess that almost all functions return only a few possible
> > > > > error codes?
> > > > 
> > > > Precisely. If we had a way of specifying "the return value is an 
> > > > errno with the possible values '0', '-EIO', and '-EINVAL'" that 
> > > > would be _so_ cool.
> > > 
> > > And perpetually out of date. Because functions call through to 
> > > *other* functions which might return an errno outside the 'known'
> > > set.
> > 
> > If you have a script to calculate it, it doesn't have to be 
> > perpetually out of date.  The problem is just the time to collect the 
> > information for the whole kernel.  It could be a good intern project.
> 
> It's a lot of pain, for what gain?  What, practically would we get as a
> benefit if we did this?  Every time I see proposals about scripting
> checks in the kernel, I'm reminded of our section mismatch debacle. 
>  Life is so much easier without every kernel release generating 100s of
> patches trying to correct section mismatches which didn't matter in the
> first place ...

To find functions where the -errno returns might be ambiguous with real
valid return values, might be beneficial. If it doesn't have too many
false positives.

To tell *which* specific errnos might be returned by a given
function... I can't see any benefit in that ever.

-- 
dwmw2

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-27 13:33               ` David Woodhouse
@ 2016-07-27 17:21                 ` Bird, Timothy
  2016-08-01 22:17                   ` Rob Herring
  0 siblings, 1 reply; 82+ messages in thread
From: Bird, Timothy @ 2016-07-27 17:21 UTC (permalink / raw)
  To: David Woodhouse, James Bottomley, Julia Lawall; +Cc: ksummit-discuss



> -----Original Message-----
> From: ksummit-discuss-bounces@lists.linuxfoundation.org [mailto:ksummit-
> discuss-bounces@lists.linuxfoundation.org] On Behalf Of David Woodhouse
> Sent: Wednesday, July 27, 2016 5:34 AM
> To: James Bottomley <James.Bottomley@HansenPartnership.com>; Julia Lawall
> <julia.lawall@lip6.fr>
> Cc: ksummit-discuss@lists.linuxfoundation.org
> Subject: Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux
> kernel
> 
> On Wed, 2016-07-27 at 09:25 -0400, James Bottomley wrote:
> > On Wed, 2016-07-27 at 14:40 +0200, Julia Lawall wrote:
> > >
> > > On Tue, 26 Jul 2016, David Woodhouse wrote:
> > >
> > > > On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
> > > > >
> > > > > > I guess that almost all functions return only a few possible
> > > > > > error codes?
> > > > >
> > > > > Precisely. If we had a way of specifying "the return value is an
> > > > > errno with the possible values '0', '-EIO', and '-EINVAL'" that
> > > > > would be _so_ cool.
> > > >
> > > > And perpetually out of date. Because functions call through to
> > > > *other* functions which might return an errno outside the 'known'
> > > > set.
> > >
> > > If you have a script to calculate it, it doesn't have to be
> > > perpetually out of date.  The problem is just the time to collect the
> > > information for the whole kernel.  It could be a good intern project.
> >
> > It's a lot of pain, for what gain?  What, practically would we get as a
> > benefit if we did this?  Every time I see proposals about scripting
> > checks in the kernel, I'm reminded of our section mismatch debacle.
> >  Life is so much easier without every kernel release generating 100s of
> > patches trying to correct section mismatches which didn't matter in the
> > first place ...
> 
> To find functions where the -errno returns might be ambiguous with real
> valid return values, might be beneficial. If it doesn't have too many
> false positives.

It might be useful to have a list of possible errno generation points, for a particular routine,
to make it easier to find the origin of a problem.  Sometimes when you're unfamiliar with
some bit of code, manually walking back through the call chain in the source is a hassle.
I'm reminded of that trick where someone (I don't recall who) embedded the line number
in the errno.


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-20 12:11     ` Jan Kara
@ 2016-07-28  3:33       ` Steven Rostedt
  0 siblings, 0 replies; 82+ messages in thread
From: Steven Rostedt @ 2016-07-28  3:33 UTC (permalink / raw)
  To: Jan Kara; +Cc: James Bottomley, ksummit-discuss

On Wed, 20 Jul 2016 14:11:10 +0200
Jan Kara <jack@suse.cz> wrote:

> > Nothing.  I am just honestly looking at ways that we can get things to
> > always or almost always run.   Sparse isn't getting run regularly now so
> > I was suspect that would not be as good of a solution.  
> 
> Isn't sparse run by 0-day testing? I thought it is...

I get sparse warnings from the 0-day testing, so I believe that it is.

-- Steve

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-21 15:05 ` David Howells
                     ` (2 preceding siblings ...)
  2016-07-22  7:03   ` David Howells
@ 2016-07-28  3:40   ` Steven Rostedt
  2016-07-28  7:12   ` David Howells
  2016-08-02 10:48   ` Jani Nikula
  5 siblings, 0 replies; 82+ messages in thread
From: Steven Rostedt @ 2016-07-28  3:40 UTC (permalink / raw)
  To: David Howells; +Cc: ksummit-discuss

On Thu, 21 Jul 2016 16:05:25 +0100
David Howells <dhowells@redhat.com> wrote:


>  (2) Differentiate non-BH spinlocks and BH spinlocks by type.
> 
>      It seems like you can't mix BH and non-BH ops on a spinlock without
>      lockdep barking.  If that's the case, let's make this a compile-time
>      check.
> 

Not sure what you mean here. You can use BH spinlocks as non BH
spinlocks while in a BH, or if BH is already disabled, and lockdep will
not complain.

-- Steve

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-21 15:05 ` David Howells
                     ` (3 preceding siblings ...)
  2016-07-28  3:40   ` Steven Rostedt
@ 2016-07-28  7:12   ` David Howells
  2016-08-02 10:48   ` Jani Nikula
  5 siblings, 0 replies; 82+ messages in thread
From: David Howells @ 2016-07-28  7:12 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: ksummit-discuss

Steven Rostedt <rostedt@goodmis.org> wrote:

> >  (2) Differentiate non-BH spinlocks and BH spinlocks by type.
> > 
> >      It seems like you can't mix BH and non-BH ops on a spinlock without
> >      lockdep barking.  If that's the case, let's make this a compile-time
> >      check.
> > 
> 
> Not sure what you mean here. You can use BH spinlocks as non BH
> spinlocks while in a BH, or if BH is already disabled, and lockdep will
> not complain.

Okay, fair enough.

David

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 21:26 ` Josh Triplett
  2016-07-20  2:36   ` Eric W. Biederman
@ 2016-07-30 18:03   ` Eric W. Biederman
  2016-07-30 18:49     ` Josh Triplett
  1 sibling, 1 reply; 82+ messages in thread
From: Eric W. Biederman @ 2016-07-30 18:03 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss


Josh Triplett <josh@joshtriplett.org> writes:

> On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
>> Would a gcc plugin that checks the most interesting things that sparse
>> checks on every build be interesting? (endianness of integer types for example)
>
> I'd like to see those checks more widely available, ideally not just as
> plugins.  Some exploration of that occurred upstream:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59852 (bitwise/endian types)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59856 (contexts/locking)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59851 (nocast: no implicit conversions)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59850 (address spaces)
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59855 (designated_init; done)
>
> I'd love to see someone pick those up and get them into upstream GCC.
>
>> Would a type system for pointers derived from separation logic that
>> has the concept that a piece of data is owned by a piece of running
>> code rather than another piece of data be interesting?
>
> Interesting, yes, but trying to track "ownership" gets complicated
> *fast* to handle real-world cases.  Rust went through quite a lot of
> work, and multiple iterations, to get to the system it has now.  I don't
> think you'd be able to handle many of the cases in the kernel without
> about that much complexity.

Politely Rust did it the stupid way.  "ownership" or perhaps better said
who is allowed to modify the data is an active piece of code thing
rather than a data thing, and Rust did it as a data thing.

The kernel has some very interesting data structures and all kinds of
hand rolled smp syncrhonization primitives.   So I don't doubt there
will be challenges, but the situation will look nothing like what Rust
did.

>> I would really like to get a feel among kernel maintainers and
>> developers if this is something that is interesting, and what kind of
>> constraints they think something like this would need to be usable for
>> the kernel?
>
> I think the biggest constraint is that new tools get very slow adoption,
> and it's incredibly difficult to introduce a new *mandatory* tool or
> compiler version (with the exception of tools that ship with the
> kernel).  And optional ones have a tendency to break due to patches from
> people not running them.  Apart from that: false positive rate.
>
> Ideally, build something you can opt into using, such that if you
> explicitly use it, the false positive rate should be *zero* by design.

Which is what makes types as opposed to other things nice.  Types
arguably don't have a false positive rate.  Unfortunately types can
sometimes be very pendantic and if everyone isn't using them things go
splat.

Eric

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-30 18:03   ` Eric W. Biederman
@ 2016-07-30 18:49     ` Josh Triplett
  2016-07-30 19:34       ` Eric W. Biederman
  0 siblings, 1 reply; 82+ messages in thread
From: Josh Triplett @ 2016-07-30 18:49 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: ksummit-discuss

On Sat, Jul 30, 2016 at 01:03:30PM -0500, Eric W. Biederman wrote:
> 
> Josh Triplett <josh@joshtriplett.org> writes:
> 
> > On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
> >> Would a gcc plugin that checks the most interesting things that sparse
> >> checks on every build be interesting? (endianness of integer types for example)
> >
> > I'd like to see those checks more widely available, ideally not just as
> > plugins.  Some exploration of that occurred upstream:
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59852 (bitwise/endian types)
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59856 (contexts/locking)
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59851 (nocast: no implicit conversions)
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59850 (address spaces)
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59855 (designated_init; done)
> >
> > I'd love to see someone pick those up and get them into upstream GCC.
> >
> >> Would a type system for pointers derived from separation logic that
> >> has the concept that a piece of data is owned by a piece of running
> >> code rather than another piece of data be interesting?
> >
> > Interesting, yes, but trying to track "ownership" gets complicated
> > *fast* to handle real-world cases.  Rust went through quite a lot of
> > work, and multiple iterations, to get to the system it has now.  I don't
> > think you'd be able to handle many of the cases in the kernel without
> > about that much complexity.
> 
> Politely Rust did it the stupid way.  "ownership" or perhaps better said
> who is allowed to modify the data is an active piece of code thing
> rather than a data thing, and Rust did it as a data thing.

I'd be interested to hear more details on what you mean by this, because
the way you've described it doesn't make sense to me.  The way lifetimes
are implemented in Rust seems very much like "an active piece of code
thing".  Can you give an example of the distinction you're making?

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-30 18:49     ` Josh Triplett
@ 2016-07-30 19:34       ` Eric W. Biederman
  2016-07-30 20:56         ` Josh Triplett
  0 siblings, 1 reply; 82+ messages in thread
From: Eric W. Biederman @ 2016-07-30 19:34 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss

Josh Triplett <josh@joshtriplett.org> writes:

> On Sat, Jul 30, 2016 at 01:03:30PM -0500, Eric W. Biederman wrote:
>> 
>> Josh Triplett <josh@joshtriplett.org> writes:
>> 
>> > On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
>> >> Would a type system for pointers derived from separation logic that
>> >> has the concept that a piece of data is owned by a piece of running
>> >> code rather than another piece of data be interesting?
>> >
>> > Interesting, yes, but trying to track "ownership" gets complicated
>> > *fast* to handle real-world cases.  Rust went through quite a lot of
>> > work, and multiple iterations, to get to the system it has now.  I don't
>> > think you'd be able to handle many of the cases in the kernel without
>> > about that much complexity.
>> 
>> Politely Rust did it the stupid way.  "ownership" or perhaps better said
>> who is allowed to modify the data is an active piece of code thing
>> rather than a data thing, and Rust did it as a data thing.
>
> I'd be interested to hear more details on what you mean by this, because
> the way you've described it doesn't make sense to me.  The way lifetimes
> are implemented in Rust seems very much like "an active piece of code
> thing".  Can you give an example of the distinction you're making?

But that is a lifetime of a piece of data, that isn't ownership of data.
The data is still owned by some pointer.

Ownership by code looks roughly like:

	head = acquire_list(&task_list);

	/* At the data level head points to the first element of the list */
        /* The type of head is a recursive type that includes every
         * element on the list.
         */
         for (ptr = head; ptr; ptr = ptr->next) {
         	/* The type of ptr shares with head the type of the list.
                 * Which allows both ptr and head to be valid
                 * and the code of this function to continue owning the list.
                 */
                 ptr->scratch++;
                 /* As the owner any mutation may be performed on the list elements */
	}

	release_list(list);
        /* Accessing list past this point would be a type error */

Where acquire_list and grabs the appropriate spinlock and then returns
ownership of the list to the calling function through a ponter to the
first element of the list.

Similarly release_list takes a pointer to the first element of a list
and the ownership of the list releases the lock.

In the area in which you have ownership of the list you can use as many
different pointers as you like and they are all valid and all may be
used because the code can see all of the aliases, and nothing special
needs to be done because the code owns the list not any particular
pointer.

Which means that since one element of the list does not contain a
pointer owning the next element of the list, doubly linked lists
are not a problem.

This is not to say there are not tricky bits, but things like borrows
are replaced by simpler concepts.

I hope that is enough to clarify where I am coming from.  If not I
suspect it will have to wait until I can find the time to release my
proof of concept and documentation of all of this.

Eric

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-30 19:34       ` Eric W. Biederman
@ 2016-07-30 20:56         ` Josh Triplett
  2016-07-30 22:21           ` Eric W. Biederman
  0 siblings, 1 reply; 82+ messages in thread
From: Josh Triplett @ 2016-07-30 20:56 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: ksummit-discuss

On Sat, Jul 30, 2016 at 02:34:33PM -0500, Eric W. Biederman wrote:
> Josh Triplett <josh@joshtriplett.org> writes:
> 
> > On Sat, Jul 30, 2016 at 01:03:30PM -0500, Eric W. Biederman wrote:
> >> 
> >> Josh Triplett <josh@joshtriplett.org> writes:
> >> 
> >> > On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
> >> >> Would a type system for pointers derived from separation logic that
> >> >> has the concept that a piece of data is owned by a piece of running
> >> >> code rather than another piece of data be interesting?
> >> >
> >> > Interesting, yes, but trying to track "ownership" gets complicated
> >> > *fast* to handle real-world cases.  Rust went through quite a lot of
> >> > work, and multiple iterations, to get to the system it has now.  I don't
> >> > think you'd be able to handle many of the cases in the kernel without
> >> > about that much complexity.
> >> 
> >> Politely Rust did it the stupid way.  "ownership" or perhaps better said
> >> who is allowed to modify the data is an active piece of code thing
> >> rather than a data thing, and Rust did it as a data thing.
> >
> > I'd be interested to hear more details on what you mean by this, because
> > the way you've described it doesn't make sense to me.  The way lifetimes
> > are implemented in Rust seems very much like "an active piece of code
> > thing".  Can you give an example of the distinction you're making?
> 
> But that is a lifetime of a piece of data, that isn't ownership of data.
> The data is still owned by some pointer.
> 
> Ownership by code looks roughly like:
> 
> 	head = acquire_list(&task_list);
> 
> 	/* At the data level head points to the first element of the list */
>         /* The type of head is a recursive type that includes every
>          * element on the list.
>          */
>          for (ptr = head; ptr; ptr = ptr->next) {
>          	/* The type of ptr shares with head the type of the list.
>                  * Which allows both ptr and head to be valid
>                  * and the code of this function to continue owning the list.
>                  */
>                  ptr->scratch++;
>                  /* As the owner any mutation may be performed on the list elements */
> 	}
> 
> 	release_list(list);
>         /* Accessing list past this point would be a type error */
> 
> Where acquire_list and grabs the appropriate spinlock and then returns
> ownership of the list to the calling function through a ponter to the
> first element of the list.

(Nit: I think in that last line you wanted release_list(head);)

What happens if the code saves a copy of either head, ptr, or something
accessed via ptr, and then some other distant code accesses that after
release_list?  (For instance, consider a function that takes a pointer
to "scratch" and retains that pointer.)  In the type system you
envision, what prevents that?

What happens if the code in that region, while looking at one value of
ptr, changes the list in a way that invalidates ptr?  What prevents
that?

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-30 20:56         ` Josh Triplett
@ 2016-07-30 22:21           ` Eric W. Biederman
  0 siblings, 0 replies; 82+ messages in thread
From: Eric W. Biederman @ 2016-07-30 22:21 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss

Josh Triplett <josh@joshtriplett.org> writes:

> On Sat, Jul 30, 2016 at 02:34:33PM -0500, Eric W. Biederman wrote:
>> Josh Triplett <josh@joshtriplett.org> writes:
>> 
>> > On Sat, Jul 30, 2016 at 01:03:30PM -0500, Eric W. Biederman wrote:
>> >> 
>> >> Josh Triplett <josh@joshtriplett.org> writes:
>> >> 
>> >> > On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
>> >> >> Would a type system for pointers derived from separation logic that
>> >> >> has the concept that a piece of data is owned by a piece of running
>> >> >> code rather than another piece of data be interesting?
>> >> >
>> >> > Interesting, yes, but trying to track "ownership" gets complicated
>> >> > *fast* to handle real-world cases.  Rust went through quite a lot of
>> >> > work, and multiple iterations, to get to the system it has now.  I don't
>> >> > think you'd be able to handle many of the cases in the kernel without
>> >> > about that much complexity.
>> >> 
>> >> Politely Rust did it the stupid way.  "ownership" or perhaps better said
>> >> who is allowed to modify the data is an active piece of code thing
>> >> rather than a data thing, and Rust did it as a data thing.
>> >
>> > I'd be interested to hear more details on what you mean by this, because
>> > the way you've described it doesn't make sense to me.  The way lifetimes
>> > are implemented in Rust seems very much like "an active piece of code
>> > thing".  Can you give an example of the distinction you're making?
>> 
>> But that is a lifetime of a piece of data, that isn't ownership of data.
>> The data is still owned by some pointer.
>> 
>> Ownership by code looks roughly like:
>> 
>> 	head = acquire_list(&task_list);
>> 
>> 	/* At the data level head points to the first element of the list */
>>         /* The type of head is a recursive type that includes every
>>          * element on the list.
>>          */
>>          for (ptr = head; ptr; ptr = ptr->next) {
>>          	/* The type of ptr shares with head the type of the list.
>>                  * Which allows both ptr and head to be valid
>>                  * and the code of this function to continue owning the list.
>>                  */
>>                  ptr->scratch++;
>>                  /* As the owner any mutation may be performed on the list elements */
>> 	}
>> 
>> 	release_list(list);
>>         /* Accessing list past this point would be a type error */
>> 
>> Where acquire_list and grabs the appropriate spinlock and then returns
>> ownership of the list to the calling function through a ponter to the
>> first element of the list.
>
> (Nit: I think in that last line you wanted release_list(head);)

Yes.

> What happens if the code saves a copy of either head, ptr, or something
> accessed via ptr, and then some other distant code accesses that after
> release_list?  (For instance, consider a function that takes a pointer
> to "scratch" and retains that pointer.)  In the type system you
> envision, what prevents that?

It depends a bit on how the copy is saved.  Without something special
when you reach the line release_list(head) the return of release_list
consumes the list as free does.

Which type wise means that the pointers you talk of remain valid things,
but you can not dereference them because the type they point to has
become unusable.   Which makes all future derefernces impossible.

> What happens if the code in that region, while looking at one value of
> ptr, changes the list in a way that invalidates ptr?  What prevents
> that?

Depends.  If it invalides ptr.  ptr becomes unusable (at least
undereferenceable).  And using dereferencing pointer past that point
does not type check.

Which is a long way of saying that functions are allowed to change the
destination types of pointers that are passed into them.  This changing
what pointers point to is the type equivalent of having preconditions
and postconditions.

Plus there is the other benefit to being able to change the type of
things by functions.  It is possible to build types that enforce state
machines by changing a the type of a variable and only making methods
available when they are valid to use.

And yes all of this depends upon having enough information to know
(within a small window) where all of the aliases for any value are at
any given time.

Eric

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-27 17:21                 ` Bird, Timothy
@ 2016-08-01 22:17                   ` Rob Herring
  2016-08-12  1:29                     ` Stephen Boyd
  0 siblings, 1 reply; 82+ messages in thread
From: Rob Herring @ 2016-08-01 22:17 UTC (permalink / raw)
  To: Bird, Timothy; +Cc: James Bottomley, ksummit-discuss

On Wed, Jul 27, 2016 at 12:21 PM, Bird, Timothy <Tim.Bird@am.sony.com> wrote:
>
>
>> -----Original Message-----
>> From: ksummit-discuss-bounces@lists.linuxfoundation.org [mailto:ksummit-
>> discuss-bounces@lists.linuxfoundation.org] On Behalf Of David Woodhouse
>> Sent: Wednesday, July 27, 2016 5:34 AM
>> To: James Bottomley <James.Bottomley@HansenPartnership.com>; Julia Lawall
>> <julia.lawall@lip6.fr>
>> Cc: ksummit-discuss@lists.linuxfoundation.org
>> Subject: Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux
>> kernel
>>
>> On Wed, 2016-07-27 at 09:25 -0400, James Bottomley wrote:
>> > On Wed, 2016-07-27 at 14:40 +0200, Julia Lawall wrote:
>> > >
>> > > On Tue, 26 Jul 2016, David Woodhouse wrote:
>> > >
>> > > > On Fri, 2016-07-22 at 15:57 +0200, Hannes Reinecke wrote:
>> > > > >
>> > > > > > I guess that almost all functions return only a few possible
>> > > > > > error codes?
>> > > > >
>> > > > > Precisely. If we had a way of specifying "the return value is an
>> > > > > errno with the possible values '0', '-EIO', and '-EINVAL'" that
>> > > > > would be _so_ cool.
>> > > >
>> > > > And perpetually out of date. Because functions call through to
>> > > > *other* functions which might return an errno outside the 'known'
>> > > > set.
>> > >
>> > > If you have a script to calculate it, it doesn't have to be
>> > > perpetually out of date.  The problem is just the time to collect the
>> > > information for the whole kernel.  It could be a good intern project.
>> >
>> > It's a lot of pain, for what gain?  What, practically would we get as a
>> > benefit if we did this?  Every time I see proposals about scripting
>> > checks in the kernel, I'm reminded of our section mismatch debacle.
>> >  Life is so much easier without every kernel release generating 100s of
>> > patches trying to correct section mismatches which didn't matter in the
>> > first place ...
>>
>> To find functions where the -errno returns might be ambiguous with real
>> valid return values, might be beneficial. If it doesn't have too many
>> false positives.
>
> It might be useful to have a list of possible errno generation points, for a particular routine,
> to make it easier to find the origin of a problem.  Sometimes when you're unfamiliar with
> some bit of code, manually walking back through the call chain in the source is a hassle.
> I'm reminded of that trick where someone (I don't recall who) embedded the line number
> in the errno.

I'd be interested if you could find what you are referring to. I would
love to see something better here than my printk debugger. A list of
possible errnos would help some, but I'd expect we typically have
multiple called functions returning the same possible errnos, so you'd
still be narrowing down the source manually. Debugging that probe of a
driver failed because of a config option not being enabled is painful.
Maybe I'm the only one that switches between boards/kernels a lot and
does a poor job maintaining working configs.

Wouldn't something as simple as turning all the errno's into backtrace
printks or tracepoints be doable. It might have to be enabled on
probe() entry to avoid any "normal" errors. Any code that initializes
the return value with an error would be a false positive. A major
benefit to something like this is it could get rid of lots of error
path printks in drivers that bloat the kernel.

Rob

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-21 15:05 ` David Howells
                     ` (4 preceding siblings ...)
  2016-07-28  7:12   ` David Howells
@ 2016-08-02 10:48   ` Jani Nikula
  2016-08-04 11:31     ` David Woodhouse
  5 siblings, 1 reply; 82+ messages in thread
From: Jani Nikula @ 2016-08-02 10:48 UTC (permalink / raw)
  To: David Howells, Eric W. Biederman; +Cc: ksummit-discuss

On Thu, 21 Jul 2016, David Howells <dhowells@redhat.com> wrote:
>  (3) Let's use bool a lot more for boolean values as the compiler might be
>      able to make better choices with it.

This would be particularly useful for boolean one-bit struct bitfield
flags (not least because assigning any positive even number to unsigned
int foo:1 will result in 0) *but* we've found gcc produces worse code
for bool:1 in our case. Details at [1].

BR,
Jani.


[1] http://patchwork.freedesktop.org/patch/msgid/1463148278-23193-1-git-send-email-jani.nikula@intel.com


-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22  6:14     ` Julia Lawall
  2016-07-22 13:57       ` Hannes Reinecke
@ 2016-08-04  7:15       ` NeilBrown
  2016-08-04 11:19         ` Julia Lawall
  1 sibling, 1 reply; 82+ messages in thread
From: NeilBrown @ 2016-08-04  7:15 UTC (permalink / raw)
  To: Julia Lawall, Hannes Reinecke; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1101 bytes --]

On Fri, Jul 22 2016, Julia Lawall wrote:
>
> In C, enums are ints.  Is there a Gcc option that checks them?  For
> example, the following program compiles fine with -Wall:
>
> enum one {ONE=1, TWO=2, THREE=3};
> enum two {ONEX=7, TWOX=8, THREEX=9};
>
> int f (int x) {
>   enum one o = ONE;
>   enum two t = THREEX;
>   if (x) o = t; else t = o;
>   return 0;
> }

However with this slight change (and line-numbers added for clarity)

     1	
     2	enum one {ONE=1, TWO=2, THREE=3} __attribute((bitwise));
     3	enum two {ONEX=7, TWOX=8, THREEX=9} __attribute((bitwise));
     4	
     5	int f (int x) {
     6	  enum one o = ONE;
     7	  enum two t = THREEX;
     8	  if (x) o = t; else t = o;
     9	  return 0;
    10	}


sparse complains:

/tmp/test.c:8:14: warning: mixing different enum types
/tmp/test.c:8:14:     int enum two  versus
/tmp/test.c:8:14:     int enum one 
/tmp/test.c:8:26: warning: mixing different enum types
/tmp/test.c:8:26:     int enum one  versus
/tmp/test.c:8:26:     int enum two 

Is that what you were hoping for?

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-04  7:15       ` NeilBrown
@ 2016-08-04 11:19         ` Julia Lawall
  0 siblings, 0 replies; 82+ messages in thread
From: Julia Lawall @ 2016-08-04 11:19 UTC (permalink / raw)
  To: NeilBrown; +Cc: ksummit-discuss



On Thu, 4 Aug 2016, NeilBrown wrote:

> On Fri, Jul 22 2016, Julia Lawall wrote:
> >
> > In C, enums are ints.  Is there a Gcc option that checks them?  For
> > example, the following program compiles fine with -Wall:
> >
> > enum one {ONE=1, TWO=2, THREE=3};
> > enum two {ONEX=7, TWOX=8, THREEX=9};
> >
> > int f (int x) {
> >   enum one o = ONE;
> >   enum two t = THREEX;
> >   if (x) o = t; else t = o;
> >   return 0;
> > }
>
> However with this slight change (and line-numbers added for clarity)
>
>      1
>      2	enum one {ONE=1, TWO=2, THREE=3} __attribute((bitwise));
>      3	enum two {ONEX=7, TWOX=8, THREEX=9} __attribute((bitwise));
>      4
>      5	int f (int x) {
>      6	  enum one o = ONE;
>      7	  enum two t = THREEX;
>      8	  if (x) o = t; else t = o;
>      9	  return 0;
>     10	}
>
>
> sparse complains:
>
> /tmp/test.c:8:14: warning: mixing different enum types
> /tmp/test.c:8:14:     int enum two  versus
> /tmp/test.c:8:14:     int enum one
> /tmp/test.c:8:26: warning: mixing different enum types
> /tmp/test.c:8:26:     int enum one  versus
> /tmp/test.c:8:26:     int enum two
>
> Is that what you were hoping for?

Yes, perfect, thanks.

julia

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-02 10:48   ` Jani Nikula
@ 2016-08-04 11:31     ` David Woodhouse
  2016-08-04 12:07       ` Jani Nikula
  0 siblings, 1 reply; 82+ messages in thread
From: David Woodhouse @ 2016-08-04 11:31 UTC (permalink / raw)
  To: Jani Nikula, David Howells, Eric W. Biederman; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 899 bytes --]

On Tue, 2016-08-02 at 13:48 +0300, Jani Nikula wrote:
> On Thu, 21 Jul 2016, David Howells <dhowells@redhat.com> wrote:
> >  (3) Let's use bool a lot more for boolean values as the compiler might be
> >      able to make better choices with it.
> 
> This would be particularly useful for boolean one-bit struct bitfield
> flags (not least because assigning any positive even number to unsigned
> int foo:1 will result in 0) *but* we've found gcc produces worse code
> for bool:1 in our case. Details at [1].
> 
> BR,
> Jani.
> 
> 
> [1] http://patchwork.freedesktop.org/patch/msgid/1463148278-23193-1-git-send-email-jani.nikula@intel.com

In that entire discussion I don't see any mention of a GCC PR being
filed.

Why?

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-04 11:31     ` David Woodhouse
@ 2016-08-04 12:07       ` Jani Nikula
  0 siblings, 0 replies; 82+ messages in thread
From: Jani Nikula @ 2016-08-04 12:07 UTC (permalink / raw)
  To: David Woodhouse, David Howells, Eric W. Biederman
  Cc: Dave Gordon, ksummit-discuss

On Thu, 04 Aug 2016, David Woodhouse <dwmw2@infradead.org> wrote:
> On Tue, 2016-08-02 at 13:48 +0300, Jani Nikula wrote:
>> On Thu, 21 Jul 2016, David Howells <dhowells@redhat.com> wrote:
>> >  (3) Let's use bool a lot more for boolean values as the compiler might be
>> >      able to make better choices with it.
>> 
>> This would be particularly useful for boolean one-bit struct bitfield
>> flags (not least because assigning any positive even number to unsigned
>> int foo:1 will result in 0) *but* we've found gcc produces worse code
>> for bool:1 in our case. Details at [1].
>> 
>> BR,
>> Jani.
>> 
>> 
>> [1] http://patchwork.freedesktop.org/patch/msgid/1463148278-23193-1-git-send-email-jani.nikula@intel.com
>
> In that entire discussion I don't see any mention of a GCC PR being
> filed.
>
> Why?

I guess something along these lines: Everybody was sure that Somebody
would do it. Anybody could have done it, but Nobody did it. Somebody got
angry about that, because it was Everybody's job. Everybody thought
Anybody could do it, but Nobody realized that Everybody wouldn't do
it. It ended up that Everybody blamed Somebody when Nobody did what
Anybody could have.

Dave (Gordon, Cc'd), you seemed to have the best grasp of what was going
on, would you mind filing that GCC bug please? https://gcc.gnu.org/bugs/


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-22 13:57       ` Hannes Reinecke
                           ` (2 preceding siblings ...)
  2016-07-26 11:48         ` David Woodhouse
@ 2016-08-11 15:44         ` Dan Carpenter
  2016-08-12  0:38           ` NeilBrown
  2016-08-12  3:51           ` Matthew Wilcox
  3 siblings, 2 replies; 82+ messages in thread
From: Dan Carpenter @ 2016-08-11 15:44 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On Fri, Jul 22, 2016 at 03:57:40PM +0200, Hannes Reinecke wrote:
> > 
> > I guess that almost all functions return only a few possible error codes?
> 
> Precisely. If we had a way of specifying "the return value is an errno
> with the possible values '0', '-EIO', and '-EINVAL'" that would be
> _so_ cool.

I think that's a bad idea.  We should be propagating errors from the
functions we call.  It should be able to change without breaking.

Smatch does a pretty good job of tracking return values, especially
if you rebuild the database over and over once a day like I do.

In some places I hack the database manually. For legacy reasons there
are a couple places that happens but the main way is through this file:
The format is function, space, old value, space, new value:

http://repo.or.cz/smatch.git/blob/HEAD:/smatch_data/db/kernel.return_fixes

One thing that causes problems for Smatch is recursion.  We don't know
what the function returns the first time it's called so we record that
it could return anything.  Then the second time we "know" that it can
return anything.  So the unknown propagates recursively. Another thing
that causes problems is when we copy a return value from another thread
or a work queue.  There are a bunch of places like that where the
programmer knows the return value is negative but it's hard for static
analysis.

I need these manual fixes when not knowing the error code causes
problems because a function does this:
	if (ret)
		return ret;
But the caller does:
	if (ret < 0)
		return ret;
There is a mismatch because Smatch thinks any non-zero is an error but
the caller knows only negatives are errors.

The other reason for the file is that we want to record that the
scnprintf() return value is less than the size parameter.  Ideally, we
could record that strnlen_user() returns "<= count + 1", but Smatch is
not flexible enough to do that yet.  These upper bounds are needed to
prevent integer overflow and buffer overflow warnings.

One thing I get annoyed about is when functions return positive values
but it's not documented what it means.  For example, ocfs2_plock()
returns negatives, zero or FILE_LOCK_DEFERRED.  Possibly this is a bug.
How should I know?  Also em_sti_clock_event_next() should return zero or
-ETIME but it returns zero or one so I think it's buggy.

One last thing, is that it's sometimes impossible to tell when we return
zero unintentionally vs intentionally.  I'm talking about code like
this:
	if (frob_whatever())
		goto out;

It's a missing error code bug 75% of the time and intentional the other
25% of the time.  I feel like this should _always_ have a comment next
to them, just like we _always_ comment /* fall through */ in switch
statements to note the missing break.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-11 15:44         ` Dan Carpenter
@ 2016-08-12  0:38           ` NeilBrown
  2016-08-12 20:56             ` Dan Carpenter
  2016-08-12  3:51           ` Matthew Wilcox
  1 sibling, 1 reply; 82+ messages in thread
From: NeilBrown @ 2016-08-12  0:38 UTC (permalink / raw)
  To: Dan Carpenter, Hannes Reinecke; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1762 bytes --]

On Fri, Aug 12 2016, Dan Carpenter wrote:

> On Fri, Jul 22, 2016 at 03:57:40PM +0200, Hannes Reinecke wrote:
>> > 
>> > I guess that almost all functions return only a few possible error codes?
>> 
>> Precisely. If we had a way of specifying "the return value is an errno
>> with the possible values '0', '-EIO', and '-EINVAL'" that would be
>> _so_ cool.
>
> I think that's a bad idea.  We should be propagating errors from the
> functions we call.  It should be able to change without breaking.

Should we?  I recently faced a bug caused by a (proposed) change to
btrfs which returned a different error code to the ->fh_to_dentry()
function. That was being propagated up to nfsd, and out on the wire in
the NFSv4 protocol.
Only the new error was invalid for the protocol and the client
(correctly) reported it to user-space rather than handling it
internally.

This happened because not enough thought/documentation had been given to
which error codes were sensibly meaningful.  I changed
exportfs_decode_fh() so that any error other than ENOMEM became ESTALE,
because that is all nfsd can sensibly handle.

I'm sure there are some (many!) paths where error codes should be
propagated transparently, but I don't think we should assume that is
always true.

>
> Smatch does a pretty good job of tracking return values, especially
> if you rebuild the database over and over once a day like I do.

So it is OK to keep a list of valid return values in a database, but not
OK to keep them in the code as documentation, and to alert the
programmer when they make a change so they can declare (and maybe even
document) if it was an intentional change?
Maybe I'm just misunderstanding your point of view.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-01 22:17                   ` Rob Herring
@ 2016-08-12  1:29                     ` Stephen Boyd
  0 siblings, 0 replies; 82+ messages in thread
From: Stephen Boyd @ 2016-08-12  1:29 UTC (permalink / raw)
  To: Rob Herring, Bird, Timothy; +Cc: James Bottomley, ksummit-discuss

On 08/01/2016 03:17 PM, Rob Herring wrote:
> On Wed, Jul 27, 2016 at 12:21 PM, Bird, Timothy <Tim.Bird@am.sony.com> wrote:
>>
>> It might be useful to have a list of possible errno generation points, for a particular routine,
>> to make it easier to find the origin of a problem.  Sometimes when you're unfamiliar with
>> some bit of code, manually walking back through the call chain in the source is a hassle.
>> I'm reminded of that trick where someone (I don't recall who) embedded the line number
>> in the errno.
> I'd be interested if you could find what you are referring to. 

Hugh Dickins:

#undef EINVAL
#define EINVAL __LINE__

https://lwn.net/Articles/614446/

Although it may not work well for error pointers as Linus stated in the
follow up mail.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-11 15:44         ` Dan Carpenter
  2016-08-12  0:38           ` NeilBrown
@ 2016-08-12  3:51           ` Matthew Wilcox
  2016-08-12  4:01             ` Josh Triplett
  1 sibling, 1 reply; 82+ messages in thread
From: Matthew Wilcox @ 2016-08-12  3:51 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 773 bytes --]

On Aug 11, 2016 8:44 AM, "Dan Carpenter" <dan.carpenter@oracle.com> wrote:
> I need these manual fixes when not knowing the error code causes
> problems because a function does this:
>         if (ret)
>                 return ret;
> But the caller does:
>         if (ret < 0)
>                 return ret;
> There is a mismatch because Smatch thinks any non-zero is an error but
> the caller knows only negatives are errors.

Can we introduce types for this? We have a number of different return type
conventions in the kernel:

bool
errno_t (-4095 to 0 are valid)
count_t (-4095 to INT_MAX)
long_count_t (-4095 to LONG_MAX)
ulong_count_t (-4095 to -4096)
struct foo _err*

I think this is good programmer documentation in addition to being
potentially useful to smatch.

[-- Attachment #2: Type: text/html, Size: 1028 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  3:51           ` Matthew Wilcox
@ 2016-08-12  4:01             ` Josh Triplett
  2016-08-12  4:07               ` Matthew Wilcox
  0 siblings, 1 reply; 82+ messages in thread
From: Josh Triplett @ 2016-08-12  4:01 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: ksummit-discuss, Dan Carpenter

On Thu, Aug 11, 2016 at 11:51:52PM -0400, Matthew Wilcox wrote:
> On Aug 11, 2016 8:44 AM, "Dan Carpenter" <dan.carpenter@oracle.com> wrote:
> > I need these manual fixes when not knowing the error code causes
> > problems because a function does this:
> >         if (ret)
> >                 return ret;
> > But the caller does:
> >         if (ret < 0)
> >                 return ret;
> > There is a mismatch because Smatch thinks any non-zero is an error but
> > the caller knows only negatives are errors.
> 
> Can we introduce types for this? We have a number of different return type
> conventions in the kernel:
> 
> bool
> errno_t (-4095 to 0 are valid)
> count_t (-4095 to INT_MAX)
> long_count_t (-4095 to LONG_MAX)
> ulong_count_t (-4095 to -4096)
> struct foo _err*
> 
> I think this is good programmer documentation in addition to being
> potentially useful to smatch.

I'd love to see an explicit type distinct from "int" for "potentially an
errno".  And if any code uses "potentially an errno *or* a non-errno
non-zero return value", that should ideally use a distinct type as well.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  4:01             ` Josh Triplett
@ 2016-08-12  4:07               ` Matthew Wilcox
  2016-08-12  5:29                 ` Alexey Dobriyan
  0 siblings, 1 reply; 82+ messages in thread
From: Matthew Wilcox @ 2016-08-12  4:07 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Dan Carpenter, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

On Aug 11, 2016 9:02 PM, "Josh Triplett" <josh@joshtriplett.org> wrote:
> On Thu, Aug 11, 2016 at 11:51:52PM -0400, Matthew Wilcox wrote:
> > Can we introduce types for this? We have a number of different return
type
> > conventions in the kernel:
> >
> > bool
> > errno_t (-4095 to 0 are valid)
> > count_t (-4095 to INT_MAX)
> > long_count_t (-4095 to LONG_MAX)
> > ulong_count_t (-4095 to -4096)
> > struct foo _err*
> >
> > I think this is good programmer documentation in addition to being
> > potentially useful to smatch.
>
> I'd love to see an explicit type distinct from "int" for "potentially an
> errno".  And if any code uses "potentially an errno *or* a non-errno
> non-zero return value", that should ideally use a distinct type as well.

I think the biggest problem is coming up with good names for the types. And
the churn of introducing them, particularly converting function pointers
and all occurrences.

[-- Attachment #2: Type: text/html, Size: 1230 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-07-19 15:32 [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel Eric W. Biederman
                   ` (4 preceding siblings ...)
  2016-07-22 11:19 ` David Howells
@ 2016-08-12  4:42 ` Michael S. Tsirkin
       [not found]   ` <871t1ulfvz.fsf@notabene.neil.brown.name>
  2016-08-12  6:23   ` NeilBrown
  5 siblings, 2 replies; 82+ messages in thread
From: Michael S. Tsirkin @ 2016-08-12  4:42 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: ksummit-discuss

On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
> I would really like to get a feel among kernel maintainers and
> developers if this is something that is interesting, and what kind of
> constraints they think something like this would need to be usable for
> the kernel?
> 
> Eric

Surprised that no one mentioned this yet - I think tagging
integers/structs as coming from userspace could be useful,
if we can teach e.g. smatch that access to a kernel
pointer through this offset might fault.

-- 
MST

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  4:07               ` Matthew Wilcox
@ 2016-08-12  5:29                 ` Alexey Dobriyan
  2016-08-12  5:38                   ` Michael S. Tsirkin
  2016-08-12  5:50                   ` Matthew Wilcox
  0 siblings, 2 replies; 82+ messages in thread
From: Alexey Dobriyan @ 2016-08-12  5:29 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: ksummit-discuss, Dan Carpenter

On Fri, Aug 12, 2016 at 12:07:11AM -0400, Matthew Wilcox wrote:
> On Aug 11, 2016 9:02 PM, "Josh Triplett" <josh@joshtriplett.org> wrote:
> > On Thu, Aug 11, 2016 at 11:51:52PM -0400, Matthew Wilcox wrote:
> > > Can we introduce types for this? We have a number of different return
> type
> > > conventions in the kernel:
> > >
> > > bool
> > > errno_t (-4095 to 0 are valid)
> > > count_t (-4095 to INT_MAX)
> > > long_count_t (-4095 to LONG_MAX)
> > > ulong_count_t (-4095 to -4096)
> > > struct foo _err*
> > >
> > > I think this is good programmer documentation in addition to being
> > > potentially useful to smatch.
> >
> > I'd love to see an explicit type distinct from "int" for "potentially an
> > errno".  And if any code uses "potentially an errno *or* a non-errno
> > non-zero return value", that should ideally use a distinct type as well.
> 
> I think the biggest problem is coming up with good names for the types. And
> the churn of introducing them, particularly converting function pointers
> and all occurrences.

Names are easy part (errno_t is perfect actually). The problem is that
once error is cleared, variable doesn't change to regular type anymore:

	errno_t rv;

	rv = f();
	if (rv < 0)
		return rv;
	int rv = rv;

which agains boils down to a language with real type system.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
       [not found]   ` <871t1ulfvz.fsf@notabene.neil.brown.name>
@ 2016-08-12  5:34     ` Michael S. Tsirkin
  2016-08-12  6:23       ` NeilBrown
       [not found]       ` <87y442jytb.fsf@notabene.neil.brown.name>
  0 siblings, 2 replies; 82+ messages in thread
From: Michael S. Tsirkin @ 2016-08-12  5:34 UTC (permalink / raw)
  To: NeilBrown; +Cc: ksummit-discuss

On Fri, Aug 12, 2016 at 03:23:28PM +1000, NeilBrown wrote:
> On Fri, Aug 12 2016, Michael S. Tsirkin wrote:
> 
> > On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
> >> I would really like to get a feel among kernel maintainers and
> >> developers if this is something that is interesting, and what kind of
> >> constraints they think something like this would need to be usable for
> >> the kernel?
> >> 
> >> Eric
> >
> > Surprised that no one mentioned this yet - I think tagging
> > integers/structs as coming from userspace could be useful,
> > if we can teach e.g. smatch that access to a kernel
> > pointer through this offset might fault.
> 
> We already have that.
> Sparse recognizes
>     __attribute__((noderef, address_space(1)))
>  to mean "this is a pointer to a different address space which
>  cannot be dereferened" and linux has
> 
> # define __user                __attribute__((noderef, address_space(1)))
> 
> so if you mark a pointer as "__user", then sparse will complain
> if you dereference it.
> 
> We've had this for over a decade :-)
> 
>   https://lwn.net/Articles/87538/
> 
> NeilBrown


Of course, everyone uses these.  But what I mean is tagging index types:

	int data[256];

	int foo(u32 __user *ptr)
	{
		u32 i;
		if (get_user(i, ptr))
			return -EFAULT;

		data[i] = 0;
			^^^ security vulnerability

	}

Above, i is coming from userspace and so must always be range-checked
before it's used as an index.

Maybe we could change get_user return a tagged result: __from_user int.
And have above warn because __from_user can not be assigned to plain
int.

Then rework the code along the following lines:


	int data[256];

	int force_range(__unsafe u32 value, unsigned idx)
	{
		return ((__force int)value) % idx;
	}

	int foo(u32 __user *ptr)
	{
		__unsafe u32 i;
		int ichecked;
		if (get_user(i, ptr))
			return -EFAULT;

		ichecked = force_range(i, sizeof data);
		data[ichecked] = 0;
			^^^ ok now

	}


-- 
MST

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  5:29                 ` Alexey Dobriyan
@ 2016-08-12  5:38                   ` Michael S. Tsirkin
  2016-08-12  6:04                     ` Julia Lawall
  2016-08-12  5:50                   ` Matthew Wilcox
  1 sibling, 1 reply; 82+ messages in thread
From: Michael S. Tsirkin @ 2016-08-12  5:38 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: Dan Carpenter, ksummit-discuss

On Fri, Aug 12, 2016 at 08:29:20AM +0300, Alexey Dobriyan wrote:
> On Fri, Aug 12, 2016 at 12:07:11AM -0400, Matthew Wilcox wrote:
> > On Aug 11, 2016 9:02 PM, "Josh Triplett" <josh@joshtriplett.org> wrote:
> > > On Thu, Aug 11, 2016 at 11:51:52PM -0400, Matthew Wilcox wrote:
> > > > Can we introduce types for this? We have a number of different return
> > type
> > > > conventions in the kernel:
> > > >
> > > > bool
> > > > errno_t (-4095 to 0 are valid)
> > > > count_t (-4095 to INT_MAX)
> > > > long_count_t (-4095 to LONG_MAX)
> > > > ulong_count_t (-4095 to -4096)
> > > > struct foo _err*
> > > >
> > > > I think this is good programmer documentation in addition to being
> > > > potentially useful to smatch.
> > >
> > > I'd love to see an explicit type distinct from "int" for "potentially an
> > > errno".  And if any code uses "potentially an errno *or* a non-errno
> > > non-zero return value", that should ideally use a distinct type as well.
> > 
> > I think the biggest problem is coming up with good names for the types. And
> > the churn of introducing them, particularly converting function pointers
> > and all occurrences.
> 
> Names are easy part (errno_t is perfect actually). The problem is that
> once error is cleared, variable doesn't change to regular type anymore:
> 
> 	errno_t rv;
> 
> 	rv = f();
> 	if (rv < 0)
> 		return rv;
> 	int rv = rv;
> 
> which agains boils down to a language with real type system.

We could maybe do

 	errno_t rv;
 
 	rv = f();
 	if (IS_ERR(rv))
 		return rv;
 	int r = CHECKED(rv);


Tools could maybe verify that all paths to CHECKED
are actually going through an IS_ERR test as well.


> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  5:29                 ` Alexey Dobriyan
  2016-08-12  5:38                   ` Michael S. Tsirkin
@ 2016-08-12  5:50                   ` Matthew Wilcox
  1 sibling, 0 replies; 82+ messages in thread
From: Matthew Wilcox @ 2016-08-12  5:50 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: ksummit-discuss, Dan Carpenter

[-- Attachment #1: Type: text/plain, Size: 1675 bytes --]

On Fri, Aug 12, 2016 at 1:29 AM, Alexey Dobriyan <adobriyan@gmail.com>
wrote:

> On Fri, Aug 12, 2016 at 12:07:11AM -0400, Matthew Wilcox wrote:
> > > On Thu, Aug 11, 2016 at 11:51:52PM -0400, Matthew Wilcox wrote:
> > > > Can we introduce types for this? We have a number of different return
> > type
> > > > conventions in the kernel:
> > > >
> > > > bool
> > > > errno_t (-4095 to 0 are valid)
> > > > count_t (-4095 to INT_MAX)
> > > > long_count_t (-4095 to LONG_MAX)
> > > > ulong_count_t (-4095 to -4096)
> > > > struct foo _err*
> >
> > I think the biggest problem is coming up with good names for the types.
> And
> > the churn of introducing them, particularly converting function pointers
> > and all occurrences.
>
> Names are easy part (errno_t is perfect actually).


I agree that errno_t is perfect, but it's not really part of a nice family
-- count_t, long_count_t and ulong_count_t are all pretty crappy.  They
don't scream out I MIGHT CONTAIN AN ERRNO, BETTER CHECK ME! the way I would
like.  count_might_be_errno_t is not exactly euphonious.  err_count_t,
perhaps?

Of course, it might not be a count ... at least at one point the NVMe
driver had variables which either contained [-4095 to -1] (Linux errno,
usually -ENOMEM), 0 (success) or 1-65535 (NVMe status code indicating some
device failure).  I think those variables are all gone now, but that's not
an unreasonable thing to want and calling that beast a ushort_count_t would
be untrue.  I suspect for that kind of thing, the driver should create its
own type (or if we did something similar in SCSI, the subsystem would
create a scsi_status_t that was pretty much private to the SCSI subsystem).

[-- Attachment #2: Type: text/html, Size: 2216 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  5:38                   ` Michael S. Tsirkin
@ 2016-08-12  6:04                     ` Julia Lawall
  2016-08-12  6:09                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 82+ messages in thread
From: Julia Lawall @ 2016-08-12  6:04 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: ksummit-discuss, Dan Carpenter



On Fri, 12 Aug 2016, Michael S. Tsirkin wrote:

> On Fri, Aug 12, 2016 at 08:29:20AM +0300, Alexey Dobriyan wrote:
> > On Fri, Aug 12, 2016 at 12:07:11AM -0400, Matthew Wilcox wrote:
> > > On Aug 11, 2016 9:02 PM, "Josh Triplett" <josh@joshtriplett.org> wrote:
> > > > On Thu, Aug 11, 2016 at 11:51:52PM -0400, Matthew Wilcox wrote:
> > > > > Can we introduce types for this? We have a number of different return
> > > type
> > > > > conventions in the kernel:
> > > > >
> > > > > bool
> > > > > errno_t (-4095 to 0 are valid)
> > > > > count_t (-4095 to INT_MAX)
> > > > > long_count_t (-4095 to LONG_MAX)
> > > > > ulong_count_t (-4095 to -4096)
> > > > > struct foo _err*
> > > > >
> > > > > I think this is good programmer documentation in addition to being
> > > > > potentially useful to smatch.
> > > >
> > > > I'd love to see an explicit type distinct from "int" for "potentially an
> > > > errno".  And if any code uses "potentially an errno *or* a non-errno
> > > > non-zero return value", that should ideally use a distinct type as well.
> > >
> > > I think the biggest problem is coming up with good names for the types. And
> > > the churn of introducing them, particularly converting function pointers
> > > and all occurrences.
> >
> > Names are easy part (errno_t is perfect actually). The problem is that
> > once error is cleared, variable doesn't change to regular type anymore:
> >
> > 	errno_t rv;
> >
> > 	rv = f();
> > 	if (rv < 0)
> > 		return rv;
> > 	int rv = rv;
> >
> > which agains boils down to a language with real type system.
>
> We could maybe do
>
>  	errno_t rv;
>
>  	rv = f();
>  	if (IS_ERR(rv))
>  		return rv;
>  	int r = CHECKED(rv);
>
>
> Tools could maybe verify that all paths to CHECKED
> are actually going through an IS_ERR test as well.

What does the return value of f look like?
Was it intentional to use IS_ERR, which checks for a pointer?  At least in
that case gcc would ensure that CHECKED is used.

julia

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  6:04                     ` Julia Lawall
@ 2016-08-12  6:09                       ` Michael S. Tsirkin
  2016-08-12  6:23                         ` Matthew Wilcox
  2016-08-12  6:37                         ` Julia Lawall
  0 siblings, 2 replies; 82+ messages in thread
From: Michael S. Tsirkin @ 2016-08-12  6:09 UTC (permalink / raw)
  To: Julia Lawall; +Cc: ksummit-discuss, Dan Carpenter

On Fri, Aug 12, 2016 at 08:04:09AM +0200, Julia Lawall wrote:
> 
> 
> On Fri, 12 Aug 2016, Michael S. Tsirkin wrote:
> 
> > On Fri, Aug 12, 2016 at 08:29:20AM +0300, Alexey Dobriyan wrote:
> > > On Fri, Aug 12, 2016 at 12:07:11AM -0400, Matthew Wilcox wrote:
> > > > On Aug 11, 2016 9:02 PM, "Josh Triplett" <josh@joshtriplett.org> wrote:
> > > > > On Thu, Aug 11, 2016 at 11:51:52PM -0400, Matthew Wilcox wrote:
> > > > > > Can we introduce types for this? We have a number of different return
> > > > type
> > > > > > conventions in the kernel:
> > > > > >
> > > > > > bool
> > > > > > errno_t (-4095 to 0 are valid)
> > > > > > count_t (-4095 to INT_MAX)
> > > > > > long_count_t (-4095 to LONG_MAX)
> > > > > > ulong_count_t (-4095 to -4096)
> > > > > > struct foo _err*
> > > > > >
> > > > > > I think this is good programmer documentation in addition to being
> > > > > > potentially useful to smatch.
> > > > >
> > > > > I'd love to see an explicit type distinct from "int" for "potentially an
> > > > > errno".  And if any code uses "potentially an errno *or* a non-errno
> > > > > non-zero return value", that should ideally use a distinct type as well.
> > > >
> > > > I think the biggest problem is coming up with good names for the types. And
> > > > the churn of introducing them, particularly converting function pointers
> > > > and all occurrences.
> > >
> > > Names are easy part (errno_t is perfect actually). The problem is that
> > > once error is cleared, variable doesn't change to regular type anymore:
> > >
> > > 	errno_t rv;
> > >
> > > 	rv = f();
> > > 	if (rv < 0)
> > > 		return rv;
> > > 	int rv = rv;
> > >
> > > which agains boils down to a language with real type system.
> >
> > We could maybe do
> >
> >  	errno_t rv;
> >
> >  	rv = f();
> >  	if (IS_ERR(rv))
> >  		return rv;
> >  	int r = CHECKED(rv);
> >
> >
> > Tools could maybe verify that all paths to CHECKED
> > are actually going through an IS_ERR test as well.
> 
> What does the return value of f look like?

errno_t f(void);


> Was it intentional to use IS_ERR, which checks for a pointer?

It seems like a nice name but yes, it's already used to convert
pointers to integers.

Either we find a macro trick to teach IS_ERR to handle
integers as well, or we use a different macro.
TEST_ERR?

>  At least in
> that case gcc would ensure that CHECKED is used.
> 
> julia


I guess
typedef void * errno_t;
is one way, maybe marking it noderef for good measure.

This means the values are 64 bit unfortunately, which is often
overkill.

-- 
MST

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  5:34     ` Michael S. Tsirkin
@ 2016-08-12  6:23       ` NeilBrown
       [not found]       ` <87y442jytb.fsf@notabene.neil.brown.name>
  1 sibling, 0 replies; 82+ messages in thread
From: NeilBrown @ 2016-08-12  6:23 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2661 bytes --]

On Fri, Aug 12 2016, Michael S. Tsirkin wrote:

> On Fri, Aug 12, 2016 at 03:23:28PM +1000, NeilBrown wrote:
>> On Fri, Aug 12 2016, Michael S. Tsirkin wrote:
>> 
>> > On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
>> >> I would really like to get a feel among kernel maintainers and
>> >> developers if this is something that is interesting, and what kind of
>> >> constraints they think something like this would need to be usable for
>> >> the kernel?
>> >> 
>> >> Eric
>> >
>> > Surprised that no one mentioned this yet - I think tagging
>> > integers/structs as coming from userspace could be useful,
>> > if we can teach e.g. smatch that access to a kernel
>> > pointer through this offset might fault.
>> 
>> We already have that.
>> Sparse recognizes
>>     __attribute__((noderef, address_space(1)))
>>  to mean "this is a pointer to a different address space which
>>  cannot be dereferened" and linux has
>> 
>> # define __user                __attribute__((noderef, address_space(1)))
>> 
>> so if you mark a pointer as "__user", then sparse will complain
>> if you dereference it.
>> 
>> We've had this for over a decade :-)
>> 
>>   https://lwn.net/Articles/87538/
>> 
>> NeilBrown
>
>
> Of course, everyone uses these.  But what I mean is tagging index types:
>
> 	int data[256];
>
> 	int foo(u32 __user *ptr)
> 	{
> 		u32 i;
> 		if (get_user(i, ptr))
> 			return -EFAULT;
>
> 		data[i] = 0;
> 			^^^ security vulnerability
>
> 	}
>
> Above, i is coming from userspace and so must always be range-checked
> before it's used as an index.

Ahhh, I see.  Thanks spelling it out for me.


>
> Maybe we could change get_user return a tagged result: __from_user int.
> And have above warn because __from_user can not be assigned to plain
> int.
>
> Then rework the code along the following lines:
>
>
> 	int data[256];
>
> 	int force_range(__unsafe u32 value, unsigned idx)
> 	{
> 		return ((__force int)value) % idx;
> 	}
>
> 	int foo(u32 __user *ptr)
> 	{
> 		__unsafe u32 i;
> 		int ichecked;
> 		if (get_user(i, ptr))
> 			return -EFAULT;
>
> 		ichecked = force_range(i, sizeof data);
> 		data[ichecked] = 0;
> 			^^^ ok now
>
> 	}

You could probably do this today using __attribute__((bitwise))

typedef int __attribute__((bitwise)) unsafe32;

Then use "unsafe32" wherever you have "__unsafe u32".

When you try

  data[i] = 0;

sparse says "warning: restricted int degrades to integer"

Of course, changing all the code would be a pain.
I guess you introduce "get_safe_user" then gradually transition code
over.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  4:42 ` Michael S. Tsirkin
       [not found]   ` <871t1ulfvz.fsf@notabene.neil.brown.name>
@ 2016-08-12  6:23   ` NeilBrown
  1 sibling, 0 replies; 82+ messages in thread
From: NeilBrown @ 2016-08-12  6:23 UTC (permalink / raw)
  To: Michael S. Tsirkin, Eric W. Biederman; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1045 bytes --]

On Fri, Aug 12 2016, Michael S. Tsirkin wrote:

> On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
>> I would really like to get a feel among kernel maintainers and
>> developers if this is something that is interesting, and what kind of
>> constraints they think something like this would need to be usable for
>> the kernel?
>> 
>> Eric
>
> Surprised that no one mentioned this yet - I think tagging
> integers/structs as coming from userspace could be useful,
> if we can teach e.g. smatch that access to a kernel
> pointer through this offset might fault.

We already have that.
Sparse recognizes
    __attribute__((noderef, address_space(1)))
 to mean "this is a pointer to a different address space which
 cannot be dereferened" and linux has

# define __user                __attribute__((noderef, address_space(1)))

so if you mark a pointer as "__user", then sparse will complain
if you dereference it.

We've had this for over a decade :-)

  https://lwn.net/Articles/87538/

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  6:09                       ` Michael S. Tsirkin
@ 2016-08-12  6:23                         ` Matthew Wilcox
  2016-08-12  6:37                         ` Julia Lawall
  1 sibling, 0 replies; 82+ messages in thread
From: Matthew Wilcox @ 2016-08-12  6:23 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Dan Carpenter, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 857 bytes --]

On Fri, Aug 12, 2016 at 2:09 AM, Michael S. Tsirkin <mst@redhat.com> wrote:

> I guess
> typedef void * errno_t;
> is one way, maybe marking it noderef for good measure.
>
> This means the values are 64 bit unfortunately, which is often
> overkill.
>

s/64-bit/native word size/

Does anyone know of an ABI where that's *inefficient*?!  Or at least, less
efficient than returning a 32-bit int?  Pretty much every ABI I ever saw
says something along the lines of 'register FOO contains the return value
if the type is small enough, otherwise for returning aggregates, register
FOO contains ...'

I can't believe anyone would deliberately design an ABI where calling
void *foo(void) is less efficient than calling int bar(void);

(and if you have some weird DSP or the CDC6600 in mind where Linux won't
run anyway, then comp.arch.necrophilia is --> that way)

[-- Attachment #2: Type: text/html, Size: 1375 bytes --]

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  6:09                       ` Michael S. Tsirkin
  2016-08-12  6:23                         ` Matthew Wilcox
@ 2016-08-12  6:37                         ` Julia Lawall
  1 sibling, 0 replies; 82+ messages in thread
From: Julia Lawall @ 2016-08-12  6:37 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: ksummit-discuss, Dan Carpenter



On Fri, 12 Aug 2016, Michael S. Tsirkin wrote:

> On Fri, Aug 12, 2016 at 08:04:09AM +0200, Julia Lawall wrote:
> >
> >
> > On Fri, 12 Aug 2016, Michael S. Tsirkin wrote:
> >
> > > On Fri, Aug 12, 2016 at 08:29:20AM +0300, Alexey Dobriyan wrote:
> > > > On Fri, Aug 12, 2016 at 12:07:11AM -0400, Matthew Wilcox wrote:
> > > > > On Aug 11, 2016 9:02 PM, "Josh Triplett" <josh@joshtriplett.org> wrote:
> > > > > > On Thu, Aug 11, 2016 at 11:51:52PM -0400, Matthew Wilcox wrote:
> > > > > > > Can we introduce types for this? We have a number of different return
> > > > > type
> > > > > > > conventions in the kernel:
> > > > > > >
> > > > > > > bool
> > > > > > > errno_t (-4095 to 0 are valid)
> > > > > > > count_t (-4095 to INT_MAX)
> > > > > > > long_count_t (-4095 to LONG_MAX)
> > > > > > > ulong_count_t (-4095 to -4096)
> > > > > > > struct foo _err*
> > > > > > >
> > > > > > > I think this is good programmer documentation in addition to being
> > > > > > > potentially useful to smatch.
> > > > > >
> > > > > > I'd love to see an explicit type distinct from "int" for "potentially an
> > > > > > errno".  And if any code uses "potentially an errno *or* a non-errno
> > > > > > non-zero return value", that should ideally use a distinct type as well.
> > > > >
> > > > > I think the biggest problem is coming up with good names for the types. And
> > > > > the churn of introducing them, particularly converting function pointers
> > > > > and all occurrences.
> > > >
> > > > Names are easy part (errno_t is perfect actually). The problem is that
> > > > once error is cleared, variable doesn't change to regular type anymore:
> > > >
> > > > 	errno_t rv;
> > > >
> > > > 	rv = f();
> > > > 	if (rv < 0)
> > > > 		return rv;
> > > > 	int rv = rv;
> > > >
> > > > which agains boils down to a language with real type system.
> > >
> > > We could maybe do
> > >
> > >  	errno_t rv;
> > >
> > >  	rv = f();
> > >  	if (IS_ERR(rv))
> > >  		return rv;
> > >  	int r = CHECKED(rv);
> > >
> > >
> > > Tools could maybe verify that all paths to CHECKED
> > > are actually going through an IS_ERR test as well.
> >
> > What does the return value of f look like?
>
> errno_t f(void);
>
>
> > Was it intentional to use IS_ERR, which checks for a pointer?
>
> It seems like a nice name but yes, it's already used to convert
> pointers to integers.
>
> Either we find a macro trick to teach IS_ERR to handle
> integers as well, or we use a different macro.
> TEST_ERR?

IS_ERRNO?

julia

>
> >  At least in
> > that case gcc would ensure that CHECKED is used.
> >
> > julia
>
>
> I guess
> typedef void * errno_t;
> is one way, maybe marking it noderef for good measure.
>
> This means the values are 64 bit unfortunately, which is often
> overkill.
>
> --
> MST
>

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
  2016-08-12  0:38           ` NeilBrown
@ 2016-08-12 20:56             ` Dan Carpenter
  0 siblings, 0 replies; 82+ messages in thread
From: Dan Carpenter @ 2016-08-12 20:56 UTC (permalink / raw)
  To: NeilBrown; +Cc: ksummit-discuss

On Fri, Aug 12, 2016 at 10:38:35AM +1000, NeilBrown wrote:
> On Fri, Aug 12 2016, Dan Carpenter wrote:
> 
> > On Fri, Jul 22, 2016 at 03:57:40PM +0200, Hannes Reinecke wrote:
> >> > 
> >> > I guess that almost all functions return only a few possible error codes?
> >> 
> >> Precisely. If we had a way of specifying "the return value is an errno
> >> with the possible values '0', '-EIO', and '-EINVAL'" that would be
> >> _so_ cool.
> >
> > I think that's a bad idea.  We should be propagating errors from the
> > functions we call.  It should be able to change without breaking.
> 
> Should we?  I recently faced a bug caused by a (proposed) change to
> btrfs which returned a different error code to the ->fh_to_dentry()
> function. That was being propagated up to nfsd, and out on the wire in
> the NFSv4 protocol.
> Only the new error was invalid for the protocol and the client
> (correctly) reported it to user-space rather than handling it
> internally.
> 
> This happened because not enough thought/documentation had been given to
> which error codes were sensibly meaningful.  I changed
> exportfs_decode_fh() so that any error other than ENOMEM became ESTALE,
> because that is all nfsd can sensibly handle.
> 
> I'm sure there are some (many!) paths where error codes should be
> propagated transparently, but I don't think we should assume that is
> always true.
> 

My guess is that 97% of the time it should be propagated.

> >
> > Smatch does a pretty good job of tracking return values, especially
> > if you rebuild the database over and over once a day like I do.
> 
> So it is OK to keep a list of valid return values in a database, but not
> OK to keep them in the code as documentation, and to alert the
> programmer when they make a change so they can declare (and maybe even
> document) if it was an intentional change?

What I'm saying is that no one wants a thing where if we change a
return code, we have to update the documentation for all the call
trees.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel
       [not found]       ` <87y442jytb.fsf@notabene.neil.brown.name>
@ 2016-08-15 23:26         ` Michael S. Tsirkin
  0 siblings, 0 replies; 82+ messages in thread
From: Michael S. Tsirkin @ 2016-08-15 23:26 UTC (permalink / raw)
  To: NeilBrown; +Cc: ksummit-discuss

On Fri, Aug 12, 2016 at 04:17:36PM +1000, NeilBrown wrote:
> On Fri, Aug 12 2016, Michael S. Tsirkin wrote:
> 
> > On Fri, Aug 12, 2016 at 03:23:28PM +1000, NeilBrown wrote:
> >> On Fri, Aug 12 2016, Michael S. Tsirkin wrote:
> >> 
> >> > On Tue, Jul 19, 2016 at 10:32:51AM -0500, Eric W. Biederman wrote:
> >> >> I would really like to get a feel among kernel maintainers and
> >> >> developers if this is something that is interesting, and what kind of
> >> >> constraints they think something like this would need to be usable for
> >> >> the kernel?
> >> >> 
> >> >> Eric
> >> >
> >> > Surprised that no one mentioned this yet - I think tagging
> >> > integers/structs as coming from userspace could be useful,
> >> > if we can teach e.g. smatch that access to a kernel
> >> > pointer through this offset might fault.
> >> 
> >> We already have that.
> >> Sparse recognizes
> >>     __attribute__((noderef, address_space(1)))
> >>  to mean "this is a pointer to a different address space which
> >>  cannot be dereferened" and linux has
> >> 
> >> # define __user                __attribute__((noderef, address_space(1)))
> >> 
> >> so if you mark a pointer as "__user", then sparse will complain
> >> if you dereference it.
> >> 
> >> We've had this for over a decade :-)
> >> 
> >>   https://lwn.net/Articles/87538/
> >> 
> >> NeilBrown
> >
> >
> > Of course, everyone uses these.  But what I mean is tagging index types:
> >
> > 	int data[256];
> >
> > 	int foo(u32 __user *ptr)
> > 	{
> > 		u32 i;
> > 		if (get_user(i, ptr))
> > 			return -EFAULT;
> >
> > 		data[i] = 0;
> > 			^^^ security vulnerability
> >
> > 	}
> >
> > Above, i is coming from userspace and so must always be range-checked
> > before it's used as an index.
> 
> Ahhh, I see.  Thanks spelling it out for me.
> 
> 
> >
> > Maybe we could change get_user return a tagged result: __from_user int.
> > And have above warn because __from_user can not be assigned to plain
> > int.
> >
> > Then rework the code along the following lines:
> >
> >
> > 	int data[256];
> >
> > 	int force_range(__unsafe u32 value, unsigned idx)
> > 	{
> > 		return ((__force int)value) % idx;
> > 	}
> >
> > 	int foo(u32 __user *ptr)
> > 	{
> > 		__unsafe u32 i;
> > 		int ichecked;
> > 		if (get_user(i, ptr))
> > 			return -EFAULT;
> >
> > 		ichecked = force_range(i, sizeof data);
> > 		data[ichecked] = 0;
> > 			^^^ ok now
> >
> > 	}
> 
> You could probably do this today using __attribute__((bitwise))
> 
> typedef int __attribute__((bitwise)) unsafe32;
> 
> Then use "unsafe32" wherever you have "__unsafe u32".
> 
> When you try
> 
>   data[i] = 0;
> 
> sparse says "warning: restricted int degrades to integer"

Yes, but inability to do math on bitwise integers is rather
annoying.

E.g. if index is in bytes:

	int j = i / sizeof(*data);

will warn even though nothing bad is going on.


Maybe we could benefit from a separate
integer type, such that unsafe integers can interact
with safe (regular) integers, resulting in unsafe
integers.

> 
> Of course, changing all the code would be a pain.
> I guess you introduce "get_safe_user" then gradually transition code
> over.
> 
> NeilBrown

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2016-08-15 23:26 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-19 15:32 [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel Eric W. Biederman
2016-07-19 17:31 ` Mark Brown
2016-07-19 18:52   ` Jiri Kosina
2016-07-19 20:39     ` Eric W. Biederman
2016-07-20 15:53     ` Mark Brown
2016-07-20 17:04       ` [Ksummit-discuss] [CORE TOPIC] [TECH TOPIC] Support (or move towards to) LLVM Jiri Kosina
2016-07-20 18:35         ` Alexey Dobriyan
2016-07-20 18:52           ` Mark Brown
2016-07-21  9:54         ` David Woodhouse
2016-07-21 13:41           ` Shuah Khan
2016-07-21 14:02             ` David Woodhouse
2016-07-21 16:21               ` Mark Brown
2016-07-23  3:28                 ` Behan Webster
2016-07-21 18:38           ` Jiri Kosina
2016-07-21 20:47             ` Paul Turner
2016-07-26 11:22             ` David Woodhouse
2016-07-19 21:08 ` [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel James Bottomley
2016-07-20  0:08   ` Eric W. Biederman
2016-07-20  7:32     ` Julia Lawall
2016-07-20 12:11     ` Jan Kara
2016-07-28  3:33       ` Steven Rostedt
2016-07-19 21:26 ` Josh Triplett
2016-07-20  2:36   ` Eric W. Biederman
2016-07-30 18:03   ` Eric W. Biederman
2016-07-30 18:49     ` Josh Triplett
2016-07-30 19:34       ` Eric W. Biederman
2016-07-30 20:56         ` Josh Triplett
2016-07-30 22:21           ` Eric W. Biederman
2016-07-21 15:05 ` David Howells
2016-07-21 23:33   ` Dmitry Torokhov
2016-07-22  6:00   ` Hannes Reinecke
2016-07-22  6:14     ` Julia Lawall
2016-07-22 13:57       ` Hannes Reinecke
2016-07-22 14:40         ` Julia Lawall
2016-07-22 19:12         ` Arnd Bergmann
2016-07-26 11:48         ` David Woodhouse
2016-07-26 12:53           ` Hannes Reinecke
2016-07-26 13:59             ` Alexey Dobriyan
2016-07-26 13:53           ` Alexey Dobriyan
2016-07-27 12:40           ` Julia Lawall
2016-07-27 13:25             ` James Bottomley
2016-07-27 13:33               ` David Woodhouse
2016-07-27 17:21                 ` Bird, Timothy
2016-08-01 22:17                   ` Rob Herring
2016-08-12  1:29                     ` Stephen Boyd
2016-08-11 15:44         ` Dan Carpenter
2016-08-12  0:38           ` NeilBrown
2016-08-12 20:56             ` Dan Carpenter
2016-08-12  3:51           ` Matthew Wilcox
2016-08-12  4:01             ` Josh Triplett
2016-08-12  4:07               ` Matthew Wilcox
2016-08-12  5:29                 ` Alexey Dobriyan
2016-08-12  5:38                   ` Michael S. Tsirkin
2016-08-12  6:04                     ` Julia Lawall
2016-08-12  6:09                       ` Michael S. Tsirkin
2016-08-12  6:23                         ` Matthew Wilcox
2016-08-12  6:37                         ` Julia Lawall
2016-08-12  5:50                   ` Matthew Wilcox
2016-08-04  7:15       ` NeilBrown
2016-08-04 11:19         ` Julia Lawall
2016-07-22  7:03   ` David Howells
2016-07-22 10:10     ` Alexey Dobriyan
2016-07-22 10:13     ` David Howells
2016-07-22 10:22       ` Alexey Dobriyan
2016-07-22 10:53         ` Vlastimil Babka
2016-07-22 11:05         ` David Howells
2016-07-22 17:18           ` Julia Lawall
2016-07-22 18:19     ` Dmitry Torokhov
2016-07-22 19:43       ` Guenter Roeck
2016-07-28  3:40   ` Steven Rostedt
2016-07-28  7:12   ` David Howells
2016-08-02 10:48   ` Jani Nikula
2016-08-04 11:31     ` David Woodhouse
2016-08-04 12:07       ` Jani Nikula
2016-07-22 11:19 ` David Howells
2016-07-22 12:44   ` Linus Walleij
2016-07-22 13:26   ` David Howells
2016-08-12  4:42 ` Michael S. Tsirkin
     [not found]   ` <871t1ulfvz.fsf@notabene.neil.brown.name>
2016-08-12  5:34     ` Michael S. Tsirkin
2016-08-12  6:23       ` NeilBrown
     [not found]       ` <87y442jytb.fsf@notabene.neil.brown.name>
2016-08-15 23:26         ` Michael S. Tsirkin
2016-08-12  6:23   ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.