Why is the bit size different between a syscall and its wrapper?

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Why is the bit size different between a syscall and its wrapper?
@ 2021-03-12  2:48 Masahiro Yamada
  2021-03-12  3:17 ` Bhaskar Chowdhury
  2021-03-12  3:27 ` Willy Tarreau
  0 siblings, 2 replies; 4+ messages in thread
From: Masahiro Yamada @ 2021-03-12  2:48 UTC (permalink / raw)
  To: Linux Kernel Mailing List, linux-api

Hi.

I think I am missing something, but
is there any particular reason to
use a different bit size between
a syscall and its userspace wrapper?

For example, for the unshare syscall,

unshare(2) says the parameter is int.

SYNOPSIS
       #define _GNU_SOURCE
       #include <sched.h>

       int unshare(int flags);

In the kernel, it is unsigned long.

SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
{
        return ksys_unshare(unshare_flags);
}

I guess the upper 32-bit will be
zeroed out in the c library when
sizeof(int) != sizeof(unsigned long)
(i.e. 64-bit system), but I'd like to know
why we do it this way.

-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why is the bit size different between a syscall and its wrapper?
  2021-03-12  2:48 Why is the bit size different between a syscall and its wrapper? Masahiro Yamada
@ 2021-03-12  3:17 ` Bhaskar Chowdhury
  2021-03-12  3:27 ` Willy Tarreau
  1 sibling, 0 replies; 4+ messages in thread
From: Bhaskar Chowdhury @ 2021-03-12  3:17 UTC (permalink / raw)
  To: Masahiro Yamada; +Cc: Linux Kernel Mailing List, linux-api

[-- Attachment #1: Type: text/plain, Size: 983 bytes --]

On 11:48 Fri 12 Mar 2021, Masahiro Yamada wrote:
>Hi.
>
>I think I am missing something, but
>is there any particular reason to
>use a different bit size between
>a syscall and its userspace wrapper?
>
>
>
>For example, for the unshare syscall,
>
>unshare(2) says the parameter is int.
>
>
>SYNOPSIS
>       #define _GNU_SOURCE
>       #include <sched.h>
>
>       int unshare(int flags);
>
>
>
>
>In the kernel, it is unsigned long.
>
>
>SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
>{
>        return ksys_unshare(unshare_flags);
>}
>
>
>
>
>I guess the upper 32-bit will be
>zeroed out in the c library when
>sizeof(int) != sizeof(unsigned long)
>(i.e. 64-bit system), but I'd like to know
>why we do it this way.
>
>
Small nit! never mind ...but eye catching, Masahiro :) ...are you typing this
on narrowed device, which allow only this much line length?? It's bloody
narrow...don't you think so?

Sorry, for the deviation.

~Bhaskar
>--
>Best Regards
>Masahiro Yamada

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why is the bit size different between a syscall and its wrapper?
  2021-03-12  2:48 Why is the bit size different between a syscall and its wrapper? Masahiro Yamada
  2021-03-12  3:17 ` Bhaskar Chowdhury
@ 2021-03-12  3:27 ` Willy Tarreau
  2021-03-14  5:10   ` Masahiro Yamada
  1 sibling, 1 reply; 4+ messages in thread
From: Willy Tarreau @ 2021-03-12  3:27 UTC (permalink / raw)
  To: Masahiro Yamada; +Cc: Linux Kernel Mailing List, linux-api

On Fri, Mar 12, 2021 at 11:48:11AM +0900, Masahiro Yamada wrote:
> Hi.
> 
> I think I am missing something, but
> is there any particular reason to
> use a different bit size between
> a syscall and its userspace wrapper?
> 
> 
> 
> For example, for the unshare syscall,
> 
> unshare(2) says the parameter is int.
> 
> 
> SYNOPSIS
>        #define _GNU_SOURCE
>        #include <sched.h>
> 
>        int unshare(int flags);
> 
> 
> 
> 
> In the kernel, it is unsigned long.
> 
> 
> SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
> {
>         return ksys_unshare(unshare_flags);
> }

The syscalls must have a well defined interface for a given architecture.
Thus in practice the ABI will define that arg1 goes into this register,
arg2 into this one etc, regardless of their type (plenty of them are
pointers for example). The long is the size of a register so it can carry
any of the types we care about. So by defining each syscall as a function
taking 1 to 6 fixed-size arguments you can implement about all syscalls.

Regarding the libc, it has to offer an interface which is compatible with
the standard definition of the syscalls as defined by POSIX or as commonly
found on other OSes, and this regardless of the platform.

For example look at recv(), it takes an int, a pointer, a size_t and an
int. It requires to be defined like this for portability, but at the OS
level all these will typically be passed as a register each.

Hoping this helps,
Willy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Why is the bit size different between a syscall and its wrapper?
  2021-03-12  3:27 ` Willy Tarreau
@ 2021-03-14  5:10   ` Masahiro Yamada
  0 siblings, 0 replies; 4+ messages in thread
From: Masahiro Yamada @ 2021-03-14  5:10 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Linux Kernel Mailing List, linux-api

Willy,

Thanks for the explanation.

On Fri, Mar 12, 2021 at 12:27 PM Willy Tarreau <w@1wt.eu> wrote:
>
> On Fri, Mar 12, 2021 at 11:48:11AM +0900, Masahiro Yamada wrote:
> > Hi.
> >
> > I think I am missing something, but
> > is there any particular reason to
> > use a different bit size between
> > a syscall and its userspace wrapper?
> >
> >
> >
> > For example, for the unshare syscall,
> >
> > unshare(2) says the parameter is int.
> >
> >
> > SYNOPSIS
> >        #define _GNU_SOURCE
> >        #include <sched.h>
> >
> >        int unshare(int flags);
> >
> >
> >
> >
> > In the kernel, it is unsigned long.
> >
> >
> > SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
> > {
> >         return ksys_unshare(unshare_flags);
> > }
>
> The syscalls must have a well defined interface for a given architecture.
> Thus in practice the ABI will define that arg1 goes into this register,
> arg2 into this one etc, regardless of their type (plenty of them are
> pointers for example). The long is the size of a register so it can carry
> any of the types we care about. So by defining each syscall as a function
> taking 1 to 6 fixed-size arguments you can implement about all syscalls.
>
> Regarding the libc, it has to offer an interface which is compatible with
> the standard definition of the syscalls as defined by POSIX or as commonly
> found on other OSes, and this regardless of the platform.
>
> For example look at recv(), it takes an int, a pointer, a size_t and an
> int. It requires to be defined like this for portability, but at the OS
> level all these will typically be passed as a register each.
>

You are right.
Functions in POSIX such as 'recv' should be portable with other OSes.
For the syscall ABI level, we have more freedom to choose
parameter types more convenient for the kernel.

IIUC, 'unshare' seems to be Linux-specific, and
I think "other OSes" do not exist.



Using types that have the same width as registers
avoids the ambiguity about the upper 32-bits
in 64-bit registers anyway. This is a benefit.

Historically, it caused a issue:
https://nvd.nist.gov/vuln/detail/CVE-2009-0029

We do not need to be worried since
commit 1a94bc34768e463a93cb3751819709ab0ea80a01.
All parameters are properly sign-extended by
forcibly casting to (long).


--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-03-14  5:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-12  2:48 Why is the bit size different between a syscall and its wrapper? Masahiro Yamada
2021-03-12  3:17 ` Bhaskar Chowdhury
2021-03-12  3:27 ` Willy Tarreau
2021-03-14  5:10   ` Masahiro Yamada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).