linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: Fix x32 System V message queue syscalls
@ 2020-10-12  1:48 Jessica Clarke
  2020-10-12  3:02 ` Andy Lutomirski
  0 siblings, 1 reply; 26+ messages in thread
From: Jessica Clarke @ 2020-10-12  1:48 UTC (permalink / raw)
  To: linux-x86_64
  Cc: Jessica Clarke, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, linux-kernel

POSIX specifies that the first field of the supplied msgp, namely mtype,
is a long, not a __kernel_long_t, and it's a user-defined struct due to
the variable-length mtext field so we can't even bend the spec and make
it a __kernel_long_t even if we wanted to. Thus we must use the compat
syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
msgrcv respectively.

Due to erroneously including the first 4 bytes of mtext in the mtype
this would previously also cause non-zero msgtyp arguments for msgrcv to
search for the wrong messages, and if sharing message queues between x32
and non-x32 (i386 or x86_64) processes this would previously cause mtext
to "move" and, depending on the direction and ABI combination, lose the
first 4 bytes.

Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
---
 arch/x86/entry/syscalls/syscall_64.tbl | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index f30d6ae9a..7ee40989e 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -77,8 +77,8 @@
 66	common	semctl			sys_semctl
 67	common	shmdt			sys_shmdt
 68	common	msgget			sys_msgget
-69	common	msgsnd			sys_msgsnd
-70	common	msgrcv			sys_msgrcv
+69	64	msgsnd			sys_msgsnd
+70	64	msgrcv			sys_msgrcv
 71	common	msgctl			sys_msgctl
 72	common	fcntl			sys_fcntl
 73	common	flock			sys_flock
@@ -404,3 +404,5 @@
 545	x32	execveat		compat_sys_execveat
 546	x32	preadv2			compat_sys_preadv64v2
 547	x32	pwritev2		compat_sys_pwritev64v2
+548	x32	msgsnd			compat_sys_msgsnd
+549	x32	msgrcv			compat_sys_msgrcv
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86: Fix x32 System V message queue syscalls
  2020-10-12  1:48 [PATCH] x86: Fix x32 System V message queue syscalls Jessica Clarke
@ 2020-10-12  3:02 ` Andy Lutomirski
  2020-10-12  3:31   ` Jessica Clarke
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2020-10-12  3:02 UTC (permalink / raw)
  To: Jessica Clarke
  Cc: linux-x86_64, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, X86 ML, H. Peter Anvin, LKML, Brian Gerst

On Sun, Oct 11, 2020 at 6:48 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>
> POSIX specifies that the first field of the supplied msgp, namely mtype,
> is a long, not a __kernel_long_t, and it's a user-defined struct due to
> the variable-length mtext field so we can't even bend the spec and make
> it a __kernel_long_t even if we wanted to. Thus we must use the compat
> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
> msgrcv respectively.
>
> Due to erroneously including the first 4 bytes of mtext in the mtype
> this would previously also cause non-zero msgtyp arguments for msgrcv to
> search for the wrong messages, and if sharing message queues between x32
> and non-x32 (i386 or x86_64) processes this would previously cause mtext
> to "move" and, depending on the direction and ABI combination, lose the
> first 4 bytes.

This has a few problems.

First, the 512-547 range is a legacy wart and we shouldn't extend it.
I thought I had documented this, but I didn't -- oops.  Patch sent.
The correct way to do this nowadays is to use the same number twice,
once for x64 and once for x32.  If you try to do this and encounter
problems with the build, please either fix them or let me know :)

Second, since this is an ABI issue, can you please include a little
test case that compiles for i386, x86_64 and x32, works correctly on
all three with whatever patch you send, and fails on x32 without the
patch?

Third, how does this interact with various libc implementations?
msgsnd(2) and msgrcv(2) have glibc wrappers.  Have they never been
tested on x32?

Fourth, if there is some deployed code that uses the buggy x32 path
and actually works by accident, do we risk breaking it if we fix the
bug?

--Andy

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] x86: Fix x32 System V message queue syscalls
  2020-10-12  3:02 ` Andy Lutomirski
@ 2020-10-12  3:31   ` Jessica Clarke
  2020-10-12 13:44     ` [PATCH v2] " Jessica Clarke
  0 siblings, 1 reply; 26+ messages in thread
From: Jessica Clarke @ 2020-10-12  3:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-x86_64, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	X86 ML, H. Peter Anvin, LKML, Brian Gerst

On 12 Oct 2020, at 04:02, Andy Lutomirski <luto@kernel.org> wrote:
> On Sun, Oct 11, 2020 at 6:48 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>> 
>> POSIX specifies that the first field of the supplied msgp, namely mtype,
>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
>> the variable-length mtext field so we can't even bend the spec and make
>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
>> msgrcv respectively.
>> 
>> Due to erroneously including the first 4 bytes of mtext in the mtype
>> this would previously also cause non-zero msgtyp arguments for msgrcv to
>> search for the wrong messages, and if sharing message queues between x32
>> and non-x32 (i386 or x86_64) processes this would previously cause mtext
>> to "move" and, depending on the direction and ABI combination, lose the
>> first 4 bytes.
> 
> This has a few problems.
> 
> First, the 512-547 range is a legacy wart and we shouldn't extend it.
> I thought I had documented this, but I didn't -- oops.  Patch sent.
> The correct way to do this nowadays is to use the same number twice,
> once for x64 and once for x32.  If you try to do this and encounter
> problems with the build, please either fix them or let me know :)

Yeah, that makes more sense. Not sure what I was thinking; changing the
syscall number is clearly a breaking change that's a huge no-no (at
least without also providing old_msgsnd/msgrcv entries, but as
discussed at the end it's unlikely any userspace code actually wants
that behaviour).

> Second, since this is an ABI issue, can you please include a little
> test case that compiles for i386, x86_64 and x32, works correctly on
> all three with whatever patch you send, and fails on x32 without the
> patch?

Test is below. Only verified to break on x32 (and work on i386+amd64)
currently; once I have v2 built I will check all three give the
expected output (NB: for x32 that means only "PAY" should get printed
in the second case since the __kernel_long_t runs over into the start
of the message payload, whereas on the other two you should get the
full "PAYLOAD" since long == __kernel_long_t).

> Third, how does this interact with various libc implementations?
> msgsnd(2) and msgrcv(2) have glibc wrappers.  Have they never been
> tested on x32?

The wrappers aren't anything interesting, they just use SYSCALL_CANCEL
with either msgsnd or irc + IPCOP_msgsnd, so glibc doesn't care about
the struct layout one bit. Of course it does care about the syscall
numbers though...

As far as testing goes, well, I guess 4 byte buffer overflows just
don't get noticed that much. I noticed this not because of that, but
because someone was trying to get fakeroot to work when mixing amd64
and x32 processes, but noticed that mtext was moving around (although
they though that was how it was meant to work, not that this was a
bug). Pure x32 fakeroot though has been in use for the entire lifetime
of the Debian port, as with all other architectures, and I'm not aware
of there having been any issues discovered, despite the fact that its
default implementation is SysV IPC (as opposed to the TCP alternative).
So yeah, it's been tested, but it's not actually quite so easy to
notice.

> Fourth, if there is some deployed code that uses the buggy x32 path
> and actually works by accident, do we risk breaking it if we fix the
> bug?

I highly doubt it. Very few people know of __kernel_long_t, so you
would have to go out of your way to make it work on x32 (and go against
POSIX and the Linux manpage), and that's *after* somehow noticing that
there's a problem which nobody seems to have done, plus msgsnd/msgrcv
is rare in and of itself. A crude search through the Debian archive for
kernel_long_t.*mtype[1] doesn't turn anything up. So yeah, there is a
risk, but I think it's about as minimal as it's ever going to be,
especially given x32 being already rather niche.

Jess

[1] https://codesearch.debian.net/search?q=__kernel_long_t.*mtype&literal=0

Test demonstrating

debian:~ jrtc27% cat sysv-msg-test.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>

#include <asm/posix_types.h>

struct msg_long {
    long mtype;
    char mtext[8];
};

struct msg_long_ext {
    struct msg_long msg_long;
    char mext[4];
};

struct msg_kern_long {
    __kernel_long_t mtype;
    char mtext[8];
};

struct msg_kern_long_ext {
    struct msg_kern_long msg_kern_long;
    char mext[4];
};

void
do_long(void)
{
    struct msg_long_ext msg_long_ext;
    int msqid;

    msqid = msgget(IPC_PRIVATE, 0600 | IPC_CREAT);

    msg_long_ext.msg_long.mtype = 1;
    strcpy(msg_long_ext.msg_long.mtext, "PAYLOAD");
    strcpy(msg_long_ext.mext, "EXT");
    msgsnd(msqid, &msg_long_ext, 8, 0);

    msg_long_ext.msg_long.mtype = 0;
    memset(msg_long_ext.msg_long.mtext, 0, 8);
    memset(msg_long_ext.mext, 0, 4);
    msgrcv(msqid, &msg_long_ext, 8, 0, 0);
    printf("mtype: %ld\n", msg_long_ext.msg_long.mtype);
    printf("mtext: \"%s\"\n", msg_long_ext.msg_long.mtext);
    printf("mext: \"%s\"\n", msg_long_ext.mext);

    msgctl(msqid, IPC_RMID, NULL);
}

void
do_kern_long(void)
{
    struct msg_kern_long_ext msg_kern_long_ext;
    int msqid;

    msqid = msgget(IPC_PRIVATE, 0600 | IPC_CREAT);

    msg_kern_long_ext.msg_kern_long.mtype = 1;
    strcpy(msg_kern_long_ext.msg_kern_long.mtext, "PAYLOAD");
    strcpy(msg_kern_long_ext.mext, "EXT");
    msgsnd(msqid, &msg_kern_long_ext, 8, 0);

    msg_kern_long_ext.msg_kern_long.mtype = 0;
    memset(msg_kern_long_ext.msg_kern_long.mtext, 0, 8);
    memset(msg_kern_long_ext.mext, 0, 4);
    msgrcv(msqid, &msg_kern_long_ext, 8, 0, 0);
    printf("mtype: %ld\n", msg_kern_long_ext.msg_kern_long.mtype);
    printf("mtext: \"%s\"\n", msg_kern_long_ext.msg_kern_long.mtext);
    printf("mext: \"%s\"\n", msg_kern_long_ext.mext);

    msgctl(msqid, IPC_RMID, NULL);
}

int
main(void)
{

    do_long();
    do_kern_long();
    return 0;
}
debian:~ jrtc27% ./sysv-msg-test
mtype: 1
mtext: "PAYLOAD"
mext: "EXT"
mtype: 1
mtext: "PAYLOAD"
mext: ""


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-10-12  3:31   ` Jessica Clarke
@ 2020-10-12 13:44     ` Jessica Clarke
  2020-10-30 19:21       ` Jessica Clarke
  2020-10-31 23:30       ` Andy Lutomirski
  0 siblings, 2 replies; 26+ messages in thread
From: Jessica Clarke @ 2020-10-12 13:44 UTC (permalink / raw)
  To: linux-x86_64
  Cc: Jessica Clarke, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, linux-kernel

POSIX specifies that the first field of the supplied msgp, namely mtype,
is a long, not a __kernel_long_t, and it's a user-defined struct due to
the variable-length mtext field so we can't even bend the spec and make
it a __kernel_long_t even if we wanted to. Thus we must use the compat
syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
msgrcv respectively.

Due to erroneously including the first 4 bytes of mtext in the mtype
this would previously also cause non-zero msgtyp arguments for msgrcv to
search for the wrong messages, and if sharing message queues between x32
and non-x32 (i386 or x86_64) processes this would previously cause mtext
to "move" and, depending on the direction and ABI combination, lose the
first 4 bytes.

Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
---

I have verified that the test at the end of [1] now gives the correct
result on x32 ("PAYL" not "PAY" as I erroneously claimed it should be in
the above email) and that both i386 and amd64 give the same output with
that test as before.

[1] <1156938F-A9A3-4EE9-B059-2294A0B9FBFE@jrtc27.com>

Changes since v1:
 * Uses the same syscall numbers for x32 as amd64 and the current x32
   rather than (further) breaking ABI by allocating new ones from the
   legacy x32 range

 arch/x86/entry/syscalls/syscall_64.tbl | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index f30d6ae9a..f462123f3 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -77,8 +77,10 @@
 66	common	semctl			sys_semctl
 67	common	shmdt			sys_shmdt
 68	common	msgget			sys_msgget
-69	common	msgsnd			sys_msgsnd
-70	common	msgrcv			sys_msgrcv
+69	64	msgsnd			sys_msgsnd
+69	x32	msgsnd			compat_sys_msgsnd
+70	64	msgrcv			sys_msgrcv
+70	x32	msgrcv			compat_sys_msgrcv
 71	common	msgctl			sys_msgctl
 72	common	fcntl			sys_fcntl
 73	common	flock			sys_flock
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-10-12 13:44     ` [PATCH v2] " Jessica Clarke
@ 2020-10-30 19:21       ` Jessica Clarke
  2020-10-31 23:30       ` Andy Lutomirski
  1 sibling, 0 replies; 26+ messages in thread
From: Jessica Clarke @ 2020-10-30 19:21 UTC (permalink / raw)
  To: linux-x86_64
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, linux-kernel

> On 12 Oct 2020, at 14:44, Jessica Clarke <jrtc27@jrtc27.com> wrote:
> 
> POSIX specifies that the first field of the supplied msgp, namely mtype,
> is a long, not a __kernel_long_t, and it's a user-defined struct due to
> the variable-length mtext field so we can't even bend the spec and make
> it a __kernel_long_t even if we wanted to. Thus we must use the compat
> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
> msgrcv respectively.
> 
> Due to erroneously including the first 4 bytes of mtext in the mtype
> this would previously also cause non-zero msgtyp arguments for msgrcv to
> search for the wrong messages, and if sharing message queues between x32
> and non-x32 (i386 or x86_64) processes this would previously cause mtext
> to "move" and, depending on the direction and ABI combination, lose the
> first 4 bytes.
> 
> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
> ---

Ping?

Jess

> 
> I have verified that the test at the end of [1] now gives the correct
> result on x32 ("PAYL" not "PAY" as I erroneously claimed it should be in
> the above email) and that both i386 and amd64 give the same output with
> that test as before.
> 
> [1] <1156938F-A9A3-4EE9-B059-2294A0B9FBFE@jrtc27.com>
> 
> Changes since v1:
> * Uses the same syscall numbers for x32 as amd64 and the current x32
>   rather than (further) breaking ABI by allocating new ones from the
>   legacy x32 range
> 
> arch/x86/entry/syscalls/syscall_64.tbl | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index f30d6ae9a..f462123f3 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -77,8 +77,10 @@
> 66	common	semctl			sys_semctl
> 67	common	shmdt			sys_shmdt
> 68	common	msgget			sys_msgget
> -69	common	msgsnd			sys_msgsnd
> -70	common	msgrcv			sys_msgrcv
> +69	64	msgsnd			sys_msgsnd
> +69	x32	msgsnd			compat_sys_msgsnd
> +70	64	msgrcv			sys_msgrcv
> +70	x32	msgrcv			compat_sys_msgrcv
> 71	common	msgctl			sys_msgctl
> 72	common	fcntl			sys_fcntl
> 73	common	flock			sys_flock
> -- 
> 2.28.0
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-10-12 13:44     ` [PATCH v2] " Jessica Clarke
  2020-10-30 19:21       ` Jessica Clarke
@ 2020-10-31 23:30       ` Andy Lutomirski
  2020-11-01  0:09         ` Jessica Clarke
  2020-11-01  1:22         ` Rich Felker
  1 sibling, 2 replies; 26+ messages in thread
From: Andy Lutomirski @ 2020-10-31 23:30 UTC (permalink / raw)
  To: Jessica Clarke, Florian Weimer, Rich Felker
  Cc: linux-x86_64, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, X86 ML, H. Peter Anvin, LKML

cc: some libc folks

On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>
> POSIX specifies that the first field of the supplied msgp, namely mtype,
> is a long, not a __kernel_long_t, and it's a user-defined struct due to
> the variable-length mtext field so we can't even bend the spec and make
> it a __kernel_long_t even if we wanted to. Thus we must use the compat
> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
> msgrcv respectively.

This is a mess.

include/uapi/linux/msg.h has:

/* message buffer for msgsnd and msgrcv calls */
struct msgbuf {
        __kernel_long_t mtype;          /* type of message */
        char mtext[1];                  /* message text */
};

Your test has:

struct msg_long {
    long mtype;
    char mtext[8];
};

struct msg_long_ext {
    struct msg_long msg_long;
    char mext[4];
};

and I'm unclear as to exactly what you're trying to do there with the
"mext" part.

POSIX says:

       The application shall ensure that the argument msgp points to  a  user-
       defined  buffer that contains first a field of type long specifying the
       type of the message, and then a data portion that holds the data  bytes
       of the message. The structure below is an example of what this user-de‐
       fined buffer might look like:

           struct mymsg {
               long   mtype;       /* Message type. */
               char   mtext[1];    /* Message text. */
           }

NTP has this delightful piece of code:

   44 typedef union {
   45   struct msgbuf msgp;
   46   struct {
   47     long mtype;
   48     int code;
   49     struct timeval tv;
   50   } msgb;
   51 } MsgBuf;

bluefish has:

struct small_msgbuf {
long mtype;
char mtext[MSQ_QUEUE_SMALL_SIZE];
} small_msgp;


My laptop has nothing at all in /dev/mqueue.

So I don't really know what the right thing to do is.  Certainly if
we're going to apply this patch, we should also fix the header.  I
almost think we should *delete* struct msgbuf from the headers, since
it's all kinds of busted, but that will break the NTP build.  Ideally
we would go back in time and remove it from the headers.

Libc people, any insight?  We can probably fix the bug without
annoying anyone given how lightly x32 is used and how lightly POSIX
message queues are used.

--Andy

>
> Due to erroneously including the first 4 bytes of mtext in the mtype
> this would previously also cause non-zero msgtyp arguments for msgrcv to
> search for the wrong messages, and if sharing message queues between x32
> and non-x32 (i386 or x86_64) processes this would previously cause mtext
> to "move" and, depending on the direction and ABI combination, lose the
> first 4 bytes.
>
> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
> ---
>
> I have verified that the test at the end of [1] now gives the correct
> result on x32 ("PAYL" not "PAY" as I erroneously claimed it should be in
> the above email) and that both i386 and amd64 give the same output with
> that test as before.
>
> [1] <1156938F-A9A3-4EE9-B059-2294A0B9FBFE@jrtc27.com>
>
> Changes since v1:
>  * Uses the same syscall numbers for x32 as amd64 and the current x32
>    rather than (further) breaking ABI by allocating new ones from the
>    legacy x32 range
>
>  arch/x86/entry/syscalls/syscall_64.tbl | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index f30d6ae9a..f462123f3 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -77,8 +77,10 @@
>  66     common  semctl                  sys_semctl
>  67     common  shmdt                   sys_shmdt
>  68     common  msgget                  sys_msgget
> -69     common  msgsnd                  sys_msgsnd
> -70     common  msgrcv                  sys_msgrcv
> +69     64      msgsnd                  sys_msgsnd
> +69     x32     msgsnd                  compat_sys_msgsnd
> +70     64      msgrcv                  sys_msgrcv
> +70     x32     msgrcv                  compat_sys_msgrcv
>  71     common  msgctl                  sys_msgctl
>  72     common  fcntl                   sys_fcntl
>  73     common  flock                   sys_flock
> --
> 2.28.0
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-10-31 23:30       ` Andy Lutomirski
@ 2020-11-01  0:09         ` Jessica Clarke
  2020-11-01  1:22         ` Rich Felker
  1 sibling, 0 replies; 26+ messages in thread
From: Jessica Clarke @ 2020-11-01  0:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Florian Weimer, Rich Felker, linux-x86_64, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On 31 Oct 2020, at 23:30, Andy Lutomirski <luto@kernel.org> wrote:
> 
> cc: some libc folks
> 
> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>> 
>> POSIX specifies that the first field of the supplied msgp, namely mtype,
>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
>> the variable-length mtext field so we can't even bend the spec and make
>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
>> msgrcv respectively.
> 
> This is a mess.
> 
> include/uapi/linux/msg.h has:
> 
> /* message buffer for msgsnd and msgrcv calls */
> struct msgbuf {
>        __kernel_long_t mtype;          /* type of message */
>        char mtext[1];                  /* message text */
> };

Yeah that's just wrong, POSIX is very clear it's a plain long.

> Your test has:
> 
> struct msg_long {
>    long mtype;
>    char mtext[8];
> };
> 
> struct msg_long_ext {
>    struct msg_long msg_long;
>    char mext[4];
> };
> 
> and I'm unclear as to exactly what you're trying to do there with the
> "mext" part.

The intent was to show that, despite the 8 being passed to
msgsnd/msgrcv (due to the `char mtext[8]`), an extra 4 bytes were being
copied when using long, namely mext, and that using __kernel_long_t was
giving the behaviour that should be observed for long, since the 4
bytes of mext in the destination buffer were left untouched and thus
still 0 from the memset.

> POSIX says:
> 
>       The application shall ensure that the argument msgp points to  a  user-
>       defined  buffer that contains first a field of type long specifying the
>       type of the message, and then a data portion that holds the data  bytes
>       of the message. The structure below is an example of what this user-de‐
>       fined buffer might look like:
> 
>           struct mymsg {
>               long   mtype;       /* Message type. */
>               char   mtext[1];    /* Message text. */
>           }
> 
> NTP has this delightful piece of code:
> 
>   44 typedef union {
>   45   struct msgbuf msgp;
>   46   struct {
>   47     long mtype;
>   48     int code;
>   49     struct timeval tv;
>   50   } msgb;
>   51 } MsgBuf;

I was initially concerned by that, since msgbuf is a thing that FreeBSD
also defines but is completely different (used for console messages),
but it appears that adjtime.h, where that is defined, is only used in
libntp/adjtime.c and adjtimed/adjtimed.c, with the former only
including it if NEED_HPUX_ADJTIME, and the latter only being enabled by
configure.ac for HP-UX. So I do not believe that code has ever been
compiled on Linux (though that doesn't mean it's good code for HP-UX!).

> bluefish has:
> 
> struct small_msgbuf {
> long mtype;
> char mtext[MSQ_QUEUE_SMALL_SIZE];
> } small_msgp;

Which is standards-conforming but unfortunately currently causes memory
corruption for the 4 bytes after it, just like any other correct user
of msgsend/msgrecv on x32.

> My laptop has nothing at all in /dev/mqueue.
> 
> So I don't really know what the right thing to do is.  Certainly if
> we're going to apply this patch, we should also fix the header.  I
> almost think we should *delete* struct msgbuf from the headers, since
> it's all kinds of busted, but that will break the NTP build.  Ideally
> we would go back in time and remove it from the headers.

I agree that it should not be in the header. Failing that though we
could change the type as it is just wrong on x32 and as you say nobody
should notice (and if anything it'll be because things *start* working).

> Libc people, any insight?  We can probably fix the bug without
> annoying anyone given how lightly x32 is used and how lightly POSIX
> message queues are used.

That is my hope...

Jess

>> 
>> Due to erroneously including the first 4 bytes of mtext in the mtype
>> this would previously also cause non-zero msgtyp arguments for msgrcv to
>> search for the wrong messages, and if sharing message queues between x32
>> and non-x32 (i386 or x86_64) processes this would previously cause mtext
>> to "move" and, depending on the direction and ABI combination, lose the
>> first 4 bytes.
>> 
>> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
>> ---
>> 
>> I have verified that the test at the end of [1] now gives the correct
>> result on x32 ("PAYL" not "PAY" as I erroneously claimed it should be in
>> the above email) and that both i386 and amd64 give the same output with
>> that test as before.
>> 
>> [1] <1156938F-A9A3-4EE9-B059-2294A0B9FBFE@jrtc27.com>
>> 
>> Changes since v1:
>> * Uses the same syscall numbers for x32 as amd64 and the current x32
>>   rather than (further) breaking ABI by allocating new ones from the
>>   legacy x32 range
>> 
>> arch/x86/entry/syscalls/syscall_64.tbl | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
>> index f30d6ae9a..f462123f3 100644
>> --- a/arch/x86/entry/syscalls/syscall_64.tbl
>> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
>> @@ -77,8 +77,10 @@
>> 66     common  semctl                  sys_semctl
>> 67     common  shmdt                   sys_shmdt
>> 68     common  msgget                  sys_msgget
>> -69     common  msgsnd                  sys_msgsnd
>> -70     common  msgrcv                  sys_msgrcv
>> +69     64      msgsnd                  sys_msgsnd
>> +69     x32     msgsnd                  compat_sys_msgsnd
>> +70     64      msgrcv                  sys_msgrcv
>> +70     x32     msgrcv                  compat_sys_msgrcv
>> 71     common  msgctl                  sys_msgctl
>> 72     common  fcntl                   sys_fcntl
>> 73     common  flock                   sys_flock
>> --
>> 2.28.0
>> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-10-31 23:30       ` Andy Lutomirski
  2020-11-01  0:09         ` Jessica Clarke
@ 2020-11-01  1:22         ` Rich Felker
  2020-11-01  1:27           ` Jessica Clarke
  1 sibling, 1 reply; 26+ messages in thread
From: Rich Felker @ 2020-11-01  1:22 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jessica Clarke, Florian Weimer, linux-x86_64, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
> cc: some libc folks
> 
> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> >
> > POSIX specifies that the first field of the supplied msgp, namely mtype,
> > is a long, not a __kernel_long_t, and it's a user-defined struct due to
> > the variable-length mtext field so we can't even bend the spec and make
> > it a __kernel_long_t even if we wanted to. Thus we must use the compat
> > syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
> > msgrcv respectively.
> 
> This is a mess.
> 
> include/uapi/linux/msg.h has:
> 
> /* message buffer for msgsnd and msgrcv calls */
> struct msgbuf {
>         __kernel_long_t mtype;          /* type of message */
>         char mtext[1];                  /* message text */
> };
> 
> Your test has:
> 
> struct msg_long {
>     long mtype;
>     char mtext[8];
> };
> 
> struct msg_long_ext {
>     struct msg_long msg_long;
>     char mext[4];
> };
> 
> and I'm unclear as to exactly what you're trying to do there with the
> "mext" part.
> 
> POSIX says:
> 
>        The application shall ensure that the argument msgp points to  a  user-
>        defined  buffer that contains first a field of type long specifying the
>        type of the message, and then a data portion that holds the data  bytes
>        of the message. The structure below is an example of what this user-de‐
>        fined buffer might look like:
> 
>            struct mymsg {
>                long   mtype;       /* Message type. */
>                char   mtext[1];    /* Message text. */
>            }
> 
> NTP has this delightful piece of code:
> 
>    44 typedef union {
>    45   struct msgbuf msgp;
>    46   struct {
>    47     long mtype;
>    48     int code;
>    49     struct timeval tv;
>    50   } msgb;
>    51 } MsgBuf;
> 
> bluefish has:
> 
> struct small_msgbuf {
> long mtype;
> char mtext[MSQ_QUEUE_SMALL_SIZE];
> } small_msgp;
> 
> 
> My laptop has nothing at all in /dev/mqueue.
> 
> So I don't really know what the right thing to do is.  Certainly if
> we're going to apply this patch, we should also fix the header.  I
> almost think we should *delete* struct msgbuf from the headers, since
> it's all kinds of busted, but that will break the NTP build.  Ideally
> we would go back in time and remove it from the headers.
> 
> Libc people, any insight?  We can probably fix the bug without
> annoying anyone given how lightly x32 is used and how lightly POSIX
> message queues are used.

If it's that outright wrong and always has been, I feel like the old
syscall numbers should just be deprecated and new ones assigned.
Otherwise, there's no way for userspace to be safe against data
corruption when run on older kernels. If there's a new syscall number,
libc can just use the new one unconditionally (giving ENOSYS on
kernels where it would be broken) or have a x32-specific
implementation that makes the old syscall and performs translation if
the new one fails with ENOSYS.

Rich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-11-01  1:22         ` Rich Felker
@ 2020-11-01  1:27           ` Jessica Clarke
  2020-11-01  1:50             ` Rich Felker
  0 siblings, 1 reply; 26+ messages in thread
From: Jessica Clarke @ 2020-11-01  1:27 UTC (permalink / raw)
  To: Rich Felker
  Cc: Andy Lutomirski, Florian Weimer, linux-x86_64, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On 1 Nov 2020, at 01:22, Rich Felker <dalias@libc.org> wrote:
> On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
>> cc: some libc folks
>> 
>> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>> 
>>> POSIX specifies that the first field of the supplied msgp, namely mtype,
>>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
>>> the variable-length mtext field so we can't even bend the spec and make
>>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
>>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
>>> msgrcv respectively.
>> 
>> This is a mess.
>> 
>> include/uapi/linux/msg.h has:
>> 
>> /* message buffer for msgsnd and msgrcv calls */
>> struct msgbuf {
>>        __kernel_long_t mtype;          /* type of message */
>>        char mtext[1];                  /* message text */
>> };
>> 
>> Your test has:
>> 
>> struct msg_long {
>>    long mtype;
>>    char mtext[8];
>> };
>> 
>> struct msg_long_ext {
>>    struct msg_long msg_long;
>>    char mext[4];
>> };
>> 
>> and I'm unclear as to exactly what you're trying to do there with the
>> "mext" part.
>> 
>> POSIX says:
>> 
>>       The application shall ensure that the argument msgp points to  a  user-
>>       defined  buffer that contains first a field of type long specifying the
>>       type of the message, and then a data portion that holds the data  bytes
>>       of the message. The structure below is an example of what this user-de‐
>>       fined buffer might look like:
>> 
>>           struct mymsg {
>>               long   mtype;       /* Message type. */
>>               char   mtext[1];    /* Message text. */
>>           }
>> 
>> NTP has this delightful piece of code:
>> 
>>   44 typedef union {
>>   45   struct msgbuf msgp;
>>   46   struct {
>>   47     long mtype;
>>   48     int code;
>>   49     struct timeval tv;
>>   50   } msgb;
>>   51 } MsgBuf;
>> 
>> bluefish has:
>> 
>> struct small_msgbuf {
>> long mtype;
>> char mtext[MSQ_QUEUE_SMALL_SIZE];
>> } small_msgp;
>> 
>> 
>> My laptop has nothing at all in /dev/mqueue.
>> 
>> So I don't really know what the right thing to do is.  Certainly if
>> we're going to apply this patch, we should also fix the header.  I
>> almost think we should *delete* struct msgbuf from the headers, since
>> it's all kinds of busted, but that will break the NTP build.  Ideally
>> we would go back in time and remove it from the headers.
>> 
>> Libc people, any insight?  We can probably fix the bug without
>> annoying anyone given how lightly x32 is used and how lightly POSIX
>> message queues are used.
> 
> If it's that outright wrong and always has been, I feel like the old
> syscall numbers should just be deprecated and new ones assigned.
> Otherwise, there's no way for userspace to be safe against data
> corruption when run on older kernels. If there's a new syscall number,
> libc can just use the new one unconditionally (giving ENOSYS on
> kernels where it would be broken) or have a x32-specific
> implementation that makes the old syscall and performs translation if
> the new one fails with ENOSYS.

That doesn't really help broken code continue to work reliably, as
upgrading libc will just pull in the new syscall for a binary that's
expecting the broken behaviour, unless you do symbol versioning, but
then it'll just break when you next recompile the code, and there's no
way for that to be diagnosed given the *application* has to define the
type. But given it's application-defined I really struggle to see how
any code out there is actually expecting the current x32 behaviour as
you'd have to go really out of your way to find out that x32 is broken
and needs __kernel_long_t. I don't think there's any way around just
technically breaking ABI whilst likely really fixing ABI in 99.999% of
cases (maybe 100%).

Jess


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-11-01  1:27           ` Jessica Clarke
@ 2020-11-01  1:50             ` Rich Felker
  2020-11-01 18:07               ` Andy Lutomirski
  0 siblings, 1 reply; 26+ messages in thread
From: Rich Felker @ 2020-11-01  1:50 UTC (permalink / raw)
  To: Jessica Clarke
  Cc: Andy Lutomirski, Florian Weimer, linux-x86_64, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote:
> On 1 Nov 2020, at 01:22, Rich Felker <dalias@libc.org> wrote:
> > On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
> >> cc: some libc folks
> >> 
> >> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> >>> 
> >>> POSIX specifies that the first field of the supplied msgp, namely mtype,
> >>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
> >>> the variable-length mtext field so we can't even bend the spec and make
> >>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
> >>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
> >>> msgrcv respectively.
> >> 
> >> This is a mess.
> >> 
> >> include/uapi/linux/msg.h has:
> >> 
> >> /* message buffer for msgsnd and msgrcv calls */
> >> struct msgbuf {
> >>        __kernel_long_t mtype;          /* type of message */
> >>        char mtext[1];                  /* message text */
> >> };
> >> 
> >> Your test has:
> >> 
> >> struct msg_long {
> >>    long mtype;
> >>    char mtext[8];
> >> };
> >> 
> >> struct msg_long_ext {
> >>    struct msg_long msg_long;
> >>    char mext[4];
> >> };
> >> 
> >> and I'm unclear as to exactly what you're trying to do there with the
> >> "mext" part.
> >> 
> >> POSIX says:
> >> 
> >>       The application shall ensure that the argument msgp points to  a  user-
> >>       defined  buffer that contains first a field of type long specifying the
> >>       type of the message, and then a data portion that holds the data  bytes
> >>       of the message. The structure below is an example of what this user-de‐
> >>       fined buffer might look like:
> >> 
> >>           struct mymsg {
> >>               long   mtype;       /* Message type. */
> >>               char   mtext[1];    /* Message text. */
> >>           }
> >> 
> >> NTP has this delightful piece of code:
> >> 
> >>   44 typedef union {
> >>   45   struct msgbuf msgp;
> >>   46   struct {
> >>   47     long mtype;
> >>   48     int code;
> >>   49     struct timeval tv;
> >>   50   } msgb;
> >>   51 } MsgBuf;
> >> 
> >> bluefish has:
> >> 
> >> struct small_msgbuf {
> >> long mtype;
> >> char mtext[MSQ_QUEUE_SMALL_SIZE];
> >> } small_msgp;
> >> 
> >> 
> >> My laptop has nothing at all in /dev/mqueue.
> >> 
> >> So I don't really know what the right thing to do is.  Certainly if
> >> we're going to apply this patch, we should also fix the header.  I
> >> almost think we should *delete* struct msgbuf from the headers, since
> >> it's all kinds of busted, but that will break the NTP build.  Ideally
> >> we would go back in time and remove it from the headers.
> >> 
> >> Libc people, any insight?  We can probably fix the bug without
> >> annoying anyone given how lightly x32 is used and how lightly POSIX
> >> message queues are used.
> > 
> > If it's that outright wrong and always has been, I feel like the old
> > syscall numbers should just be deprecated and new ones assigned.
> > Otherwise, there's no way for userspace to be safe against data
> > corruption when run on older kernels. If there's a new syscall number,
> > libc can just use the new one unconditionally (giving ENOSYS on
> > kernels where it would be broken) or have a x32-specific
> > implementation that makes the old syscall and performs translation if
> > the new one fails with ENOSYS.
> 
> That doesn't really help broken code continue to work reliably, as
> upgrading libc will just pull in the new syscall for a binary that's
> expecting the broken behaviour, unless you do symbol versioning, but
> then it'll just break when you next recompile the code, and there's no
> way for that to be diagnosed given the *application* has to define the
> type. But given it's application-defined I really struggle to see how
> any code out there is actually expecting the current x32 behaviour as
> you'd have to go really out of your way to find out that x32 is broken
> and needs __kernel_long_t. I don't think there's any way around just
> technically breaking ABI whilst likely really fixing ABI in 99.999% of
> cases (maybe 100%).

I'm not opposed to "breaking ABI" here because the current syscall
doesn't work unless someone wrote bogus x32-specific code to work
around it being wrong. I don't particularly want to preserve any of
the current behavior.

What I am somewhat opposed to is making a situation where an updated
libc can't be safe against getting run on a kernel with a broken
version of the syscall and silently corrupting data. I'm flexible
about how avoiding tha tis achieved.

Rich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-11-01  1:50             ` Rich Felker
@ 2020-11-01 18:07               ` Andy Lutomirski
  2020-11-01 18:15                 ` Jessica Clarke
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2020-11-01 18:07 UTC (permalink / raw)
  To: Rich Felker
  Cc: Jessica Clarke, Andy Lutomirski, Florian Weimer, linux-x86_64,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, X86 ML,
	H. Peter Anvin, LKML

On Sat, Oct 31, 2020 at 6:50 PM Rich Felker <dalias@libc.org> wrote:
>
> On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote:
> > On 1 Nov 2020, at 01:22, Rich Felker <dalias@libc.org> wrote:
> > > On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
> > >> cc: some libc folks
> > >>
> > >> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > >>>
> > >>> POSIX specifies that the first field of the supplied msgp, namely mtype,
> > >>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
> > >>> the variable-length mtext field so we can't even bend the spec and make
> > >>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
> > >>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
> > >>> msgrcv respectively.
> > >>
> > >> This is a mess.
> > >>
> > >> include/uapi/linux/msg.h has:
> > >>
> > >> /* message buffer for msgsnd and msgrcv calls */
> > >> struct msgbuf {
> > >>        __kernel_long_t mtype;          /* type of message */
> > >>        char mtext[1];                  /* message text */
> > >> };
> > >>
> > >> Your test has:
> > >>
> > >> struct msg_long {
> > >>    long mtype;
> > >>    char mtext[8];
> > >> };
> > >>
> > >> struct msg_long_ext {
> > >>    struct msg_long msg_long;
> > >>    char mext[4];
> > >> };
> > >>
> > >> and I'm unclear as to exactly what you're trying to do there with the
> > >> "mext" part.
> > >>
> > >> POSIX says:
> > >>
> > >>       The application shall ensure that the argument msgp points to  a  user-
> > >>       defined  buffer that contains first a field of type long specifying the
> > >>       type of the message, and then a data portion that holds the data  bytes
> > >>       of the message. The structure below is an example of what this user-de‐
> > >>       fined buffer might look like:
> > >>
> > >>           struct mymsg {
> > >>               long   mtype;       /* Message type. */
> > >>               char   mtext[1];    /* Message text. */
> > >>           }
> > >>
> > >> NTP has this delightful piece of code:
> > >>
> > >>   44 typedef union {
> > >>   45   struct msgbuf msgp;
> > >>   46   struct {
> > >>   47     long mtype;
> > >>   48     int code;
> > >>   49     struct timeval tv;
> > >>   50   } msgb;
> > >>   51 } MsgBuf;
> > >>
> > >> bluefish has:
> > >>
> > >> struct small_msgbuf {
> > >> long mtype;
> > >> char mtext[MSQ_QUEUE_SMALL_SIZE];
> > >> } small_msgp;
> > >>
> > >>
> > >> My laptop has nothing at all in /dev/mqueue.
> > >>
> > >> So I don't really know what the right thing to do is.  Certainly if
> > >> we're going to apply this patch, we should also fix the header.  I
> > >> almost think we should *delete* struct msgbuf from the headers, since
> > >> it's all kinds of busted, but that will break the NTP build.  Ideally
> > >> we would go back in time and remove it from the headers.
> > >>
> > >> Libc people, any insight?  We can probably fix the bug without
> > >> annoying anyone given how lightly x32 is used and how lightly POSIX
> > >> message queues are used.
> > >
> > > If it's that outright wrong and always has been, I feel like the old
> > > syscall numbers should just be deprecated and new ones assigned.
> > > Otherwise, there's no way for userspace to be safe against data
> > > corruption when run on older kernels. If there's a new syscall number,
> > > libc can just use the new one unconditionally (giving ENOSYS on
> > > kernels where it would be broken) or have a x32-specific
> > > implementation that makes the old syscall and performs translation if
> > > the new one fails with ENOSYS.
> >
> > That doesn't really help broken code continue to work reliably, as
> > upgrading libc will just pull in the new syscall for a binary that's
> > expecting the broken behaviour, unless you do symbol versioning, but
> > then it'll just break when you next recompile the code, and there's no
> > way for that to be diagnosed given the *application* has to define the
> > type. But given it's application-defined I really struggle to see how
> > any code out there is actually expecting the current x32 behaviour as
> > you'd have to go really out of your way to find out that x32 is broken
> > and needs __kernel_long_t. I don't think there's any way around just
> > technically breaking ABI whilst likely really fixing ABI in 99.999% of
> > cases (maybe 100%).
>
> I'm not opposed to "breaking ABI" here because the current syscall
> doesn't work unless someone wrote bogus x32-specific code to work
> around it being wrong. I don't particularly want to preserve any of
> the current behavior.
>
> What I am somewhat opposed to is making a situation where an updated
> libc can't be safe against getting run on a kernel with a broken
> version of the syscall and silently corrupting data. I'm flexible
> about how avoiding tha tis achieved.

If we're sufficiently confident that we won't regress anything by
fixing the bug, I propose we do the following.  First, we commit a fix
that's Jessica's patch plus a fix to struct msghdr, and we mark that
for -stable.  Then we commit another patch that removes 'struct
msghdr' from uapi entirely, but we don't mark that for -stable.  If
people complain about the latter, we revert it.

After all, we could argue that the x32 bug here isn't really that
different from any of the huge number of various syscall bugs we've
had in the past and should just be fixed in the kernel.  If you run
user programs on a buggy kernel, you get buggy behavior...

--Andy

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-11-01 18:07               ` Andy Lutomirski
@ 2020-11-01 18:15                 ` Jessica Clarke
  2020-11-01 18:27                   ` Jessica Clarke
  0 siblings, 1 reply; 26+ messages in thread
From: Jessica Clarke @ 2020-11-01 18:15 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Rich Felker, Florian Weimer, linux-x86_64, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On 1 Nov 2020, at 18:07, Andy Lutomirski <luto@kernel.org> wrote:
> 
> On Sat, Oct 31, 2020 at 6:50 PM Rich Felker <dalias@libc.org> wrote:
>> 
>> On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote:
>>> On 1 Nov 2020, at 01:22, Rich Felker <dalias@libc.org> wrote:
>>>> On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
>>>>> cc: some libc folks
>>>>> 
>>>>> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>>>>> 
>>>>>> POSIX specifies that the first field of the supplied msgp, namely mtype,
>>>>>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
>>>>>> the variable-length mtext field so we can't even bend the spec and make
>>>>>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
>>>>>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
>>>>>> msgrcv respectively.
>>>>> 
>>>>> This is a mess.
>>>>> 
>>>>> include/uapi/linux/msg.h has:
>>>>> 
>>>>> /* message buffer for msgsnd and msgrcv calls */
>>>>> struct msgbuf {
>>>>>       __kernel_long_t mtype;          /* type of message */
>>>>>       char mtext[1];                  /* message text */
>>>>> };
>>>>> 
>>>>> Your test has:
>>>>> 
>>>>> struct msg_long {
>>>>>   long mtype;
>>>>>   char mtext[8];
>>>>> };
>>>>> 
>>>>> struct msg_long_ext {
>>>>>   struct msg_long msg_long;
>>>>>   char mext[4];
>>>>> };
>>>>> 
>>>>> and I'm unclear as to exactly what you're trying to do there with the
>>>>> "mext" part.
>>>>> 
>>>>> POSIX says:
>>>>> 
>>>>>      The application shall ensure that the argument msgp points to  a  user-
>>>>>      defined  buffer that contains first a field of type long specifying the
>>>>>      type of the message, and then a data portion that holds the data  bytes
>>>>>      of the message. The structure below is an example of what this user-de‐
>>>>>      fined buffer might look like:
>>>>> 
>>>>>          struct mymsg {
>>>>>              long   mtype;       /* Message type. */
>>>>>              char   mtext[1];    /* Message text. */
>>>>>          }
>>>>> 
>>>>> NTP has this delightful piece of code:
>>>>> 
>>>>>  44 typedef union {
>>>>>  45   struct msgbuf msgp;
>>>>>  46   struct {
>>>>>  47     long mtype;
>>>>>  48     int code;
>>>>>  49     struct timeval tv;
>>>>>  50   } msgb;
>>>>>  51 } MsgBuf;
>>>>> 
>>>>> bluefish has:
>>>>> 
>>>>> struct small_msgbuf {
>>>>> long mtype;
>>>>> char mtext[MSQ_QUEUE_SMALL_SIZE];
>>>>> } small_msgp;
>>>>> 
>>>>> 
>>>>> My laptop has nothing at all in /dev/mqueue.
>>>>> 
>>>>> So I don't really know what the right thing to do is.  Certainly if
>>>>> we're going to apply this patch, we should also fix the header.  I
>>>>> almost think we should *delete* struct msgbuf from the headers, since
>>>>> it's all kinds of busted, but that will break the NTP build.  Ideally
>>>>> we would go back in time and remove it from the headers.
>>>>> 
>>>>> Libc people, any insight?  We can probably fix the bug without
>>>>> annoying anyone given how lightly x32 is used and how lightly POSIX
>>>>> message queues are used.
>>>> 
>>>> If it's that outright wrong and always has been, I feel like the old
>>>> syscall numbers should just be deprecated and new ones assigned.
>>>> Otherwise, there's no way for userspace to be safe against data
>>>> corruption when run on older kernels. If there's a new syscall number,
>>>> libc can just use the new one unconditionally (giving ENOSYS on
>>>> kernels where it would be broken) or have a x32-specific
>>>> implementation that makes the old syscall and performs translation if
>>>> the new one fails with ENOSYS.
>>> 
>>> That doesn't really help broken code continue to work reliably, as
>>> upgrading libc will just pull in the new syscall for a binary that's
>>> expecting the broken behaviour, unless you do symbol versioning, but
>>> then it'll just break when you next recompile the code, and there's no
>>> way for that to be diagnosed given the *application* has to define the
>>> type. But given it's application-defined I really struggle to see how
>>> any code out there is actually expecting the current x32 behaviour as
>>> you'd have to go really out of your way to find out that x32 is broken
>>> and needs __kernel_long_t. I don't think there's any way around just
>>> technically breaking ABI whilst likely really fixing ABI in 99.999% of
>>> cases (maybe 100%).
>> 
>> I'm not opposed to "breaking ABI" here because the current syscall
>> doesn't work unless someone wrote bogus x32-specific code to work
>> around it being wrong. I don't particularly want to preserve any of
>> the current behavior.
>> 
>> What I am somewhat opposed to is making a situation where an updated
>> libc can't be safe against getting run on a kernel with a broken
>> version of the syscall and silently corrupting data. I'm flexible
>> about how avoiding tha tis achieved.
> 
> If we're sufficiently confident that we won't regress anything by
> fixing the bug, I propose we do the following.  First, we commit a fix
> that's Jessica's patch plus a fix to struct msghdr, and we mark that
> for -stable.  Then we commit another patch that removes 'struct
> msghdr' from uapi entirely, but we don't mark that for -stable.  If
> people complain about the latter, we revert it.

Thinking about this more, MIPS n32 is also affected by that header. In
fact the n32 syscalls currently do the right thing and use the compat
implementations, so the header is currently out-of-sync with the kernel
there*. This should be noted when committing the change to msg.h.

Jess

* Of course that gets particularly exciting on big-endian n32...


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-11-01 18:15                 ` Jessica Clarke
@ 2020-11-01 18:27                   ` Jessica Clarke
  2020-11-01 21:01                     ` Rich Felker
  0 siblings, 1 reply; 26+ messages in thread
From: Jessica Clarke @ 2020-11-01 18:27 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Rich Felker, Florian Weimer, linux-x86_64, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On 1 Nov 2020, at 18:15, Jessica Clarke <jrtc27@jrtc27.com> wrote:
> 
> On 1 Nov 2020, at 18:07, Andy Lutomirski <luto@kernel.org> wrote:
>> 
>> On Sat, Oct 31, 2020 at 6:50 PM Rich Felker <dalias@libc.org> wrote:
>>> 
>>> On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote:
>>>> On 1 Nov 2020, at 01:22, Rich Felker <dalias@libc.org> wrote:
>>>>> On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
>>>>>> cc: some libc folks
>>>>>> 
>>>>>> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>>>>>> 
>>>>>>> POSIX specifies that the first field of the supplied msgp, namely mtype,
>>>>>>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
>>>>>>> the variable-length mtext field so we can't even bend the spec and make
>>>>>>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
>>>>>>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
>>>>>>> msgrcv respectively.
>>>>>> 
>>>>>> This is a mess.
>>>>>> 
>>>>>> include/uapi/linux/msg.h has:
>>>>>> 
>>>>>> /* message buffer for msgsnd and msgrcv calls */
>>>>>> struct msgbuf {
>>>>>>      __kernel_long_t mtype;          /* type of message */
>>>>>>      char mtext[1];                  /* message text */
>>>>>> };
>>>>>> 
>>>>>> Your test has:
>>>>>> 
>>>>>> struct msg_long {
>>>>>>  long mtype;
>>>>>>  char mtext[8];
>>>>>> };
>>>>>> 
>>>>>> struct msg_long_ext {
>>>>>>  struct msg_long msg_long;
>>>>>>  char mext[4];
>>>>>> };
>>>>>> 
>>>>>> and I'm unclear as to exactly what you're trying to do there with the
>>>>>> "mext" part.
>>>>>> 
>>>>>> POSIX says:
>>>>>> 
>>>>>>     The application shall ensure that the argument msgp points to  a  user-
>>>>>>     defined  buffer that contains first a field of type long specifying the
>>>>>>     type of the message, and then a data portion that holds the data  bytes
>>>>>>     of the message. The structure below is an example of what this user-de‐
>>>>>>     fined buffer might look like:
>>>>>> 
>>>>>>         struct mymsg {
>>>>>>             long   mtype;       /* Message type. */
>>>>>>             char   mtext[1];    /* Message text. */
>>>>>>         }
>>>>>> 
>>>>>> NTP has this delightful piece of code:
>>>>>> 
>>>>>> 44 typedef union {
>>>>>> 45   struct msgbuf msgp;
>>>>>> 46   struct {
>>>>>> 47     long mtype;
>>>>>> 48     int code;
>>>>>> 49     struct timeval tv;
>>>>>> 50   } msgb;
>>>>>> 51 } MsgBuf;
>>>>>> 
>>>>>> bluefish has:
>>>>>> 
>>>>>> struct small_msgbuf {
>>>>>> long mtype;
>>>>>> char mtext[MSQ_QUEUE_SMALL_SIZE];
>>>>>> } small_msgp;
>>>>>> 
>>>>>> 
>>>>>> My laptop has nothing at all in /dev/mqueue.
>>>>>> 
>>>>>> So I don't really know what the right thing to do is.  Certainly if
>>>>>> we're going to apply this patch, we should also fix the header.  I
>>>>>> almost think we should *delete* struct msgbuf from the headers, since
>>>>>> it's all kinds of busted, but that will break the NTP build.  Ideally
>>>>>> we would go back in time and remove it from the headers.
>>>>>> 
>>>>>> Libc people, any insight?  We can probably fix the bug without
>>>>>> annoying anyone given how lightly x32 is used and how lightly POSIX
>>>>>> message queues are used.
>>>>> 
>>>>> If it's that outright wrong and always has been, I feel like the old
>>>>> syscall numbers should just be deprecated and new ones assigned.
>>>>> Otherwise, there's no way for userspace to be safe against data
>>>>> corruption when run on older kernels. If there's a new syscall number,
>>>>> libc can just use the new one unconditionally (giving ENOSYS on
>>>>> kernels where it would be broken) or have a x32-specific
>>>>> implementation that makes the old syscall and performs translation if
>>>>> the new one fails with ENOSYS.
>>>> 
>>>> That doesn't really help broken code continue to work reliably, as
>>>> upgrading libc will just pull in the new syscall for a binary that's
>>>> expecting the broken behaviour, unless you do symbol versioning, but
>>>> then it'll just break when you next recompile the code, and there's no
>>>> way for that to be diagnosed given the *application* has to define the
>>>> type. But given it's application-defined I really struggle to see how
>>>> any code out there is actually expecting the current x32 behaviour as
>>>> you'd have to go really out of your way to find out that x32 is broken
>>>> and needs __kernel_long_t. I don't think there's any way around just
>>>> technically breaking ABI whilst likely really fixing ABI in 99.999% of
>>>> cases (maybe 100%).
>>> 
>>> I'm not opposed to "breaking ABI" here because the current syscall
>>> doesn't work unless someone wrote bogus x32-specific code to work
>>> around it being wrong. I don't particularly want to preserve any of
>>> the current behavior.
>>> 
>>> What I am somewhat opposed to is making a situation where an updated
>>> libc can't be safe against getting run on a kernel with a broken
>>> version of the syscall and silently corrupting data. I'm flexible
>>> about how avoiding tha tis achieved.
>> 
>> If we're sufficiently confident that we won't regress anything by
>> fixing the bug, I propose we do the following.  First, we commit a fix
>> that's Jessica's patch plus a fix to struct msghdr, and we mark that
>> for -stable.  Then we commit another patch that removes 'struct
>> msghdr' from uapi entirely, but we don't mark that for -stable.  If
>> people complain about the latter, we revert it.
> 
> Thinking about this more, MIPS n32 is also affected by that header. In
> fact the n32 syscalls currently do the right thing and use the compat
> implementations, so the header is currently out-of-sync with the kernel
> there*. This should be noted when committing the change to msg.h.

Never mind, it seems MIPS n32 is weird and leaves __kernel_long_t as a
normal long despite being an ILP32-on-64-bit ABI, I guess because it's
inherited from IRIX rather than being invented by the GNU world.

Jess


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-11-01 18:27                   ` Jessica Clarke
@ 2020-11-01 21:01                     ` Rich Felker
  2020-11-16  0:55                       ` Jessica Clarke
  0 siblings, 1 reply; 26+ messages in thread
From: Rich Felker @ 2020-11-01 21:01 UTC (permalink / raw)
  To: Jessica Clarke
  Cc: Andy Lutomirski, Florian Weimer, linux-x86_64, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On Sun, Nov 01, 2020 at 06:27:10PM +0000, Jessica Clarke wrote:
> On 1 Nov 2020, at 18:15, Jessica Clarke <jrtc27@jrtc27.com> wrote:
> > 
> > On 1 Nov 2020, at 18:07, Andy Lutomirski <luto@kernel.org> wrote:
> >> 
> >> On Sat, Oct 31, 2020 at 6:50 PM Rich Felker <dalias@libc.org> wrote:
> >>> 
> >>> On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote:
> >>>> On 1 Nov 2020, at 01:22, Rich Felker <dalias@libc.org> wrote:
> >>>>> On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
> >>>>>> cc: some libc folks
> >>>>>> 
> >>>>>> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> >>>>>>> 
> >>>>>>> POSIX specifies that the first field of the supplied msgp, namely mtype,
> >>>>>>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
> >>>>>>> the variable-length mtext field so we can't even bend the spec and make
> >>>>>>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
> >>>>>>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
> >>>>>>> msgrcv respectively.
> >>>>>> 
> >>>>>> This is a mess.
> >>>>>> 
> >>>>>> include/uapi/linux/msg.h has:
> >>>>>> 
> >>>>>> /* message buffer for msgsnd and msgrcv calls */
> >>>>>> struct msgbuf {
> >>>>>>      __kernel_long_t mtype;          /* type of message */
> >>>>>>      char mtext[1];                  /* message text */
> >>>>>> };
> >>>>>> 
> >>>>>> Your test has:
> >>>>>> 
> >>>>>> struct msg_long {
> >>>>>>  long mtype;
> >>>>>>  char mtext[8];
> >>>>>> };
> >>>>>> 
> >>>>>> struct msg_long_ext {
> >>>>>>  struct msg_long msg_long;
> >>>>>>  char mext[4];
> >>>>>> };
> >>>>>> 
> >>>>>> and I'm unclear as to exactly what you're trying to do there with the
> >>>>>> "mext" part.
> >>>>>> 
> >>>>>> POSIX says:
> >>>>>> 
> >>>>>>     The application shall ensure that the argument msgp points to  a  user-
> >>>>>>     defined  buffer that contains first a field of type long specifying the
> >>>>>>     type of the message, and then a data portion that holds the data  bytes
> >>>>>>     of the message. The structure below is an example of what this user-de‐
> >>>>>>     fined buffer might look like:
> >>>>>> 
> >>>>>>         struct mymsg {
> >>>>>>             long   mtype;       /* Message type. */
> >>>>>>             char   mtext[1];    /* Message text. */
> >>>>>>         }
> >>>>>> 
> >>>>>> NTP has this delightful piece of code:
> >>>>>> 
> >>>>>> 44 typedef union {
> >>>>>> 45   struct msgbuf msgp;
> >>>>>> 46   struct {
> >>>>>> 47     long mtype;
> >>>>>> 48     int code;
> >>>>>> 49     struct timeval tv;
> >>>>>> 50   } msgb;
> >>>>>> 51 } MsgBuf;
> >>>>>> 
> >>>>>> bluefish has:
> >>>>>> 
> >>>>>> struct small_msgbuf {
> >>>>>> long mtype;
> >>>>>> char mtext[MSQ_QUEUE_SMALL_SIZE];
> >>>>>> } small_msgp;
> >>>>>> 
> >>>>>> 
> >>>>>> My laptop has nothing at all in /dev/mqueue.
> >>>>>> 
> >>>>>> So I don't really know what the right thing to do is.  Certainly if
> >>>>>> we're going to apply this patch, we should also fix the header.  I
> >>>>>> almost think we should *delete* struct msgbuf from the headers, since
> >>>>>> it's all kinds of busted, but that will break the NTP build.  Ideally
> >>>>>> we would go back in time and remove it from the headers.
> >>>>>> 
> >>>>>> Libc people, any insight?  We can probably fix the bug without
> >>>>>> annoying anyone given how lightly x32 is used and how lightly POSIX
> >>>>>> message queues are used.
> >>>>> 
> >>>>> If it's that outright wrong and always has been, I feel like the old
> >>>>> syscall numbers should just be deprecated and new ones assigned.
> >>>>> Otherwise, there's no way for userspace to be safe against data
> >>>>> corruption when run on older kernels. If there's a new syscall number,
> >>>>> libc can just use the new one unconditionally (giving ENOSYS on
> >>>>> kernels where it would be broken) or have a x32-specific
> >>>>> implementation that makes the old syscall and performs translation if
> >>>>> the new one fails with ENOSYS.
> >>>> 
> >>>> That doesn't really help broken code continue to work reliably, as
> >>>> upgrading libc will just pull in the new syscall for a binary that's
> >>>> expecting the broken behaviour, unless you do symbol versioning, but
> >>>> then it'll just break when you next recompile the code, and there's no
> >>>> way for that to be diagnosed given the *application* has to define the
> >>>> type. But given it's application-defined I really struggle to see how
> >>>> any code out there is actually expecting the current x32 behaviour as
> >>>> you'd have to go really out of your way to find out that x32 is broken
> >>>> and needs __kernel_long_t. I don't think there's any way around just
> >>>> technically breaking ABI whilst likely really fixing ABI in 99.999% of
> >>>> cases (maybe 100%).
> >>> 
> >>> I'm not opposed to "breaking ABI" here because the current syscall
> >>> doesn't work unless someone wrote bogus x32-specific code to work
> >>> around it being wrong. I don't particularly want to preserve any of
> >>> the current behavior.
> >>> 
> >>> What I am somewhat opposed to is making a situation where an updated
> >>> libc can't be safe against getting run on a kernel with a broken
> >>> version of the syscall and silently corrupting data. I'm flexible
> >>> about how avoiding tha tis achieved.
> >> 
> >> If we're sufficiently confident that we won't regress anything by
> >> fixing the bug, I propose we do the following.  First, we commit a fix
> >> that's Jessica's patch plus a fix to struct msghdr, and we mark that
> >> for -stable.  Then we commit another patch that removes 'struct
> >> msghdr' from uapi entirely, but we don't mark that for -stable.  If
> >> people complain about the latter, we revert it.
> > 
> > Thinking about this more, MIPS n32 is also affected by that header. In
> > fact the n32 syscalls currently do the right thing and use the compat
> > implementations, so the header is currently out-of-sync with the kernel
> > there*. This should be noted when committing the change to msg.h.
> 
> Never mind, it seems MIPS n32 is weird and leaves __kernel_long_t as a
> normal long despite being an ILP32-on-64-bit ABI, I guess because it's
> inherited from IRIX rather than being invented by the GNU world.

Yes, the whole __kernel_long_t invention is largely x32-only (maybe
theoretically on aarch64-ilp32 too? if that even really exists?) and
is pretty much entirely a mistake from lacking the proper
infrastructure to do time64 when x32 was introduced (note that n32 has
32-bit old-time_t). I hope effort will be made to keep the same
mistake from creeping into future ilp32-on-64 ABIs if there are any.

Rich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-11-01 21:01                     ` Rich Felker
@ 2020-11-16  0:55                       ` Jessica Clarke
  2020-12-06  0:01                         ` Jessica Clarke
  0 siblings, 1 reply; 26+ messages in thread
From: Jessica Clarke @ 2020-11-16  0:55 UTC (permalink / raw)
  To: Rich Felker
  Cc: Andy Lutomirski, Florian Weimer, linux-x86_64, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On 1 Nov 2020, at 21:01, Rich Felker <dalias@libc.org> wrote:
> 
> On Sun, Nov 01, 2020 at 06:27:10PM +0000, Jessica Clarke wrote:
>> On 1 Nov 2020, at 18:15, Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>> 
>>> On 1 Nov 2020, at 18:07, Andy Lutomirski <luto@kernel.org> wrote:
>>>> 
>>>> On Sat, Oct 31, 2020 at 6:50 PM Rich Felker <dalias@libc.org> wrote:
>>>>> 
>>>>> On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote:
>>>>>> On 1 Nov 2020, at 01:22, Rich Felker <dalias@libc.org> wrote:
>>>>>>> On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
>>>>>>>> cc: some libc folks
>>>>>>>> 
>>>>>>>> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>>>>>>>> 
>>>>>>>>> POSIX specifies that the first field of the supplied msgp, namely mtype,
>>>>>>>>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
>>>>>>>>> the variable-length mtext field so we can't even bend the spec and make
>>>>>>>>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
>>>>>>>>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
>>>>>>>>> msgrcv respectively.
>>>>>>>> 
>>>>>>>> This is a mess.
>>>>>>>> 
>>>>>>>> include/uapi/linux/msg.h has:
>>>>>>>> 
>>>>>>>> /* message buffer for msgsnd and msgrcv calls */
>>>>>>>> struct msgbuf {
>>>>>>>>     __kernel_long_t mtype;          /* type of message */
>>>>>>>>     char mtext[1];                  /* message text */
>>>>>>>> };
>>>>>>>> 
>>>>>>>> Your test has:
>>>>>>>> 
>>>>>>>> struct msg_long {
>>>>>>>> long mtype;
>>>>>>>> char mtext[8];
>>>>>>>> };
>>>>>>>> 
>>>>>>>> struct msg_long_ext {
>>>>>>>> struct msg_long msg_long;
>>>>>>>> char mext[4];
>>>>>>>> };
>>>>>>>> 
>>>>>>>> and I'm unclear as to exactly what you're trying to do there with the
>>>>>>>> "mext" part.
>>>>>>>> 
>>>>>>>> POSIX says:
>>>>>>>> 
>>>>>>>>    The application shall ensure that the argument msgp points to  a  user-
>>>>>>>>    defined  buffer that contains first a field of type long specifying the
>>>>>>>>    type of the message, and then a data portion that holds the data  bytes
>>>>>>>>    of the message. The structure below is an example of what this user-de‐
>>>>>>>>    fined buffer might look like:
>>>>>>>> 
>>>>>>>>        struct mymsg {
>>>>>>>>            long   mtype;       /* Message type. */
>>>>>>>>            char   mtext[1];    /* Message text. */
>>>>>>>>        }
>>>>>>>> 
>>>>>>>> NTP has this delightful piece of code:
>>>>>>>> 
>>>>>>>> 44 typedef union {
>>>>>>>> 45   struct msgbuf msgp;
>>>>>>>> 46   struct {
>>>>>>>> 47     long mtype;
>>>>>>>> 48     int code;
>>>>>>>> 49     struct timeval tv;
>>>>>>>> 50   } msgb;
>>>>>>>> 51 } MsgBuf;
>>>>>>>> 
>>>>>>>> bluefish has:
>>>>>>>> 
>>>>>>>> struct small_msgbuf {
>>>>>>>> long mtype;
>>>>>>>> char mtext[MSQ_QUEUE_SMALL_SIZE];
>>>>>>>> } small_msgp;
>>>>>>>> 
>>>>>>>> 
>>>>>>>> My laptop has nothing at all in /dev/mqueue.
>>>>>>>> 
>>>>>>>> So I don't really know what the right thing to do is.  Certainly if
>>>>>>>> we're going to apply this patch, we should also fix the header.  I
>>>>>>>> almost think we should *delete* struct msgbuf from the headers, since
>>>>>>>> it's all kinds of busted, but that will break the NTP build.  Ideally
>>>>>>>> we would go back in time and remove it from the headers.
>>>>>>>> 
>>>>>>>> Libc people, any insight?  We can probably fix the bug without
>>>>>>>> annoying anyone given how lightly x32 is used and how lightly POSIX
>>>>>>>> message queues are used.
>>>>>>> 
>>>>>>> If it's that outright wrong and always has been, I feel like the old
>>>>>>> syscall numbers should just be deprecated and new ones assigned.
>>>>>>> Otherwise, there's no way for userspace to be safe against data
>>>>>>> corruption when run on older kernels. If there's a new syscall number,
>>>>>>> libc can just use the new one unconditionally (giving ENOSYS on
>>>>>>> kernels where it would be broken) or have a x32-specific
>>>>>>> implementation that makes the old syscall and performs translation if
>>>>>>> the new one fails with ENOSYS.
>>>>>> 
>>>>>> That doesn't really help broken code continue to work reliably, as
>>>>>> upgrading libc will just pull in the new syscall for a binary that's
>>>>>> expecting the broken behaviour, unless you do symbol versioning, but
>>>>>> then it'll just break when you next recompile the code, and there's no
>>>>>> way for that to be diagnosed given the *application* has to define the
>>>>>> type. But given it's application-defined I really struggle to see how
>>>>>> any code out there is actually expecting the current x32 behaviour as
>>>>>> you'd have to go really out of your way to find out that x32 is broken
>>>>>> and needs __kernel_long_t. I don't think there's any way around just
>>>>>> technically breaking ABI whilst likely really fixing ABI in 99.999% of
>>>>>> cases (maybe 100%).
>>>>> 
>>>>> I'm not opposed to "breaking ABI" here because the current syscall
>>>>> doesn't work unless someone wrote bogus x32-specific code to work
>>>>> around it being wrong. I don't particularly want to preserve any of
>>>>> the current behavior.
>>>>> 
>>>>> What I am somewhat opposed to is making a situation where an updated
>>>>> libc can't be safe against getting run on a kernel with a broken
>>>>> version of the syscall and silently corrupting data. I'm flexible
>>>>> about how avoiding tha tis achieved.
>>>> 
>>>> If we're sufficiently confident that we won't regress anything by
>>>> fixing the bug, I propose we do the following.  First, we commit a fix
>>>> that's Jessica's patch plus a fix to struct msghdr, and we mark that
>>>> for -stable.  Then we commit another patch that removes 'struct
>>>> msghdr' from uapi entirely, but we don't mark that for -stable.  If
>>>> people complain about the latter, we revert it.
>>> 
>>> Thinking about this more, MIPS n32 is also affected by that header. In
>>> fact the n32 syscalls currently do the right thing and use the compat
>>> implementations, so the header is currently out-of-sync with the kernel
>>> there*. This should be noted when committing the change to msg.h.
>> 
>> Never mind, it seems MIPS n32 is weird and leaves __kernel_long_t as a
>> normal long despite being an ILP32-on-64-bit ABI, I guess because it's
>> inherited from IRIX rather than being invented by the GNU world.
> 
> Yes, the whole __kernel_long_t invention is largely x32-only (maybe
> theoretically on aarch64-ilp32 too? if that even really exists?) and
> is pretty much entirely a mistake from lacking the proper
> infrastructure to do time64 when x32 was introduced (note that n32 has
> 32-bit old-time_t). I hope effort will be made to keep the same
> mistake from creeping into future ilp32-on-64 ABIs if there are any.

Ping? Does anyone have further thoughts/are people happy for this to
land?

Jess


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-11-16  0:55                       ` Jessica Clarke
@ 2020-12-06  0:01                         ` Jessica Clarke
  2020-12-06 22:55                           ` Andy Lutomirski
  0 siblings, 1 reply; 26+ messages in thread
From: Jessica Clarke @ 2020-12-06  0:01 UTC (permalink / raw)
  To: Rich Felker, Andy Lutomirski, linux-x86_64
  Cc: Florian Weimer, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	X86 ML, H. Peter Anvin, LKML

On 16 Nov 2020, at 00:55, Jessica Clarke <jrtc27@jrtc27.com> wrote:
> 
> On 1 Nov 2020, at 21:01, Rich Felker <dalias@libc.org> wrote:
>> 
>> On Sun, Nov 01, 2020 at 06:27:10PM +0000, Jessica Clarke wrote:
>>> On 1 Nov 2020, at 18:15, Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>>> 
>>>> On 1 Nov 2020, at 18:07, Andy Lutomirski <luto@kernel.org> wrote:
>>>>> 
>>>>> On Sat, Oct 31, 2020 at 6:50 PM Rich Felker <dalias@libc.org> wrote:
>>>>>> 
>>>>>> On Sun, Nov 01, 2020 at 01:27:35AM +0000, Jessica Clarke wrote:
>>>>>>> On 1 Nov 2020, at 01:22, Rich Felker <dalias@libc.org> wrote:
>>>>>>>> On Sat, Oct 31, 2020 at 04:30:44PM -0700, Andy Lutomirski wrote:
>>>>>>>>> cc: some libc folks
>>>>>>>>> 
>>>>>>>>> On Mon, Oct 12, 2020 at 6:45 AM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> POSIX specifies that the first field of the supplied msgp, namely mtype,
>>>>>>>>>> is a long, not a __kernel_long_t, and it's a user-defined struct due to
>>>>>>>>>> the variable-length mtext field so we can't even bend the spec and make
>>>>>>>>>> it a __kernel_long_t even if we wanted to. Thus we must use the compat
>>>>>>>>>> syscalls on x32 to avoid buffer overreads and overflows in msgsnd and
>>>>>>>>>> msgrcv respectively.
>>>>>>>>> 
>>>>>>>>> This is a mess.
>>>>>>>>> 
>>>>>>>>> include/uapi/linux/msg.h has:
>>>>>>>>> 
>>>>>>>>> /* message buffer for msgsnd and msgrcv calls */
>>>>>>>>> struct msgbuf {
>>>>>>>>>    __kernel_long_t mtype;          /* type of message */
>>>>>>>>>    char mtext[1];                  /* message text */
>>>>>>>>> };
>>>>>>>>> 
>>>>>>>>> Your test has:
>>>>>>>>> 
>>>>>>>>> struct msg_long {
>>>>>>>>> long mtype;
>>>>>>>>> char mtext[8];
>>>>>>>>> };
>>>>>>>>> 
>>>>>>>>> struct msg_long_ext {
>>>>>>>>> struct msg_long msg_long;
>>>>>>>>> char mext[4];
>>>>>>>>> };
>>>>>>>>> 
>>>>>>>>> and I'm unclear as to exactly what you're trying to do there with the
>>>>>>>>> "mext" part.
>>>>>>>>> 
>>>>>>>>> POSIX says:
>>>>>>>>> 
>>>>>>>>>   The application shall ensure that the argument msgp points to  a  user-
>>>>>>>>>   defined  buffer that contains first a field of type long specifying the
>>>>>>>>>   type of the message, and then a data portion that holds the data  bytes
>>>>>>>>>   of the message. The structure below is an example of what this user-de‐
>>>>>>>>>   fined buffer might look like:
>>>>>>>>> 
>>>>>>>>>       struct mymsg {
>>>>>>>>>           long   mtype;       /* Message type. */
>>>>>>>>>           char   mtext[1];    /* Message text. */
>>>>>>>>>       }
>>>>>>>>> 
>>>>>>>>> NTP has this delightful piece of code:
>>>>>>>>> 
>>>>>>>>> 44 typedef union {
>>>>>>>>> 45   struct msgbuf msgp;
>>>>>>>>> 46   struct {
>>>>>>>>> 47     long mtype;
>>>>>>>>> 48     int code;
>>>>>>>>> 49     struct timeval tv;
>>>>>>>>> 50   } msgb;
>>>>>>>>> 51 } MsgBuf;
>>>>>>>>> 
>>>>>>>>> bluefish has:
>>>>>>>>> 
>>>>>>>>> struct small_msgbuf {
>>>>>>>>> long mtype;
>>>>>>>>> char mtext[MSQ_QUEUE_SMALL_SIZE];
>>>>>>>>> } small_msgp;
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> My laptop has nothing at all in /dev/mqueue.
>>>>>>>>> 
>>>>>>>>> So I don't really know what the right thing to do is.  Certainly if
>>>>>>>>> we're going to apply this patch, we should also fix the header.  I
>>>>>>>>> almost think we should *delete* struct msgbuf from the headers, since
>>>>>>>>> it's all kinds of busted, but that will break the NTP build.  Ideally
>>>>>>>>> we would go back in time and remove it from the headers.
>>>>>>>>> 
>>>>>>>>> Libc people, any insight?  We can probably fix the bug without
>>>>>>>>> annoying anyone given how lightly x32 is used and how lightly POSIX
>>>>>>>>> message queues are used.
>>>>>>>> 
>>>>>>>> If it's that outright wrong and always has been, I feel like the old
>>>>>>>> syscall numbers should just be deprecated and new ones assigned.
>>>>>>>> Otherwise, there's no way for userspace to be safe against data
>>>>>>>> corruption when run on older kernels. If there's a new syscall number,
>>>>>>>> libc can just use the new one unconditionally (giving ENOSYS on
>>>>>>>> kernels where it would be broken) or have a x32-specific
>>>>>>>> implementation that makes the old syscall and performs translation if
>>>>>>>> the new one fails with ENOSYS.
>>>>>>> 
>>>>>>> That doesn't really help broken code continue to work reliably, as
>>>>>>> upgrading libc will just pull in the new syscall for a binary that's
>>>>>>> expecting the broken behaviour, unless you do symbol versioning, but
>>>>>>> then it'll just break when you next recompile the code, and there's no
>>>>>>> way for that to be diagnosed given the *application* has to define the
>>>>>>> type. But given it's application-defined I really struggle to see how
>>>>>>> any code out there is actually expecting the current x32 behaviour as
>>>>>>> you'd have to go really out of your way to find out that x32 is broken
>>>>>>> and needs __kernel_long_t. I don't think there's any way around just
>>>>>>> technically breaking ABI whilst likely really fixing ABI in 99.999% of
>>>>>>> cases (maybe 100%).
>>>>>> 
>>>>>> I'm not opposed to "breaking ABI" here because the current syscall
>>>>>> doesn't work unless someone wrote bogus x32-specific code to work
>>>>>> around it being wrong. I don't particularly want to preserve any of
>>>>>> the current behavior.
>>>>>> 
>>>>>> What I am somewhat opposed to is making a situation where an updated
>>>>>> libc can't be safe against getting run on a kernel with a broken
>>>>>> version of the syscall and silently corrupting data. I'm flexible
>>>>>> about how avoiding tha tis achieved.
>>>>> 
>>>>> If we're sufficiently confident that we won't regress anything by
>>>>> fixing the bug, I propose we do the following.  First, we commit a fix
>>>>> that's Jessica's patch plus a fix to struct msghdr, and we mark that
>>>>> for -stable.  Then we commit another patch that removes 'struct
>>>>> msghdr' from uapi entirely, but we don't mark that for -stable.  If
>>>>> people complain about the latter, we revert it.
>>>> 
>>>> Thinking about this more, MIPS n32 is also affected by that header. In
>>>> fact the n32 syscalls currently do the right thing and use the compat
>>>> implementations, so the header is currently out-of-sync with the kernel
>>>> there*. This should be noted when committing the change to msg.h.
>>> 
>>> Never mind, it seems MIPS n32 is weird and leaves __kernel_long_t as a
>>> normal long despite being an ILP32-on-64-bit ABI, I guess because it's
>>> inherited from IRIX rather than being invented by the GNU world.
>> 
>> Yes, the whole __kernel_long_t invention is largely x32-only (maybe
>> theoretically on aarch64-ilp32 too? if that even really exists?) and
>> is pretty much entirely a mistake from lacking the proper
>> infrastructure to do time64 when x32 was introduced (note that n32 has
>> 32-bit old-time_t). I hope effort will be made to keep the same
>> mistake from creeping into future ilp32-on-64 ABIs if there are any.
> 
> Ping? Does anyone have further thoughts/are people happy for this to
> land?

Ping?

Jess


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-12-06  0:01                         ` Jessica Clarke
@ 2020-12-06 22:55                           ` Andy Lutomirski
  2023-08-01  0:43                             ` Harald van Dijk
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Lutomirski @ 2020-12-06 22:55 UTC (permalink / raw)
  To: Jessica Clarke
  Cc: Rich Felker, Andy Lutomirski, linux-x86_64, Florian Weimer,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, X86 ML,
	H. Peter Anvin, LKML

On Sat, Dec 5, 2020 at 4:01 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>

> Ping?

Can you submit patches implementing my proposal?  One is your existing
patch plus fixing struct msghdr, with Cc: stable@vger.kernel.org at
the bottom.  The second is a removal of struct msghdr from uapi,
moving it into include/inux (no uapi) if needed.  The second should
not cc stable.


--Andy

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2020-12-06 22:55                           ` Andy Lutomirski
@ 2023-08-01  0:43                             ` Harald van Dijk
  2023-08-01  1:38                               ` Jessica Clarke
  2023-08-01  7:15                               ` [PATCH v2] x86: Fix x32 System V message queue syscalls Florian Weimer
  0 siblings, 2 replies; 26+ messages in thread
From: Harald van Dijk @ 2023-08-01  0:43 UTC (permalink / raw)
  To: Andy Lutomirski, Jessica Clarke
  Cc: Rich Felker, linux-x86_64, Florian Weimer, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On 06/12/2020 22:55, Andy Lutomirski wrote:
> On Sat, Dec 5, 2020 at 4:01 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>
> 
>> Ping?
> 
> Can you submit patches implementing my proposal?  One is your existing
> patch plus fixing struct msghdr, with Cc: stable@vger.kernel.org at
> the bottom.  The second is a removal of struct msghdr from uapi,
> moving it into include/inux (no uapi) if needed.  The second should
> not cc stable.

Hi,

This looks like it was forgotten, but it is still needed. Jessica, are 
you interested in submitting the requested change? If not, would it be 
okay if I do so? I have been running this locally for a long time now.

There is one complication that I think has not been mentioned yet: when 
_GNU_SOURCE is defined, glibc does provide a definition of struct msghdr 
in <sys/msg.h> with a field "__syscall_slong_t mtype;". This makes it 
slightly more likely that there is code out there in the wild that works 
fine with current kernels and would be broken by the fix. Given how rare 
x32 is, and how rare message queues are, this may still be acceptable, 
but I am mentioning it just in case this would cause a different 
approach to be preferred. And whatever is done, a fix should also be 
submitted to glibc.

(musl define struct msghdr as well, but defines mtype unconditionally as 
having type long, so if this approach is still preferred, needs no changes.)

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2023-08-01  0:43                             ` Harald van Dijk
@ 2023-08-01  1:38                               ` Jessica Clarke
  2023-08-01  2:53                                 ` Rich Felker
                                                   ` (2 more replies)
  2023-08-01  7:15                               ` [PATCH v2] x86: Fix x32 System V message queue syscalls Florian Weimer
  1 sibling, 3 replies; 26+ messages in thread
From: Jessica Clarke @ 2023-08-01  1:38 UTC (permalink / raw)
  To: Harald van Dijk
  Cc: Andy Lutomirski, Rich Felker, linux-x86_64, Florian Weimer,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, X86 ML,
	H. Peter Anvin, LKML

On 1 Aug 2023, at 01:43, Harald van Dijk <harald@gigawatt.nl> wrote:
> 
> On 06/12/2020 22:55, Andy Lutomirski wrote:
>> On Sat, Dec 5, 2020 at 4:01 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>> 
>>> Ping?
>> Can you submit patches implementing my proposal?  One is your existing
>> patch plus fixing struct msghdr, with Cc: stable@vger.kernel.org at
>> the bottom.  The second is a removal of struct msghdr from uapi,
>> moving it into include/inux (no uapi) if needed.  The second should
>> not cc stable.
> 
> Hi,
> 
> This looks like it was forgotten, but it is still needed. Jessica, are you interested in submitting the requested change? If not, would it be okay if I do so? I have been running this locally for a long time now.

Hi,
Please feel free to; sorry that it dropped off my radar. Part of the
issue is my laptop no longer being x86, making it more annoying to test.

> There is one complication that I think has not been mentioned yet: when _GNU_SOURCE is defined, glibc does provide a definition of struct msghdr in <sys/msg.h> with a field "__syscall_slong_t mtype;". This makes it slightly more likely that there is code out there in the wild that works fine with current kernels and would be broken by the fix. Given how rare x32 is, and how rare message queues are, this may still be acceptable, but I am mentioning it just in case this would cause a different approach to be preferred. And whatever is done, a fix should also be submitted to glibc.

Given POSIX is very clear on how msghdr works I think we have to break
whatever oddball code out there might be using this. The alternative is
violating POSIX in a way that makes correct code compile fine but fail
at run time on x32, which is a terrible place to be, especially when
the “fix” is to special-case x32 to go against what POSIX says. I just
can’t see how that’s a good place to stay in, even if something might
break when we fix this bug.

Thanks,
Jess

> (musl define struct msghdr as well, but defines mtype unconditionally as having type long, so if this approach is still preferred, needs no changes.)
> 
> Cheers,
> Harald van Dijk


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2023-08-01  1:38                               ` Jessica Clarke
@ 2023-08-01  2:53                                 ` Rich Felker
  2023-08-01 12:13                                   ` Harald van Dijk
  2023-09-10 23:33                                 ` [PATCH 1/2] uapi: Stop using __kernel_long_t in struct msgbuf Harald van Dijk
  2023-09-10 23:33                                 ` [PATCH 2/2] uapi: Remove struct msgbuf, struct ipc_kludge Harald van Dijk
  2 siblings, 1 reply; 26+ messages in thread
From: Rich Felker @ 2023-08-01  2:53 UTC (permalink / raw)
  To: Jessica Clarke
  Cc: Harald van Dijk, Andy Lutomirski, linux-x86_64, Florian Weimer,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, X86 ML,
	H. Peter Anvin, LKML

On Tue, Aug 01, 2023 at 02:38:47AM +0100, Jessica Clarke wrote:
> On 1 Aug 2023, at 01:43, Harald van Dijk <harald@gigawatt.nl> wrote:
> > 
> > On 06/12/2020 22:55, Andy Lutomirski wrote:
> >> On Sat, Dec 5, 2020 at 4:01 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
> >>> 
> >>> Ping?
> >> Can you submit patches implementing my proposal?  One is your existing
> >> patch plus fixing struct msghdr, with Cc: stable@vger.kernel.org at
> >> the bottom.  The second is a removal of struct msghdr from uapi,
> >> moving it into include/inux (no uapi) if needed.  The second should
> >> not cc stable.
> > 
> > Hi,
> > 
> > This looks like it was forgotten, but it is still needed. Jessica,
> > are you interested in submitting the requested change? If not,
> > would it be okay if I do so? I have been running this locally for
> > a long time now.
> 
> Hi,
> Please feel free to; sorry that it dropped off my radar. Part of the
> issue is my laptop no longer being x86, making it more annoying to test.
> 
> > There is one complication that I think has not been mentioned yet:
> > when _GNU_SOURCE is defined, glibc does provide a definition of
> > struct msghdr in <sys/msg.h> with a field "__syscall_slong_t
> > mtype;". This makes it slightly more likely that there is code out
> > there in the wild that works fine with current kernels and would
> > be broken by the fix. Given how rare x32 is, and how rare message
> > queues are, this may still be acceptable, but I am mentioning it
> > just in case this would cause a different approach to be
> > preferred. And whatever is done, a fix should also be submitted to
> > glibc.
> 
> Given POSIX is very clear on how msghdr works I think we have to break
> whatever oddball code out there might be using this. The alternative is
> violating POSIX in a way that makes correct code compile fine but fail
> at run time on x32, which is a terrible place to be, especially when
> the “fix” is to special-case x32 to go against what POSIX says. I just
> can’t see how that’s a good place to stay in, even if something might
> break when we fix this bug.

Absolutely. The application-facing API absolutely needs to have the
type of mtype be whatever long is in the application-facing C ABI.
However, I'm not sure how best to fix this. A fix now still leaves
applications broken on all existing kernels in the wild. This might be
a place where libc should have x32-specific translation code to work
around the wrong kernel ABI that became the contract with the kernel.
I'm not sure how practical this is, since it seems like it would
require a temp buffer. Is the message size sufficiently bounded to
make that reasonable? Should there me a new x32-specific syscall that
takes the right ABI so that translation is only needed on old kernels?

Rich

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2023-08-01  0:43                             ` Harald van Dijk
  2023-08-01  1:38                               ` Jessica Clarke
@ 2023-08-01  7:15                               ` Florian Weimer
  2023-08-01 12:15                                 ` Harald van Dijk
  1 sibling, 1 reply; 26+ messages in thread
From: Florian Weimer @ 2023-08-01  7:15 UTC (permalink / raw)
  To: Harald van Dijk
  Cc: Andy Lutomirski, Jessica Clarke, Rich Felker, linux-x86_64,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, X86 ML,
	H. Peter Anvin, LKML

* Harald van Dijk:

> There is one complication that I think has not been mentioned yet:
> when _GNU_SOURCE is defined, glibc does provide a definition of struct
> msghdr in <sys/msg.h> with a field "__syscall_slong_t mtype;". This
> makes it slightly more likely that there is code out there in the wild
> that works fine with current kernels and would be broken by the
> fix. Given how rare x32 is, and how rare message queues are, this may
> still be acceptable, but I am mentioning it just in case this would
> cause a different approach to be preferred. And whatever is done, a
> fix should also be submitted to glibc.

What should glibc do here?  Just change the definition in the header to
long and ignore the breakage?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2023-08-01  2:53                                 ` Rich Felker
@ 2023-08-01 12:13                                   ` Harald van Dijk
  0 siblings, 0 replies; 26+ messages in thread
From: Harald van Dijk @ 2023-08-01 12:13 UTC (permalink / raw)
  To: Rich Felker, Jessica Clarke
  Cc: Andy Lutomirski, linux-x86_64, Florian Weimer, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML

On 01/08/2023 03:53, Rich Felker wrote:
> On Tue, Aug 01, 2023 at 02:38:47AM +0100, Jessica Clarke wrote:
>> On 1 Aug 2023, at 01:43, Harald van Dijk <harald@gigawatt.nl> wrote:
>>>
>>> On 06/12/2020 22:55, Andy Lutomirski wrote:
>>>> On Sat, Dec 5, 2020 at 4:01 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>>>>>
>>>>> Ping?
>>>> Can you submit patches implementing my proposal?  One is your existing
>>>> patch plus fixing struct msghdr, with Cc: stable@vger.kernel.org at
>>>> the bottom.  The second is a removal of struct msghdr from uapi,
>>>> moving it into include/inux (no uapi) if needed.  The second should
>>>> not cc stable.
>>>
>>> Hi,
>>>
>>> This looks like it was forgotten, but it is still needed. Jessica,
>>> are you interested in submitting the requested change? If not,
>>> would it be okay if I do so? I have been running this locally for
>>> a long time now.
>>
>> Hi,
>> Please feel free to; sorry that it dropped off my radar. Part of the
>> issue is my laptop no longer being x86, making it more annoying to test.

No worries and thanks, I will do so.

>>> There is one complication that I think has not been mentioned yet:
>>> when _GNU_SOURCE is defined, glibc does provide a definition of
>>> struct msghdr in <sys/msg.h> with a field "__syscall_slong_t
>>> mtype;". This makes it slightly more likely that there is code out
>>> there in the wild that works fine with current kernels and would
>>> be broken by the fix. Given how rare x32 is, and how rare message
>>> queues are, this may still be acceptable, but I am mentioning it
>>> just in case this would cause a different approach to be
>>> preferred. And whatever is done, a fix should also be submitted to
>>> glibc.
>>
>> Given POSIX is very clear on how msghdr works I think we have to break
>> whatever oddball code out there might be using this. The alternative is
>> violating POSIX in a way that makes correct code compile fine but fail
>> at run time on x32, which is a terrible place to be, especially when
>> the “fix” is to special-case x32 to go against what POSIX says. I just
>> can’t see how that’s a good place to stay in, even if something might
>> break when we fix this bug.
> 
> Absolutely. The application-facing API absolutely needs to have the
> type of mtype be whatever long is in the application-facing C ABI.
> However, I'm not sure how best to fix this.

I shall go with Andy's suggested approach. 
<https://lore.kernel.org/all/CALCETrUuBR3Pt_9NhRZTLzjZzwdsS2OPW4U2r31_1Uq-=poRDw@mail.gmail.com/>

>                                             A fix now still leaves
> applications broken on all existing kernels in the wild.

True, but fixing it any other way also leaves applications broken on all 
existing kernels in the wild, and fixing it this way makes it so that 
existing applications that are currently broken start to work 
automatically once people move to new kernels, rather than requiring 
rebuilds.

>                                                          This might be
> a place where libc should have x32-specific translation code to work
> around the wrong kernel ABI that became the contract with the kernel.

The problem is that there are two conflicting contracts, the de jure 
contract and the de facto contract. The de jure contract has always been 
that the field has type "long" and we have seen from the breakage that 
that is what applications have been using already. The de facto contract 
was different, but we do not know of any application that has made use 
of this. We cannot make it so that both work, so it makes sense to me to 
make it so that what we do know is out there works.

> I'm not sure how practical this is, since it seems like it would
> require a temp buffer. Is the message size sufficiently bounded to
> make that reasonable? Should there me a new x32-specific syscall that
> takes the right ABI so that translation is only needed on old kernels?

If a libc wishes to detect the current kernel behaviour and implement a 
workaround, can it technically not also do so without a new syscall by 
just issuing the syscall with a known payload and seeing what comes back?

But personally, I would be happy to leave that as it is now under Andy's 
rationale: "If you run user programs on a buggy kernel, you get buggy 
behavior..."

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2023-08-01  7:15                               ` [PATCH v2] x86: Fix x32 System V message queue syscalls Florian Weimer
@ 2023-08-01 12:15                                 ` Harald van Dijk
  2023-09-10 23:40                                   ` Harald van Dijk
  0 siblings, 1 reply; 26+ messages in thread
From: Harald van Dijk @ 2023-08-01 12:15 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Andy Lutomirski, Jessica Clarke, Rich Felker, linux-x86_64,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, X86 ML,
	H. Peter Anvin, LKML

On 01/08/2023 08:15, Florian Weimer wrote:
> * Harald van Dijk:
> 
>> There is one complication that I think has not been mentioned yet:
>> when _GNU_SOURCE is defined, glibc does provide a definition of struct
>> msghdr in <sys/msg.h> with a field "__syscall_slong_t mtype;". This
>> makes it slightly more likely that there is code out there in the wild
>> that works fine with current kernels and would be broken by the
>> fix. Given how rare x32 is, and how rare message queues are, this may
>> still be acceptable, but I am mentioning it just in case this would
>> cause a different approach to be preferred. And whatever is done, a
>> fix should also be submitted to glibc.
> 
> What should glibc do here?  Just change the definition in the header to
> long and ignore the breakage?

Yes, I believe so, but perhaps for glibc it might make sense to wait a 
little bit to see if it does indeed go into the kernel in that form. If 
the kernel ends up preferring something else, the change needed in glibc 
might also be different.

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/2] uapi: Stop using __kernel_long_t in struct msgbuf
  2023-08-01  1:38                               ` Jessica Clarke
  2023-08-01  2:53                                 ` Rich Felker
@ 2023-09-10 23:33                                 ` Harald van Dijk
  2023-09-10 23:33                                 ` [PATCH 2/2] uapi: Remove struct msgbuf, struct ipc_kludge Harald van Dijk
  2 siblings, 0 replies; 26+ messages in thread
From: Harald van Dijk @ 2023-09-10 23:33 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Rich Felker, linux-x86_64, Florian Weimer,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, X86 ML,
	H. Peter Anvin, Jessica Clarke

struct msgbuf is mostly an internal type: msgsnd and msgrcv have always
been documented as taking a pointer to a user-defined type starting with
a field of type long.  __kernel_long_t is the same as long for all
platforms other than x32, almost all code out there using msgsnd and
msgrcv uses them as documented, and it is impossible to simultaneously
support existing binaries that use the documented API and existing
binaries that use the implemented API. This change breaks existing x32
binaries that use the implemented API, and fixes existing x32 binaries
that use the documented API.

This reverts commit 443d5670f77aab121cb95f45da60f0aad390bcb5.

Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
Cc: stable@kernel.org
---
  arch/x86/entry/syscalls/syscall_64.tbl | 6 ++++--
  include/uapi/linux/msg.h               | 2 +-
  2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 1d6eee30eceb2..f5d81fe2c94ca 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -77,8 +77,10 @@
  66	common	semctl			sys_semctl
  67	common	shmdt			sys_shmdt
  68	common	msgget			sys_msgget
-69	common	msgsnd			sys_msgsnd
-70	common	msgrcv			sys_msgrcv
+69	64	msgsnd			sys_msgsnd
+69	x32	msgsnd			compat_sys_msgsnd
+70	64	msgrcv			sys_msgrcv
+70	x32	msgrcv			compat_sys_msgrcv
  71	common	msgctl			sys_msgctl
  72	common	fcntl			sys_fcntl
  73	common	flock			sys_flock
diff --git a/include/uapi/linux/msg.h b/include/uapi/linux/msg.h
index 01ee8d54c1c8a..3881c94d48eae 100644
--- a/include/uapi/linux/msg.h
+++ b/include/uapi/linux/msg.h
@@ -36,7 +36,7 @@ struct msqid_ds {
  
  /* message buffer for msgsnd and msgrcv calls */
  struct msgbuf {
-	__kernel_long_t mtype;          /* type of message */
+	long mtype;                     /* type of message */
  	char mtext[1];                  /* message text */
  };
  
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/2] uapi: Remove struct msgbuf, struct ipc_kludge.
  2023-08-01  1:38                               ` Jessica Clarke
  2023-08-01  2:53                                 ` Rich Felker
  2023-09-10 23:33                                 ` [PATCH 1/2] uapi: Stop using __kernel_long_t in struct msgbuf Harald van Dijk
@ 2023-09-10 23:33                                 ` Harald van Dijk
  2 siblings, 0 replies; 26+ messages in thread
From: Harald van Dijk @ 2023-09-10 23:33 UTC (permalink / raw)
  To: LKML
  Cc: Jessica Clarke, Andy Lutomirski, Rich Felker, linux-x86_64,
	Florian Weimer, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	X86 ML, H. Peter Anvin

struct msgbuf is mostly an internal type: msgsnd and msgrcv have always
been documented as taking a pointer to a user-defined type.  Any code
that relies on the uapi-defined type can trivially be updated to follow
the documentation and define the type itself.

struct ipc_kludge is a workaround for the limited number of arguments
supported in syscalls, references struct msgbuf, so either needs
updating or also needs removing from uapi.  As no libc that I have been
able to find uses it (almost all use an array of long instead, and
dietlibc defines its own struct ipc_kludge without relying on kernel
headers), this commit removes it too from uapi.

Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
---
  include/linux/ipc.h      | 10 ++++++++++
  include/linux/msg.h      |  6 ++++++
  include/uapi/linux/ipc.h | 10 ----------
  include/uapi/linux/msg.h |  6 ------
  4 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/include/linux/ipc.h b/include/linux/ipc.h
index e1c9eea6015b5..b2f8e4ff554ec 100644
--- a/include/linux/ipc.h
+++ b/include/linux/ipc.h
@@ -8,6 +8,16 @@
  #include <uapi/linux/ipc.h>
  #include <linux/refcount.h>
  
+/*
+ * These are used to wrap system calls.
+ *
+ * See architecture code for ugly details..
+ */
+struct ipc_kludge {
+	struct msgbuf __user *msgp;
+	long msgtyp;
+};
+
  /* used by in-kernel data structures */
  struct kern_ipc_perm {
  	spinlock_t	lock;
diff --git a/include/linux/msg.h b/include/linux/msg.h
index 9a972a296b953..46836a6291af2 100644
--- a/include/linux/msg.h
+++ b/include/linux/msg.h
@@ -5,6 +5,12 @@
  #include <linux/list.h>
  #include <uapi/linux/msg.h>
  
+/* message buffer for msgsnd and msgrcv calls */
+struct msgbuf {
+	long mtype;                     /* type of message */
+	char mtext[1];                  /* message text */
+};
+
  /* one msg_msg structure for each message */
  struct msg_msg {
  	struct list_head m_list;
diff --git a/include/uapi/linux/ipc.h b/include/uapi/linux/ipc.h
index 5995fc9d675ea..2c6e543c18fd8 100644
--- a/include/uapi/linux/ipc.h
+++ b/include/uapi/linux/ipc.h
@@ -50,16 +50,6 @@ struct ipc_perm
  #define IPC_64  0x0100  /* New version (support 32-bit UIDs, bigger
  			   message sizes, etc. */
  
-/*
- * These are used to wrap system calls.
- *
- * See architecture code for ugly details..
- */
-struct ipc_kludge {
-	struct msgbuf __user *msgp;
-	long msgtyp;
-};
-
  #define SEMOP		 1
  #define SEMGET		 2
  #define SEMCTL		 3
diff --git a/include/uapi/linux/msg.h b/include/uapi/linux/msg.h
index 3881c94d48eae..901e2952669e6 100644
--- a/include/uapi/linux/msg.h
+++ b/include/uapi/linux/msg.h
@@ -34,12 +34,6 @@ struct msqid_ds {
  /* Include the definition of msqid64_ds */
  #include <asm/msgbuf.h>
  
-/* message buffer for msgsnd and msgrcv calls */
-struct msgbuf {
-	long mtype;                     /* type of message */
-	char mtext[1];                  /* message text */
-};
-
  /* buffer for msgctl calls IPC_INFO, MSG_INFO */
  struct msginfo {
  	int msgpool;
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] x86: Fix x32 System V message queue syscalls
  2023-08-01 12:15                                 ` Harald van Dijk
@ 2023-09-10 23:40                                   ` Harald van Dijk
  0 siblings, 0 replies; 26+ messages in thread
From: Harald van Dijk @ 2023-09-10 23:40 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Andy Lutomirski, Jessica Clarke, Rich Felker, linux-x86_64,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, X86 ML,
	H. Peter Anvin, LKML

On 01/08/2023 13:15, Harald van Dijk wrote:
> On 01/08/2023 08:15, Florian Weimer wrote:
>> * Harald van Dijk:
>>
>>> There is one complication that I think has not been mentioned yet:
>>> when _GNU_SOURCE is defined, glibc does provide a definition of struct
>>> msghdr in <sys/msg.h> with a field "__syscall_slong_t mtype;". This
>>> makes it slightly more likely that there is code out there in the wild
>>> that works fine with current kernels and would be broken by the
>>> fix. Given how rare x32 is, and how rare message queues are, this may
>>> still be acceptable, but I am mentioning it just in case this would
>>> cause a different approach to be preferred. And whatever is done, a
>>> fix should also be submitted to glibc.
>>
>> What should glibc do here?  Just change the definition in the header to
>> long and ignore the breakage?
> 
> Yes, I believe so, but perhaps for glibc it might make sense to wait a 
> little bit to see if it does indeed go into the kernel in that form. If 
> the kernel ends up preferring something else, the change needed in glibc 
> might also be different.

With the patches sent just now, following the guidance from Andy in 
<https://lore.kernel.org/all/CALCETrUuBR3Pt_9NhRZTLzjZzwdsS2OPW4U2r31_1Uq-=poRDw@mail.gmail.com/>, 
assuming they get accepted, the right thing to do in glibc would indeed 
be to just change the header to use long. It will not affect anything 
other than x32, and most code avoids that header anyway and defines the 
type manually as per the documentation.

Cheers,
Harald van Dijk

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-09-10 23:42 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-12  1:48 [PATCH] x86: Fix x32 System V message queue syscalls Jessica Clarke
2020-10-12  3:02 ` Andy Lutomirski
2020-10-12  3:31   ` Jessica Clarke
2020-10-12 13:44     ` [PATCH v2] " Jessica Clarke
2020-10-30 19:21       ` Jessica Clarke
2020-10-31 23:30       ` Andy Lutomirski
2020-11-01  0:09         ` Jessica Clarke
2020-11-01  1:22         ` Rich Felker
2020-11-01  1:27           ` Jessica Clarke
2020-11-01  1:50             ` Rich Felker
2020-11-01 18:07               ` Andy Lutomirski
2020-11-01 18:15                 ` Jessica Clarke
2020-11-01 18:27                   ` Jessica Clarke
2020-11-01 21:01                     ` Rich Felker
2020-11-16  0:55                       ` Jessica Clarke
2020-12-06  0:01                         ` Jessica Clarke
2020-12-06 22:55                           ` Andy Lutomirski
2023-08-01  0:43                             ` Harald van Dijk
2023-08-01  1:38                               ` Jessica Clarke
2023-08-01  2:53                                 ` Rich Felker
2023-08-01 12:13                                   ` Harald van Dijk
2023-09-10 23:33                                 ` [PATCH 1/2] uapi: Stop using __kernel_long_t in struct msgbuf Harald van Dijk
2023-09-10 23:33                                 ` [PATCH 2/2] uapi: Remove struct msgbuf, struct ipc_kludge Harald van Dijk
2023-08-01  7:15                               ` [PATCH v2] x86: Fix x32 System V message queue syscalls Florian Weimer
2023-08-01 12:15                                 ` Harald van Dijk
2023-09-10 23:40                                   ` Harald van Dijk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).