All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc
@ 2021-02-14 15:25 Michael Tokarev
  2021-02-14 15:45 ` no-reply
  2021-03-05 17:53 ` Peter Maydell
  0 siblings, 2 replies; 5+ messages in thread
From: Michael Tokarev @ 2021-02-14 15:25 UTC (permalink / raw)
  To: qemu-devel

Hi!

As known for a long time, qemu's linux-user, when invoked in context of binfmt-misc mechanism,
does not preserve the original argv[0] element, so some software which relies on argv[0] is
not functioning under qemu-user.  When run this way, argv[0] of the program being run under
qemu-user points to this qemu-user binary, instead of being what has been used to spawn the
original binary.

There's an interpreter flag in binfmt handling in recent kernels, P, or preserve, which meant
to pass 3 extra first arguments to the binfmt interpeter, - namely, the path to interpreter
itself, the argv[0] as used when spawning the original binary, and the actual path of the
said binary. But qemu-user/main does not handle this situation, - it should be prepared for
this unusual way of being invoked.

There are several hackish solutions exists on this theme used by downstreams, which introduces
a wrapper program especially for binfmt registration and nothing else, uses this P flag, and
uses -argv0 qemu-user argument to pass all the necessary information to qemu-user.

But these wrappers introduce a different issue: since the wrapper executes the qemu binary,
we can't use F binfmt-misc flag anymore without copying the qemu-user binary inside any
foreign chroot being used with it.

So the possible solution is to made qemu-user aware of this in-kernel binfmt mechanism,
which I implemented for Debian for now, as a guinea pig :)

Since the original problem is the proper handling of programs which depend on their own
name in argv[0], the proposed solution is also based on the program name, - this time
it is the name under which qemu-user binary is called.

I introduced a special name for qemu-user binaries to be used _only_ for binfmt registration.
This is, in my case, /usr/libexec/qemu-binfmt/foo-binfmt-P - where "foo" is the architecture
name, and "-binfmt-P" is the literal suffix. This name is just a (sym)link to /usr/bin/qemu-foo,
- just an alternative name for qemu-foo, nothing more.

And added a patch for linux-user/main.c which checks suffix of its argv[0], and if it ends
up on the literal "-binfmt-P", we assume we're being called from in-kernel binfmt-misc
subsystem with the P flag enabled, which means first 3 args are: our own name, the original
argv[0], and the intended binary's path, and the rest are the arguments for the binary to
run.

At first I thought it is a hackish approach, and mentioned that in a comment in the code,
but the more I think about it, the more I like it, and the more it makes sense.

Here's the patch I used in Debian (it is not intended for upstream for now), - for comments,
what do you think? At least it seems like a good step in the right direction, finally.. :)

And we have another issue still, in the same field. Some programs executes itself by using
/proc/self/exe. This does not work under linux-user too, since this link, again, points to
the qemu binary, not the original binary being run. But this is a different story.

Thanks,

/mjt

Subject: [PATCH, HACK]: linux-user: handle binfmt-misc P flag as a separate exe name
From: Michael Tokarev <mjt@tls.msk.ru>
Date: Sat, 13 Feb 2021 13:57:52 +0300

A hackish way to distinguish the case when qemu-user binary is executed
using in-kernel binfmt-misc subsystem with P flag (preserve argv).
We register binfmt interpreter under name /usr/libexec/qemu-binfmt/qemu-foo-binfmt-P
(which is just a symlink to ../../bin/qemu-foo), and if run like that,
qemu-user binary will "know" it should interpret argv[1] & argv[2]
in a special way.

diff --git a/linux-user/main.c b/linux-user/main.c
index 24d1eb73ad..5596dab9be 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -560,6 +560,27 @@ static int parse_args(int argc, char **argv)
          }
      }

+    /* HACK alert.
+     * when run as an interpreter using kernel's binfmt-misc mechanism,
+     * we have to know where are we (our own binary), where's the binary being run,
+     * and what it's argv[0] element.
+     * Only with the P interpreter flag kernel passes all 3 elements as first 3 argv[],
+     * but we can't distinguish if we were run with or without this P flag.
+     * So we register a special name with binfmt-misc system, a name which ends up
+     * in "-binfmt-P", and if our argv[0] ends up with that, we assume we were run
+     * from kernel's binfmt with P flag and our first 3 args are from kernel.
+     */
+    if (strlen(argv[0]) > sizeof("binfmt-P") &&
+        strcmp(argv[0] + strlen(argv[0]) - sizeof("binfmt-P"), "-binfmt-P") == 0) {
+        if (argc < 3) {
+            (void) fprintf(stderr, "qemu: %s has to be run using kernel binfmt-misc subsystem\n", argv[0]);
+            exit(EXIT_FAILURE);
+        }
+        handle_arg_argv0(argv[1]);
+        exec_path = argv[2];
+        return 2;
+    }
+
      optind = 1;
      for (;;) {
          if (optind >= argc) {


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc
  2021-02-14 15:25 RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc Michael Tokarev
@ 2021-02-14 15:45 ` no-reply
  2021-03-05 17:53 ` Peter Maydell
  1 sibling, 0 replies; 5+ messages in thread
From: no-reply @ 2021-02-14 15:45 UTC (permalink / raw)
  To: mjt; +Cc: qemu-devel

Patchew URL: https://patchew.org/QEMU/27dfe8eb-adce-8db4-f28b-c42858b086db@msgid.tls.msk.ru/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 27dfe8eb-adce-8db4-f28b-c42858b086db@msgid.tls.msk.ru
Subject: RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
   eac92d3..f4ceebd  master     -> master
 - [tag update]      patchew/1610505995-144129-1-git-send-email-lei.rao@intel.com -> patchew/1610505995-144129-1-git-send-email-lei.rao@intel.com
 - [tag update]      patchew/20201008043105.21058-1-chengang@emindsoft.com.cn -> patchew/20201008043105.21058-1-chengang@emindsoft.com.cn
 - [tag update]      patchew/20201011195001.3219730-1-f4bug@amsat.org -> patchew/20201011195001.3219730-1-f4bug@amsat.org
 - [tag update]      patchew/20210131061849.12615-1-vfazio@xes-inc.com -> patchew/20210131061849.12615-1-vfazio@xes-inc.com
 - [tag update]      patchew/20210131061948.15990-1-vfazio@xes-inc.com -> patchew/20210131061948.15990-1-vfazio@xes-inc.com
 - [tag update]      patchew/20210201155922.GA18291@ls3530.fritz.box -> patchew/20210201155922.GA18291@ls3530.fritz.box
 - [tag update]      patchew/20210201220551.GA8015@ls3530.fritz.box -> patchew/20210201220551.GA8015@ls3530.fritz.box
 - [tag update]      patchew/20210204153925.2030606-1-Jason@zx2c4.com -> patchew/20210204153925.2030606-1-Jason@zx2c4.com
 - [tag update]      patchew/20210210171537.32932-1-david@redhat.com -> patchew/20210210171537.32932-1-david@redhat.com
 - [tag update]      patchew/20210213130325.14781-1-alex.bennee@linaro.org -> patchew/20210213130325.14781-1-alex.bennee@linaro.org
 * [new tag]         patchew/27dfe8eb-adce-8db4-f28b-c42858b086db@msgid.tls.msk.ru -> patchew/27dfe8eb-adce-8db4-f28b-c42858b086db@msgid.tls.msk.ru
Switched to a new branch 'test'
53c7840 RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc

=== OUTPUT BEGIN ===
WARNING: Block comments use a leading /* on a separate line
#85: FILE: linux-user/main.c:564:
+    /* HACK alert.

WARNING: line over 80 characters
#87: FILE: linux-user/main.c:566:
+     * we have to know where are we (our own binary), where's the binary being run,

WARNING: line over 80 characters
#89: FILE: linux-user/main.c:568:
+     * Only with the P interpreter flag kernel passes all 3 elements as first 3 argv[],

WARNING: line over 80 characters
#91: FILE: linux-user/main.c:570:
+     * So we register a special name with binfmt-misc system, a name which ends up

WARNING: line over 80 characters
#92: FILE: linux-user/main.c:571:
+     * in "-binfmt-P", and if our argv[0] ends up with that, we assume we were run

WARNING: line over 80 characters
#96: FILE: linux-user/main.c:575:
+        strcmp(argv[0] + strlen(argv[0]) - sizeof("binfmt-P"), "-binfmt-P") == 0) {

ERROR: line over 90 characters
#98: FILE: linux-user/main.c:577:
+            (void) fprintf(stderr, "qemu: %s has to be run using kernel binfmt-misc subsystem\n", argv[0]);

ERROR: Missing Signed-off-by: line(s)

total: 2 errors, 6 warnings, 27 lines checked

Commit 53c7840551a1 (RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc) has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/27dfe8eb-adce-8db4-f28b-c42858b086db@msgid.tls.msk.ru/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc
  2021-02-14 15:25 RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc Michael Tokarev
  2021-02-14 15:45 ` no-reply
@ 2021-03-05 17:53 ` Peter Maydell
  2021-03-05 18:07   ` Laurent Vivier
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Maydell @ 2021-03-05 17:53 UTC (permalink / raw)
  To: Michael Tokarev, Laurent Vivier; +Cc: QEMU Developers

On Sun, 14 Feb 2021 at 15:34, Michael Tokarev <mjt@tls.msk.ru> wrote:
> As known for a long time, qemu's linux-user, when invoked in context of binfmt-misc mechanism,
> does not preserve the original argv[0] element, so some software which relies on argv[0] is
> not functioning under qemu-user.  When run this way, argv[0] of the program being run under
> qemu-user points to this qemu-user binary, instead of being what has been used to spawn the
> original binary.
>
> There's an interpreter flag in binfmt handling in recent kernels, P, or preserve, which meant
> to pass 3 extra first arguments to the binfmt interpeter, - namely, the path to interpreter
> itself, the argv[0] as used when spawning the original binary, and the actual path of the
> said binary. But qemu-user/main does not handle this situation, - it should be prepared for
> this unusual way of being invoked.
>
> There are several hackish solutions exists on this theme used by downstreams, which introduces
> a wrapper program especially for binfmt registration and nothing else, uses this P flag, and
> uses -argv0 qemu-user argument to pass all the necessary information to qemu-user.
>
> But these wrappers introduce a different issue: since the wrapper executes the qemu binary,
> we can't use F binfmt-misc flag anymore without copying the qemu-user binary inside any
> foreign chroot being used with it.
>
> So the possible solution is to made qemu-user aware of this in-kernel binfmt mechanism,
> which I implemented for Debian for now, as a guinea pig :)

I've always felt that the fundamental problem is that the kernel has never
provided any way for the binfmt handler to know in what way it is being
invoked. So you can't have a handler that backwards-compatibly says "if the
user/distro/whatever installed me with the P flag then I should expect my
arguments like this, but if it didn't then I should do the other thing".

> Since the original problem is the proper handling of programs which depend on their own
> name in argv[0], the proposed solution is also based on the program name, - this time
> it is the name under which qemu-user binary is called.
>
> I introduced a special name for qemu-user binaries to be used _only_ for binfmt registration.
> This is, in my case, /usr/libexec/qemu-binfmt/foo-binfmt-P - where "foo" is the architecture
> name, and "-binfmt-P" is the literal suffix. This name is just a (sym)link to /usr/bin/qemu-foo,
> - just an alternative name for qemu-foo, nothing more.

Mmm, you can work around the kernel's missing feature by using a particular
naming convention. I guess that's better than nothing but I think that if
we want to go this route we should try to get buy-in from more than one
distro that this is the right way to do it...

Alternatively, if anybody has a bright idea for how to get the kernel
to tell us how it's invoking us (ELF auxv entry???) maybe we could make
a proposal to the kernel folks.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc
  2021-03-05 17:53 ` Peter Maydell
@ 2021-03-05 18:07   ` Laurent Vivier
  2021-03-05 18:15     ` Peter Maydell
  0 siblings, 1 reply; 5+ messages in thread
From: Laurent Vivier @ 2021-03-05 18:07 UTC (permalink / raw)
  To: Peter Maydell, Michael Tokarev; +Cc: QEMU Developers

Le 05/03/2021 à 18:53, Peter Maydell a écrit :
> On Sun, 14 Feb 2021 at 15:34, Michael Tokarev <mjt@tls.msk.ru> wrote:
>> As known for a long time, qemu's linux-user, when invoked in context of binfmt-misc mechanism,
>> does not preserve the original argv[0] element, so some software which relies on argv[0] is
>> not functioning under qemu-user.  When run this way, argv[0] of the program being run under
>> qemu-user points to this qemu-user binary, instead of being what has been used to spawn the
>> original binary.
>>
>> There's an interpreter flag in binfmt handling in recent kernels, P, or preserve, which meant
>> to pass 3 extra first arguments to the binfmt interpeter, - namely, the path to interpreter
>> itself, the argv[0] as used when spawning the original binary, and the actual path of the
>> said binary. But qemu-user/main does not handle this situation, - it should be prepared for
>> this unusual way of being invoked.
>>
>> There are several hackish solutions exists on this theme used by downstreams, which introduces
>> a wrapper program especially for binfmt registration and nothing else, uses this P flag, and
>> uses -argv0 qemu-user argument to pass all the necessary information to qemu-user.
>>
>> But these wrappers introduce a different issue: since the wrapper executes the qemu binary,
>> we can't use F binfmt-misc flag anymore without copying the qemu-user binary inside any
>> foreign chroot being used with it.
>>
>> So the possible solution is to made qemu-user aware of this in-kernel binfmt mechanism,
>> which I implemented for Debian for now, as a guinea pig :)
> 
> I've always felt that the fundamental problem is that the kernel has never
> provided any way for the binfmt handler to know in what way it is being
> invoked. So you can't have a handler that backwards-compatibly says "if the
> user/distro/whatever installed me with the P flag then I should expect my
> arguments like this, but if it didn't then I should do the other thing".
> 
>> Since the original problem is the proper handling of programs which depend on their own
>> name in argv[0], the proposed solution is also based on the program name, - this time
>> it is the name under which qemu-user binary is called.
>>
>> I introduced a special name for qemu-user binaries to be used _only_ for binfmt registration.
>> This is, in my case, /usr/libexec/qemu-binfmt/foo-binfmt-P - where "foo" is the architecture
>> name, and "-binfmt-P" is the literal suffix. This name is just a (sym)link to /usr/bin/qemu-foo,
>> - just an alternative name for qemu-foo, nothing more.
> 
> Mmm, you can work around the kernel's missing feature by using a particular
> naming convention. I guess that's better than nothing but I think that if
> we want to go this route we should try to get buy-in from more than one
> distro that this is the right way to do it...
> 
> Alternatively, if anybody has a bright idea for how to get the kernel
> to tell us how it's invoking us (ELF auxv entry???) maybe we could make
> a proposal to the kernel folks.

My patch has been merged in v5.12:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2347961b11d4079deace3c81dceed460c08a8fc1

And I will push soon the qemu part:

https://patchew.org/QEMU/20210222105004.1642234-1-laurent@vivier.eu/

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc
  2021-03-05 18:07   ` Laurent Vivier
@ 2021-03-05 18:15     ` Peter Maydell
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Maydell @ 2021-03-05 18:15 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Michael Tokarev, QEMU Developers

On Fri, 5 Mar 2021 at 18:07, Laurent Vivier <laurent@vivier.eu> wrote:
> Le 05/03/2021 à 18:53, Peter Maydell a écrit :
> > Alternatively, if anybody has a bright idea for how to get the kernel
> > to tell us how it's invoking us (ELF auxv entry???) maybe we could make
> > a proposal to the kernel folks.
>
> My patch has been merged in v5.12:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2347961b11d4079deace3c81dceed460c08a8fc1

Ha! I must have seen that before and forgotten about it but
my subconscious still remembered the part about the ELF auxv...

thanks
-- PMM


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-05 18:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-14 15:25 RFC: linux-user: preserving argv[0] of the original binary in context of binfmt-misc Michael Tokarev
2021-02-14 15:45 ` no-reply
2021-03-05 17:53 ` Peter Maydell
2021-03-05 18:07   ` Laurent Vivier
2021-03-05 18:15     ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.