All of lore.kernel.org
 help / color / mirror / Atom feed
From: George Dunlap <George.Dunlap@citrix.com>
To: Ian Jackson <Ian.Jackson@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Wei Liu <wei.liu2@citrix.com>
Subject: Re: [PATCH 9/9] libxl: Kill QEMU with "reaper" ruid
Date: Wed, 28 Nov 2018 18:33:31 +0000	[thread overview]
Message-ID: <E01AFF18-E01D-4912-8D1D-D3068B86837B@citrix.com> (raw)
In-Reply-To: <23550.51750.957497.310933@mariner.uk.xensource.com>



> On Nov 28, 2018, at 5:02 PM, Ian Jackson <ian.jackson@citrix.com> wrote:
> 
> George Dunlap writes ("[PATCH 9/9] libxl: Kill QEMU with "reaper" ruid"):
>> Using kill(-1) to killing an untrusted dm process with the real uid
>> equal to the dm_uid isn't guaranteed to succeed: the process in
>> question may be able to kill the reaper process after the setresuid()
>> and before the kill().
> ...
>> +static int libxl__get_reaper_uid(libxl__gc *gc)
>> +{
>> +    struct passwd *user_base, user_pwbuf;
>> +    int ret;
>> +    ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
>> +                                         &user_pwbuf, &user_base);
> 
> ret should be r I think.
> 
> Also I think you need to handle errors properly ?  Ie set and check
> errno.

Don’t I want to pass up the errno values set by the getpwnam functions?

Although I suppose I should also make sure that the uid I return is not zero...

> 
>> const char *libxl__domain_device_model(libxl__gc *gc,
>>                                        const libxl_domain_build_info *info)
>> {
>> @@ -2719,12 +2728,62 @@ void libxl__destroy_device_model(libxl__egc *egc,
> ...
>> -        ret = setresuid(dm_uid, dm_uid, 0);
>> +        fd = open(lockfile, O_RDWR|O_CREAT, 0666);
>> +        if (fd < 0) {
>> +            /* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
>> +            LOGED(ERROR, domid,
>> +                  "unexpected error while trying to open lockfile %s, errno=%d",
>> +                  lockfile, errno);
>> +            goto kill;
> 
> More gotos!  I doubt this error handling is right.
> 
> I'm also not convinced that it is sensible to handle the case where we
> have multiple per-domain uids but no reaper uid.  This turns host
> configuration errors into systems that are less secure than they
> should be in a really obscure way.

At this point, we have a target_uid but no way of getting a lock for reaper_uid.  We have three options:

1. Don’t kill(-1) at all.
2. Try to  kill(-1) with setresuid(target_uid, target_uid, 0)
3. kill(-1) with setresuid(reaper_uid, target_uid, 0) without holding the lock.

#1 means that a rogue qemu will not be destroyed.

#2 means that there’s a race, whereby sometimes the rogue qemu is destroyed, and sometimes the reaper process is destroyed by the rogue qemu first.

#3 means there’s a race, whereby sometimes everything works fine, sometimes both the rogue qemu and *the reaper process from another domain* is destroyed, and sometimes this reaper process is killed by the reaper process from another domain (leaving the rogue qemu alive).

I think #1 is obviously the best option.

> 
>> +        /* Try to lock the file */
>> +        while (flock(fd, LOCK_EX)) {
> 
> CODING_STYLE, no call inside the condition please.

I c&p’d this from libxl_internal.c:libxl__lock_domain_userdata().

I take it you’re referring to the “ERROR HANDLING” section of CODING_STYLE?  It wasn’t obvious to me that refers to while() loops.

I take it you’d prefer "while(true) { rc = flock(); if(!rc) break; …}” instead?

And if I may make a minor suggestion: This is the second time in this series you’ve said “don’t do X” for fairly common code idioms without giving me guidance as to what you’d like to see instead. 

> Overall I think this stuff needs a different error handling approach:
> 
> * We should distinguish expected and reasonable configurations, where
>   we fall back to less secure methods, from other unexpected
>   situations.
> 
> * In other unexpected situations (whether bad host configuration or
>   syscall errors or whatever) we should make a best effort to
>   destroy as much as we can.

Which is what the code does.

> 
> * But crucially in such situations (i) overall destroy ao should
>   return a failure error code (ii) the domain itself should not be
>   destroyed in Xen.  This means that `xl destroy' fails, and can be
>   repeated after the problem is corrected.

This means that in such a situation, we might successfully kill the devicemodel, but leave a zombie domain lying around.

I suppose that might be the least-bad option, as 1) would be more likely to alert the administrator to fix the configuration error, and 2) the domain would hold the domid, such that any unkilled qemu processes wouldn’t be able to cause mischief on other domains.

Probably some of these principles should be in a comment somewhere.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-11-28 18:34 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-23 17:14 [PATCH 1/9] libxl: Remove redundant pidpath setting George Dunlap
2018-11-23 17:14 ` [PATCH 2/9] libxl: Move dm user determination logic into a helper function George Dunlap
2018-11-28 15:20   ` Wei Liu
2018-11-28 16:33   ` Ian Jackson
2018-11-30 12:46   ` George Dunlap
2018-11-23 17:14 ` [PATCH 3/9] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN) George Dunlap
2018-11-28 15:32   ` Wei Liu
2018-11-30 12:53     ` George Dunlap
2018-11-30 16:56       ` Wei Liu
2018-11-28 16:34   ` Ian Jackson
2018-11-28 17:02     ` George Dunlap
2018-11-29 12:19       ` Ian Jackson
2018-11-23 17:14 ` [PATCH 4/9] dm_depriv: Describe expected usage of device_model_user parameter George Dunlap
2018-11-28 15:37   ` Wei Liu
2018-11-28 16:36   ` Ian Jackson
2018-11-28 17:27     ` George Dunlap
2018-11-29 12:20       ` Ian Jackson
2018-11-23 17:14 ` [PATCH 5/9] libxl: Do root checks once in libxl__domain_get_device_model_uid George Dunlap
2018-11-28 16:39   ` Ian Jackson
2018-11-28 17:38     ` George Dunlap
2018-11-29 17:09       ` Ian Jackson
2018-11-29 17:43         ` George Dunlap
2018-11-23 17:14 ` [PATCH 6/9] libxl: Move qmp cleanup into devicemodel destroy function George Dunlap
2018-11-28 16:39   ` Ian Jackson
2018-11-23 17:15 ` [PATCH 7/9] libxl: Make killing of device model asynchronous George Dunlap
2018-11-28 16:43   ` Ian Jackson
2018-11-30 16:12     ` George Dunlap
2018-11-30 16:14       ` George Dunlap
2018-11-30 16:22         ` Ian Jackson
2018-11-30 16:22       ` Ian Jackson
2018-11-23 17:15 ` [PATCH 8/9] libxl: Kill QEMU by uid when possible George Dunlap
2018-11-23 17:18   ` George Dunlap
2018-11-28 15:57     ` Anthony PERARD
2018-11-29 11:55       ` Wei Liu
2018-11-29 12:00         ` George Dunlap
2018-11-29 12:26           ` Ian Jackson
2018-11-29 12:38             ` George Dunlap
2018-11-29 17:16               ` Ian Jackson
2018-11-28 16:56   ` Ian Jackson
2018-11-28 17:55     ` George Dunlap
2018-11-29 17:01       ` Ian Jackson
2018-11-30 16:37     ` George Dunlap
2018-11-23 17:15 ` [PATCH 9/9] libxl: Kill QEMU with "reaper" ruid George Dunlap
2018-11-28 17:02   ` Ian Jackson
2018-11-28 18:33     ` George Dunlap [this message]
2018-11-29 11:39       ` George Dunlap
2018-11-29 17:15       ` Ian Jackson
2018-11-28 15:20 ` [PATCH 1/9] libxl: Remove redundant pidpath setting Wei Liu
2018-11-28 16:25 ` Ian Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E01AFF18-E01D-4912-8D1D-D3068B86837B@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.