xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: Julien Grall <julien@xen.org>, xen-devel@lists.xenproject.org
Cc: Ian Jackson <iwj@xenproject.org>, Wei Liu <wl@xen.org>
Subject: Re: [PATCH] tools/xenstore: claim resources when running as daemon
Date: Mon, 17 May 2021 08:47:13 +0200	[thread overview]
Message-ID: <fe5f1e6a-1a89-ea12-feb5-318f25d4281f@suse.com> (raw)
In-Reply-To: <1e38cce0-6960-ac21-b349-dac8551e23ed@xen.org>


[-- Attachment #1.1.1: Type: text/plain, Size: 5490 bytes --]

On 14.05.21 22:19, Julien Grall wrote:
> Hi Juergen,
> 
> On 14/05/2021 09:41, Juergen Gross wrote:
>> Xenstored is absolutely mandatory for a Xen host and it can't be
>> restarted, so being killed by OOM-killer in case of memory shortage is
>> to be avoided.
>>
>> Set /proc/$pid/oom_score_adj (if available) to -500 in order to allow
>> xenstored to use large amounts of memory without being killed.
>>
>> In order to support large numbers of domains the limit for open file
>> descriptors might need to be raised. Each domain needs 2 file
>> descriptors (one for the event channel and one for the xl per-domain
>> daemon to connect to xenstored).
> 
> Hmmm... AFAICT there is only one file descriptor to handle all the event 
> channels. Could you point out the code showing one event FD per domain?

I have let me fooled by just counting the file descriptors used with one
or two domain active.

So you are right that all event channels only use one fd, but each xl
daemon will use two (which should be fixed, IMO). And thinking more
about it it is even worse: each qemu process will at least require one
additional fd.

> 
>>
>> Try to raise ulimit for open files to 65536. First the hard limit if
>> needed, and then the soft limit.
> 
> I am not sure this is right to impose this limit to everyone. For 
> instance, one admin may know that there will be no more than 100 domains 
> on its system.

Is setting a higher limit really a problem?

> So the admin should be able to configure them. At this point, I think 
> the two limit should be set my the initscript rather than xenstored itself.

But the admin would need to know the Xen internals for selecting the
correct limits. In the end I'd be fine with moving this modification to
the script starting Xenstore (which would be launch-xenstore), but the
configuration item should be "max number of domains to support".

> 
> This would also avoid the problem where Xenstored is not allowed to 
> modify its limit (see more below).
> 
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   tools/xenstore/xenstored_core.c   |  2 ++
>>   tools/xenstore/xenstored_core.h   |  3 ++
>>   tools/xenstore/xenstored_minios.c |  4 +++
>>   tools/xenstore/xenstored_posix.c  | 46 +++++++++++++++++++++++++++++++
>>   4 files changed, 55 insertions(+)
>>
>> diff --git a/tools/xenstore/xenstored_core.c 
>> b/tools/xenstore/xenstored_core.c
>> index b66d119a98..964e693450 100644
>> --- a/tools/xenstore/xenstored_core.c
>> +++ b/tools/xenstore/xenstored_core.c
>> @@ -2243,6 +2243,8 @@ int main(int argc, char *argv[])
>>           xprintf = trace;
>>   #endif
>> +    claim_resources();
>> +
>>       signal(SIGHUP, trigger_reopen_log);
>>       if (tracefile)
>>           tracefile = talloc_strdup(NULL, tracefile);
>> diff --git a/tools/xenstore/xenstored_core.h 
>> b/tools/xenstore/xenstored_core.h
>> index 1467270476..ac26973648 100644
>> --- a/tools/xenstore/xenstored_core.h
>> +++ b/tools/xenstore/xenstored_core.h
>> @@ -255,6 +255,9 @@ void daemonize(void);
>>   /* Close stdin/stdout/stderr to complete daemonize */
>>   void finish_daemonize(void);
>> +/* Set OOM-killer score and raise ulimit. */
>> +void claim_resources(void);
>> +
>>   /* Open a pipe for signal handling */
>>   void init_pipe(int reopen_log_pipe[2]);
>> diff --git a/tools/xenstore/xenstored_minios.c 
>> b/tools/xenstore/xenstored_minios.c
>> index c94493e52a..df8ff580b0 100644
>> --- a/tools/xenstore/xenstored_minios.c
>> +++ b/tools/xenstore/xenstored_minios.c
>> @@ -32,6 +32,10 @@ void finish_daemonize(void)
>>   {
>>   }
>> +void claim_resources(void)
>> +{
>> +}
>> +
>>   void init_pipe(int reopen_log_pipe[2])
>>   {
>>       reopen_log_pipe[0] = -1;
>> diff --git a/tools/xenstore/xenstored_posix.c 
>> b/tools/xenstore/xenstored_posix.c
>> index 48c37ffe3e..0074fbd8b2 100644
>> --- a/tools/xenstore/xenstored_posix.c
>> +++ b/tools/xenstore/xenstored_posix.c
>> @@ -22,6 +22,7 @@
>>   #include <fcntl.h>
>>   #include <stdlib.h>
>>   #include <sys/mman.h>
>> +#include <sys/resource.h>
>>   #include "utils.h"
>>   #include "xenstored_core.h"
>> @@ -87,6 +88,51 @@ void finish_daemonize(void)
>>       close(devnull);
>>   }
>> +static void avoid_oom_killer(void)
>> +{
>> +    char path[32];
>> +    char val[] = "-500";
>> +    int fd;
>> +
>> +    snprintf(path, sizeof(path), "/proc/%d/oom_score_adj", 
>> (int)getpid());
> 
> This looks Linux specific. How about other OSes?

I don't know whether other OSes have an OOM killer, and if they do, how
to configure it. It is a best effort attempt, after all.

> 
>> +
>> +    fd = open(path, O_WRONLY);
>> +    /* Do nothing if file doesn't exist. */
> 
> Your commit message leads to think that we *must* configure the OOM. If 
> not, then we should not continue. But here, this suggest this is 
> optional. In fact...

I can modify the commit message by adding a "Try to".

> 
>> +    if (fd < 0)
>> +        return;
>> +    /* Ignore errors. */
>> +    write(fd, val, sizeof(val));
> 
> ... xenstored may not be allowed to modify its own parameters. So this 
> would continue silently without the admin necessarily knowning the limit 
> wasn't applied.

I can add a line in the Xenstore log in this regard.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

  reply	other threads:[~2021-05-17  6:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-14  8:41 Juergen Gross
2021-05-14 20:19 ` Julien Grall
2021-05-17  6:47   ` Juergen Gross [this message]
2021-05-17 15:55     ` Julien Grall
2021-05-18  6:43       ` Juergen Gross
2021-05-18 13:09         ` Julien Grall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fe5f1e6a-1a89-ea12-feb5-318f25d4281f@suse.com \
    --to=jgross@suse.com \
    --cc=iwj@xenproject.org \
    --cc=julien@xen.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    --subject='Re: [PATCH] tools/xenstore: claim resources when running as daemon' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).