All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] tools/xenstore: set resource limits of xenstored
@ 2021-07-30 12:26 Juergen Gross
  2021-07-30 12:26 ` [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux Juergen Gross
  2021-07-30 12:26 ` [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored Juergen Gross
  0 siblings, 2 replies; 15+ messages in thread
From: Juergen Gross @ 2021-07-30 12:26 UTC (permalink / raw)
  To: xen-devel; +Cc: julien, Juergen Gross, Ian Jackson, Wei Liu

Set some limits for xenstored in order to avoid it being killed by
OOM killer, or to run out of file descriptors.

Changes in V3:
- make oom score configurable

Changes in V2:
- split into 2 patches
- set limits from start script

Juergen Gross (2):
  tools/xenstore: set oom score for xenstore daemon on Linux
  tools/xenstore: set open file descriptor limit for xenstored

 tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 14 ++++++++++++++
 tools/hotplug/Linux/launch-xenstore.in             |  9 +++++++++
 2 files changed, 23 insertions(+)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux
  2021-07-30 12:26 [PATCH v3 0/2] tools/xenstore: set resource limits of xenstored Juergen Gross
@ 2021-07-30 12:26 ` Juergen Gross
  2021-07-30 13:26   ` Ian Jackson
  2021-07-30 12:26 ` [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored Juergen Gross
  1 sibling, 1 reply; 15+ messages in thread
From: Juergen Gross @ 2021-07-30 12:26 UTC (permalink / raw)
  To: xen-devel; +Cc: julien, Juergen Gross, Ian Jackson, Wei Liu

Xenstored is absolutely mandatory for a Xen host and it can't be
restarted, so being killed by OOM-killer in case of memory shortage is
to be avoided.

Set /proc/$pid/oom_score_adj (if available) per default to -500 (this
translates to 50% of dom0 memory size) in order to allow xenstored to
use large amounts of memory without being killed.

The percentage of dom0 memory above which the oom killer is allowed to
kill xenstored can be set via XENSTORED_OOM_MEM_THRESHOLD in
xencommons.

Make sure the pid file isn't a left-over from a previous run delete it
before starting xenstored.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- set oom score from launch script (Julien Grall)
- split off open file descriptor limit setting (Julien Grall)
V3:
- make oom killer threshold configurable (Julien Grall)
---
 tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 7 +++++++
 tools/hotplug/Linux/launch-xenstore.in             | 6 ++++++
 2 files changed, 13 insertions(+)

diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
index 00cf7f91d4..5ad4fe0818 100644
--- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
+++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
@@ -48,6 +48,13 @@ XENSTORED_ARGS=
 # Only evaluated if XENSTORETYPE is "daemon".
 #XENSTORED_TRACE=[yes|on|1]
 
+## Type: integer
+## Default: 50
+#
+# Percentage of dom0 memory size the xenstore daemon can use before the
+# OOM killer is allowed to kill it.
+#XENSTORED_OOM_MEM_THRESHOLD=50
+
 ## Type: string
 ## Default: @LIBEXEC@/boot/xenstore-stubdom.gz
 #
diff --git a/tools/hotplug/Linux/launch-xenstore.in b/tools/hotplug/Linux/launch-xenstore.in
index 019f9d6f4d..1747c96065 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -59,11 +59,17 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF
 		echo "No xenstored found"
 		exit 1
 	}
+	[ -z "$XENSTORED_OOM_MEM_THRESHOLD" ] || XENSTORED_OOM_MEM_THRESHOLD=50
+	XS_OOM_SCORE=-$(($XENSTORED_OOM_MEM_THRESHOLD * 10))
+
+	rm -f @XEN_RUN_DIR@/xenstored.pid
 
 	echo -n Starting $XENSTORED...
 	$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
 
 	systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1
+	XS_PID=`cat @XEN_RUN_DIR@/xenstored.pid`
+	echo $XS_OOM_SCORE >/proc/$XS_PID/oom_score_adj
 
 	exit 0
 }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-07-30 12:26 [PATCH v3 0/2] tools/xenstore: set resource limits of xenstored Juergen Gross
  2021-07-30 12:26 ` [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux Juergen Gross
@ 2021-07-30 12:26 ` Juergen Gross
  2021-07-30 13:35   ` Ian Jackson
  1 sibling, 1 reply; 15+ messages in thread
From: Juergen Gross @ 2021-07-30 12:26 UTC (permalink / raw)
  To: xen-devel; +Cc: julien, Juergen Gross, Ian Jackson, Wei Liu

Add a configuration item for the maximum number of domains xenstored
should support and set the limit of open file descriptors accordingly.

For HVM domains there are up to 5 socket connections per domain (2 by
the xl daemon process, and 3 by qemu). So set the ulimit for xenstored
to 5 * XENSTORED_MAX_DOMAINS + 100 (the "+ 100" is for some headroom,
like logging, event channel device, etc.).

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- set ulimit form launch script (Julien Grall)
- split off from original patch (Julien Grall)
---
 tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 7 +++++++
 tools/hotplug/Linux/launch-xenstore.in             | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
index 5ad4fe0818..2b682415f4 100644
--- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
+++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
@@ -32,6 +32,13 @@
 # Changing this requires a reboot to take effect.
 #XENSTORED=@XENSTORED@
 
+## Type: integer
+## Default: 32768
+#
+# Select maximum number of domains supported by xenstored.
+# Only evaluated if XENSTORETYPE is "daemon".
+#XENSTORED_MAX_N_DOMAINS=32768
+
 ## Type: string
 ## Default: ""
 #
diff --git a/tools/hotplug/Linux/launch-xenstore.in b/tools/hotplug/Linux/launch-xenstore.in
index 1747c96065..3f8b33dd32 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -54,6 +54,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF
 
 [ "$XENSTORETYPE" = "daemon" ] && {
 	[ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T @XEN_LOG_DIR@/xenstored-trace.log"
+	[ -z "$XENSTORED_MAX_N_DOMAINS" ] && XENSTORED_MAX_N_DOMAINS=32768
 	[ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
 	[ -x "$XENSTORED" ] || {
 		echo "No xenstored found"
@@ -63,6 +64,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF
 	XS_OOM_SCORE=-$(($XENSTORED_OOM_MEM_THRESHOLD * 10))
 
 	rm -f @XEN_RUN_DIR@/xenstored.pid
+	N_FILES=$(($XENSTORED_MAX_N_DOMAINS * 5 + 100))
 
 	echo -n Starting $XENSTORED...
 	$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
@@ -70,6 +72,7 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF
 	systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1
 	XS_PID=`cat @XEN_RUN_DIR@/xenstored.pid`
 	echo $XS_OOM_SCORE >/proc/$XS_PID/oom_score_adj
+	prlimit --pid $XS_PID --nofile=$N_FILES
 
 	exit 0
 }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux
  2021-07-30 12:26 ` [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux Juergen Gross
@ 2021-07-30 13:26   ` Ian Jackson
  2021-08-31 12:20     ` Juergen Gross
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Jackson @ 2021-07-30 13:26 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, julien, Wei Liu

Juergen Gross writes ("[PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux"):
> Xenstored is absolutely mandatory for a Xen host and it can't be
> restarted, so being killed by OOM-killer in case of memory shortage is
> to be avoided.
> 
> Set /proc/$pid/oom_score_adj (if available) per default to -500 (this
> translates to 50% of dom0 memory size) in order to allow xenstored to
> use large amounts of memory without being killed.
...
> +## Type: integer
> +## Default: 50
> +#
> +# Percentage of dom0 memory size the xenstore daemon can use before the
> +# OOM killer is allowed to kill it.
> +#XENSTORED_OOM_MEM_THRESHOLD=50
> +
>  ## Type: string
>  ## Default: @LIBEXEC@/boot/xenstore-stubdom.gz

Thanks for working on this.  I approve of the principle.

I have one question about detail:

>  	}
> +	[ -z "$XENSTORED_OOM_MEM_THRESHOLD" ] || XENSTORED_OOM_MEM_THRESHOLD=50
> +	XS_OOM_SCORE=-$(($XENSTORED_OOM_MEM_THRESHOLD * 10))
> +
> +	rm -f @XEN_RUN_DIR@/xenstored.pid
...
> +	XS_PID=`cat @XEN_RUN_DIR@/xenstored.pid`
> +	echo $XS_OOM_SCORE >/proc/$XS_PID/oom_score_adj

The effect of all this is that the value specified in
XENSTORED_OOM_MEM_THRESHOLD is transformed before being echoed into
/proc, by being multiplied by -10.

Of course an alternative would be to ask the user to specify the
tuneable directly but given its rather more obscure semantics I think
it is reasonable to have this done by the script.

But maybe we could add something to the doc comment ?

Eg
  # (The specified value is multiplied by -10 and echoed into
  # /proc/PID/oom_score_adj.)

?

Thanks,
Ian.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-07-30 12:26 ` [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored Juergen Gross
@ 2021-07-30 13:35   ` Ian Jackson
  2021-07-30 17:14     ` Julien Grall
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Jackson @ 2021-07-30 13:35 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, julien, Wei Liu

Juergen Gross writes ("[PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored"):
> Add a configuration item for the maximum number of domains xenstored
> should support and set the limit of open file descriptors accordingly.
> 
> For HVM domains there are up to 5 socket connections per domain (2 by
> the xl daemon process, and 3 by qemu). So set the ulimit for xenstored
> to 5 * XENSTORED_MAX_DOMAINS + 100 (the "+ 100" is for some headroom,
> like logging, event channel device, etc.).
...
> +## Type: integer
> +## Default: 32768
> +#
> +# Select maximum number of domains supported by xenstored.
> +# Only evaluated if XENSTORETYPE is "daemon".
> +#XENSTORED_MAX_N_DOMAINS=32768

I approve of doing something about the fd limit.  I have some qualms
about the documentation.

The documentation doesn't say what happens if this limit is exceeded.
Also the default of 32758 suggests that we actually support that many
domains.  I don't think we do...

I didn't find anything in SUPPORT.md about how many guests we support
but I wouldn't want this setting here to imply full support for 32768
domains.

If you don't want to tackle this can of works, maybe add this:

  # This just controls some resource limits for xenstored; if the
  # limit is exceeded, xenstored will stop being able to function
  # properly for additional guests.  The default value is so large
  # that it won't be exceeded in a supported configuration, but
  # should not be taken to mean that the whole Xen system is
  # guaranteed to work properly with that many guests.

Julien, did you ask for this to be made configurable ?  Having written
the text above, I wonder if it wouldn't just be better to
unconditionally set it to "unlimited" rather than offering footgun
dressed up like a tuneable...

If xenstored does go mad or leadk lots of fds, things are basically
stuffed in anycase.  Having its syscalls fail with EMFILE is not
really going to help.

Ian.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-07-30 13:35   ` Ian Jackson
@ 2021-07-30 17:14     ` Julien Grall
  2021-08-31 12:11       ` Juergen Gross
  0 siblings, 1 reply; 15+ messages in thread
From: Julien Grall @ 2021-07-30 17:14 UTC (permalink / raw)
  To: Ian Jackson, Juergen Gross; +Cc: xen-devel, Wei Liu

Hi Ian,

On 30/07/2021 14:35, Ian Jackson wrote:
> Juergen Gross writes ("[PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored"):
>> Add a configuration item for the maximum number of domains xenstored
>> should support and set the limit of open file descriptors accordingly.
>>
>> For HVM domains there are up to 5 socket connections per domain (2 by
>> the xl daemon process, and 3 by qemu). So set the ulimit for xenstored
>> to 5 * XENSTORED_MAX_DOMAINS + 100 (the "+ 100" is for some headroom,
>> like logging, event channel device, etc.).
> ...
>> +## Type: integer
>> +## Default: 32768
>> +#
>> +# Select maximum number of domains supported by xenstored.
>> +# Only evaluated if XENSTORETYPE is "daemon".
>> +#XENSTORED_MAX_N_DOMAINS=32768
> 
> I approve of doing something about the fd limit.  I have some qualms
> about the documentation.
> 
> The documentation doesn't say what happens if this limit is exceeded.
> Also the default of 32758 suggests that we actually support that many
> domains.  I don't think we do...
> 
> I didn't find anything in SUPPORT.md about how many guests we support
> but I wouldn't want this setting here to imply full support for 32768
> domains.
> 
> If you don't want to tackle this can of works, maybe add this:
> 
>    # This just controls some resource limits for xenstored; if the
>    # limit is exceeded, xenstored will stop being able to function
>    # properly for additional guests.  The default value is so large
>    # that it won't be exceeded in a supported configuration, but
>    # should not be taken to mean that the whole Xen system is
>    # guaranteed to work properly with that many guests.
> 
> Julien, did you ask for this to be made configurable ?  Having written
> the text above, I wonder if it wouldn't just be better to
> unconditionally set it to "unlimited" rather than offering footgun
> dressed up like a tuneable...

So in v1 (see [1]), Juergen wanted to raise the limit. I assumed this 
meant that the default limit (configured by the system may not be enough).

I felt this was wrong to impose an higher limit on everyone when an 
admin may know the maximum number of domains.

By "unlimited", do you mean the calling "ulimit" (or whatever is used 
for configuring FDs) with unlimited?

If so, I would be OK with that. My main was was to move the raising the 
limit outside Xenstored because:
  1) This is easier for an admin to tweak it (in particular the OOM)
  2) It feels wrong to me that the daemon chose the limits
  3) An admin can enforce it

Cheers,

[1] 1e38cce0-6960-ac21-b349-dac8551e23ed@xen.org

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-07-30 17:14     ` Julien Grall
@ 2021-08-31 12:11       ` Juergen Gross
  2021-08-31 12:30         ` Julien Grall
  0 siblings, 1 reply; 15+ messages in thread
From: Juergen Gross @ 2021-08-31 12:11 UTC (permalink / raw)
  To: Julien Grall, Ian Jackson; +Cc: xen-devel, Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 3302 bytes --]

On 30.07.21 19:14, Julien Grall wrote:
> Hi Ian,
> 
> On 30/07/2021 14:35, Ian Jackson wrote:
>> Juergen Gross writes ("[PATCH v3 2/2] tools/xenstore: set open file 
>> descriptor limit for xenstored"):
>>> Add a configuration item for the maximum number of domains xenstored
>>> should support and set the limit of open file descriptors accordingly.
>>>
>>> For HVM domains there are up to 5 socket connections per domain (2 by
>>> the xl daemon process, and 3 by qemu). So set the ulimit for xenstored
>>> to 5 * XENSTORED_MAX_DOMAINS + 100 (the "+ 100" is for some headroom,
>>> like logging, event channel device, etc.).
>> ...
>>> +## Type: integer
>>> +## Default: 32768
>>> +#
>>> +# Select maximum number of domains supported by xenstored.
>>> +# Only evaluated if XENSTORETYPE is "daemon".
>>> +#XENSTORED_MAX_N_DOMAINS=32768
>>
>> I approve of doing something about the fd limit.  I have some qualms
>> about the documentation.
>>
>> The documentation doesn't say what happens if this limit is exceeded.
>> Also the default of 32758 suggests that we actually support that many
>> domains.  I don't think we do...
>>
>> I didn't find anything in SUPPORT.md about how many guests we support
>> but I wouldn't want this setting here to imply full support for 32768
>> domains.
>>
>> If you don't want to tackle this can of works, maybe add this:
>>
>>    # This just controls some resource limits for xenstored; if the
>>    # limit is exceeded, xenstored will stop being able to function
>>    # properly for additional guests.  The default value is so large
>>    # that it won't be exceeded in a supported configuration, but
>>    # should not be taken to mean that the whole Xen system is
>>    # guaranteed to work properly with that many guests.
>>
>> Julien, did you ask for this to be made configurable ?  Having written
>> the text above, I wonder if it wouldn't just be better to
>> unconditionally set it to "unlimited" rather than offering footgun
>> dressed up like a tuneable...
> 
> So in v1 (see [1]), Juergen wanted to raise the limit. I assumed this 
> meant that the default limit (configured by the system may not be enough).
> 
> I felt this was wrong to impose an higher limit on everyone when an 
> admin may know the maximum number of domains.
> 
> By "unlimited", do you mean the calling "ulimit" (or whatever is used 
> for configuring FDs) with unlimited?
> 
> If so, I would be OK with that. My main was was to move the raising the 
> limit outside Xenstored because:
>   1) This is easier for an admin to tweak it (in particular the OOM)
>   2) It feels wrong to me that the daemon chose the limits
>   3) An admin can enforce it

Coming back to this series, I'm puzzled now.

Julien, you didn't want me to raise the limit to a specific number
covering the maximum possible number of domains, because you thought
this might result in xenstored hogging huge numbers of file descriptors
in case of a bug. Why is unlimited better then? This will make the
possible number even larger.

I'd really like to know how to come to an acceptable end result soon.
Right now the max number of domains supported by xenstored is just
limited by an arbitrary system wide limit.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux
  2021-07-30 13:26   ` Ian Jackson
@ 2021-08-31 12:20     ` Juergen Gross
  2021-09-13 13:57       ` Ian Jackson
  0 siblings, 1 reply; 15+ messages in thread
From: Juergen Gross @ 2021-08-31 12:20 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, julien, Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 2147 bytes --]

On 30.07.21 15:26, Ian Jackson wrote:
> Juergen Gross writes ("[PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux"):
>> Xenstored is absolutely mandatory for a Xen host and it can't be
>> restarted, so being killed by OOM-killer in case of memory shortage is
>> to be avoided.
>>
>> Set /proc/$pid/oom_score_adj (if available) per default to -500 (this
>> translates to 50% of dom0 memory size) in order to allow xenstored to
>> use large amounts of memory without being killed.
> ...
>> +## Type: integer
>> +## Default: 50
>> +#
>> +# Percentage of dom0 memory size the xenstore daemon can use before the
>> +# OOM killer is allowed to kill it.
>> +#XENSTORED_OOM_MEM_THRESHOLD=50
>> +
>>   ## Type: string
>>   ## Default: @LIBEXEC@/boot/xenstore-stubdom.gz
> 
> Thanks for working on this.  I approve of the principle.
> 
> I have one question about detail:
> 
>>   	}
>> +	[ -z "$XENSTORED_OOM_MEM_THRESHOLD" ] || XENSTORED_OOM_MEM_THRESHOLD=50
>> +	XS_OOM_SCORE=-$(($XENSTORED_OOM_MEM_THRESHOLD * 10))
>> +
>> +	rm -f @XEN_RUN_DIR@/xenstored.pid
> ...
>> +	XS_PID=`cat @XEN_RUN_DIR@/xenstored.pid`
>> +	echo $XS_OOM_SCORE >/proc/$XS_PID/oom_score_adj
> 
> The effect of all this is that the value specified in
> XENSTORED_OOM_MEM_THRESHOLD is transformed before being echoed into
> /proc, by being multiplied by -10.

Yes.

> Of course an alternative would be to ask the user to specify the
> tuneable directly but given its rather more obscure semantics I think
> it is reasonable to have this done by the script.

Correct. Otherwise the user would need to know about the oom_score_adj
ABI.

> But maybe we could add something to the doc comment ?
> 
> Eg
>    # (The specified value is multiplied by -10 and echoed into
>    # /proc/PID/oom_score_adj.)
> 
> ?

Why? This is an internal implementation detail. I don't see why the
user needs to know how this is accomplished. What is unclear with the
XENSTORED_OOM_MEM_THRESHOLD semantics as described?

There is no other parameter with an explanation how it's semantics are
being accomplished.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-08-31 12:11       ` Juergen Gross
@ 2021-08-31 12:30         ` Julien Grall
  2021-08-31 12:37           ` Andrew Cooper
  0 siblings, 1 reply; 15+ messages in thread
From: Julien Grall @ 2021-08-31 12:30 UTC (permalink / raw)
  To: Juergen Gross, Ian Jackson; +Cc: xen-devel, Wei Liu

Hi Juergen,

On 31/08/2021 13:11, Juergen Gross wrote:
> On 30.07.21 19:14, Julien Grall wrote:
>> Hi Ian,
>>
>> On 30/07/2021 14:35, Ian Jackson wrote:
>>> Juergen Gross writes ("[PATCH v3 2/2] tools/xenstore: set open file 
>>> descriptor limit for xenstored"):
>>>> Add a configuration item for the maximum number of domains xenstored
>>>> should support and set the limit of open file descriptors accordingly.
>>>>
>>>> For HVM domains there are up to 5 socket connections per domain (2 by
>>>> the xl daemon process, and 3 by qemu). So set the ulimit for xenstored
>>>> to 5 * XENSTORED_MAX_DOMAINS + 100 (the "+ 100" is for some headroom,
>>>> like logging, event channel device, etc.).
>>> ...
>>>> +## Type: integer
>>>> +## Default: 32768
>>>> +#
>>>> +# Select maximum number of domains supported by xenstored.
>>>> +# Only evaluated if XENSTORETYPE is "daemon".
>>>> +#XENSTORED_MAX_N_DOMAINS=32768
>>>
>>> I approve of doing something about the fd limit.  I have some qualms
>>> about the documentation.
>>>
>>> The documentation doesn't say what happens if this limit is exceeded.
>>> Also the default of 32758 suggests that we actually support that many
>>> domains.  I don't think we do...
>>>
>>> I didn't find anything in SUPPORT.md about how many guests we support
>>> but I wouldn't want this setting here to imply full support for 32768
>>> domains.
>>>
>>> If you don't want to tackle this can of works, maybe add this:
>>>
>>>    # This just controls some resource limits for xenstored; if the
>>>    # limit is exceeded, xenstored will stop being able to function
>>>    # properly for additional guests.  The default value is so large
>>>    # that it won't be exceeded in a supported configuration, but
>>>    # should not be taken to mean that the whole Xen system is
>>>    # guaranteed to work properly with that many guests.
>>>
>>> Julien, did you ask for this to be made configurable ?  Having written
>>> the text above, I wonder if it wouldn't just be better to
>>> unconditionally set it to "unlimited" rather than offering footgun
>>> dressed up like a tuneable...
>>
>> So in v1 (see [1]), Juergen wanted to raise the limit. I assumed this 
>> meant that the default limit (configured by the system may not be 
>> enough).
>>
>> I felt this was wrong to impose an higher limit on everyone when an 
>> admin may know the maximum number of domains.
>>
>> By "unlimited", do you mean the calling "ulimit" (or whatever is used 
>> for configuring FDs) with unlimited?
>>
>> If so, I would be OK with that. My main was was to move the raising 
>> the limit outside Xenstored because:
>>   1) This is easier for an admin to tweak it (in particular the OOM)
>>   2) It feels wrong to me that the daemon chose the limits
>>   3) An admin can enforce it
> 
> Coming back to this series, I'm puzzled now.
> 
> Julien, you didn't want me to raise the limit to a specific number
> covering the maximum possible number of domains, because you thought
> this might result in xenstored hogging huge numbers of file descriptors
> in case of a bug. Why is unlimited better then? This will make the
> possible number even larger.

I don't think I suggested the unlimited number is better... My main 
objection in your original approach is you set an arbitrary limit you in 
Xenstored (which may not apply at all) and don't offer a way to the 
admin to tweak it.

If the limit is set outside of Xenstored, then it becomes much easier 
for someone to just tweak the init script. I don't have a strong opinion 
on whether the default limit should be "unlimited" or a fixed number.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-08-31 12:30         ` Julien Grall
@ 2021-08-31 12:37           ` Andrew Cooper
  2021-08-31 14:22             ` Julien Grall
  2021-08-31 14:22             ` Ian Jackson
  0 siblings, 2 replies; 15+ messages in thread
From: Andrew Cooper @ 2021-08-31 12:37 UTC (permalink / raw)
  To: Julien Grall, Juergen Gross, Ian Jackson; +Cc: xen-devel, Wei Liu

On 31/08/2021 13:30, Julien Grall wrote:
> Hi Juergen,
>
> On 31/08/2021 13:11, Juergen Gross wrote:
>> On 30.07.21 19:14, Julien Grall wrote:
>>> Hi Ian,
>>>
>>> On 30/07/2021 14:35, Ian Jackson wrote:
>>>> Juergen Gross writes ("[PATCH v3 2/2] tools/xenstore: set open file
>>>> descriptor limit for xenstored"):
>>>>> Add a configuration item for the maximum number of domains xenstored
>>>>> should support and set the limit of open file descriptors
>>>>> accordingly.
>>>>>
>>>>> For HVM domains there are up to 5 socket connections per domain (2 by
>>>>> the xl daemon process, and 3 by qemu). So set the ulimit for
>>>>> xenstored
>>>>> to 5 * XENSTORED_MAX_DOMAINS + 100 (the "+ 100" is for some headroom,
>>>>> like logging, event channel device, etc.).
>>>> ...
>>>>> +## Type: integer
>>>>> +## Default: 32768
>>>>> +#
>>>>> +# Select maximum number of domains supported by xenstored.
>>>>> +# Only evaluated if XENSTORETYPE is "daemon".
>>>>> +#XENSTORED_MAX_N_DOMAINS=32768
>>>>
>>>> I approve of doing something about the fd limit.  I have some qualms
>>>> about the documentation.
>>>>
>>>> The documentation doesn't say what happens if this limit is exceeded.
>>>> Also the default of 32758 suggests that we actually support that many
>>>> domains.  I don't think we do...
>>>>
>>>> I didn't find anything in SUPPORT.md about how many guests we support
>>>> but I wouldn't want this setting here to imply full support for 32768
>>>> domains.
>>>>
>>>> If you don't want to tackle this can of works, maybe add this:
>>>>
>>>>    # This just controls some resource limits for xenstored; if the
>>>>    # limit is exceeded, xenstored will stop being able to function
>>>>    # properly for additional guests.  The default value is so large
>>>>    # that it won't be exceeded in a supported configuration, but
>>>>    # should not be taken to mean that the whole Xen system is
>>>>    # guaranteed to work properly with that many guests.
>>>>
>>>> Julien, did you ask for this to be made configurable ?  Having written
>>>> the text above, I wonder if it wouldn't just be better to
>>>> unconditionally set it to "unlimited" rather than offering footgun
>>>> dressed up like a tuneable...
>>>
>>> So in v1 (see [1]), Juergen wanted to raise the limit. I assumed
>>> this meant that the default limit (configured by the system may not
>>> be enough).
>>>
>>> I felt this was wrong to impose an higher limit on everyone when an
>>> admin may know the maximum number of domains.
>>>
>>> By "unlimited", do you mean the calling "ulimit" (or whatever is
>>> used for configuring FDs) with unlimited?
>>>
>>> If so, I would be OK with that. My main was was to move the raising
>>> the limit outside Xenstored because:
>>>   1) This is easier for an admin to tweak it (in particular the OOM)
>>>   2) It feels wrong to me that the daemon chose the limits
>>>   3) An admin can enforce it
>>
>> Coming back to this series, I'm puzzled now.
>>
>> Julien, you didn't want me to raise the limit to a specific number
>> covering the maximum possible number of domains, because you thought
>> this might result in xenstored hogging huge numbers of file descriptors
>> in case of a bug. Why is unlimited better then? This will make the
>> possible number even larger.
>
> I don't think I suggested the unlimited number is better... My main
> objection in your original approach is you set an arbitrary limit you
> in Xenstored (which may not apply at all) and don't offer a way to the
> admin to tweak it.
>
> If the limit is set outside of Xenstored, then it becomes much easier
> for someone to just tweak the init script. I don't have a strong
> opinion on whether the default limit should be "unlimited" or a fixed
> number.

xenstored is TCB.  It needs a large number of FDs, and can be trusted
with unlimited.

Also, like xenconsoled, we can calculate an upper bound, which is
derived from the ABI limit of 32k domids.

All you're haggling over is the error semantics in the case of:
1) the upper bound calculation is wrong, or
2) there is an fd leak

Personally, I think a fixed calculation is right, so fd leaks can be
spotted more obviously.

An admin knob is not helpful - higher than the upper bound is just
wasteful, while lower will cause malfunctions.

~Andrew



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-08-31 12:37           ` Andrew Cooper
@ 2021-08-31 14:22             ` Julien Grall
  2021-08-31 14:22             ` Ian Jackson
  1 sibling, 0 replies; 15+ messages in thread
From: Julien Grall @ 2021-08-31 14:22 UTC (permalink / raw)
  To: Andrew Cooper, Juergen Gross, Ian Jackson; +Cc: xen-devel, Wei Liu

Hi Andrew,

On 31/08/2021 13:37, Andrew Cooper wrote:
> On 31/08/2021 13:30, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 31/08/2021 13:11, Juergen Gross wrote:
>>> On 30.07.21 19:14, Julien Grall wrote:
>>>> Hi Ian,
>>>>
>>>> On 30/07/2021 14:35, Ian Jackson wrote:
>>>>> Juergen Gross writes ("[PATCH v3 2/2] tools/xenstore: set open file
>>>>> descriptor limit for xenstored"):
>>>>>> Add a configuration item for the maximum number of domains xenstored
>>>>>> should support and set the limit of open file descriptors
>>>>>> accordingly.
>>>>>>
>>>>>> For HVM domains there are up to 5 socket connections per domain (2 by
>>>>>> the xl daemon process, and 3 by qemu). So set the ulimit for
>>>>>> xenstored
>>>>>> to 5 * XENSTORED_MAX_DOMAINS + 100 (the "+ 100" is for some headroom,
>>>>>> like logging, event channel device, etc.).
>>>>> ...
>>>>>> +## Type: integer
>>>>>> +## Default: 32768
>>>>>> +#
>>>>>> +# Select maximum number of domains supported by xenstored.
>>>>>> +# Only evaluated if XENSTORETYPE is "daemon".
>>>>>> +#XENSTORED_MAX_N_DOMAINS=32768
>>>>>
>>>>> I approve of doing something about the fd limit.  I have some qualms
>>>>> about the documentation.
>>>>>
>>>>> The documentation doesn't say what happens if this limit is exceeded.
>>>>> Also the default of 32758 suggests that we actually support that many
>>>>> domains.  I don't think we do...
>>>>>
>>>>> I didn't find anything in SUPPORT.md about how many guests we support
>>>>> but I wouldn't want this setting here to imply full support for 32768
>>>>> domains.
>>>>>
>>>>> If you don't want to tackle this can of works, maybe add this:
>>>>>
>>>>>     # This just controls some resource limits for xenstored; if the
>>>>>     # limit is exceeded, xenstored will stop being able to function
>>>>>     # properly for additional guests.  The default value is so large
>>>>>     # that it won't be exceeded in a supported configuration, but
>>>>>     # should not be taken to mean that the whole Xen system is
>>>>>     # guaranteed to work properly with that many guests.
>>>>>
>>>>> Julien, did you ask for this to be made configurable ?  Having written
>>>>> the text above, I wonder if it wouldn't just be better to
>>>>> unconditionally set it to "unlimited" rather than offering footgun
>>>>> dressed up like a tuneable...
>>>>
>>>> So in v1 (see [1]), Juergen wanted to raise the limit. I assumed
>>>> this meant that the default limit (configured by the system may not
>>>> be enough).
>>>>
>>>> I felt this was wrong to impose an higher limit on everyone when an
>>>> admin may know the maximum number of domains.
>>>>
>>>> By "unlimited", do you mean the calling "ulimit" (or whatever is
>>>> used for configuring FDs) with unlimited?
>>>>
>>>> If so, I would be OK with that. My main was was to move the raising
>>>> the limit outside Xenstored because:
>>>>    1) This is easier for an admin to tweak it (in particular the OOM)
>>>>    2) It feels wrong to me that the daemon chose the limits
>>>>    3) An admin can enforce it
>>>
>>> Coming back to this series, I'm puzzled now.
>>>
>>> Julien, you didn't want me to raise the limit to a specific number
>>> covering the maximum possible number of domains, because you thought
>>> this might result in xenstored hogging huge numbers of file descriptors
>>> in case of a bug. Why is unlimited better then? This will make the
>>> possible number even larger.
>>
>> I don't think I suggested the unlimited number is better... My main
>> objection in your original approach is you set an arbitrary limit you
>> in Xenstored (which may not apply at all) and don't offer a way to the
>> admin to tweak it.
>>
>> If the limit is set outside of Xenstored, then it becomes much easier
>> for someone to just tweak the init script. I don't have a strong
>> opinion on whether the default limit should be "unlimited" or a fixed
>> number.
> 
> xenstored is TCB.  It needs a large number of FDs, and can be trusted
> with unlimited.
> 
> Also, like xenconsoled, we can calculate an upper bound, which is
> derived from the ABI limit of 32k domids.
> 
> All you're haggling over is the error semantics in the case of:
> 1) the upper bound calculation is wrong, or
> 2) there is an fd leak
> 
> Personally, I think a fixed calculation is right, so fd leaks can be
> spotted more obviously. >
> An admin knob is not helpful - higher than the upper bound is just
> wasteful, while lower will cause malfunctions.

A lot of users will never run more than a few handful (maybe hundreds?) 
number of domains... So setting the limit to allow 32K of domains sound 
really wasteful and wouldn't really help to spot FD leaks.

I don't really see the problem to allow an admin to say "I will never 
run more than N domains". So the limit can be lower and would still 
function normally. Do you mind explaining why you think otherwise?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-08-31 12:37           ` Andrew Cooper
  2021-08-31 14:22             ` Julien Grall
@ 2021-08-31 14:22             ` Ian Jackson
  2021-09-01  6:59               ` Juergen Gross
  1 sibling, 1 reply; 15+ messages in thread
From: Ian Jackson @ 2021-08-31 14:22 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Juergen Gross, xen-devel, Wei Liu

Andrew Cooper writes ("Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored"):
> xenstored is TCB.  It needs a large number of FDs, and can be trusted
> with unlimited.

I baseically agree with this.

> Also, like xenconsoled, we can calculate an upper bound, which is
> derived from the ABI limit of 32k domids.

IMO the default should support at leaat this much.

However, I don't think you are right, because xenstored offers console
connections to (possibly arbitrarily many) local socket connections.

> All you're haggling over is the error semantics in the case of:
> 1) the upper bound calculation is wrong, or
> 2) there is an fd leak
> 
> Personally, I think a fixed calculation is right, so fd leaks can be
> spotted more obviously.
> 
> An admin knob is not helpful - higher than the upper bound is just
> wasteful, while lower will cause malfunctions.

I don't agree.  Firstly, on a technical level, your statement is true
only if the admin does not know they will be running a much smaller
number of domains.  Secondly, we have had several people saying they
want this to be configurable.  I think if several people say they want
something to be configurable, we should respect that, even if we think
the use cases for it are marginal.  If there are hazards in bad
settings of such a know, that can be dealt with in the docs.

Julien's point about not having the limit set by xenstored itself is
very well taken.

ISTM that the following scheme is in the intersection of everyone's
requirements:

 * The limit will be adjusted/imposed in the startup script.
 * An /etc/{default,sysconfig} parameter will be provided to
   adjust the setting.
 * The default should be `unlimtied` since we cannot calculate
   a safe upper bound for all configurations.
 * Systems like Citrix Hypervisor (XenServer) which can calculate
   a safe upper bound can do so, and adjust the default, enabling
   them to spot fd leaks.

Ian.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-08-31 14:22             ` Ian Jackson
@ 2021-09-01  6:59               ` Juergen Gross
  2021-09-13 13:59                 ` Ian Jackson
  0 siblings, 1 reply; 15+ messages in thread
From: Juergen Gross @ 2021-09-01  6:59 UTC (permalink / raw)
  To: Ian Jackson, Andrew Cooper; +Cc: Julien Grall, xen-devel, Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 2587 bytes --]

On 31.08.21 16:22, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored"):
>> xenstored is TCB.  It needs a large number of FDs, and can be trusted
>> with unlimited.
> 
> I baseically agree with this.
> 
>> Also, like xenconsoled, we can calculate an upper bound, which is
>> derived from the ABI limit of 32k domids.
> 
> IMO the default should support at leaat this much.
> 
> However, I don't think you are right, because xenstored offers console
> connections to (possibly arbitrarily many) local socket connections.
> 
>> All you're haggling over is the error semantics in the case of:
>> 1) the upper bound calculation is wrong, or
>> 2) there is an fd leak
>>
>> Personally, I think a fixed calculation is right, so fd leaks can be
>> spotted more obviously.
>>
>> An admin knob is not helpful - higher than the upper bound is just
>> wasteful, while lower will cause malfunctions.
> 
> I don't agree.  Firstly, on a technical level, your statement is true
> only if the admin does not know they will be running a much smaller
> number of domains.  Secondly, we have had several people saying they
> want this to be configurable.  I think if several people say they want
> something to be configurable, we should respect that, even if we think
> the use cases for it are marginal.  If there are hazards in bad
> settings of such a know, that can be dealt with in the docs.
> 
> Julien's point about not having the limit set by xenstored itself is
> very well taken.
> 
> ISTM that the following scheme is in the intersection of everyone's
> requirements:
> 
>   * The limit will be adjusted/imposed in the startup script.
>   * An /etc/{default,sysconfig} parameter will be provided to
>     adjust the setting.
>   * The default should be `unlimtied` since we cannot calculate
>     a safe upper bound for all configurations.
>   * Systems like Citrix Hypervisor (XenServer) which can calculate
>     a safe upper bound can do so, and adjust the default, enabling
>     them to spot fd leaks.

Makes sense for me.

So this would mean:
- the sysconfig parameter will no longer be "number of domains", but the
   fd limit of the Xenstore daemon
- default should be "unlimited"
- the comment should mention the current number of fds needed per domain
   (5 for HVM, 2 for PV/PVH) plus some headroom in dom0 (without any
   guest running on my test system 21 fds are active, so I guess a value
   of 50 seems appropriate)

Is this okay?


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux
  2021-08-31 12:20     ` Juergen Gross
@ 2021-09-13 13:57       ` Ian Jackson
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Jackson @ 2021-09-13 13:57 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, julien, Wei Liu

Juergen Gross writes ("Re: [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux"):
> Correct. Otherwise the user would need to know about the oom_score_adj
> ABI.

I think they might know about it.  In particular, sysadmins might well
be used to configuring this directly (for other daemons).

> On 30.07.21 15:26, Ian Jackson wrote:
> > But maybe we could add something to the doc comment ?
> > 
> > Eg
> >    # (The specified value is multiplied by -10 and echoed into
> >    # /proc/PID/oom_score_adj.)
> > 
> > ?
> 
> Why? This is an internal implementation detail. I don't see why the
> user needs to know how this is accomplished. What is unclear with the
> XENSTORED_OOM_MEM_THRESHOLD semantics as described?

The underlying interface is both also-publicly-exposed, and has a
nonobvious mapping.  Speaking as a sometime sysadmin, I would often
appreciate something like this.

> There is no other parameter with an explanation how it's semantics are
> being accomplished.

I don't think the other parameters are as strange as this.

Ian.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored
  2021-09-01  6:59               ` Juergen Gross
@ 2021-09-13 13:59                 ` Ian Jackson
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Jackson @ 2021-09-13 13:59 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Ian Jackson, Andrew Cooper, Julien Grall, xen-devel, Wei Liu

Juergen Gross writes ("Re: [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored"):
> On 31.08.21 16:22, Ian Jackson wrote:
> > ISTM that the following scheme is in the intersection of everyone's
> > requirements:
> > 
> >   * The limit will be adjusted/imposed in the startup script.
> >   * An /etc/{default,sysconfig} parameter will be provided to
> >     adjust the setting.
> >   * The default should be `unlimtied` since we cannot calculate
> >     a safe upper bound for all configurations.
> >   * Systems like Citrix Hypervisor (XenServer) which can calculate
> >     a safe upper bound can do so, and adjust the default, enabling
> >     them to spot fd leaks.
> 
> Makes sense for me.
> 
> So this would mean:
> - the sysconfig parameter will no longer be "number of domains", but the
>    fd limit of the Xenstore daemon
> - default should be "unlimited"
> - the comment should mention the current number of fds needed per domain
>    (5 for HVM, 2 for PV/PVH) plus some headroom in dom0 (without any
>    guest running on my test system 21 fds are active, so I guess a value
>    of 50 seems appropriate)
> 
> Is this okay?

Sounds good to me, thanks.

Ian.


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-09-13 13:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-30 12:26 [PATCH v3 0/2] tools/xenstore: set resource limits of xenstored Juergen Gross
2021-07-30 12:26 ` [PATCH v3 1/2] tools/xenstore: set oom score for xenstore daemon on Linux Juergen Gross
2021-07-30 13:26   ` Ian Jackson
2021-08-31 12:20     ` Juergen Gross
2021-09-13 13:57       ` Ian Jackson
2021-07-30 12:26 ` [PATCH v3 2/2] tools/xenstore: set open file descriptor limit for xenstored Juergen Gross
2021-07-30 13:35   ` Ian Jackson
2021-07-30 17:14     ` Julien Grall
2021-08-31 12:11       ` Juergen Gross
2021-08-31 12:30         ` Julien Grall
2021-08-31 12:37           ` Andrew Cooper
2021-08-31 14:22             ` Julien Grall
2021-08-31 14:22             ` Ian Jackson
2021-09-01  6:59               ` Juergen Gross
2021-09-13 13:59                 ` Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.