All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
       [not found] <CAKw-m2Pnas6hjZjZ+Lnd3b0hiWAV2T2iYNekimjJtbrS8r5rBA@mail.gmail.com>
@ 2017-05-17  9:04 ` George Dunlap
  2017-05-17  9:45   ` Roger Pau Monné
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2017-05-17  9:04 UTC (permalink / raw)
  To: Antony Saba
  Cc: xen-users, Ian Jackson, Roger Pau Monné, Wei Liu, xen-devel

cc'ing xen-devel & some relevant people

On Tue, May 16, 2017 at 4:21 PM, Antony Saba <awsaba@gmail.com> wrote:
> Hello xen-users,
>
> We are seeing the following errors repeatedly while trying to create
> domains using a script, with the end result that 2 or 3 out of about
> 20 VMs fail to start, and there are stale entries in the iptables for
> domains that have been destroyed.
>
>
>    2017-05-10 11:45:40 UTC libxl: error:
> libxl_exec.c:118:libxl_report_child_exitstatus:
> /etc/xen/scripts/vif-bridge remove [18767] exited with error status 4
>    2017-05-10 11:50:52 UTC libxl: error:
> libxl_exec.c:118:libxl_report_child_exitstatus:
> /etc/xen/scripts/vif-bridge offline [1554] exited with error status 4
>
> I've been testing the following patch of vif-common.sh over the last
> day and it appears to resolve the issue.  iptables exits with status 4
> when "Another app is currently holding the xtables lock."
>
> Does this solution seem reasonable?
>
> Thanks.
>
> --- /etc/xen/scripts/vif-common.sh.bak 2017-05-15 18:57:34.549288900 +0000
> +++ /etc/xen/scripts/vif-common.sh 2017-05-15 18:58:01.361208788 +0000
> @@ -154,12 +154,13 @@
> # binary is not sufficient, because the user may not have the appropriate
> # modules installed. If iptables is not working, then there's no need to do
> # anything with it, so we can just return.
> + claim_lock "iptables"
> if ! iptables -L -n >&/dev/null
> then
> + release_lock "iptables"
> return
> fi
> - claim_lock "iptables"
> if [ "$ip" != "" ]
> then

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
  2017-05-17  9:04 ` [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously George Dunlap
@ 2017-05-17  9:45   ` Roger Pau Monné
  2017-05-17 10:10     ` George Dunlap
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2017-05-17  9:45 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-users, Ian Jackson, Antony Saba, Wei Liu, xen-devel

On Wed, May 17, 2017 at 10:04:40AM +0100, George Dunlap wrote:
> cc'ing xen-devel & some relevant people

Please bear with me, my knowledge of iptables is 0.

> On Tue, May 16, 2017 at 4:21 PM, Antony Saba <awsaba@gmail.com> wrote:
> > Hello xen-users,
> >
> > We are seeing the following errors repeatedly while trying to create
> > domains using a script, with the end result that 2 or 3 out of about
> > 20 VMs fail to start, and there are stale entries in the iptables for
> > domains that have been destroyed.
> >
> >
> >    2017-05-10 11:45:40 UTC libxl: error:
> > libxl_exec.c:118:libxl_report_child_exitstatus:
> > /etc/xen/scripts/vif-bridge remove [18767] exited with error status 4
> >    2017-05-10 11:50:52 UTC libxl: error:
> > libxl_exec.c:118:libxl_report_child_exitstatus:
> > /etc/xen/scripts/vif-bridge offline [1554] exited with error status 4
> >
> > I've been testing the following patch of vif-common.sh over the last
> > day and it appears to resolve the issue.  iptables exits with status 4
> > when "Another app is currently holding the xtables lock."

So, an iptables command can fail randomly because there's someone else holding
an iptables internal lock?

Isn't there anyway to tell the iptables command to just block until it can get
the lock? This seems extremely racy, isn't people then forced to use something
like:

while true; do
	iptables <...>
	if [ $? == 0 ]; then
		break;
	elif [ $? != 4 ]; then
		error ...
	fi
done

When dealing with iptables?

> > Does this solution seem reasonable?

I'm not sure, this protects you from other hotplug scripts poking concurrently
at iptables, but what about the system administrator? It still seems racy to
me.

> > Thanks.
> >
> > --- /etc/xen/scripts/vif-common.sh.bak 2017-05-15 18:57:34.549288900 +0000
> > +++ /etc/xen/scripts/vif-common.sh 2017-05-15 18:58:01.361208788 +0000
> > @@ -154,12 +154,13 @@
> > # binary is not sufficient, because the user may not have the appropriate
> > # modules installed. If iptables is not working, then there's no need to do
> > # anything with it, so we can just return.
> > + claim_lock "iptables"
> > if ! iptables -L -n >&/dev/null
> > then
> > + release_lock "iptables"
> > return
> > fi
> > - claim_lock "iptables"
> > if [ "$ip" != "" ]
> > then

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
  2017-05-17  9:45   ` Roger Pau Monné
@ 2017-05-17 10:10     ` George Dunlap
  2017-05-17 11:17       ` George Dunlap
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2017-05-17 10:10 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-users, Ian Jackson, Antony Saba, Wei Liu, xen-devel

On 17/05/17 10:45, Roger Pau Monné wrote:
> On Wed, May 17, 2017 at 10:04:40AM +0100, George Dunlap wrote:
>> cc'ing xen-devel & some relevant people
> 
> Please bear with me, my knowledge of iptables is 0.
> 
>> On Tue, May 16, 2017 at 4:21 PM, Antony Saba <awsaba@gmail.com> wrote:
>>> Hello xen-users,
>>>
>>> We are seeing the following errors repeatedly while trying to create
>>> domains using a script, with the end result that 2 or 3 out of about
>>> 20 VMs fail to start, and there are stale entries in the iptables for
>>> domains that have been destroyed.
>>>
>>>
>>>    2017-05-10 11:45:40 UTC libxl: error:
>>> libxl_exec.c:118:libxl_report_child_exitstatus:
>>> /etc/xen/scripts/vif-bridge remove [18767] exited with error status 4
>>>    2017-05-10 11:50:52 UTC libxl: error:
>>> libxl_exec.c:118:libxl_report_child_exitstatus:
>>> /etc/xen/scripts/vif-bridge offline [1554] exited with error status 4
>>>
>>> I've been testing the following patch of vif-common.sh over the last
>>> day and it appears to resolve the issue.  iptables exits with status 4
>>> when "Another app is currently holding the xtables lock."
> 
> So, an iptables command can fail randomly because there's someone else holding
> an iptables internal lock?
> 
> Isn't there anyway to tell the iptables command to just block until it can get
> the lock? This seems extremely racy, isn't people then forced to use something
> like:
> 
> while true; do
> 	iptables <...>
> 	if [ $? == 0 ]; then
> 		break;
> 	elif [ $? != 4 ]; then
> 		error ...
> 	fi
> done
> 
> When dealing with iptables?

This seems to be a common problem ([1][2][3] come up right away).

The basic solution seems to be to add the '-w' option to have it wait
for the lock.  It does seem like that should be the default though.
Having commands normally run inside of scripts randomly fail unless you
add the special "don't randomly fail" option seems a bit mad.

 -George


[1] https://github.com/kubernetes/kubernetes/issues/7370
[2] https://github.com/docker/for-mac/issues/285
[3]
https://serverfault.com/questions/805718/iptables-another-app-is-currently-holding-the-xtables-lock


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
  2017-05-17 10:10     ` George Dunlap
@ 2017-05-17 11:17       ` George Dunlap
  2017-05-17 12:43         ` Ian Jackson
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2017-05-17 11:17 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, xen-devel, Jan Beulich, Ian Jackson, xen-users, Antony Saba

On Wed, May 17, 2017 at 11:10 AM, George Dunlap
<george.dunlap@citrix.com> wrote:
> On 17/05/17 10:45, Roger Pau Monné wrote:
>> On Wed, May 17, 2017 at 10:04:40AM +0100, George Dunlap wrote:
>>> cc'ing xen-devel & some relevant people
>>
>> Please bear with me, my knowledge of iptables is 0.
>>
>>> On Tue, May 16, 2017 at 4:21 PM, Antony Saba <awsaba@gmail.com> wrote:
>>>> Hello xen-users,
>>>>
>>>> We are seeing the following errors repeatedly while trying to create
>>>> domains using a script, with the end result that 2 or 3 out of about
>>>> 20 VMs fail to start, and there are stale entries in the iptables for
>>>> domains that have been destroyed.
>>>>
>>>>
>>>>    2017-05-10 11:45:40 UTC libxl: error:
>>>> libxl_exec.c:118:libxl_report_child_exitstatus:
>>>> /etc/xen/scripts/vif-bridge remove [18767] exited with error status 4
>>>>    2017-05-10 11:50:52 UTC libxl: error:
>>>> libxl_exec.c:118:libxl_report_child_exitstatus:
>>>> /etc/xen/scripts/vif-bridge offline [1554] exited with error status 4
>>>>
>>>> I've been testing the following patch of vif-common.sh over the last
>>>> day and it appears to resolve the issue.  iptables exits with status 4
>>>> when "Another app is currently holding the xtables lock."
>>
>> So, an iptables command can fail randomly because there's someone else holding
>> an iptables internal lock?
>>
>> Isn't there anyway to tell the iptables command to just block until it can get
>> the lock? This seems extremely racy, isn't people then forced to use something
>> like:
>>
>> while true; do
>>       iptables <...>
>>       if [ $? == 0 ]; then
>>               break;
>>       elif [ $? != 4 ]; then
>>               error ...
>>       fi
>> done
>>
>> When dealing with iptables?
>
> This seems to be a common problem ([1][2][3] come up right away).
>
> The basic solution seems to be to add the '-w' option to have it wait
> for the lock.  It does seem like that should be the default though.
> Having commands normally run inside of scripts randomly fail unless you
> add the special "don't randomly fail" option seems a bit mad.

Hmm, looking more into it:

* The -w option was introduced at the same time that the locking was
introduced [1].  So any version that has locking will have the -w
option.

* The bare -w option doesn't introduce a timeout, so in the case that
the xtables lock wasn't released, the script will hang indefinitely.
A '-W' option was introduced in 2016 to introduce a timeout, but this
is on even fewer systems than the -w option.  (My desktop, running
Debian Jessie, doesn't seem to have the -W option for instance.)

* The return code, RESOURCE_PROBLEM, is returned for other reasons;
but it looks like for our purposes in most case retrying might not be
a bad strategy in those cases either.

* But that was only in 2013 that the option was introduced, so it's
likely there are still old versions of iptables around that don't have
the -w option.

The good news is that versions without the -w option will *also* not
fail with error code 4 (although they may fail in other ways in the
case of concurrent accesses instead).

So we have three options:

1. Always add -w.  This will effectively drop support for systems
which don't have iptables -w.  It also wouldn't allow us to reliably
set a timeout.

2. Always do a loop.  This should work on all systems, but is
redundant for systems with -w and unnecessary on systems without.  On
the other hand, it would allow us to implement our own timeout even on
systems without the -W option.

3. Try to check to see if the version of iptables we have supports -w,
and use it if available.  This should also work on all systems, but
introduces a bit of complication.  It also doesn't allow us to
reliably use a timeout.

Any thoughts?

 -George

[1] https://git.netfilter.org/iptables/commit/?id=93587a04d0f2511e108bbc4d87a8b9d28a5c5dd8

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
  2017-05-17 11:17       ` George Dunlap
@ 2017-05-17 12:43         ` Ian Jackson
  2017-05-17 12:46           ` George Dunlap
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Jackson @ 2017-05-17 12:43 UTC (permalink / raw)
  To: George Dunlap
  Cc: Wei Liu, xen-devel, Jan Beulich, xen-users, Antony Saba,
	Roger Pau Monné

George Dunlap writes ("Re: [Xen-devel] [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously"):
> So we have three options:
...
> 3. Try to check to see if the version of iptables we have supports -w,
> and use it if available.  This should also work on all systems, but
> introduces a bit of complication.  It also doesn't allow us to
> reliably use a timeout.

I think this is best.  Eventually we can get rid of the check for -w.

I think a timeout in this context is not very helpful.

Also, a loop, on a busy system, might need to have many attempts,
because it will be polling.

As I said on irc:

  If iptables fails to release its lock, then surely everything is going
  to be bust forever more, at least until someone manages to unstick it
  and get the lock released ?

  I'm not sure it's worth a lot of effort to try to contain the
  consequences.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
  2017-05-17 12:43         ` Ian Jackson
@ 2017-05-17 12:46           ` George Dunlap
  2017-05-17 13:44             ` George Dunlap
  0 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2017-05-17 12:46 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Wei Liu, xen-devel, Jan Beulich, xen-users, Antony Saba,
	Roger Pau Monné

On 17/05/17 13:43, Ian Jackson wrote:
> George Dunlap writes ("Re: [Xen-devel] [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously"):
>> So we have three options:
> ...
>> 3. Try to check to see if the version of iptables we have supports -w,
>> and use it if available.  This should also work on all systems, but
>> introduces a bit of complication.  It also doesn't allow us to
>> reliably use a timeout.
> 
> I think this is best.  Eventually we can get rid of the check for -w.
> 
> I think a timeout in this context is not very helpful.
> 
> Also, a loop, on a busy system, might need to have many attempts,
> because it will be polling.

FWIW the iptables internal mechanism will try to grab the lock, and if
it fails (and -w is set), will call sleep(1) before trying again.  My
bash loop would do exactly the same thing.

But I agree that if timeouts are not important, doing it via iptables is
probably cleaner.  Let me work up a patch.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
  2017-05-17 12:46           ` George Dunlap
@ 2017-05-17 13:44             ` George Dunlap
  2017-05-17 14:57               ` Antony Saba
  2017-05-18 14:47               ` Antony Saba
  0 siblings, 2 replies; 9+ messages in thread
From: George Dunlap @ 2017-05-17 13:44 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Wei Liu, xen-devel, Jan Beulich, xen-users, Antony Saba,
	Roger Pau Monné

[-- Attachment #1: Type: text/plain, Size: 1343 bytes --]

On Wed, May 17, 2017 at 1:46 PM, George Dunlap <george.dunlap@citrix.com> wrote:
> On 17/05/17 13:43, Ian Jackson wrote:
>> George Dunlap writes ("Re: [Xen-devel] [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously"):
>>> So we have three options:
>> ...
>>> 3. Try to check to see if the version of iptables we have supports -w,
>>> and use it if available.  This should also work on all systems, but
>>> introduces a bit of complication.  It also doesn't allow us to
>>> reliably use a timeout.
>>
>> I think this is best.  Eventually we can get rid of the check for -w.
>>
>> I think a timeout in this context is not very helpful.
>>
>> Also, a loop, on a busy system, might need to have many attempts,
>> because it will be polling.
>
> FWIW the iptables internal mechanism will try to grab the lock, and if
> it fails (and -w is set), will call sleep(1) before trying again.  My
> bash loop would do exactly the same thing.
>
> But I agree that if timeouts are not important, doing it via iptables is
> probably cleaner.  Let me work up a patch.

Antony,

Attached is a patch to add the -w option if it's available.  I've
smoke-tested that it works under normal conditions; but my simplistic
attempts to get the bug to trigger have failed.  Can you give it a try
and see if it works?

Thanks,
 -George

[-- Attachment #2: 0001-vif-common.sh-Have-iptables-wait-for-the-xtables-loc.patch --]
[-- Type: text/x-diff, Size: 3616 bytes --]

From 7ab50acde39a05de664646ba58d5892f0b8fe353 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Wed, 17 May 2017 11:36:25 +0100
Subject: [PATCH] vif-common.sh: Have iptables wait for the xtables lock

iptables has a system-wide lock on the xtables.  Strangely though, in
the case of two concurrent invocations, the default is for the
instance not grabbing the lock to exit out rather than waiting for it.
This means that when starting a large number of guests in parallel,
many will fail out with messages like this:

  2017-05-10 11:45:40 UTC libxl: error: libxl_exec.c:118: libxl_report_child_exitstatus: /etc/xen/scripts/vif-bridge remove [18767] exited with error status 4
  2017-05-10 11:50:52 UTC libxl: error: libxl_exec.c:118: libxl_report_child_exitstatus: /etc/xen/scripts/vif-bridge offline [1554] exited with error status 4

In order to instruct iptables to wait for the lock, you have to
specify '-w'.  Unfortunately, not all versions of iptables have the
'-w' option, so on first invocation check to see if it accepts the -w
command.

Reported-by: Antony Saba <awsaba@gmail.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/hotplug/Linux/vif-common.sh | 35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/tools/hotplug/Linux/vif-common.sh b/tools/hotplug/Linux/vif-common.sh
index 6e8d584..60ccce8 100644
--- a/tools/hotplug/Linux/vif-common.sh
+++ b/tools/hotplug/Linux/vif-common.sh
@@ -120,6 +120,35 @@ fi
 ip=${ip:-}
 ip=$(xenstore_read_default "$XENBUS_PATH/ip" "$ip")
 
+IPTABLES_WAIT_RUNE="-w"
+IPTABLES_WAIT_RUNE_CHECKED=false
+
+# If iptables tries to grab the xtable lock and fails, instead if
+# waiting for it by default, it exits with error 4.  They have since
+# added an option, `-w`, to specify the more sensible behavior. But it
+# was only introduced in 2013, so there are probably still systems
+# around which don't support it.  So check to see if it's supported
+# the first time we use it.
+iptables_w()
+{
+    if ! $IPTABLES_WAIT_RUNE_CHECKED ; then
+	iptables $IPTABLES_WAIT_RUNE -L -n >& /dev/null
+	# If it fails with -w and succeeds without, remove the rune
+	if [[ $? == 2 ]] ; then
+	    iptables -L -n >& /dev/null
+	    if [[ $? != 2 ]] ; then
+		# If we fail with PARAMETER_PROBLEM with -w and don't fail
+		# with PARAMETER_PRIBLEM without it, then it's the -w option
+		IPTABLES_WAIT_RUNE_CHECKED=true
+		IPTABLES_WAIT_RUNE=""
+	    fi
+	else
+	    IPTABLES_WAIT_RUNE_CHECKED=true
+	fi
+    fi
+    iptables $IPTABLES_WAIT_RUNE "$@"
+}
+
 frob_iptable()
 {
   if [ "$command" == "online" -o "$command" == "add" ]
@@ -129,9 +158,9 @@ frob_iptable()
     local c="-D"
   fi
 
-  iptables "$c" FORWARD -m physdev --physdev-is-bridged --physdev-in "$dev" \
+  iptables_w "$c" FORWARD -m physdev --physdev-is-bridged --physdev-in "$dev" \
     "$@" -j ACCEPT 2>/dev/null &&
-  iptables "$c" FORWARD -m physdev --physdev-is-bridged --physdev-out "$dev" \
+  iptables_w "$c" FORWARD -m physdev --physdev-is-bridged --physdev-out "$dev" \
     -j ACCEPT 2>/dev/null
 
   if [ \( "$command" == "online" -o "$command" == "add" \) -a $? -ne 0 ]
@@ -154,7 +183,7 @@ handle_iptable()
   # binary is not sufficient, because the user may not have the appropriate
   # modules installed.  If iptables is not working, then there's no need to do
   # anything with it, so we can just return.
-  if ! iptables -L -n >&/dev/null
+  if ! iptables_w -L -n >&/dev/null
   then
     return
   fi
-- 
2.1.4


[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
  2017-05-17 13:44             ` George Dunlap
@ 2017-05-17 14:57               ` Antony Saba
  2017-05-18 14:47               ` Antony Saba
  1 sibling, 0 replies; 9+ messages in thread
From: Antony Saba @ 2017-05-17 14:57 UTC (permalink / raw)
  To: George Dunlap
  Cc: Wei Liu, Ian Jackson, xen-devel, Jan Beulich, xen-users,
	Roger Pau Monné

On Wed, May 17, 2017 at 7:44 AM, George Dunlap <george.dunlap@citrix.com> wrote:
>
> Antony,
>
> Attached is a patch to add the -w option if it's available.  I've
> smoke-tested that it works under normal conditions; but my simplistic
> attempts to get the bug to trigger have failed.  Can you give it a try
> and see if it works?
>
> Thanks,
>  -George

No problem, I'll apply to one of the machines showing the issue and
run it overnight.

Thanks.

-Tony




-- 
Antony Saba, awsaba@gmail.com

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously
  2017-05-17 13:44             ` George Dunlap
  2017-05-17 14:57               ` Antony Saba
@ 2017-05-18 14:47               ` Antony Saba
  1 sibling, 0 replies; 9+ messages in thread
From: Antony Saba @ 2017-05-18 14:47 UTC (permalink / raw)
  To: George Dunlap
  Cc: Wei Liu, Ian Jackson, xen-devel, Jan Beulich, xen-users,
	Roger Pau Monné

George,

Patch works as expected, no failures on create and no stale iptables
rules after running under the same load that was producing the errors
previously.

Ubuntu 16.04
Linux 3.13.0-83-generic
iptables v1.6.0
Xen 4.6 5 from distro packages

Thanks!

-Tony

On Wed, May 17, 2017 at 7:44 AM, George Dunlap <george.dunlap@citrix.com> wrote:
> On Wed, May 17, 2017 at 1:46 PM, George Dunlap <george.dunlap@citrix.com> wrote:
>> On 17/05/17 13:43, Ian Jackson wrote:
>>> George Dunlap writes ("Re: [Xen-devel] [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously"):
>>>> So we have three options:
>>> ...
>>>> 3. Try to check to see if the version of iptables we have supports -w,
>>>> and use it if available.  This should also work on all systems, but
>>>> introduces a bit of complication.  It also doesn't allow us to
>>>> reliably use a timeout.
>>>
>>> I think this is best.  Eventually we can get rid of the check for -w.
>>>
>>> I think a timeout in this context is not very helpful.
>>>
>>> Also, a loop, on a busy system, might need to have many attempts,
>>> because it will be polling.
>>
>> FWIW the iptables internal mechanism will try to grab the lock, and if
>> it fails (and -w is set), will call sleep(1) before trying again.  My
>> bash loop would do exactly the same thing.
>>
>> But I agree that if timeouts are not important, doing it via iptables is
>> probably cleaner.  Let me work up a patch.
>
> Antony,
>
> Attached is a patch to add the -w option if it's available.  I've
> smoke-tested that it works under normal conditions; but my simplistic
> attempts to get the bug to trigger have failed.  Can you give it a try
> and see if it works?
>
> Thanks,
>  -George



-- 
Antony Saba, awsaba@gmail.com

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-05-18 14:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAKw-m2Pnas6hjZjZ+Lnd3b0hiWAV2T2iYNekimjJtbrS8r5rBA@mail.gmail.com>
2017-05-17  9:04 ` [Xen-users] vif-bridge errors when creating and destroying dozens of VMs simultaneously George Dunlap
2017-05-17  9:45   ` Roger Pau Monné
2017-05-17 10:10     ` George Dunlap
2017-05-17 11:17       ` George Dunlap
2017-05-17 12:43         ` Ian Jackson
2017-05-17 12:46           ` George Dunlap
2017-05-17 13:44             ` George Dunlap
2017-05-17 14:57               ` Antony Saba
2017-05-18 14:47               ` Antony Saba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.