xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [RESEND OSSTEST PATCH 0/5] Fix TCP problem
@ 2020-09-28 13:12 Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 1/5] daemonlib: Provide a "noop" command Ian Jackson
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Ian Jackson @ 2020-09-28 13:12 UTC (permalink / raw)
  To: xen-devel

The best reference I found for this was here:
  https://www.evanjones.ca/tcp-stuck-connection-mystery.html

I'm resending this series because the first one had my Citrix email,
which is probably not going to reach many people.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [OSSTEST PATCH 1/5] daemonlib: Provide a "noop" command
  2020-09-28 13:12 [RESEND OSSTEST PATCH 0/5] Fix TCP problem Ian Jackson
@ 2020-09-28 13:12 ` Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 2/5] TCP fix: Do not wait for queuedaemon to speak Ian Jackson
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ian Jackson @ 2020-09-28 13:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

From: Ian Jackson <ian.jackson@eu.citrix.com>

We are going to want clients to speak before waiting for the server
banner.  A noop command is useful for that.

Putting this here makes it apply to both ownerdaemon and queuedaemon.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tcl/daemonlib.tcl | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tcl/daemonlib.tcl b/tcl/daemonlib.tcl
index 1e86d5f4..747deab1 100644
--- a/tcl/daemonlib.tcl
+++ b/tcl/daemonlib.tcl
@@ -124,6 +124,10 @@ proc puts-chan {chan m} {
     puts $chan $m
 }
 
+proc cmd/noop {chan desc} {
+    puts-chan $chan "OK noop"
+}
+
 #---------- data ----------
 
 proc puts-chan-data {chan m data} {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [OSSTEST PATCH 2/5] TCP fix: Do not wait for queuedaemon to speak
  2020-09-28 13:12 [RESEND OSSTEST PATCH 0/5] Fix TCP problem Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 1/5] daemonlib: Provide a "noop" command Ian Jackson
@ 2020-09-28 13:12 ` Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 3/5] TCP fix: Do not wait for ownerdaemon " Ian Jackson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ian Jackson @ 2020-09-28 13:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

From: Ian Jackson <ian.jackson@eu.citrix.com>

This depends on the preceding daemonlib patch and an ms-queuedaemon
restart.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 Osstest/Executive.pm | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Osstest/Executive.pm b/Osstest/Executive.pm
index 61a99bc3..80e70070 100644
--- a/Osstest/Executive.pm
+++ b/Osstest/Executive.pm
@@ -643,7 +643,16 @@ sub tcpconnect_queuedaemon () {
     my $qserv= tcpconnect($c{QueueDaemonHost}, $c{QueueDaemonPort});
     $qserv->autoflush(1);
 
+    # TCP connections can get into a weird state where the client
+    # thinks the connection is open but the server has no record
+    # of it.  To avoid this, have the client speak without waiting
+    # for the server.
+    #
+    # See A TCP "stuck" connection mystery"
+    # https://www.evanjones.ca/tcp-stuck-connection-mystery.html
+    print $qserv "noop\n";
     $_= <$qserv>;  defined && m/^OK ms-queuedaemon\s/ or die "$_?";
+    $_= <$qserv>;  defined && m/^OK noop\s/ or die "$_?";
 
     return $qserv;
 }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [OSSTEST PATCH 3/5] TCP fix: Do not wait for ownerdaemon to speak
  2020-09-28 13:12 [RESEND OSSTEST PATCH 0/5] Fix TCP problem Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 1/5] daemonlib: Provide a "noop" command Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 2/5] TCP fix: Do not wait for queuedaemon to speak Ian Jackson
@ 2020-09-28 13:12 ` Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 4/5] TftiDiVersion: Update to latest installer for stretch Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 5/5] Update TftpDiVersion_buster Ian Jackson
  4 siblings, 0 replies; 6+ messages in thread
From: Ian Jackson @ 2020-09-28 13:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

From: Ian Jackson <ian.jackson@eu.citrix.com>

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tcl/JobDB-Executive.tcl | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tcl/JobDB-Executive.tcl b/tcl/JobDB-Executive.tcl
index 29c82821..4fe85696 100644
--- a/tcl/JobDB-Executive.tcl
+++ b/tcl/JobDB-Executive.tcl
@@ -414,7 +414,20 @@ proc become-task {comment} {
 
     set ownerqueue [socket $c(OwnerDaemonHost) $c(OwnerDaemonPort)]
     fconfigure $ownerqueue -buffering line -translation lf
+
+    # TCP connections can get into a weird state where the client
+    # thinks the connection is open but the server has no record
+    # of it.  To avoid this, have the client speak without waiting
+    # for the server.  We tolerate "unknown command" errors so
+    # that it is not necessary to restart the ownerdaemon since
+    # that is very disruptive.
+    #
+    # See A TCP "stuck" connection mystery"
+    # https://www.evanjones.ca/tcp-stuck-connection-mystery.html
+    puts $ownerqueue noop
     must-gets $ownerqueue {^OK ms-ownerdaemon\M}
+    must-gets $ownerqueue {^OK noop|^ERROR unknown command}
+
     puts $ownerqueue create-task
     must-gets $ownerqueue {^OK created-task (\d+) (\w+ [\[\]:.0-9a-f]+)$} \
         taskid refinfo
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [OSSTEST PATCH 4/5] TftiDiVersion: Update to latest installer for stretch
  2020-09-28 13:12 [RESEND OSSTEST PATCH 0/5] Fix TCP problem Ian Jackson
                   ` (2 preceding siblings ...)
  2020-09-28 13:12 ` [OSSTEST PATCH 3/5] TCP fix: Do not wait for ownerdaemon " Ian Jackson
@ 2020-09-28 13:12 ` Ian Jackson
  2020-09-28 13:12 ` [OSSTEST PATCH 5/5] Update TftpDiVersion_buster Ian Jackson
  4 siblings, 0 replies; 6+ messages in thread
From: Ian Jackson @ 2020-09-28 13:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Jan Beulich

The stretch (Debian oldstable) kernel has been updated, causing our
Xen 4.10 tests (which are still using stretch) to break.  This update
seems to fix it.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Ian Jackson <iwj@xenproject.org>
---
 production-config | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/production-config b/production-config
index 6055bd18..0c135bcb 100644
--- a/production-config
+++ b/production-config
@@ -90,7 +90,7 @@ TftpNetbootGroup osstest
 # Update with ./mg-debian-installer-update(-all)
 TftpDiVersion_wheezy 2016-06-08
 TftpDiVersion_jessie 2018-06-26
-TftpDiVersion_stretch 2020-02-10
+TftpDiVersion_stretch 2020-09-24
 TftpDiVersion_buster 2020-05-19
 
 DebianSnapshotBackports_jessie http://snapshot.debian.org/archive/debian/20190206T211314Z/
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [OSSTEST PATCH 5/5] Update TftpDiVersion_buster
  2020-09-28 13:12 [RESEND OSSTEST PATCH 0/5] Fix TCP problem Ian Jackson
                   ` (3 preceding siblings ...)
  2020-09-28 13:12 ` [OSSTEST PATCH 4/5] TftiDiVersion: Update to latest installer for stretch Ian Jackson
@ 2020-09-28 13:12 ` Ian Jackson
  4 siblings, 0 replies; 6+ messages in thread
From: Ian Jackson @ 2020-09-28 13:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

From: Ian Jackson <ian.jackson@eu.citrix.com>

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 production-config | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/production-config b/production-config
index 0c135bcb..6f85a4df 100644
--- a/production-config
+++ b/production-config
@@ -91,7 +91,7 @@ TftpNetbootGroup osstest
 TftpDiVersion_wheezy 2016-06-08
 TftpDiVersion_jessie 2018-06-26
 TftpDiVersion_stretch 2020-09-24
-TftpDiVersion_buster 2020-05-19
+TftpDiVersion_buster 2020-09-28
 
 DebianSnapshotBackports_jessie http://snapshot.debian.org/archive/debian/20190206T211314Z/
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-28 13:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-28 13:12 [RESEND OSSTEST PATCH 0/5] Fix TCP problem Ian Jackson
2020-09-28 13:12 ` [OSSTEST PATCH 1/5] daemonlib: Provide a "noop" command Ian Jackson
2020-09-28 13:12 ` [OSSTEST PATCH 2/5] TCP fix: Do not wait for queuedaemon to speak Ian Jackson
2020-09-28 13:12 ` [OSSTEST PATCH 3/5] TCP fix: Do not wait for ownerdaemon " Ian Jackson
2020-09-28 13:12 ` [OSSTEST PATCH 4/5] TftiDiVersion: Update to latest installer for stretch Ian Jackson
2020-09-28 13:12 ` [OSSTEST PATCH 5/5] Update TftpDiVersion_buster Ian Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).