From: Alexander Aring <aahringo@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCHv4 dlm/next 0/8] fs: dlm: introduce dlm re-transmission layer
Date: Fri, 9 Apr 2021 10:48:51 -0400 [thread overview]
Message-ID: <20210409144859.48385-1-aahringo@redhat.com> (raw)
Hi,
this is the final patch-series to make dlm reliable when re-connection
occurs. You can easily generate a couple of re-connections by running:
tcpkill -9 -i $IFACE port 21064
on your own to test these patches. At some time dlm will detect message
drops and will re-transmit messages if necessary. It introduces a new dlm
protocol behaviour and increases the dlm protocol version. I tested it
with SCTP as well and tried to be backwards compatible with dlm protocol
version 3.1. However I don't recommend at all to mix these versions
in a setup since dlm version 3.2 fixes long-term issues.
- Alex
changes since v4:
- remove fast retransmit and the timer, introduce logic to retransmit
all unacknowledged message when doing reconnect. The receiver side
will deliver the next fit sequence number then.
There might be still problems with that we don't trigger a
reconnection again if we don't transmit anything (but still have
something unacknowledged in midcomms). This is still an issue in
lowcomms implementation which I want to fix in due courses.
- Change comments/commit msg how it works regarding to the new
behaviour.
- Let the send_queue now have messages in order according to the seq
this is necessary for the new behaviour that the receiver side can
resolve drops by receiving unacknowledged messages in their order of
delivery. If sequence .e.g. 1 3 2 is received then the receiver will
not be able to resolve the drop because 3 will be dropped and not
retransmitted again.
- change the dlm fin waiting mechanism to split the wait into fin ack
received and fin message received. Also change the timeout handling
a little bit there.
- add a missing flush send_queue in midcomms_close
- update patch 04 to not be irqsafe anymore
- fix use-after-free for dlm version 3.1 and recent nodes_srcu changes
I thought about to update patch 08 to drop all pending messages inside
the write queue, because we retransmit all unacknowledged messages
at reconnect anyway. However that makes a very bad behaviour on
reconnects with DLM version 3.1 so I only drop half-transmitted page
buffers to don't start the bytestream inside the middle of an DLM
message which is terrible as well. It might send more duplicate messages
at reconnect, but the receive should solve these duplicates.
I still have some problems with synchronization of membership with
DLM_FIN. However I think my testcase is overkill and I have zero
problems with any synchronization when not running tcpkill. It gets
a lot of more worse when I don't have any synchronization and the
"midcomms membership" and sequence numbers are out of sync with the
"cluster manager membership". I think such synchronization need to be
there but there might be more additional handling. (I hope non protocol
changes needed).
changes since v3:
- add comment about why queues are unbound
- move rcu usage to version receive handler
changes since v2:
- make timer handling pending only if messages are on air, the sync
isn't quite correct there but doesn't need to be precise
- use before() from tcp to check if seq is before other seq with
respect of overflows
- change srcu handling to hold srcu in all places where nodes are
referencing - we should not get a disadvantage of holding that
lock. We should update also lowcomms regarding to that.
- add some WARN_ON() to check that nothing in send/recv is going
anymore otherwise it's likely an issue.
- add more future work regarding to fencing of nodes if over
cluster manager timeout/bad seq happens
- add note about missing length size check of tail payload
(resource name length) regarding to the receive buffer
- remove some include which isn't necessary in recoverd.c
Thanks to Paolo Abeni and Guillaume Nault for their reviews and
recommendations.
Alexander Aring (8):
fs: dlm: public header in out utility
fs: dlm: add more midcomms hooks
fs: dlm: make buffer handling per msg
fs: dlm: add functionality to re-transmit a message
fs: dlm: move out some hash functionality
fs: dlm: add union in dlm header for lockspace id
fs: dlm: add reliable connection if reconnect
fs: dlm: don't allow half transmitted messages
fs/dlm/config.c | 3 +-
fs/dlm/dlm_internal.h | 35 +-
fs/dlm/lock.c | 14 +-
fs/dlm/lockspace.c | 14 +-
fs/dlm/lowcomms.c | 173 ++++++-
fs/dlm/lowcomms.h | 23 +-
fs/dlm/member.c | 12 +-
fs/dlm/midcomms.c | 1101 +++++++++++++++++++++++++++++++++++++++--
fs/dlm/midcomms.h | 11 +
fs/dlm/rcom.c | 63 ++-
fs/dlm/util.c | 10 +-
fs/dlm/util.h | 2 +
12 files changed, 1361 insertions(+), 100 deletions(-)
--
2.26.3
next reply other threads:[~2021-04-09 14:48 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-09 14:48 Alexander Aring [this message]
2021-04-09 14:48 ` [Cluster-devel] [PATCHv4 dlm/next 1/8] fs: dlm: public header in out utility Alexander Aring
2021-04-09 14:48 ` [Cluster-devel] [PATCHv4 dlm/next 2/8] fs: dlm: add more midcomms hooks Alexander Aring
2021-04-09 14:48 ` [Cluster-devel] [PATCHv4 dlm/next 3/8] fs: dlm: make buffer handling per msg Alexander Aring
2021-04-09 14:48 ` [Cluster-devel] [PATCHv4 dlm/next 4/8] fs: dlm: add functionality to re-transmit a message Alexander Aring
2021-04-21 14:45 ` Alexander Ahring Oder Aring
2021-04-22 21:11 ` Alexander Ahring Oder Aring
2021-04-09 14:48 ` [Cluster-devel] [PATCHv4 dlm/next 5/8] fs: dlm: move out some hash functionality Alexander Aring
2021-04-09 14:48 ` [Cluster-devel] [PATCHv4 dlm/next 6/8] fs: dlm: add union in dlm header for lockspace id Alexander Aring
2021-04-09 14:48 ` [Cluster-devel] [PATCHv4 dlm/next 7/8] fs: dlm: add reliable connection if reconnect Alexander Aring
2021-04-09 14:48 ` [Cluster-devel] [PATCHv4 dlm/next 8/8] fs: dlm: don't allow half transmitted messages Alexander Aring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210409144859.48385-1-aahringo@redhat.com \
--to=aahringo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.