[Cluster-devel] [PATCHv4 dlm/next 00/20] fs: dlm: introduce dlm re-transmission layer

* [Cluster-devel] [PATCHv4 dlm/next 00/20] fs: dlm: introduce dlm re-transmission layer
@ 2021-01-11 18:02 Alexander Aring
  2021-01-11 18:02 ` [Cluster-devel] [PATCHv4 dlm/next 01/20] fs: dlm: set connected bit after accept Alexander Aring
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: Alexander Aring @ 2021-01-11 18:02 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

this is the final patch-series to make dlm reliable when re-connection
occurs. You can easily generate a couple of re-connections by running:

tcpkill -9 -i $IFACE port 21064

on your own to test these patches. At some time dlm will detect message
drops and will re-transmit messages if necessary. It introduces a new dlm
protocol behaviour and increases the dlm protocol version. I tested it
with SCTP as well and tried to be backwards compatible with dlm protocol
version 3.1. However I don't recommend at all to mix these versions
in a setup since dlm version 3.2 fixes long-term issues.

- Alex

changes since v4:
 - add big midcomms file header comment about what's the idea about
   midcomms layer and how it works.
 - add the close mutex lock to prevent running close API call while
   connection is being terimanted. However when a close call occurs
   it will terminate the current termination wait until the close
   lock is released. If the node is removed from the nodes hash the
   lowcomms close call will occur anyway.

   I added a define to insert some sleep to test this behaviour.

changes since v3:
 - make dlm messages to 8 byte boundary size (more pads), because there
   exists uint64_t fields and we should prepared for future 8 byte fields.
   This will make it directly aligned to 4 and 2 as well.
 - change unaligned memory access handling. I will not fix it yet. It
   seems nobody is using dlm on an architecture which cannot handle
   unaligned memory access at all (panics). However I added a note that
   this is a known problem. There is a slightly performance improvement
   (depends on many things e.g. if another message gets allocated after a
   (len % 8) != 0 message length got allocated). However I saw that such
   cases are rarely (for now some user space messages only) occur.

   The receiving side is not the problem here, the sending side is it
   and we run in a unaligned memory access in dlm messages fields there
   as well. However, fixing sending side will fix the receiving side and
   more length checks can be applied then to drop invalid message
   lengths.
 - be sure to remove node from hash at first at close call

   I am a little bit worried about the midcomms/lowcomms close call and
   the timer is running at exactly this time and maybe begins to
   re-transmit messages. I thought about to stop/start the timer but now
   I ended up to remove the node from the hash at first and be sure that
   no readers are left when calling lowcomms close. I think this should
   be fine because we "should" not receive any dlm messages from this
   node while close is running.

 - add patch "fs: dlm: add per node receive flush"

   As I was worried about that the lowcomms close call flushes the receive
   work on a socket close and we already removed the node from the hash,
   I added a functionality to flush the receive work right before we remove
   the node. With this functionality we male sure we don't receive any
   messages after we removed the node from the hash.
 - add patch "fs: dlm: remove obsolete code and comment"
 - add patch "fs: dlm: check for invalid namelen"

changes since v2:
 - add patch "fs: dlm: set connected bit after accept"
 - add patch "fs: dlm: set subclass for othercon sock_mutex"
 - change title "fs: dlm: public utils header utils" to
   "fs: dlm: public header in out utility"
 - squash "fs: dlm: add check for minimum allocation length" into
   "fs: dlm: remove unaligned memory access handling"
 - make the midcomms timeout a little bit longer, because I saw
   sometimes it's not enough (I hope that was the reason)
 - midcomms: fix version mismatch handling
 - remove DLM_ACK in invalid sequence handling
 - add additional length check in dlm_opts_check_msglen()
 - use optlen to skip DLM_OPTS header
 - add DLM_MSGLEN_IS_NOT_ALIGNED to check if msglen is proper
   aligned before parsing
 - change dlm_midcomms_close() to close first then cut queues,
   because lowcomms close will may flush some messages which
   need to be dropped afterwards if seq doesn't fit.
 - remove newline in "fs: dlm: add more midcomms hooks"
 - may more changes which I don't have on track.
 - change defines handling for calculating max application buffer
   size vs max allocation size
 - run aspell on my commit msgs

Alexander Aring (20):
  fs: dlm: set connected bit after accept
  fs: dlm: set subclass for othercon sock_mutex
  fs: dlm: add errno handling to check callback
  fs: dlm: add check if dlm is currently running
  fs: dlm: change allocation limits
  fs: dlm: public header in out utility
  fs: dlm: use GFP_ZERO for page buffer
  fs: dlm: simplify writequeue handling
  fs: dlm: add more midcomms hooks
  fs: dlm: make buffer handling per msg
  fs: dlm: make new buffer handling softirq ready
  fs: dlm: add functionality to re-transmit a message
  fs: dlm: move out some hash functionality
  fs: dlm: remove unaligned memory access handling
  fs: dlm: add union in dlm header for lockspace id
  fs: dlm: add per node receive flush
  fs: dlm: add reliable connection if reconnect
  fs: dlm: don't allow half transmitted messages
  fs: dlm: remove obsolete code and comment
  fs: dlm: check for invalid namelen

 fs/dlm/config.c       |   60 +-
 fs/dlm/dlm_internal.h |   41 +-
 fs/dlm/lock.c         |   16 +-
 fs/dlm/lockspace.c    |    5 +-
 fs/dlm/lowcomms.c     |  288 +++++++---
 fs/dlm/lowcomms.h     |   27 +-
 fs/dlm/member.c       |   16 +
 fs/dlm/member.h       |    1 +
 fs/dlm/midcomms.c     | 1266 +++++++++++++++++++++++++++++++++++++++--
 fs/dlm/midcomms.h     |   10 +
 fs/dlm/rcom.c         |   61 +-
 fs/dlm/recoverd.c     |    3 +
 fs/dlm/user.c         |    3 +
 fs/dlm/util.c         |   10 +-
 fs/dlm/util.h         |    2 +
 15 files changed, 1628 insertions(+), 181 deletions(-)

-- 
2.26.2

^ permalink raw reply	[flat|nested] 21+ messages in thread