From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Ahring Oder Aring Date: Thu, 22 Apr 2021 17:11:56 -0400 Subject: [Cluster-devel] [PATCHv4 dlm/next 4/8] fs: dlm: add functionality to re-transmit a message In-Reply-To: <20210409144859.48385-5-aahringo@redhat.com> References: <20210409144859.48385-1-aahringo@redhat.com> <20210409144859.48385-5-aahringo@redhat.com> Message-ID: List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, On Fri, Apr 9, 2021 at 10:49 AM Alexander Aring wrote: > > This patch introduces a retransmit functionality for a lowcomms message > handle. It's just allocates a new buffer and transmit it again, no > special handling about prioritize it because keeping bytestream in order. > > To avoid another connection look some refactor was done to make a new > buffer allocation with a preexisting connection pointer. > > Signed-off-by: Alexander Aring > --- > fs/dlm/lowcomms.c | 55 ++++++++++++++++++++++++++++++++++------------- > fs/dlm/lowcomms.h | 1 + > 2 files changed, 41 insertions(+), 15 deletions(-) > > diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c > index ade1a5266e4a..fb04448c4e48 100644 > --- a/fs/dlm/lowcomms.c > +++ b/fs/dlm/lowcomms.c > @@ -1447,25 +1447,14 @@ static struct writequeue_entry *new_wq_entry(struct connection *con, int len, > return e; > }; > > -void *dlm_lowcomms_new_buffer(int nodeid, int len, gfp_t allocation, char **ppc, > - void (*cb)(void *buf, void *priv), void *priv) > +static void *dlm_lowcomms_new_buffer_con(struct connection *con, int len, > + gfp_t allocation, char **ppc, > + void (*cb)(void *buf, void *priv), > + void *priv) > { > struct writequeue_entry *e; > - struct connection *con; > struct dlm_msg *msg; > > - if (len > DEFAULT_BUFFER_SIZE || > - len < sizeof(struct dlm_header)) { > - BUILD_BUG_ON(PAGE_SIZE < DEFAULT_BUFFER_SIZE); > - log_print("failed to allocate a buffer of size %d", len); > - WARN_ON(1); > - return NULL; > - } > - > - con = nodeid2con(nodeid, allocation); > - if (!con) > - return NULL; > - > msg = kzalloc(sizeof(*msg), allocation); > if (!msg) > return NULL; > @@ -1485,6 +1474,26 @@ void *dlm_lowcomms_new_buffer(int nodeid, int len, gfp_t allocation, char **ppc, > return msg; > } > > +void *dlm_lowcomms_new_buffer(int nodeid, int len, gfp_t allocation, char **ppc, > + void (*cb)(void *buf, void *priv), void *priv) > +{ > + struct connection *con; > + > + if (len > DEFAULT_BUFFER_SIZE || > + len < sizeof(struct dlm_header)) { > + BUILD_BUG_ON(PAGE_SIZE < DEFAULT_BUFFER_SIZE); > + log_print("failed to allocate a buffer of size %d", len); > + WARN_ON(1); > + return NULL; > + } > + > + con = nodeid2con(nodeid, allocation); > + if (!con) > + return NULL; > + > + return dlm_lowcomms_new_buffer_con(con, len, GFP_ATOMIC, ppc, cb, priv); > +} > + > void dlm_lowcomms_commit_buffer(void *mh) > { > struct dlm_msg *msg = mh; > @@ -1525,6 +1534,22 @@ void dlm_lowcomms_get_buffer(void *mh) > kref_get(&msg->ref); > } > > +void dlm_lowcomms_resend_buffer(void *mh) > +{ > + struct dlm_msg *msg = mh; > + void *mh_new; > + char *ppc; > + > + mh_new = dlm_lowcomms_new_buffer_con(msg->entry->con, msg->len, > + GFP_NOFS, &ppc, NULL, NULL); > + if (!mh_new) > + return; > + > + memcpy(ppc, msg->ppc, msg->len); > + dlm_lowcomms_commit_buffer(mh_new); > + dlm_lowcomms_put_buffer(mh_new); > +} I will change this functionality so that it checks if the actual message is already in "retransmit" state. If yes it will not queue a message again which sits somewhere inside the "dlm lowcomms send queue", the retransmit state will be dropped when the dlm message was sent out. It might be triggered again for retransmission then. I had some terrible experience with a lot of reconnects due tcpkill test and the "lowcomms queue send" was getting bigger and bigger with a bunch of retransmits and the other peer was doing "a lot" of work to filter these messages out because already received. - Alex