From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Horman Date: Tue, 12 Dec 2017 18:32:04 +0000 Subject: Re: How to restrict SCTP abort during a process crash Message-Id: <20171212183203.GA1047@hmswarspite.think-freely.org> List-Id: References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-sctp@vger.kernel.org On Tue, Dec 12, 2017 at 10:21:31PM +0530, Ashok Kumar wrote: > Hi, > > > > We are using LKSCTP in our LTE product (HeNBGW). We have > high-availability support also in our product. In case of any failure > on active VM, standby VM will take over active role and all the SCTP > associations will be moved to that new active VM. The associations > should be moved transparent to the peers (a kind of SCTP reset before > SCTP heartbeat expires on the peer nodes). > > > > But the problem that we face is that when a process crashes on active > VM, the LKSCTP stack immediately sends SCTP abort to the peers for all > associations before the system goes down completely. This creates > confusion with the peers. Is there any way to avoid sending SCTP abort > message in this scenario? If yes, please let us know how to do the > same? If it needs LKSCTP kernel code change, please give pointers on > what and where to change. > > > > P.S: We tried to block the abort messages by dynamically using > IPtables through signal handler (for signal 11 and 6). But this did > not work. > > > > A quick response will be highly appreciated. > You're not going to be able to reliably block ABORTS, or any packet only on a crash condition, just because the stack has points that operates asynchronously to the process. About the closest thing that I could think of would be to write a custom iptables rule to match on ABORT packets and send them to the NFQUEUE target. Write a userspace handler process for queue targeted packets which in turn just holds the abort packet for at least one cluster live heartbeat time (I'm assuming here that, being a clustered system it has some sort of liveness check). Doing this hold may allow the cluster to shift to the new vm in a failure situation before your queue handler process releases any abort packets that it has, while in the event there is no failover, it will just release the abort a little late. I can't really recommend that approach mind you (its a horrid hack, and will likely cause other protocol issues), but its all I can think of at the moment. Regards Neil > > > Thanks, > > Ashok > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >