All of lore.kernel.org
 help / color / mirror / Atom feed
* How to restrict SCTP abort during a process crash
@ 2017-12-12 16:51 Ashok Kumar
  2017-12-12 18:32 ` Neil Horman
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Ashok Kumar @ 2017-12-12 16:51 UTC (permalink / raw)
  To: linux-sctp

Hi,



We are using LKSCTP in our LTE product (HeNBGW). We have
high-availability support also in our product. In case of any failure
on active VM, standby VM will take over active role and all the SCTP
associations will be moved to that new active VM. The associations
should be moved transparent to the peers (a kind of SCTP reset before
SCTP heartbeat expires on the peer nodes).



But the problem that we face is that when a process crashes on active
VM, the LKSCTP stack immediately sends SCTP abort to the peers for all
associations before the system goes down completely. This creates
confusion with the peers. Is there any way to avoid sending SCTP abort
message in this scenario? If yes, please let us know how to do the
same? If it needs LKSCTP kernel code change, please give pointers on
what and where to change.



P.S: We tried to block the abort messages by dynamically using
IPtables through signal handler (for signal 11 and 6). But this did
not work.



A quick response will be highly appreciated.



Thanks,

Ashok

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to restrict SCTP abort during a process crash
  2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
@ 2017-12-12 18:32 ` Neil Horman
  2017-12-12 19:38 ` Marcelo Ricardo Leitner
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Neil Horman @ 2017-12-12 18:32 UTC (permalink / raw)
  To: linux-sctp

On Tue, Dec 12, 2017 at 10:21:31PM +0530, Ashok Kumar wrote:
> Hi,
> 
> 
> 
> We are using LKSCTP in our LTE product (HeNBGW). We have
> high-availability support also in our product. In case of any failure
> on active VM, standby VM will take over active role and all the SCTP
> associations will be moved to that new active VM. The associations
> should be moved transparent to the peers (a kind of SCTP reset before
> SCTP heartbeat expires on the peer nodes).
> 
> 
> 
> But the problem that we face is that when a process crashes on active
> VM, the LKSCTP stack immediately sends SCTP abort to the peers for all
> associations before the system goes down completely. This creates
> confusion with the peers. Is there any way to avoid sending SCTP abort
> message in this scenario? If yes, please let us know how to do the
> same? If it needs LKSCTP kernel code change, please give pointers on
> what and where to change.
> 
> 
> 
> P.S: We tried to block the abort messages by dynamically using
> IPtables through signal handler (for signal 11 and 6). But this did
> not work.
> 
> 
> 
> A quick response will be highly appreciated.
> 
You're not going to be able to reliably block ABORTS, or any packet only on a
crash condition, just because the stack has points that operates asynchronously
to the process.  

About the closest thing that I could think of would be to write a custom
iptables rule to match on ABORT packets and send them to the NFQUEUE target.
Write a userspace handler process for queue targeted packets which in turn just
holds the abort packet for at least one cluster live heartbeat time (I'm
assuming here that, being a clustered system it has some sort of liveness
check).  Doing this hold may allow the cluster to shift to the new vm in a
failure situation before your queue handler process releases any abort packets
that it has, while in the event there is no failover, it will just release the
abort a little late.

I can't really recommend that approach mind you (its a horrid hack, and will
likely cause other protocol issues), but its all I can think of at the moment.

Regards
Neil

> 
> 
> Thanks,
> 
> Ashok
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to restrict SCTP abort during a process crash
  2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
  2017-12-12 18:32 ` Neil Horman
@ 2017-12-12 19:38 ` Marcelo Ricardo Leitner
  2017-12-13  4:50 ` Ashok Kumar
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-12 19:38 UTC (permalink / raw)
  To: linux-sctp

On Tue, Dec 12, 2017 at 10:21:31PM +0530, Ashok Kumar wrote:
> Hi,
> 
> 
> 
> We are using LKSCTP in our LTE product (HeNBGW). We have
> high-availability support also in our product. In case of any failure
> on active VM, standby VM will take over active role and all the SCTP
> associations will be moved to that new active VM. The associations
> should be moved transparent to the peers (a kind of SCTP reset before
> SCTP heartbeat expires on the peer nodes).
> 
> 
> 
> But the problem that we face is that when a process crashes on active
> VM, the LKSCTP stack immediately sends SCTP abort to the peers for all

"... when a *process* crashes..."
Have you considered redesigning your application so that 1 process
handles 1 association?
May not be the optimal solution, but then such crashes won't bring
all other assocs down too.

> associations before the system goes down completely. This creates
> confusion with the peers. Is there any way to avoid sending SCTP abort
> message in this scenario? If yes, please let us know how to do the
> same? If it needs LKSCTP kernel code change, please give pointers on
> what and where to change.
> 
> 
> 
> P.S: We tried to block the abort messages by dynamically using
> IPtables through signal handler (for signal 11 and 6). But this did
> not work.
> 
> 
> 
> A quick response will be highly appreciated.
> 
> 
> 
> Thanks,
> 
> Ashok
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to restrict SCTP abort during a process crash
  2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
  2017-12-12 18:32 ` Neil Horman
  2017-12-12 19:38 ` Marcelo Ricardo Leitner
@ 2017-12-13  4:50 ` Ashok Kumar
  2017-12-13  6:58 ` Xin Long
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ashok Kumar @ 2017-12-13  4:50 UTC (permalink / raw)
  To: linux-sctp

Thanks Neil for the suggestion. Yes, it sounds to be a bad hack, but
we will give it a try. Meanwhile, if you can think of some other
solution please let me know.

Thanks,
Ashok

On Wed, Dec 13, 2017 at 12:02 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
> On Tue, Dec 12, 2017 at 10:21:31PM +0530, Ashok Kumar wrote:
>> Hi,
>>
>>
>>
>> We are using LKSCTP in our LTE product (HeNBGW). We have
>> high-availability support also in our product. In case of any failure
>> on active VM, standby VM will take over active role and all the SCTP
>> associations will be moved to that new active VM. The associations
>> should be moved transparent to the peers (a kind of SCTP reset before
>> SCTP heartbeat expires on the peer nodes).
>>
>>
>>
>> But the problem that we face is that when a process crashes on active
>> VM, the LKSCTP stack immediately sends SCTP abort to the peers for all
>> associations before the system goes down completely. This creates
>> confusion with the peers. Is there any way to avoid sending SCTP abort
>> message in this scenario? If yes, please let us know how to do the
>> same? If it needs LKSCTP kernel code change, please give pointers on
>> what and where to change.
>>
>>
>>
>> P.S: We tried to block the abort messages by dynamically using
>> IPtables through signal handler (for signal 11 and 6). But this did
>> not work.
>>
>>
>>
>> A quick response will be highly appreciated.
>>
> You're not going to be able to reliably block ABORTS, or any packet only on a
> crash condition, just because the stack has points that operates asynchronously
> to the process.
>
> About the closest thing that I could think of would be to write a custom
> iptables rule to match on ABORT packets and send them to the NFQUEUE target.
> Write a userspace handler process for queue targeted packets which in turn just
> holds the abort packet for at least one cluster live heartbeat time (I'm
> assuming here that, being a clustered system it has some sort of liveness
> check).  Doing this hold may allow the cluster to shift to the new vm in a
> failure situation before your queue handler process releases any abort packets
> that it has, while in the event there is no failover, it will just release the
> abort a little late.
>
> I can't really recommend that approach mind you (its a horrid hack, and will
> likely cause other protocol issues), but its all I can think of at the moment.
>
> Regards
> Neil
>
>>
>>
>> Thanks,
>>
>> Ashok
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to restrict SCTP abort during a process crash
  2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
                   ` (2 preceding siblings ...)
  2017-12-13  4:50 ` Ashok Kumar
@ 2017-12-13  6:58 ` Xin Long
  2017-12-13 12:22 ` Neil Horman
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Xin Long @ 2017-12-13  6:58 UTC (permalink / raw)
  To: linux-sctp

On Wed, Dec 13, 2017 at 12:50 PM, Ashok Kumar <svashok79@gmail.com> wrote:
> Thanks Neil for the suggestion. Yes, it sounds to be a bad hack, but
> we will give it a try. Meanwhile, if you can think of some other
> solution please let me know.

Not sure if your SCTP server app running as a systemd service,
if yes, just add it to the 'After =', then let systemd insert the
iptables rule before killing your sctp process.

# cat /etc/systemd/system/sctp_no_abort.service
[Unit]
Description=SCTP No Abort Send When Shutdown
After=shutdown.target reboot.target halt.target

[Service]
Type=oneshot
ExecStart=/bin/true
ExecStop=/usr/bin/bash -c "iptables -A OUTPUT -p sctp -j DROP"
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target




>
> Thanks,
> Ashok
>
> On Wed, Dec 13, 2017 at 12:02 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
>> On Tue, Dec 12, 2017 at 10:21:31PM +0530, Ashok Kumar wrote:
>>> Hi,
>>>
>>>
>>>
>>> We are using LKSCTP in our LTE product (HeNBGW). We have
>>> high-availability support also in our product. In case of any failure
>>> on active VM, standby VM will take over active role and all the SCTP
>>> associations will be moved to that new active VM. The associations
>>> should be moved transparent to the peers (a kind of SCTP reset before
>>> SCTP heartbeat expires on the peer nodes).
>>>
>>>
>>>
>>> But the problem that we face is that when a process crashes on active
>>> VM, the LKSCTP stack immediately sends SCTP abort to the peers for all
>>> associations before the system goes down completely. This creates
>>> confusion with the peers. Is there any way to avoid sending SCTP abort
>>> message in this scenario? If yes, please let us know how to do the
>>> same? If it needs LKSCTP kernel code change, please give pointers on
>>> what and where to change.
>>>
>>>
>>>
>>> P.S: We tried to block the abort messages by dynamically using
>>> IPtables through signal handler (for signal 11 and 6). But this did
>>> not work.
>>>
>>>
>>>
>>> A quick response will be highly appreciated.
>>>
>> You're not going to be able to reliably block ABORTS, or any packet only on a
>> crash condition, just because the stack has points that operates asynchronously
>> to the process.
>>
>> About the closest thing that I could think of would be to write a custom
>> iptables rule to match on ABORT packets and send them to the NFQUEUE target.
>> Write a userspace handler process for queue targeted packets which in turn just
>> holds the abort packet for at least one cluster live heartbeat time (I'm
>> assuming here that, being a clustered system it has some sort of liveness
>> check).  Doing this hold may allow the cluster to shift to the new vm in a
>> failure situation before your queue handler process releases any abort packets
>> that it has, while in the event there is no failover, it will just release the
>> abort a little late.
>>
>> I can't really recommend that approach mind you (its a horrid hack, and will
>> likely cause other protocol issues), but its all I can think of at the moment.
>>
>> Regards
>> Neil
>>
>>>
>>>
>>> Thanks,
>>>
>>> Ashok
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to restrict SCTP abort during a process crash
  2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
                   ` (3 preceding siblings ...)
  2017-12-13  6:58 ` Xin Long
@ 2017-12-13 12:22 ` Neil Horman
  2017-12-14  6:42 ` Ashok Kumar
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Neil Horman @ 2017-12-13 12:22 UTC (permalink / raw)
  To: linux-sctp

On Wed, Dec 13, 2017 at 02:58:34PM +0800, Xin Long wrote:
> On Wed, Dec 13, 2017 at 12:50 PM, Ashok Kumar <svashok79@gmail.com> wrote:
> > Thanks Neil for the suggestion. Yes, it sounds to be a bad hack, but
> > we will give it a try. Meanwhile, if you can think of some other
> > solution please let me know.
> 
> Not sure if your SCTP server app running as a systemd service,
> if yes, just add it to the 'After =', then let systemd insert the
> iptables rule before killing your sctp process.
> 
> # cat /etc/systemd/system/sctp_no_abort.service
> [Unit]
> Description=SCTP No Abort Send When Shutdown
> After=shutdown.target reboot.target halt.target
> 
> [Service]
> Type=oneshot
> ExecStart=/bin/true
> ExecStop=/usr/bin/bash -c "iptables -A OUTPUT -p sctp -j DROP"
> RemainAfterExit=yes
> 
> [Install]
> WantedBy=multi-user.target
> 
This would work for some packets, but those queued and sent by a timer might
make it out.

Neil

> 
> 
> 
> >
> > Thanks,
> > Ashok
> >
> > On Wed, Dec 13, 2017 at 12:02 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
> >> On Tue, Dec 12, 2017 at 10:21:31PM +0530, Ashok Kumar wrote:
> >>> Hi,
> >>>
> >>>
> >>>
> >>> We are using LKSCTP in our LTE product (HeNBGW). We have
> >>> high-availability support also in our product. In case of any failure
> >>> on active VM, standby VM will take over active role and all the SCTP
> >>> associations will be moved to that new active VM. The associations
> >>> should be moved transparent to the peers (a kind of SCTP reset before
> >>> SCTP heartbeat expires on the peer nodes).
> >>>
> >>>
> >>>
> >>> But the problem that we face is that when a process crashes on active
> >>> VM, the LKSCTP stack immediately sends SCTP abort to the peers for all
> >>> associations before the system goes down completely. This creates
> >>> confusion with the peers. Is there any way to avoid sending SCTP abort
> >>> message in this scenario? If yes, please let us know how to do the
> >>> same? If it needs LKSCTP kernel code change, please give pointers on
> >>> what and where to change.
> >>>
> >>>
> >>>
> >>> P.S: We tried to block the abort messages by dynamically using
> >>> IPtables through signal handler (for signal 11 and 6). But this did
> >>> not work.
> >>>
> >>>
> >>>
> >>> A quick response will be highly appreciated.
> >>>
> >> You're not going to be able to reliably block ABORTS, or any packet only on a
> >> crash condition, just because the stack has points that operates asynchronously
> >> to the process.
> >>
> >> About the closest thing that I could think of would be to write a custom
> >> iptables rule to match on ABORT packets and send them to the NFQUEUE target.
> >> Write a userspace handler process for queue targeted packets which in turn just
> >> holds the abort packet for at least one cluster live heartbeat time (I'm
> >> assuming here that, being a clustered system it has some sort of liveness
> >> check).  Doing this hold may allow the cluster to shift to the new vm in a
> >> failure situation before your queue handler process releases any abort packets
> >> that it has, while in the event there is no failover, it will just release the
> >> abort a little late.
> >>
> >> I can't really recommend that approach mind you (its a horrid hack, and will
> >> likely cause other protocol issues), but its all I can think of at the moment.
> >>
> >> Regards
> >> Neil
> >>
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Ashok
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to restrict SCTP abort during a process crash
  2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
                   ` (4 preceding siblings ...)
  2017-12-13 12:22 ` Neil Horman
@ 2017-12-14  6:42 ` Ashok Kumar
  2017-12-14  9:22 ` Xin Long
  2017-12-14 10:40 ` Neil Horman
  7 siblings, 0 replies; 9+ messages in thread
From: Ashok Kumar @ 2017-12-14  6:42 UTC (permalink / raw)
  To: linux-sctp

Neil / Xin,

The best way is to change the LKSTCP kernel code to handle this
situation and stop sending SCTP abort message?

Can you please give guidance on where to change the code?

Thanks,
Ashok


On Wed, Dec 13, 2017 at 5:52 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
> On Wed, Dec 13, 2017 at 02:58:34PM +0800, Xin Long wrote:
>> On Wed, Dec 13, 2017 at 12:50 PM, Ashok Kumar <svashok79@gmail.com> wrote:
>> > Thanks Neil for the suggestion. Yes, it sounds to be a bad hack, but
>> > we will give it a try. Meanwhile, if you can think of some other
>> > solution please let me know.
>>
>> Not sure if your SCTP server app running as a systemd service,
>> if yes, just add it to the 'After =', then let systemd insert the
>> iptables rule before killing your sctp process.
>>
>> # cat /etc/systemd/system/sctp_no_abort.service
>> [Unit]
>> Description=SCTP No Abort Send When Shutdown
>> After=shutdown.target reboot.target halt.target
>>
>> [Service]
>> Type=oneshot
>> ExecStart=/bin/true
>> ExecStop=/usr/bin/bash -c "iptables -A OUTPUT -p sctp -j DROP"
>> RemainAfterExit=yes
>>
>> [Install]
>> WantedBy=multi-user.target
>>
> This would work for some packets, but those queued and sent by a timer might
> make it out.
>
> Neil
>
>>
>>
>>
>> >
>> > Thanks,
>> > Ashok
>> >
>> > On Wed, Dec 13, 2017 at 12:02 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
>> >> On Tue, Dec 12, 2017 at 10:21:31PM +0530, Ashok Kumar wrote:
>> >>> Hi,
>> >>>
>> >>>
>> >>>
>> >>> We are using LKSCTP in our LTE product (HeNBGW). We have
>> >>> high-availability support also in our product. In case of any failure
>> >>> on active VM, standby VM will take over active role and all the SCTP
>> >>> associations will be moved to that new active VM. The associations
>> >>> should be moved transparent to the peers (a kind of SCTP reset before
>> >>> SCTP heartbeat expires on the peer nodes).
>> >>>
>> >>>
>> >>>
>> >>> But the problem that we face is that when a process crashes on active
>> >>> VM, the LKSCTP stack immediately sends SCTP abort to the peers for all
>> >>> associations before the system goes down completely. This creates
>> >>> confusion with the peers. Is there any way to avoid sending SCTP abort
>> >>> message in this scenario? If yes, please let us know how to do the
>> >>> same? If it needs LKSCTP kernel code change, please give pointers on
>> >>> what and where to change.
>> >>>
>> >>>
>> >>>
>> >>> P.S: We tried to block the abort messages by dynamically using
>> >>> IPtables through signal handler (for signal 11 and 6). But this did
>> >>> not work.
>> >>>
>> >>>
>> >>>
>> >>> A quick response will be highly appreciated.
>> >>>
>> >> You're not going to be able to reliably block ABORTS, or any packet only on a
>> >> crash condition, just because the stack has points that operates asynchronously
>> >> to the process.
>> >>
>> >> About the closest thing that I could think of would be to write a custom
>> >> iptables rule to match on ABORT packets and send them to the NFQUEUE target.
>> >> Write a userspace handler process for queue targeted packets which in turn just
>> >> holds the abort packet for at least one cluster live heartbeat time (I'm
>> >> assuming here that, being a clustered system it has some sort of liveness
>> >> check).  Doing this hold may allow the cluster to shift to the new vm in a
>> >> failure situation before your queue handler process releases any abort packets
>> >> that it has, while in the event there is no failover, it will just release the
>> >> abort a little late.
>> >>
>> >> I can't really recommend that approach mind you (its a horrid hack, and will
>> >> likely cause other protocol issues), but its all I can think of at the moment.
>> >>
>> >> Regards
>> >> Neil
>> >>
>> >>>
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Ashok
>> >>> --
>> >>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> >>> the body of a message to majordomo@vger.kernel.org
>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>>
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How to restrict SCTP abort during a process crash
  2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
                   ` (5 preceding siblings ...)
  2017-12-14  6:42 ` Ashok Kumar
@ 2017-12-14  9:22 ` Xin Long
  2017-12-14 10:40 ` Neil Horman
  7 siblings, 0 replies; 9+ messages in thread
From: Xin Long @ 2017-12-14  9:22 UTC (permalink / raw)
  To: linux-sctp

On Thu, Dec 14, 2017 at 2:30 PM, Ashok Kumar <svashok79@gmail.com> wrote:
> Neil / Xin,
>
> The best way is to change the LKSTCP kernel code to handle this
> situation and stop sending SCTP abort message?
>
> Can you please give guidance on where to change the code?

If it was ABORT packet generate by app crash, try this:

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 1b00a1e..6cc245a 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1526,7 +1526,7 @@ static void sctp_close(struct sock *sk, long timeout)
                    (sock_flag(sk, SOCK_LINGER) && !sk->sk_lingertime)) {
                        struct sctp_chunk *chunk;

-                       chunk = sctp_make_abort_user(asoc, NULL, 0);
+                       chunk = NULL; /* sctp_make_abort_user(asoc, NULL, 0); */
                        sctp_primitive_ABORT(net, asoc, chunk);
                } else
                        sctp_primitive_SHUTDOWN(net, asoc, NULL);



>
> Thanks,
> Ashok
>
>
> On Wed, Dec 13, 2017 at 5:52 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
>> On Wed, Dec 13, 2017 at 02:58:34PM +0800, Xin Long wrote:
>>> On Wed, Dec 13, 2017 at 12:50 PM, Ashok Kumar <svashok79@gmail.com> wrote:
>>> > Thanks Neil for the suggestion. Yes, it sounds to be a bad hack, but
>>> > we will give it a try. Meanwhile, if you can think of some other
>>> > solution please let me know.
>>>
>>> Not sure if your SCTP server app running as a systemd service,
>>> if yes, just add it to the 'After =', then let systemd insert the
>>> iptables rule before killing your sctp process.
>>>
>>> # cat /etc/systemd/system/sctp_no_abort.service
>>> [Unit]
>>> Description=SCTP No Abort Send When Shutdown
>>> After=shutdown.target reboot.target halt.target
>>>
>>> [Service]
>>> Type=oneshot
>>> ExecStart=/bin/true
>>> ExecStop=/usr/bin/bash -c "iptables -A OUTPUT -p sctp -j DROP"
>>> RemainAfterExit=yes
>>>
>>> [Install]
>>> WantedBy=multi-user.target
>>>
>> This would work for some packets, but those queued and sent by a timer might
>> make it out.
>>
>> Neil
>>
>>>
>>>
>>>
>>> >
>>> > Thanks,
>>> > Ashok
>>> >
>>> > On Wed, Dec 13, 2017 at 12:02 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
>>> >> On Tue, Dec 12, 2017 at 10:21:31PM +0530, Ashok Kumar wrote:
>>> >>> Hi,
>>> >>>
>>> >>>
>>> >>>
>>> >>> We are using LKSCTP in our LTE product (HeNBGW). We have
>>> >>> high-availability support also in our product. In case of any failure
>>> >>> on active VM, standby VM will take over active role and all the SCTP
>>> >>> associations will be moved to that new active VM. The associations
>>> >>> should be moved transparent to the peers (a kind of SCTP reset before
>>> >>> SCTP heartbeat expires on the peer nodes).
>>> >>>
>>> >>>
>>> >>>
>>> >>> But the problem that we face is that when a process crashes on active
>>> >>> VM, the LKSCTP stack immediately sends SCTP abort to the peers for all
>>> >>> associations before the system goes down completely. This creates
>>> >>> confusion with the peers. Is there any way to avoid sending SCTP abort
>>> >>> message in this scenario? If yes, please let us know how to do the
>>> >>> same? If it needs LKSCTP kernel code change, please give pointers on
>>> >>> what and where to change.
>>> >>>
>>> >>>
>>> >>>
>>> >>> P.S: We tried to block the abort messages by dynamically using
>>> >>> IPtables through signal handler (for signal 11 and 6). But this did
>>> >>> not work.
>>> >>>
>>> >>>
>>> >>>
>>> >>> A quick response will be highly appreciated.
>>> >>>
>>> >> You're not going to be able to reliably block ABORTS, or any packet only on a
>>> >> crash condition, just because the stack has points that operates asynchronously
>>> >> to the process.
>>> >>
>>> >> About the closest thing that I could think of would be to write a custom
>>> >> iptables rule to match on ABORT packets and send them to the NFQUEUE target.
>>> >> Write a userspace handler process for queue targeted packets which in turn just
>>> >> holds the abort packet for at least one cluster live heartbeat time (I'm
>>> >> assuming here that, being a clustered system it has some sort of liveness
>>> >> check).  Doing this hold may allow the cluster to shift to the new vm in a
>>> >> failure situation before your queue handler process releases any abort packets
>>> >> that it has, while in the event there is no failover, it will just release the
>>> >> abort a little late.
>>> >>
>>> >> I can't really recommend that approach mind you (its a horrid hack, and will
>>> >> likely cause other protocol issues), but its all I can think of at the moment.
>>> >>
>>> >> Regards
>>> >> Neil
>>> >>
>>> >>>
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>> Ashok
>>> >>> --
>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> >>> the body of a message to majordomo@vger.kernel.org
>>> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> >>>
>>> > --
>>> > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> > the body of a message to majordomo@vger.kernel.org
>>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: How to restrict SCTP abort during a process crash
  2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
                   ` (6 preceding siblings ...)
  2017-12-14  9:22 ` Xin Long
@ 2017-12-14 10:40 ` Neil Horman
  7 siblings, 0 replies; 9+ messages in thread
From: Neil Horman @ 2017-12-14 10:40 UTC (permalink / raw)
  To: linux-sctp

On Thu, Dec 14, 2017 at 12:00:42PM +0530, Ashok Kumar wrote:
> Neil / Xin,
> 
> The best way is to change the LKSTCP kernel code to handle this
> situation and stop sending SCTP abort message?
> 
> Can you please give guidance on where to change the code?
> 
> Thanks,
> Ashok
> 

while Xin's code will have the desired effect, that is absolutely the wrong
thing to do, as you don't want to suppress abort messages, you want to suppress
them in the event that a process needs to be migrated to your backup system.

If you're going to hack up your system like that, I'd suggest two things:

1) Use a systemtap script, or a kprobe module to hook into the code in question,
that way you can keep your custom changes isolated.

2) Gate the setting of chunk to NULL on the presence of SIGKILL or SIGSEGV in
the pending signals set of the task you are working on.

Neil


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-12-14 10:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-12 16:51 How to restrict SCTP abort during a process crash Ashok Kumar
2017-12-12 18:32 ` Neil Horman
2017-12-12 19:38 ` Marcelo Ricardo Leitner
2017-12-13  4:50 ` Ashok Kumar
2017-12-13  6:58 ` Xin Long
2017-12-13 12:22 ` Neil Horman
2017-12-14  6:42 ` Ashok Kumar
2017-12-14  9:22 ` Xin Long
2017-12-14 10:40 ` Neil Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.