From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Horman Subject: Re: [PATCH net] sctp: not allow to set rto_min with a value below 200 msecs Date: Sat, 26 May 2018 21:01:00 -0400 Message-ID: <20180527010059.GA15798@neilslaptop.think-freely.org> References: <20180525191319.GB392@hmswarspite.think-freely.org> <71FD541D-25FE-4100-980B-C3A0CEAF6CD4@lurchi.franken.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Michael Tuexen , Xin Long , network dev , linux-sctp@vger.kernel.org, David Miller , David Ahern , Eric Dumazet , Marcelo Ricardo Leitner , syzkaller To: Dmitry Vyukov Return-path: Received: from charlotte.tuxdriver.com ([70.61.120.58]:39229 "EHLO smtp.tuxdriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751103AbeE0BBv (ORCPT ); Sat, 26 May 2018 21:01:51 -0400 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Sat, May 26, 2018 at 05:50:39PM +0200, Dmitry Vyukov wrote: > On Sat, May 26, 2018 at 5:42 PM, Michael Tuexen > wrote: > >> On 25. May 2018, at 21:13, Neil Horman wrote: > >> > >> On Sat, May 26, 2018 at 01:41:02AM +0800, Xin Long wrote: > >>> syzbot reported a rcu_sched self-detected stall on CPU which is caused > >>> by too small value set on rto_min with SCTP_RTOINFO sockopt. With this > >>> value, hb_timer will get stuck there, as in its timer handler it starts > >>> this timer again with this value, then goes to the timer handler again. > >>> > >>> This problem is there since very beginning, and thanks to Eric for the > >>> reproducer shared from a syzbot mail. > >>> > >>> This patch fixes it by not allowing to set rto_min with a value below > >>> 200 msecs, which is based on TCP's, by either setsockopt or sysctl. > >>> > >>> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com > >>> Suggested-by: Marcelo Ricardo Leitner > >>> Signed-off-by: Xin Long > >>> --- > >>> include/net/sctp/constants.h | 1 + > >>> net/sctp/socket.c | 10 +++++++--- > >>> net/sctp/sysctl.c | 3 ++- > >>> 3 files changed, 10 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h > >>> index 20ff237..2ee7a7b 100644 > >>> --- a/include/net/sctp/constants.h > >>> +++ b/include/net/sctp/constants.h > >>> @@ -277,6 +277,7 @@ enum { SCTP_MAX_GABS = 16 }; > >>> #define SCTP_RTO_INITIAL (3 * 1000) > >>> #define SCTP_RTO_MIN (1 * 1000) > >>> #define SCTP_RTO_MAX (60 * 1000) > >>> +#define SCTP_RTO_HARD_MIN 200 > >>> > >>> #define SCTP_RTO_ALPHA 3 /* 1/8 when converted to right shifts. */ > >>> #define SCTP_RTO_BETA 2 /* 1/4 when converted to right shifts. */ > >>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c > >>> index ae7e7c6..6ef12c7 100644 > >>> --- a/net/sctp/socket.c > >>> +++ b/net/sctp/socket.c > >>> @@ -3029,7 +3029,8 @@ static int sctp_setsockopt_nodelay(struct sock *sk, char __user *optval, > >>> * be changed. > >>> * > >>> */ > >>> -static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigned int optlen) > >>> +static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, > >>> + unsigned int optlen) > >>> { > >>> struct sctp_rtoinfo rtoinfo; > >>> struct sctp_association *asoc; > >>> @@ -3056,10 +3057,13 @@ static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigne > >>> else > >>> rto_max = asoc ? asoc->rto_max : sp->rtoinfo.srto_max; > >>> > >>> - if (rto_min) > >>> + if (rto_min) { > >>> + if (rto_min < SCTP_RTO_HARD_MIN) > >>> + return -EINVAL; > >>> rto_min = asoc ? msecs_to_jiffies(rto_min) : rto_min; > >>> - else > >>> + } else { > >>> rto_min = asoc ? asoc->rto_min : sp->rtoinfo.srto_min; > >>> + } > >>> > >>> if (rto_min > rto_max) > >>> return -EINVAL; > >>> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c > >>> index 33ca5b7..7ec854a 100644 > >>> --- a/net/sctp/sysctl.c > >>> +++ b/net/sctp/sysctl.c > >>> @@ -52,6 +52,7 @@ static int rto_alpha_min = 0; > >>> static int rto_beta_min = 0; > >>> static int rto_alpha_max = 1000; > >>> static int rto_beta_max = 1000; > >>> +static int rto_hard_min = SCTP_RTO_HARD_MIN; > >>> > >>> static unsigned long max_autoclose_min = 0; > >>> static unsigned long max_autoclose_max = > >>> @@ -116,7 +117,7 @@ static struct ctl_table sctp_net_table[] = { > >>> .maxlen = sizeof(unsigned int), > >>> .mode = 0644, > >>> .proc_handler = proc_sctp_do_rto_min, > >>> - .extra1 = &one, > >>> + .extra1 = &rto_hard_min, > >>> .extra2 = &init_net.sctp.rto_max > >>> }, > >>> { > >>> -- > >>> 2.1.0 > >>> > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> Patch looks fine, you probably want to note this hard minimum in man(7) sctp as > >> well > >> > > I'm aware of some signalling networks which use RTO.min of smaller values than 200ms. > > So could this be reduced? > > Hi Michael, > > What value do they use? > > Xin, Neil, is there more principled way of ensuring that a timer won't > cause a hard CPU stall? There are slow machines and there are slow > kernels (in particular syzbot kernel has tons of debug configs > enabled). 200ms _should_ not cause problems because we did not see > them with tcp. But it's hard to say what's the low limit as we are > trying to put a hard upper bound on execution time of a complex > section of code. Is there something like cond_resched for timers? Unfortunately, Theres not really a way to do conditional rescheduling of timers, additionally, we have a problem because the timer is reset as a side effect of the SCTP state machine, and so the execution time between timer updates has a signifcant amount of jitter (meaning its a pretty hard value to calibrate, unless you just select a 'safe' large value for the floor). What we might could do (though this might impact the protocol function is change the timer update side effects to simply set a flag, and consistently update the timers on exit from sctp_do_sm, so they don't re-arm until all state machine processing is complete. Anyone have any thoughts on that? Neil > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Horman Date: Sun, 27 May 2018 01:01:00 +0000 Subject: Re: [PATCH net] sctp: not allow to set rto_min with a value below 200 msecs Message-Id: <20180527010059.GA15798@neilslaptop.think-freely.org> List-Id: References: <20180525191319.GB392@hmswarspite.think-freely.org> <71FD541D-25FE-4100-980B-C3A0CEAF6CD4@lurchi.franken.de> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Dmitry Vyukov Cc: Michael Tuexen , Xin Long , network dev , linux-sctp@vger.kernel.org, David Miller , David Ahern , Eric Dumazet , Marcelo Ricardo Leitner , syzkaller On Sat, May 26, 2018 at 05:50:39PM +0200, Dmitry Vyukov wrote: > On Sat, May 26, 2018 at 5:42 PM, Michael Tuexen > wrote: > >> On 25. May 2018, at 21:13, Neil Horman wrote: > >> > >> On Sat, May 26, 2018 at 01:41:02AM +0800, Xin Long wrote: > >>> syzbot reported a rcu_sched self-detected stall on CPU which is caused > >>> by too small value set on rto_min with SCTP_RTOINFO sockopt. With this > >>> value, hb_timer will get stuck there, as in its timer handler it starts > >>> this timer again with this value, then goes to the timer handler again. > >>> > >>> This problem is there since very beginning, and thanks to Eric for the > >>> reproducer shared from a syzbot mail. > >>> > >>> This patch fixes it by not allowing to set rto_min with a value below > >>> 200 msecs, which is based on TCP's, by either setsockopt or sysctl. > >>> > >>> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com > >>> Suggested-by: Marcelo Ricardo Leitner > >>> Signed-off-by: Xin Long > >>> --- > >>> include/net/sctp/constants.h | 1 + > >>> net/sctp/socket.c | 10 +++++++--- > >>> net/sctp/sysctl.c | 3 ++- > >>> 3 files changed, 10 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h > >>> index 20ff237..2ee7a7b 100644 > >>> --- a/include/net/sctp/constants.h > >>> +++ b/include/net/sctp/constants.h > >>> @@ -277,6 +277,7 @@ enum { SCTP_MAX_GABS = 16 }; > >>> #define SCTP_RTO_INITIAL (3 * 1000) > >>> #define SCTP_RTO_MIN (1 * 1000) > >>> #define SCTP_RTO_MAX (60 * 1000) > >>> +#define SCTP_RTO_HARD_MIN 200 > >>> > >>> #define SCTP_RTO_ALPHA 3 /* 1/8 when converted to right shifts. */ > >>> #define SCTP_RTO_BETA 2 /* 1/4 when converted to right shifts. */ > >>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c > >>> index ae7e7c6..6ef12c7 100644 > >>> --- a/net/sctp/socket.c > >>> +++ b/net/sctp/socket.c > >>> @@ -3029,7 +3029,8 @@ static int sctp_setsockopt_nodelay(struct sock *sk, char __user *optval, > >>> * be changed. > >>> * > >>> */ > >>> -static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigned int optlen) > >>> +static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, > >>> + unsigned int optlen) > >>> { > >>> struct sctp_rtoinfo rtoinfo; > >>> struct sctp_association *asoc; > >>> @@ -3056,10 +3057,13 @@ static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigne > >>> else > >>> rto_max = asoc ? asoc->rto_max : sp->rtoinfo.srto_max; > >>> > >>> - if (rto_min) > >>> + if (rto_min) { > >>> + if (rto_min < SCTP_RTO_HARD_MIN) > >>> + return -EINVAL; > >>> rto_min = asoc ? msecs_to_jiffies(rto_min) : rto_min; > >>> - else > >>> + } else { > >>> rto_min = asoc ? asoc->rto_min : sp->rtoinfo.srto_min; > >>> + } > >>> > >>> if (rto_min > rto_max) > >>> return -EINVAL; > >>> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c > >>> index 33ca5b7..7ec854a 100644 > >>> --- a/net/sctp/sysctl.c > >>> +++ b/net/sctp/sysctl.c > >>> @@ -52,6 +52,7 @@ static int rto_alpha_min = 0; > >>> static int rto_beta_min = 0; > >>> static int rto_alpha_max = 1000; > >>> static int rto_beta_max = 1000; > >>> +static int rto_hard_min = SCTP_RTO_HARD_MIN; > >>> > >>> static unsigned long max_autoclose_min = 0; > >>> static unsigned long max_autoclose_max > >>> @@ -116,7 +117,7 @@ static struct ctl_table sctp_net_table[] = { > >>> .maxlen = sizeof(unsigned int), > >>> .mode = 0644, > >>> .proc_handler = proc_sctp_do_rto_min, > >>> - .extra1 = &one, > >>> + .extra1 = &rto_hard_min, > >>> .extra2 = &init_net.sctp.rto_max > >>> }, > >>> { > >>> -- > >>> 2.1.0 > >>> > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> Patch looks fine, you probably want to note this hard minimum in man(7) sctp as > >> well > >> > > I'm aware of some signalling networks which use RTO.min of smaller values than 200ms. > > So could this be reduced? > > Hi Michael, > > What value do they use? > > Xin, Neil, is there more principled way of ensuring that a timer won't > cause a hard CPU stall? There are slow machines and there are slow > kernels (in particular syzbot kernel has tons of debug configs > enabled). 200ms _should_ not cause problems because we did not see > them with tcp. But it's hard to say what's the low limit as we are > trying to put a hard upper bound on execution time of a complex > section of code. Is there something like cond_resched for timers? Unfortunately, Theres not really a way to do conditional rescheduling of timers, additionally, we have a problem because the timer is reset as a side effect of the SCTP state machine, and so the execution time between timer updates has a signifcant amount of jitter (meaning its a pretty hard value to calibrate, unless you just select a 'safe' large value for the floor). What we might could do (though this might impact the protocol function is change the timer update side effects to simply set a flag, and consistently update the timers on exit from sctp_do_sm, so they don't re-arm until all state machine processing is complete. Anyone have any thoughts on that? Neil >