netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND 0/2] enable hires timer to timeout datagram socket
@ 2017-08-18 18:44 Vallish Vaidyeshwara
  2017-08-18 18:44 ` [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets Vallish Vaidyeshwara
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Vallish Vaidyeshwara @ 2017-08-18 18:44 UTC (permalink / raw)
  To: davem, shuah, netdev, linux-kernel; +Cc: eduval, anchalag, vallish

Hello Dave,

Resending the patch series to include netdev mailing list with a
cover letter.

I am submitting 2 patch series to enable hires timer to timeout
datagram sockets (AF_UNIX & AF_INET domain) and test code to test
timeout accuracy on these sockets.

There has been a behavior change in 4.9 kernel with refactoring of Kernel
timer wheel in 4.8. We have a use case wherein our datagram socket
application is sensitive to socket timeout including long timeouts.

One of the test runs with a timeout value of 180 seconds timed out at
190 seconds.
[root@]# ./datagram_sock_timeout 180000
datagram_sock_timeout failed: took 190.00 seconds
[root@]#
The same program when run on a 4.4 kernel would timeout more accurately and
the kernel added slack was not noticeable to user application.

Patch 1: Has core code change of enabling hires timer to timeout datagram
	 socket on AF_UNIX and AF_INET domain
Patch 2: Test code to report regression in timeout behavior related to
	 patch 1

Vallish Vaidyeshwara (2):
  net: enable high resolution timer mode to timeout datagram sockets
  selftests/net: add test to verify datagram socket timeout

 net/core/datagram.c                                |  11 +-
 tools/testing/selftests/net/Makefile               |   3 +-
 .../testing/selftests/net/datagram_sock_timeout.c  | 119 +++++++++++++++++++++
 .../selftests/net/run_datagram_sock_timeout.sh     |  12 +++
 4 files changed, 143 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/net/datagram_sock_timeout.c
 create mode 100755 tools/testing/selftests/net/run_datagram_sock_timeout.sh

-- 
2.7.3.AMZN

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets
  2017-08-18 18:44 [PATCH RESEND 0/2] enable hires timer to timeout datagram socket Vallish Vaidyeshwara
@ 2017-08-18 18:44 ` Vallish Vaidyeshwara
  2017-08-21 20:10   ` Cong Wang
  2017-08-18 18:44 ` [PATCH RESEND 2/2] selftests/net: add test to verify datagram socket timeout Vallish Vaidyeshwara
  2017-08-18 20:18 ` [PATCH RESEND 0/2] enable hires timer to timeout datagram socket Richard Cochran
  2 siblings, 1 reply; 12+ messages in thread
From: Vallish Vaidyeshwara @ 2017-08-18 18:44 UTC (permalink / raw)
  To: davem, shuah, netdev, linux-kernel; +Cc: eduval, anchalag, vallish

Enable high resolution timer mode to time SO_RCVTIMEO value used with
setsockopt(2) on AF_UNIX and AF_INET datagram sockets. By default,
SO_RCVTIMEO uses low resolution timer which is good for most of socket
use cases.

Background:
Kernel timer wheel was refactored in 4.8 to avoid drawbacks with previous
implementation:
https://lwn.net/Articles/691064/
Unlike the previous "kernel timer wheel" implementation in 4.4 which aimed
for accuracy by paying cost for cascading tracked timers at the boundary of
256 jiffies, the new timer wheel implementation gets rid of cascading
latency by paying a price for being less accurate for far off timers.

Use Case:
New implementation is good for most of socket use cases. However we have a
use case where our application is sensitive to socket timeout including
long timeouts.  Please refer to test code as part of this patch series.
One of the test runs with a timeout value of 180 seconds timed out at
190 seconds.
[root@]# ./datagram_sock_timeout 180000
datagram_sock_timeout failed: took 190.00 seconds
[root@]#
The same program when run on a 4.4 kernel would timeout more acurately and
the kernel added slack was not noticeable to user application.

Interesting text:
a) Standards for setsockopt:
http://pubs.opengroup.org/onlinepubs/009695399/functions/setsockopt.html
<snip>
SO_RCVTIMEO
Sets the timeout value that specifies the maximum amount of time an input
function waits until it completes. It accepts a timeval structure with the
number of seconds and microseconds specifying the limit on how long to wait
for an input operation to complete. If a receive operation has blocked for
this much time without receiving additional data, it shall return with a
partial count or errno set to [EAGAIN] or [EWOULDBLOCK] if no data is
received. The default for this option is zero, which indicates that a
receive operation shall not time out. This option takes a timeval
structure. Note that not all implementations allow this option to be set.
<end snip>
This only talks about the maximum time and the current behavior indeed
follows this standard. System call does not return before the time
specified and it does return EAGAIN.
b) Man page for SETSOCKOPT(3P):
<snip>
The  option_name  argument  specifies  a  single  option to set. It can be
one of the socket-level options defined in <sys_socket.h> and described in
Section 2.10.16, Use of Options.  If option_name is equal to SO_RCVTIMEO
or SO_SNDTIMEO and the implementation supports setting the option, it is
unspecified whether the struct timeval  pointed  to by  option_value  is
stored  as  provided by this function or is rounded up to align with the
resolution of the clock being used. If setsockopt() is called with
option_name equal to SO_ACCEPTCONN, SO_ERROR, or SO_TYPE, the behavior is
unspecified.
<end snip>
Behavior is unspecified.
3) Man page for SELECT(2):
<snip>
Note  that  the  timeout  interval  will  be  rounded up to the system
clock granularity, and kernel scheduling delays mean that the blocking
interval may overrun by a small amount.  If both fields of the timeval
structure are zero, then select() returns immediately.  (This is useful
for polling.)  If timeout is NULL (no timeout),  select()  can block
indefinitely.
<end snip>
Select system call guarantees timeout interval and inturn uses highres
timer.

Reported-by: Manjula Peiris <thelgep@amazon.com>
Reviewed-by: Eduardo Valentin <eduval@amazon.com>
Reviewed-by: Anchal Agarwal <anchalag@amazon.com>
Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com>
---
 net/core/datagram.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index ee5647b..c89a104 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -80,6 +80,7 @@ static int receiver_wake_function(wait_queue_entry_t *wait, unsigned int mode, i
 		return 0;
 	return autoremove_wake_function(wait, mode, sync, key);
 }
+
 /*
  * Wait for the last received packet to be different from skb
  */
@@ -87,6 +88,8 @@ int __skb_wait_for_more_packets(struct sock *sk, int *err, long *timeo_p,
 				const struct sk_buff *skb)
 {
 	int error;
+	ktime_t expires;
+	unsigned long pre_sched_time;
 	DEFINE_WAIT_FUNC(wait, receiver_wake_function);
 
 	prepare_to_wait_exclusive(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
@@ -116,7 +119,13 @@ int __skb_wait_for_more_packets(struct sock *sk, int *err, long *timeo_p,
 		goto interrupted;
 
 	error = 0;
-	*timeo_p = schedule_timeout(*timeo_p);
+	/* Wait using highres timer */
+	expires = ktime_add_ns(ktime_get(), jiffies_to_nsecs(*timeo_p));
+	pre_sched_time = jiffies;
+	if (schedule_hrtimeout(&expires, HRTIMER_MODE_ABS))
+		*timeo_p = jiffies - pre_sched_time;
+	else
+		*timeo_p = 0;
 out:
 	finish_wait(sk_sleep(sk), &wait);
 	return error;
-- 
2.7.3.AMZN

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RESEND 2/2] selftests/net: add test to verify datagram socket timeout
  2017-08-18 18:44 [PATCH RESEND 0/2] enable hires timer to timeout datagram socket Vallish Vaidyeshwara
  2017-08-18 18:44 ` [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets Vallish Vaidyeshwara
@ 2017-08-18 18:44 ` Vallish Vaidyeshwara
  2017-08-18 20:18 ` [PATCH RESEND 0/2] enable hires timer to timeout datagram socket Richard Cochran
  2 siblings, 0 replies; 12+ messages in thread
From: Vallish Vaidyeshwara @ 2017-08-18 18:44 UTC (permalink / raw)
  To: davem, shuah, netdev, linux-kernel; +Cc: eduval, anchalag, vallish

AF_UNIX and AF_INET datagram sockets use high resolution timer to time
SO_RCVTIMEO value used with setsockopt(2). This test checks for the
accuracy of kernel notifying these sockets timeout to application. Test
program has code to check AF_UNIX socket, however the kernel function used
to timeout AF_INET socket is the same kernel function used by AF_UNIX as
well which is __skb_wait_for_more_packets().

Reported-by: Manjula Peiris <thelgep@amazon.com>
Reviewed-by: Eduardo Valentin <eduval@amazon.com>
Reviewed-by: Anchal Agarwal <anchalag@amazon.com>
Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com>
---
 tools/testing/selftests/net/Makefile               |   3 +-
 .../testing/selftests/net/datagram_sock_timeout.c  | 119 +++++++++++++++++++++
 .../selftests/net/run_datagram_sock_timeout.sh     |  12 +++
 3 files changed, 133 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/net/datagram_sock_timeout.c
 create mode 100755 tools/testing/selftests/net/run_datagram_sock_timeout.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index f6c9dbf..eb5a8c7 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -3,11 +3,12 @@
 CFLAGS =  -Wall -Wl,--no-as-needed -O2 -g
 CFLAGS += -I../../../../usr/include/
 
-TEST_PROGS := run_netsocktests run_afpackettests test_bpf.sh netdevice.sh
+TEST_PROGS := run_netsocktests run_afpackettests test_bpf.sh netdevice.sh run_datagram_sock_timeout.sh
 TEST_GEN_FILES =  socket
 TEST_GEN_FILES += psock_fanout psock_tpacket
 TEST_GEN_FILES += reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa
 TEST_GEN_FILES += reuseport_dualstack
+TEST_GEN_FILES += datagram_sock_timeout
 
 include ../lib.mk
 
diff --git a/tools/testing/selftests/net/datagram_sock_timeout.c b/tools/testing/selftests/net/datagram_sock_timeout.c
new file mode 100644
index 0000000..2549be5
--- /dev/null
+++ b/tools/testing/selftests/net/datagram_sock_timeout.c
@@ -0,0 +1,119 @@
+/*
+ * Copyright 2017 Amazon.com, Inc.
+ * Author: Manjula Peiris <thelgep@amazon.com>
+ *         Vallish Vaidyeshwara <vallish@amazon.com>
+ *
+ * selftests/net: test to verify datagram socket timeout
+ *
+ * AF_UNIX and AF_INET datagram sockets use high resolution timer to time
+ * SO_RCVTIMEO value used with setsockopt(2). This test checks for the accuracy
+ * of kernel notifying these sockets timeout to application. Test program has
+ * code to check AF_UNIX socket, however the kernel function used to timeout
+ * AF_INET socket is the same kernel function used by AF_UNIX as well which is
+ * __skb_wait_for_more_packets().
+ *
+ * License (GPLv2):
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.
+ */
+
+#include <stdlib.h>
+#include <sys/socket.h>
+#include <unistd.h>
+#include <signal.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <err.h>
+#include <errno.h>
+#include <sys/un.h>
+#include <time.h>
+#include <assert.h>
+
+#define BUF_SIZE 128
+#define KB 1024
+#define NUM_FD 2
+
+static int set_socket_timeout(int sockfd, unsigned int ms)
+{
+	int ret;
+	struct timeval timeout;
+	socklen_t cb = sizeof(timeout);
+
+	timeout.tv_sec = ms / 1000;
+	timeout.tv_usec = (ms % 1000) * 1000;
+	ret = setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, &timeout, cb);
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	char err[BUF_SIZE];
+	int ret;
+	int fds[NUM_FD];
+	struct msghdr message;
+	char buffer[KB];
+	struct sockaddr_storage src_addr;
+	struct iovec iov[1];
+	time_t start, end;
+	unsigned int timeout;
+
+	iov[0].iov_base = buffer;
+	iov[0].iov_len = sizeof(buffer);
+	message.msg_name = &src_addr;
+	message.msg_namelen = sizeof(src_addr);
+	message.msg_iov = iov;
+	message.msg_iovlen = 1;
+	message.msg_control = 0;
+	message.msg_controllen = 0;
+
+	if (argc != 2) {
+		fprintf(stderr,
+			"datagram_sock_timeout failed: no timeout specified\n");
+		return -1;
+	}
+	timeout = (unsigned int)(atoi(argv[1]));
+
+	if (socketpair(AF_UNIX, SOCK_DGRAM, 0, fds) != 0) {
+		strerror_r(errno, err, BUF_SIZE);
+		fprintf(stderr, "socketpair() call failed with %s\n", err);
+		return -1;
+	}
+
+	if (set_socket_timeout(fds[0], timeout) != 0) {
+		strerror_r(errno, err, BUF_SIZE);
+		fprintf(stderr, "setsockopt() call failed with %s\n", err);
+		return -1;
+	}
+
+	start = time(NULL);
+	ret = (int)recvmsg(fds[0], &message, 0);
+	end = time(NULL);
+	if (!(ret == -1 && errno == 11)) {
+		fprintf(stderr,
+			"datagram_sock_timeout failed: test was interrupted\n");
+		return -1;
+	}
+
+	if (((int)(end - start)) != (timeout / 1000)) {
+		fprintf(stderr,
+			"datagram_sock_timeout failed: took %.2f seconds\n",
+			(double)(end - start));
+		return -1;
+	}
+
+	close(fds[0]);
+	close(fds[1]);
+
+	fprintf(stderr, "datagram_sock_timeout passed\n");
+	return 0;
+}
diff --git a/tools/testing/selftests/net/run_datagram_sock_timeout.sh b/tools/testing/selftests/net/run_datagram_sock_timeout.sh
new file mode 100755
index 0000000..d5f4f82
--- /dev/null
+++ b/tools/testing/selftests/net/run_datagram_sock_timeout.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+# Runs datagram socket timeout test
+
+echo "--------------------"
+echo "running run_datagram_sock_timeout test"
+echo "--------------------"
+./datagram_sock_timeout 180000
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+else
+	echo "[PASS]"
+fi
-- 
2.7.3.AMZN

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/2] enable hires timer to timeout datagram socket
  2017-08-18 18:44 [PATCH RESEND 0/2] enable hires timer to timeout datagram socket Vallish Vaidyeshwara
  2017-08-18 18:44 ` [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets Vallish Vaidyeshwara
  2017-08-18 18:44 ` [PATCH RESEND 2/2] selftests/net: add test to verify datagram socket timeout Vallish Vaidyeshwara
@ 2017-08-18 20:18 ` Richard Cochran
  2017-08-18 22:27   ` Vallish Vaidyeshwara
  2 siblings, 1 reply; 12+ messages in thread
From: Richard Cochran @ 2017-08-18 20:18 UTC (permalink / raw)
  To: Vallish Vaidyeshwara
  Cc: davem, shuah, netdev, linux-kernel, eduval, anchalag, tglx

On Fri, Aug 18, 2017 at 06:44:08PM +0000, Vallish Vaidyeshwara wrote:
> There has been a behavior change in 4.9 kernel with refactoring of Kernel
> timer wheel in 4.8. We have a use case wherein our datagram socket
> application is sensitive to socket timeout including long timeouts.
> 
> One of the test runs with a timeout value of 180 seconds timed out at
> 190 seconds.

So the whole premise of the new timer wheel is that long timeouts need
not be very accurate.

> Patch 1: Has core code change of enabling hires timer to timeout datagram
> 	 socket on AF_UNIX and AF_INET domain

Using hrtimers will hurt performance for most applications.  Can you
please explain your use case and why is it so important?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/2] enable hires timer to timeout datagram socket
  2017-08-18 20:18 ` [PATCH RESEND 0/2] enable hires timer to timeout datagram socket Richard Cochran
@ 2017-08-18 22:27   ` Vallish Vaidyeshwara
  2017-08-19  6:21     ` Richard Cochran
  0 siblings, 1 reply; 12+ messages in thread
From: Vallish Vaidyeshwara @ 2017-08-18 22:27 UTC (permalink / raw)
  To: Richard Cochran
  Cc: davem, shuah, netdev, linux-kernel, eduval, anchalag, tglx

On Fri, Aug 18, 2017 at 10:18:54PM +0200, Richard Cochran wrote:
> On Fri, Aug 18, 2017 at 06:44:08PM +0000, Vallish Vaidyeshwara wrote:
> > There has been a behavior change in 4.9 kernel with refactoring of Kernel
> > timer wheel in 4.8. We have a use case wherein our datagram socket
> > application is sensitive to socket timeout including long timeouts.
> > 
> > One of the test runs with a timeout value of 180 seconds timed out at
> > 190 seconds.
> 
> So the whole premise of the new timer wheel is that long timeouts need
> not be very accurate.
> 
> > Patch 1: Has core code change of enabling hires timer to timeout datagram
> > 	 socket on AF_UNIX and AF_INET domain
>

Hello Richard,

> Using hrtimers will hurt performance for most applications.  Can you
> please explain your use case and why is it so important?
>

We have a on-demand application that uses long timeouts and needs to react to
events within milliseconds. With this change in behavior, we notice from 4.4 to
4.9, the legacy application is now exhibiting incorrect behavior with same set
of system calls.

Thanks.
-Vallish

> Thanks,
> Richard
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/2] enable hires timer to timeout datagram socket
  2017-08-18 22:27   ` Vallish Vaidyeshwara
@ 2017-08-19  6:21     ` Richard Cochran
  2017-08-20  1:47       ` Vallish Vaidyeshwara
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Cochran @ 2017-08-19  6:21 UTC (permalink / raw)
  To: Vallish Vaidyeshwara
  Cc: davem, shuah, netdev, linux-kernel, eduval, anchalag, tglx

On Fri, Aug 18, 2017 at 10:27:56PM +0000, Vallish Vaidyeshwara wrote:
> We have a on-demand application that uses long timeouts and needs to react to
> events within milliseconds.

Huh?  The test program you posted does not react to any event.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/2] enable hires timer to timeout datagram socket
  2017-08-19  6:21     ` Richard Cochran
@ 2017-08-20  1:47       ` Vallish Vaidyeshwara
  2017-08-21 18:22         ` Vallish Vaidyeshwara
  0 siblings, 1 reply; 12+ messages in thread
From: Vallish Vaidyeshwara @ 2017-08-20  1:47 UTC (permalink / raw)
  To: Richard Cochran
  Cc: davem, shuah, netdev, linux-kernel, eduval, anchalag, tglx

On Sat, Aug 19, 2017 at 08:21:45AM +0200, Richard Cochran wrote:
> On Fri, Aug 18, 2017 at 10:27:56PM +0000, Vallish Vaidyeshwara wrote:
> > We have a on-demand application that uses long timeouts and needs to react to
> > events within milliseconds.
>

Hello Richard,

> Huh?  The test program you posted does not react to any event.
>

Application has logic for complex events and test program is kept simple to
highlight the change in behavior seen with system calls.

Thanks.
-Vallish

> Thanks,
> Richard
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/2] enable hires timer to timeout datagram socket
  2017-08-20  1:47       ` Vallish Vaidyeshwara
@ 2017-08-21 18:22         ` Vallish Vaidyeshwara
  2017-08-22  6:23           ` Richard Cochran
  0 siblings, 1 reply; 12+ messages in thread
From: Vallish Vaidyeshwara @ 2017-08-21 18:22 UTC (permalink / raw)
  To: Richard Cochran
  Cc: davem, shuah, netdev, linux-kernel, eduval, anchalag, tglx

On Sun, Aug 20, 2017 at 01:47:45AM +0000, Vallish Vaidyeshwara wrote:
> On Sat, Aug 19, 2017 at 08:21:45AM +0200, Richard Cochran wrote:
> > On Fri, Aug 18, 2017 at 10:27:56PM +0000, Vallish Vaidyeshwara wrote:
> > > We have a on-demand application that uses long timeouts and needs to react to
> > > events within milliseconds.
> >
> 
> Hello Richard,
> 
> > Huh?  The test program you posted does not react to any event.
> >
> 
> Application has logic for complex events and test program is kept simple to
> highlight the change in behavior seen with system calls.
>

Hello Richard,

AWS Lambda is affected by this change in behavior in
system call. Following links has more information:
https://en.wikipedia.org/wiki/AWS_Lambda
https://aws.amazon.com/lambda/

Thanks.
-Vallish

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets
  2017-08-18 18:44 ` [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets Vallish Vaidyeshwara
@ 2017-08-21 20:10   ` Cong Wang
  2017-08-22 11:14     ` Vallish Vaidyeshwara
  0 siblings, 1 reply; 12+ messages in thread
From: Cong Wang @ 2017-08-21 20:10 UTC (permalink / raw)
  To: Vallish Vaidyeshwara
  Cc: David Miller, shuah, Linux Kernel Network Developers, LKML,
	Eduardo Valentin, anchalag

On Fri, Aug 18, 2017 at 11:44 AM, Vallish Vaidyeshwara
<vallish@amazon.com> wrote:
> -       *timeo_p = schedule_timeout(*timeo_p);
> +       /* Wait using highres timer */
> +       expires = ktime_add_ns(ktime_get(), jiffies_to_nsecs(*timeo_p));
> +       pre_sched_time = jiffies;
> +       if (schedule_hrtimeout(&expires, HRTIMER_MODE_ABS))

Does this work with MAX_SCHEDULE_TIMEOUT too??

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/2] enable hires timer to timeout datagram socket
  2017-08-21 18:22         ` Vallish Vaidyeshwara
@ 2017-08-22  6:23           ` Richard Cochran
  2017-08-22 11:17             ` Vallish Vaidyeshwara
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Cochran @ 2017-08-22  6:23 UTC (permalink / raw)
  To: Vallish Vaidyeshwara
  Cc: davem, shuah, netdev, linux-kernel, eduval, anchalag, tglx

On Mon, Aug 21, 2017 at 06:22:10PM +0000, Vallish Vaidyeshwara wrote:
> AWS Lambda is affected by this change in behavior in
> system call. Following links has more information:
> https://en.wikipedia.org/wiki/AWS_Lambda

Quote:

    Unlike Amazon EC2, which is priced by the hour, AWS Lambda is
    metered in increments of 100 milliseconds.

So I guess you want the accurate timeout in order to support billing?
In any case, even with the old wheel you didn't have guarantees WRT
timeout latency, and so the proper way for the application to handle
this is to use a timerfd together with HIGH_RES_TIMERS, and PREEMPT_RT
in order to have sub-millisecond latency.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets
  2017-08-21 20:10   ` Cong Wang
@ 2017-08-22 11:14     ` Vallish Vaidyeshwara
  0 siblings, 0 replies; 12+ messages in thread
From: Vallish Vaidyeshwara @ 2017-08-22 11:14 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, shuah, Linux Kernel Network Developers, LKML,
	Eduardo Valentin, anchalag

On Mon, Aug 21, 2017 at 01:10:34PM -0700, Cong Wang wrote:
> On Fri, Aug 18, 2017 at 11:44 AM, Vallish Vaidyeshwara
> <vallish@amazon.com> wrote:
> > -       *timeo_p = schedule_timeout(*timeo_p);
> > +       /* Wait using highres timer */
> > +       expires = ktime_add_ns(ktime_get(), jiffies_to_nsecs(*timeo_p));
> > +       pre_sched_time = jiffies;
> > +       if (schedule_hrtimeout(&expires, HRTIMER_MODE_ABS))
>

Hello Cong,

> Does this work with MAX_SCHEDULE_TIMEOUT too??
> 

Thanks for pointing out MAX_SCHEDULE_TIMEOUT. I have made minor change to
accommodate MAX_SCHEDULE_TIMEOUT and will send out next version of the patch
for review.

Thanks.
-Vallish

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/2] enable hires timer to timeout datagram socket
  2017-08-22  6:23           ` Richard Cochran
@ 2017-08-22 11:17             ` Vallish Vaidyeshwara
  0 siblings, 0 replies; 12+ messages in thread
From: Vallish Vaidyeshwara @ 2017-08-22 11:17 UTC (permalink / raw)
  To: Richard Cochran
  Cc: davem, shuah, netdev, linux-kernel, eduval, anchalag, tglx

On Tue, Aug 22, 2017 at 08:23:11AM +0200, Richard Cochran wrote:
> On Mon, Aug 21, 2017 at 06:22:10PM +0000, Vallish Vaidyeshwara wrote:
> > AWS Lambda is affected by this change in behavior in
> > system call. Following links has more information:
> > https://en.wikipedia.org/wiki/AWS_Lambda
> 
> Quote:
> 
>     Unlike Amazon EC2, which is priced by the hour, AWS Lambda is
>     metered in increments of 100 milliseconds.
>
> So I guess you want the accurate timeout in order to support billing?
> In any case, even with the old wheel you didn't have guarantees WRT
> timeout latency, and so the proper way for the application to handle
> this is to use a timerfd together with HIGH_RES_TIMERS, and PREEMPT_RT
> in order to have sub-millisecond latency.
> 
> Thanks,
> Richard

Hello Richard,

4.4 kernel implementation of datagram socket wait code is calling
schedule_timeout() which in-turn calls __mod_timer(). __mod_timer()
does not add any slack. mod_timer() is the function that adds slack.
This gives good consistent results for event handling response time
on datagram socket timeouts.

strace from 4.4 test run of waiting for 180 seconds:
10:25:48.239685 setsockopt(3, SOL_SOCKET, SO_RCVTIMEO, "\264\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
10:25:48.239755 recvmsg(3, 0x7ffd0a3beec0, 0) = -1 EAGAIN (Resource temporarily unavailable)
10:28:48.236989 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0

strace from 4.9 test run of waiting for 180 seconds times out close to 195 seconds:
setsockopt(3, SOL_SOCKET, SO_RCVTIMEO, "\264\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0 <0.000028>
recvmsg(3, 0x7ffd6a2c4380, 0)           = -1 EAGAIN (Resource temporarily unavailable) <194.852000>
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 <0.000018>

This change of behavior in system call is breaking the application logic and
response time.

Thanks.
-Vallish

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-08-22 11:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-18 18:44 [PATCH RESEND 0/2] enable hires timer to timeout datagram socket Vallish Vaidyeshwara
2017-08-18 18:44 ` [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets Vallish Vaidyeshwara
2017-08-21 20:10   ` Cong Wang
2017-08-22 11:14     ` Vallish Vaidyeshwara
2017-08-18 18:44 ` [PATCH RESEND 2/2] selftests/net: add test to verify datagram socket timeout Vallish Vaidyeshwara
2017-08-18 20:18 ` [PATCH RESEND 0/2] enable hires timer to timeout datagram socket Richard Cochran
2017-08-18 22:27   ` Vallish Vaidyeshwara
2017-08-19  6:21     ` Richard Cochran
2017-08-20  1:47       ` Vallish Vaidyeshwara
2017-08-21 18:22         ` Vallish Vaidyeshwara
2017-08-22  6:23           ` Richard Cochran
2017-08-22 11:17             ` Vallish Vaidyeshwara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).