netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] net: low latency sockets follow ups
@ 2013-06-11 14:24 Eliezer Tamir
  2013-06-11 14:24 ` [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue Eliezer Tamir
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-11 14:24 UTC (permalink / raw)
  To: David Miller
  Cc: Willem de Bruijn, Or Gerlitz, e1000-devel, netdev, HPA,
	Amir Vadai, linux-kernel, Eliezer Tamir, Jesse Brandeburg,
	Andi Kleen, Ben Hutchings, Eric Dumazet, Eilon Greenstien

David,

Here are two more patches.

Patch 1 removes the config menu for NET_LL_RX_POLL and defaults to y.
Patch 2 adds an SO_LL socket option to allow per-socket control of busy polling.

The patches do not depend on each other.
If one of them needs more work, please consider applying the other.

I will reply to this thread with a patch to sockperf that enables using
the socket option.

Thanks,
Eliezer


------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue
  2013-06-11 14:24 [PATCH net-next 0/2] net: low latency sockets follow ups Eliezer Tamir
@ 2013-06-11 14:24 ` Eliezer Tamir
  2013-06-12 22:12   ` David Miller
  2013-06-11 14:24 ` [PATCH net-next 2/2] net:add socket option for low latency polling Eliezer Tamir
  2013-06-11 14:26 ` [PATCH] sockperf: add SO_LL socketop support Eliezer Tamir
  2 siblings, 1 reply; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-11 14:24 UTC (permalink / raw)
  To: David Miller
  Cc: Willem de Bruijn, Or Gerlitz, e1000-devel, netdev, HPA,
	Amir Vadai, linux-kernel, Eliezer Tamir, Jesse Brandeburg,
	Andi Kleen, Ben Hutchings, Eric Dumazet, Eilon Greenstien

Remove NET_LL_RX_POLL from the config menu.
Change default to y.
Busy polling still needs to be enabled at runtime via sysctl.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---

 net/Kconfig |   11 ++---------
 1 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/net/Kconfig b/net/Kconfig
index d6a9ce6..8fe8845 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -244,16 +244,9 @@ config NETPRIO_CGROUP
 	  a per-interface basis
 
 config NET_LL_RX_POLL
-	bool "Low Latency Receive Poll"
+	boolean
 	depends on X86_TSC
-	default n
-	---help---
-	  Support Low Latency Receive Queue Poll.
-	  (For network card drivers which support this option.)
-	  When waiting for data in read or poll call directly into the the device driver
-	  to flush packets which may be pending on the device queues into the stack.
-
-	  If unsure, say N.
+	default y
 
 config BQL
 	boolean


------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 2/2] net:add socket option for low latency polling
  2013-06-11 14:24 [PATCH net-next 0/2] net: low latency sockets follow ups Eliezer Tamir
  2013-06-11 14:24 ` [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue Eliezer Tamir
@ 2013-06-11 14:24 ` Eliezer Tamir
  2013-06-11 14:45   ` Eric Dumazet
  2013-06-11 20:24   ` Ben Hutchings
  2013-06-11 14:26 ` [PATCH] sockperf: add SO_LL socketop support Eliezer Tamir
  2 siblings, 2 replies; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-11 14:24 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, netdev, Jesse Brandeburg, Don Skidmore,
	e1000-devel, Willem de Bruijn, Eric Dumazet, Ben Hutchings,
	Andi Kleen, HPA, Eilon Greenstien, Or Gerlitz, Amir Vadai,
	Eliezer Tamir

adds a socket option for low latency polling.
This allows overriding the global sysctl value with a per-socket one.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---

 arch/alpha/include/uapi/asm/socket.h   |    2 ++
 arch/avr32/include/uapi/asm/socket.h   |    2 ++
 arch/cris/include/uapi/asm/socket.h    |    2 ++
 arch/frv/include/uapi/asm/socket.h     |    2 ++
 arch/h8300/include/uapi/asm/socket.h   |    2 ++
 arch/ia64/include/uapi/asm/socket.h    |    2 ++
 arch/m32r/include/uapi/asm/socket.h    |    2 ++
 arch/mips/include/uapi/asm/socket.h    |    2 ++
 arch/mn10300/include/uapi/asm/socket.h |    2 ++
 arch/parisc/include/uapi/asm/socket.h  |    2 ++
 arch/powerpc/include/uapi/asm/socket.h |    2 ++
 arch/s390/include/uapi/asm/socket.h    |    2 ++
 arch/sparc/include/uapi/asm/socket.h   |    2 ++
 arch/xtensa/include/uapi/asm/socket.h  |    2 ++
 include/net/ll_poll.h                  |   10 +++++-----
 include/net/sock.h                     |    2 ++
 include/uapi/asm-generic/socket.h      |    2 ++
 net/core/sock.c                        |   26 ++++++++++++++++++++++++++
 18 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index eee6ea7..4885825 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -81,4 +81,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/avr32/include/uapi/asm/socket.h b/arch/avr32/include/uapi/asm/socket.h
index 37401f5..79b6179 100644
--- a/arch/avr32/include/uapi/asm/socket.h
+++ b/arch/avr32/include/uapi/asm/socket.h
@@ -74,4 +74,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* __ASM_AVR32_SOCKET_H */
diff --git a/arch/cris/include/uapi/asm/socket.h b/arch/cris/include/uapi/asm/socket.h
index ba409c9..47b1ec5 100644
--- a/arch/cris/include/uapi/asm/socket.h
+++ b/arch/cris/include/uapi/asm/socket.h
@@ -76,6 +76,8 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _ASM_SOCKET_H */
 
 
diff --git a/arch/frv/include/uapi/asm/socket.h b/arch/frv/include/uapi/asm/socket.h
index 31dbb5d..dbc0852 100644
--- a/arch/frv/include/uapi/asm/socket.h
+++ b/arch/frv/include/uapi/asm/socket.h
@@ -74,5 +74,7 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _ASM_SOCKET_H */
 
diff --git a/arch/h8300/include/uapi/asm/socket.h b/arch/h8300/include/uapi/asm/socket.h
index 5d1c6d0..a38d38a 100644
--- a/arch/h8300/include/uapi/asm/socket.h
+++ b/arch/h8300/include/uapi/asm/socket.h
@@ -74,4 +74,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/ia64/include/uapi/asm/socket.h b/arch/ia64/include/uapi/asm/socket.h
index 6b4329f..d3358b7 100644
--- a/arch/ia64/include/uapi/asm/socket.h
+++ b/arch/ia64/include/uapi/asm/socket.h
@@ -83,4 +83,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _ASM_IA64_SOCKET_H */
diff --git a/arch/m32r/include/uapi/asm/socket.h b/arch/m32r/include/uapi/asm/socket.h
index 2a3b59e..44aaf46 100644
--- a/arch/m32r/include/uapi/asm/socket.h
+++ b/arch/m32r/include/uapi/asm/socket.h
@@ -74,4 +74,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _ASM_M32R_SOCKET_H */
diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index 3b21150..6a07992 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -92,4 +92,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/mn10300/include/uapi/asm/socket.h b/arch/mn10300/include/uapi/asm/socket.h
index b4ce844..db80fd3 100644
--- a/arch/mn10300/include/uapi/asm/socket.h
+++ b/arch/mn10300/include/uapi/asm/socket.h
@@ -74,4 +74,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index 70c512a..f866fff 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -73,6 +73,8 @@
 
 #define SO_SELECT_ERR_QUEUE	0x4026
 
+#define SO_LL			0x4027
+
 /* O_NONBLOCK clashes with the bits used for socket types.  Therefore we
  * have to define SOCK_NONBLOCK to a different value here.
  */
diff --git a/arch/powerpc/include/uapi/asm/socket.h b/arch/powerpc/include/uapi/asm/socket.h
index a36daf3..405fb09 100644
--- a/arch/powerpc/include/uapi/asm/socket.h
+++ b/arch/powerpc/include/uapi/asm/socket.h
@@ -81,4 +81,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif	/* _ASM_POWERPC_SOCKET_H */
diff --git a/arch/s390/include/uapi/asm/socket.h b/arch/s390/include/uapi/asm/socket.h
index 2dacb306..0c5105fb 100644
--- a/arch/s390/include/uapi/asm/socket.h
+++ b/arch/s390/include/uapi/asm/socket.h
@@ -80,4 +80,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index 89f49b6..b46c3fa 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -70,6 +70,8 @@
 
 #define SO_SELECT_ERR_QUEUE	0x0029
 
+#define SO_LL			0x0030
+
 /* Security levels - as per NRL IPv6 - don't actually do anything */
 #define SO_SECURITY_AUTHENTICATION		0x5001
 #define SO_SECURITY_ENCRYPTION_TRANSPORT	0x5002
diff --git a/arch/xtensa/include/uapi/asm/socket.h b/arch/xtensa/include/uapi/asm/socket.h
index a8f44f5..b21ace4 100644
--- a/arch/xtensa/include/uapi/asm/socket.h
+++ b/arch/xtensa/include/uapi/asm/socket.h
@@ -85,4 +85,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif	/* _XTENSA_SOCKET_H */
diff --git a/include/net/ll_poll.h b/include/net/ll_poll.h
index bc262f8..4dafd52 100644
--- a/include/net/ll_poll.h
+++ b/include/net/ll_poll.h
@@ -43,14 +43,14 @@ extern unsigned long sysctl_net_ll_poll __read_mostly;
 /* we don't mind a ~2.5% imprecision */
 #define TSC_MHZ (tsc_khz >> 10)
 
-static inline cycles_t ll_end_time(void)
+static inline cycles_t ll_end_time(struct sock *sk)
 {
-	return TSC_MHZ * ACCESS_ONCE(sysctl_net_ll_poll) + get_cycles();
+	return TSC_MHZ * ACCESS_ONCE(sk->sk_ll_usec) + get_cycles();
 }
 
 static inline bool sk_valid_ll(struct sock *sk)
 {
-	return sysctl_net_ll_poll && sk->sk_napi_id &&
+	return sk->sk_ll_usec && sk->sk_napi_id &&
 	       !need_resched() && !signal_pending(current);
 }
 
@@ -62,7 +62,7 @@ static inline bool can_poll_ll(cycles_t end_time)
 
 static inline bool sk_poll_ll(struct sock *sk, int nonblock)
 {
-	cycles_t end_time = ll_end_time();
+	cycles_t end_time = ll_end_time(sk);
 	const struct net_device_ops *ops;
 	struct napi_struct *napi;
 	int rc = false;
@@ -116,7 +116,7 @@ static inline void sk_mark_ll(struct sock *sk, struct sk_buff *skb)
 
 #else /* CONFIG_NET_LL_RX_POLL */
 
-static inline cycles_t ll_end_time(void)
+static inline cycles_t ll_end_time(struct sock *sk)
 {
 	return 0;
 }
diff --git a/include/net/sock.h b/include/net/sock.h
index ac8e181..21db792 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -230,6 +230,7 @@ struct cg_proto;
   *	@sk_wmem_queued: persistent queue size
   *	@sk_forward_alloc: space allocated forward
   *	@sk_napi_id: id of the last napi context to receive data for sk
+  *	@sk_ll_usec: usecs to busypoll when there is no data
   *	@sk_allocation: allocation mode
   *	@sk_sndbuf: size of send buffer in bytes
   *	@sk_flags: %SO_LINGER (l_onoff), %SO_BROADCAST, %SO_KEEPALIVE,
@@ -328,6 +329,7 @@ struct sock {
 #endif
 #ifdef CONFIG_NET_LL_RX_POLL
 	unsigned int		sk_napi_id;
+	unsigned int		sk_ll_usec;
 #endif
 	atomic_t		sk_drops;
 	int			sk_rcvbuf;
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index c5d2e3a..ca3a20d 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -76,4 +76,6 @@
 
 #define SO_SELECT_ERR_QUEUE	45
 
+#define SO_LL			46
+
 #endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/net/core/sock.c b/net/core/sock.c
index 788c0da..bddc962 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -913,6 +913,23 @@ set_rcvbuf:
 		sock_valbool_flag(sk, SOCK_SELECT_ERR_QUEUE, valbool);
 		break;
 
+#ifdef CONFIG_NET_LL_RX_POLL
+	case SO_LL:
+		if (!capable(CAP_NET_ADMIN))
+			ret = -EACCES;
+		else {
+			unsigned long ulval;
+
+			if (!copy_from_user(&ulval, optval, sizeof(ulval))
+							&& ulval >= 0)
+				sk->sk_ll_usec = ulval;
+
+			else
+				ret = -EINVAL;
+
+		}
+		break;
+#endif
 	default:
 		ret = -ENOPROTOOPT;
 		break;
@@ -944,6 +961,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
 
 	union {
 		int val;
+		unsigned long ulval;
 		struct linger ling;
 		struct timeval tm;
 	} v;
@@ -1170,6 +1188,13 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
 		v.val = sock_flag(sk, SOCK_SELECT_ERR_QUEUE);
 		break;
 
+#ifdef CONFIG_NET_LL_RX_POLL
+	case SO_LL:
+		len = sizeof(v.ulval);
+		v.ulval = sk->sk_ll_usec;
+		break;
+#endif
+
 	default:
 		return -ENOPROTOOPT;
 	}
@@ -2288,6 +2313,7 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 
 #ifdef CONFIG_NET_LL_RX_POLL
 	sk->sk_napi_id		=	0;
+	sk->sk_ll_usec		=	sysctl_net_ll_poll;
 #endif
 
 	/*

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH] sockperf: add SO_LL socketop support
  2013-06-11 14:24 [PATCH net-next 0/2] net: low latency sockets follow ups Eliezer Tamir
  2013-06-11 14:24 ` [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue Eliezer Tamir
  2013-06-11 14:24 ` [PATCH net-next 2/2] net:add socket option for low latency polling Eliezer Tamir
@ 2013-06-11 14:26 ` Eliezer Tamir
  2013-06-12  8:36   ` Amir Vadai
  2 siblings, 1 reply; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-11 14:26 UTC (permalink / raw)
  To: David Miller
  Cc: Willem de Bruijn, Or Gerlitz, e1000-devel, netdev, HPA,
	Amir Vadai, linux-kernel, Eliezer Tamir, Jesse Brandeburg,
	Andi Kleen, Ben Hutchings, Eric Dumazet, Eilon Greenstien

Add lls socket option support to sockperf.
Right now we always get the option before set to show the option is 
working properly. We should probably remove that in an official release.
use --lls (value in usecs) to override global setting.
---

  src/Defs.h       |    3 +++
  src/SockPerf.cpp |   54 
+++++++++++++++++++++++++++++++++++++++++++++++++++++-
  2 files changed, 56 insertions(+), 1 deletions(-)

diff --git a/src/Defs.h b/src/Defs.h
index e38e3a4..87b45a0 100644
--- a/src/Defs.h
+++ b/src/Defs.h
@@ -161,6 +161,7 @@ enum {
      OPT_OUTPUT_PRECISION,           //35
      OPT_CLIENTPORT,                 //36
      OPT_CLIENTIP,                   //37
+    OPT_LLS,            //38
  };

  #define MODULE_NAME            "sockperf"
@@ -527,6 +528,8 @@ struct user_params_t {
  //    bool stream_mode; - use b_stream instead
      int mthread_server;
      struct timeval* select_timeout;
+    unsigned long lls_usecs;
+    bool lls_is_set;
      int sock_buff_size;
      int threads_num;
      char threads_affinity[MAX_ARGV_SIZE];
diff --git a/src/SockPerf.cpp b/src/SockPerf.cpp
index 41daf95..d76320f 100644
--- a/src/SockPerf.cpp
+++ b/src/SockPerf.cpp
@@ -207,6 +207,10 @@ static const AOPT_DESC  common_opt_desc[] =
          "Limit the lifetime of the message (default 2)."
      },
      {
+        OPT_LLS, AOPT_ARG, aopt_set_literal( 0 ), aopt_set_string( "lls" ),
+        "Turn on LLS via socket option (value = us to poll)."
+    },
+    {
          OPT_BUFFER_SIZE, AOPT_ARG, aopt_set_literal( 0 ), 
aopt_set_string( "buffer-size" ),
          "Set total socket receive/send buffer <size> in bytes (system 
defined by default)."
      },
@@ -292,7 +296,7 @@ static int proc_mode_help( int id, int argc, const 
char **argv )
      int   i = 0;

      printf(MODULE_NAME " is a tool for testing network latency and 
throughput.\n");
-    printf("version %s\n", STR(VERSION));
+    printf("version %s-lls\n", STR(VERSION));
      printf("\n");
      printf("Usage: " MODULE_NAME " <subcommand> [options] [args]\n");
      printf("Type: \'" MODULE_NAME " <subcommand> --help\' for help on 
a specific subcommand.\n");
@@ -1789,6 +1793,26 @@ static int parse_common_opt( const AOPT_OBJECT 
*common_obj )
              s_user_params.is_nonblocked_send = true;
          }

+        if ( !rc && aopt_check(common_obj, OPT_LLS) ) {
+            const char* optarg = aopt_value(common_obj, OPT_LLS);
+            if (optarg) {
+                errno = 0;
+                int value = strtoul(optarg, NULL, 0);
+                if (errno != 0 || value < 0) {
+                    log_msg("'-%d' Invalid LLS value: %s", OPT_LLS, 
optarg);
+                    rc = SOCKPERF_ERR_BAD_ARGUMENT;
+                }
+                else {
+                    s_user_params.lls_usecs = value;
+                    s_user_params.lls_is_set = true;
+                }
+            }
+            else {
+                log_msg("'-%d' Invalid value", OPT_LLS);
+                rc = SOCKPERF_ERR_BAD_ARGUMENT;
+            }
+        }
+
          if ( !rc && aopt_check(common_obj, OPT_RECV_LOOPING) ) {

              const char* optarg = aopt_value(common_obj, OPT_RECV_LOOPING);
@@ -2296,6 +2320,29 @@ int sock_set_reuseaddr(int fd)
      return rc;
  }

+#ifndef SO_LL
+#define SO_LL 46
+#endif
+int sock_set_lls(int fd)
+{
+    int rc = SOCKPERF_ERR_NONE;
+    unsigned long lls;
+    int size = sizeof(lls);
+
+    if(getsockopt(fd, SOL_SOCKET, SO_LL, &lls, (socklen_t *)&size) < 0){
+        log_err("getsockopt(SO_LL) failed");
+        return SOCKPERF_ERR_SOCKET;
+    } else
+        log_msg("socket option SO_LL default was %lu, changing to %lu", 
lls, s_user_params.lls_usecs);
+
+    if (setsockopt(fd, SOL_SOCKET, SO_LL, &(s_user_params.lls_usecs), 
sizeof(s_user_params.lls_usecs)) < 0) {
+        log_err("setsockopt(SO_LL) failed");
+        rc = SOCKPERF_ERR_SOCKET;
+    }
+    return rc;
+}
+
+
  int sock_set_snd_rcv_bufs(int fd)
  {
      /*
@@ -2460,6 +2507,11 @@ int prepare_socket(int fd, struct fds_data *p_data)
      }

      if (!rc &&
+            (s_user_params.lls_is_set == true))
+    {
+        rc = sock_set_lls(fd);
+    }
+    if (!rc &&
              (s_user_params.sock_buff_size > 0))
      {
          rc = sock_set_snd_rcv_bufs(fd);






------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] net:add socket option for low latency polling
  2013-06-11 14:24 ` [PATCH net-next 2/2] net:add socket option for low latency polling Eliezer Tamir
@ 2013-06-11 14:45   ` Eric Dumazet
  2013-06-11 15:37     ` Eliezer Tamir
  2013-06-11 20:24   ` Ben Hutchings
  1 sibling, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2013-06-11 14:45 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: Willem de Bruijn, Don, Or Gerlitz, e1000-devel, netdev, HPA,
	Amir Vadai, Jesse Brandeburg, Eliezer Tamir, linux-kernel,
	Andi Kleen, Ben Hutchings, Eilon Greenstien, David Miller

On Tue, 2013-06-11 at 17:24 +0300, Eliezer Tamir wrote:
> adds a socket option for low latency polling.
> This allows overriding the global sysctl value with a per-socket one.
> 
> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> ---
> 
>  
> -static inline cycles_t ll_end_time(void)
> +static inline cycles_t ll_end_time(struct sock *sk)
>  {
> -	return TSC_MHZ * ACCESS_ONCE(sysctl_net_ll_poll) + get_cycles();
> +	return TSC_MHZ * ACCESS_ONCE(sk->sk_ll_usec) + get_cycles();
>  }

Hmm, sk_ll_usec is an unsigned int.

(cycles_t)TSC_MHZ * ACCESS_ONCE(sk->sk_ll_usec) + get_cycles();

>  
> +#ifdef CONFIG_NET_LL_RX_POLL
> +	case SO_LL:
> +		if (!capable(CAP_NET_ADMIN))
> +			ret = -EACCES;
> +		else {
> +			unsigned long ulval;

Why an "unsigned long" ?


> +
> +			if (!copy_from_user(&ulval, optval, sizeof(ulval))
> +							&& ulval >= 0)
> +				sk->sk_ll_usec = ulval;
> +
> +			else
> +				ret = -EINVAL;

Just use
	sk->sk_ll_usec = val;

> +
> +		}
> +		break;
> +#endif
>  	default:
>  		ret = -ENOPROTOOPT;
>  		break;
> @@ -944,6 +961,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
>  
>  	union {
>  		int val;
> +		unsigned long ulval;

Why an unsigned long ? 32bit are more than enough.

>  		struct linger ling;
>  		struct timeval tm;
>  	} v;
> @@ -1170,6 +1188,13 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
>  		v.val = sock_flag(sk, SOCK_SELECT_ERR_QUEUE);
>  		break;
>  
> +#ifdef CONFIG_NET_LL_RX_POLL
> +	case SO_LL:
> +		len = sizeof(v.ulval);
> +		v.ulval = sk->sk_ll_usec;
> +		break;
> +#endif
> +




------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] net:add socket option for low latency polling
  2013-06-11 14:45   ` Eric Dumazet
@ 2013-06-11 15:37     ` Eliezer Tamir
  2013-06-11 15:58       ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-11 15:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Willem de Bruijn, Or Gerlitz, e1000-devel, netdev, HPA,
	Amir Vadai, Jesse Brandeburg, Eliezer Tamir, linux-kernel,
	Andi Kleen, Ben Hutchings, Eilon Greenstien, David Miller

On 11/06/2013 17:45, Eric Dumazet wrote:
> On Tue, 2013-06-11 at 17:24 +0300, Eliezer Tamir wrote:
>> adds a socket option for low latency polling.
>> This allows overriding the global sysctl value with a per-socket one.
>>
>> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>> ---
>>
>>
>> -static inline cycles_t ll_end_time(void)
>> +static inline cycles_t ll_end_time(struct sock *sk)
>>   {
>> -	return TSC_MHZ * ACCESS_ONCE(sysctl_net_ll_poll) + get_cycles();
>> +	return TSC_MHZ * ACCESS_ONCE(sk->sk_ll_usec) + get_cycles();
>>   }
>
> Hmm, sk_ll_usec is an unsigned int.

We changed it to an unsigned long in v7, I guess that was gratuitous.
Re-reading your comments on v6 2/5 I realize a cast would have sufficed.
Should I send a patch to revert it to an unsigned int?

Thanks,
Eliezer


------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] net:add socket option for low latency polling
  2013-06-11 15:37     ` Eliezer Tamir
@ 2013-06-11 15:58       ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2013-06-11 15:58 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: Willem de Bruijn, Don, Or Gerlitz, e1000-devel, netdev, HPA,
	Amir Vadai, Jesse Brandeburg, Eliezer Tamir, linux-kernel,
	Andi Kleen, Ben Hutchings, Eilon Greenstien, David Miller

On Tue, 2013-06-11 at 18:37 +0300, Eliezer Tamir wrote:
> On 11/06/2013 17:45, Eric Dumazet wrote:
> > On Tue, 2013-06-11 at 17:24 +0300, Eliezer Tamir wrote:
> >> adds a socket option for low latency polling.
> >> This allows overriding the global sysctl value with a per-socket one.
> >>
> >> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> >> ---
> >>
> >>
> >> -static inline cycles_t ll_end_time(void)
> >> +static inline cycles_t ll_end_time(struct sock *sk)
> >>   {
> >> -	return TSC_MHZ * ACCESS_ONCE(sysctl_net_ll_poll) + get_cycles();
> >> +	return TSC_MHZ * ACCESS_ONCE(sk->sk_ll_usec) + get_cycles();
> >>   }
> >
> > Hmm, sk_ll_usec is an unsigned int.
> 
> We changed it to an unsigned long in v7, I guess that was gratuitous.
> Re-reading your comments on v6 2/5 I realize a cast would have sufficed.
> Should I send a patch to revert it to an unsigned int?

One sysctl as unsigned long was not a big deal so I was ok with your
change ;)

unsigned int for sk_ll_usec is enough, but using a 32x32->64bit multiply
is probably safer.





------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] net:add socket option for low latency polling
  2013-06-11 14:24 ` [PATCH net-next 2/2] net:add socket option for low latency polling Eliezer Tamir
  2013-06-11 14:45   ` Eric Dumazet
@ 2013-06-11 20:24   ` Ben Hutchings
  2013-06-12  6:39     ` Eliezer Tamir
  1 sibling, 1 reply; 16+ messages in thread
From: Ben Hutchings @ 2013-06-11 20:24 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: Willem de Bruijn, Don, Or Gerlitz, e1000-devel, netdev, HPA,
	Amir Vadai, Jesse Brandeburg, Eliezer Tamir, linux-kernel,
	Andi Kleen, Eric Dumazet, Eilon Greenstien, David Miller

On Tue, 2013-06-11 at 17:24 +0300, Eliezer Tamir wrote:
> adds a socket option for low latency polling.
> This allows overriding the global sysctl value with a per-socket one.
[...]
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -913,6 +913,23 @@ set_rcvbuf:
>  		sock_valbool_flag(sk, SOCK_SELECT_ERR_QUEUE, valbool);
>  		break;
>  
> +#ifdef CONFIG_NET_LL_RX_POLL
> +	case SO_LL:
> +		if (!capable(CAP_NET_ADMIN))
> +			ret = -EACCES;
[...]

Failed capability checks normally result in EPERM whereas EACCES usually
results from a file permission/ACL/label check.

Perhaps unprivileged users should be allowed to set a value as long as
it's less than or equal to the global value?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] net:add socket option for low latency polling
  2013-06-11 20:24   ` Ben Hutchings
@ 2013-06-12  6:39     ` Eliezer Tamir
  0 siblings, 0 replies; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-12  6:39 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, linux-kernel, netdev, Jesse Brandeburg,
	Don Skidmore, e1000-devel, Willem de Bruijn, Eric Dumazet,
	Andi Kleen, HPA, Eilon Greenstien, Or Gerlitz, Amir Vadai,
	Eliezer Tamir

On 11/06/2013 23:24, Ben Hutchings wrote:
> On Tue, 2013-06-11 at 17:24 +0300, Eliezer Tamir wrote:
>> adds a socket option for low latency polling.
>> This allows overriding the global sysctl value with a per-socket one.
> [...]
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -913,6 +913,23 @@ set_rcvbuf:
>>   		sock_valbool_flag(sk, SOCK_SELECT_ERR_QUEUE, valbool);
>>   		break;
>>
>> +#ifdef CONFIG_NET_LL_RX_POLL
>> +	case SO_LL:
>> +		if (!capable(CAP_NET_ADMIN))
>> +			ret = -EACCES;
> [...]
>
> Failed capability checks normally result in EPERM whereas EACCES usually
> results from a file permission/ACL/label check.

OK

> Perhaps unprivileged users should be allowed to set a value as long as
> it's less than or equal to the global value?

I thought of allowing to disable even if you are not privileged, but 
this sounds better.

Thanks!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] sockperf: add SO_LL socketop support
  2013-06-11 14:26 ` [PATCH] sockperf: add SO_LL socketop support Eliezer Tamir
@ 2013-06-12  8:36   ` Amir Vadai
  2013-06-12  8:45     ` Eliezer Tamir
  0 siblings, 1 reply; 16+ messages in thread
From: Amir Vadai @ 2013-06-12  8:36 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: Willem de Bruijn, Don, Or Kehati, Or Gerlitz, e1000-devel,
	netdev, HPA, Jesse Brandeburg, Eliezer Tamir, linux-kernel,
	Avner Ben Hanoch, Andi Kleen, Alex Rosenbaum, Ben Hutchings,
	Eric Dumazet, Eilon Greenstien, David Miller

On 11/06/2013 17:26, Eliezer Tamir wrote:
> Add lls socket option support to sockperf.
> Right now we always get the option before set to show the option is
> working properly. We should probably remove that in an official release.
> use --lls (value in usecs) to override global setting.
> ---
> 
>  src/Defs.h       |    3 +++
>  src/SockPerf.cpp |   54
> +++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 56 insertions(+), 1 deletions(-)
> 
> diff --git a/src/Defs.h b/src/Defs.h
> index e38e3a4..87b45a0 100644
> --- a/src/Defs.h
> +++ b/src/Defs.h
> @@ -161,6 +161,7 @@ enum {
>      OPT_OUTPUT_PRECISION,           //35
>      OPT_CLIENTPORT,                 //36
>      OPT_CLIENTIP,                   //37
> +    OPT_LLS,            //38
>  };
> 
>  #define MODULE_NAME            "sockperf"
> @@ -527,6 +528,8 @@ struct user_params_t {
>  //    bool stream_mode; - use b_stream instead
>      int mthread_server;
>      struct timeval* select_timeout;
> +    unsigned long lls_usecs;
> +    bool lls_is_set;
>      int sock_buff_size;
>      int threads_num;
>      char threads_affinity[MAX_ARGV_SIZE];
> diff --git a/src/SockPerf.cpp b/src/SockPerf.cpp
> index 41daf95..d76320f 100644
> --- a/src/SockPerf.cpp
> +++ b/src/SockPerf.cpp
> @@ -207,6 +207,10 @@ static const AOPT_DESC  common_opt_desc[] =
>          "Limit the lifetime of the message (default 2)."
>      },
>      {
> +        OPT_LLS, AOPT_ARG, aopt_set_literal( 0 ), aopt_set_string(
> "lls" ),
> +        "Turn on LLS via socket option (value = us to poll)."
> +    },
> +    {
>          OPT_BUFFER_SIZE, AOPT_ARG, aopt_set_literal( 0 ),
> aopt_set_string( "buffer-size" ),
>          "Set total socket receive/send buffer <size> in bytes (system
> defined by default)."
>      },
> @@ -292,7 +296,7 @@ static int proc_mode_help( int id, int argc, const
> char **argv )
>      int   i = 0;
> 
>      printf(MODULE_NAME " is a tool for testing network latency and
> throughput.\n");
> -    printf("version %s\n", STR(VERSION));
> +    printf("version %s-lls\n", STR(VERSION));
>      printf("\n");
>      printf("Usage: " MODULE_NAME " <subcommand> [options] [args]\n");
>      printf("Type: \'" MODULE_NAME " <subcommand> --help\' for help on a
> specific subcommand.\n");
> @@ -1789,6 +1793,26 @@ static int parse_common_opt( const AOPT_OBJECT
> *common_obj )
>              s_user_params.is_nonblocked_send = true;
>          }
> 
> +        if ( !rc && aopt_check(common_obj, OPT_LLS) ) {
> +            const char* optarg = aopt_value(common_obj, OPT_LLS);
> +            if (optarg) {
> +                errno = 0;
> +                int value = strtoul(optarg, NULL, 0);
> +                if (errno != 0 || value < 0) {
> +                    log_msg("'-%d' Invalid LLS value: %s", OPT_LLS,
> optarg);
> +                    rc = SOCKPERF_ERR_BAD_ARGUMENT;
> +                }
> +                else {
> +                    s_user_params.lls_usecs = value;
> +                    s_user_params.lls_is_set = true;
> +                }
> +            }
> +            else {
> +                log_msg("'-%d' Invalid value", OPT_LLS);
> +                rc = SOCKPERF_ERR_BAD_ARGUMENT;
> +            }
> +        }
> +
>          if ( !rc && aopt_check(common_obj, OPT_RECV_LOOPING) ) {
> 
>              const char* optarg = aopt_value(common_obj, OPT_RECV_LOOPING);
> @@ -2296,6 +2320,29 @@ int sock_set_reuseaddr(int fd)
>      return rc;
>  }
> 
> +#ifndef SO_LL
> +#define SO_LL 46
> +#endif
> +int sock_set_lls(int fd)
> +{
> +    int rc = SOCKPERF_ERR_NONE;
> +    unsigned long lls;
> +    int size = sizeof(lls);
> +
> +    if(getsockopt(fd, SOL_SOCKET, SO_LL, &lls, (socklen_t *)&size) < 0){
> +        log_err("getsockopt(SO_LL) failed");
> +        return SOCKPERF_ERR_SOCKET;
> +    } else
> +        log_msg("socket option SO_LL default was %lu, changing to %lu",
> lls, s_user_params.lls_usecs);
> +
> +    if (setsockopt(fd, SOL_SOCKET, SO_LL, &(s_user_params.lls_usecs),
> sizeof(s_user_params.lls_usecs)) < 0) {
> +        log_err("setsockopt(SO_LL) failed");
> +        rc = SOCKPERF_ERR_SOCKET;
> +    }
> +    return rc;
> +}
> +
> +
>  int sock_set_snd_rcv_bufs(int fd)
>  {
>      /*
> @@ -2460,6 +2507,11 @@ int prepare_socket(int fd, struct fds_data *p_data)
>      }
> 
>      if (!rc &&
> +            (s_user_params.lls_is_set == true))
> +    {
> +        rc = sock_set_lls(fd);
> +    }
> +    if (!rc &&
>              (s_user_params.sock_buff_size > 0))
>      {
>          rc = sock_set_snd_rcv_bufs(fd);
> 
> 
> 
> 
> 

Eliezer Hi,

Added sockperf maintainers to CC list.

To upload changes to sockperf code, please open a ticket on the sockperf
google code and then submit the patch
(https://code.google.com/p/sockperf/issues/).

You will need to wrap it in " #ifndef w-i-n-d-o-w-s" but sockperf
maintainers can help you do that on the sockperf pages.

You should look at the recently added TOS which is very similar
(https://code.google.com/p/sockperf/issues/detail?id=44).

Amir

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] sockperf: add SO_LL socketop support
  2013-06-12  8:36   ` Amir Vadai
@ 2013-06-12  8:45     ` Eliezer Tamir
  0 siblings, 0 replies; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-12  8:45 UTC (permalink / raw)
  To: Amir Vadai
  Cc: Willem de Bruijn, Or Kehati, Or Gerlitz, e1000-devel, netdev,
	HPA, Jesse Brandeburg, Eliezer Tamir, linux-kernel,
	Avner Ben Hanoch, Andi Kleen, Alex Rosenbaum, Ben Hutchings,
	Eric Dumazet, Eilon Greenstien, David Miller

On 12/06/2013 11:36, Amir Vadai wrote:
> On 11/06/2013 17:26, Eliezer Tamir wrote:
>> Add lls socket option support to sockperf.
>> Right now we always get the option before set to show the option is
>> working properly. We should probably remove that in an official release.
>> use --lls (value in usecs) to override global setting.
>> ---

> Eliezer Hi,
>
> Added sockperf maintainers to CC list.

Thank you.

Please note that I plan on revising this patch, so don't add it just 
yet. (I will CC the maintainers on further versions.)

-Eliezer

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue
  2013-06-11 14:24 ` [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue Eliezer Tamir
@ 2013-06-12 22:12   ` David Miller
  2013-06-13  2:01     ` Stephen Hemminger
  0 siblings, 1 reply; 16+ messages in thread
From: David Miller @ 2013-06-12 22:12 UTC (permalink / raw)
  To: eliezer.tamir
  Cc: linux-kernel, netdev, jesse.brandeburg, donald.c.skidmore,
	e1000-devel, willemb, erdnetdev, bhutchings, andi, hpa, eilong,
	or.gerlitz, amirv, eliezer

From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Date: Tue, 11 Jun 2013 17:24:28 +0300

>  	depends on X86_TSC

Wait a second, I didn't notice this before.  There needs to be a better
way to test for the accuracy you need, or if the issue is lack of a proper
API for cycle counter reading, fix that rather than add ugly arch
specific dependencies to generic networking code.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue
  2013-06-12 22:12   ` David Miller
@ 2013-06-13  2:01     ` Stephen Hemminger
  2013-06-13  2:13       ` Eliezer Tamir
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2013-06-13  2:01 UTC (permalink / raw)
  To: David Miller
  Cc: eliezer.tamir, linux-kernel, netdev, jesse.brandeburg,
	donald.c.skidmore, e1000-devel, willemb, erdnetdev, bhutchings,
	andi, hpa, eilong, or.gerlitz, amirv, eliezer

On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> Date: Tue, 11 Jun 2013 17:24:28 +0300
> 
> >  	depends on X86_TSC
> 
> Wait a second, I didn't notice this before.  There needs to be a better
> way to test for the accuracy you need, or if the issue is lack of a proper
> API for cycle counter reading, fix that rather than add ugly arch
> specific dependencies to generic networking code.

This should be sched_clock(), rather than direct TSC access.
Also any code using TSC or sched_clock has to be carefully audited to deal with
clocks running at different rates on different CPU's. Basically value is only
meaning full on same CPU.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue
  2013-06-13  2:01     ` Stephen Hemminger
@ 2013-06-13  2:13       ` Eliezer Tamir
  2013-06-13  8:00         ` Daniel Borkmann
  0 siblings, 1 reply; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-13  2:13 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: willemb, or.gerlitz, e1000-devel, netdev, hpa, amirv,
	jesse.brandeburg, eliezer, linux-kernel, andi, bhutchings,
	erdnetdev, eilong, David Miller

On 13/06/2013 05:01, Stephen Hemminger wrote:
> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
>
>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>
>>>   	depends on X86_TSC
>>
>> Wait a second, I didn't notice this before.  There needs to be a better
>> way to test for the accuracy you need, or if the issue is lack of a proper
>> API for cycle counter reading, fix that rather than add ugly arch
>> specific dependencies to generic networking code.
>
> This should be sched_clock(), rather than direct TSC access.
> Also any code using TSC or sched_clock has to be carefully audited to deal with
> clocks running at different rates on different CPU's. Basically value is only
> meaning full on same CPU.

OK,

If we covert to sched_clock(), would adding a define such as 
HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high 
precision clock and a 64 bit cycles_t be a good solution?

(if not any other suggestion?)



------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue
  2013-06-13  2:13       ` Eliezer Tamir
@ 2013-06-13  8:00         ` Daniel Borkmann
  2013-06-13 10:09           ` Eliezer Tamir
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel Borkmann @ 2013-06-13  8:00 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: willemb, or.gerlitz, e1000-devel, netdev, hpa, amirv,
	jesse.brandeburg, eliezer, linux-kernel, andi, bhutchings,
	erdnetdev, eilong, David Miller

On 06/13/2013 04:13 AM, Eliezer Tamir wrote:
> On 13/06/2013 05:01, Stephen Hemminger wrote:
>> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
>> David Miller <davem@davemloft.net> wrote:
>>
>>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>>
>>>>       depends on X86_TSC
>>>
>>> Wait a second, I didn't notice this before.  There needs to be a better
>>> way to test for the accuracy you need, or if the issue is lack of a proper
>>> API for cycle counter reading, fix that rather than add ugly arch
>>> specific dependencies to generic networking code.
>>
>> This should be sched_clock(), rather than direct TSC access.
>> Also any code using TSC or sched_clock has to be carefully audited to deal with
>> clocks running at different rates on different CPU's. Basically value is only
>> meaning full on same CPU.
>
> OK,
>
> If we covert to sched_clock(), would adding a define such as HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high precision clock and a 64 bit cycles_t be a good solution?
>
> (if not any other suggestion?)

Hm, probably cpu_clock() and similar might be better, since they use
sched_clock() in the background when !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
(meaning when sched_clock() provides synchronized highres time source from
the architecture), and, quoting ....

  Otherwise it tries to create a semi stable clock from a mixture of other
  clocks, including:

   - GTOD (clock monotomic)
   - sched_clock()
   - explicit idle events

But yeah, it needs to be evaluated regarding the drift between CPUs in
general.

Then, eventually, you could get rid of the entire NET_LL_RX_POLL config
option plus related ifdefs in the code and have it built-in in general?

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue
  2013-06-13  8:00         ` Daniel Borkmann
@ 2013-06-13 10:09           ` Eliezer Tamir
  0 siblings, 0 replies; 16+ messages in thread
From: Eliezer Tamir @ 2013-06-13 10:09 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Stephen Hemminger, David Miller, linux-kernel, netdev,
	jesse.brandeburg, donald.c.skidmore, e1000-devel, willemb,
	erdnetdev, bhutchings, andi, hpa, eilong, or.gerlitz, amirv,
	eliezer

On 13/06/2013 11:00, Daniel Borkmann wrote:
> On 06/13/2013 04:13 AM, Eliezer Tamir wrote:
>> On 13/06/2013 05:01, Stephen Hemminger wrote:
>>> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
>>> David Miller <davem@davemloft.net> wrote:
>>>
>>>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>>>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>>>
>>>>>       depends on X86_TSC
>>>>
>>>> Wait a second, I didn't notice this before.  There needs to be a better
>>>> way to test for the accuracy you need, or if the issue is lack of a
>>>> proper
>>>> API for cycle counter reading, fix that rather than add ugly arch
>>>> specific dependencies to generic networking code.
>>>
>>> This should be sched_clock(), rather than direct TSC access.
>>> Also any code using TSC or sched_clock has to be carefully audited to
>>> deal with
>>> clocks running at different rates on different CPU's. Basically value
>>> is only
>>> meaning full on same CPU.
>>
>> OK,
>>
>> If we covert to sched_clock(), would adding a define such as
>> HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high
>> precision clock and a 64 bit cycles_t be a good solution?
>>
>> (if not any other suggestion?)
>
> Hm, probably cpu_clock() and similar might be better, since they use
> sched_clock() in the background when !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
> (meaning when sched_clock() provides synchronized highres time source from
> the architecture), and, quoting ....

I don't think we want the overhead of disabling IRQs
that cpu_clock() adds.

We don't really care about precise measurement.
All we need is a sane cut-off for busy polling.
It's no big deal if on a rare occasion we poll less,
or even poll twice the time.
As long as it's rare it should not matter.

Maybe the answer is not to use cycle counting at all?
Maybe just wait the full sk_rcvtimo?
(resched() when proper, bail out if signal pending, etc.)

This could only be a safe/sane thing to do after we add
a socket option, because this can't be a global setting.

This would of course turn the option into a flag.
If it's set (and !nonblock), busy wait up to sk_recvtimo.

Opinions?

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-06-13 10:09 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-11 14:24 [PATCH net-next 0/2] net: low latency sockets follow ups Eliezer Tamir
2013-06-11 14:24 ` [PATCH net-next 1/2] net: remove NET_LL_RX_POLL config menue Eliezer Tamir
2013-06-12 22:12   ` David Miller
2013-06-13  2:01     ` Stephen Hemminger
2013-06-13  2:13       ` Eliezer Tamir
2013-06-13  8:00         ` Daniel Borkmann
2013-06-13 10:09           ` Eliezer Tamir
2013-06-11 14:24 ` [PATCH net-next 2/2] net:add socket option for low latency polling Eliezer Tamir
2013-06-11 14:45   ` Eric Dumazet
2013-06-11 15:37     ` Eliezer Tamir
2013-06-11 15:58       ` Eric Dumazet
2013-06-11 20:24   ` Ben Hutchings
2013-06-12  6:39     ` Eliezer Tamir
2013-06-11 14:26 ` [PATCH] sockperf: add SO_LL socketop support Eliezer Tamir
2013-06-12  8:36   ` Amir Vadai
2013-06-12  8:45     ` Eliezer Tamir

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).