All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] net: lls cleanup patches
@ 2013-06-28 12:59 Eliezer Tamir
  2013-06-28 12:59 ` [PATCH net-next 1/2] net: fix LLS debug_smp_processor_id() warning Eliezer Tamir
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Eliezer Tamir @ 2013-06-28 12:59 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, netdev, Willem de Bruijn, Eric Dumazet, Andi Kleen,
	HPA, Cody P Schafer, Eliezer Tamir

David,

Here are two cleanup patches.

1. fix warning from debug_smp_processor_id().
- reported by Cody P Schafer.

2. avoid calling sched_clock() in select and poll when lls is disabled.
- reported by Andi Kleen.

Thanks to everyone for their comments.

-Eliezer

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH net-next 1/2] net: fix LLS debug_smp_processor_id() warning
  2013-06-28 12:59 [PATCH net-next 0/2] net: lls cleanup patches Eliezer Tamir
@ 2013-06-28 12:59 ` Eliezer Tamir
  2013-06-28 16:51   ` Using sched_clock() for polling time limit Ben Hutchings
  2013-06-28 12:59 ` [PATCH net-next 2/2] net: avoid calling sched_clock when LLS is off Eliezer Tamir
  2013-07-01 21:08 ` [PATCH net-next 0/2] net: lls cleanup patches David Miller
  2 siblings, 1 reply; 17+ messages in thread
From: Eliezer Tamir @ 2013-06-28 12:59 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, netdev, Willem de Bruijn, Eric Dumazet, Andi Kleen,
	HPA, Cody P Schafer, Eliezer Tamir

Our use of sched_clock is OK because we don't mind the side effects
of calling it and occasionally waking up on a different CPU.

When CONFIG_DEBUG_PREEMPT is on, disable preempt before calling
sched_clock() so we don't trigger a debug_smp_processor_id() warning.

Reported-by: Cody P Schafer <devel-lists@codyps.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---

 include/net/ll_poll.h |   30 +++++++++++++++++++++++++-----
 1 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/include/net/ll_poll.h b/include/net/ll_poll.h
index 5bf2b3a..6d45e6f 100644
--- a/include/net/ll_poll.h
+++ b/include/net/ll_poll.h
@@ -37,20 +37,40 @@ extern unsigned int sysctl_net_ll_poll __read_mostly;
 #define LL_FLUSH_FAILED		-1
 #define LL_FLUSH_BUSY		-2
 
-/* we can use sched_clock() because we don't care much about precision
+/* a wrapper to make debug_smp_processor_id() happy
+ * we can use sched_clock() because we don't care much about precision
  * we only care that the average is bounded
- * we don't mind a ~2.5% imprecision so <<10 instead of *1000
+ */
+#ifdef CONFIG_DEBUG_PREEMPT
+static inline u64 ll_sched_clock(void)
+{
+	u64 rc;
+
+	preempt_disable_notrace();
+	rc = sched_clock();
+	preempt_enable_no_resched_notrace();
+
+	return rc;
+}
+#else /* CONFIG_DEBUG_PREEMPT */
+static inline u64 ll_sched_clock(void)
+{
+	return sched_clock();
+}
+#endif /* CONFIG_DEBUG_PREEMPT */
+
+/* we don't mind a ~2.5% imprecision so <<10 instead of *1000
  * sk->sk_ll_usec is a u_int so this can't overflow
  */
 static inline u64 ll_sk_end_time(struct sock *sk)
 {
-	return ((u64)ACCESS_ONCE(sk->sk_ll_usec) << 10) + sched_clock();
+	return ((u64)ACCESS_ONCE(sk->sk_ll_usec) << 10) + ll_sched_clock();
 }
 
 /* in poll/select we use the global sysctl_net_ll_poll value */
 static inline u64 ll_end_time(void)
 {
-	return ((u64)ACCESS_ONCE(sysctl_net_ll_poll) << 10) + sched_clock();
+	return ((u64)ACCESS_ONCE(sysctl_net_ll_poll) << 10) + ll_sched_clock();
 }
 
 static inline bool sk_valid_ll(struct sock *sk)
@@ -61,7 +81,7 @@ static inline bool sk_valid_ll(struct sock *sk)
 
 static inline bool can_poll_ll(u64 end_time)
 {
-	return !time_after64(sched_clock(), end_time);
+	return !time_after64(ll_sched_clock(), end_time);
 }
 
 /* when used in sock_poll() nonblock is known at compile time to be true


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next 2/2] net: avoid calling sched_clock when LLS is off
  2013-06-28 12:59 [PATCH net-next 0/2] net: lls cleanup patches Eliezer Tamir
  2013-06-28 12:59 ` [PATCH net-next 1/2] net: fix LLS debug_smp_processor_id() warning Eliezer Tamir
@ 2013-06-28 12:59 ` Eliezer Tamir
  2013-06-28 14:38   ` Andi Kleen
  2013-07-01 21:08 ` [PATCH net-next 0/2] net: lls cleanup patches David Miller
  2 siblings, 1 reply; 17+ messages in thread
From: Eliezer Tamir @ 2013-06-28 12:59 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, netdev, Willem de Bruijn, Eric Dumazet, Andi Kleen,
	HPA, Cody P Schafer, Eliezer Tamir

Change Low Latency Sockets code for select and poll so that
when LLS is disabled sched_clock() is never called.

Also, avoid sending POLL_LL to sockets if disabled.

Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---

 fs/select.c           |   11 +++++++----
 include/net/ll_poll.h |   17 +++++++++++++++--
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 79b876e..3654075 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -402,7 +402,7 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
 	poll_table *wait;
 	int retval, i, timed_out = 0;
 	unsigned long slack = 0;
-	unsigned int ll_flag = POLL_LL;
+	unsigned int ll_flag = ll_get_flag();
 	u64 ll_time = ll_end_time();
 
 	rcu_read_lock();
@@ -497,7 +497,8 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
 			break;
 		}
 
-		if (can_ll && can_poll_ll(ll_time))
+		/* only if on, have sockets with POLL_LL and not out of time */
+		if (ll_flag && can_ll && can_poll_ll(ll_time))
 			continue;
 
 		/*
@@ -768,7 +769,7 @@ static int do_poll(unsigned int nfds,  struct poll_list *list,
 	ktime_t expire, *to = NULL;
 	int timed_out = 0, count = 0;
 	unsigned long slack = 0;
-	unsigned int ll_flag = POLL_LL;
+	unsigned int ll_flag = ll_get_flag();
 	u64 ll_time = ll_end_time();
 
 	/* Optimise the no-wait case */
@@ -817,8 +818,10 @@ static int do_poll(unsigned int nfds,  struct poll_list *list,
 		if (count || timed_out)
 			break;
 
-		if (can_ll && can_poll_ll(ll_time))
+		/* only if on, have sockets with POLL_LL and not out of time */
+		if (ll_flag && can_ll && can_poll_ll(ll_time))
 			continue;
+
 		/*
 		 * If this is the first loop and we have a timeout
 		 * given, then we convert to ktime_t and set the to
diff --git a/include/net/ll_poll.h b/include/net/ll_poll.h
index 6d45e6f..6c06f7c 100644
--- a/include/net/ll_poll.h
+++ b/include/net/ll_poll.h
@@ -37,6 +37,11 @@ extern unsigned int sysctl_net_ll_poll __read_mostly;
 #define LL_FLUSH_FAILED		-1
 #define LL_FLUSH_BUSY		-2
 
+static inline unsigned int ll_get_flag(void)
+{
+	return sysctl_net_ll_poll ? POLL_LL : 0;
+}
+
 /* a wrapper to make debug_smp_processor_id() happy
  * we can use sched_clock() because we don't care much about precision
  * we only care that the average is bounded
@@ -67,10 +72,14 @@ static inline u64 ll_sk_end_time(struct sock *sk)
 	return ((u64)ACCESS_ONCE(sk->sk_ll_usec) << 10) + ll_sched_clock();
 }
 
-/* in poll/select we use the global sysctl_net_ll_poll value */
+/* in poll/select we use the global sysctl_net_ll_poll value
+ * only call sched_clock() if enabled
+ */
 static inline u64 ll_end_time(void)
 {
-	return ((u64)ACCESS_ONCE(sysctl_net_ll_poll) << 10) + ll_sched_clock();
+	u64 end_time = ACCESS_ONCE(sysctl_net_ll_poll);
+
+	return end_time ? (end_time << 10) + ll_sched_clock() : 0;
 }
 
 static inline bool sk_valid_ll(struct sock *sk)
@@ -141,6 +150,10 @@ static inline void sk_mark_ll(struct sock *sk, struct sk_buff *skb)
 }
 
 #else /* CONFIG_NET_LL_RX_POLL */
+static inline unsigned long ll_get_flag(void)
+{
+	return 0;
+}
 
 static inline u64 sk_ll_end_time(struct sock *sk)
 {


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 2/2] net: avoid calling sched_clock when LLS is off
  2013-06-28 12:59 ` [PATCH net-next 2/2] net: avoid calling sched_clock when LLS is off Eliezer Tamir
@ 2013-06-28 14:38   ` Andi Kleen
  2013-06-28 14:54     ` Eliezer Tamir
  0 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2013-06-28 14:38 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: David Miller, linux-kernel, netdev, Willem de Bruijn,
	Eric Dumazet, Andi Kleen, HPA, Cody P Schafer, Eliezer Tamir

> diff --git a/fs/select.c b/fs/select.c
> index 79b876e..3654075 100644
> --- a/fs/select.c
> +++ b/fs/select.c
> @@ -402,7 +402,7 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
>  	poll_table *wait;
>  	int retval, i, timed_out = 0;
>  	unsigned long slack = 0;
> -	unsigned int ll_flag = POLL_LL;
> +	unsigned int ll_flag = ll_get_flag();

Is that a global flag? That's still the wrong level. It should
look at something in the file descriptor (preferably without
fetching any new cache lines)

-Andi

>  
> +static inline unsigned int ll_get_flag(void)
> +{
> +	return sysctl_net_ll_poll ? POLL_LL : 0;
> +}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 2/2] net: avoid calling sched_clock when LLS is off
  2013-06-28 14:38   ` Andi Kleen
@ 2013-06-28 14:54     ` Eliezer Tamir
  0 siblings, 0 replies; 17+ messages in thread
From: Eliezer Tamir @ 2013-06-28 14:54 UTC (permalink / raw)
  Cc: linux-kernel, netdev, Eliezer Tamir

On 28/06/2013 17:38, Andi Kleen wrote:
>> diff --git a/fs/select.c b/fs/select.c
>> index 79b876e..3654075 100644
>> --- a/fs/select.c
>> +++ b/fs/select.c
>> @@ -402,7 +402,7 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
>>   	poll_table *wait;
>>   	int retval, i, timed_out = 0;
>>   	unsigned long slack = 0;
>> -	unsigned int ll_flag = POLL_LL;
>> +	unsigned int ll_flag = ll_get_flag();
>
> Is that a global flag? That's still the wrong level. It should
> look at something in the file descriptor (preferably without
> fetching any new cache lines)
>

There is a global flag that decides if we even try to find supported
files (sockets).

If it is on we look to see if anyone will return POLL_LL.
at the individual socket level, you need both the socket option to
be on, and the queue data to be available for it to return POLL_LL.

I wanted to find a way to control this at the time the user calls poll().
I was not able to find any simple way of doing it that way.
I'm open to suggestion.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Using sched_clock() for polling time limit
  2013-06-28 12:59 ` [PATCH net-next 1/2] net: fix LLS debug_smp_processor_id() warning Eliezer Tamir
@ 2013-06-28 16:51   ` Ben Hutchings
  2013-06-29 18:50     ` Eliezer Tamir
  0 siblings, 1 reply; 17+ messages in thread
From: Ben Hutchings @ 2013-06-28 16:51 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: David Miller, linux-kernel, netdev, Willem de Bruijn,
	Eric Dumazet, Andi Kleen, HPA, Cody P Schafer, Eliezer Tamir

On Fri, 2013-06-28 at 15:59 +0300, Eliezer Tamir wrote:
> Our use of sched_clock is OK because we don't mind the side effects
> of calling it and occasionally waking up on a different CPU.

Sure about that?  Jitter matters too.

> When CONFIG_DEBUG_PREEMPT is on, disable preempt before calling
> sched_clock() so we don't trigger a debug_smp_processor_id() warning.
[...]

I think this is papering over a problem.  The warning is there for a
good reason.

Would functions like these make it possible to use sched_clock() safely
for polling?  (I didn't spend much time thinking about the names.)

struct sched_timestamp {
	int cpu;
	unsigned long long clock;
};

static inline struct sched_timestamp sched_get_timestamp(void)
{
	struct sched_timestamp ret;

	preempt_disable_notrace();
	ret.cpu = smp_processor_id();
	ret.clock = sched_clock();
	preempt_enable_no_resched_notrace();

	return ret;
}

static inline bool sched_timeout_or_moved(struct sched_timestamp start,
					  unsigned long long timeout)
{
	bool ret;

	preempt_disable_notrace();
	ret = start.cpu != smp_processor_id() ||
		(sched_clock() - start.clock) > timeout;
	preempt_enable_no_resched_notrace();

	return ret;
}

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Using sched_clock() for polling time limit
  2013-06-28 16:51   ` Using sched_clock() for polling time limit Ben Hutchings
@ 2013-06-29 18:50     ` Eliezer Tamir
  2013-07-01 19:48       ` Ben Hutchings
  0 siblings, 1 reply; 17+ messages in thread
From: Eliezer Tamir @ 2013-06-29 18:50 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, linux-kernel, netdev, Willem de Bruijn,
	Eric Dumazet, Andi Kleen, HPA, Cody P Schafer, Eliezer Tamir

On 28/06/2013 19:51, Ben Hutchings wrote:
> On Fri, 2013-06-28 at 15:59 +0300, Eliezer Tamir wrote:
>> Our use of sched_clock is OK because we don't mind the side effects
>> of calling it and occasionally waking up on a different CPU.
>
> Sure about that?  Jitter matters too.
>
Pretty sure, this is a limit on how long we poll, it's for fairness to
the rest of the system not for performance of this code.

What matters is that on average you are bounded by something close to
what the user specified. If once in a while you run less because of
clock jitter, or even twice the specified time, it's no big deal.

So I don't see how jitter would matter.

And if your workload is jitter sensitive, you should probably be
pinning tasks to CPUs anyway.


>> When CONFIG_DEBUG_PREEMPT is on, disable preempt before calling
>> sched_clock() so we don't trigger a debug_smp_processor_id() warning.
> [...]
>
> I think this is papering over a problem.  The warning is there for a
> good reason.

I think we understand the warning, and that we are OK with the effects.

looking at how other users in the kernel solved this issue
It seems like this is what they do.
for example trace/ring_buffer.c:ring_buffer_time_Stamp()

Also kvm_clock_read() and xen_clokcsource_read() seem to disable preempt
just to silence this warning.

If they really cared about reading the value on one CPU, then being
scheduled on another they should have disabled interrupts.
or am I missing something?

> Would functions like these make it possible to use sched_clock() safely
> for polling?  (I didn't spend much time thinking about the names.)
>
> struct sched_timestamp {
> 	int cpu;
> 	unsigned long long clock;
> };
>
> static inline struct sched_timestamp sched_get_timestamp(void)
> {
> 	struct sched_timestamp ret;
>
> 	preempt_disable_notrace();
> 	ret.cpu = smp_processor_id();
> 	ret.clock = sched_clock();
> 	preempt_enable_no_resched_notrace();
>
> 	return ret;
> }

I don't understand, preempt_disable() only makes prevents preempt
from taking the CPU away from you, you could still lose it for
other reasons.
You would really need to disable interrupts in this region to be
sure that it all executed on the same CPU.

-Eliezer

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Using sched_clock() for polling time limit
  2013-06-29 18:50     ` Eliezer Tamir
@ 2013-07-01 19:48       ` Ben Hutchings
  0 siblings, 0 replies; 17+ messages in thread
From: Ben Hutchings @ 2013-07-01 19:48 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: David Miller, linux-kernel, netdev, Willem de Bruijn,
	Eric Dumazet, Andi Kleen, HPA, Cody P Schafer, Eliezer Tamir

On Sat, 2013-06-29 at 21:50 +0300, Eliezer Tamir wrote:
> On 28/06/2013 19:51, Ben Hutchings wrote:
> > On Fri, 2013-06-28 at 15:59 +0300, Eliezer Tamir wrote:
> >> Our use of sched_clock is OK because we don't mind the side effects
> >> of calling it and occasionally waking up on a different CPU.
> >
> > Sure about that?  Jitter matters too.
> >
> Pretty sure, this is a limit on how long we poll, it's for fairness to
> the rest of the system not for performance of this code.
> 
> What matters is that on average you are bounded by something close to
> what the user specified. If once in a while you run less because of
> clock jitter, or even twice the specified time, it's no big deal.

So what is the maximum time difference in sched_clock() values between
CPUs in the same system?

> So I don't see how jitter would matter.
> 
> And if your workload is jitter sensitive, you should probably be
> pinning tasks to CPUs anyway.

Yes, they should be pinned.  But I think a single task that isn't pinned
could poll for significantly longer than intended and the effect
wouldn't be limited to that one task.

> >> When CONFIG_DEBUG_PREEMPT is on, disable preempt before calling
> >> sched_clock() so we don't trigger a debug_smp_processor_id() warning.
> > [...]
> >
> > I think this is papering over a problem.  The warning is there for a
> > good reason.
> 
> I think we understand the warning, and that we are OK with the effects.
> 
> looking at how other users in the kernel solved this issue
> It seems like this is what they do.
> for example trace/ring_buffer.c:ring_buffer_time_Stamp()

Well, this seems to be debug information, not actually used for timing.

> Also kvm_clock_read() and xen_clokcsource_read() seem to disable preempt
> just to silence this warning.

I can't see what you're referring to.

> If they really cared about reading the value on one CPU, then being
> scheduled on another they should have disabled interrupts.
> or am I missing something?
> 
> > Would functions like these make it possible to use sched_clock() safely
> > for polling?  (I didn't spend much time thinking about the names.)
[...]
> I don't understand, preempt_disable() only makes prevents preempt
> from taking the CPU away from you, you could still lose it for
> other reasons.
> You would really need to disable interrupts in this region to be
> sure that it all executed on the same CPU.

Hmm, yes you're right.  Anyway, what I was trying to suggest was that
you would record the current CPU along with the initial timestamp, and
then abort low-latency polling if either the task is moved onto a
different CPU or the time limit was reached.  Checking the CPU first
means that the sched_clock() comparison is valid.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 0/2] net: lls cleanup patches
  2013-06-28 12:59 [PATCH net-next 0/2] net: lls cleanup patches Eliezer Tamir
  2013-06-28 12:59 ` [PATCH net-next 1/2] net: fix LLS debug_smp_processor_id() warning Eliezer Tamir
  2013-06-28 12:59 ` [PATCH net-next 2/2] net: avoid calling sched_clock when LLS is off Eliezer Tamir
@ 2013-07-01 21:08 ` David Miller
  2013-07-02  8:38   ` Eliezer Tamir
  2 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2013-07-01 21:08 UTC (permalink / raw)
  To: eliezer.tamir
  Cc: linux-kernel, netdev, willemb, erdnetdev, andi, hpa, devel-lists,
	eliezer

From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Date: Fri, 28 Jun 2013 15:59:18 +0300

> Here are two cleanup patches.
> 
> 1. fix warning from debug_smp_processor_id().
> - reported by Cody P Schafer.
> 
> 2. avoid calling sched_clock() in select and poll when lls is disabled.
> - reported by Andi Kleen.
> 
> Thanks to everyone for their comments.

Applied, but like Ben said perhaps you want to remember the last cpu you
got the sched_clock() measurement from and abort the ll poll if it changes
on you instead of using a comparison between two cpus.

But then again, since preemption is enabled, the cpu could change
back and forth during the sched_clock() call, so you wouldn't be able
to reliably detect this anyways.

In the grand scheme of things all of this probably doesn't matter at
all.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 0/2] net: lls cleanup patches
  2013-07-01 21:08 ` [PATCH net-next 0/2] net: lls cleanup patches David Miller
@ 2013-07-02  8:38   ` Eliezer Tamir
  2013-07-02  8:45     ` Eliezer Tamir
  0 siblings, 1 reply; 17+ messages in thread
From: Eliezer Tamir @ 2013-07-02  8:38 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, netdev, willemb, erdnetdev, andi, hpa, devel-lists,
	eliezer

On 02/07/2013 00:08, David Miller wrote:
> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> Date: Fri, 28 Jun 2013 15:59:18 +0300
>
>> Here are two cleanup patches.
>>
>> 1. fix warning from debug_smp_processor_id().
>> - reported by Cody P Schafer.
>>

> Applied, but like Ben said perhaps you want to remember the last cpu you
> got the sched_clock() measurement from and abort the ll poll if it changes
> on you instead of using a comparison between two cpus.
>
> But then again, since preemption is enabled, the cpu could change
> back and forth during the sched_clock() call, so you wouldn't be able
> to reliably detect this anyways.
>
> In the grand scheme of things all of this probably doesn't matter at
> all.

The only thing that really worries me, is the possibility of time
on the new cpu to be completely random, then we could be back in the
range where time_after() will be false again and end up polling for
another year.

A simple way to limit the damage would be to use time_in_range()
instead of time_after(), then if we have a completely random time we
would be out of the range and fail safely.

would something like this be an acceptable solution?

---
[PATCH net-next] net: convert lls to use time_in_range()

Time in range will fail safely if we move to a different cpu with an
extremely large clock skew.
Add time_in_range64() and convert lls to use it.

Signed-of-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---

  fs/select.c             |   10 ++++++----
  include/linux/jiffies.h |    4 ++++
  include/net/ll_poll.h   |   24 +++++++++++++++---------
  3 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 3654075..f28a585 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -403,7 +403,8 @@ int do_select(int n, fd_set_bits *fds, struct 
timespec *end_time)
  	int retval, i, timed_out = 0;
  	unsigned long slack = 0;
  	unsigned int ll_flag = ll_get_flag();
-	u64 ll_time = ll_end_time();
+	u64 ll_start = ll_start_time(ll_flag);
+	u64 ll_time = ll_run_time();

  	rcu_read_lock();
  	retval = max_select_fd(n, fds);
@@ -498,7 +499,7 @@ int do_select(int n, fd_set_bits *fds, struct 
timespec *end_time)
  		}

  		/* only if on, have sockets with POLL_LL and not out of time */
-		if (ll_flag && can_ll && can_poll_ll(ll_time))
+		if (ll_flag && can_ll && can_poll_ll(ll_start, ll_time))
  			continue;

  		/*
@@ -770,7 +771,8 @@ static int do_poll(unsigned int nfds,  struct 
poll_list *list,
  	int timed_out = 0, count = 0;
  	unsigned long slack = 0;
  	unsigned int ll_flag = ll_get_flag();
-	u64 ll_time = ll_end_time();
+	u64 ll_start = ll_start_time(ll_flag);
+	u64 ll_time = ll_run_time();

  	/* Optimise the no-wait case */
  	if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
@@ -819,7 +821,7 @@ static int do_poll(unsigned int nfds,  struct 
poll_list *list,
  			break;

  		/* only if on, have sockets with POLL_LL and not out of time */
-		if (ll_flag && can_ll && can_poll_ll(ll_time))
+		if (ll_flag && can_ll && can_poll_ll(ll_start, ll_time))
  			continue;

  		/*
diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 8fb8edf..37b7354 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -139,6 +139,10 @@ static inline u64 get_jiffies_64(void)
  	 ((__s64)(a) - (__s64)(b) >= 0))
  #define time_before_eq64(a,b)	time_after_eq64(b,a)

+#define time_in_range64(a,b,c) \
+	(time_after_eq64(a,b) && \
+	 time_before_eq64(a,c))
+
  /*
   * These four macros compare jiffies and 'a' for convenience.
   */
diff --git a/include/net/ll_poll.h b/include/net/ll_poll.h
index 6c06f7c..61c2daf 100644
--- a/include/net/ll_poll.h
+++ b/include/net/ll_poll.h
@@ -67,19 +67,23 @@ static inline u64 ll_sched_clock(void)
  /* we don't mind a ~2.5% imprecision so <<10 instead of *1000
   * sk->sk_ll_usec is a u_int so this can't overflow
   */
-static inline u64 ll_sk_end_time(struct sock *sk)
+static inline u64 ll_sk_run_time(struct sock *sk)
  {
-	return ((u64)ACCESS_ONCE(sk->sk_ll_usec) << 10) + ll_sched_clock();
+	return (u64)ACCESS_ONCE(sk->sk_ll_usec) << 10;
  }

  /* in poll/select we use the global sysctl_net_ll_poll value
   * only call sched_clock() if enabled
   */
-static inline u64 ll_end_time(void)
+static inline u64 ll_run_time(void)
  {
-	u64 end_time = ACCESS_ONCE(sysctl_net_ll_poll);
+	return (u64)ACCESS_ONCE(sysctl_net_ll_poll) << 10;
+}

-	return end_time ? (end_time << 10) + ll_sched_clock() : 0;
+/* if flag is not set we don't need to know the time */
+static inline u64 ll_start_time(unsigned int flag)
+{
+	return flag ? ll_sched_clock() : 0;
  }

  static inline bool sk_valid_ll(struct sock *sk)
@@ -88,9 +92,10 @@ static inline bool sk_valid_ll(struct sock *sk)
  	       !need_resched() && !signal_pending(current);
  }

-static inline bool can_poll_ll(u64 end_time)
+static inline bool can_poll_ll(u64 start_time, u64 run_time)
  {
-	return !time_after64(ll_sched_clock(), end_time);
+	return time_in_range64(ll_sched_clock(), start_time,
+			       start_time + run_time);
  }

  /* when used in sock_poll() nonblock is known at compile time to be true
@@ -98,7 +103,8 @@ static inline bool can_poll_ll(u64 end_time)
   */
  static inline bool sk_poll_ll(struct sock *sk, int nonblock)
  {
-	u64 end_time = nonblock ? 0 : ll_sk_end_time(sk);
+	u64 start_time = ll_start_time(!nonblock);
+	u64 run_time = ll_sk_run_time(sk);
  	const struct net_device_ops *ops;
  	struct napi_struct *napi;
  	int rc = false;
@@ -129,7 +135,7 @@ static inline bool sk_poll_ll(struct sock *sk, int 
nonblock)
  					 LINUX_MIB_LOWLATENCYRXPACKETS, rc);

  	} while (!nonblock && skb_queue_empty(&sk->sk_receive_queue) &&
-		 can_poll_ll(end_time));
+		 can_poll_ll(start_time, run_time));

  	rc = !skb_queue_empty(&sk->sk_receive_queue);
  out:




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next 0/2] net: lls cleanup patches
  2013-07-02  8:38   ` Eliezer Tamir
@ 2013-07-02  8:45     ` Eliezer Tamir
  2013-07-02  9:49       ` [PATCH v2 net-next] net: convert lls to use time_in_range() Eliezer Tamir
  0 siblings, 1 reply; 17+ messages in thread
From: Eliezer Tamir @ 2013-07-02  8:45 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, netdev, willemb, erdnetdev, andi, hpa, devel-lists,
	eliezer

On 02/07/2013 11:38, Eliezer Tamir wrote:
> On 02/07/2013 00:08, David Miller wrote:
>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>> Date: Fri, 28 Jun 2013 15:59:18 +0300
>>
>>> Here are two cleanup patches.
>>>
>>> 1. fix warning from debug_smp_processor_id().
>>> - reported by Cody P Schafer.
>>>
>
>> Applied, but like Ben said perhaps you want to remember the last cpu you
>> got the sched_clock() measurement from and abort the ll poll if it
>> changes
>> on you instead of using a comparison between two cpus.
>>
>> But then again, since preemption is enabled, the cpu could change
>> back and forth during the sched_clock() call, so you wouldn't be able
>> to reliably detect this anyways.
>>
>> In the grand scheme of things all of this probably doesn't matter at
>> all.
>
> The only thing that really worries me, is the possibility of time
> on the new cpu to be completely random, then we could be back in the
> range where time_after() will be false again and end up polling for
> another year.
>
> A simple way to limit the damage would be to use time_in_range()
> instead of time_after(), then if we have a completely random time we
> would be out of the range and fail safely.
>
> would something like this be an acceptable solution?
>
> ---

actually, this code has a bug.

> -static inline bool can_poll_ll(u64 end_time)
> +static inline bool can_poll_ll(u64 start_time, u64 run_time)
>   {
> -    return !time_after64(ll_sched_clock(), end_time);
> +    return time_in_range64(ll_sched_clock(), start_time,
> +                   start_time + run_time);
>   }
>
this will call sched_clock() twice.

I will send a fix after I test it.

-Eliezer

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 net-next] net: convert lls to use time_in_range()
  2013-07-02  8:45     ` Eliezer Tamir
@ 2013-07-02  9:49       ` Eliezer Tamir
  2013-07-02 19:56         ` David Miller
  2013-07-02 20:10         ` Ben Hutchings
  0 siblings, 2 replies; 17+ messages in thread
From: Eliezer Tamir @ 2013-07-02  9:49 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, netdev, willemb, erdnetdev, andi, hpa, devel-lists,
	eliezer

Time in range will fail safely if we move to a different cpu with an
extremely large clock skew.
Add time_in_range64() and convert lls to use it.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---
v1->v2
fixed double call to sched_clock() in can_poll_ll(), checkpatchisms


  fs/select.c             |   10 ++++++----
  include/linux/jiffies.h |    4 ++++
  include/net/ll_poll.h   |   26 +++++++++++++++++---------
  3 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 3654075..f28a585 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -403,7 +403,8 @@ int do_select(int n, fd_set_bits *fds, struct 
timespec *end_time)
  	int retval, i, timed_out = 0;
  	unsigned long slack = 0;
  	unsigned int ll_flag = ll_get_flag();
-	u64 ll_time = ll_end_time();
+	u64 ll_start = ll_start_time(ll_flag);
+	u64 ll_time = ll_run_time();

  	rcu_read_lock();
  	retval = max_select_fd(n, fds);
@@ -498,7 +499,7 @@ int do_select(int n, fd_set_bits *fds, struct 
timespec *end_time)
  		}

  		/* only if on, have sockets with POLL_LL and not out of time */
-		if (ll_flag && can_ll && can_poll_ll(ll_time))
+		if (ll_flag && can_ll && can_poll_ll(ll_start, ll_time))
  			continue;

  		/*
@@ -770,7 +771,8 @@ static int do_poll(unsigned int nfds,  struct 
poll_list *list,
  	int timed_out = 0, count = 0;
  	unsigned long slack = 0;
  	unsigned int ll_flag = ll_get_flag();
-	u64 ll_time = ll_end_time();
+	u64 ll_start = ll_start_time(ll_flag);
+	u64 ll_time = ll_run_time();

  	/* Optimise the no-wait case */
  	if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
@@ -819,7 +821,7 @@ static int do_poll(unsigned int nfds,  struct 
poll_list *list,
  			break;

  		/* only if on, have sockets with POLL_LL and not out of time */
-		if (ll_flag && can_ll && can_poll_ll(ll_time))
+		if (ll_flag && can_ll && can_poll_ll(ll_start, ll_time))
  			continue;

  		/*
diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 8fb8edf..97ba4e7 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -139,6 +139,10 @@ static inline u64 get_jiffies_64(void)
  	 ((__s64)(a) - (__s64)(b) >= 0))
  #define time_before_eq64(a,b)	time_after_eq64(b,a)

+#define time_in_range64(a, b, c) \
+	(time_after_eq64(a, b) && \
+	 time_before_eq64(a, c))
+
  /*
   * These four macros compare jiffies and 'a' for convenience.
   */
diff --git a/include/net/ll_poll.h b/include/net/ll_poll.h
index 6c06f7c..b76f004 100644
--- a/include/net/ll_poll.h
+++ b/include/net/ll_poll.h
@@ -67,19 +67,23 @@ static inline u64 ll_sched_clock(void)
  /* we don't mind a ~2.5% imprecision so <<10 instead of *1000
   * sk->sk_ll_usec is a u_int so this can't overflow
   */
-static inline u64 ll_sk_end_time(struct sock *sk)
+static inline u64 ll_sk_run_time(struct sock *sk)
  {
-	return ((u64)ACCESS_ONCE(sk->sk_ll_usec) << 10) + ll_sched_clock();
+	return (u64)ACCESS_ONCE(sk->sk_ll_usec) << 10;
  }

  /* in poll/select we use the global sysctl_net_ll_poll value
   * only call sched_clock() if enabled
   */
-static inline u64 ll_end_time(void)
+static inline u64 ll_run_time(void)
  {
-	u64 end_time = ACCESS_ONCE(sysctl_net_ll_poll);
+	return (u64)ACCESS_ONCE(sysctl_net_ll_poll) << 10;
+}

-	return end_time ? (end_time << 10) + ll_sched_clock() : 0;
+/* if flag is not set we don't need to know the time */
+static inline u64 ll_start_time(unsigned int flag)
+{
+	return flag ? ll_sched_clock() : 0;
  }

  static inline bool sk_valid_ll(struct sock *sk)
@@ -88,9 +92,12 @@ static inline bool sk_valid_ll(struct sock *sk)
  	       !need_resched() && !signal_pending(current);
  }

-static inline bool can_poll_ll(u64 end_time)
+/* careful! time_in_range64 will evaluate now twice */
+static inline bool can_poll_ll(u64 start_time, u64 run_time)
  {
-	return !time_after64(ll_sched_clock(), end_time);
+	u64 now = ll_sched_clock();
+
+	return time_in_range64(now, start_time, start_time + run_time);
  }

  /* when used in sock_poll() nonblock is known at compile time to be true
@@ -98,7 +105,8 @@ static inline bool can_poll_ll(u64 end_time)
   */
  static inline bool sk_poll_ll(struct sock *sk, int nonblock)
  {
-	u64 end_time = nonblock ? 0 : ll_sk_end_time(sk);
+	u64 start_time = ll_start_time(!nonblock);
+	u64 run_time = ll_sk_run_time(sk);
  	const struct net_device_ops *ops;
  	struct napi_struct *napi;
  	int rc = false;
@@ -129,7 +137,7 @@ static inline bool sk_poll_ll(struct sock *sk, int 
nonblock)
  					 LINUX_MIB_LOWLATENCYRXPACKETS, rc);

  	} while (!nonblock && skb_queue_empty(&sk->sk_receive_queue) &&
-		 can_poll_ll(end_time));
+		 can_poll_ll(start_time, run_time));

  	rc = !skb_queue_empty(&sk->sk_receive_queue);
  out:


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 net-next] net: convert lls to use time_in_range()
  2013-07-02  9:49       ` [PATCH v2 net-next] net: convert lls to use time_in_range() Eliezer Tamir
@ 2013-07-02 19:56         ` David Miller
  2013-07-02 20:10         ` Ben Hutchings
  1 sibling, 0 replies; 17+ messages in thread
From: David Miller @ 2013-07-02 19:56 UTC (permalink / raw)
  To: eliezer.tamir
  Cc: linux-kernel, netdev, willemb, erdnetdev, andi, hpa, devel-lists,
	eliezer

From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Date: Tue, 02 Jul 2013 12:49:58 +0300

> @@ -403,7 +403,8 @@ int do_select(int n, fd_set_bits *fds, struct
> timespec *end_time)

Your email client has corrupted this patch, please repost with this
issue corrected.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 net-next] net: convert lls to use time_in_range()
  2013-07-02  9:49       ` [PATCH v2 net-next] net: convert lls to use time_in_range() Eliezer Tamir
  2013-07-02 19:56         ` David Miller
@ 2013-07-02 20:10         ` Ben Hutchings
  2013-07-02 20:28           ` Eliezer Tamir
  1 sibling, 1 reply; 17+ messages in thread
From: Ben Hutchings @ 2013-07-02 20:10 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: David Miller, linux-kernel, netdev, willemb, erdnetdev, andi,
	hpa, devel-lists, eliezer

On Tue, 2013-07-02 at 12:49 +0300, Eliezer Tamir wrote:
> Time in range will fail safely if we move to a different cpu with an
> extremely large clock skew.
> Add time_in_range64() and convert lls to use it.
> 
> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> ---
> v1->v2
> fixed double call to sched_clock() in can_poll_ll(), checkpatchisms
[...]
> --- a/include/linux/jiffies.h
> +++ b/include/linux/jiffies.h
> @@ -139,6 +139,10 @@ static inline u64 get_jiffies_64(void)
>   	 ((__s64)(a) - (__s64)(b) >= 0))
>   #define time_before_eq64(a,b)	time_after_eq64(b,a)
> 
> +#define time_in_range64(a, b, c) \
> +	(time_after_eq64(a, b) && \
> +	 time_before_eq64(a, c))
[...]

Why not make this an inline function, so the caller doesn't need to
worry about repeated evaluation?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 net-next] net: convert lls to use time_in_range()
  2013-07-02 20:10         ` Ben Hutchings
@ 2013-07-02 20:28           ` Eliezer Tamir
  2013-07-02 20:42             ` Ben Hutchings
  0 siblings, 1 reply; 17+ messages in thread
From: Eliezer Tamir @ 2013-07-02 20:28 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, linux-kernel, netdev, willemb, erdnetdev, andi,
	hpa, devel-lists, eliezer

On 02/07/2013 23:10, Ben Hutchings wrote:
> On Tue, 2013-07-02 at 12:49 +0300, Eliezer Tamir wrote:
>> Time in range will fail safely if we move to a different cpu with an
>> extremely large clock skew.
>> Add time_in_range64() and convert lls to use it.
>>
>> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>> ---
>> v1->v2
>> fixed double call to sched_clock() in can_poll_ll(), checkpatchisms

>> +#define time_in_range64(a, b, c) \
>> +	(time_after_eq64(a, b) && \
>> +	 time_before_eq64(a, c))
> [...]
>
> Why not make this an inline function, so the caller doesn't need to
> worry about repeated evaluation?

I was following the conventions in jiffies.h
(well almost, I did add a few spaces to make checkpatch happy)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 net-next] net: convert lls to use time_in_range()
  2013-07-02 20:28           ` Eliezer Tamir
@ 2013-07-02 20:42             ` Ben Hutchings
  2013-07-03  7:00               ` Eliezer Tamir
  0 siblings, 1 reply; 17+ messages in thread
From: Ben Hutchings @ 2013-07-02 20:42 UTC (permalink / raw)
  To: Eliezer Tamir
  Cc: David Miller, linux-kernel, netdev, willemb, erdnetdev, andi,
	hpa, devel-lists, eliezer

On Tue, 2013-07-02 at 23:28 +0300, Eliezer Tamir wrote:
> On 02/07/2013 23:10, Ben Hutchings wrote:
> > On Tue, 2013-07-02 at 12:49 +0300, Eliezer Tamir wrote:
> >> Time in range will fail safely if we move to a different cpu with an
> >> extremely large clock skew.
> >> Add time_in_range64() and convert lls to use it.
> >>
> >> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> >> ---
> >> v1->v2
> >> fixed double call to sched_clock() in can_poll_ll(), checkpatchisms
> 
> >> +#define time_in_range64(a, b, c) \
> >> +	(time_after_eq64(a, b) && \
> >> +	 time_before_eq64(a, c))
> > [...]
> >
> > Why not make this an inline function, so the caller doesn't need to
> > worry about repeated evaluation?
> 
> I was following the conventions in jiffies.h
> (well almost, I did add a few spaces to make checkpatch happy)

I see, but now you have a good reason to change that convention.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 net-next] net: convert lls to use time_in_range()
  2013-07-02 20:42             ` Ben Hutchings
@ 2013-07-03  7:00               ` Eliezer Tamir
  0 siblings, 0 replies; 17+ messages in thread
From: Eliezer Tamir @ 2013-07-03  7:00 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, linux-kernel, netdev, willemb, erdnetdev, andi,
	hpa, devel-lists, eliezer

On 02/07/2013 23:42, Ben Hutchings wrote:
> On Tue, 2013-07-02 at 23:28 +0300, Eliezer Tamir wrote:
>> On 02/07/2013 23:10, Ben Hutchings wrote:
>>> On Tue, 2013-07-02 at 12:49 +0300, Eliezer Tamir wrote:
>>>> Time in range will fail safely if we move to a different cpu with an
>>>> extremely large clock skew.
>>>> Add time_in_range64() and convert lls to use it.
>>>>
>>>> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>>>> ---
>>>> v1->v2
>>>> fixed double call to sched_clock() in can_poll_ll(), checkpatchisms
>>
>>>> +#define time_in_range64(a, b, c) \
>>>> +	(time_after_eq64(a, b) && \
>>>> +	 time_before_eq64(a, c))
>>> [...]
>>>
>>> Why not make this an inline function, so the caller doesn't need to
>>> worry about repeated evaluation?
>>
>> I was following the conventions in jiffies.h
>> (well almost, I did add a few spaces to make checkpatch happy)
>
> I see, but now you have a good reason to change that convention.

I'm not sure that an inline function is always a win.
Macros do get evaluated at an earlier stage.

Having just the new function use a different convention is plain
wrong, people are virtually guarantied to not notice until they
introduce a bug.

Are you suggesting we convert all of jiffies.h into inline functions?

Invoking the principle of least astonishment, I think it's best to keep 
the changes minimal.

Or would you prefer to risk breaking things all over in subtle ways
that are sure to get Linus to go voodoo on your hamster?

I for one care about hamsters.

-Eliezer

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-07-03  7:00 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-28 12:59 [PATCH net-next 0/2] net: lls cleanup patches Eliezer Tamir
2013-06-28 12:59 ` [PATCH net-next 1/2] net: fix LLS debug_smp_processor_id() warning Eliezer Tamir
2013-06-28 16:51   ` Using sched_clock() for polling time limit Ben Hutchings
2013-06-29 18:50     ` Eliezer Tamir
2013-07-01 19:48       ` Ben Hutchings
2013-06-28 12:59 ` [PATCH net-next 2/2] net: avoid calling sched_clock when LLS is off Eliezer Tamir
2013-06-28 14:38   ` Andi Kleen
2013-06-28 14:54     ` Eliezer Tamir
2013-07-01 21:08 ` [PATCH net-next 0/2] net: lls cleanup patches David Miller
2013-07-02  8:38   ` Eliezer Tamir
2013-07-02  8:45     ` Eliezer Tamir
2013-07-02  9:49       ` [PATCH v2 net-next] net: convert lls to use time_in_range() Eliezer Tamir
2013-07-02 19:56         ` David Miller
2013-07-02 20:10         ` Ben Hutchings
2013-07-02 20:28           ` Eliezer Tamir
2013-07-02 20:42             ` Ben Hutchings
2013-07-03  7:00               ` Eliezer Tamir

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.