All of lore.kernel.org
 help / color / mirror / Atom feed
* Calculate average latencies on the fly
@ 2016-11-14 10:00 Piotr Gregor
  2016-11-14 11:50 ` Piotr Gregor
  0 siblings, 1 reply; 10+ messages in thread
From: Piotr Gregor @ 2016-11-14 10:00 UTC (permalink / raw)
  To: Clark Williams, John Kacur; +Cc: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 312 bytes --]

Hi,

There is a possibility for average to overflow as it is first accumulated
and only divided when printing statistics. Attached is the patch
to calculate average on the go.

Patch is diffed between stable/v1.0 branch
git format-patch origin/stable/v1.0 --stdout > calc_average_on_the_fly.patch

cheers,
Piotr

[-- Attachment #2: calc_average_on_the_fly.patch --]
[-- Type: text/x-patch, Size: 1753 bytes --]

From cb9a85222dff1a299fde25a96243139cde6a4947 Mon Sep 17 00:00:00 2001
From: Piotr Gregor <piotrgregor@rsyncme.org>
Date: Fri, 11 Nov 2016 11:28:09 +0000
Subject: [PATCH] rt-tests: cyclictest: Calculate average latencies on the fly

Prevents stat->avg variable from overflow.
---
 src/cyclictest/cyclictest.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/cyclictest/cyclictest.c b/src/cyclictest/cyclictest.c
index 3f1bef1..2c4e450 100644
--- a/src/cyclictest/cyclictest.c
+++ b/src/cyclictest/cyclictest.c
@@ -1178,7 +1178,11 @@ static void *timerthread(void *param)
 			if (refresh_on_max)
 				pthread_cond_signal(&refresh_on_max_cond);
 		}
-		stat->avg += (double) diff;
+		if (stat->cycles == 0) {
+			stat->avg = (double) diff;
+		} else {
+			stat->avg = (stat->avg + (double) diff) * ((double) stat->cycles / (double) (stat->cycles + 1));
+		}
 
 		if (trigger && (diff > trigger)) {
 			trigger_update(par, diff, calctime(now));
@@ -2005,7 +2009,7 @@ static void print_hist(struct thread_param *par[], int nthreads)
 	fprintf(fd, "# Avg Latencies:");
 	for (j = 0; j < nthreads; j++)
 		fprintf(fd, " %05lu", par[j]->stats->cycles ?
-		       (long)(par[j]->stats->avg/par[j]->stats->cycles) : 0);
+		       (long)(par[j]->stats->avg) : 0);
 	fprintf(fd, "\n");
 	fprintf(fd, "# Max Latencies:");
 	maxmax = 0;
@@ -2059,7 +2063,7 @@ static void print_stat(FILE *fp, struct thread_param *par, int index, int verbos
 			fprintf(fp, fmt, index, stat->tid, par->prio,
 				par->interval, stat->cycles, stat->min,
 				stat->act, stat->cycles ?
-				(long)(stat->avg/stat->cycles) : 0, stat->max);
+				(long)(stat->avg) : 0, stat->max);
 
 			if (smi)
 				fprintf(fp," SMI:%8ld", stat->smi_count);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
  2016-11-14 10:00 Calculate average latencies on the fly Piotr Gregor
@ 2016-11-14 11:50 ` Piotr Gregor
  2016-11-14 15:42   ` Mathieu Poirier
  2016-11-16 14:47   ` John Kacur
  0 siblings, 2 replies; 10+ messages in thread
From: Piotr Gregor @ 2016-11-14 11:50 UTC (permalink / raw)
  To: Clark Williams, John Kacur; +Cc: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 91 bytes --]

There was an error in previous patch. Please find attached corrected patch.

cheers,
Piotr

[-- Attachment #2: calc_average_on_the_fly.patch --]
[-- Type: text/x-patch, Size: 1777 bytes --]

From 5610e2b01e5302a5ecff7ca368198863a6ce9f78 Mon Sep 17 00:00:00 2001
From: Piotr Gregor <piotrgregor@rsyncme.org>
Date: Fri, 11 Nov 2016 11:28:09 +0000
Subject: [PATCH] rt-tests: cyclictest: Calculate average latencies on the fly

Prevents stat->avg variable from overflow.
---
 src/cyclictest/cyclictest.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/cyclictest/cyclictest.c b/src/cyclictest/cyclictest.c
index 3f1bef1..0d2ddbc 100644
--- a/src/cyclictest/cyclictest.c
+++ b/src/cyclictest/cyclictest.c
@@ -1178,7 +1178,11 @@ static void *timerthread(void *param)
 			if (refresh_on_max)
 				pthread_cond_signal(&refresh_on_max_cond);
 		}
-		stat->avg += (double) diff;
+		if (stat->cycles == 0) {
+			stat->avg = (double) diff;
+		} else {
+			stat->avg = (stat->avg + (double) diff / (double) stat->cycles) * ((double) stat->cycles / (double) (stat->cycles + 1));
+		}
 
 		if (trigger && (diff > trigger)) {
 			trigger_update(par, diff, calctime(now));
@@ -2005,7 +2009,7 @@ static void print_hist(struct thread_param *par[], int nthreads)
 	fprintf(fd, "# Avg Latencies:");
 	for (j = 0; j < nthreads; j++)
 		fprintf(fd, " %05lu", par[j]->stats->cycles ?
-		       (long)(par[j]->stats->avg/par[j]->stats->cycles) : 0);
+		       (long)(par[j]->stats->avg) : 0);
 	fprintf(fd, "\n");
 	fprintf(fd, "# Max Latencies:");
 	maxmax = 0;
@@ -2059,7 +2063,7 @@ static void print_stat(FILE *fp, struct thread_param *par, int index, int verbos
 			fprintf(fp, fmt, index, stat->tid, par->prio,
 				par->interval, stat->cycles, stat->min,
 				stat->act, stat->cycles ?
-				(long)(stat->avg/stat->cycles) : 0, stat->max);
+				(long)(stat->avg) : 0, stat->max);
 
 			if (smi)
 				fprintf(fp," SMI:%8ld", stat->smi_count);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
  2016-11-14 11:50 ` Piotr Gregor
@ 2016-11-14 15:42   ` Mathieu Poirier
  2016-11-14 16:11     ` Piotr Gregor
  2016-11-16 14:47   ` John Kacur
  1 sibling, 1 reply; 10+ messages in thread
From: Mathieu Poirier @ 2016-11-14 15:42 UTC (permalink / raw)
  To: Piotr Gregor; +Cc: Clark Williams, John Kacur, linux-rt-users

On 14 November 2016 at 04:50, Piotr Gregor <piotrgregor@rsyncme.org> wrote:
> There was an error in previous patch. Please find attached corrected patch.
>
> cheers,
> Piotr

Piort,

I am fairly new to this mailing list but if it is anything like the
kernel's, you will want to send your patch using git send-email.  It
is probably a good idea to run checkpatch.pl on it too - that way
you'll avoid all sort of style problems like lines over 80 characters
long.

Mathieu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
  2016-11-14 15:42   ` Mathieu Poirier
@ 2016-11-14 16:11     ` Piotr Gregor
  2016-11-14 16:58       ` Clark Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Piotr Gregor @ 2016-11-14 16:11 UTC (permalink / raw)
  To: Mathieu Poirier; +Cc: Clark Williams, John Kacur, linux-rt-users

Hi Mathieu,

I am newer most likely - many thanks for your advise.

cheers,
Piotr

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
  2016-11-14 16:11     ` Piotr Gregor
@ 2016-11-14 16:58       ` Clark Williams
  0 siblings, 0 replies; 10+ messages in thread
From: Clark Williams @ 2016-11-14 16:58 UTC (permalink / raw)
  To: Piotr Gregor; +Cc: Mathieu Poirier, John Kacur, linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 602 bytes --]

On Mon, 14 Nov 2016 16:11:58 +0000
Piotr Gregor <piotrgregor@rsyncme.org> wrote:

> Hi Mathieu,
> 
> I am newer most likely - many thanks for your advise.
> 
> cheers,
> Piotr
> --

Luckily this list is *much* lower traffic than LKML and we're not quite as strict as they are, since we don't have the volume of patches to deal with. 

That being said, very good advice and I'll make a note to follow it :)

Piotr, John Kacur or I will look at adding in your patch to cyclictest shortly. Thanks for the submission.

-- 
United States Coast Guard
Ruining Natural Selection since 1790

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
  2016-11-14 11:50 ` Piotr Gregor
  2016-11-14 15:42   ` Mathieu Poirier
@ 2016-11-16 14:47   ` John Kacur
  2016-11-19 23:27     ` Piotr Gregor
  1 sibling, 1 reply; 10+ messages in thread
From: John Kacur @ 2016-11-16 14:47 UTC (permalink / raw)
  To: Piotr Gregor; +Cc: Clark Williams, linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 993 bytes --]


Please inline the patches

There are 1000000 microseconds / second, the default interval is 1000us, 
but let's say we did it more frequently at 100us

1000000 us /  100 intervals = 10000 samples / second * 60 sec/min * 60 
min/hr * 24 hrs/day * 365 days / year * 1000 (very pessimistic diff) =
an accumulated diff of 3.1536×10^14

With a max double of 9x10^15/3.15*10^14 we could run for 30 years before 
we move into normal numbers.

So, the problem with your patch, is not just the problem which it proposes 
to fix, is one I see as mostly theoretical, but there is the potential for 
accumulating small inaccuracies.

Of course, there is always the chance that I made some kind of error in my 
analysis here, please feel free to point it out!

What might be more interesting is how long would it take for the cycles to 
overflow?

Thanks

John

On Mon, 14 Nov 2016, Piotr Gregor wrote:

> There was an error in previous patch. Please find attached corrected patch.
> 
> cheers,
> Piotr
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
  2016-11-16 14:47   ` John Kacur
@ 2016-11-19 23:27     ` Piotr Gregor
  2016-11-20  5:03       ` Tracy Smith
  2016-11-21 18:32       ` John Kacur
  0 siblings, 2 replies; 10+ messages in thread
From: Piotr Gregor @ 2016-11-19 23:27 UTC (permalink / raw)
  To: John Kacur; +Cc: Clark Williams, Linux RT Users

Hi John, Clark,

Recently I observed jitter of > 100 000 during tests of virtualized system.
In that case 30 years would change into 0.3 year which is not as much.
The jitter depends on the hardware/software you test, what if someone
deliberately wishes to measure jitter of more than this or he is just
out of luck and gets huge values?
Also, the limit will change if the type of average changes, but if it
is calculated in the way I proposed, it will always work.
Apart from this, since the benign possibility exists - why not to fix this
if the fix is easy.

cheers,
Piotr

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
  2016-11-19 23:27     ` Piotr Gregor
@ 2016-11-20  5:03       ` Tracy Smith
  2016-11-21 18:32       ` John Kacur
  1 sibling, 0 replies; 10+ messages in thread
From: Tracy Smith @ 2016-11-20  5:03 UTC (permalink / raw)
  To: Piotr Gregor; +Cc: John Kacur, Clark Williams, Linux RT Users

Piotr, please explain what you believe to be the root cause?

> On Nov 19, 2016, at 5:27 PM, Piotr Gregor <piotrgregor@rsyncme.org> wrote:
> 
> Hi John, Clark,
> 
> Recently I observed jitter of > 100 000 during tests of virtualized system.
> In that case 30 years would change into 0.3 year which is not as much.
> The jitter depends on the hardware/software you test, what if someone
> deliberately wishes to measure jitter of more than this or he is just
> out of luck and gets huge values?
> Also, the limit will change if the type of average changes, but if it
> is calculated in the way I proposed, it will always work.
> Apart from this, since the benign possibility exists - why not to fix this
> if the fix is easy.
> 
> cheers,
> Piotr
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
  2016-11-19 23:27     ` Piotr Gregor
  2016-11-20  5:03       ` Tracy Smith
@ 2016-11-21 18:32       ` John Kacur
       [not found]         ` <CAChUvXOj_D2N3o7BoEUjt1A1O0fXZnvGSz5ba=O55h12ns___A@mail.gmail.com>
  1 sibling, 1 reply; 10+ messages in thread
From: John Kacur @ 2016-11-21 18:32 UTC (permalink / raw)
  To: Piotr Gregor; +Cc: Clark Williams, Linux RT Users



On Sat, 19 Nov 2016, Piotr Gregor wrote:

> Hi John, Clark,
> 
> Recently I observed jitter of > 100 000 during tests of virtualized system.
> In that case 30 years would change into 0.3 year which is not as much.
> The jitter depends on the hardware/software you test, what if someone
> deliberately wishes to measure jitter of more than this or he is just
> out of luck and gets huge values?
> Also, the limit will change if the type of average changes, but if it
> is calculated in the way I proposed, it will always work.
> Apart from this, since the benign possibility exists - why not to fix this
> if the fix is easy.
> 

I already explained why. Everytime you do a floating point long division 
you are potentially accumulating small errors. Something is terribly 
wrong if you are getting a diff of greater than 100,000, if that really 
is true you need to fix it right away instead of run cyclictest for a 
long period of time.

John> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Calculate average latencies on the fly
       [not found]         ` <CAChUvXOj_D2N3o7BoEUjt1A1O0fXZnvGSz5ba=O55h12ns___A@mail.gmail.com>
@ 2016-11-22 15:43           ` John Kacur
  0 siblings, 0 replies; 10+ messages in thread
From: John Kacur @ 2016-11-22 15:43 UTC (permalink / raw)
  To: Tracy Smith; +Cc: Piotr Gregor, Clark Williams, Linux RT Users



----- Original Message -----
> John,
> 
> Latency of 100,000 seems to me to be a bug when doing floating point long
> division, which potentially accumulates small errors.
> 
> Is the issue in the cyclictest calculation or in the kernel?  Or, the FPU?
> 
> If in the kernel, where do we need to trace to check for increased latency
> when doing floating point long division?
> 
> Piotr, needs to trace where the latency is occurring and fix.  Any
> recommendations on a possible fix at this stage?
> 
> Thx, Tracy
> 

Sure, you make some good points, it could be a cyclictest bug, it could be in the kernel, 
he also said he was doing this in a virtualized system, it could be somewhere in that stack.
The first step of course is to see if you can come up with a way to reliably reproduce it.

Thanks

John

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-11-22 15:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-14 10:00 Calculate average latencies on the fly Piotr Gregor
2016-11-14 11:50 ` Piotr Gregor
2016-11-14 15:42   ` Mathieu Poirier
2016-11-14 16:11     ` Piotr Gregor
2016-11-14 16:58       ` Clark Williams
2016-11-16 14:47   ` John Kacur
2016-11-19 23:27     ` Piotr Gregor
2016-11-20  5:03       ` Tracy Smith
2016-11-21 18:32       ` John Kacur
     [not found]         ` <CAChUvXOj_D2N3o7BoEUjt1A1O0fXZnvGSz5ba=O55h12ns___A@mail.gmail.com>
2016-11-22 15:43           ` John Kacur

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.