* [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs @ 2019-12-08 21:06 Uwe Kleine-König 2019-12-09 9:40 ` Daniel Wagner ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Uwe Kleine-König @ 2019-12-08 21:06 UTC (permalink / raw) To: Clark Williams, John Kacur; +Cc: linux-rt-users This fixes a build error on arm64, mips*, ppc and several others --- src/queuelat/queuelat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/queuelat/queuelat.c b/src/queuelat/queuelat.c index cccb50ef0cc4..98346f346f82 100644 --- a/src/queuelat/queuelat.c +++ b/src/queuelat/queuelat.c @@ -283,7 +283,7 @@ static inline unsigned long long __rdtscll(void) #define gettick(val) do { (val) = __rdtscll(); } while (0) -#elif defined __arm__ +#else static inline unsigned long long __clock_gettime(void) { -- 2.24.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-08 21:06 [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs Uwe Kleine-König @ 2019-12-09 9:40 ` Daniel Wagner 2019-12-10 11:24 ` John Kacur 2019-12-12 17:31 ` Sebastian Andrzej Siewior 2019-12-10 11:20 ` John Kacur 2019-12-12 17:46 ` Sebastian Andrzej Siewior 2 siblings, 2 replies; 11+ messages in thread From: Daniel Wagner @ 2019-12-09 9:40 UTC (permalink / raw) To: Uwe Kleine-König, Clark Williams, John Kacur; +Cc: linux-rt-users Hi Uwe, On 2019-12-08 22:06, Uwe Kleine-König wrote: > This fixes a build error on arm64, mips*, ppc and several others Just wondering if the tool should print a warning if the fallback is used? IIRC, the code wants to use the TSC and clock_gettime is probably not so precise. Thanks, Daniel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-09 9:40 ` Daniel Wagner @ 2019-12-10 11:24 ` John Kacur 2019-12-12 17:31 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 11+ messages in thread From: John Kacur @ 2019-12-10 11:24 UTC (permalink / raw) To: Daniel Wagner; +Cc: Uwe Kleine-König, Clark Williams, linux-rt-users [-- Attachment #1: Type: text/plain, Size: 393 bytes --] On Mon, 9 Dec 2019, Daniel Wagner wrote: > Hi Uwe, > > On 2019-12-08 22:06, Uwe Kleine-König wrote: > > This fixes a build error on arm64, mips*, ppc and several others > > Just wondering if the tool should print a warning if the fallback is used? > IIRC, the code wants to use the TSC and clock_gettime is probably not so > precise. > > Thanks, > Daniel > I take patches! John Kacur ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-09 9:40 ` Daniel Wagner 2019-12-10 11:24 ` John Kacur @ 2019-12-12 17:31 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 11+ messages in thread From: Sebastian Andrzej Siewior @ 2019-12-12 17:31 UTC (permalink / raw) To: Daniel Wagner Cc: Uwe Kleine-König, Clark Williams, John Kacur, linux-rt-users On 2019-12-09 10:40:29 [+0100], Daniel Wagner wrote: > Just wondering if the tool should print a warning if the fallback is used? > IIRC, the code wants to use the TSC and clock_gettime is probably not so > precise. clock_gettime() is precise but it might have more overhead. With VDSO support the overhead is quite low. > Thanks, > Daniel Sebastian ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-08 21:06 [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs Uwe Kleine-König 2019-12-09 9:40 ` Daniel Wagner @ 2019-12-10 11:20 ` John Kacur 2019-12-12 17:46 ` Sebastian Andrzej Siewior 2 siblings, 0 replies; 11+ messages in thread From: John Kacur @ 2019-12-10 11:20 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Clark Williams, linux-rt-users [-- Attachment #1: Type: text/plain, Size: 685 bytes --] On Sun, 8 Dec 2019, Uwe Kleine-König wrote: > This fixes a build error on arm64, mips*, ppc and several others > --- > src/queuelat/queuelat.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/queuelat/queuelat.c b/src/queuelat/queuelat.c > index cccb50ef0cc4..98346f346f82 100644 > --- a/src/queuelat/queuelat.c > +++ b/src/queuelat/queuelat.c > @@ -283,7 +283,7 @@ static inline unsigned long long __rdtscll(void) > > #define gettick(val) do { (val) = __rdtscll(); } while (0) > > -#elif defined __arm__ > +#else > > static inline unsigned long long __clock_gettime(void) > { > -- > 2.24.0 > > Signed-off-by: John Kacur <jkacur@redhat.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-08 21:06 [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs Uwe Kleine-König 2019-12-09 9:40 ` Daniel Wagner 2019-12-10 11:20 ` John Kacur @ 2019-12-12 17:46 ` Sebastian Andrzej Siewior 2019-12-13 12:41 ` Daniel Wagner 2019-12-13 14:54 ` John Kacur 2 siblings, 2 replies; 11+ messages in thread From: Sebastian Andrzej Siewior @ 2019-12-12 17:46 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Clark Williams, John Kacur, linux-rt-users On 2019-12-08 22:06:25 [+0100], Uwe Kleine-König wrote: > This fixes a build error on arm64, mips*, ppc and several others > --- > src/queuelat/queuelat.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/queuelat/queuelat.c b/src/queuelat/queuelat.c > index cccb50ef0cc4..98346f346f82 100644 > --- a/src/queuelat/queuelat.c > +++ b/src/queuelat/queuelat.c > @@ -283,7 +283,7 @@ static inline unsigned long long __rdtscll(void) > > #define gettick(val) do { (val) = __rdtscll(); } while (0) > > -#elif defined __arm__ > +#else Did actually anyone look at the code? I somehow missed the queuelat thingy completely. Now that I look I think I need further assistance… So what I select as frequency for the !x86 case? And why. That freq. script reports here: |1555.184 1566.269 1566.498 1560.055 1593.149 1568.185 1583.807 1599.096 2574.546 2572.408 2573.849 2583.862 2619.402 1825.680 1847.264 1870.318 2552.102 1570.552 1589.650 1595.813 1590.253 1573.834 1589.438 1599.439 1770.963 1786.370 1814.918 1811.936 1828.277 1850.905 1861.976 1792.809 I guess I pick one… Could someone please figure out the actual difference of clock_gettime() vs rdtsc() so we know how important it is. Based on its current implementation, if memmove() takes >1sec then it ends up undetected because only the ns of the timestamp are considered for. > static inline unsigned long long __clock_gettime(void) > { > -- > 2.24.0 > Sebastian ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-12 17:46 ` Sebastian Andrzej Siewior @ 2019-12-13 12:41 ` Daniel Wagner 2019-12-13 14:54 ` John Kacur 1 sibling, 0 replies; 11+ messages in thread From: Daniel Wagner @ 2019-12-13 12:41 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Uwe Kleine-König Cc: Clark Williams, John Kacur, linux-rt-users Hi, On 2019-12-12 18:46, Sebastian Andrzej Siewior wrote: > So what I select as frequency for the !x86 case? And why. IMO, an user should be able to run rt-tests without the need to provide special configuration or tuning. queuelat is a bit hard to use at this point. > That freq. script reports here: > |1555.184 1566.269 1566.498 1560.055 1593.149 1568.185 1583.807 1599.096 2574.546 2572.408 2573.849 2583.862 2619.402 1825.680 1847.264 1870.318 2552.102 1570.552 1589.650 1595.813 1590.253 1573.834 1589.438 1599.439 1770.963 1786.370 1814.918 1811.936 1828.277 1850.905 1861.976 1792.809 > > I guess I pick one… > > Could someone please figure out the actual difference of clock_gettime() > vs rdtsc() so we know how important it is. I didn't really understood what the test is doing. The initial clock_gettime() patch was just to shoutup the compiler. Thanks, Daniel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-12 17:46 ` Sebastian Andrzej Siewior 2019-12-13 12:41 ` Daniel Wagner @ 2019-12-13 14:54 ` John Kacur 2019-12-13 23:02 ` Marcelo Tosatti 1 sibling, 1 reply; 11+ messages in thread From: John Kacur @ 2019-12-13 14:54 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Uwe Kleine-König, Clark Williams, Linux RT Users, Marcelo Tosatti [-- Attachment #1: Type: text/plain, Size: 1631 bytes --] On Thu, 12 Dec 2019, Sebastian Andrzej Siewior wrote: > On 2019-12-08 22:06:25 [+0100], Uwe Kleine-König wrote: > > This fixes a build error on arm64, mips*, ppc and several others > > --- > > src/queuelat/queuelat.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/queuelat/queuelat.c b/src/queuelat/queuelat.c > > index cccb50ef0cc4..98346f346f82 100644 > > --- a/src/queuelat/queuelat.c > > +++ b/src/queuelat/queuelat.c > > @@ -283,7 +283,7 @@ static inline unsigned long long __rdtscll(void) > > > > #define gettick(val) do { (val) = __rdtscll(); } while (0) > > > > -#elif defined __arm__ > > +#else > > Did actually anyone look at the code? I somehow missed the queuelat > thingy completely. Now that I look I think I need further assistance… > > So what I select as frequency for the !x86 case? And why. > > That freq. script reports here: > |1555.184 1566.269 1566.498 1560.055 1593.149 1568.185 1583.807 1599.096 2574.546 2572.408 2573.849 2583.862 2619.402 1825.680 1847.264 1870.318 2552.102 1570.552 1589.650 1595.813 1590.253 1573.834 1589.438 1599.439 1770.963 1786.370 1814.918 1811.936 1828.277 1850.905 1861.976 1792.809 > > I guess I pick one… > > Could someone please figure out the actual difference of clock_gettime() > vs rdtsc() so we know how important it is. Based on its current > implementation, if memmove() takes >1sec then it ends up undetected > because only the ns of the timestamp are considered for. > > > static inline unsigned long long __clock_gettime(void) > > { > > -- > > 2.24.0 > > > > Sebastian > Adding Marcelo to the cc list ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-13 14:54 ` John Kacur @ 2019-12-13 23:02 ` Marcelo Tosatti 2019-12-16 15:34 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 11+ messages in thread From: Marcelo Tosatti @ 2019-12-13 23:02 UTC (permalink / raw) To: John Kacur Cc: Sebastian Andrzej Siewior, Uwe Kleine-König, Clark Williams, Linux RT Users On Fri, Dec 13, 2019 at 03:54:48PM +0100, John Kacur wrote: > > > On Thu, 12 Dec 2019, Sebastian Andrzej Siewior wrote: > > > On 2019-12-08 22:06:25 [+0100], Uwe Kleine-König wrote: > > > This fixes a build error on arm64, mips*, ppc and several others > > > --- > > > src/queuelat/queuelat.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/src/queuelat/queuelat.c b/src/queuelat/queuelat.c > > > index cccb50ef0cc4..98346f346f82 100644 > > > --- a/src/queuelat/queuelat.c > > > +++ b/src/queuelat/queuelat.c > > > @@ -283,7 +283,7 @@ static inline unsigned long long __rdtscll(void) > > > > > > #define gettick(val) do { (val) = __rdtscll(); } while (0) > > > > > > -#elif defined __arm__ > > > +#else > > > > Did actually anyone look at the code? I somehow missed the queuelat > > thingy completely. Now that I look I think I need further assistance… > > > > So what I select as frequency for the !x86 case? And why. > > > > That freq. script reports here: > > |1555.184 1566.269 1566.498 1560.055 1593.149 1568.185 1583.807 1599.096 2574.546 2572.408 2573.849 2583.862 2619.402 1825.680 1847.264 1870.318 2552.102 1570.552 1589.650 1595.813 1590.253 1573.834 1589.438 1599.439 1770.963 1786.370 1814.918 1811.936 1828.277 1850.905 1861.976 1792.809 > > > > I guess I pick one… > > > > Could someone please figure out the actual difference of clock_gettime() > > vs rdtsc() so we know how important it is. Based on its current > > implementation, if memmove() takes >1sec then it ends up undetected > > because only the ns of the timestamp are considered for. /* Program parameters: * max_queue_len: maximum latency allowed, in nanoseconds (int). * cycles_per_packet: number of cycles to process one packet (int). * mpps(million-packet-per-sec): million packets per second (float). * tsc_freq_mhz: TSC frequency in MHz, as measured by TSC PIT * calibration * (search for "Detected XXX MHz processor" in dmesg, and use the * integer part). */ So you have to pass it "processor frequency" (you can change the names, its TSC but thats x86 specific). The script grabs the processor frequency (so you have to adjust that the script to ARM). And thats it. Please replace tsc_mhz -> processor_freq. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-13 23:02 ` Marcelo Tosatti @ 2019-12-16 15:34 ` Sebastian Andrzej Siewior 2019-12-16 21:14 ` Marcelo Tosatti 0 siblings, 1 reply; 11+ messages in thread From: Sebastian Andrzej Siewior @ 2019-12-16 15:34 UTC (permalink / raw) To: Marcelo Tosatti Cc: John Kacur, Uwe Kleine-König, Clark Williams, Linux RT Users On 2019-12-13 21:02:56 [-0200], Marcelo Tosatti wrote: > > > That freq. script reports here: > > > |1555.184 1566.269 1566.498 1560.055 1593.149 1568.185 1583.807 1599.096 2574.546 2572.408 2573.849 2583.862 2619.402 1825.680 1847.264 1870.318 2552.102 1570.552 1589.650 1595.813 1590.253 1573.834 1589.438 1599.439 1770.963 1786.370 1814.918 1811.936 1828.277 1850.905 1861.976 1792.809 > > > > > > I guess I pick one… > > > > > > Could someone please figure out the actual difference of clock_gettime() > > > vs rdtsc() so we know how important it is. Based on its current > > > implementation, if memmove() takes >1sec then it ends up undetected > > > because only the ns of the timestamp are considered for. > > /* Program parameters: > * max_queue_len: maximum latency allowed, in nanoseconds (int). > * cycles_per_packet: number of cycles to process one packet (int). > * mpps(million-packet-per-sec): million packets per second (float). > * tsc_freq_mhz: TSC frequency in MHz, as measured by TSC PIT > * calibration > * (search for "Detected XXX MHz processor" in dmesg, and use the > * integer part). > */ > > So you have to pass it "processor frequency" (you can change the names, > its TSC but thats x86 specific). The script grabs the processor > frequency (so you have to adjust that the script to ARM). > > And thats it. Please replace tsc_mhz -> processor_freq. So the script reports the freq. from 1555.184 Mhz to 2574.546 Mhz. And I doubt the numbers remain steady on today's x86 even with gov=performance. However, I asked for the frequency to be used on !x86 given the code we have. It was fixed in hurry and you would have to use 1000 so that the math keeps working. Then I asked how much benefit of this complicated TSC calculation vs clock_gettime() has. I tried to use it myself but after 30secs I saw no output, it just ate 100% and it seemed to do nothing. So I gave up. Sebastian ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs 2019-12-16 15:34 ` Sebastian Andrzej Siewior @ 2019-12-16 21:14 ` Marcelo Tosatti 0 siblings, 0 replies; 11+ messages in thread From: Marcelo Tosatti @ 2019-12-16 21:14 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: John Kacur, Uwe Kleine-König, Clark Williams, Linux RT Users On Mon, Dec 16, 2019 at 04:34:49PM +0100, Sebastian Andrzej Siewior wrote: > On 2019-12-13 21:02:56 [-0200], Marcelo Tosatti wrote: > > > > That freq. script reports here: > > > > |1555.184 1566.269 1566.498 1560.055 1593.149 1568.185 1583.807 1599.096 2574.546 2572.408 2573.849 2583.862 2619.402 1825.680 1847.264 1870.318 2552.102 1570.552 1589.650 1595.813 1590.253 1573.834 1589.438 1599.439 1770.963 1786.370 1814.918 1811.936 1828.277 1850.905 1861.976 1792.809 > > > > > > > > I guess I pick one… > > > > > > > > Could someone please figure out the actual difference of clock_gettime() > > > > vs rdtsc() so we know how important it is. Based on its current > > > > implementation, if memmove() takes >1sec then it ends up undetected > > > > because only the ns of the timestamp are considered for. > > > > /* Program parameters: > > * max_queue_len: maximum latency allowed, in nanoseconds (int). > > * cycles_per_packet: number of cycles to process one packet (int). > > * mpps(million-packet-per-sec): million packets per second (float). > > * tsc_freq_mhz: TSC frequency in MHz, as measured by TSC PIT > > * calibration > > * (search for "Detected XXX MHz processor" in dmesg, and use the > > * integer part). > > */ > > > > So you have to pass it "processor frequency" (you can change the names, > > its TSC but thats x86 specific). The script grabs the processor > > frequency (so you have to adjust that the script to ARM). > > > > And thats it. Please replace tsc_mhz -> processor_freq. > > So the script reports the freq. from 1555.184 Mhz to 2574.546 Mhz. > And I doubt the numbers remain steady on today's x86 even with > gov=performance. > However, I asked for the frequency to be used on !x86 given the code we > have. It was fixed in hurry and you would have to use 1000 so that the > math keeps working. > Then I asked how much benefit of this complicated TSC calculation vs > clock_gettime() has. > I tried to use it myself but after 30secs I saw no output, it just ate > 100% and it seemed to do nothing. So I gave up. Then the simulated queue size was not exceeded, accordingly to the parameters you specified: * max_queue_len: maximum latency allowed, in nanoseconds (int). * cycles_per_packet: number of cycles to process one packet (int). * mpps(million-packet-per-sec): million packets per second (float). If you increase mpps or increase cycles_per_packet (or both), or decrease max_queue_len, it should fail. Do you see that? ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-12-17 3:35 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-08 21:06 [PATCH rt-tests] queuelat: use ARM implementation of gettick also for all !x86 archs Uwe Kleine-König 2019-12-09 9:40 ` Daniel Wagner 2019-12-10 11:24 ` John Kacur 2019-12-12 17:31 ` Sebastian Andrzej Siewior 2019-12-10 11:20 ` John Kacur 2019-12-12 17:46 ` Sebastian Andrzej Siewior 2019-12-13 12:41 ` Daniel Wagner 2019-12-13 14:54 ` John Kacur 2019-12-13 23:02 ` Marcelo Tosatti 2019-12-16 15:34 ` Sebastian Andrzej Siewior 2019-12-16 21:14 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).