lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
* [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
@ 2020-11-27  6:39 熊毓华 via lttng-dev
  2020-11-27 14:05 ` Jonathan Rajotte-Julien via lttng-dev
  0 siblings, 1 reply; 7+ messages in thread
From: 熊毓华 via lttng-dev @ 2020-11-27  6:39 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 1591 bytes --]

Hi,dear.




I have been using lttng to monitor my server these days,but I found something interesting.

The cpu usage of lttng varies with the number of cpu cores of the server.




On the server, I create a tracing session in live mode, using "lttng create my-session --live". 

Then,I Start the babeltrace2 and configure it to connect to the relay daemon,using "--input-format=lttng-live" mode.

I used 5 cloud servers,1core4G 2core8G 4core16G 8core16G 8core16G.

And,the same test script was executed above to provide the same workload.




As we all know,lttng has 5 processes,

1.lttng-runas    --daemonize

2.lttng-runas      -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing

3.lttng-sessiond --daemonize

4.lttng-relayd -L tcp://localhost:5344

5.lttng-consumerd  -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing




The CPU usage of the first four processes is below 2% on the 5 servers,but the lttng-consumerd process is different.

On 1-core、2-core、4-core servers,the CPU usage of the lttng-consumerd process is below 2%.

But on two 8-core machines, the cpu usage of the lttng-consumerd process reached 10% or more.




And,the cpu usage of the babeltrace process is not much different,just the cpu usage of the lttng-consumerd process varies with the number of cpu cores of the server.




Why it is like this?How should this phenomenon be analyzed?





Looking forward to your reply.


thanks,
yuhua







[-- Attachment #1.2: Type: text/html, Size: 2374 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
  2020-11-27  6:39 [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process 熊毓华 via lttng-dev
@ 2020-11-27 14:05 ` Jonathan Rajotte-Julien via lttng-dev
  2020-11-27 15:32   ` 熊毓华 via lttng-dev
  0 siblings, 1 reply; 7+ messages in thread
From: Jonathan Rajotte-Julien via lttng-dev @ 2020-11-27 14:05 UTC (permalink / raw)
  To: 熊毓华; +Cc: lttng-dev

Hi,

On Fri, Nov 27, 2020 at 02:39:28PM +0800, 熊毓华 via lttng-dev wrote:
> Hi,dear.
> 
> I have been using lttng to monitor my server these days,but I found something interesting.
> 
> The cpu usage of lttng varies with the number of cpu cores of the server.

Which is a bit expected since more CPU means more "data" source from the point
of view of lttng hence more "work" overall.

> 
> On the server, I create a tracing session in live mode, using "lttng create my-session --live". 
> 
> Then,I Start the babeltrace2 and configure it to connect to the relay daemon,using "--input-format=lttng-live" mode.
> 
> I used 5 cloud servers,1core4G 2core8G 4core16G 8core16G 8core16G.
> 
> And,the same test script was executed above to provide the same workload.

We would need the test script to have some context here of the workload.

> 
> As we all know,lttng has 5 processes,
> 
> 1.lttng-runas    --daemonize
> 
> 2.lttng-runas      -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing

Based on this you are performing kernel tracing.

> 
> 3.lttng-sessiond --daemonize
> 
> 4.lttng-relayd -L tcp://localhost:5344
> 
> 5.lttng-consumerd  -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing
> 
> 
> The CPU usage of the first four processes is below 2% on the 5 servers,but the lttng-consumerd process is different.
> 
> On 1-core、2-core、4-core servers,the CPU usage of the lttng-consumerd process is below 2%.

How is the cpu usage measured here?

> 
> But on two 8-core machines, the cpu usage of the lttng-consumerd process reached 10% or more.

Consumerd is responsible of "fetching" data from the ring buffers and "saving"
it either locally (trace on disk) or remotely (streaming/live session). CPU usage
should be a bit correlated with the event production rate. Did you have a look at the
number of events generated for a similar interval?

> And,the cpu usage of the babeltrace process is not much different,just the cpu usage of the lttng-consumerd process varies with the number of cpu cores of the server.
> 
> Why it is like this?How should this phenomenon be analyzed?
> 
> Looking forward to your reply.
> 
> thanks,
> yuhua
> 
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


-- 
Jonathan Rajotte-Julien
EfficiOS
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
  2020-11-27 14:05 ` Jonathan Rajotte-Julien via lttng-dev
@ 2020-11-27 15:32   ` 熊毓华 via lttng-dev
  2020-11-27 16:04     ` Jonathan Rajotte-Julien via lttng-dev
  0 siblings, 1 reply; 7+ messages in thread
From: 熊毓华 via lttng-dev @ 2020-11-27 15:32 UTC (permalink / raw)
  To: Jonathan Rajotte-Julien, lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 4148 bytes --]

Hi,Dear.

The test script was used to generate some common fileIO,netIO events.

On all servers, the monitoring strategy I set up when I start lttng is the same, monitoring all fileIO, netIO and some related system calls. 
The following table records the amount of events generated by the test script per minute, and one babeltrace record represents one event.


The unit of the number is every ten thousand events per minute. And the number were read out after parsing by babeltrace.
In addition, the server1 is 1core4G, server2 is 2core8G, server3 is 4core16G, server4 and server5 are 8core16G.

It can be seen that the average amount of data generated per minute on all servers is roughly the same.However, the CPU usage of the lttng-consumerd process behave differently on server4 and server5, as I mentioned in my last email.


In addition, the usage of cpu is recorded using the "top" command.




My test concluded that, while the same number of events collected, lttng-consumerd process need to consume more cpu on the 8-core server.

I want to know why is this and what else information do you need?

Looking forward to your reply.
thanks,
yuhua.


> -----原始邮件-----
> 发件人: "Jonathan Rajotte-Julien" <jonathan.rajotte-julien@efficios.com>
> 发送时间: 2020-11-27 22:05:48 (星期五)
> 收件人: "熊毓华" <xiongyuhua@zju.edu.cn>
> 抄送: lttng-dev@lists.lttng.org
> 主题: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
> 
> Hi,
> 
> On Fri, Nov 27, 2020 at 02:39:28PM +0800, 熊毓华 via lttng-dev wrote:
> > Hi,dear.
> > 
> > I have been using lttng to monitor my server these days,but I found something interesting.
> > 
> > The cpu usage of lttng varies with the number of cpu cores of the server.
> 
> Which is a bit expected since more CPU means more "data" source from the point
> of view of lttng hence more "work" overall.
> 
> > 
> > On the server, I create a tracing session in live mode, using "lttng create my-session --live". 
> > 
> > Then,I Start the babeltrace2 and configure it to connect to the relay daemon,using "--input-format=lttng-live" mode.
> > 
> > I used 5 cloud servers,1core4G 2core8G 4core16G 8core16G 8core16G.
> > 
> > And,the same test script was executed above to provide the same workload.
> 
> We would need the test script to have some context here of the workload.
> 
> > 
> > As we all know,lttng has 5 processes,
> > 
> > 1.lttng-runas    --daemonize
> > 
> > 2.lttng-runas      -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing
> 
> Based on this you are performing kernel tracing.
> 
> > 
> > 3.lttng-sessiond --daemonize
> > 
> > 4.lttng-relayd -L tcp://localhost:5344
> > 
> > 5.lttng-consumerd  -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing
> > 
> > 
> > The CPU usage of the first four processes is below 2% on the 5 servers,but the lttng-consumerd process is different.
> > 
> > On 1-core、2-core、4-core servers,the CPU usage of the lttng-consumerd process is below 2%.
> 
> How is the cpu usage measured here?
> 
> > 
> > But on two 8-core machines, the cpu usage of the lttng-consumerd process reached 10% or more.
> 
> Consumerd is responsible of "fetching" data from the ring buffers and "saving"
> it either locally (trace on disk) or remotely (streaming/live session). CPU usage
> should be a bit correlated with the event production rate. Did you have a look at the
> number of events generated for a similar interval?
> 
> > And,the cpu usage of the babeltrace process is not much different,just the cpu usage of the lttng-consumerd process varies with the number of cpu cores of the server.
> > 
> > Why it is like this?How should this phenomenon be analyzed?
> > 
> > Looking forward to your reply.
> > 
> > thanks,
> > yuhua
> > 
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev@lists.lttng.org
> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> 
> 
> -- 
> Jonathan Rajotte-Julien
> EfficiOS



[-- Attachment #1.2: Type: text/html, Size: 7413 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
  2020-11-27 15:32   ` 熊毓华 via lttng-dev
@ 2020-11-27 16:04     ` Jonathan Rajotte-Julien via lttng-dev
  2020-11-27 17:11       ` Mathieu Desnoyers via lttng-dev
  2020-11-28  6:49       ` 熊毓华 via lttng-dev
  0 siblings, 2 replies; 7+ messages in thread
From: Jonathan Rajotte-Julien via lttng-dev @ 2020-11-27 16:04 UTC (permalink / raw)
  To: 熊毓华; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 5657 bytes --]

> From: "熊毓华" <xiongyuhua@zju.edu.cn>
> To: "Jonathan Rajotte-Julien" <jonathan.rajotte-julien@efficios.com>,
> "lttng-dev" <lttng-dev@lists.lttng.org>
> Sent: Friday, November 27, 2020 10:32:07 AM
> Subject: Re: Re: [lttng-dev] Some confusion about cpu usage of the
> lttng-consumerd process

> Hi,Dear.

Side note, you can remove the "Dear" here. ;) 

> The test script was used to generate some common fileIO,netIO events.

Please provide a complete code repository if possible. So that we can at least have a baseline for reproduction. 

> On all servers, the monitoring strategy I set up when I start lttng is the same,
> monitoring all fileIO, netIO and some related system calls.
> The following table records the amount of events generated by the test script
> per minute, and one babeltrace record represents one event.

For some reason the image does not load here. Please provide a text based alternative for this figure. 

> The unit of the number is every ten thousand events per minute. And the number
> were read out after parsing by babeltrace.
> In addition, the server1 is 1core4G, server2 is 2core8G, server3 is 4core16G,
> server4 and server5 are 8core16G.

> It can be seen that the average amount of data generated per minute on all
> servers is roughly the same.However, the CPU usage of the lttng-consumerd
> process behave differently on server4 and server5, as I mentioned in my last
> email.

> In addition, the usage of cpu is recorded using the "top" command.

> My test concluded that, while the same number of events collected,
> lttng-consumerd process need to consume more cpu on the 8-core server.

> I want to know why is this and what else information do you need?

Well we also want to know why! You will understand that albeit we develop lttng we do not always have a quick and easy answer to all problems. Performance related problem are always tricky. 
And we also have to keep in mind that we do not necessarily optimize for low-cpu usage on the lttng-consumerd side. 

We have to take a look at what "work" scale with the number of CPU on the lttng-consumerd side. One such thing is the live timer which is fired on an interval (default is 1s (1000000us)). 

You could test this hypothesis by streaming the trace instead of using the live feature. 

lttng create --set-url .... 

Cheers 

> Looking forward to your reply.
> thanks,
> yuhua.
> > -----原始邮件-----
> > 发件人: "Jonathan Rajotte-Julien" < jonathan.rajotte-julien@efficios.com >
> > 发送时间: [ callto:2020-11-27 22 | 2020-11-27 22 ] :05:48 (星期五)
> > 收件人: "熊毓华" < xiongyuhua@zju.edu.cn >
> > 抄送: lttng-dev@lists.lttng.org
>> 主题: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd
> > process

> > Hi,

> > On Fri, Nov 27 , 2020 at 02:39:28PM +0800, 熊毓华 via lttng-dev wrote:
> > > Hi,dear.

>> > I have been using lttng to monitor my server these days,but I found something
> > > interesting.

> > > The cpu usage of lttng varies with the number of cpu cores of the server.

> > Which is a bit expected since more CPU means more "data" source from the point
> > of view of lttng hence more "work" overall.


>> > On the server, I create a tracing session in live mode, using "lttng create
> > > my-session --live".

>> > Then,I Start the babeltrace2 and configure it to connect to the relay
> > > daemon,using "--input-format=lttng-live" mode.

> > > I used 5 cloud servers,1core4G 2core8G 4core16G 8core16G 8core16G.

> > > And,the same test script was executed above to provide the same workload.

> > We would need the test script to have some context here of the workload.


> > > As we all know,lttng has 5 processes,

> > > 1.lttng-runas --daemonize

>> > 2.lttng-runas -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command
> > > --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing

> > Based on this you are performing kernel tracing.


> > > 3.lttng-sessiond --daemonize

> > > 4.lttng-relayd -L tcp://localhost:5344

>> > 5.lttng-consumerd -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command
> > > --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing


>> > The CPU usage of the first four processes is below 2% on the 5 servers,but the
> > > lttng-consumerd process is different.

>> > On 1-core、2-core、4-core servers,the CPU usage of the lttng-consumerd process is
> > > below 2%.

> > How is the cpu usage measured here?


>> > But on two 8-core machines, the cpu usage of the lttng-consumerd process reached
> > > 10% or more.

> > Consumerd is responsible of "fetching" data from the ring buffers and "saving"
>> it either locally (trace on disk) or remotely (streaming/live session). CPU
> > usage
>> should be a bit correlated with the event production rate. Did you have a look
> > at the
> > number of events generated for a similar interval?

>> > And,the cpu usage of the babeltrace process is not much different,just the cpu
>> > usage of the lttng-consumerd process varies with the number of cpu cores of the
> > > server.

> > > Why it is like this?How should this phenomenon be analyzed?

> > > Looking forward to your reply.

> > > thanks,
> > > yuhua

> > > _______________________________________________
> > > lttng-dev mailing list
> > > lttng-dev@lists.lttng.org
>> > [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev |
> > > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ]


> > --
> > Jonathan Rajotte-Julien
> > EfficiOS

[-- Attachment #1.2: Type: text/html, Size: 12045 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
  2020-11-27 16:04     ` Jonathan Rajotte-Julien via lttng-dev
@ 2020-11-27 17:11       ` Mathieu Desnoyers via lttng-dev
  2020-11-28  6:49       ` 熊毓华 via lttng-dev
  1 sibling, 0 replies; 7+ messages in thread
From: Mathieu Desnoyers via lttng-dev @ 2020-11-27 17:11 UTC (permalink / raw)
  To: Jonathan Rajotte-Julien; +Cc: 熊毓华, lttng-dev

----- On Nov 27, 2020, at 11:04 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

> Well we also want to know why! You will understand that albeit we develop lttng
> we do not always have a quick and easy answer to all problems. Performance
> related problem are always tricky.
> And we also have to keep in mind that we do not necessarily optimize for low-cpu
> usage on the lttng-consumerd side.

That being said, we did optimize for low-cpu usage of lttng-consumerd for use-cases
streaming to disk or to the network. However, the "live" mode was originally created
for use-cases where only a few events per second would be emitted, and no such
requirements were placed on performance. We can see today that its use has grown
much beyond the few events per seconds, but then in those use-cases the live mode
may not be the appropriate tool for the job then. We have introduced the "session
rotation" feature as a more efficient alternative to the live mode.

> We have to take a look at what "work" scale with the number of CPU on the
> lttng-consumerd side. One such thing is the live timer which is fired on an
> interval (default is 1s (1000000us)).

> You could test this hypothesis by streaming the trace instead of using the live
> feature.

> lttng create --set-url ....

Yes, I agree with Jonathan's recommendation: you should compare this cpu usage with
that of the streaming mode of lttng by *not* using the "--live" option when creating
the trace session. It will at least help identify whether consumerd also exhibits this
cpu usage increase with number of cores in streaming mode, or if it is an expected
additional overhead of periodically flushing more cpu buffers (because there are more
cores) caused by the live timer.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
  2020-11-27 16:04     ` Jonathan Rajotte-Julien via lttng-dev
  2020-11-27 17:11       ` Mathieu Desnoyers via lttng-dev
@ 2020-11-28  6:49       ` 熊毓华 via lttng-dev
  2020-11-30 14:24         ` Mathieu Desnoyers via lttng-dev
  1 sibling, 1 reply; 7+ messages in thread
From: 熊毓华 via lttng-dev @ 2020-11-28  6:49 UTC (permalink / raw)
  To: Jonathan Rajotte-Julien, mathieu.desnoyers, lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 6284 bytes --]

Hi,

I put my test scripts in the attachment.

You can just run the script directly, create the trace session with the "--live" option on the 8core server,

then you will find the cpu usage of the lttng-consumerd process reached 10% or more.




About the streaming mode of lttng,I did the test before, it worked well.

When I create the trace session with "lttng create my-session --output=/tmp/my-kernel-trace", or with "lttng create my-session --set-url=net://ip",

the number of CPU seems not affect the cpu usage with lttng-consumerd.

It seems that only live-mode will be affected.




thanks,

yuhua


-----原始邮件-----
发件人:"Jonathan Rajotte-Julien" <jonathan.rajotte-julien@efficios.com>
发送时间:2020-11-28 00:04:23 (星期六)
收件人: "熊毓华" <xiongyuhua@zju.edu.cn>
抄送: lttng-dev <lttng-dev@lists.lttng.org>
主题: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process




From: "熊毓华" <xiongyuhua@zju.edu.cn>
To: "Jonathan Rajotte-Julien" <jonathan.rajotte-julien@efficios.com>, "lttng-dev" <lttng-dev@lists.lttng.org>
Sent: Friday, November 27, 2020 10:32:07 AM
Subject: Re: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process

Hi,Dear.


Side note, you can remove the "Dear" here. ;)




The test script was used to generate some common fileIO,netIO events.


Please provide a complete code repository if possible. So that we can at least have a baseline for reproduction.




On all servers, the monitoring strategy I set up when I start lttng is the same, monitoring all fileIO, netIO and some related system calls. 
The following table records the amount of events generated by the test script per minute, and one babeltrace record represents one event.




For some reason the image does not load here. Please provide a text based alternative for this figure.






The unit of the number is every ten thousand events per minute. And the number were read out after parsing by babeltrace.
In addition, the server1 is 1core4G, server2 is 2core8G, server3 is 4core16G, server4 and server5 are 8core16G.

It can be seen that the average amount of data generated per minute on all servers is roughly the same.However, the CPU usage of the lttng-consumerd process behave differently on server4 and server5, as I mentioned in my last email.


In addition, the usage of cpu is recorded using the "top" command.




My test concluded that, while the same number of events collected, lttng-consumerd process need to consume more cpu on the 8-core server.

I want to know why is this and what else information do you need?



Well we also want to know why! You will understand that albeit we develop lttng we do not always have a quick and easy answer to all problems. Performance related problem are always tricky.
And we also have to keep in mind that we do not necessarily optimize for low-cpu usage on the lttng-consumerd side. 


We have to take a look at what "work" scale with the number of CPU on the lttng-consumerd side. One such thing is the live timer which is fired on an interval (default is 1s (1000000us)).


You could test this hypothesis by streaming the trace instead of using the live feature.


lttng create --set-url ....


Cheers



Looking forward to your reply.
thanks,
yuhua.


> -----原始邮件-----
> 发件人: "Jonathan Rajotte-Julien" <jonathan.rajotte-julien@efficios.com>
> 发送时间: 2020-11-27 22:05:48 (星期五)
> 收件人: "熊毓华" <xiongyuhua@zju.edu.cn>
> 抄送: lttng-dev@lists.lttng.org
> 主题: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
> 
> Hi,
> 
> On Fri, Nov 27, 2020 at 02:39:28PM +0800, 熊毓华 via lttng-dev wrote:
> > Hi,dear.
> > 
> > I have been using lttng to monitor my server these days,but I found something interesting.
> > 
> > The cpu usage of lttng varies with the number of cpu cores of the server.
> 
> Which is a bit expected since more CPU means more "data" source from the point
> of view of lttng hence more "work" overall.
> 
> > 
> > On the server, I create a tracing session in live mode, using "lttng create my-session --live". 
> > 
> > Then,I Start the babeltrace2 and configure it to connect to the relay daemon,using "--input-format=lttng-live" mode.
> > 
> > I used 5 cloud servers,1core4G 2core8G 4core16G 8core16G 8core16G.
> > 
> > And,the same test script was executed above to provide the same workload.
> 
> We would need the test script to have some context here of the workload.
> 
> > 
> > As we all know,lttng has 5 processes,
> > 
> > 1.lttng-runas    --daemonize
> > 
> > 2.lttng-runas      -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing
> 
> Based on this you are performing kernel tracing.
> 
> > 
> > 3.lttng-sessiond --daemonize
> > 
> > 4.lttng-relayd -L tcp://localhost:5344
> > 
> > 5.lttng-consumerd  -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing
> > 
> > 
> > The CPU usage of the first four processes is below 2% on the 5 servers,but the lttng-consumerd process is different.
> > 
> > On 1-core、2-core、4-core servers,the CPU usage of the lttng-consumerd process is below 2%.
> 
> How is the cpu usage measured here?
> 
> > 
> > But on two 8-core machines, the cpu usage of the lttng-consumerd process reached 10% or more.
> 
> Consumerd is responsible of "fetching" data from the ring buffers and "saving"
> it either locally (trace on disk) or remotely (streaming/live session). CPU usage
> should be a bit correlated with the event production rate. Did you have a look at the
> number of events generated for a similar interval?
> 
> > And,the cpu usage of the babeltrace process is not much different,just the cpu usage of the lttng-consumerd process varies with the number of cpu cores of the server.
> > 
> > Why it is like this?How should this phenomenon be analyzed?
> > 
> > Looking forward to your reply.
> > 
> > thanks,
> > yuhua
> > 
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev@lists.lttng.org
> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> 
> 
> -- 
> Jonathan Rajotte-Julien
> EfficiOS




[-- Attachment #1.2: Type: text/html, Size: 13777 bytes --]

[-- Attachment #2: test script.rar --]
[-- Type: application/octet-stream, Size: 1120 bytes --]

[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
  2020-11-28  6:49       ` 熊毓华 via lttng-dev
@ 2020-11-30 14:24         ` Mathieu Desnoyers via lttng-dev
  0 siblings, 0 replies; 7+ messages in thread
From: Mathieu Desnoyers via lttng-dev @ 2020-11-30 14:24 UTC (permalink / raw)
  To: 熊毓华; +Cc: lttng-dev

----- On Nov 28, 2020, at 1:49 AM, 熊毓华 xiongyuhua@zju.edu.cn wrote:

> Hi,

> I put my test scripts in the attachment.

> You can just run the script directly, create the trace session with the "--live"
> option on the 8core server,

> then you will find the cpu usage of the lttng-consumerd process reached 10% or
> more.

> About the streaming mode of lttng,I did the test before, it worked well.

> When I create the trace session with "lttng create my-session
> --output=/tmp/my-kernel-trace", or with "lttng create my-session
> --set-url=net://ip",

> the number of CPU seems not affect the cpu usage with lttng-consumerd.

> It seems that only live-mode will be affected.

The overhead of consumer daemon in live mode will increase with the number
of cpus on the system. This is expected.

There is one knob you can try to configure to adapt the amount of overhead
caused by the live timer in the consumer daemon: lttng create --live=NNN
where NNN is the live timer period in microseconds. The default is 1000000us
(1 second). Try changing this value to something larger and you will
probably notice that it lessen the consumer daemon overhead.

Shipping trace data with relatively low latency so it can be immediately
read adds overhead, and it increases with the number of cpus in the system
because there are then more per-cpu buffers to flush.

Note that your instrumentation also targets unrelated kernel and user-space
execution, so whatever is executed in the background also gets traced. And
this generate trace data traffic on each cpu. Having even just a bit of
trace data to send out in live mode can very much explain the overhead you
observe with the lttng consumer daemon.

If you care that much about overhead of the consumer daemon, don't use live
mode, and use the alternatives we discussed earlier instead, such as the
session rotation mode, if you need to consume the trace data while it is
produced.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-30 14:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-27  6:39 [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process 熊毓华 via lttng-dev
2020-11-27 14:05 ` Jonathan Rajotte-Julien via lttng-dev
2020-11-27 15:32   ` 熊毓华 via lttng-dev
2020-11-27 16:04     ` Jonathan Rajotte-Julien via lttng-dev
2020-11-27 17:11       ` Mathieu Desnoyers via lttng-dev
2020-11-28  6:49       ` 熊毓华 via lttng-dev
2020-11-30 14:24         ` Mathieu Desnoyers via lttng-dev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).