All of lore.kernel.org
 help / color / mirror / Atom feed
* Question to perf spending a large amount of time monitoring a java process
@ 2017-12-01  7:32 zhangmengting
  2017-12-01  8:03 ` Wangnan (F)
  2017-12-05 17:40 ` Andi Kleen
  0 siblings, 2 replies; 4+ messages in thread
From: zhangmengting @ 2017-12-01  7:32 UTC (permalink / raw)
  To: linux-perf-users
  Cc: acme, namhyung, jolsa, huawei.libin, cj.chengjian, zhangmengting

[-- Attachment #1: Type: text/plain, Size: 4196 bytes --]

Hi all,

I found that perf spends a large amount of time attaching and monitoring
a java process with lock, although the execution time of the java process
is below 1 minute.

Attachment 1(ContextSwitchTest.java) is the java source code used to
reproduce the problem. The code is compiled and run with the following
commands. The arguments of the process are <number of RUNS>
(how many times the test code will be excuated) and <lock ITERATES>
(how many times the thread acquires lock).
With arguments <1, 1000000>, the execution time of the process is just 
one minute.

$javac ContextSwitchTest.java

$java ContextSwitchTest
Usage:
java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
ContextSwitchTest  <number of RUNS>  <lock ITERATES>
Example:
java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
ContextSwitchTest 1 1000000

I've tested the problem on both x86 and ARM64 platform with 4.14 kernel 
and 4.14 perf.
And for convenience, I've add time check code to detect the execution 
time for perf record.
Attachment 2 is the time check patch 
(0001-perf-record-add-execution-time-check-code.patch)

The test result is shown below:
1) On x86 platform
a. The execution time of this java process
$java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
ContextSwitchTest 1 1000000
RUNS : 1, ITERATES : 1000000
Name : Thread-0, 21
Name : Thread-1, 22
parks: 979010
parks: 978929
Average time: 28642ns
Total time: 56081313428ns = 56s
b. The execution time of perf monitoring this process with several event 
groups
$perf record -N -B -T -g -e 
'{cycles,r008,r01b,r10c,r009},{cycles,r100,r102,r107,r108},\
{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\
{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\
{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\
{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108}'\
java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
ContextSwitchTest 1 1000000
record_open_RUN is 0.235386s
RUNS : 1, ITERATES : 1000000
Name : Thread-0, 21
Name : Thread-1, 22
parks: 997895
parks: 998116
Average time: 72197ns
Total time: 144107437593ns = 144s
pollfd_RUN is 169.4294951967s
[ perf record: Woken up 148 times to write data ]
[ perf record: Captured and wrote 0.060 MB perf.data ]
Record_RUN is 170.4294783665s

2) On ARM64 platform
a. The execution time of this java process
$java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
ContextSwitchTest 1 1000000
RUNS : 1, ITERATES : 1000000
Name : Thread-0, 24
Name : Thread-1, 25
parks: 977285
parks: 977279
Average time: 4708ns
Total time: 9203640720ns = 9s
b. The execution time of perf monitoring this process with several event 
groups
$perf record -N -B -T -g -e'{cycles,r008,r01b,r10c,r009,r010,r012},\
{cycles,r100,r102,r107,r108,r076,r078},{cycles,r001,r002,r014,r179,r177},\
{cycles,r121,r122,r123,r124,r125,r126},{cycles,r040,r042,r050,r052,r060,r061},\
{cycles,r003,r004,r005,r016,r017},{cycles,r070,r071,r073,r074,r075,r077},\
{cycles,r112,r113,r12c,r111,r120},{cycles,r06c,r06d,r06e,r07c,r07d,r07e},\
{cycles,r150,r151,r152,r16a,r079,r07a}' \
java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
ContextSwitchTest 1 1000000
record_open_RUN is 0.40954s
RUNS : 1, ITERATES : 1000000
Name : Thread-0, 24
Name : Thread-1, 25
parks: 1008468
parks: 1003826
Average time: 1154505ns
Total time: 2323208237220ns = 2323s
pollfd_RUN is 2326.645806s
[ perf record: Woken up 18463 times to write data ]
[ perf record: Captured and wrote 6263.982 MB perf.data ]
Record_RUN is 2328.4294867157s

The test result shows that perf consumes most of the time polling fds.
In addtion, it seems that when tracing a great amount of events, perf may
extend the execution time of the traced process, especially on ARM64 
platform.
A process that runs only 10 seconds now needs an hour execution time, 
which is
somewhat insane.

I confuses that how perf affects the traced process and whether the
final perf.data is still accurate since perf has affected the traced 
process?
Is there something wrong with perf?

Thanks,
Mengting Zhang


[-- Attachment #2: ContextSwitchTest.java --]
[-- Type: text/java, Size: 2634 bytes --]

import java.util.concurrent.atomic.AtomicReference;
import java.util.concurrent.locks.LockSupport;    

public final class ContextSwitchTest {
    static int RUNS = 1;
    static int ITERATES = 1000;
    static AtomicReference turn = new AtomicReference();

    static final class WorkerThread extends Thread {
        volatile Thread other;
        volatile int nparks;
        public void run() {
            final AtomicReference t = turn;
            final Thread other = this.other;
            
	    Thread current = Thread.currentThread();  
            System.out.println("Name : " + current.getName() +", " + current.getId( ));

            if (turn == null || other == null)
                throw new NullPointerException();
            int p = 0;
            for (int i = 0; i < ITERATES; ++i) {
                while (!t.compareAndSet(other, this)) {
                    LockSupport.park();
                    ++p;
                }
                LockSupport.unpark(other);
            }
            LockSupport.unpark(other);
            nparks = p;
            System.out.println("parks: " + p);

        }
    }

    static void test() throws Exception {
        WorkerThread a = new WorkerThread();
        WorkerThread b = new WorkerThread();
        a.other = b;
        b.other = a;
        turn.set(a);
        long startTime = System.nanoTime();
        a.start();
        b.start();
        a.join();
        b.join();
        long endTime = System.nanoTime();
        int parkNum = a.nparks + b.nparks;
        System.out.println("Average time: " + ((endTime - startTime) / parkNum)
                           + "ns");
    }

    public static void main(String[] args) throws Exception {
	if (args.length != 2) {
                System.out.println("Usage: \n" +
                                   "java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints  ContextSwitchTest  <number of RUNS>  <lock ITERATES>");
                System.out.println("Example: \n" +
                                   "java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints  ContextSwitchTest 1 1000000");
                System.exit(0);
        }
        if (args.length == 2) {
		RUNS = Integer.parseInt(args[0]);
    		ITERATES = Integer.parseInt(args[1]);
	}

        System.out.println("RUNS : " + RUNS + ", ITERATES : " + ITERATES);
	long startTime = System.nanoTime();
        for (int i = 0; i < RUNS; i++) {
            test();
        }
        long endTime = System.nanoTime();
        System.out.println("Total time: " + ((endTime - startTime)) + "ns = " + (endTime - startTime) / 1000000000 + "s");
    }
}

[-- Attachment #3: 0001-perf-record-add-execution-time-check-code.patch --]
[-- Type: text/plain, Size: 3279 bytes --]

From f21d8b2f7329785da27548e61152d7cd542d9ee1 Mon Sep 17 00:00:00 2001
From: Mengting Zhang <zhangmengting@huawei.com>
Date: Fri, 1 Dec 2017 13:43:57 +0800
Subject: [PATCH] perf record: add execution time check code

"record_open_RUN" means the time of record__open();
"Record_RUN" means the time of cmd_record();
"pollfd_RUN" means the time of main part of __cmd_record()
polling fds;

Test it:
$perf record sleep 1
$record_open_RUN is 1.4294047351s
pollfd_RUN is 1.8617s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.014 MB perf.data (28 samples) ]
Record_RUN is 2.4294628690s

Signed-off-by: Mengting Zhang <zhangmengting@huawei.com>
---
 tools/perf/builtin-record.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 56f8142..f0f0dab 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -50,6 +50,7 @@
 #include <signal.h>
 #include <sys/mman.h>
 #include <sys/wait.h>
+#include <sys/time.h>
 #include <asm/bug.h>
 #include <linux/time64.h>
 
@@ -432,6 +433,8 @@ static int record__open(struct record *rec)
 	struct record_opts *opts = &rec->opts;
 	struct perf_evsel_config_term *err_term;
 	int rc = 0;
+	struct timeval start, end;
+	gettimeofday(&start, NULL);
 
 	perf_evlist__config(evlist, opts, &callchain_param);
 
@@ -475,6 +478,10 @@ static int record__open(struct record *rec)
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
 out:
+	gettimeofday(&end, NULL);
+	printf("record_open_RUN is %u.%us\n",
+		(unsigned int)(end.tv_sec - start.tv_sec),
+		(unsigned int)(end.tv_usec - start.tv_usec));
 	return rc;
 }
 
@@ -881,6 +888,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	struct perf_session *session;
 	bool disabled = false, draining = false;
 	int fd;
+	struct timeval start, end;
 
 	rec->progname = argv[0];
 
@@ -1051,6 +1059,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	trigger_ready(&auxtrace_snapshot_trigger);
 	trigger_ready(&switch_output_trigger);
 	perf_hooks__invoke_record_start();
+	gettimeofday(&start, NULL);
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
@@ -1148,6 +1157,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			disabled = true;
 		}
 	}
+	gettimeofday(&end, NULL);
+	printf("pollfd_RUN is %u.%us\n",
+		(unsigned int)(end.tv_sec - start.tv_sec),
+		(unsigned int)(end.tv_usec - start.tv_usec));
+
 	trigger_off(&auxtrace_snapshot_trigger);
 	trigger_off(&switch_output_trigger);
 
@@ -1688,6 +1702,8 @@ int cmd_record(int argc, const char **argv)
 	int err;
 	struct record *rec = &record;
 	char errbuf[BUFSIZ];
+	struct timeval start, end;
+	gettimeofday(&start, NULL);	
 
 #ifndef HAVE_LIBBPF_SUPPORT
 # define set_nobuild(s, l, c) set_option_nobuild(record_options, s, l, "NO_LIBBPF=1", c)
@@ -1884,6 +1900,12 @@ int cmd_record(int argc, const char **argv)
 	perf_evlist__delete(rec->evlist);
 	symbol__exit();
 	auxtrace_record__free(rec->itr);
+
+	gettimeofday(&end, NULL);
+	printf("Record_RUN is %u.%us\n",
+		(unsigned int)(end.tv_sec - start.tv_sec),
+		(unsigned int)(end.tv_usec - start.tv_usec));
+
 	return err;
 }
 
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Question to perf spending a large amount of time monitoring a java process
  2017-12-01  7:32 Question to perf spending a large amount of time monitoring a java process zhangmengting
@ 2017-12-01  8:03 ` Wangnan (F)
  2017-12-05 17:40 ` Andi Kleen
  1 sibling, 0 replies; 4+ messages in thread
From: Wangnan (F) @ 2017-12-01  8:03 UTC (permalink / raw)
  To: zhangmengting, linux-perf-users
  Cc: acme, namhyung, jolsa, huawei.libin, cj.chengjian, Andi Kleen

I have contacted Mengting by phone. My suggestion is listed below.

I think what she wants is generating data to do a top-down analysis on 
ARM64. Append Andi Kleen to cc list to see if he has any suggestion on it.

On 2017/12/1 15:32, zhangmengting wrote:
> Hi all,
>
> I found that perf spends a large amount of time attaching and monitoring
> a java process with lock, although the execution time of the java process
> is below 1 minute.
>
> Attachment 1(ContextSwitchTest.java) is the java source code used to
> reproduce the problem. The code is compiled and run with the following
> commands. The arguments of the process are <number of RUNS>
> (how many times the test code will be excuated) and <lock ITERATES>
> (how many times the thread acquires lock).
> With arguments <1, 1000000>, the execution time of the process is just 
> one minute.
>
> $javac ContextSwitchTest.java
>
> $java ContextSwitchTest
> Usage:
> java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
> ContextSwitchTest  <number of RUNS>  <lock ITERATES>
> Example:
> java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
> ContextSwitchTest 1 1000000
>
> I've tested the problem on both x86 and ARM64 platform with 4.14 
> kernel and 4.14 perf.
> And for convenience, I've add time check code to detect the execution 
> time for perf record.
> Attachment 2 is the time check patch 
> (0001-perf-record-add-execution-time-check-code.patch)
>
> The test result is shown below:
> 1) On x86 platform
> a. The execution time of this java process
> $java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
> ContextSwitchTest 1 1000000
> RUNS : 1, ITERATES : 1000000
> Name : Thread-0, 21
> Name : Thread-1, 22
> parks: 979010
> parks: 978929
> Average time: 28642ns
> Total time: 56081313428ns = 56s
> b. The execution time of perf monitoring this process with several 
> event groups
> $perf record -N -B -T -g -e 
> '{cycles,r008,r01b,r10c,r009},{cycles,r100,r102,r107,r108},\
> {cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\ 
>
> {cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\ 
>
> {cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\ 
>
> {cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108}'\ 
>

There are too many event groups. It introduce heavy overhead to context 
switch, which
is the benchmark you are using. A '-a' may help to reduce event group 
switching
overhead.

> java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
> ContextSwitchTest 1 1000000
> record_open_RUN is 0.235386s
> RUNS : 1, ITERATES : 1000000
> Name : Thread-0, 21
> Name : Thread-1, 22
> parks: 997895
> parks: 998116
> Average time: 72197ns
> Total time: 144107437593ns = 144s
> pollfd_RUN is 169.4294951967s
> [ perf record: Woken up 148 times to write data ]
> [ perf record: Captured and wrote 0.060 MB perf.data ]
> Record_RUN is 170.4294783665s
>
> 2) On ARM64 platform
> a. The execution time of this java process
> $java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
> ContextSwitchTest 1 1000000
> RUNS : 1, ITERATES : 1000000
> Name : Thread-0, 24
> Name : Thread-1, 25
> parks: 977285
> parks: 977279
> Average time: 4708ns
> Total time: 9203640720ns = 9s
> b. The execution time of perf monitoring this process with several 
> event groups
> $perf record -N -B -T -g -e'{cycles,r008,r01b,r10c,r009,r010,r012},\
> {cycles,r100,r102,r107,r108,r076,r078},{cycles,r001,r002,r014,r179,r177},\ 
>
> {cycles,r121,r122,r123,r124,r125,r126},{cycles,r040,r042,r050,r052,r060,r061},\ 
>
> {cycles,r003,r004,r005,r016,r017},{cycles,r070,r071,r073,r074,r075,r077},\ 
>
> {cycles,r112,r113,r12c,r111,r120},{cycles,r06c,r06d,r06e,r07c,r07d,r07e},\ 
>
> {cycles,r150,r151,r152,r16a,r079,r07a}' \
> java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints 
> ContextSwitchTest 1 1000000
> record_open_RUN is 0.40954s
> RUNS : 1, ITERATES : 1000000
> Name : Thread-0, 24
> Name : Thread-1, 25
> parks: 1008468
> parks: 1003826
> Average time: 1154505ns
> Total time: 2323208237220ns = 2323s
> pollfd_RUN is 2326.645806s
> [ perf record: Woken up 18463 times to write data ]
> [ perf record: Captured and wrote 6263.982 MB perf.data ]

Event frequency seems too high. Please try -F or -C to control data
rate. When using '-a', you need '--exclude-perf' to filter out those
events generated by perf itself.

To reduce IO overhead, you can increase ring buffer using '-m' or switch
to overwrite mode.

> Record_RUN is 2328.4294867157s
>
> The test result shows that perf consumes most of the time polling fds.
> In addtion, it seems that when tracing a great amount of events, perf may
> extend the execution time of the traced process, especially on ARM64 
> platform.
> A process that runs only 10 seconds now needs an hour execution time, 
> which is
> somewhat insane.
>
> I confuses that how perf affects the traced process and whether the
> final perf.data is still accurate since perf has affected the traced 
> process?
> Is there something wrong with perf?
>
> Thanks,
> Mengting Zhang
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question to perf spending a large amount of time monitoring a java process
  2017-12-01  7:32 Question to perf spending a large amount of time monitoring a java process zhangmengting
  2017-12-01  8:03 ` Wangnan (F)
@ 2017-12-05 17:40 ` Andi Kleen
  2017-12-05 17:52   ` Peter Zijlstra
  1 sibling, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2017-12-05 17:40 UTC (permalink / raw)
  To: zhangmengting
  Cc: linux-perf-users, acme, namhyung, jolsa, huawei.libin,
	cj.chengjian, peterz, alexey.budankov

zhangmengting <zhangmengting@huawei.com> writes:

> Hi all,
>
> I found that perf spends a large amount of time attaching and monitoring
> a java process with lock, although the execution time of the java process
> is below 1 minute.

You could check if this patchkit helps

https://lkml.org/lkml/2017/9/8/118

Not sure why it's not moved forward. Peter?

-Andi

Keeping for context:

>
> Attachment 1(ContextSwitchTest.java) is the java source code used to
> reproduce the problem. The code is compiled and run with the following
> commands. The arguments of the process are <number of RUNS>
> (how many times the test code will be excuated) and <lock ITERATES>
> (how many times the thread acquires lock).
> With arguments <1, 1000000>, the execution time of the process is just
> one minute.
>
> $javac ContextSwitchTest.java
>
> $java ContextSwitchTest
> Usage:
> java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints
> ContextSwitchTest  <number of RUNS>  <lock ITERATES>
> Example:
> java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints
> ContextSwitchTest 1 1000000
>
> I've tested the problem on both x86 and ARM64 platform with 4.14
> kernel and 4.14 perf.
> And for convenience, I've add time check code to detect the execution
> time for perf record.
> Attachment 2 is the time check patch
> (0001-perf-record-add-execution-time-check-code.patch)
>
> The test result is shown below:
> 1) On x86 platform
> a. The execution time of this java process
> $java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints
> ContextSwitchTest 1 1000000
> RUNS : 1, ITERATES : 1000000
> Name : Thread-0, 21
> Name : Thread-1, 22
> parks: 979010
> parks: 978929
> Average time: 28642ns
> Total time: 56081313428ns = 56s
> b. The execution time of perf monitoring this process with several
> event groups
> $perf record -N -B -T -g -e 
> '{cycles,r008,r01b,r10c,r009},{cycles,r100,r102,r107,r108},\
> {cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\
> {cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\
> {cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},\
> {cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108},{cycles,r100,r102,r107,r108}'\
> java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints
> ContextSwitchTest 1 1000000
> record_open_RUN is 0.235386s
> RUNS : 1, ITERATES : 1000000
> Name : Thread-0, 21
> Name : Thread-1, 22
> parks: 997895
> parks: 998116
> Average time: 72197ns
> Total time: 144107437593ns = 144s
> pollfd_RUN is 169.4294951967s
> [ perf record: Woken up 148 times to write data ]
> [ perf record: Captured and wrote 0.060 MB perf.data ]
> Record_RUN is 170.4294783665s
>
> 2) On ARM64 platform
> a. The execution time of this java process
> $java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints
> ContextSwitchTest 1 1000000
> RUNS : 1, ITERATES : 1000000
> Name : Thread-0, 24
> Name : Thread-1, 25
> parks: 977285
> parks: 977279
> Average time: 4708ns
> Total time: 9203640720ns = 9s
> b. The execution time of perf monitoring this process with several
> event groups
> $perf record -N -B -T -g -e'{cycles,r008,r01b,r10c,r009,r010,r012},\
> {cycles,r100,r102,r107,r108,r076,r078},{cycles,r001,r002,r014,r179,r177},\
> {cycles,r121,r122,r123,r124,r125,r126},{cycles,r040,r042,r050,r052,r060,r061},\
> {cycles,r003,r004,r005,r016,r017},{cycles,r070,r071,r073,r074,r075,r077},\
> {cycles,r112,r113,r12c,r111,r120},{cycles,r06c,r06d,r06e,r07c,r07d,r07e},\
> {cycles,r150,r151,r152,r16a,r079,r07a}' \
> java -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints
> ContextSwitchTest 1 1000000
> record_open_RUN is 0.40954s
> RUNS : 1, ITERATES : 1000000
> Name : Thread-0, 24
> Name : Thread-1, 25
> parks: 1008468
> parks: 1003826
> Average time: 1154505ns
> Total time: 2323208237220ns = 2323s
> pollfd_RUN is 2326.645806s
> [ perf record: Woken up 18463 times to write data ]
> [ perf record: Captured and wrote 6263.982 MB perf.data ]
> Record_RUN is 2328.4294867157s
>
> The test result shows that perf consumes most of the time polling fds.
> In addtion, it seems that when tracing a great amount of events, perf may
> extend the execution time of the traced process, especially on ARM64
> platform.
> A process that runs only 10 seconds now needs an hour execution time,
> which is
> somewhat insane.
>
> I confuses that how perf affects the traced process and whether the
> final perf.data is still accurate since perf has affected the traced
> process?
> Is there something wrong with perf?
>
> Thanks,
> Mengting Zhang

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question to perf spending a large amount of time monitoring a java process
  2017-12-05 17:40 ` Andi Kleen
@ 2017-12-05 17:52   ` Peter Zijlstra
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2017-12-05 17:52 UTC (permalink / raw)
  To: Andi Kleen
  Cc: zhangmengting, linux-perf-users, acme, namhyung, jolsa,
	huawei.libin, cj.chengjian, alexey.budankov

On Tue, Dec 05, 2017 at 09:40:01AM -0800, Andi Kleen wrote:
> zhangmengting <zhangmengting@huawei.com> writes:
> 
> > Hi all,
> >
> > I found that perf spends a large amount of time attaching and monitoring
> > a java process with lock, although the execution time of the java process
> > is below 1 minute.
> 
> You could check if this patchkit helps
> 
> https://lkml.org/lkml/2017/9/8/118
> 
> Not sure why it's not moved forward. Peter?

Because its absolutely broken and I don't have had the time to work on
it much. There's a pile of patches here:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/core

that fix the worst of the fallout.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-12-05 17:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-01  7:32 Question to perf spending a large amount of time monitoring a java process zhangmengting
2017-12-01  8:03 ` Wangnan (F)
2017-12-05 17:40 ` Andi Kleen
2017-12-05 17:52   ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.