All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] Stack overflow when running perf and function tracer
@ 2020-10-30  4:27 Steven Rostedt
  2020-10-30  9:00 ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2020-10-30  4:27 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, kan.liang, like.xu

I found a bug in the recursion protection that prevented function
tracing from running in NMI context. Applying this fix to 5.9 worked
fine (tested by running perf record and function tracing at the same
time). But when I applied the patch to 5.10-rc1, it blew up with a
stack overflow:

perf: interrupt took too long (14611 > 13550), lowering kernel.perf_event_max_sample_rate to 13000
perf: interrupt took too long (27965 > 18263), lowering kernel.perf_event_max_sample_rate to 7000
 INFO: NMI handler (perf_event_nmi_handler) took too long to run: 0.000 msecs
 INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.087 msecs
 INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.082 msecs
 perf: interrupt took too long (27965 > 18263), lowering kernel.perf_event_max_sample_rate to 7000
 traps: PANIC: double fault, error_code: 0x0
 double fault: 0000 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 1655 Comm: perf Not tainted 5.9.0-rc1-test+ #685
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 RIP: 0010:__rb_reserve_next+0xb/0x450
 Code: 4f 10 f0 ff 41 08 0f 0b 83 f8 02 0f 84 97 fb ff ff e9 a5 fc ff ff b8 04 00 00 00 eb e1 66 90 41 57 41 56 41 55 41 54 49 89 f4 <55> 48 89 fd 53 48 83 ec 08 48 8b 9f 98 00 00 00 48 89 5e 28 4c 8b
 RSP: 0018:fffffe000003c000 EFLAGS: 00010046
 RAX: 000000000000001c RBX: ffff928ada27b400 RCX: 0000000000000000
 RDX: ffff928ada07b200 RSI: fffffe000003c028 RDI: ffff928ada27b400
 RBP: ffff928ada27b4f0 R08: 0000000000000001 R09: 0000000000000000
 R10: fffffe000003c440 R11: ffff928a7383cc60 R12: fffffe000003c028
 R13: 00000000000003e8 R14: 0000000000000046 R15: 0000000000110001
 FS:  00007f25d43cf780(0000) GS:ffff928adaa40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: fffffe000003bff8 CR3: 00000000b52a8005 CR4: 00000000001707e0
 Call Trace:
  <NMI>
  ring_buffer_lock_reserve+0x12c/0x380
  ? perf_output_begin+0x4d/0x2d0
  ? rcu_panic+0x20/0x20
  trace_function+0x27/0x130
  ? rcu_panic+0x20/0x20
  ? perf_output_begin+0x4d/0x2d0
  function_trace_call+0x133/0x180
  0xffffffffc02a106a
  ? __rcu_read_lock+0x5/0x20
  __rcu_read_lock+0x5/0x20
  perf_output_begin+0x4d/0x2d0
  ? sched_clock+0x5/0x10
  ? trace_clock_local+0xc/0x20
  ? __rb_reserve_next+0x5c/0x450
  ? sched_clock+0x5/0x10
  ? ring_buffer_lock_reserve+0x12c/0x380
  ? sched_clock+0x5/0x10
  ? trace_clock_local+0xc/0x20
  ? __rb_reserve_next+0x5c/0x450
  ? ring_buffer_lock_reserve+0x12c/0x380
  ? perf_log_throttle+0x9a/0x120
  ? perf_log_throttle+0x9a/0x120
  ? perf_output_begin_backward+0x280/0x280
  ? trace_function+0x88/0x130
  ? ring_buffer_unlock_commit+0x25/0x140
  ? perf_output_begin_backward+0x280/0x280
  ? perf_log_throttle+0x9a/0x120
  ? function_trace_call+0x133/0x180
  ? 0xffffffffc02a106a
  ? perf_log_throttle+0x9a/0x120
  ? perf_output_begin+0x5/0x2d0
  ? __perf_event_header__init_id.isra.0+0x39/0xd0
  ? perf_output_begin+0x5/0x2d0
  perf_log_throttle+0x9a/0x120
  ? trace_clock_local+0xc/0x20
  ? __rb_reserve_next+0x5c/0x450
  ? sched_clock+0x5/0x10
  ? ring_buffer_lock_reserve+0x12c/0x380
  ? sched_clock+0x5/0x10
  ? sched_clock+0x5/0x10
  ? sched_clock+0x5/0x10
  ? trace_clock_local+0xc/0x20
  ? __rb_reserve_next+0x5c/0x450
  ? sched_clock+0x5/0x10
  ? trace_clock_local+0xc/0x20
  ? __rb_reserve_next+0x5c/0x450
  ? setup_pebs_fixed_sample_data+0x365/0x3a0
  ? setup_pebs_fixed_sample_data+0x365/0x3a0
  ? intel_pmu_pebs_event_update_no_drain+0x70/0x70
  ? trace_function+0x88/0x130
  ? ring_buffer_unlock_commit+0x25/0x140
  ? intel_pmu_pebs_event_update_no_drain+0x70/0x70
  ? setup_pebs_fixed_sample_data+0x365/0x3a0
  ? function_trace_call+0x133/0x180
  ? intel_pmu_pebs_fixup_ip+0x32/0x1b0
  ? function_trace_call+0x133/0x180
  ? native_write_msr+0x6/0x20
  __perf_event_account_interrupt+0xa9/0x120
  __perf_event_overflow+0x2b/0xf0
  __intel_pmu_pebs_event+0x2ec/0x3e0
  ? trace_function+0x88/0x130
  ? ring_buffer_unlock_commit+0x25/0x140
  ? setup_pebs_adaptive_sample_data+0x3e0/0x3e0
  ? 0xffffffffc02a106a
  ? 0xffffffffc02a106a
  ? generic_exec_single+0xa6/0xe0
  ? perf_duration_warn+0x20/0x20
  ? generic_exec_single+0xa6/0xe0
  ? native_write_msr+0x6/0x20
  ? ring_buffer_lock_reserve+0x1b9/0x380
  ? ring_buffer_lock_reserve+0x12c/0x380
  ? intel_pmu_drain_pebs_nhm+0x268/0x330
  ? intel_pmu_drain_pebs_nhm+0x268/0x330
  ? release_bts_buffer+0x60/0x60
  ? trace_function+0x88/0x130
  ? ring_buffer_unlock_commit+0x25/0x140
  ? release_bts_buffer+0x60/0x60
  ? intel_pmu_drain_pebs_nhm+0x268/0x330
  ? function_trace_call+0x133/0x180
  ? 0xffffffffc02a106a
  ? ring_buffer_lock_reserve+0x12c/0x380
  intel_pmu_drain_pebs_nhm+0x268/0x330
  ? setup_pebs_adaptive_sample_data+0x3e0/0x3e0
  handle_pmi_common+0xc2/0x2b0
  ? __rb_reserve_next+0x5c/0x450
  ? sched_clock+0x5/0x10
  ? trace_clock_local+0xc/0x20
  ? __rb_reserve_next+0x5c/0x450
  ? ring_buffer_lock_reserve+0x12c/0x380
  ? intel_pmu_handle_irq+0xc8/0x160
  ? intel_pmu_ha
 Lost 135 message(s)!
 ---[ end trace ba215b7ba269800a ]---
 RIP: 0010:__rb_reserve_next+0xb/0x450
 Code: 4f 10 f0 ff 41 08 0f 0b 83 f8 02 0f 84 97 fb ff ff e9 a5 fc ff ff b8 04 00 00 00 eb e1 66 90 41 57 41 56 41 55 41 54 49 89 f4 <55> 48 89 fd 53 48 83 ec 08 48 8b 9f 98 00 00 00 48 89 5e 28 4c 8b
 RSP: 0018:fffffe000003c000 EFLAGS: 00010046
 RAX: 000000000000001c RBX: ffff928ada27b400 RCX: 0000000000000000
 RDX: ffff928ada07b200 RSI: fffffe000003c028 RDI: ffff928ada27b400
 RBP: ffff928ada27b4f0 R08: 0000000000000001 R09: 0000000000000000
 R10: fffffe000003c440 R11: ffff928a7383cc60 R12: fffffe000003c028
 R13: 00000000000003e8 R14: 0000000000000046 R15: 0000000000110001
 FS:  00007f25d43cf780(0000) GS:ffff928adaa40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: fffffe000003bff8 CR3: 00000000b52a8005 CR4: 00000000001707e0
 Kernel panic - not syncing: Fatal exception in interrupt
 Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)


I bisected it down to:

35d1ce6bec133679ff16325d335217f108b84871 ("perf/x86/intel/ds: Fix
x86_pmu_stop warning for large PEBS")

Which looks to be storing an awful lot on the stack:

static void __intel_pmu_pebs_event(struct perf_event *event,
				   struct pt_regs *iregs,
				   void *base, void *top,
				   int bit, int count,
				   void (*setup_sample)(struct perf_event *,
						struct pt_regs *,
						void *,
						struct perf_sample_data *,
						struct pt_regs *))
{
	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
	struct hw_perf_event *hwc = &event->hw;
	struct perf_sample_data data;
	struct x86_perf_regs perf_regs;
	struct pt_regs *regs = &perf_regs.regs;
	void *at = get_next_pebs_record_by_bit(base, top, bit);
	struct pt_regs dummy_iregs;

Reverting this patch from 5.10-rc1 makes makes the crash go away.

-- Steve

PS. here's the patch to fix nmi function tracing:

From c1c8a30be7c138a5c3519c12c8ef3b80288d5514 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Date: Thu, 29 Oct 2020 17:31:45 -0400
Subject: [PATCH] ftrace: Fix recursion check for NMI test

The code that checks recursion will work to only do the recursion check once
if there's nested checks. The top one will do the check, the other nested
checks will see recursion was already checked and return zero for its "bit".
On the return side, nothing will be done if the "bit" is zero.

The problem is that zero is returned for the "good" bit when in NMI context.
This will set the bit for NMIs making it look like *all* NMI tracing is
recursing, and prevent tracing of anything in NMI context!

The simple fix is to return "bit + 1" and subtract that bit on the end to
get the real bit.

Cc: stable@vger.kernel.org
Fixes: edc15cafcbfa3 ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index f3f5e77123ad..fee535a89560 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -698,7 +698,7 @@ static __always_inline int trace_test_and_set_recursion(int start, int max)
 	current->trace_recursion = val;
 	barrier();
 
-	return bit;
+	return bit + 1;
 }
 
 static __always_inline void trace_clear_recursion(int bit)
@@ -708,6 +708,7 @@ static __always_inline void trace_clear_recursion(int bit)
 	if (!bit)
 		return;
 
+	bit--;
 	bit = 1 << bit;
 	val &= ~bit;
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [BUG] Stack overflow when running perf and function tracer
  2020-10-30  4:27 [BUG] Stack overflow when running perf and function tracer Steven Rostedt
@ 2020-10-30  9:00 ` Peter Zijlstra
  2020-10-30 10:26   ` Thomas Gleixner
  2020-10-30 11:19   ` [PATCH] perf/x86: Make dummy_iregs static Peter Zijlstra
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2020-10-30  9:00 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Thomas Gleixner, Ingo Molnar, kan.liang, like.xu

On Fri, Oct 30, 2020 at 12:27:22AM -0400, Steven Rostedt wrote:
> I found a bug in the recursion protection that prevented function
> tracing from running in NMI context. Applying this fix to 5.9 worked
> fine (tested by running perf record and function tracing at the same
> time). But when I applied the patch to 5.10-rc1, it blew up with a
> stack overflow:

So we just blew away our NMI stack, right?

> perf: interrupt took too long (14611 > 13550), lowering kernel.perf_event_max_sample_rate to 13000
> perf: interrupt took too long (27965 > 18263), lowering kernel.perf_event_max_sample_rate to 7000
>  INFO: NMI handler (perf_event_nmi_handler) took too long to run: 0.000 msecs
>  INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.087 msecs
>  INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.082 msecs
>  perf: interrupt took too long (27965 > 18263), lowering kernel.perf_event_max_sample_rate to 7000
>  traps: PANIC: double fault, error_code: 0x0
>  double fault: 0000 [#1] PREEMPT SMP PTI
>  CPU: 1 PID: 1655 Comm: perf Not tainted 5.9.0-rc1-test+ #685
>  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
>  RIP: 0010:__rb_reserve_next+0xb/0x450
>  Code: 4f 10 f0 ff 41 08 0f 0b 83 f8 02 0f 84 97 fb ff ff e9 a5 fc ff ff b8 04 00 00 00 eb e1 66 90 41 57 41 56 41 55 41 54 49 89 f4 <55> 48 89 fd 53 48 83 ec 08 48 8b 9f 98 00 00 00 48 89 5e 28 4c 8b
>  RSP: 0018:fffffe000003c000 EFLAGS: 00010046
>  RAX: 000000000000001c RBX: ffff928ada27b400 RCX: 0000000000000000
>  RDX: ffff928ada07b200 RSI: fffffe000003c028 RDI: ffff928ada27b400
>  RBP: ffff928ada27b4f0 R08: 0000000000000001 R09: 0000000000000000
>  R10: fffffe000003c440 R11: ffff928a7383cc60 R12: fffffe000003c028
>  R13: 00000000000003e8 R14: 0000000000000046 R15: 0000000000110001
>  FS:  00007f25d43cf780(0000) GS:ffff928adaa40000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: fffffe000003bff8 CR3: 00000000b52a8005 CR4: 00000000001707e0
>  Call Trace:
>   <NMI>
>   ring_buffer_lock_reserve+0x12c/0x380
>   ? perf_output_begin+0x4d/0x2d0
>   ? rcu_panic+0x20/0x20
>   trace_function+0x27/0x130
>   ? rcu_panic+0x20/0x20
>   ? perf_output_begin+0x4d/0x2d0
>   function_trace_call+0x133/0x180
>   0xffffffffc02a106a
>   ? __rcu_read_lock+0x5/0x20
>   __rcu_read_lock+0x5/0x20
>   perf_output_begin+0x4d/0x2d0
>   ? sched_clock+0x5/0x10
>   ? trace_clock_local+0xc/0x20
>   ? __rb_reserve_next+0x5c/0x450
>   ? sched_clock+0x5/0x10
>   ? ring_buffer_lock_reserve+0x12c/0x380
>   ? sched_clock+0x5/0x10
>   ? trace_clock_local+0xc/0x20
>   ? __rb_reserve_next+0x5c/0x450
>   ? ring_buffer_lock_reserve+0x12c/0x380
>   ? perf_log_throttle+0x9a/0x120
>   ? perf_log_throttle+0x9a/0x120
>   ? perf_output_begin_backward+0x280/0x280
>   ? trace_function+0x88/0x130
>   ? ring_buffer_unlock_commit+0x25/0x140
>   ? perf_output_begin_backward+0x280/0x280
>   ? perf_log_throttle+0x9a/0x120
>   ? function_trace_call+0x133/0x180
>   ? 0xffffffffc02a106a
>   ? perf_log_throttle+0x9a/0x120
>   ? perf_output_begin+0x5/0x2d0
>   ? __perf_event_header__init_id.isra.0+0x39/0xd0
>   ? perf_output_begin+0x5/0x2d0
>   perf_log_throttle+0x9a/0x120
>   ? trace_clock_local+0xc/0x20
>   ? __rb_reserve_next+0x5c/0x450
>   ? sched_clock+0x5/0x10
>   ? ring_buffer_lock_reserve+0x12c/0x380
>   ? sched_clock+0x5/0x10
>   ? sched_clock+0x5/0x10
>   ? sched_clock+0x5/0x10
>   ? trace_clock_local+0xc/0x20
>   ? __rb_reserve_next+0x5c/0x450
>   ? sched_clock+0x5/0x10
>   ? trace_clock_local+0xc/0x20
>   ? __rb_reserve_next+0x5c/0x450
>   ? setup_pebs_fixed_sample_data+0x365/0x3a0
>   ? setup_pebs_fixed_sample_data+0x365/0x3a0
>   ? intel_pmu_pebs_event_update_no_drain+0x70/0x70
>   ? trace_function+0x88/0x130
>   ? ring_buffer_unlock_commit+0x25/0x140
>   ? intel_pmu_pebs_event_update_no_drain+0x70/0x70
>   ? setup_pebs_fixed_sample_data+0x365/0x3a0
>   ? function_trace_call+0x133/0x180
>   ? intel_pmu_pebs_fixup_ip+0x32/0x1b0
>   ? function_trace_call+0x133/0x180
>   ? native_write_msr+0x6/0x20
>   __perf_event_account_interrupt+0xa9/0x120
>   __perf_event_overflow+0x2b/0xf0
>   __intel_pmu_pebs_event+0x2ec/0x3e0
>   ? trace_function+0x88/0x130
>   ? ring_buffer_unlock_commit+0x25/0x140
>   ? setup_pebs_adaptive_sample_data+0x3e0/0x3e0
>   ? 0xffffffffc02a106a
>   ? 0xffffffffc02a106a
>   ? generic_exec_single+0xa6/0xe0
>   ? perf_duration_warn+0x20/0x20
>   ? generic_exec_single+0xa6/0xe0
>   ? native_write_msr+0x6/0x20
>   ? ring_buffer_lock_reserve+0x1b9/0x380
>   ? ring_buffer_lock_reserve+0x12c/0x380
>   ? intel_pmu_drain_pebs_nhm+0x268/0x330
>   ? intel_pmu_drain_pebs_nhm+0x268/0x330
>   ? release_bts_buffer+0x60/0x60
>   ? trace_function+0x88/0x130
>   ? ring_buffer_unlock_commit+0x25/0x140
>   ? release_bts_buffer+0x60/0x60
>   ? intel_pmu_drain_pebs_nhm+0x268/0x330
>   ? function_trace_call+0x133/0x180
>   ? 0xffffffffc02a106a
>   ? ring_buffer_lock_reserve+0x12c/0x380
>   intel_pmu_drain_pebs_nhm+0x268/0x330
>   ? setup_pebs_adaptive_sample_data+0x3e0/0x3e0
>   handle_pmi_common+0xc2/0x2b0
>   ? __rb_reserve_next+0x5c/0x450
>   ? sched_clock+0x5/0x10
>   ? trace_clock_local+0xc/0x20
>   ? __rb_reserve_next+0x5c/0x450
>   ? ring_buffer_lock_reserve+0x12c/0x380
>   ? intel_pmu_handle_irq+0xc8/0x160
>   ? intel_pmu_ha
>  Lost 135 message(s)!

What kind of shit console is that?

Anyway, I can't seem to see any actual recursion happening, possibly due
to that loosing 135 entries. Which seems to suggest we geniuinely blew
away the NMI stack.

>  ---[ end trace ba215b7ba269800a ]---
>  RIP: 0010:__rb_reserve_next+0xb/0x450
>  Code: 4f 10 f0 ff 41 08 0f 0b 83 f8 02 0f 84 97 fb ff ff e9 a5 fc ff ff b8 04 00 00 00 eb e1 66 90 41 57 41 56 41 55 41 54 49 89 f4 <55> 48 89 fd 53 48 83 ec 08 48 8b 9f 98 00 00 00 48 89 5e 28 4c 8b
>  RSP: 0018:fffffe000003c000 EFLAGS: 00010046
>  RAX: 000000000000001c RBX: ffff928ada27b400 RCX: 0000000000000000
>  RDX: ffff928ada07b200 RSI: fffffe000003c028 RDI: ffff928ada27b400
>  RBP: ffff928ada27b4f0 R08: 0000000000000001 R09: 0000000000000000
>  R10: fffffe000003c440 R11: ffff928a7383cc60 R12: fffffe000003c028
>  R13: 00000000000003e8 R14: 0000000000000046 R15: 0000000000110001
>  FS:  00007f25d43cf780(0000) GS:ffff928adaa40000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: fffffe000003bff8 CR3: 00000000b52a8005 CR4: 00000000001707e0
>  Kernel panic - not syncing: Fatal exception in interrupt
>  Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> 
> 
> I bisected it down to:
> 
> 35d1ce6bec133679ff16325d335217f108b84871 ("perf/x86/intel/ds: Fix
> x86_pmu_stop warning for large PEBS")
> 
> Which looks to be storing an awful lot on the stack:
> 
> static void __intel_pmu_pebs_event(struct perf_event *event,
> 				   struct pt_regs *iregs,
> 				   void *base, void *top,
> 				   int bit, int count,
> 				   void (*setup_sample)(struct perf_event *,
> 						struct pt_regs *,
> 						void *,
> 						struct perf_sample_data *,
> 						struct pt_regs *))
> {
> 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> 	struct hw_perf_event *hwc = &event->hw;
> 	struct perf_sample_data data;
> 	struct x86_perf_regs perf_regs;
> 	struct pt_regs *regs = &perf_regs.regs;
> 	void *at = get_next_pebs_record_by_bit(base, top, bit);
> 	struct pt_regs dummy_iregs;

The only thing I can come up with in a hurry is that that dummy_iregs
thing really should be static. That's 168 bytes of stack out the window
right there.

Still, this seems to suggest (barring some actual issue hidding in those
135 lost lines, we're very close to the limit on the NMI stack, which is
a single 4k page IIRC.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Stack overflow when running perf and function tracer
  2020-10-30  9:00 ` Peter Zijlstra
@ 2020-10-30 10:26   ` Thomas Gleixner
  2020-10-30 10:32     ` Peter Zijlstra
  2020-10-30 10:59     ` [PATCH] perf/arch: Remove perf_sample_data::regs_user_copy Peter Zijlstra
  2020-10-30 11:19   ` [PATCH] perf/x86: Make dummy_iregs static Peter Zijlstra
  1 sibling, 2 replies; 9+ messages in thread
From: Thomas Gleixner @ 2020-10-30 10:26 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt; +Cc: LKML, Ingo Molnar, kan.liang, like.xu

On Fri, Oct 30 2020 at 10:00, Peter Zijlstra wrote:
> On Fri, Oct 30, 2020 at 12:27:22AM -0400, Steven Rostedt wrote:
>> I found a bug in the recursion protection that prevented function
>> tracing from running in NMI context. Applying this fix to 5.9 worked
>> fine (tested by running perf record and function tracing at the same
>> time). But when I applied the patch to 5.10-rc1, it blew up with a
>> stack overflow:
>
> So we just blew away our NMI stack, right?

Looks like that:

>>  RSP: 0018:fffffe000003c000 EFLAGS: 00010046

Clearly a page boundary.

>>  RAX: 000000000000001c RBX: ffff928ada27b400 RCX: 0000000000000000
>>  RDX: ffff928ada07b200 RSI: fffffe000003c028 RDI: ffff928ada27b400
>>  RBP: ffff928ada27b4f0 R08: 0000000000000001 R09: 0000000000000000
>>  R10: fffffe000003c440 R11: ffff928a7383cc60 R12: fffffe000003c028
>>  R13: 00000000000003e8 R14: 0000000000000046 R15: 0000000000110001
>>  FS:  00007f25d43cf780(0000) GS:ffff928adaa40000(0000) knlGS:0000000000000000
>>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>  CR2: fffffe000003bff8 CR3: 00000000b52a8005 CR4: 00000000001707e0

and CR2 says it tried below.

>> I bisected it down to:
>> 
>> 35d1ce6bec133679ff16325d335217f108b84871 ("perf/x86/intel/ds: Fix
>> x86_pmu_stop warning for large PEBS")
>> 
>> Which looks to be storing an awful lot on the stack:
>> 
>> static void __intel_pmu_pebs_event(struct perf_event *event,
>> 				   struct pt_regs *iregs,
>> 				   void *base, void *top,
>> 				   int bit, int count,
>> 				   void (*setup_sample)(struct perf_event *,
>> 						struct pt_regs *,
>> 						void *,
>> 						struct perf_sample_data *,
>> 						struct pt_regs *))
>> {
>> 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> 	struct hw_perf_event *hwc = &event->hw;
>> 	struct perf_sample_data data;
>> 	struct x86_perf_regs perf_regs;
>> 	struct pt_regs *regs = &perf_regs.regs;
>> 	void *at = get_next_pebs_record_by_bit(base, top, bit);
>> 	struct pt_regs dummy_iregs;
>
> The only thing I can come up with in a hurry is that that dummy_iregs
> thing really should be static. That's 168 bytes of stack out the window
> right there.

What's worse is perf_sample_data which is 384 bytes and is 64 bytes aligned.

> Still, this seems to suggest (barring some actual issue hidding in those
> 135 lost lines, we're very close to the limit on the NMI stack, which is
> a single 4k page IIRC.

Yes, unless KASAN is enabled

Thanks,

        tglx





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Stack overflow when running perf and function tracer
  2020-10-30 10:26   ` Thomas Gleixner
@ 2020-10-30 10:32     ` Peter Zijlstra
  2020-10-30 11:36       ` Thomas Gleixner
  2020-10-30 10:59     ` [PATCH] perf/arch: Remove perf_sample_data::regs_user_copy Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2020-10-30 10:32 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Steven Rostedt, LKML, Ingo Molnar, kan.liang, like.xu

On Fri, Oct 30, 2020 at 11:26:01AM +0100, Thomas Gleixner wrote:

> > The only thing I can come up with in a hurry is that that dummy_iregs
> > thing really should be static. That's 168 bytes of stack out the window
> > right there.
> 
> What's worse is perf_sample_data which is 384 bytes and is 64 bytes aligned.

Urgh, that thing just keeps on growing :/

I'll try and have a look if we can both shrink the thing and move it off
stack.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] perf/arch: Remove perf_sample_data::regs_user_copy
  2020-10-30 10:26   ` Thomas Gleixner
  2020-10-30 10:32     ` Peter Zijlstra
@ 2020-10-30 10:59     ` Peter Zijlstra
  2020-10-30 12:33       ` Steven Rostedt
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2020-10-30 10:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, LKML, Ingo Molnar, kan.liang, like.xu, Andy Lutomirski

On Fri, Oct 30, 2020 at 11:26:01AM +0100, Thomas Gleixner wrote:
> What's worse is perf_sample_data which is 384 bytes and is 64 bytes aligned.

Here; this shrinks it to 192 bytes. Combined with the static dummy this
should reduce stack usage by 360 bytes.

---
Subject: perf/arch: Remove perf_sample_data::regs_user_copy

struct perf_sample_data lives on-stack, we should be careful about it's
size. Furthermore, the pt_regs copy in there is only because x86_64 is a
trainwreck, solve it differently.

Halves sizeof(perf_sample_data).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arm/kernel/perf_regs.c   |  3 +--
 arch/arm64/kernel/perf_regs.c |  3 +--
 arch/csky/kernel/perf_regs.c  |  3 +--
 arch/powerpc/perf/perf_regs.c |  3 +--
 arch/riscv/kernel/perf_regs.c |  3 +--
 arch/s390/kernel/perf_regs.c  |  3 +--
 arch/x86/kernel/perf_regs.c   | 15 +++++++++++----
 include/linux/perf_event.h    |  6 ------
 include/linux/perf_regs.h     |  6 ++----
 kernel/events/core.c          |  8 +++-----
 10 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/arch/arm/kernel/perf_regs.c b/arch/arm/kernel/perf_regs.c
index 05fe92aa7d98..0529f90395c9 100644
--- a/arch/arm/kernel/perf_regs.c
+++ b/arch/arm/kernel/perf_regs.c
@@ -32,8 +32,7 @@ u64 perf_reg_abi(struct task_struct *task)
 }
 
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy)
+			struct pt_regs *regs)
 {
 	regs_user->regs = task_pt_regs(current);
 	regs_user->abi = perf_reg_abi(current);
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c
index 94e8718e7229..f6f58e6265df 100644
--- a/arch/arm64/kernel/perf_regs.c
+++ b/arch/arm64/kernel/perf_regs.c
@@ -73,8 +73,7 @@ u64 perf_reg_abi(struct task_struct *task)
 }
 
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy)
+			struct pt_regs *regs)
 {
 	regs_user->regs = task_pt_regs(current);
 	regs_user->abi = perf_reg_abi(current);
diff --git a/arch/csky/kernel/perf_regs.c b/arch/csky/kernel/perf_regs.c
index eb32838b8210..09b7f88a2d6a 100644
--- a/arch/csky/kernel/perf_regs.c
+++ b/arch/csky/kernel/perf_regs.c
@@ -32,8 +32,7 @@ u64 perf_reg_abi(struct task_struct *task)
 }
 
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy)
+			struct pt_regs *regs)
 {
 	regs_user->regs = task_pt_regs(current);
 	regs_user->abi = perf_reg_abi(current);
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
index 8e53f2fc3fe0..6f681b105eec 100644
--- a/arch/powerpc/perf/perf_regs.c
+++ b/arch/powerpc/perf/perf_regs.c
@@ -144,8 +144,7 @@ u64 perf_reg_abi(struct task_struct *task)
 }
 
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy)
+			struct pt_regs *regs)
 {
 	regs_user->regs = task_pt_regs(current);
 	regs_user->abi = (regs_user->regs) ? perf_reg_abi(current) :
diff --git a/arch/riscv/kernel/perf_regs.c b/arch/riscv/kernel/perf_regs.c
index 04a38fbeb9c7..fd304a248de6 100644
--- a/arch/riscv/kernel/perf_regs.c
+++ b/arch/riscv/kernel/perf_regs.c
@@ -36,8 +36,7 @@ u64 perf_reg_abi(struct task_struct *task)
 }
 
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy)
+			struct pt_regs *regs)
 {
 	regs_user->regs = task_pt_regs(current);
 	regs_user->abi = perf_reg_abi(current);
diff --git a/arch/s390/kernel/perf_regs.c b/arch/s390/kernel/perf_regs.c
index 4352a504f235..6e9e5d5e927e 100644
--- a/arch/s390/kernel/perf_regs.c
+++ b/arch/s390/kernel/perf_regs.c
@@ -53,8 +53,7 @@ u64 perf_reg_abi(struct task_struct *task)
 }
 
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy)
+			struct pt_regs *regs)
 {
 	/*
 	 * Use the regs from the first interruption and let
diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c
index bb7e1132290b..750129bf539c 100644
--- a/arch/x86/kernel/perf_regs.c
+++ b/arch/x86/kernel/perf_regs.c
@@ -101,8 +101,7 @@ u64 perf_reg_abi(struct task_struct *task)
 }
 
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy)
+			struct pt_regs *regs)
 {
 	regs_user->regs = task_pt_regs(current);
 	regs_user->abi = perf_reg_abi(current);
@@ -129,12 +128,20 @@ u64 perf_reg_abi(struct task_struct *task)
 		return PERF_SAMPLE_REGS_ABI_64;
 }
 
+static DEFINE_PER_CPU(struct pt_regs, nmi_user_regs);
+
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy)
+			struct pt_regs *regs)
 {
+	struct pt_regs *regs_user_copy = this_cpu_ptr(&nmi_user_regs);
 	struct pt_regs *user_regs = task_pt_regs(current);
 
+	if (!in_nmi()) {
+		regs_user->regs = task_pt_regs(current);
+		regs_user->abi = perf_reg_abi(current);
+		return;
+	}
+
 	/*
 	 * If we're in an NMI that interrupted task_pt_regs setup, then
 	 * we can't sample user regs at all.  This check isn't really
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0c19d279b97f..e58f4e6b6e1c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1022,13 +1022,7 @@ struct perf_sample_data {
 	struct perf_callchain_entry	*callchain;
 	u64				aux_size;
 
-	/*
-	 * regs_user may point to task_pt_regs or to regs_user_copy, depending
-	 * on arch details.
-	 */
 	struct perf_regs		regs_user;
-	struct pt_regs			regs_user_copy;
-
 	struct perf_regs		regs_intr;
 	u64				stack_user_size;
 
diff --git a/include/linux/perf_regs.h b/include/linux/perf_regs.h
index 2d12e97d5e7b..5efcdb768c75 100644
--- a/include/linux/perf_regs.h
+++ b/include/linux/perf_regs.h
@@ -20,8 +20,7 @@ u64 perf_reg_value(struct pt_regs *regs, int idx);
 int perf_reg_validate(u64 mask);
 u64 perf_reg_abi(struct task_struct *task);
 void perf_get_regs_user(struct perf_regs *regs_user,
-			struct pt_regs *regs,
-			struct pt_regs *regs_user_copy);
+			struct pt_regs *regs);
 #else
 
 #define PERF_REG_EXTENDED_MASK	0
@@ -42,8 +41,7 @@ static inline u64 perf_reg_abi(struct task_struct *task)
 }
 
 static inline void perf_get_regs_user(struct perf_regs *regs_user,
-				      struct pt_regs *regs,
-				      struct pt_regs *regs_user_copy)
+				      struct pt_regs *regs);
 {
 	regs_user->regs = task_pt_regs(current);
 	regs_user->abi = perf_reg_abi(current);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index da467e1dd49a..14709182d94e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6374,14 +6374,13 @@ perf_output_sample_regs(struct perf_output_handle *handle,
 }
 
 static void perf_sample_regs_user(struct perf_regs *regs_user,
-				  struct pt_regs *regs,
-				  struct pt_regs *regs_user_copy)
+				  struct pt_regs *regs)
 {
 	if (user_mode(regs)) {
 		regs_user->abi = perf_reg_abi(current);
 		regs_user->regs = regs;
 	} else if (!(current->flags & PF_KTHREAD)) {
-		perf_get_regs_user(regs_user, regs, regs_user_copy);
+		perf_get_regs_user(regs_user, regs);
 	} else {
 		regs_user->abi = PERF_SAMPLE_REGS_ABI_NONE;
 		regs_user->regs = NULL;
@@ -7083,8 +7082,7 @@ void perf_prepare_sample(struct perf_event_header *header,
 	}
 
 	if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
-		perf_sample_regs_user(&data->regs_user, regs,
-				      &data->regs_user_copy);
+		perf_sample_regs_user(&data->regs_user, regs);
 
 	if (sample_type & PERF_SAMPLE_REGS_USER) {
 		/* regs dump ABI info */

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] perf/x86: Make dummy_iregs static
  2020-10-30  9:00 ` Peter Zijlstra
  2020-10-30 10:26   ` Thomas Gleixner
@ 2020-10-30 11:19   ` Peter Zijlstra
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2020-10-30 11:19 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Thomas Gleixner, Ingo Molnar, kan.liang, like.xu

On Fri, Oct 30, 2020 at 10:00:37AM +0100, Peter Zijlstra wrote:
> > static void __intel_pmu_pebs_event(struct perf_event *event,
> > 				   struct pt_regs *iregs,
> > 				   void *base, void *top,
> > 				   int bit, int count,
> > 				   void (*setup_sample)(struct perf_event *,
> > 						struct pt_regs *,
> > 						void *,
> > 						struct perf_sample_data *,
> > 						struct pt_regs *))
> > {
> > 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> > 	struct hw_perf_event *hwc = &event->hw;
> > 	struct perf_sample_data data;
> > 	struct x86_perf_regs perf_regs;
> > 	struct pt_regs *regs = &perf_regs.regs;
> > 	void *at = get_next_pebs_record_by_bit(base, top, bit);
> > 	struct pt_regs dummy_iregs;
> 
> The only thing I can come up with in a hurry is that that dummy_iregs
> thing really should be static. That's 168 bytes of stack out the window
> right there.

---
Subject: perf/x86: Make dummy_iregs static
From: Peter Zijlstra <peterz@infradead.org>
Date: Fri Oct 30 12:15:06 CET 2020

Having pt_regs on-stack is unfortunate, it's 168 bytes. Since it isn't
actually used, make it a static variable. This both gets if off the
stack and ensures it gets 0 initialized, just in case someone does
look at it.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/events/intel/ds.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1735,7 +1735,7 @@ static void __intel_pmu_pebs_event(struc
 	struct x86_perf_regs perf_regs;
 	struct pt_regs *regs = &perf_regs.regs;
 	void *at = get_next_pebs_record_by_bit(base, top, bit);
-	struct pt_regs dummy_iregs;
+	static struct pt_regs dummy_iregs;
 
 	if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) {
 		/*

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Stack overflow when running perf and function tracer
  2020-10-30 10:32     ` Peter Zijlstra
@ 2020-10-30 11:36       ` Thomas Gleixner
  2020-10-30 13:42         ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2020-10-30 11:36 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Steven Rostedt, LKML, Ingo Molnar, kan.liang, like.xu

On Fri, Oct 30 2020 at 11:32, Peter Zijlstra wrote:
> On Fri, Oct 30, 2020 at 11:26:01AM +0100, Thomas Gleixner wrote:
>
>> > The only thing I can come up with in a hurry is that that dummy_iregs
>> > thing really should be static. That's 168 bytes of stack out the window
>> > right there.
>> 
>> What's worse is perf_sample_data which is 384 bytes and is 64 bytes
>> aligned.

And there is also x86_perf_regs which is another 176 bytes ....

> Urgh, that thing just keeps on growing :/
>
> I'll try and have a look if we can both shrink the thing and move it off
> stack.

Even then we still need to worry about the stack depth because that
tracer code was not yet done. Throw some BPF at it and it goes south
again.

So the real question is what else is on that stack which blows it up
close to 4k? Btw, it would be massively helpful for this kind of crash
to print the actual stack depth per entry in the backtrace.

Here is the partial stack trace:
                                                Stack usage
  ring_buffer_lock_reserve+0x12c/0x380          
  trace_function+0x27/0x130
  function_trace_call+0x133/0x180
  perf_output_begin+0x4d/0x2d0                   64+
  perf_log_throttle+0x9a/0x120                  470+
  __perf_event_account_interrupt+0xa9/0x120
  __perf_event_overflow+0x2b/0xf0               
  __intel_pmu_pebs_event+0x2ec/0x3e0            760+
  intel_pmu_drain_pebs_nhm+0x268/0x330          200+
  handle_pmi_common+0xc2/0x2b0                  

The missing call chain is:

 intel_pmu_handle_irq or intel_pmu_handle_irq_v4
 perf_event_nmi_handler
 nmi
 
So the larger offenders accumulate to ~1.5k data on stack, but
where is the rest of the 4k?

It's about 15 calls. So to fill up the stack you'd need about 230 bytes
per call to go up to 4k.

Something is fishy.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] perf/arch: Remove perf_sample_data::regs_user_copy
  2020-10-30 10:59     ` [PATCH] perf/arch: Remove perf_sample_data::regs_user_copy Peter Zijlstra
@ 2020-10-30 12:33       ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2020-10-30 12:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, LKML, Ingo Molnar, kan.liang, like.xu, Andy Lutomirski

On Fri, 30 Oct 2020 11:59:17 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Oct 30, 2020 at 11:26:01AM +0100, Thomas Gleixner wrote:
> > What's worse is perf_sample_data which is 384 bytes and is 64 bytes aligned.  
> 
> Here; this shrinks it to 192 bytes. Combined with the static dummy this
> should reduce stack usage by 360 bytes.
> 
> ---
> Subject: perf/arch: Remove perf_sample_data::regs_user_copy
> 
> struct perf_sample_data lives on-stack, we should be careful about it's
> size. Furthermore, the pt_regs copy in there is only because x86_64 is a
> trainwreck, solve it differently.
> 
> Halves sizeof(perf_sample_data).
> 
> 


I tested with just this patch and it doesn't crash when running
function tracing and perf simultaneously.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] Stack overflow when running perf and function tracer
  2020-10-30 11:36       ` Thomas Gleixner
@ 2020-10-30 13:42         ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2020-10-30 13:42 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Steven Rostedt, LKML, Ingo Molnar, kan.liang, like.xu

On Fri, Oct 30 2020 at 12:36, Thomas Gleixner wrote:
> On Fri, Oct 30 2020 at 11:32, Peter Zijlstra wrote:
> So the real question is what else is on that stack which blows it up
> close to 4k? Btw, it would be massively helpful for this kind of crash
> to print the actual stack depth per entry in the backtrace.
>
> Here is the partial stack trace:
>                                                 Stack usage
>   ring_buffer_lock_reserve+0x12c/0x380          
>   trace_function+0x27/0x130
>   function_trace_call+0x133/0x180
>   perf_output_begin+0x4d/0x2d0                   64+
>   perf_log_throttle+0x9a/0x120                  470+
>   __perf_event_account_interrupt+0xa9/0x120
>   __perf_event_overflow+0x2b/0xf0               
>   __intel_pmu_pebs_event+0x2ec/0x3e0            760+
>   intel_pmu_drain_pebs_nhm+0x268/0x330          200+
>   handle_pmi_common+0xc2/0x2b0                  

So Steven provided a backtrace with the actual stack depth printed:

ring_buffer_lock_reserve+0x12c/0x380		0030  104
trace_function+0x27/0xf0			0098   56
function_trace_call+0x124/0x190			00d0  224 
__rcu_read_lock+0x5/0x20			01b0    8
perf_output_begin+0x4d/0x2d0			01b8  640
perf_log_throttle+0x9a/0x120			0438  624
__perf_event_account_interrupt+0xa6/0x120	06a8   32
__perf_event_overflow+0x2b/0xf0			06c8   48
__intel_pmu_pebs_event+0x2ec/0x3e0		06f8  960
intel_pmu_drain_pebs_nhm+0x268/0x330		0ab8  256
handle_pmi_common+0xc2/0x2b0			0bb8  584
intel_pmu_handle_irq+0xc8/0x160			0e00   64
perf_event_nmi_handler+0x28/0x50		0e40   32
nmi_handle+0x80/0x190				0e60   64
default_do_nmi+0x6b/0x170			0ea0   40
exc_nmi+0x15d/0x1a0				0ec8   40
end_repeat_nmi+0x16/0x55			0ef0  272

So I missed perf_output_begin and handle_pmi_common ...

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-10-30 13:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-30  4:27 [BUG] Stack overflow when running perf and function tracer Steven Rostedt
2020-10-30  9:00 ` Peter Zijlstra
2020-10-30 10:26   ` Thomas Gleixner
2020-10-30 10:32     ` Peter Zijlstra
2020-10-30 11:36       ` Thomas Gleixner
2020-10-30 13:42         ` Thomas Gleixner
2020-10-30 10:59     ` [PATCH] perf/arch: Remove perf_sample_data::regs_user_copy Peter Zijlstra
2020-10-30 12:33       ` Steven Rostedt
2020-10-30 11:19   ` [PATCH] perf/x86: Make dummy_iregs static Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.