All of lore.kernel.org
 help / color / mirror / Atom feed
* mmiotracer hangs the system
@ 2016-08-02 10:08 Andy Shevchenko
  2016-08-02 10:36 ` Andy Shevchenko
  2016-08-02 15:05 ` Andy Shevchenko
  0 siblings, 2 replies; 20+ messages in thread
From: Andy Shevchenko @ 2016-08-02 10:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Paul E. McKenney, Steven Rostedt

Hi!

I'm trying to use mmio tracer with recent kernels (in this particular
case today's linux-next).

# mount -t debugfs none /sys/kernel/debug/

# echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
[  869.673145] in mmio_trace_init
[  869.714170] mmiotrace: Disabling non-boot CPUs...
[  869.729938] Cannot set affinity for irq 169
[  869.735765] smpboot: CPU 1 is now offline
[  869.746662] mmiotrace: CPU1 is down.
[  869.757896] smpboot: CPU 2 is now offline
[  869.773572] mmiotrace: CPU2 is down.
[  869.781768] smpboot: CPU 3 is now offline
[  869.789495] mmiotrace: CPU3 is down.
[  869.793515] mmiotrace: enabled.

#  echo 1 > /sys/kernel/debug/tracing/tracing_on
[  869.802634] in mmio_trace_start

# echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/unbind
[  883.625744] mmiotrace: Unmapping ffffc90000854000.
[  883.633925] mmiotrace: Unmapping ffffc90000852000.
[  883.644580] mmiotrace: Unmapping ffffc90000850000.

# echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/bind
[  889.525125] mmiotrace: ioremap_*(0x9242e200, 0x100) = ffffc90000856200

[  910.533911] INFO: rcu_sched detected stalls on CPUs/tasks:
[  910.540052]  (detected by 0, t=21002 jiffies, g=348, c=347, q=0)
[  910.546790] All QSes seen, last rcu_sched kthread activity 21002
(4295577777-4295556775), jiffies_till_next_fqs=3, root ->qsm
ask 0x0
[  910.560142] sh              R  running task    12336  1289      1 0x20020008
[  910.568055]  ffffffff81e422c0 ffff88017fc03e20 ffffffff81085296
ffff88017fc18500
[  910.576366]  ffffffff81e422c0 ffff88017fc03e88 ffffffff810b547d
0000000000000000
[  910.584675]  ffffffff810f67ec 000000000000015c 0000000000000000
000000000000015c
[  910.592980] Call Trace:
[  910.595715]  <IRQ>  [<ffffffff81085296>] sched_show_task+0xb6/0x110
[  910.602748]  [<ffffffff810b547d>] rcu_check_callbacks+0x84d/0x850
[  910.609573]  [<ffffffff810f67ec>] ? __acct_update_integrals+0x2c/0xb0
[  910.616788]  [<ffffffff810c9150>] ? tick_sched_do_timer+0x30/0x30
[  910.623613]  [<ffffffff810ba34a>] update_process_times+0x2a/0x50
[  910.630343]  [<ffffffff810c8bb1>] tick_sched_handle.isra.12+0x31/0x40
[  910.637560]  [<ffffffff810c9188>] tick_sched_timer+0x38/0x70
[  910.643902]  [<ffffffff810bacba>] __hrtimer_run_queues+0xda/0x250
[  910.650734]  [<ffffffff810bb3f3>] hrtimer_interrupt+0xa3/0x190
[  910.657272]  [<ffffffff8103ead3>] local_apic_timer_interrupt+0x33/0x50
[  910.664584]  [<ffffffff8103f588>] smp_apic_timer_interrupt+0x38/0x50
[  910.671705]  [<ffffffff8190dd6f>] apic_timer_interrupt+0x7f/0x90
[  910.678427]  <EOI>  [<ffffffff814a717f>] ? intel_lpss_probe+0x7f/0x5f0
[  910.685739]  [<ffffffff814a716b>] ? intel_lpss_probe+0x6b/0x5f0
[  910.692364]  [<ffffffff8170e5df>] ? raw_pci_write+0x1f/0x40
[  910.698610]  [<ffffffff8136e825>] ? pci_bus_write_config_byte+0x55/0x70
[  910.706022]  [<ffffffff813781b1>] ? pcibios_set_master+0x51/0x80
[  910.712753]  [<ffffffff814a7836>] intel_lpss_pci_probe+0x76/0xb0
[  910.719479]  [<ffffffff813797e0>] local_pci_probe+0x40/0xa0
[  910.725719]  [<ffffffff811fce44>] ? sysfs_do_create_link_sd.isra.2+0x64/0xa0
[  910.733617]  [<ffffffff8137ab46>] pci_device_probe+0xd6/0x120
[  910.740058]  [<ffffffff8148679f>] driver_probe_device+0x21f/0x430
[  910.746883]  [<ffffffff81484c4f>] bind_store+0x10f/0x160
[  910.752836]  [<ffffffff81484150>] drv_attr_store+0x20/0x30
[  910.758983]  [<ffffffff811fc312>] sysfs_kf_write+0x32/0x40
[  910.765129]  [<ffffffff811fb863>] kernfs_fop_write+0x113/0x190
[  910.771663]  [<ffffffff81185343>] __vfs_write+0x23/0x120
[  910.777607]  [<ffffffff812cfd46>] ? security_file_permission+0x36/0xb0
[  910.784918]  [<ffffffff810998dd>] ? percpu_down_read+0xd/0x50
[  910.791351]  [<ffffffff81186403>] vfs_write+0xb3/0x1b0
[  910.797103]  [<ffffffff81187711>] SyS_write+0x41/0xa0
[  910.802758]  [<ffffffff81002b2e>] do_int80_syscall_32+0x4e/0xa0
[  910.809389]  [<ffffffff8190f2aa>] entry_INT80_compat+0x2a/0x40
[  910.815925] rcu_sched kthread starved for 21002 jiffies! g348 c347
f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
[  910.826351] rcu_sched       R  running task    14896     7      2 0x00000000
[  910.834260]  ffff88017ab7bd88 ffff880179c17080 ffff88017ab34b00
0000000000000000
[  910.842550]  ffff88017ab7c000 ffff88017ab7bdd0 00000000ffffffff
0000000000000000
[  910.850836]  ffff88017fc0ec40 ffff88017ab7bda0 ffffffff81909370
000000010008feaa
[  910.859118] Call Trace:
[  910.861852]  [<ffffffff81909370>] schedule+0x30/0x80
[  910.867408]  [<ffffffff8190c3b9>] schedule_timeout+0x209/0x410
[  910.873938]  [<ffffffff810b8760>] ? init_timer_key+0xa0/0xa0
[  910.880267]  [<ffffffff81097aab>] ? prepare_to_swait+0x5b/0x80
[  910.886793]  [<ffffffff810b3e09>] rcu_gp_kthread+0x479/0x800
[  910.893124]  [<ffffffff810b3990>] ? call_rcu_sched+0x20/0x20
[  910.899458]  [<ffffffff81079f54>] kthread+0xc4/0xe0
[  910.904917]  [<ffffffff8190d3cf>] ret_from_fork+0x1f/0x40
[  910.910961]  [<ffffffff81079e90>] ? kthread_worker_fn+0x160/0x160


Is it bug in the driver or somewhere else?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-02 10:08 mmiotracer hangs the system Andy Shevchenko
@ 2016-08-02 10:36 ` Andy Shevchenko
  2016-08-02 15:05 ` Andy Shevchenko
  1 sibling, 0 replies; 20+ messages in thread
From: Andy Shevchenko @ 2016-08-02 10:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Paul E. McKenney, Steven Rostedt

On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> Hi!
>
> I'm trying to use mmio tracer with recent kernels (in this particular
> case today's linux-next).

Additional info. I took v4.4.16 and add the following to the default
x86_64_defconfig:
+CONFIG_MMIOTRACE=y
+CONFIG_SERIAL_8250_DW=y
+CONFIG_MFD_INTEL_LPSS=y
+CONFIG_MFD_INTEL_LPSS_PCI=y

The problem is still reproduced.
I can't take earlier kernels because the mentioned driver (intel-lpss)
was introduced in v4.4.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-02 10:08 mmiotracer hangs the system Andy Shevchenko
  2016-08-02 10:36 ` Andy Shevchenko
@ 2016-08-02 15:05 ` Andy Shevchenko
  2016-08-02 15:07   ` Andy Shevchenko
  1 sibling, 1 reply; 20+ messages in thread
From: Andy Shevchenko @ 2016-08-02 15:05 UTC (permalink / raw)
  To: linux-kernel, Karol Herbst; +Cc: Paul E. McKenney, Steven Rostedt, Ingo Molnar

+Cc: Karol, Ingo

On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> Hi!
>
> I'm trying to use mmio tracer with recent kernels (in this particular
> case today's linux-next).

Tested on other board and found that v4.5 works while v4.5.7 doesn't.
Bisecting to

commit d62a28a60562a8ba82e67e13c268245f37e796cb
Author: Karol Herbst <nouveau@karolherbst.de>
Date:   Thu Mar 3 02:03:11 2016 +0100

    x86/mm/kmmio: Fix mmiotrace for hugepages

    commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream.

Reverting _helps_ for x86 and x86_64 builds.

>
> # mount -t debugfs none /sys/kernel/debug/
>
> # echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
> [  869.673145] in mmio_trace_init
> [  869.714170] mmiotrace: Disabling non-boot CPUs...
> [  869.729938] Cannot set affinity for irq 169
> [  869.735765] smpboot: CPU 1 is now offline
> [  869.746662] mmiotrace: CPU1 is down.
> [  869.757896] smpboot: CPU 2 is now offline
> [  869.773572] mmiotrace: CPU2 is down.
> [  869.781768] smpboot: CPU 3 is now offline
> [  869.789495] mmiotrace: CPU3 is down.
> [  869.793515] mmiotrace: enabled.
>
> #  echo 1 > /sys/kernel/debug/tracing/tracing_on
> [  869.802634] in mmio_trace_start
>
> # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/unbind
> [  883.625744] mmiotrace: Unmapping ffffc90000854000.
> [  883.633925] mmiotrace: Unmapping ffffc90000852000.
> [  883.644580] mmiotrace: Unmapping ffffc90000850000.
>
> # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/bind
> [  889.525125] mmiotrace: ioremap_*(0x9242e200, 0x100) = ffffc90000856200
>
> [  910.533911] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  910.540052]  (detected by 0, t=21002 jiffies, g=348, c=347, q=0)
> [  910.546790] All QSes seen, last rcu_sched kthread activity 21002
> (4295577777-4295556775), jiffies_till_next_fqs=3, root ->qsm
> ask 0x0
> [  910.560142] sh              R  running task    12336  1289      1 0x20020008
> [  910.568055]  ffffffff81e422c0 ffff88017fc03e20 ffffffff81085296
> ffff88017fc18500
> [  910.576366]  ffffffff81e422c0 ffff88017fc03e88 ffffffff810b547d
> 0000000000000000
> [  910.584675]  ffffffff810f67ec 000000000000015c 0000000000000000
> 000000000000015c
> [  910.592980] Call Trace:
> [  910.595715]  <IRQ>  [<ffffffff81085296>] sched_show_task+0xb6/0x110
> [  910.602748]  [<ffffffff810b547d>] rcu_check_callbacks+0x84d/0x850
> [  910.609573]  [<ffffffff810f67ec>] ? __acct_update_integrals+0x2c/0xb0
> [  910.616788]  [<ffffffff810c9150>] ? tick_sched_do_timer+0x30/0x30
> [  910.623613]  [<ffffffff810ba34a>] update_process_times+0x2a/0x50
> [  910.630343]  [<ffffffff810c8bb1>] tick_sched_handle.isra.12+0x31/0x40
> [  910.637560]  [<ffffffff810c9188>] tick_sched_timer+0x38/0x70
> [  910.643902]  [<ffffffff810bacba>] __hrtimer_run_queues+0xda/0x250
> [  910.650734]  [<ffffffff810bb3f3>] hrtimer_interrupt+0xa3/0x190
> [  910.657272]  [<ffffffff8103ead3>] local_apic_timer_interrupt+0x33/0x50
> [  910.664584]  [<ffffffff8103f588>] smp_apic_timer_interrupt+0x38/0x50
> [  910.671705]  [<ffffffff8190dd6f>] apic_timer_interrupt+0x7f/0x90
> [  910.678427]  <EOI>  [<ffffffff814a717f>] ? intel_lpss_probe+0x7f/0x5f0
> [  910.685739]  [<ffffffff814a716b>] ? intel_lpss_probe+0x6b/0x5f0
> [  910.692364]  [<ffffffff8170e5df>] ? raw_pci_write+0x1f/0x40
> [  910.698610]  [<ffffffff8136e825>] ? pci_bus_write_config_byte+0x55/0x70
> [  910.706022]  [<ffffffff813781b1>] ? pcibios_set_master+0x51/0x80
> [  910.712753]  [<ffffffff814a7836>] intel_lpss_pci_probe+0x76/0xb0
> [  910.719479]  [<ffffffff813797e0>] local_pci_probe+0x40/0xa0
> [  910.725719]  [<ffffffff811fce44>] ? sysfs_do_create_link_sd.isra.2+0x64/0xa0
> [  910.733617]  [<ffffffff8137ab46>] pci_device_probe+0xd6/0x120
> [  910.740058]  [<ffffffff8148679f>] driver_probe_device+0x21f/0x430
> [  910.746883]  [<ffffffff81484c4f>] bind_store+0x10f/0x160
> [  910.752836]  [<ffffffff81484150>] drv_attr_store+0x20/0x30
> [  910.758983]  [<ffffffff811fc312>] sysfs_kf_write+0x32/0x40
> [  910.765129]  [<ffffffff811fb863>] kernfs_fop_write+0x113/0x190
> [  910.771663]  [<ffffffff81185343>] __vfs_write+0x23/0x120
> [  910.777607]  [<ffffffff812cfd46>] ? security_file_permission+0x36/0xb0
> [  910.784918]  [<ffffffff810998dd>] ? percpu_down_read+0xd/0x50
> [  910.791351]  [<ffffffff81186403>] vfs_write+0xb3/0x1b0
> [  910.797103]  [<ffffffff81187711>] SyS_write+0x41/0xa0
> [  910.802758]  [<ffffffff81002b2e>] do_int80_syscall_32+0x4e/0xa0
> [  910.809389]  [<ffffffff8190f2aa>] entry_INT80_compat+0x2a/0x40
> [  910.815925] rcu_sched kthread starved for 21002 jiffies! g348 c347
> f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
> [  910.826351] rcu_sched       R  running task    14896     7      2 0x00000000
> [  910.834260]  ffff88017ab7bd88 ffff880179c17080 ffff88017ab34b00
> 0000000000000000
> [  910.842550]  ffff88017ab7c000 ffff88017ab7bdd0 00000000ffffffff
> 0000000000000000
> [  910.850836]  ffff88017fc0ec40 ffff88017ab7bda0 ffffffff81909370
> 000000010008feaa
> [  910.859118] Call Trace:
> [  910.861852]  [<ffffffff81909370>] schedule+0x30/0x80
> [  910.867408]  [<ffffffff8190c3b9>] schedule_timeout+0x209/0x410
> [  910.873938]  [<ffffffff810b8760>] ? init_timer_key+0xa0/0xa0
> [  910.880267]  [<ffffffff81097aab>] ? prepare_to_swait+0x5b/0x80
> [  910.886793]  [<ffffffff810b3e09>] rcu_gp_kthread+0x479/0x800
> [  910.893124]  [<ffffffff810b3990>] ? call_rcu_sched+0x20/0x20
> [  910.899458]  [<ffffffff81079f54>] kthread+0xc4/0xe0
> [  910.904917]  [<ffffffff8190d3cf>] ret_from_fork+0x1f/0x40
> [  910.910961]  [<ffffffff81079e90>] ? kthread_worker_fn+0x160/0x160
>
>
> Is it bug in the driver or somewhere else?
>
> --
> With Best Regards,
> Andy Shevchenko



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-02 15:05 ` Andy Shevchenko
@ 2016-08-02 15:07   ` Andy Shevchenko
  2016-08-02 15:31     ` Steven Rostedt
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Shevchenko @ 2016-08-02 15:07 UTC (permalink / raw)
  To: linux-kernel, Karol Herbst; +Cc: Paul E. McKenney, Steven Rostedt, Ingo Molnar

Use another Karol's address (found in MAINTAINERS)

On Tue, Aug 2, 2016 at 6:05 PM, Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> +Cc: Karol, Ingo
>
> On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
>> Hi!
>>
>> I'm trying to use mmio tracer with recent kernels (in this particular
>> case today's linux-next).
>
> Tested on other board and found that v4.5 works while v4.5.7 doesn't.
> Bisecting to
>
> commit d62a28a60562a8ba82e67e13c268245f37e796cb
> Author: Karol Herbst <nouveau@karolherbst.de>
> Date:   Thu Mar 3 02:03:11 2016 +0100
>
>     x86/mm/kmmio: Fix mmiotrace for hugepages
>
>     commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream.
>
> Reverting _helps_ for x86 and x86_64 builds.
>
>>
>> # mount -t debugfs none /sys/kernel/debug/
>>
>> # echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
>> [  869.673145] in mmio_trace_init
>> [  869.714170] mmiotrace: Disabling non-boot CPUs...
>> [  869.729938] Cannot set affinity for irq 169
>> [  869.735765] smpboot: CPU 1 is now offline
>> [  869.746662] mmiotrace: CPU1 is down.
>> [  869.757896] smpboot: CPU 2 is now offline
>> [  869.773572] mmiotrace: CPU2 is down.
>> [  869.781768] smpboot: CPU 3 is now offline
>> [  869.789495] mmiotrace: CPU3 is down.
>> [  869.793515] mmiotrace: enabled.
>>
>> #  echo 1 > /sys/kernel/debug/tracing/tracing_on
>> [  869.802634] in mmio_trace_start
>>
>> # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/unbind
>> [  883.625744] mmiotrace: Unmapping ffffc90000854000.
>> [  883.633925] mmiotrace: Unmapping ffffc90000852000.
>> [  883.644580] mmiotrace: Unmapping ffffc90000850000.
>>
>> # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/bind
>> [  889.525125] mmiotrace: ioremap_*(0x9242e200, 0x100) = ffffc90000856200
>>
>> [  910.533911] INFO: rcu_sched detected stalls on CPUs/tasks:
>> [  910.540052]  (detected by 0, t=21002 jiffies, g=348, c=347, q=0)
>> [  910.546790] All QSes seen, last rcu_sched kthread activity 21002
>> (4295577777-4295556775), jiffies_till_next_fqs=3, root ->qsm
>> ask 0x0
>> [  910.560142] sh              R  running task    12336  1289      1 0x20020008
>> [  910.568055]  ffffffff81e422c0 ffff88017fc03e20 ffffffff81085296
>> ffff88017fc18500
>> [  910.576366]  ffffffff81e422c0 ffff88017fc03e88 ffffffff810b547d
>> 0000000000000000
>> [  910.584675]  ffffffff810f67ec 000000000000015c 0000000000000000
>> 000000000000015c
>> [  910.592980] Call Trace:
>> [  910.595715]  <IRQ>  [<ffffffff81085296>] sched_show_task+0xb6/0x110
>> [  910.602748]  [<ffffffff810b547d>] rcu_check_callbacks+0x84d/0x850
>> [  910.609573]  [<ffffffff810f67ec>] ? __acct_update_integrals+0x2c/0xb0
>> [  910.616788]  [<ffffffff810c9150>] ? tick_sched_do_timer+0x30/0x30
>> [  910.623613]  [<ffffffff810ba34a>] update_process_times+0x2a/0x50
>> [  910.630343]  [<ffffffff810c8bb1>] tick_sched_handle.isra.12+0x31/0x40
>> [  910.637560]  [<ffffffff810c9188>] tick_sched_timer+0x38/0x70
>> [  910.643902]  [<ffffffff810bacba>] __hrtimer_run_queues+0xda/0x250
>> [  910.650734]  [<ffffffff810bb3f3>] hrtimer_interrupt+0xa3/0x190
>> [  910.657272]  [<ffffffff8103ead3>] local_apic_timer_interrupt+0x33/0x50
>> [  910.664584]  [<ffffffff8103f588>] smp_apic_timer_interrupt+0x38/0x50
>> [  910.671705]  [<ffffffff8190dd6f>] apic_timer_interrupt+0x7f/0x90
>> [  910.678427]  <EOI>  [<ffffffff814a717f>] ? intel_lpss_probe+0x7f/0x5f0
>> [  910.685739]  [<ffffffff814a716b>] ? intel_lpss_probe+0x6b/0x5f0
>> [  910.692364]  [<ffffffff8170e5df>] ? raw_pci_write+0x1f/0x40
>> [  910.698610]  [<ffffffff8136e825>] ? pci_bus_write_config_byte+0x55/0x70
>> [  910.706022]  [<ffffffff813781b1>] ? pcibios_set_master+0x51/0x80
>> [  910.712753]  [<ffffffff814a7836>] intel_lpss_pci_probe+0x76/0xb0
>> [  910.719479]  [<ffffffff813797e0>] local_pci_probe+0x40/0xa0
>> [  910.725719]  [<ffffffff811fce44>] ? sysfs_do_create_link_sd.isra.2+0x64/0xa0
>> [  910.733617]  [<ffffffff8137ab46>] pci_device_probe+0xd6/0x120
>> [  910.740058]  [<ffffffff8148679f>] driver_probe_device+0x21f/0x430
>> [  910.746883]  [<ffffffff81484c4f>] bind_store+0x10f/0x160
>> [  910.752836]  [<ffffffff81484150>] drv_attr_store+0x20/0x30
>> [  910.758983]  [<ffffffff811fc312>] sysfs_kf_write+0x32/0x40
>> [  910.765129]  [<ffffffff811fb863>] kernfs_fop_write+0x113/0x190
>> [  910.771663]  [<ffffffff81185343>] __vfs_write+0x23/0x120
>> [  910.777607]  [<ffffffff812cfd46>] ? security_file_permission+0x36/0xb0
>> [  910.784918]  [<ffffffff810998dd>] ? percpu_down_read+0xd/0x50
>> [  910.791351]  [<ffffffff81186403>] vfs_write+0xb3/0x1b0
>> [  910.797103]  [<ffffffff81187711>] SyS_write+0x41/0xa0
>> [  910.802758]  [<ffffffff81002b2e>] do_int80_syscall_32+0x4e/0xa0
>> [  910.809389]  [<ffffffff8190f2aa>] entry_INT80_compat+0x2a/0x40
>> [  910.815925] rcu_sched kthread starved for 21002 jiffies! g348 c347
>> f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
>> [  910.826351] rcu_sched       R  running task    14896     7      2 0x00000000
>> [  910.834260]  ffff88017ab7bd88 ffff880179c17080 ffff88017ab34b00
>> 0000000000000000
>> [  910.842550]  ffff88017ab7c000 ffff88017ab7bdd0 00000000ffffffff
>> 0000000000000000
>> [  910.850836]  ffff88017fc0ec40 ffff88017ab7bda0 ffffffff81909370
>> 000000010008feaa
>> [  910.859118] Call Trace:
>> [  910.861852]  [<ffffffff81909370>] schedule+0x30/0x80
>> [  910.867408]  [<ffffffff8190c3b9>] schedule_timeout+0x209/0x410
>> [  910.873938]  [<ffffffff810b8760>] ? init_timer_key+0xa0/0xa0
>> [  910.880267]  [<ffffffff81097aab>] ? prepare_to_swait+0x5b/0x80
>> [  910.886793]  [<ffffffff810b3e09>] rcu_gp_kthread+0x479/0x800
>> [  910.893124]  [<ffffffff810b3990>] ? call_rcu_sched+0x20/0x20
>> [  910.899458]  [<ffffffff81079f54>] kthread+0xc4/0xe0
>> [  910.904917]  [<ffffffff8190d3cf>] ret_from_fork+0x1f/0x40
>> [  910.910961]  [<ffffffff81079e90>] ? kthread_worker_fn+0x160/0x160
>>
>>
>> Is it bug in the driver or somewhere else?
>>
>> --
>> With Best Regards,
>> Andy Shevchenko
>
>
>
> --
> With Best Regards,
> Andy Shevchenko



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-02 15:07   ` Andy Shevchenko
@ 2016-08-02 15:31     ` Steven Rostedt
  2016-08-02 16:08       ` Andy Shevchenko
  0 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2016-08-02 15:31 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: linux-kernel, Karol Herbst, Paul E. McKenney, Ingo Molnar

On Tue, 2 Aug 2016 18:07:38 +0300
Andy Shevchenko <andy.shevchenko@gmail.com> wrote:

> Use another Karol's address (found in MAINTAINERS)
> 
> On Tue, Aug 2, 2016 at 6:05 PM, Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
> > +Cc: Karol, Ingo
> >
> > On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko
> > <andy.shevchenko@gmail.com> wrote:  
> >> Hi!
> >>
> >> I'm trying to use mmio tracer with recent kernels (in this particular
> >> case today's linux-next).  
> >
> > Tested on other board and found that v4.5 works while v4.5.7 doesn't.
> > Bisecting to
> >
> > commit d62a28a60562a8ba82e67e13c268245f37e796cb
> > Author: Karol Herbst <nouveau@karolherbst.de>
> > Date:   Thu Mar 3 02:03:11 2016 +0100
> >
> >     x86/mm/kmmio: Fix mmiotrace for hugepages
> >
> >     commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream.
> >
> > Reverting _helps_ for x86 and x86_64 builds.
> >  

That commit was added in 4.6. Does that kernel work? Maybe it was a bad
backport?

-- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-02 15:31     ` Steven Rostedt
@ 2016-08-02 16:08       ` Andy Shevchenko
  2016-08-02 16:13         ` Steven Rostedt
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Shevchenko @ 2016-08-02 16:08 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Karol Herbst, Paul E. McKenney, Ingo Molnar

On Tue, Aug 2, 2016 at 6:31 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2 Aug 2016 18:07:38 +0300
> Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
>
>> Use another Karol's address (found in MAINTAINERS)
>>
>> On Tue, Aug 2, 2016 at 6:05 PM, Andy Shevchenko
>> <andy.shevchenko@gmail.com> wrote:
>> > +Cc: Karol, Ingo
>> >
>> > On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko
>> > <andy.shevchenko@gmail.com> wrote:
>> >> Hi!
>> >>
>> >> I'm trying to use mmio tracer with recent kernels (in this particular
>> >> case today's linux-next).
>> >
>> > Tested on other board and found that v4.5 works while v4.5.7 doesn't.
>> > Bisecting to
>> >
>> > commit d62a28a60562a8ba82e67e13c268245f37e796cb
>> > Author: Karol Herbst <nouveau@karolherbst.de>
>> > Date:   Thu Mar 3 02:03:11 2016 +0100
>> >
>> >     x86/mm/kmmio: Fix mmiotrace for hugepages
>> >
>> >     commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream.
>> >
>> > Reverting _helps_ for x86 and x86_64 builds.
>> >
>
> That commit was added in 4.6. Does that kernel work? Maybe it was a bad
> backport?

I don't think so, since linux-next doesn't work until I revert this commit.

I can try exactly v4.6 (yep, I tried stable versions, including
v4.4.16 that's why all of them failed to me) if you still would like
me to do so.


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-02 16:08       ` Andy Shevchenko
@ 2016-08-02 16:13         ` Steven Rostedt
  2016-08-03 18:24           ` karol herbst
  2016-08-19 10:35           ` karol herbst
  0 siblings, 2 replies; 20+ messages in thread
From: Steven Rostedt @ 2016-08-02 16:13 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: linux-kernel, Karol Herbst, Paul E. McKenney, Ingo Molnar

On Tue, 2 Aug 2016 19:08:24 +0300
Andy Shevchenko <andy.shevchenko@gmail.com> wrote:

> I don't think so, since linux-next doesn't work until I revert this commit.
> 
> I can try exactly v4.6 (yep, I tried stable versions, including
> v4.4.16 that's why all of them failed to me) if you still would like
> me to do so.

If linux-next doesn't work, then don't bother.

That commit obviously broke something and you'll probably need help
from Karol to fix it.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-02 16:13         ` Steven Rostedt
@ 2016-08-03 18:24           ` karol herbst
  2016-08-19 10:35           ` karol herbst
  1 sibling, 0 replies; 20+ messages in thread
From: karol herbst @ 2016-08-03 18:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andy Shevchenko, linux-kernel, Paul E. McKenney, Ingo Molnar

hi all,

mhhh exactly this commit fixed mmiotrace for me and a few other
nouveau devs on x86_64.

Also the error log doesn't really show a problem inside the tracer?
Maybe it would be helpful to provide full dmesg in the case of error,
just to rule silly things out.

I will try to figure out how that bug could happen at all.

Greetings

Karol

2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>:
> On Tue, 2 Aug 2016 19:08:24 +0300
> Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
>
>> I don't think so, since linux-next doesn't work until I revert this commit.
>>
>> I can try exactly v4.6 (yep, I tried stable versions, including
>> v4.4.16 that's why all of them failed to me) if you still would like
>> me to do so.
>
> If linux-next doesn't work, then don't bother.
>
> That commit obviously broke something and you'll probably need help
> from Karol to fix it.
>
> Thanks,
>
> -- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-02 16:13         ` Steven Rostedt
  2016-08-03 18:24           ` karol herbst
@ 2016-08-19 10:35           ` karol herbst
  2016-08-19 13:02             ` Andy Shevchenko
  2016-08-19 13:34             ` Steven Rostedt
  1 sibling, 2 replies; 20+ messages in thread
From: karol herbst @ 2016-08-19 10:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andy Shevchenko, linux-kernel, Paul E. McKenney, Ingo Molnar

Hi everybody,

is there any update on that issue I missed somehow? I really don't
want to leave the mmiotracer in a state, where it breaks something
while fixing other issues.

But for now, without being able to even reproduce the issue, I can't
really do much, because the code in the current state looks sane to
me. Maybe this case includes the mmiotracer cleaning things up and
arms new region for mmiotracing and that's why it fails? Besides that,
I have no idea and no way to reproduce this, so I can't help this way.

Greetings

2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>:
> On Tue, 2 Aug 2016 19:08:24 +0300
> Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
>
>> I don't think so, since linux-next doesn't work until I revert this commit.
>>
>> I can try exactly v4.6 (yep, I tried stable versions, including
>> v4.4.16 that's why all of them failed to me) if you still would like
>> me to do so.
>
> If linux-next doesn't work, then don't bother.
>
> That commit obviously broke something and you'll probably need help
> from Karol to fix it.
>
> Thanks,
>
> -- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-19 10:35           ` karol herbst
@ 2016-08-19 13:02             ` Andy Shevchenko
  2016-08-19 15:08               ` karol herbst
  2016-08-19 13:34             ` Steven Rostedt
  1 sibling, 1 reply; 20+ messages in thread
From: Andy Shevchenko @ 2016-08-19 13:02 UTC (permalink / raw)
  To: karol herbst; +Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
> is there any update on that issue I missed somehow? I really don't
> want to leave the mmiotracer in a state, where it breaks something
> while fixing other issues.

No updates. I'm busy right now with more priority tasks and revert
works for me. Issue is reproducible in my case 100%.

So, I would able to attach dmesg in case it would be helpful.
Otherwise tell me exact instructions how to debug the issue.

Here you are:
http://pastebin.com/raw/VfTZENt7

> But for now, without being able to even reproduce the issue, I can't
> really do much, because the code in the current state looks sane to
> me. Maybe this case includes the mmiotracer cleaning things up and
> arms new region for mmiotracing and that's why it fails? Besides that,
> I have no idea and no way to reproduce this, so I can't help this way.

Maybe. First thing happened is iounmap().

> 2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>:
> > On Tue, 2 Aug 2016 19:08:24 +0300
> > Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> >
> >> I don't think so, since linux-next doesn't work until I revert this commit.
> >>
> >> I can try exactly v4.6 (yep, I tried stable versions, including
> >> v4.4.16 that's why all of them failed to me) if you still would like
> >> me to do so.
> >
> > If linux-next doesn't work, then don't bother.
> >
> > That commit obviously broke something and you'll probably need help
> > from Karol to fix it.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-19 10:35           ` karol herbst
  2016-08-19 13:02             ` Andy Shevchenko
@ 2016-08-19 13:34             ` Steven Rostedt
  1 sibling, 0 replies; 20+ messages in thread
From: Steven Rostedt @ 2016-08-19 13:34 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: karol herbst, linux-kernel, Paul E. McKenney, Ingo Molnar


Andy,

OK, the ball is in your court. Karol can't reproduce it, thus it will
require you sending debug information back so we can get this solved.

-- Steve


On Fri, 19 Aug 2016 12:35:24 +0200
karol herbst <karolherbst@gmail.com> wrote:

> Hi everybody,
> 
> is there any update on that issue I missed somehow? I really don't
> want to leave the mmiotracer in a state, where it breaks something
> while fixing other issues.
> 
> But for now, without being able to even reproduce the issue, I can't
> really do much, because the code in the current state looks sane to
> me. Maybe this case includes the mmiotracer cleaning things up and
> arms new region for mmiotracing and that's why it fails? Besides that,
> I have no idea and no way to reproduce this, so I can't help this way.
> 
> Greetings
> 
> 2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>:
> > On Tue, 2 Aug 2016 19:08:24 +0300
> > Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> >  
> >> I don't think so, since linux-next doesn't work until I revert this commit.
> >>
> >> I can try exactly v4.6 (yep, I tried stable versions, including
> >> v4.4.16 that's why all of them failed to me) if you still would like
> >> me to do so.  
> >
> > If linux-next doesn't work, then don't bother.
> >
> > That commit obviously broke something and you'll probably need help
> > from Karol to fix it.
> >
> > Thanks,
> >
> > -- Steve  

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-19 13:02             ` Andy Shevchenko
@ 2016-08-19 15:08               ` karol herbst
  2016-08-19 15:35                 ` Andy Shevchenko
  0 siblings, 1 reply; 20+ messages in thread
From: karol herbst @ 2016-08-19 15:08 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
>> is there any update on that issue I missed somehow? I really don't
>> want to leave the mmiotracer in a state, where it breaks something
>> while fixing other issues.
>
> No updates. I'm busy right now with more priority tasks and revert
> works for me. Issue is reproducible in my case 100%.
>

Is there something I could do with a "normal" haswell desktop system
to reproduce this issue?

I'll try to play around the next days a bit and maybe I find something
that works out here as well. It seems to be related to
unmapping-mapping cycles.

Because if this only happens with the pwm-lpss driver, it may be
really troublesome to debug, because I don't really know the code that
well to be sure where the issue might be.

> So, I would able to attach dmesg in case it would be helpful.
> Otherwise tell me exact instructions how to debug the issue.
>
> Here you are:
> http://pastebin.com/raw/VfTZENt7
>
>> But for now, without being able to even reproduce the issue, I can't
>> really do much, because the code in the current state looks sane to
>> me. Maybe this case includes the mmiotracer cleaning things up and
>> arms new region for mmiotracing and that's why it fails? Besides that,
>> I have no idea and no way to reproduce this, so I can't help this way.
>
> Maybe. First thing happened is iounmap().
>
>> 2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>:
>> > On Tue, 2 Aug 2016 19:08:24 +0300
>> > Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
>> >
>> >> I don't think so, since linux-next doesn't work until I revert this commit.
>> >>
>> >> I can try exactly v4.6 (yep, I tried stable versions, including
>> >> v4.4.16 that's why all of them failed to me) if you still would like
>> >> me to do so.
>> >
>> > If linux-next doesn't work, then don't bother.
>> >
>> > That commit obviously broke something and you'll probably need help
>> > from Karol to fix it.
>
> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-19 15:08               ` karol herbst
@ 2016-08-19 15:35                 ` Andy Shevchenko
  2016-08-19 18:23                   ` karol herbst
  2016-08-19 20:46                   ` Karol Herbst
  0 siblings, 2 replies; 20+ messages in thread
From: Andy Shevchenko @ 2016-08-19 15:35 UTC (permalink / raw)
  To: karol herbst; +Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote:
> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
>>> is there any update on that issue I missed somehow? I really don't
>>> want to leave the mmiotracer in a state, where it breaks something
>>> while fixing other issues.
>>
>> No updates. I'm busy right now with more priority tasks and revert
>> works for me. Issue is reproducible in my case 100%.
>>
>
> Is there something I could do with a "normal" haswell desktop system
> to reproduce this issue?

Try LPSS UART device(s)

>
> I'll try to play around the next days a bit and maybe I find something
> that works out here as well. It seems to be related to
> unmapping-mapping cycles.

That is the only thing I would think of.

>
> Because if this only happens with the pwm-lpss driver,

It has nothing to do with pwm-lpss since it's a HS UART and served by
intel-lpss driver.

> it may be
> really troublesome to debug, because I don't really know the code that
> well to be sure where the issue might be.
>
>> So, I would able to attach dmesg in case it would be helpful.
>> Otherwise tell me exact instructions how to debug the issue.
>>
>> Here you are:
>> http://pastebin.com/raw/VfTZENt7
>>
>>> But for now, without being able to even reproduce the issue, I can't
>>> really do much, because the code in the current state looks sane to
>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>> I have no idea and no way to reproduce this, so I can't help this way.
>>
>> Maybe. First thing happened is iounmap().


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-19 15:35                 ` Andy Shevchenko
@ 2016-08-19 18:23                   ` karol herbst
  2016-08-19 20:46                   ` Karol Herbst
  1 sibling, 0 replies; 20+ messages in thread
From: karol herbst @ 2016-08-19 18:23 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote:
>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>> is there any update on that issue I missed somehow? I really don't
>>>> want to leave the mmiotracer in a state, where it breaks something
>>>> while fixing other issues.
>>>
>>> No updates. I'm busy right now with more priority tasks and revert
>>> works for me. Issue is reproducible in my case 100%.
>>>
>>
>> Is there something I could do with a "normal" haswell desktop system
>> to reproduce this issue?
>
> Try LPSS UART device(s)
>

isn't this a skylake thing? Because my CPU and motherboard is a bit
older than this.

>>
>> I'll try to play around the next days a bit and maybe I find something
>> that works out here as well. It seems to be related to
>> unmapping-mapping cycles.
>
> That is the only thing I would think of.
>
>>
>> Because if this only happens with the pwm-lpss driver,
>
> It has nothing to do with pwm-lpss since it's a HS UART and served by
> intel-lpss driver.
>
>> it may be
>> really troublesome to debug, because I don't really know the code that
>> well to be sure where the issue might be.
>>
>>> So, I would able to attach dmesg in case it would be helpful.
>>> Otherwise tell me exact instructions how to debug the issue.
>>>
>>> Here you are:
>>> http://pastebin.com/raw/VfTZENt7
>>>
>>>> But for now, without being able to even reproduce the issue, I can't
>>>> really do much, because the code in the current state looks sane to
>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>
>>> Maybe. First thing happened is iounmap().
>
>
> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-19 15:35                 ` Andy Shevchenko
  2016-08-19 18:23                   ` karol herbst
@ 2016-08-19 20:46                   ` Karol Herbst
  2016-08-19 21:50                     ` Andy Shevchenko
  2016-10-13 21:12                     ` Karol Herbst
  1 sibling, 2 replies; 20+ messages in thread
From: Karol Herbst @ 2016-08-19 20:46 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

Hi again,

I was able to get a crash/freeze/something while unbinding/binding my
nvidia gpu from nouveau.

Guess that means something is odd. I will investigate this more over
the weekend.

2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote:
>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>> is there any update on that issue I missed somehow? I really don't
>>>> want to leave the mmiotracer in a state, where it breaks something
>>>> while fixing other issues.
>>>
>>> No updates. I'm busy right now with more priority tasks and revert
>>> works for me. Issue is reproducible in my case 100%.
>>>
>>
>> Is there something I could do with a "normal" haswell desktop system
>> to reproduce this issue?
>
> Try LPSS UART device(s)
>
>>
>> I'll try to play around the next days a bit and maybe I find something
>> that works out here as well. It seems to be related to
>> unmapping-mapping cycles.
>
> That is the only thing I would think of.
>
>>
>> Because if this only happens with the pwm-lpss driver,
>
> It has nothing to do with pwm-lpss since it's a HS UART and served by
> intel-lpss driver.
>
>> it may be
>> really troublesome to debug, because I don't really know the code that
>> well to be sure where the issue might be.
>>
>>> So, I would able to attach dmesg in case it would be helpful.
>>> Otherwise tell me exact instructions how to debug the issue.
>>>
>>> Here you are:
>>> http://pastebin.com/raw/VfTZENt7
>>>
>>>> But for now, without being able to even reproduce the issue, I can't
>>>> really do much, because the code in the current state looks sane to
>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>
>>> Maybe. First thing happened is iounmap().
>
>
> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-19 20:46                   ` Karol Herbst
@ 2016-08-19 21:50                     ` Andy Shevchenko
  2016-10-13 21:12                     ` Karol Herbst
  1 sibling, 0 replies; 20+ messages in thread
From: Andy Shevchenko @ 2016-08-19 21:50 UTC (permalink / raw)
  To: Karol Herbst; +Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

On Fri, Aug 19, 2016 at 11:46 PM, Karol Herbst <karolherbst@gmail.com> wrote:
> I was able to get a crash/freeze/something while unbinding/binding my
> nvidia gpu from nouveau.

>
> Guess that means something is odd. I will investigate this more over
> the weekend.

Thanks. Will wait for further updates.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-08-19 20:46                   ` Karol Herbst
  2016-08-19 21:50                     ` Andy Shevchenko
@ 2016-10-13 21:12                     ` Karol Herbst
  2016-10-22 16:02                       ` Andy Shevchenko
  1 sibling, 1 reply; 20+ messages in thread
From: Karol Herbst @ 2016-10-13 21:12 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

sorry for the delay fixing that bug. I got occupied with other things
and didn't really got to the issue again, it is on my todo list as the
next item though and I hope I will be able to get a fix ready this
weekend. I think I might know where the issue is, but didn't confirm
it yet.

Again, sorry for the delay.

Karol

2016-08-19 22:46 GMT+02:00 Karol Herbst <karolherbst@gmail.com>:
> Hi again,
>
> I was able to get a crash/freeze/something while unbinding/binding my
> nvidia gpu from nouveau.
>
> Guess that means something is odd. I will investigate this more over
> the weekend.
>
> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote:
>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>>> is there any update on that issue I missed somehow? I really don't
>>>>> want to leave the mmiotracer in a state, where it breaks something
>>>>> while fixing other issues.
>>>>
>>>> No updates. I'm busy right now with more priority tasks and revert
>>>> works for me. Issue is reproducible in my case 100%.
>>>>
>>>
>>> Is there something I could do with a "normal" haswell desktop system
>>> to reproduce this issue?
>>
>> Try LPSS UART device(s)
>>
>>>
>>> I'll try to play around the next days a bit and maybe I find something
>>> that works out here as well. It seems to be related to
>>> unmapping-mapping cycles.
>>
>> That is the only thing I would think of.
>>
>>>
>>> Because if this only happens with the pwm-lpss driver,
>>
>> It has nothing to do with pwm-lpss since it's a HS UART and served by
>> intel-lpss driver.
>>
>>> it may be
>>> really troublesome to debug, because I don't really know the code that
>>> well to be sure where the issue might be.
>>>
>>>> So, I would able to attach dmesg in case it would be helpful.
>>>> Otherwise tell me exact instructions how to debug the issue.
>>>>
>>>> Here you are:
>>>> http://pastebin.com/raw/VfTZENt7
>>>>
>>>>> But for now, without being able to even reproduce the issue, I can't
>>>>> really do much, because the code in the current state looks sane to
>>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>>
>>>> Maybe. First thing happened is iounmap().
>>
>>
>> --
>> With Best Regards,
>> Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-10-13 21:12                     ` Karol Herbst
@ 2016-10-22 16:02                       ` Andy Shevchenko
  2016-11-19 10:56                         ` Karol Herbst
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Shevchenko @ 2016-10-22 16:02 UTC (permalink / raw)
  To: Karol Herbst; +Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolherbst@gmail.com> wrote:
> sorry for the delay fixing that bug. I got occupied with other things
> and didn't really got to the issue again, it is on my todo list as the
> next item though and I hope I will be able to get a fix ready this
> weekend. I think I might know where the issue is, but didn't confirm
> it yet.

Thanks.I'm still using revert. Feel free to Cc me when you will have
some material to test.

>
> Again, sorry for the delay.
>
> Karol
>
> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolherbst@gmail.com>:
>> Hi again,
>>
>> I was able to get a crash/freeze/something while unbinding/binding my
>> nvidia gpu from nouveau.
>>
>> Guess that means something is odd. I will investigate this more over
>> the weekend.
>>
>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>>>> is there any update on that issue I missed somehow? I really don't
>>>>>> want to leave the mmiotracer in a state, where it breaks something
>>>>>> while fixing other issues.
>>>>>
>>>>> No updates. I'm busy right now with more priority tasks and revert
>>>>> works for me. Issue is reproducible in my case 100%.
>>>>>
>>>>
>>>> Is there something I could do with a "normal" haswell desktop system
>>>> to reproduce this issue?
>>>
>>> Try LPSS UART device(s)
>>>
>>>>
>>>> I'll try to play around the next days a bit and maybe I find something
>>>> that works out here as well. It seems to be related to
>>>> unmapping-mapping cycles.
>>>
>>> That is the only thing I would think of.
>>>
>>>>
>>>> Because if this only happens with the pwm-lpss driver,
>>>
>>> It has nothing to do with pwm-lpss since it's a HS UART and served by
>>> intel-lpss driver.
>>>
>>>> it may be
>>>> really troublesome to debug, because I don't really know the code that
>>>> well to be sure where the issue might be.
>>>>
>>>>> So, I would able to attach dmesg in case it would be helpful.
>>>>> Otherwise tell me exact instructions how to debug the issue.
>>>>>
>>>>> Here you are:
>>>>> http://pastebin.com/raw/VfTZENt7
>>>>>
>>>>>> But for now, without being able to even reproduce the issue, I can't
>>>>>> really do much, because the code in the current state looks sane to
>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>>>
>>>>> Maybe. First thing happened is iounmap().
>>>
>>>
>>> --
>>> With Best Regards,
>>> Andy Shevchenko



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-10-22 16:02                       ` Andy Shevchenko
@ 2016-11-19 10:56                         ` Karol Herbst
  2016-11-24 20:50                           ` Karol Herbst
  0 siblings, 1 reply; 20+ messages in thread
From: Karol Herbst @ 2016-11-19 10:56 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

this is odd, I found a bug related to nouveau (modprobe/bind doesn't
return), but that isn't related to your issue at all or maybe it is
exactly this, cause the binding of the device doesn't return and
depending on the kind of driver, it would hang the system... yeah,
maybe it is the same issue.

anyway, could you try to trace with the attached patch? Maybe the
additional output would help me to verify it. Currently I am working
on the bugfix I mentioned above and this may also fix your issue. I
was still able to get a working mmiotrace file, even if the dvice
binding didn't finish. Is this the same for you? (try cat
"/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this
contains anything usefull).

This really looks like an odd issue, because the mmiotracer still
behaves as expected.

2016-10-22 18:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolherbst@gmail.com> wrote:
>> sorry for the delay fixing that bug. I got occupied with other things
>> and didn't really got to the issue again, it is on my todo list as the
>> next item though and I hope I will be able to get a fix ready this
>> weekend. I think I might know where the issue is, but didn't confirm
>> it yet.
>
> Thanks.I'm still using revert. Feel free to Cc me when you will have
> some material to test.
>
>>
>> Again, sorry for the delay.
>>
>> Karol
>>
>> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolherbst@gmail.com>:
>>> Hi again,
>>>
>>> I was able to get a crash/freeze/something while unbinding/binding my
>>> nvidia gpu from nouveau.
>>>
>>> Guess that means something is odd. I will investigate this more over
>>> the weekend.
>>>
>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>>>>> is there any update on that issue I missed somehow? I really don't
>>>>>>> want to leave the mmiotracer in a state, where it breaks something
>>>>>>> while fixing other issues.
>>>>>>
>>>>>> No updates. I'm busy right now with more priority tasks and revert
>>>>>> works for me. Issue is reproducible in my case 100%.
>>>>>>
>>>>>
>>>>> Is there something I could do with a "normal" haswell desktop system
>>>>> to reproduce this issue?
>>>>
>>>> Try LPSS UART device(s)
>>>>
>>>>>
>>>>> I'll try to play around the next days a bit and maybe I find something
>>>>> that works out here as well. It seems to be related to
>>>>> unmapping-mapping cycles.
>>>>
>>>> That is the only thing I would think of.
>>>>
>>>>>
>>>>> Because if this only happens with the pwm-lpss driver,
>>>>
>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by
>>>> intel-lpss driver.
>>>>
>>>>> it may be
>>>>> really troublesome to debug, because I don't really know the code that
>>>>> well to be sure where the issue might be.
>>>>>
>>>>>> So, I would able to attach dmesg in case it would be helpful.
>>>>>> Otherwise tell me exact instructions how to debug the issue.
>>>>>>
>>>>>> Here you are:
>>>>>> http://pastebin.com/raw/VfTZENt7
>>>>>>
>>>>>>> But for now, without being able to even reproduce the issue, I can't
>>>>>>> really do much, because the code in the current state looks sane to
>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>>>>
>>>>>> Maybe. First thing happened is iounmap().
>>>>
>>>>
>>>> --
>>>> With Best Regards,
>>>> Andy Shevchenko
>
>
>
> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: mmiotracer hangs the system
  2016-11-19 10:56                         ` Karol Herbst
@ 2016-11-24 20:50                           ` Karol Herbst
  0 siblings, 0 replies; 20+ messages in thread
From: Karol Herbst @ 2016-11-24 20:50 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 4044 bytes --]

sorry for that, but I forgot the patch

2016-11-19 11:56 GMT+01:00 Karol Herbst <karolherbst@gmail.com>:
> this is odd, I found a bug related to nouveau (modprobe/bind doesn't
> return), but that isn't related to your issue at all or maybe it is
> exactly this, cause the binding of the device doesn't return and
> depending on the kind of driver, it would hang the system... yeah,
> maybe it is the same issue.
>
> anyway, could you try to trace with the attached patch? Maybe the
> additional output would help me to verify it. Currently I am working
> on the bugfix I mentioned above and this may also fix your issue. I
> was still able to get a working mmiotrace file, even if the dvice
> binding didn't finish. Is this the same for you? (try cat
> "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this
> contains anything usefull).
>
> This really looks like an odd issue, because the mmiotracer still
> behaves as expected.
>
> 2016-10-22 18:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolherbst@gmail.com> wrote:
>>> sorry for the delay fixing that bug. I got occupied with other things
>>> and didn't really got to the issue again, it is on my todo list as the
>>> next item though and I hope I will be able to get a fix ready this
>>> weekend. I think I might know where the issue is, but didn't confirm
>>> it yet.
>>
>> Thanks.I'm still using revert. Feel free to Cc me when you will have
>> some material to test.
>>
>>>
>>> Again, sorry for the delay.
>>>
>>> Karol
>>>
>>> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolherbst@gmail.com>:
>>>> Hi again,
>>>>
>>>> I was able to get a crash/freeze/something while unbinding/binding my
>>>> nvidia gpu from nouveau.
>>>>
>>>> Guess that means something is odd. I will investigate this more over
>>>> the weekend.
>>>>
>>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>:
>>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote:
>>>>>>>> is there any update on that issue I missed somehow? I really don't
>>>>>>>> want to leave the mmiotracer in a state, where it breaks something
>>>>>>>> while fixing other issues.
>>>>>>>
>>>>>>> No updates. I'm busy right now with more priority tasks and revert
>>>>>>> works for me. Issue is reproducible in my case 100%.
>>>>>>>
>>>>>>
>>>>>> Is there something I could do with a "normal" haswell desktop system
>>>>>> to reproduce this issue?
>>>>>
>>>>> Try LPSS UART device(s)
>>>>>
>>>>>>
>>>>>> I'll try to play around the next days a bit and maybe I find something
>>>>>> that works out here as well. It seems to be related to
>>>>>> unmapping-mapping cycles.
>>>>>
>>>>> That is the only thing I would think of.
>>>>>
>>>>>>
>>>>>> Because if this only happens with the pwm-lpss driver,
>>>>>
>>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by
>>>>> intel-lpss driver.
>>>>>
>>>>>> it may be
>>>>>> really troublesome to debug, because I don't really know the code that
>>>>>> well to be sure where the issue might be.
>>>>>>
>>>>>>> So, I would able to attach dmesg in case it would be helpful.
>>>>>>> Otherwise tell me exact instructions how to debug the issue.
>>>>>>>
>>>>>>> Here you are:
>>>>>>> http://pastebin.com/raw/VfTZENt7
>>>>>>>
>>>>>>>> But for now, without being able to even reproduce the issue, I can't
>>>>>>>> really do much, because the code in the current state looks sane to
>>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>>>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>>>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>>>>>
>>>>>>> Maybe. First thing happened is iounmap().
>>>>>
>>>>>
>>>>> --
>>>>> With Best Regards,
>>>>> Andy Shevchenko
>>
>>
>>
>> --
>> With Best Regards,
>> Andy Shevchenko

[-- Attachment #2: 0001-temp-hack.patch --]
[-- Type: text/x-patch, Size: 2760 bytes --]

From 92aea447a776f10aad0a2e971b5f2b208a1161d2 Mon Sep 17 00:00:00 2001
From: Karol Herbst <nouveau@karolherbst.de>
Date: Thu, 24 Nov 2016 21:46:27 +0100
Subject: [PATCH] temp hack

---
 arch/x86/mm/kmmio.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
index afc47f5c9531..a002ee314a0c 100644
--- a/arch/x86/mm/kmmio.c
+++ b/arch/x86/mm/kmmio.c
@@ -97,11 +97,16 @@ static DEFINE_PER_CPU(struct kmmio_context, kmmio_ctx);
 static struct kmmio_probe *get_kmmio_probe(unsigned long addr)
 {
 	struct kmmio_probe *p;
+	struct kmmio_probe *result = NULL;
 	list_for_each_entry_rcu(p, &kmmio_probes, list) {
-		if (addr >= p->addr && addr < (p->addr + p->len))
-			return p;
+		if (addr >= p->addr && addr < (p->addr + p->len)) {
+			if (!result)
+				result = p;
+			else
+				printk(KERN_ERR " %s collision detected %lu", __FUNCTION__, addr);
+		}
 	}
-	return NULL;
+	return result;
 }
 
 /* You must be holding RCU read lock. */
@@ -109,6 +114,7 @@ static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long addr)
 {
 	struct list_head *head;
 	struct kmmio_fault_page *f;
+	struct kmmio_fault_page *result = NULL;
 	unsigned int l;
 	pte_t *pte = lookup_address(addr, &l);
 
@@ -116,11 +122,16 @@ static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long addr)
 		return NULL;
 	addr &= page_level_mask(l);
 	head = kmmio_page_list(addr);
+
 	list_for_each_entry_rcu(f, head, list) {
-		if (f->addr == addr)
-			return f;
+		if (f->addr == addr) {
+			if (!result)
+				return f;
+			else
+				printk(KERN_ERR " %s collision detected %lu", __FUNCTION__, addr);
+		}
 	}
-	return NULL;
+	return result;
 }
 
 static void clear_pmd_presence(pmd_t *pmd, bool clear, pmdval_t *old)
@@ -375,6 +386,7 @@ static int add_kmmio_fault_page(unsigned long addr)
 {
 	struct kmmio_fault_page *f;
 
+	printk(KERN_WARNING " %s %lx", __FUNCTION__, addr);
 	f = get_kmmio_fault_page(addr);
 	if (f) {
 		if (!f->count)
@@ -406,6 +418,7 @@ static void release_kmmio_fault_page(unsigned long addr,
 {
 	struct kmmio_fault_page *f;
 
+	printk(KERN_WARNING " %s %lx", __FUNCTION__, addr);
 	f = get_kmmio_fault_page(addr);
 	if (!f)
 		return;
@@ -445,6 +458,8 @@ int register_kmmio_probe(struct kmmio_probe *p)
 	}
 
 	pte = lookup_address(p->addr, &l);
+	printk(KERN_WARNING " %s %lx %u", __FUNCTION__, p->addr, l);
+
 	if (!pte) {
 		ret = -EINVAL;
 		goto out;
@@ -537,6 +552,8 @@ void unregister_kmmio_probe(struct kmmio_probe *p)
 	if (!pte)
 		return;
 
+	printk(KERN_WARNING " %s %lx %u", __FUNCTION__, p->addr, l);
+
 	spin_lock_irqsave(&kmmio_lock, flags);
 	while (size < size_lim) {
 		release_kmmio_fault_page(p->addr + size, &release_list);
-- 
2.11.0.rc2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-11-24 20:50 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-02 10:08 mmiotracer hangs the system Andy Shevchenko
2016-08-02 10:36 ` Andy Shevchenko
2016-08-02 15:05 ` Andy Shevchenko
2016-08-02 15:07   ` Andy Shevchenko
2016-08-02 15:31     ` Steven Rostedt
2016-08-02 16:08       ` Andy Shevchenko
2016-08-02 16:13         ` Steven Rostedt
2016-08-03 18:24           ` karol herbst
2016-08-19 10:35           ` karol herbst
2016-08-19 13:02             ` Andy Shevchenko
2016-08-19 15:08               ` karol herbst
2016-08-19 15:35                 ` Andy Shevchenko
2016-08-19 18:23                   ` karol herbst
2016-08-19 20:46                   ` Karol Herbst
2016-08-19 21:50                     ` Andy Shevchenko
2016-10-13 21:12                     ` Karol Herbst
2016-10-22 16:02                       ` Andy Shevchenko
2016-11-19 10:56                         ` Karol Herbst
2016-11-24 20:50                           ` Karol Herbst
2016-08-19 13:34             ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.