All of lore.kernel.org
 help / color / mirror / Atom feed
* kmemleak complaints in nvme PM
@ 2017-04-09 12:47 Sagi Grimberg
  2017-04-12 17:29 ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2017-04-09 12:47 UTC (permalink / raw)


Hi folks,

I got this kmemleak complaint [1] on nvme-loop (but I'm pretty
confident it should happen with any nvme transport).

Initial code stare tells me that
dev_pm_qos_update_user_latency_tolerance() can (and usually will
in controller initializaion) allocate a dev_pm_qos_request, but I
didn't see a pairing free of that request.

I'll try to find some time looking into it, but thought it might
be a good idea to throw it out here in the mean time...


[1]:
--
unreferenced object 0xffff9d87f7f65b00 (size 64):
   comm "nvme", pid 1088, jiffies 4295242789 (age 37.036s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<ffffffffaf84312a>] kmemleak_alloc+0x4a/0xa0
     [<ffffffffaf1ff780>] kmem_cache_alloc_trace+0x110/0x230
     [<ffffffffaf579fbc>] 
dev_pm_qos_update_user_latency_tolerance+0x7c/0x100
     [<ffffffffc06b9b9c>] nvme_init_ctrl+0x21c/0x250 [nvme_core]
     [<ffffffffc06bc52a>] nvme_probe_ctrl+0x9a/0x1c0 [nvme_core]
     [<ffffffffc0736b9f>] nvme_loop_create_ctrl+0xbf/0x150 [nvme_loop]
     [<ffffffffc053e3f2>] nvmf_dev_write+0x7a2/0x9d7 [nvme_fabrics]
     [<ffffffffaf225fc8>] __vfs_write+0x28/0x140
     [<ffffffffaf2268d8>] vfs_write+0xb8/0x1b0
     [<ffffffffaf227d86>] SyS_write+0x46/0xa0
     [<ffffffffaf84dffb>] entry_SYSCALL_64_fastpath+0x1e/0xad
     [<ffffffffffffffff>] 0xffffffffffffffff
--

^ permalink raw reply	[flat|nested] 8+ messages in thread

* kmemleak complaints in nvme PM
  2017-04-09 12:47 kmemleak complaints in nvme PM Sagi Grimberg
@ 2017-04-12 17:29 ` Christoph Hellwig
  2017-04-12 17:42   ` Andy Lutomirski
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2017-04-12 17:29 UTC (permalink / raw)


On Sun, Apr 09, 2017@03:47:35PM +0300, Sagi Grimberg wrote:
> Hi folks,
>
> I got this kmemleak complaint [1] on nvme-loop (but I'm pretty
> confident it should happen with any nvme transport).
>
> Initial code stare tells me that
> dev_pm_qos_update_user_latency_tolerance() can (and usually will
> in controller initializaion) allocate a dev_pm_qos_request, but I
> didn't see a pairing free of that request.
>
> I'll try to find some time looking into it, but thought it might
> be a good idea to throw it out here in the mean time...

Seems like the only way to free dev->power.qos->latency_tolerance_req
is to call dev_pm_qos_update_user_latency_tolerance with a negative
val argument.  What an odd API..

^ permalink raw reply	[flat|nested] 8+ messages in thread

* kmemleak complaints in nvme PM
  2017-04-12 17:29 ` Christoph Hellwig
@ 2017-04-12 17:42   ` Andy Lutomirski
  2017-04-13  2:47     ` Rafael J. Wysocki
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Lutomirski @ 2017-04-12 17:42 UTC (permalink / raw)


On Wed, Apr 12, 2017@10:29 AM, Christoph Hellwig <hch@lst.de> wrote:
> On Sun, Apr 09, 2017@03:47:35PM +0300, Sagi Grimberg wrote:
>> Hi folks,
>>
>> I got this kmemleak complaint [1] on nvme-loop (but I'm pretty
>> confident it should happen with any nvme transport).
>>
>> Initial code stare tells me that
>> dev_pm_qos_update_user_latency_tolerance() can (and usually will
>> in controller initializaion) allocate a dev_pm_qos_request, but I
>> didn't see a pairing free of that request.
>>
>> I'll try to find some time looking into it, but thought it might
>> be a good idea to throw it out here in the mean time...
>
> Seems like the only way to free dev->power.qos->latency_tolerance_req
> is to call dev_pm_qos_update_user_latency_tolerance with a negative
> val argument.  What an odd API..

Greg and/or Rafael, what's the right way to fix this?  I would imagine
that the driver core should free pm_qos requests when the device goes
away, no?

Also, I suspect this is all quite racy -- I don't see anything
preventing two user threads from writing to the sysfs fields at the
same time.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* kmemleak complaints in nvme PM
  2017-04-12 17:42   ` Andy Lutomirski
@ 2017-04-13  2:47     ` Rafael J. Wysocki
  2017-04-19 21:37       ` Andy Lutomirski
  0 siblings, 1 reply; 8+ messages in thread
From: Rafael J. Wysocki @ 2017-04-13  2:47 UTC (permalink / raw)


On Wednesday, April 12, 2017 10:42:51 AM Andy Lutomirski wrote:
> On Wed, Apr 12, 2017@10:29 AM, Christoph Hellwig <hch@lst.de> wrote:
> > On Sun, Apr 09, 2017@03:47:35PM +0300, Sagi Grimberg wrote:
> >> Hi folks,
> >>
> >> I got this kmemleak complaint [1] on nvme-loop (but I'm pretty
> >> confident it should happen with any nvme transport).
> >>
> >> Initial code stare tells me that
> >> dev_pm_qos_update_user_latency_tolerance() can (and usually will
> >> in controller initializaion) allocate a dev_pm_qos_request, but I
> >> didn't see a pairing free of that request.
> >>
> >> I'll try to find some time looking into it, but thought it might
> >> be a good idea to throw it out here in the mean time...
> >
> > Seems like the only way to free dev->power.qos->latency_tolerance_req
> > is to call dev_pm_qos_update_user_latency_tolerance with a negative
> > val argument.  What an odd API..
> 
> Greg and/or Rafael, what's the right way to fix this?  I would imagine
> that the driver core should free pm_qos requests when the device goes
> away, no?

Yes, it should.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* kmemleak complaints in nvme PM
  2017-04-13  2:47     ` Rafael J. Wysocki
@ 2017-04-19 21:37       ` Andy Lutomirski
  2017-04-20 11:34         ` Sagi Grimberg
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Lutomirski @ 2017-04-19 21:37 UTC (permalink / raw)


On Wed, Apr 12, 2017@7:47 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, April 12, 2017 10:42:51 AM Andy Lutomirski wrote:
>> On Wed, Apr 12, 2017@10:29 AM, Christoph Hellwig <hch@lst.de> wrote:
>> > On Sun, Apr 09, 2017@03:47:35PM +0300, Sagi Grimberg wrote:
>> >> Hi folks,
>> >>
>> >> I got this kmemleak complaint [1] on nvme-loop (but I'm pretty
>> >> confident it should happen with any nvme transport).
>> >>
>> >> Initial code stare tells me that
>> >> dev_pm_qos_update_user_latency_tolerance() can (and usually will
>> >> in controller initializaion) allocate a dev_pm_qos_request, but I
>> >> didn't see a pairing free of that request.
>> >>
>> >> I'll try to find some time looking into it, but thought it might
>> >> be a good idea to throw it out here in the mean time...
>> >
>> > Seems like the only way to free dev->power.qos->latency_tolerance_req
>> > is to call dev_pm_qos_update_user_latency_tolerance with a negative
>> > val argument.  What an odd API..
>>
>> Greg and/or Rafael, what's the right way to fix this?  I would imagine
>> that the driver core should free pm_qos requests when the device goes
>> away, no?
>
> Yes, it should.
>

I don't see the bug.  dev_pm_qos_constraints_destroy() should take care of it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* kmemleak complaints in nvme PM
  2017-04-19 21:37       ` Andy Lutomirski
@ 2017-04-20 11:34         ` Sagi Grimberg
  2017-04-20 16:24           ` Andy Lutomirski
  0 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2017-04-20 11:34 UTC (permalink / raw)



>>> Greg and/or Rafael, what's the right way to fix this?  I would imagine
>>> that the driver core should free pm_qos requests when the device goes
>>> away, no?
>>
>> Yes, it should.
>>
>
> I don't see the bug.  dev_pm_qos_constraints_destroy() should take care of it.

I see it all the time, where is dev_pm_qos_constraints_destroy() called?

Not in nvme.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* kmemleak complaints in nvme PM
  2017-04-20 11:34         ` Sagi Grimberg
@ 2017-04-20 16:24           ` Andy Lutomirski
  2017-04-20 17:03             ` Sagi Grimberg
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Lutomirski @ 2017-04-20 16:24 UTC (permalink / raw)


On Thu, Apr 20, 2017@4:34 AM, Sagi Grimberg <sagi@grimberg.me> wrote:
>
>>>> Greg and/or Rafael, what's the right way to fix this?  I would imagine
>>>> that the driver core should free pm_qos requests when the device goes
>>>> away, no?
>>>
>>>
>>> Yes, it should.
>>>
>>
>> I don't see the bug.  dev_pm_qos_constraints_destroy() should take care of
>> it.
>
>
> I see it all the time, where is dev_pm_qos_constraints_destroy() called?
>
> Not in nvme.

What I meant was: I don't see the bug in the code.

dev_pm_qos_constraints_destroy() is called from dpm_sysfs_remove(),
which is called from device_del().

Anyway, I tried to reproduce it and I failed:

# nvmetcli
/> cd /ports/
/ports> create 1
module 'nvmet' has no attribute 'Port'

I can manually mkdir '1' in /sys/kernel/config/nvmet/ports, but I have
no idea what I'm supposed to do from there.

Help?  Is there some trivial script I can run to reproduce the problem?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* kmemleak complaints in nvme PM
  2017-04-20 16:24           ` Andy Lutomirski
@ 2017-04-20 17:03             ` Sagi Grimberg
  0 siblings, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2017-04-20 17:03 UTC (permalink / raw)



> Anyway, I tried to reproduce it and I failed:
>
> # nvmetcli
> /> cd /ports/
> /ports> create 1
> module 'nvmet' has no attribute 'Port'
>
> I can manually mkdir '1' in /sys/kernel/config/nvmet/ports, but I have
> no idea what I'm supposed to do from there.
>
> Help?  Is there some trivial script I can run to reproduce the problem?

I think you need to pull from Christoph's nvmetcli repo to resolve this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-04-20 17:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-09 12:47 kmemleak complaints in nvme PM Sagi Grimberg
2017-04-12 17:29 ` Christoph Hellwig
2017-04-12 17:42   ` Andy Lutomirski
2017-04-13  2:47     ` Rafael J. Wysocki
2017-04-19 21:37       ` Andy Lutomirski
2017-04-20 11:34         ` Sagi Grimberg
2017-04-20 16:24           ` Andy Lutomirski
2017-04-20 17:03             ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.