All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zhou, Peng Ju via iommu" <iommu@lists.linux-foundation.org>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Robin Murphy <robin.murphy@arm.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Wang, Yin" <Yin.Wang@amd.com>,
	"Deng, Emily" <Emily.Deng@amd.com>,
	"will@kernel.org" <will@kernel.org>,
	"Chang, HaiJun" <HaiJun.Chang@amd.com>
Subject: RE: [PATCH] iommu/iova: kmemleak when disable SRIOV.
Date: Tue, 3 Aug 2021 05:50:47 +0000	[thread overview]
Message-ID: <DM8PR12MB54785D5E3F31EA4E887B02C3F8F09@DM8PR12MB5478.namprd12.prod.outlook.com> (raw)
In-Reply-To: <e6675688-c5a7-448f-84f4-bba0217d99a6@arm.com>

[AMD Official Use Only]

Hi Chris
I hit kmemleak with your following patch, Can you help to fix it?

According to the info in this thread, it seems the patch doesn't merge into iommu mainline branch, but I can get your patch from my kernel: 5.11.0

 
commit 48a64dd561a53fb809e3f2210faf5dd442cfc56d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Jan 16 11:10:35 2021 +0000

    iommu/iova: Use bottom-up allocation for DMA32

    Since DMA32 allocations are currently allocated top-down from the 4G
    boundary, this causes fragmentation and reduces the maximise allocation
    size. On some systems, the dma address space may be extremely
    constrained by PCIe windows, requiring a stronger anti-fragmentation
    strategy.

    Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2929
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


---------------------------------------------------------------------- 
BW
Pengju Zhou





> -----Original Message-----
> From: Robin Murphy <robin.murphy@arm.com>
> Sent: Tuesday, July 27, 2021 10:23 PM
> To: Zhou, Peng Ju <PengJu.Zhou@amd.com>; iommu@lists.linux-
> foundation.org
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Wang, Yin
> <Yin.Wang@amd.com>; will@kernel.org
> Subject: Re: [PATCH] iommu/iova: kmemleak when disable SRIOV.
> 
> On 2021-07-27 15:05, Zhou, Peng Ju wrote:
> > [AMD Official Use Only]
> >
> > Hi Robin
> > The patch which add "head" :
> >
> > commit 48a64dd561a53fb809e3f2210faf5dd442cfc56d
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Sat Jan 16 11:10:35 2021 +0000
> >
> >      iommu/iova: Use bottom-up allocation for DMA32
> >
> >      Since DMA32 allocations are currently allocated top-down from the 4G
> >      boundary, this causes fragmentation and reduces the maximise allocation
> >      size. On some systems, the dma address space may be extremely
> >      constrained by PCIe windows, requiring a stronger anti-fragmentation
> >      strategy.
> >
> >      Closes:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.f
> reedesktop.org%2Fdrm%2Fintel%2F-
> %2Fissues%2F2929&amp;data=04%7C01%7CPengJu.Zhou%40amd.com%7C47f
> c4308f6044a379ed908d9510a19b1%7C3dd8961fe4884e608e11a82d994e183d
> %7C0%7C0%7C637629927137121754%7CUnknown%7CTWFpbGZsb3d8eyJWIjo
> iMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000
> &amp;sdata=iO5%2FKSW8KV1UZtwGU3oiZpYqiR4eBNcSpF3%2Ft6uSDpY%3D&
> amp;reserved=0
> >      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> ...which is not in mainline. I've never even seen it posted for review.
> In fact two search engines can't seem to find any trace of that SHA or patch
> subject on the internet at all.
> 
> By all means tell Chris that his patch, wherever you got it from, is broken, but
> once again there's nothing the upstream maintainers/reviewers can do about
> code which isn't upstream.
> 
> Thanks,
> Robin.
> 
> > ----------------------------------------------------------------------
> > BW
> > Pengju Zhou
> >
> >
> >
> >> -----Original Message-----
> >> From: Robin Murphy <robin.murphy@arm.com>
> >> Sent: Tuesday, July 27, 2021 4:52 PM
> >> To: Zhou, Peng Ju <PengJu.Zhou@amd.com>; iommu@lists.linux-
> >> foundation.org
> >> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Wang, Yin
> >> <Yin.Wang@amd.com>; will@kernel.org
> >> Subject: Re: [PATCH] iommu/iova: kmemleak when disable SRIOV.
> >>
> >> On 2021-07-27 05:46, Zhou, Peng Ju wrote:
> >>> [AMD Official Use Only]
> >>>
> >>> Hi Robin
> >>> 1. it is not a good manner to free a statically allocated object(in
> >>> this case, it
> >> is iovad->head) dynamically even though the free only occurred when
> >> shut down the OS in most cases.
> >>> 2. the kmemleak occurred when disable SRIOV(remove a PCIE device), I
> >>> post the log in the following, in the log, the line:" kmemleak:
> >>> Found object by alias at 0xffff9221ae647050" means the OS frees a
> >>> not existing object(iovad->head) which added to RB_TREE in the
> >>> function init_iova_domain
> >>
> >> Sure, it was apparent enough what the bug was; my point is that the
> >> bug does not exist in mainline. This is the current mainline
> >> definition of struct
> >> iova_domain:
> >>
> >>
> >> /* holds all the iova translations for a domain */ struct iova_domain {
> >> 	spinlock_t	iova_rbtree_lock; /* Lock to protect update of rbtree
> >> */
> >> 	struct rb_root	rbroot;		/* iova domain rbtree root */
> >> 	struct rb_node	*cached_node;	/* Save last alloced node */
> >> 	struct rb_node	*cached32_node; /* Save last 32-bit alloced node */
> >> 	unsigned long	granule;	/* pfn granularity for this domain */
> >> 	unsigned long	start_pfn;	/* Lower limit for this domain */
> >> 	unsigned long	dma_32bit_pfn;
> >> 	unsigned long	max32_alloc_size; /* Size of last failed allocation */
> >> 	struct iova_fq __percpu *fq;	/* Flush Queue */
> >>
> >> 	atomic64_t	fq_flush_start_cnt;	/* Number of TLB flushes
> >> that
> >> 						   have been started */
> >>
> >> 	atomic64_t	fq_flush_finish_cnt;	/* Number of TLB flushes
> >> that
> >> 						   have been finished */
> >>
> >> 	struct iova	anchor;		/* rbtree lookup anchor */
> >> 	struct iova_rcache rcaches[IOVA_RANGE_CACHE_MAX_SIZE];	/*
> >> IOVA range caches */
> >>
> >> 	iova_flush_cb	flush_cb;	/* Call-Back function to flush IOMMU
> >> 					   TLBs */
> >>
> >> 	iova_entry_dtor entry_dtor;	/* IOMMU driver specific destructor
> >> for
> >> 					   iova entry */
> >>
> >> 	struct timer_list fq_timer;		/* Timer to regularily empty
> >> the
> >> 						   flush-queues */
> >> 	atomic_t fq_timer_on;			/* 1 when timer is active, 0
> >> 						   when not */
> >> 	struct hlist_node	cpuhp_dead;
> >> };
> >>
> >>
> >> Please point to where "head" exists either way ;)
> >>
> >> The mainline code already has a special case to avoid trying to free
> >> the statically-allocated "anchor" node. Whoever modified your kernel
> >> has apparently failed to preserve the equivalent behaviour.
> >>
> >> Robin.
> >>
> >>> The patch I sent out may don't meet IOMMU requirement because I have
> >> no knowledge of the background of IOMMU, but this patch can fix this
> >> kmemleak.
> >>>
> >>> The kmemleak log on my server:
> >>> 316613 Jul 19 02:14:20 z-u18 kernel: [  108.967526] pci 0000:83:02.0:
> >>> Removing from iommu group 108
> >>> 316614 Jul 19 02:14:20 z-u18 kernel: [  108.969032] kmemleak: Found
> >> object by alias at 0xffff9221ae647050
> >>> 316615 Jul 19 02:14:20 z-u18 kernel: [  108.969037] CPU: 30 PID:
> >>> 2768
> >> Comm: modprobe Tainted: G           OE     5.11.0 #       12
> >>> 316616 Jul 19 02:14:20 z-u18 kernel: [  108.969042] Hardware name:
> >> Supermicro ...
> >>> 316617 Jul 19 02:14:20 z-u18 kernel: [  108.969045] Call Trace:
> >>> 316618 Jul 19 02:14:20 z-u18 kernel: [  108.969049]
> >>> dump_stack+0x6d/0x88
> >>> 316619 Jul 19 02:14:20 z-u18 kernel: [  108.969061]
> >>> lookup_object+0x5f/0x70
> >>> 316620 Jul 19 02:14:20 z-u18 kernel: [  108.969070]
> >>> find_and_remove_object+0x29/0x80
> >>> 316621 Jul 19 02:14:20 z-u18 kernel: [  108.969077]
> >>> delete_object_full+0xc/0x20
> >>> 316622 Jul 19 02:14:20 z-u18 kernel: [  108.969083]
> >>> kmem_cache_free+0x22b/0x410
> >>> 316623 Jul 19 02:14:20 z-u18 kernel: [  108.969097]
> >>> free_iova_mem+0x22/0x60
> >>> 316624 Jul 19 02:14:20 z-u18 kernel: [  108.969103]
> >>> put_iova_domain+0x1b5/0x1e0
> >>> 316625 Jul 19 02:14:20 z-u18 kernel: [  108.969108]
> >>> iommu_put_dma_cookie+0x44/0xc0
> >>> 316626 Jul 19 02:14:20 z-u18 kernel: [  108.969118]
> >>> domain_exit+0xba/0xc0
> >>> 316627 Jul 19 02:14:20 z-u18 kernel: [  108.969123]
> >>> iommu_group_release+0x51/0x90
> >>> 316628 Jul 19 02:14:20 z-u18 kernel: [  108.969129]
> >>> kobject_put+0xa7/0x210
> >>> 316629 Jul 19 02:14:20 z-u18 kernel: [  108.969140]
> >>> iommu_release_device+0x41/0x80
> >>> 316630 Jul 19 02:14:20 z-u18 kernel: [  108.969147]
> >>> iommu_bus_notifier+0xa0/0xc0
> >>> 316631 Jul 19 02:14:20 z-u18 kernel: [  108.969153]
> >>> blocking_notifier_call_chain+0x63/0x90
> >>> 316632 Jul 19 02:14:20 z-u18 kernel: [  108.969162]
> >>> device_del+0x29c/0x3f0
> >>> 316633 Jul 19 02:14:20 z-u18 kernel: [  108.969167]
> >>> pci_remove_bus_device+0x77/0x100
> >>> 316634 Jul 19 02:14:20 z-u18 kernel: [  108.969178]
> >>> pci_iov_remove_virtfn+0xbc/0x110
> >>> 316635 Jul 19 02:14:20 z-u18 kernel: [  108.969187]
> >>> sriov_disable+0x2f/0xe0 ......
> >>> 316651 Jul 19 02:14:20 z-u18 kernel: [  108.969629] RIP:
> >> 0033:0x7f6d8ec45047
> >>> 316652 Jul 19 02:14:20 z-u18 kernel: [  108.969634] Code: 73 01 c3
> >>> 48 8b
> >> 0d 41 8e 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66        2e 0f 1f 84 00 00 00 00
> >> 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d
> >> 11 8e 2c 00 f       7 d8 64 89 01 48
> >>> 316653 Jul 19 02:14:20 z-u18 kernel: [  108.969638] RSP:
> >> 002b:00007ffc18dafc48 EFLAGS: 00000206 ORIG_RAX: 00000000000000b
> >> 0
> >>> 316654 Jul 19 02:14:20 z-u18 kernel: [  108.969645] RAX:
> >>> ffffffffffffffda RBX: 000055817f00adc0 RCX: 00007f6d8ec45047
> >>> 316655 Jul 19 02:14:20 z-u18 kernel: [  108.969648] RDX:
> >>> 0000000000000000 RSI: 0000000000000800 RDI: 000055817f00ae28
> >>> 316656 Jul 19 02:14:20 z-u18 kernel: [  108.969651] RBP:
> >>> 000055817f00adc0 R08: 00007ffc18daebf1 R09: 0000000000000000
> >>> 316657 Jul 19 02:14:20 z-u18 kernel: [  108.969654] R10:
> >>> 00007f6d8ecc1c40 R11: 0000000000000206 R12: 000055817f00ae28
> >>> 316658 Jul 19 02:14:20 z-u18 kernel: [  108.969656] R13:
> >>> 0000000000000001 R14: 000055817f00ae28 R15: 00007ffc18db1030
> >>> 316659 Jul 19 02:14:20 z-u18 kernel: [  108.969661] kmemleak: Object
> >> 0xffff9221ae647000 (size 2048):
> >>> 316660 Jul 19 02:14:20 z-u18 kernel: [  108.969665] kmemleak:   comm
> >> "gpu_init_thread", pid 2687, jiffies 4294911321
> >>> 316661 Jul 19 02:14:20 z-u18 kernel: [  108.969669] kmemleak:
> >> min_count = 1
> >>> 316662 Jul 19 02:14:20 z-u18 kernel: [  108.969671] kmemleak:   count = 0
> >>> 316663 Jul 19 02:14:20 z-u18 kernel: [  108.969672] kmemleak:   flags =
> >> 0x1
> >>> 316664 Jul 19 02:14:20 z-u18 kernel: [  108.969674] kmemleak:   checksum
> >> = 0
> >>> 316665 Jul 19 02:14:20 z-u18 kernel: [  108.969675] kmemleak:   backtrace:
> >>> 316666 Jul 19 02:14:20 z-u18 kernel: [  108.969677]
> >> cookie_alloc+0x1f/0x60
> >>> 316667 Jul 19 02:14:20 z-u18 kernel: [  108.969682]
> >> iommu_get_dma_cookie+0x17/0x30
> >>> 316668 Jul 19 02:14:20 z-u18 kernel: [  108.969685]
> >> intel_iommu_domain_alloc+0xa7/0xe0
> >>> 316669 Jul 19 02:14:20 z-u18 kernel: [  108.969689]
> >> iommu_group_alloc_default_domain+0x4c/0x160
> >>> 316670 Jul 19 02:14:20 z-u18 kernel: [  108.969695]
> >> iommu_probe_device+0xff/0x130
> >>> 316671 Jul 19 02:14:20 z-u18 kernel: [  108.969702]
> >> iommu_bus_notifier+0x7c/0xc0
> >>> 316672 Jul 19 02:14:20 z-u18 kernel: [  108.969708]
> >> blocking_notifier_call_chain+0x63/0x90
> >>> 316673 Jul 19 02:14:20 z-u18 kernel: [  108.969713]
> >> device_add+0x3f9/0x860
> >>> 316674 Jul 19 02:14:20 z-u18 kernel: [  108.969717]
> >> pci_device_add+0x2c3/0x6a0
> >>> 316675 Jul 19 02:14:20 z-u18 kernel: [  108.969723]
> >> pci_iov_add_virtfn+0x1b1/0x300
> >>> 316676 Jul 19 02:14:20 z-u18 kernel: [  108.969729]
> >> sriov_enable+0x254/0x410
> >>>
> >>>
> >>> --------------------------------------------------------------------
> >>> --
> >>> BW
> >>> Pengju Zhou
> >>>
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Robin Murphy <robin.murphy@arm.com>
> >>>> Sent: Thursday, July 22, 2021 10:59 PM
> >>>> To: Zhou, Peng Ju <PengJu.Zhou@amd.com>; iommu@lists.linux-
> >>>> foundation.org
> >>>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Wang, Yin
> >>>> <Yin.Wang@amd.com>; will@kernel.org
> >>>> Subject: Re: [PATCH] iommu/iova: kmemleak when disable SRIOV.
> >>>>
> >>>> On 2021-07-22 09:16, Peng Ju Zhou via iommu wrote:
> >>>>> the object iova->head allocated statically when enable SRIOV but
> >>>>> freed dynamically when disable SRIOV which causing kmemleak.
> >>>>> changing the allocation from statically to dynamically.
> >>>>
> >>>> Thanks for the glimpse into the kind of weird and wonderful things
> >>>> people are doing to the IOVA allocator out-of-tree (the "holes"
> >>>> list sounds like an idea I also thought about a long time ago), but
> >>>> judging by the context this patch is clearly of no use to mainline
> >>>> ;)
> >>>>
> >>>> Robin.
> >>>>
> >>>>> Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
> >>>>> ---
> >>>>>     drivers/iommu/iova.c | 15 ++++++++-------
> >>>>>     include/linux/iova.h |  4 ++--
> >>>>>     2 files changed, 10 insertions(+), 9 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index
> >>>>> 2371524796d3..505881d8d97f 100644
> >>>>> --- a/drivers/iommu/iova.c
> >>>>> +++ b/drivers/iommu/iova.c
> >>>>> @@ -26,6 +26,8 @@ static void free_iova_rcaches(struct iova_domain
> >>>> *iovad);
> >>>>>     static void fq_destroy_all_entries(struct iova_domain *iovad);
> >>>>>     static void fq_flush_timeout(struct timer_list *t);
> >>>>>     static void free_global_cached_iovas(struct iova_domain
> >>>>> *iovad);
> >>>>> +static inline struct iova *alloc_and_init_iova(unsigned long pfn_lo,
> >>>>> +					       unsigned long pfn_hi);
> >>>>>
> >>>>>     void
> >>>>>     init_iova_domain(struct iova_domain *iovad, unsigned long
> >>>>> granule, @@ -47,17 +49,16 @@ init_iova_domain(struct iova_domain
> >>>>> *iovad, unsigned long granule,
> >>>>>
> >>>>>     	INIT_LIST_HEAD(&iovad->holes);
> >>>>>
> >>>>> -	iovad->head.pfn_lo = 0;
> >>>>> -	iovad->head.pfn_hi = start_pfn;
> >>>>> -	rb_link_node(&iovad->head.node, NULL, &iovad->rbroot.rb_node);
> >>>>> -	rb_insert_color(&iovad->head.node, &iovad->rbroot);
> >>>>> -	list_add(&iovad->head.hole, &iovad->holes);
> >>>>> +	iovad->head = alloc_and_init_iova(0, start_pfn);
> >>>>> +	rb_link_node(&iovad->head->node, NULL, &iovad->rbroot.rb_node);
> >>>>> +	rb_insert_color(&iovad->head->node, &iovad->rbroot);
> >>>>> +	list_add(&iovad->head->hole, &iovad->holes);
> >>>>>
> >>>>>     	iovad->tail.pfn_lo = IOVA_ANCHOR;
> >>>>>     	iovad->tail.pfn_hi = IOVA_ANCHOR;
> >>>>>     	rb_link_node(&iovad->tail.node,
> >>>>> -		     &iovad->head.node,
> >>>>> -		     &iovad->head.node.rb_right);
> >>>>> +		     &iovad->head->node,
> >>>>> +		     &iovad->head->node.rb_right);
> >>>>>     	rb_insert_color(&iovad->tail.node, &iovad->rbroot);
> >>>>>
> >>>>>     	init_iova_rcaches(iovad);
> >>>>> diff --git a/include/linux/iova.h b/include/linux/iova.h index
> >>>>> 076eb6cfc613..553905ef41fe 100644
> >>>>> --- a/include/linux/iova.h
> >>>>> +++ b/include/linux/iova.h
> >>>>> @@ -81,7 +81,7 @@ struct iova_domain {
> >>>>>     						   have been finished
> */
> >>>>>
> >>>>>     	struct list_head holes;
> >>>>> -	struct iova	head, tail;		/* rbtree lookup anchors */
> >>>>> +	struct iova	*head, tail;		/* rbtree lookup anchors */
> >>>>>     	struct iova_rcache rcaches[IOVA_RANGE_CACHE_MAX_SIZE];
> 	/*
> >>>> IOVA range caches */
> >>>>>
> >>>>>     	iova_flush_cb	flush_cb;	/* Call-Back function to flush
> IOMMU
> >>>>> @@ -252,7 +252,7 @@ static inline void
> >>>>> free_cpu_cached_iovas(unsigned int cpu,
> >>>>>
> >>>>>     static inline unsigned long iovad_start_pfn(struct iova_domain *iovad)
> >>>>>     {
> >>>>> -	return iovad->head.pfn_hi;
> >>>>> +	return iovad->head->pfn_hi;
> >>>>>     }
> >>>>>
> >>>>>     #endif
> >>>>>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2021-08-03  5:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22  8:16 [PATCH] iommu/iova: kmemleak when disable SRIOV Peng Ju Zhou via iommu
2021-07-22 14:58 ` Robin Murphy
2021-07-27  4:46   ` Zhou, Peng Ju via iommu
2021-07-27  8:51     ` Robin Murphy
2021-07-27 14:05       ` Zhou, Peng Ju via iommu
2021-07-27 14:23         ` Robin Murphy
2021-08-03  5:50           ` Zhou, Peng Ju via iommu [this message]
2021-08-03 13:29             ` Deucher, Alexander via iommu
2021-08-04  0:47               ` Zhou, Peng Ju via iommu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM8PR12MB54785D5E3F31EA4E887B02C3F8F09@DM8PR12MB5478.namprd12.prod.outlook.com \
    --to=iommu@lists.linux-foundation.org \
    --cc=Alexander.Deucher@amd.com \
    --cc=Emily.Deng@amd.com \
    --cc=HaiJun.Chang@amd.com \
    --cc=PengJu.Zhou@amd.com \
    --cc=Yin.Wang@amd.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.