Re: [RFC PATCH 1/2] swiotlb: Split up single swiotlb lock

From: Tianyu Lan <ltykernel@gmail.com>
To: Robin Murphy <robin.murphy@arm.com>,
	hch@infradead.org, m.szyprowski@samsung.com,
	michael.h.kelley@microsoft.com, kys@microsoft.com
Cc: parri.andrea@gmail.com, thomas.lendacky@amd.com,
	wei.liu@kernel.org, Andi Kleen <ak@linux.intel.com>,
	Tianyu Lan <Tianyu.Lan@microsoft.com>,
	linux-hyperv@vger.kernel.org, konrad.wilk@oracle.com,
	linux-kernel@vger.kernel.org, kirill.shutemov@intel.com,
	iommu@lists.linux-foundation.org, andi.kleen@intel.com,
	brijesh.singh@amd.com, vkuznets@redhat.com, hch@lst.de
Subject: Re: [RFC PATCH 1/2] swiotlb: Split up single swiotlb lock
Date: Thu, 28 Apr 2022 23:54:31 +0800	[thread overview]
Message-ID: <8c390129-4fb3-dd7c-cf83-0451c405d0b9@gmail.com> (raw)
In-Reply-To: <e7b644f0-6c90-fe99-792d-75c38505dc54@arm.com>

On 4/28/2022 10:44 PM, Robin Murphy wrote:
> On 2022-04-28 15:14, Tianyu Lan wrote:
>> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
>>
>> Traditionally swiotlb was not performance critical because it was only
>> used for slow devices. But in some setups, like TDX/SEV confidential
>> guests, all IO has to go through swiotlb. Currently swiotlb only has a
>> single lock. Under high IO load with multiple CPUs this can lead to
>> significat lock contention on the swiotlb lock.
>>
>> This patch splits the swiotlb into individual areas which have their
>> own lock. When there are swiotlb map/allocate request, allocate
>> io tlb buffer from areas averagely and free the allocation back
>> to the associated area. This is to prepare to resolve the overhead
>> of single spinlock among device's queues. Per device may have its
>> own io tlb mem and bounce buffer pool.
>>
>> This idea from Andi Kleen 
>> patch(https://github.com/intel/tdx/commit/4529b578
>> 4c141782c72ec9bd9a92df2b68cb7d45). Rework it and make it may work
>> for individual device's io tlb mem. The device driver may determine
>> area number according to device queue number.
> 
> Rather than introduce this extra level of allocator complexity, how 
> about just dividing up the initial SWIOTLB allocation into multiple 
> io_tlb_mem instances?
> 
> Robin.

Agree. Thanks for suggestion. That will be more generic and will update
in the next version.

Thanks.