All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb
@ 2021-02-04  2:13 kernel test robot
  0 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2021-02-04  2:13 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 16893 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210203233709.19819-5-dongli.zhang@oracle.com>
References: <20210203233709.19819-5-dongli.zhang@oracle.com>
TO: Dongli Zhang <dongli.zhang@oracle.com>

Hi Dongli,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on powerpc/next]
[also build test WARNING on linus/master v5.11-rc6 next-20210125]
[cannot apply to swiotlb/linux-next xen-tip/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Dongli-Zhang/swiotlb-64-bit-DMA-buffer/20210204-074426
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
:::::: branch date: 2 hours ago
:::::: commit date: 2 hours ago
config: i386-randconfig-m021-20210202 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
kernel/dma/swiotlb.c:524 swiotlb_tbl_map_single() warn: impossible condition '(dma_mask == ((((64) == 64)) ?~0:((1 << (64)) - 1))) => (0-u32max == u64max)'

vim +524 kernel/dma/swiotlb.c

1b548f667c1487 lib/swiotlb.c           Jeremy Fitzhardinge   2008-12-16  506  
fc0021aa340af6 kernel/dma/swiotlb.c    Christoph Hellwig     2020-10-23  507  phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
fc0021aa340af6 kernel/dma/swiotlb.c    Christoph Hellwig     2020-10-23  508  		size_t mapping_size, size_t alloc_size,
fc0021aa340af6 kernel/dma/swiotlb.c    Christoph Hellwig     2020-10-23  509  		enum dma_data_direction dir, unsigned long attrs)
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  510  {
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  511  	dma_addr_t tbl_dma_addr;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  512  	unsigned long flags;
e05ed4d1fad9e7 lib/swiotlb.c           Alexander Duyck       2012-10-15  513  	phys_addr_t tlb_addr;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  514  	unsigned int nslots, stride, index, wrap;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  515  	int i;
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  516  	unsigned long mask;
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  517  	unsigned long offset_slots;
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  518  	unsigned long max_slots;
53b29c336830db kernel/dma/swiotlb.c    Dongli Zhang          2019-04-12  519  	unsigned long tmp_io_tlb_used;
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  520  	unsigned long dma_mask = min_not_zero(*hwdev->dma_mask,
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  521  					      hwdev->bus_dma_limit);
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  522  	int type;
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  523  
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03 @524  	if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  525  		type = SWIOTLB_HI;
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  526  	else
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  527  		type = SWIOTLB_LO;
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  528  
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  529  	if (no_iotlb_memory[type])
ac2cbab21f318e lib/swiotlb.c           Yinghai Lu            2013-01-24  530  		panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
ac2cbab21f318e lib/swiotlb.c           Yinghai Lu            2013-01-24  531  
d7b417fa08d118 lib/swiotlb.c           Tom Lendacky          2017-10-20  532  	if (mem_encrypt_active())
47e5d8f9ed34a5 kernel/dma/swiotlb.c    Thiago Jung Bauermann 2019-08-06  533  		pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n");
648babb7078c63 lib/swiotlb.c           Tom Lendacky          2017-07-17  534  
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  535  	if (mapping_size > alloc_size) {
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  536  		dev_warn_once(hwdev, "Invalid sizes (mapping: %zd bytes, alloc: %zd bytes)",
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  537  			      mapping_size, alloc_size);
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  538  		return (phys_addr_t)DMA_MAPPING_ERROR;
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  539  	}
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  540  
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  541  	tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[type]);
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  542  
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  543  	mask = dma_get_seg_boundary(hwdev);
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  544  
eb605a5754d050 lib/swiotlb.c           FUJITA Tomonori       2010-05-10  545  	tbl_dma_addr &= mask;
eb605a5754d050 lib/swiotlb.c           FUJITA Tomonori       2010-05-10  546  
eb605a5754d050 lib/swiotlb.c           FUJITA Tomonori       2010-05-10  547  	offset_slots = ALIGN(tbl_dma_addr, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
a5ddde4a558b3b lib/swiotlb.c           Ian Campbell          2008-12-16  548  
a5ddde4a558b3b lib/swiotlb.c           Ian Campbell          2008-12-16  549  	/*
a5ddde4a558b3b lib/swiotlb.c           Ian Campbell          2008-12-16  550  	 * Carefully handle integer overflow which can occur when mask == ~0UL.
a5ddde4a558b3b lib/swiotlb.c           Ian Campbell          2008-12-16  551  	 */
b15a3891c916f3 lib/swiotlb.c           Jan Beulich           2008-03-13  552  	max_slots = mask + 1
b15a3891c916f3 lib/swiotlb.c           Jan Beulich           2008-03-13  553  		    ? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT
b15a3891c916f3 lib/swiotlb.c           Jan Beulich           2008-03-13  554  		    : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT);
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  555  
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  556  	/*
602d9858f07c72 lib/swiotlb.c           Nikita Yushchenko     2017-01-11  557  	 * For mappings greater than or equal to a page, we limit the stride
602d9858f07c72 lib/swiotlb.c           Nikita Yushchenko     2017-01-11  558  	 * (and hence alignment) to a page size.
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  559  	 */
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  560  	nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  561  	if (alloc_size >= PAGE_SIZE)
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  562  		stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  563  	else
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  564  		stride = 1;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  565  
34814545890db6 lib/swiotlb.c           Eric Sesterhenn       2006-03-24  566  	BUG_ON(!nslots);
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  567  
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  568  	/*
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  569  	 * Find suitable number of IO TLB entries size that will fit this
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  570  	 * request and allocate a buffer from that IO TLB pool.
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  571  	 */
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  572  	spin_lock_irqsave(&io_tlb_lock, flags);
60513ed06a4104 kernel/dma/swiotlb.c    Dongli Zhang          2019-01-18  573  
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  574  	if (unlikely(nslots > io_tlb_nslabs[type] - io_tlb_used[type]))
60513ed06a4104 kernel/dma/swiotlb.c    Dongli Zhang          2019-01-18  575  		goto not_found;
60513ed06a4104 kernel/dma/swiotlb.c    Dongli Zhang          2019-01-18  576  
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  577  	index = ALIGN(io_tlb_index[type], stride);
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  578  	if (index >= io_tlb_nslabs[type])
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  579  		index = 0;
b15a3891c916f3 lib/swiotlb.c           Jan Beulich           2008-03-13  580  	wrap = index;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  581  
b15a3891c916f3 lib/swiotlb.c           Jan Beulich           2008-03-13  582  	do {
a8522509200b46 lib/swiotlb.c           FUJITA Tomonori       2008-04-29  583  		while (iommu_is_span_boundary(index, nslots, offset_slots,
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  584  					      max_slots)) {
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  585  			index += stride;
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  586  			if (index >= io_tlb_nslabs[type])
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  587  				index = 0;
b15a3891c916f3 lib/swiotlb.c           Jan Beulich           2008-03-13  588  			if (index == wrap)
b15a3891c916f3 lib/swiotlb.c           Jan Beulich           2008-03-13  589  				goto not_found;
681cc5cd3efbea lib/swiotlb.c           FUJITA Tomonori       2008-02-04  590  		}
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  591  
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  592  		/*
a7133a15587b89 lib/swiotlb.c           Andrew Morton         2008-04-29  593  		 * If we find a slot that indicates we have 'nslots' number of
a7133a15587b89 lib/swiotlb.c           Andrew Morton         2008-04-29  594  		 * contiguous buffers, we allocate the buffers from that slot
a7133a15587b89 lib/swiotlb.c           Andrew Morton         2008-04-29  595  		 * and mark the entries as '0' indicating unavailable.
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  596  		 */
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  597  		if (io_tlb_list[type][index] >= nslots) {
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  598  			int count = 0;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  599  
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  600  			for (i = index; i < (int) (index + nslots); i++)
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  601  				io_tlb_list[type][i] = 0;
2e7a0d517bf27a kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  602  			for (i = index - 1;
2e7a0d517bf27a kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  603  			     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  604  			     io_tlb_list[type][i];
2e7a0d517bf27a kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  605  			     i--)
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  606  				io_tlb_list[type][i] = ++count;
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  607  			tlb_addr = io_tlb_start[type] + (index << IO_TLB_SHIFT);
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  608  
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  609  			/*
a7133a15587b89 lib/swiotlb.c           Andrew Morton         2008-04-29  610  			 * Update the indices to avoid searching in the next
a7133a15587b89 lib/swiotlb.c           Andrew Morton         2008-04-29  611  			 * round.
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  612  			 */
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  613  			io_tlb_index[type] = ((index + nslots) < io_tlb_nslabs[type]
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  614  					? (index + nslots) : 0);
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  615  
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  616  			goto found;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  617  		}
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  618  		index += stride;
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  619  		if (index >= io_tlb_nslabs[type])
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  620  			index = 0;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  621  	} while (index != wrap);
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  622  
b15a3891c916f3 lib/swiotlb.c           Jan Beulich           2008-03-13  623  not_found:
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  624  	tmp_io_tlb_used = io_tlb_used[type];
53b29c336830db kernel/dma/swiotlb.c    Dongli Zhang          2019-04-12  625  
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  626  	spin_unlock_irqrestore(&io_tlb_lock, flags);
d0bc0c2a31c950 lib/swiotlb.c           Christian König       2018-01-04  627  	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  628  		dev_warn(hwdev, "%s swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  629  			 swiotlb_name[type], alloc_size, io_tlb_nslabs[type], tmp_io_tlb_used);
9c106119f6538f kernel/dma/swiotlb.c    Arnd Bergmann         2019-06-17  630  	return (phys_addr_t)DMA_MAPPING_ERROR;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  631  found:
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  632  	io_tlb_used[type] += nslots;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  633  	spin_unlock_irqrestore(&io_tlb_lock, flags);
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  634  
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  635  	/*
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  636  	 * Save away the mapping from the original address to the DMA address.
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  637  	 * This is needed when we sync the memory.  Then we sync the buffer if
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  638  	 * needed.
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  639  	 */
bc40ac66988a77 lib/swiotlb.c           Becky Bruce           2008-12-22  640  	for (i = 0; i < nslots; i++)
bb094f75bbb66b kernel/dma/swiotlb.c    Dongli Zhang          2021-02-03  641  		io_tlb_orig_addr[type][index+i] = orig_addr + (i << IO_TLB_SHIFT);
0443fa003fa199 lib/swiotlb.c           Alexander Duyck       2016-11-02  642  	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
0443fa003fa199 lib/swiotlb.c           Alexander Duyck       2016-11-02  643  	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
3fc1ca00653db6 kernel/dma/swiotlb.c    Lu Baolu              2019-09-06  644  		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  645  
e05ed4d1fad9e7 lib/swiotlb.c           Alexander Duyck       2012-10-15  646  	return tlb_addr;
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  647  }
^1da177e4c3f41 arch/ia64/lib/swiotlb.c Linus Torvalds        2005-04-16  648  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 38572 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb
  2021-02-03 23:37   ` Dongli Zhang
                     ` (2 preceding siblings ...)
  (?)
@ 2021-02-04  4:10   ` kernel test robot
  -1 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2021-02-04  4:10 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 9990 bytes --]

Hi Dongli,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on powerpc/next]
[also build test WARNING on linus/master v5.11-rc6 next-20210125]
[cannot apply to swiotlb/linux-next xen-tip/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Dongli-Zhang/swiotlb-64-bit-DMA-buffer/20210204-074426
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc64-randconfig-r036-20210202 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 275c6af7d7f1ed63a03d05b4484413e447133269)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc64 cross compiling tool for clang build
        # apt-get install binutils-powerpc64-linux-gnu
        # https://github.com/0day-ci/linux/commit/bb094f75bbb66bf693fc13a9990702b3a88ae93f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Dongli-Zhang/swiotlb-64-bit-DMA-buffer/20210204-074426
        git checkout bb094f75bbb66bf693fc13a9990702b3a88ae93f
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> kernel/dma/swiotlb.c:524:33: warning: result of comparison of constant 18446744073709551615 with expression of type 'unsigned long' is always false [-Wtautological-constant-out-of-range-compare]
           if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:86: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                                       ~~~~~~~~~~~~~~~~~^~~~~
   include/linux/compiler.h:69:3: note: expanded from macro '__trace_if_value'
           (cond) ?                                        \
            ^~~~
>> kernel/dma/swiotlb.c:524:33: warning: result of comparison of constant 18446744073709551615 with expression of type 'unsigned long' is always false [-Wtautological-constant-out-of-range-compare]
           if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
>> kernel/dma/swiotlb.c:524:33: warning: result of comparison of constant 18446744073709551615 with expression of type 'unsigned long' is always false [-Wtautological-constant-out-of-range-compare]
           if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:61: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                               ^~~~
   3 warnings generated.

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for HOTPLUG_CPU
   Depends on SMP && (PPC_PSERIES || PPC_PMAC || PPC_POWERNV || FSL_SOC_BOOKE
   Selected by
   - PM_SLEEP_SMP && SMP && (ARCH_SUSPEND_POSSIBLE || ARCH_HIBERNATION_POSSIBLE && PM_SLEEP


vim +524 kernel/dma/swiotlb.c

   506	
   507	phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
   508			size_t mapping_size, size_t alloc_size,
   509			enum dma_data_direction dir, unsigned long attrs)
   510	{
   511		dma_addr_t tbl_dma_addr;
   512		unsigned long flags;
   513		phys_addr_t tlb_addr;
   514		unsigned int nslots, stride, index, wrap;
   515		int i;
   516		unsigned long mask;
   517		unsigned long offset_slots;
   518		unsigned long max_slots;
   519		unsigned long tmp_io_tlb_used;
   520		unsigned long dma_mask = min_not_zero(*hwdev->dma_mask,
   521						      hwdev->bus_dma_limit);
   522		int type;
   523	
 > 524		if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
   525			type = SWIOTLB_HI;
   526		else
   527			type = SWIOTLB_LO;
   528	
   529		if (no_iotlb_memory[type])
   530			panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
   531	
   532		if (mem_encrypt_active())
   533			pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n");
   534	
   535		if (mapping_size > alloc_size) {
   536			dev_warn_once(hwdev, "Invalid sizes (mapping: %zd bytes, alloc: %zd bytes)",
   537				      mapping_size, alloc_size);
   538			return (phys_addr_t)DMA_MAPPING_ERROR;
   539		}
   540	
   541		tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[type]);
   542	
   543		mask = dma_get_seg_boundary(hwdev);
   544	
   545		tbl_dma_addr &= mask;
   546	
   547		offset_slots = ALIGN(tbl_dma_addr, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
   548	
   549		/*
   550		 * Carefully handle integer overflow which can occur when mask == ~0UL.
   551		 */
   552		max_slots = mask + 1
   553			    ? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT
   554			    : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT);
   555	
   556		/*
   557		 * For mappings greater than or equal to a page, we limit the stride
   558		 * (and hence alignment) to a page size.
   559		 */
   560		nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
   561		if (alloc_size >= PAGE_SIZE)
   562			stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
   563		else
   564			stride = 1;
   565	
   566		BUG_ON(!nslots);
   567	
   568		/*
   569		 * Find suitable number of IO TLB entries size that will fit this
   570		 * request and allocate a buffer from that IO TLB pool.
   571		 */
   572		spin_lock_irqsave(&io_tlb_lock, flags);
   573	
   574		if (unlikely(nslots > io_tlb_nslabs[type] - io_tlb_used[type]))
   575			goto not_found;
   576	
   577		index = ALIGN(io_tlb_index[type], stride);
   578		if (index >= io_tlb_nslabs[type])
   579			index = 0;
   580		wrap = index;
   581	
   582		do {
   583			while (iommu_is_span_boundary(index, nslots, offset_slots,
   584						      max_slots)) {
   585				index += stride;
   586				if (index >= io_tlb_nslabs[type])
   587					index = 0;
   588				if (index == wrap)
   589					goto not_found;
   590			}
   591	
   592			/*
   593			 * If we find a slot that indicates we have 'nslots' number of
   594			 * contiguous buffers, we allocate the buffers from that slot
   595			 * and mark the entries as '0' indicating unavailable.
   596			 */
   597			if (io_tlb_list[type][index] >= nslots) {
   598				int count = 0;
   599	
   600				for (i = index; i < (int) (index + nslots); i++)
   601					io_tlb_list[type][i] = 0;
   602				for (i = index - 1;
   603				     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
   604				     io_tlb_list[type][i];
   605				     i--)
   606					io_tlb_list[type][i] = ++count;
   607				tlb_addr = io_tlb_start[type] + (index << IO_TLB_SHIFT);
   608	
   609				/*
   610				 * Update the indices to avoid searching in the next
   611				 * round.
   612				 */
   613				io_tlb_index[type] = ((index + nslots) < io_tlb_nslabs[type]
   614						? (index + nslots) : 0);
   615	
   616				goto found;
   617			}
   618			index += stride;
   619			if (index >= io_tlb_nslabs[type])
   620				index = 0;
   621		} while (index != wrap);
   622	
   623	not_found:
   624		tmp_io_tlb_used = io_tlb_used[type];
   625	
   626		spin_unlock_irqrestore(&io_tlb_lock, flags);
   627		if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
   628			dev_warn(hwdev, "%s swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
   629				 swiotlb_name[type], alloc_size, io_tlb_nslabs[type], tmp_io_tlb_used);
   630		return (phys_addr_t)DMA_MAPPING_ERROR;
   631	found:
   632		io_tlb_used[type] += nslots;
   633		spin_unlock_irqrestore(&io_tlb_lock, flags);
   634	
   635		/*
   636		 * Save away the mapping from the original address to the DMA address.
   637		 * This is needed when we sync the memory.  Then we sync the buffer if
   638		 * needed.
   639		 */
   640		for (i = 0; i < nslots; i++)
   641			io_tlb_orig_addr[type][index+i] = orig_addr + (i << IO_TLB_SHIFT);
   642		if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
   643		    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
   644			swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
   645	
   646		return tlb_addr;
   647	}
   648	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 29864 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb
  2021-02-03 23:37 [PATCH RFC v1 0/6] swiotlb: 64-bit DMA buffer Dongli Zhang
  2021-02-03 23:37   ` Dongli Zhang
  (?)
@ 2021-02-03 23:37   ` Dongli Zhang
  0 siblings, 0 replies; 6+ messages in thread
From: Dongli Zhang @ 2021-02-03 23:37 UTC (permalink / raw)
  To: dri-devel, intel-gfx, iommu, linux-mips, linux-mmc, linux-pci,
	linuxppc-dev, nouveau, x86, xen-devel
  Cc: linux-kernel, adrian.hunter, akpm, benh, bskeggs, bhelgaas, bp,
	boris.ostrovsky, hch, chris, daniel, airlied, hpa, mingo, mingo,
	jani.nikula, joonas.lahtinen, jgross, konrad.wilk, m.szyprowski,
	matthew.auld, mpe, rppt, paulus, peterz, robin.murphy,
	rodrigo.vivi, sstabellini, bauerman, tsbogend, tglx, ulf.hansson,
	joe.jin, thomas.lendacky

This patch is to enable the 64-bit swiotlb buffer.

The state of the art swiotlb pre-allocates <=32-bit memory in order to meet
the DMA mask requirement for some 32-bit legacy device. Considering most
devices nowadays support 64-bit DMA and IOMMU is available, the swiotlb is
not used for most of the times, except:

1. The xen PVM domain requires the DMA addresses to both (1) less than the
dma mask, and (2) continuous in machine address. Therefore, the 64-bit
device may still require swiotlb on PVM domain.

2. From source code the AMD SME/SEV will enable SWIOTLB_FORCE. As a result
it is always required to allocate from swiotlb buffer even the device dma
mask is 64-bit.

sme_early_init()
-> if (sev_active())
   swiotlb_force = SWIOTLB_FORCE;

Therefore, this patch introduces the 2nd swiotlb buffer for 64-bit DMA
access. For instance, the swiotlb_tbl_map_single() allocates from the 2nd
64-bit buffer if the device DMA mask is
min_not_zero(*hwdev->dma_mask, hwdev->bus_dma_limit).

The example to configure 64-bit swiotlb is "swiotlb=65536,524288,force"
or "swiotlb=,524288,force", where 524288 is the size of 64-bit buffer.

With the patch, the kernel will be able to allocate >4GB swiotlb buffer.
This patch is only for swiotlb, not including xen-swiotlb.

Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 arch/mips/cavium-octeon/dma-octeon.c         |   3 +-
 arch/powerpc/kernel/dma-swiotlb.c            |   2 +-
 arch/powerpc/platforms/pseries/svm.c         |   2 +-
 arch/x86/kernel/pci-swiotlb.c                |   5 +-
 arch/x86/pci/sta2x11-fixup.c                 |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_internal.c |   4 +-
 drivers/gpu/drm/i915/i915_scatterlist.h      |   2 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c        |   2 +-
 drivers/mmc/host/sdhci.c                     |   2 +-
 drivers/pci/xen-pcifront.c                   |   2 +-
 drivers/xen/swiotlb-xen.c                    |   9 +-
 include/linux/swiotlb.h                      |  28 +-
 kernel/dma/swiotlb.c                         | 339 +++++++++++--------
 13 files changed, 238 insertions(+), 164 deletions(-)

diff --git a/arch/mips/cavium-octeon/dma-octeon.c b/arch/mips/cavium-octeon/dma-octeon.c
index df70308db0e6..3480555d908a 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -245,6 +245,7 @@ void __init plat_swiotlb_setup(void)
 		panic("%s: Failed to allocate %zu bytes align=%lx\n",
 		      __func__, swiotlbsize, PAGE_SIZE);
 
-	if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs, 1) == -ENOMEM)
+	if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs,
+				  SWIOTLB_LO, 1) == -ENOMEM)
 		panic("Cannot allocate SWIOTLB buffer");
 }
diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index fc7816126a40..88113318c53f 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -20,7 +20,7 @@ void __init swiotlb_detect_4g(void)
 static int __init check_swiotlb_enabled(void)
 {
 	if (ppc_swiotlb_enable)
-		swiotlb_print_info();
+		swiotlb_print_info(SWIOTLB_LO);
 	else
 		swiotlb_exit();
 
diff --git a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c
index 9f8842d0da1f..77910e4ffad8 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -52,7 +52,7 @@ void __init svm_swiotlb_init(void)
 	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
 	vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
-	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, false))
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, SWIOTLB_LO, false))
 		return;
 
 	if (io_tlb_start[SWIOTLB_LO])
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..950624fd95a4 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -67,12 +67,15 @@ void __init pci_swiotlb_init(void)
 
 void __init pci_swiotlb_late_init(void)
 {
+	int i;
+
 	/* An IOMMU turned us off. */
 	if (!swiotlb)
 		swiotlb_exit();
 	else {
 		printk(KERN_INFO "PCI-DMA: "
 		       "Using software bounce buffering for IO (SWIOTLB)\n");
-		swiotlb_print_info();
+		for (i = 0; i < swiotlb_nr; i++)
+			swiotlb_print_info(i);
 	}
 }
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index 7d2525691854..c440520b2055 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -57,7 +57,7 @@ static void sta2x11_new_instance(struct pci_dev *pdev)
 		int size = STA2X11_SWIOTLB_SIZE;
 		/* First instance: register your own swiotlb area */
 		dev_info(&pdev->dev, "Using SWIOTLB (size %i)\n", size);
-		if (swiotlb_late_init_with_default_size(size))
+		if (swiotlb_late_init_with_default_size(size), SWIOTLB_LO)
 			dev_emerg(&pdev->dev, "init swiotlb failed\n");
 	}
 	list_add(&instance->list, &sta2x11_instance_list);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index ad22f42541bd..947683f2e568 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -42,10 +42,10 @@ static int i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
 
 	max_order = MAX_ORDER;
 #ifdef CONFIG_SWIOTLB
-	if (swiotlb_nr_tbl()) {
+	if (swiotlb_nr_tbl(SWIOTLB_LO)) {
 		unsigned int max_segment;
 
-		max_segment = swiotlb_max_segment();
+		max_segment = swiotlb_max_segment(SWIOTLB_LO);
 		if (max_segment) {
 			max_segment = max_t(unsigned int, max_segment,
 					    PAGE_SIZE) >> PAGE_SHIFT;
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h
index 9cb26a224034..c63c7f6941f6 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -118,7 +118,7 @@ static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg)
 
 static inline unsigned int i915_sg_segment_size(void)
 {
-	unsigned int size = swiotlb_max_segment();
+	unsigned int size = swiotlb_max_segment(SWIOTLB_LO);
 
 	if (size == 0)
 		size = UINT_MAX;
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index a37bc3d7b38b..0919b207ac47 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -321,7 +321,7 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 	}
 
 #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86)
-	need_swiotlb = !!swiotlb_nr_tbl();
+	need_swiotlb = !!swiotlb_nr_tbl(SWIOTLB_LO);
 #endif
 
 	ret = ttm_bo_device_init(&drm->ttm.bdev, &nouveau_bo_driver,
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 646823ddd317..1f7fb912d5a9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -4582,7 +4582,7 @@ int sdhci_setup_host(struct sdhci_host *host)
 		mmc->max_segs = SDHCI_MAX_SEGS;
 	} else if (host->flags & SDHCI_USE_SDMA) {
 		mmc->max_segs = 1;
-		if (swiotlb_max_segment()) {
+		if (swiotlb_max_segment(SWIOTLB_LO)) {
 			unsigned int max_req_size = (1 << IO_TLB_SHIFT) *
 						IO_TLB_SEGSIZE;
 			mmc->max_req_size = min(mmc->max_req_size,
diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index c6fe0cfec0f6..9509ed9b4126 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -693,7 +693,7 @@ static int pcifront_connect_and_init_dma(struct pcifront_device *pdev)
 
 	spin_unlock(&pcifront_dev_lock);
 
-	if (!err && !swiotlb_nr_tbl()) {
+	if (!err && !swiotlb_nr_tbl(SWIOTLB_LO)) {
 		err = pci_xen_swiotlb_init_late();
 		if (err)
 			dev_err(&pdev->xdev->dev, "Could not setup SWIOTLB!\n");
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 3261880ad859..662638093542 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -184,7 +184,7 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 	enum xen_swiotlb_err m_ret = XEN_SWIOTLB_UNKNOWN;
 	unsigned int repeat = 3;
 
-	xen_io_tlb_nslabs = swiotlb_nr_tbl();
+	xen_io_tlb_nslabs = swiotlb_nr_tbl(SWIOTLB_LO);
 retry:
 	bytes = xen_set_nslabs(xen_io_tlb_nslabs);
 	order = get_order(xen_io_tlb_nslabs << IO_TLB_SHIFT);
@@ -245,16 +245,17 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 	}
 	if (early) {
 		if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
-			 verbose))
+					  SWIOTLB_LO, verbose))
 			panic("Cannot allocate SWIOTLB buffer");
 		rc = 0;
 	} else
-		rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs);
+		rc = swiotlb_late_init_with_tbl(xen_io_tlb_start,
+						xen_io_tlb_nslabs, SWIOTLB_LO);
 
 end:
 	xen_io_tlb_end = xen_io_tlb_start + bytes;
 	if (!rc)
-		swiotlb_set_max_segment(PAGE_SIZE);
+		swiotlb_set_max_segment(PAGE_SIZE, SWIOTLB_LO);
 
 	return rc;
 error:
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 3d5980d77810..8ba45fddfb14 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -43,11 +43,14 @@ extern int swiotlb_nr;
 #define IO_TLB_DEFAULT_SIZE (64UL<<20)
 
 extern void swiotlb_init(int verbose);
-int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose);
-extern unsigned long swiotlb_nr_tbl(void);
+int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs,
+			  enum swiotlb_t type, int verbose);
+extern unsigned long swiotlb_nr_tbl(enum swiotlb_t type);
 unsigned long swiotlb_size_or_default(void);
-extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs);
-extern int swiotlb_late_init_with_default_size(size_t default_size);
+extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs,
+				      enum swiotlb_t type);
+extern int swiotlb_late_init_with_default_size(size_t default_size,
+					       enum swiotlb_t type);
 extern void __init swiotlb_update_mem_attributes(void);
 
 /*
@@ -83,8 +86,13 @@ extern phys_addr_t io_tlb_start[], io_tlb_end[];
 
 static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 {
-	return paddr >= io_tlb_start[SWIOTLB_LO] &&
-	       paddr < io_tlb_end[SWIOTLB_LO];
+	int i;
+
+	for (i = 0; i < swiotlb_nr; i++)
+		if (paddr >= io_tlb_start[i] && paddr < io_tlb_end[i])
+			return true;
+
+	return false;
 }
 
 static inline int swiotlb_get_type(phys_addr_t paddr)
@@ -99,7 +107,7 @@ static inline int swiotlb_get_type(phys_addr_t paddr)
 }
 
 void __init swiotlb_exit(void);
-unsigned int swiotlb_max_segment(void);
+unsigned int swiotlb_max_segment(enum swiotlb_t type);
 size_t swiotlb_max_mapping_size(struct device *dev);
 bool is_swiotlb_active(void);
 void __init swiotlb_adjust_size(unsigned long new_size);
@@ -112,7 +120,7 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 static inline void swiotlb_exit(void)
 {
 }
-static inline unsigned int swiotlb_max_segment(void)
+static inline unsigned int swiotlb_max_segment(enum swiotlb_t type)
 {
 	return 0;
 }
@@ -131,7 +139,7 @@ static inline void swiotlb_adjust_size(unsigned long new_size)
 }
 #endif /* CONFIG_SWIOTLB */
 
-extern void swiotlb_print_info(void);
-extern void swiotlb_set_max_segment(unsigned int);
+extern void swiotlb_print_info(enum swiotlb_t type);
+extern void swiotlb_set_max_segment(unsigned int val, enum swiotlb_t type);
 
 #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index c91d3d2c3936..cd28db5b016a 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -111,6 +111,11 @@ static int late_alloc;
 
 int swiotlb_nr = 1;
 
+static const char * const swiotlb_name[] = {
+	"Low Mem",
+	"High Mem"
+};
+
 static int __init
 setup_io_tlb_npages(char *str)
 {
@@ -121,11 +126,25 @@ setup_io_tlb_npages(char *str)
 	}
 	if (*str == ',')
 		++str;
+
+	if (isdigit(*str)) {
+		io_tlb_nslabs[SWIOTLB_HI] = simple_strtoul(str, &str, 0);
+		/* avoid tail segment of size < IO_TLB_SEGSIZE */
+		io_tlb_nslabs[SWIOTLB_HI] = ALIGN(io_tlb_nslabs[SWIOTLB_HI], IO_TLB_SEGSIZE);
+
+		swiotlb_nr = 2;
+	}
+
+	if (*str == ',')
+		++str;
+
 	if (!strcmp(str, "force")) {
 		swiotlb_force = SWIOTLB_FORCE;
 	} else if (!strcmp(str, "noforce")) {
 		swiotlb_force = SWIOTLB_NO_FORCE;
 		io_tlb_nslabs[SWIOTLB_LO] = 1;
+		if (swiotlb_nr > 1)
+			io_tlb_nslabs[SWIOTLB_HI] = 1;
 	}
 
 	return 0;
@@ -134,24 +153,24 @@ early_param("swiotlb", setup_io_tlb_npages);
 
 static bool no_iotlb_memory[SWIOTLB_MAX];
 
-unsigned long swiotlb_nr_tbl(void)
+unsigned long swiotlb_nr_tbl(enum swiotlb_t type)
 {
-	return unlikely(no_iotlb_memory[SWIOTLB_LO]) ? 0 : io_tlb_nslabs[SWIOTLB_LO];
+	return unlikely(no_iotlb_memory[type]) ? 0 : io_tlb_nslabs[type];
 }
 EXPORT_SYMBOL_GPL(swiotlb_nr_tbl);
 
-unsigned int swiotlb_max_segment(void)
+unsigned int swiotlb_max_segment(enum swiotlb_t type)
 {
-	return unlikely(no_iotlb_memory[SWIOTLB_LO]) ? 0 : max_segment[SWIOTLB_LO];
+	return unlikely(no_iotlb_memory[type]) ? 0 : max_segment[type];
 }
 EXPORT_SYMBOL_GPL(swiotlb_max_segment);
 
-void swiotlb_set_max_segment(unsigned int val)
+void swiotlb_set_max_segment(unsigned int val, enum swiotlb_t type)
 {
 	if (swiotlb_force == SWIOTLB_FORCE)
-		max_segment[SWIOTLB_LO] = 1;
+		max_segment[type] = 1;
 	else
-		max_segment[SWIOTLB_LO] = rounddown(val, PAGE_SIZE);
+		max_segment[type] = rounddown(val, PAGE_SIZE);
 }
 
 unsigned long swiotlb_size_or_default(void)
@@ -181,18 +200,18 @@ void __init swiotlb_adjust_size(unsigned long new_size)
 	}
 }
 
-void swiotlb_print_info(void)
+void swiotlb_print_info(enum swiotlb_t type)
 {
-	unsigned long bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	unsigned long bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
 
-	if (no_iotlb_memory[SWIOTLB_LO]) {
-		pr_warn("No low mem\n");
+	if (no_iotlb_memory[type]) {
+		pr_warn("No low mem with %s\n", swiotlb_name[type]);
 		return;
 	}
 
-	pr_info("mapped [mem %pa-%pa] (%luMB)\n",
-		&io_tlb_start[SWIOTLB_LO], &io_tlb_end[SWIOTLB_LO],
-		bytes >> 20);
+	pr_info("%s mapped [mem %pa-%pa] (%luMB)\n",
+		swiotlb_name[type],
+		&io_tlb_start[type], &io_tlb_end[type], bytes >> 20);
 }
 
 /*
@@ -205,88 +224,104 @@ void __init swiotlb_update_mem_attributes(void)
 {
 	void *vaddr;
 	unsigned long bytes;
+	int i;
 
-	if (no_iotlb_memory[SWIOTLB_LO] || late_alloc)
-		return;
+	for (i = 0; i < swiotlb_nr; i++) {
+		if (no_iotlb_memory[i] || late_alloc)
+			continue;
 
-	vaddr = phys_to_virt(io_tlb_start[SWIOTLB_LO]);
-	bytes = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
-	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
-	memset(vaddr, 0, bytes);
+		vaddr = phys_to_virt(io_tlb_start[i]);
+		bytes = PAGE_ALIGN(io_tlb_nslabs[i] << IO_TLB_SHIFT);
+		set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
+		memset(vaddr, 0, bytes);
+	}
 }
 
-int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
+int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs,
+				 enum swiotlb_t type, int verbose)
 {
 	unsigned long i, bytes;
 	size_t alloc_size;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
-	io_tlb_nslabs[SWIOTLB_LO] = nslabs;
-	io_tlb_start[SWIOTLB_LO] = __pa(tlb);
-	io_tlb_end[SWIOTLB_LO] = io_tlb_start[SWIOTLB_LO] + bytes;
+	io_tlb_nslabs[type] = nslabs;
+	io_tlb_start[type] = __pa(tlb);
+	io_tlb_end[type] = io_tlb_start[type] + bytes;
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
 	 * between io_tlb_start and io_tlb_end.
 	 */
-	alloc_size = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int));
-	io_tlb_list[SWIOTLB_LO] = memblock_alloc(alloc_size, PAGE_SIZE);
-	if (!io_tlb_list[SWIOTLB_LO])
+	alloc_size = PAGE_ALIGN(io_tlb_nslabs[type] * sizeof(int));
+	io_tlb_list[type] = memblock_alloc(alloc_size, PAGE_SIZE);
+	if (!io_tlb_list[type])
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	alloc_size = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t));
-	io_tlb_orig_addr[SWIOTLB_LO] = memblock_alloc(alloc_size, PAGE_SIZE);
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
+	alloc_size = PAGE_ALIGN(io_tlb_nslabs[type] * sizeof(phys_addr_t));
+	io_tlb_orig_addr[type] = memblock_alloc(alloc_size, PAGE_SIZE);
+	if (!io_tlb_orig_addr[type])
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	for (i = 0; i < io_tlb_nslabs[SWIOTLB_LO]; i++) {
-		io_tlb_list[SWIOTLB_LO][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-		io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+	for (i = 0; i < io_tlb_nslabs[type]; i++) {
+		io_tlb_list[type][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 	}
-	io_tlb_index[SWIOTLB_LO] = 0;
-	no_iotlb_memory[SWIOTLB_LO] = false;
+	io_tlb_index[type] = 0;
+	no_iotlb_memory[type] = false;
 
 	if (verbose)
-		swiotlb_print_info();
+		swiotlb_print_info(type);
 
-	swiotlb_set_max_segment(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+	swiotlb_set_max_segment(io_tlb_nslabs[type] << IO_TLB_SHIFT, type);
 	return 0;
 }
 
-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the DMA API.
- */
-void  __init
-swiotlb_init(int verbose)
+static void __init
+swiotlb_init_type(enum swiotlb_t type, int verbose)
 {
 	size_t default_size = IO_TLB_DEFAULT_SIZE;
 	unsigned char *vstart;
 	unsigned long bytes;
 
-	if (!io_tlb_nslabs[SWIOTLB_LO]) {
-		io_tlb_nslabs[SWIOTLB_LO] = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], IO_TLB_SEGSIZE);
+	if (!io_tlb_nslabs[type]) {
+		io_tlb_nslabs[type] = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs[type] = ALIGN(io_tlb_nslabs[type], IO_TLB_SEGSIZE);
 	}
 
-	bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
+
+	if (type == SWIOTLB_LO)
+		vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
+	else
+		vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
 
-	/* Get IO TLB memory from the low pages */
-	vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
-	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs[SWIOTLB_LO], verbose))
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs[type], type, verbose))
 		return;
 
-	if (io_tlb_start[SWIOTLB_LO]) {
-		memblock_free_early(io_tlb_start[SWIOTLB_LO],
-				    PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
-		io_tlb_start[SWIOTLB_LO] = 0;
+	if (io_tlb_start[type]) {
+		memblock_free_early(io_tlb_start[type],
+				    PAGE_ALIGN(io_tlb_nslabs[type] << IO_TLB_SHIFT));
+		io_tlb_start[type] = 0;
 	}
-	pr_warn("Cannot allocate buffer");
-	no_iotlb_memory[SWIOTLB_LO] = true;
+	pr_warn("Cannot allocate buffer %s\n", swiotlb_name[type]);
+	no_iotlb_memory[type] = true;
+}
+
+/*
+ * Statically reserve bounce buffer space and initialize bounce buffer data
+ * structures for the software IO TLB used to implement the DMA API.
+ */
+void  __init
+swiotlb_init(int verbose)
+{
+	int i;
+
+	for (i = 0; i < swiotlb_nr; i++)
+		swiotlb_init_type(i, verbose);
 }
 
 /*
@@ -295,67 +330,68 @@ swiotlb_init(int verbose)
  * This should be just like above, but with some error catching.
  */
 int
-swiotlb_late_init_with_default_size(size_t default_size)
+swiotlb_late_init_with_default_size(size_t default_size, enum swiotlb_t type)
 {
-	unsigned long bytes, req_nslabs = io_tlb_nslabs[SWIOTLB_LO];
+	unsigned long bytes, req_nslabs = io_tlb_nslabs[type];
 	unsigned char *vstart = NULL;
 	unsigned int order;
 	int rc = 0;
+	gfp_t gfp_mask = (type == SWIOTLB_LO) ? GFP_DMA | __GFP_NOWARN :
+						GFP_KERNEL | __GFP_NOWARN;
 
-	if (!io_tlb_nslabs[SWIOTLB_LO]) {
-		io_tlb_nslabs[SWIOTLB_LO] = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], IO_TLB_SEGSIZE);
+	if (!io_tlb_nslabs[type]) {
+		io_tlb_nslabs[type] = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs[type] = ALIGN(io_tlb_nslabs[type], IO_TLB_SEGSIZE);
 	}
 
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	order = get_order(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
-	io_tlb_nslabs[SWIOTLB_LO] = SLABS_PER_PAGE << order;
-	bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	order = get_order(io_tlb_nslabs[type] << IO_TLB_SHIFT);
+	io_tlb_nslabs[type] = SLABS_PER_PAGE << order;
+	bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
 
 	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-		vstart = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
-						  order);
+		vstart = (void *)__get_free_pages(gfp_mask, order);
 		if (vstart)
 			break;
 		order--;
 	}
 
 	if (!vstart) {
-		io_tlb_nslabs[SWIOTLB_LO] = req_nslabs;
+		io_tlb_nslabs[type] = req_nslabs;
 		return -ENOMEM;
 	}
 	if (order != get_order(bytes)) {
 		pr_warn("only able to allocate %ld MB\n",
 			(PAGE_SIZE << order) >> 20);
-		io_tlb_nslabs[SWIOTLB_LO] = SLABS_PER_PAGE << order;
+		io_tlb_nslabs[type] = SLABS_PER_PAGE << order;
 	}
-	rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs[SWIOTLB_LO]);
+	rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs[type], type);
 	if (rc)
 		free_pages((unsigned long)vstart, order);
 
 	return rc;
 }
 
-static void swiotlb_cleanup(void)
+static void swiotlb_cleanup(enum swiotlb_t type)
 {
-	io_tlb_end[SWIOTLB_LO] = 0;
-	io_tlb_start[SWIOTLB_LO] = 0;
-	io_tlb_nslabs[SWIOTLB_LO] = 0;
-	max_segment[SWIOTLB_LO] = 0;
+	io_tlb_end[type] = 0;
+	io_tlb_start[type] = 0;
+	io_tlb_nslabs[type] = 0;
+	max_segment[type] = 0;
 }
 
 int
-swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
+swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs, enum swiotlb_t type)
 {
 	unsigned long i, bytes;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
-	io_tlb_nslabs[SWIOTLB_LO] = nslabs;
-	io_tlb_start[SWIOTLB_LO] = virt_to_phys(tlb);
-	io_tlb_end[SWIOTLB_LO] = io_tlb_start[SWIOTLB_LO] + bytes;
+	io_tlb_nslabs[type] = nslabs;
+	io_tlb_start[type] = virt_to_phys(tlb);
+	io_tlb_end[type] = io_tlb_start[type] + bytes;
 
 	set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
 	memset(tlb, 0, bytes);
@@ -365,63 +401,67 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
 	 * between io_tlb_start and io_tlb_end.
 	 */
-	io_tlb_list[SWIOTLB_LO] = (unsigned int *)__get_free_pages(GFP_KERNEL,
-			get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-	if (!io_tlb_list[SWIOTLB_LO])
+	io_tlb_list[type] = (unsigned int *)__get_free_pages(GFP_KERNEL,
+			get_order(io_tlb_nslabs[type] * sizeof(int)));
+	if (!io_tlb_list[type])
 		goto cleanup3;
 
-	io_tlb_orig_addr[SWIOTLB_LO] = (phys_addr_t *)
+	io_tlb_orig_addr[type] = (phys_addr_t *)
 		__get_free_pages(GFP_KERNEL,
-				 get_order(io_tlb_nslabs[SWIOTLB_LO] *
+				 get_order(io_tlb_nslabs[type] *
 					   sizeof(phys_addr_t)));
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
+	if (!io_tlb_orig_addr[type])
 		goto cleanup4;
 
-	for (i = 0; i < io_tlb_nslabs[SWIOTLB_LO]; i++) {
-		io_tlb_list[SWIOTLB_LO][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-		io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+	for (i = 0; i < io_tlb_nslabs[type]; i++) {
+		io_tlb_list[type][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 	}
-	io_tlb_index[SWIOTLB_LO] = 0;
-	no_iotlb_memory[SWIOTLB_LO] = false;
+	io_tlb_index[type] = 0;
+	no_iotlb_memory[type] = false;
 
-	swiotlb_print_info();
+	swiotlb_print_info(type);
 
 	late_alloc = 1;
 
-	swiotlb_set_max_segment(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+	swiotlb_set_max_segment(io_tlb_nslabs[type] << IO_TLB_SHIFT, type);
 
 	return 0;
 
 cleanup4:
-	free_pages((unsigned long)io_tlb_list[SWIOTLB_LO], get_order(io_tlb_nslabs[SWIOTLB_LO] *
+	free_pages((unsigned long)io_tlb_list[type], get_order(io_tlb_nslabs[type] *
 	                                                 sizeof(int)));
-	io_tlb_list[SWIOTLB_LO] = NULL;
+	io_tlb_list[type] = NULL;
 cleanup3:
-	swiotlb_cleanup();
+	swiotlb_cleanup(type);
 	return -ENOMEM;
 }
 
 void __init swiotlb_exit(void)
 {
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
-		return;
+	int i;
 
-	if (late_alloc) {
-		free_pages((unsigned long)io_tlb_orig_addr[SWIOTLB_LO],
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t)));
-		free_pages((unsigned long)io_tlb_list[SWIOTLB_LO],
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-		free_pages((unsigned long)phys_to_virt(io_tlb_start[SWIOTLB_LO]),
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
-	} else {
-		memblock_free_late(__pa(io_tlb_orig_addr[SWIOTLB_LO]),
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t)));
-		memblock_free_late(__pa(io_tlb_list[SWIOTLB_LO]),
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-		memblock_free_late(io_tlb_start[SWIOTLB_LO],
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
+	for (i = 0; i < swiotlb_nr; i++) {
+		if (!io_tlb_orig_addr[i])
+			continue;
+
+		if (late_alloc) {
+			free_pages((unsigned long)io_tlb_orig_addr[i],
+				   get_order(io_tlb_nslabs[i] * sizeof(phys_addr_t)));
+			free_pages((unsigned long)io_tlb_list[i],
+				   get_order(io_tlb_nslabs[i] * sizeof(int)));
+			free_pages((unsigned long)phys_to_virt(io_tlb_start[i]),
+				   get_order(io_tlb_nslabs[i] << IO_TLB_SHIFT));
+		} else {
+			memblock_free_late(__pa(io_tlb_orig_addr[i]),
+					   PAGE_ALIGN(io_tlb_nslabs[i] * sizeof(phys_addr_t)));
+			memblock_free_late(__pa(io_tlb_list[i]),
+					   PAGE_ALIGN(io_tlb_nslabs[i] * sizeof(int)));
+			memblock_free_late(io_tlb_start[i],
+					   PAGE_ALIGN(io_tlb_nslabs[i] << IO_TLB_SHIFT));
+		}
+		swiotlb_cleanup(i);
 	}
-	swiotlb_cleanup();
 }
 
 /*
@@ -468,7 +508,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		size_t mapping_size, size_t alloc_size,
 		enum dma_data_direction dir, unsigned long attrs)
 {
-	dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[SWIOTLB_LO]);
+	dma_addr_t tbl_dma_addr;
 	unsigned long flags;
 	phys_addr_t tlb_addr;
 	unsigned int nslots, stride, index, wrap;
@@ -477,8 +517,16 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	unsigned long offset_slots;
 	unsigned long max_slots;
 	unsigned long tmp_io_tlb_used;
+	unsigned long dma_mask = min_not_zero(*hwdev->dma_mask,
+					      hwdev->bus_dma_limit);
+	int type;
 
-	if (no_iotlb_memory[SWIOTLB_LO])
+	if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
+		type = SWIOTLB_HI;
+	else
+		type = SWIOTLB_LO;
+
+	if (no_iotlb_memory[type])
 		panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
 
 	if (mem_encrypt_active())
@@ -490,6 +538,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		return (phys_addr_t)DMA_MAPPING_ERROR;
 	}
 
+	tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[type]);
+
 	mask = dma_get_seg_boundary(hwdev);
 
 	tbl_dma_addr &= mask;
@@ -521,11 +571,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	 */
 	spin_lock_irqsave(&io_tlb_lock, flags);
 
-	if (unlikely(nslots > io_tlb_nslabs[SWIOTLB_LO] - io_tlb_used[SWIOTLB_LO]))
+	if (unlikely(nslots > io_tlb_nslabs[type] - io_tlb_used[type]))
 		goto not_found;
 
-	index = ALIGN(io_tlb_index[SWIOTLB_LO], stride);
-	if (index >= io_tlb_nslabs[SWIOTLB_LO])
+	index = ALIGN(io_tlb_index[type], stride);
+	if (index >= io_tlb_nslabs[type])
 		index = 0;
 	wrap = index;
 
@@ -533,7 +583,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		while (iommu_is_span_boundary(index, nslots, offset_slots,
 					      max_slots)) {
 			index += stride;
-			if (index >= io_tlb_nslabs[SWIOTLB_LO])
+			if (index >= io_tlb_nslabs[type])
 				index = 0;
 			if (index == wrap)
 				goto not_found;
@@ -544,42 +594,42 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		 * contiguous buffers, we allocate the buffers from that slot
 		 * and mark the entries as '0' indicating unavailable.
 		 */
-		if (io_tlb_list[SWIOTLB_LO][index] >= nslots) {
+		if (io_tlb_list[type][index] >= nslots) {
 			int count = 0;
 
 			for (i = index; i < (int) (index + nslots); i++)
-				io_tlb_list[SWIOTLB_LO][i] = 0;
+				io_tlb_list[type][i] = 0;
 			for (i = index - 1;
 			     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
-			     io_tlb_list[SWIOTLB_LO][i];
+			     io_tlb_list[type][i];
 			     i--)
-				io_tlb_list[SWIOTLB_LO][i] = ++count;
-			tlb_addr = io_tlb_start[SWIOTLB_LO] + (index << IO_TLB_SHIFT);
+				io_tlb_list[type][i] = ++count;
+			tlb_addr = io_tlb_start[type] + (index << IO_TLB_SHIFT);
 
 			/*
 			 * Update the indices to avoid searching in the next
 			 * round.
 			 */
-			io_tlb_index[SWIOTLB_LO] = ((index + nslots) < io_tlb_nslabs[SWIOTLB_LO]
+			io_tlb_index[type] = ((index + nslots) < io_tlb_nslabs[type]
 					? (index + nslots) : 0);
 
 			goto found;
 		}
 		index += stride;
-		if (index >= io_tlb_nslabs[SWIOTLB_LO])
+		if (index >= io_tlb_nslabs[type])
 			index = 0;
 	} while (index != wrap);
 
 not_found:
-	tmp_io_tlb_used = io_tlb_used[SWIOTLB_LO];
+	tmp_io_tlb_used = io_tlb_used[type];
 
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
-		dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
-			 alloc_size, io_tlb_nslabs[SWIOTLB_LO], tmp_io_tlb_used);
+		dev_warn(hwdev, "%s swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
+			 swiotlb_name[type], alloc_size, io_tlb_nslabs[type], tmp_io_tlb_used);
 	return (phys_addr_t)DMA_MAPPING_ERROR;
 found:
-	io_tlb_used[SWIOTLB_LO] += nslots;
+	io_tlb_used[type] += nslots;
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 
 	/*
@@ -588,7 +638,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	 * needed.
 	 */
 	for (i = 0; i < nslots; i++)
-		io_tlb_orig_addr[SWIOTLB_LO][index+i] = orig_addr + (i << IO_TLB_SHIFT);
+		io_tlb_orig_addr[type][index+i] = orig_addr + (i << IO_TLB_SHIFT);
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
 		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
@@ -605,8 +655,9 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 {
 	unsigned long flags;
 	int i, count, nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	int index = (tlb_addr - io_tlb_start[SWIOTLB_LO]) >> IO_TLB_SHIFT;
-	phys_addr_t orig_addr = io_tlb_orig_addr[SWIOTLB_LO][index];
+	int type = swiotlb_get_type(tlb_addr);
+	int index = (tlb_addr - io_tlb_start[type]) >> IO_TLB_SHIFT;
+	phys_addr_t orig_addr = io_tlb_orig_addr[type][index];
 
 	/*
 	 * First, sync the memory before unmapping the entry
@@ -625,14 +676,14 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 	spin_lock_irqsave(&io_tlb_lock, flags);
 	{
 		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
-			 io_tlb_list[SWIOTLB_LO][index + nslots] : 0);
+			 io_tlb_list[type][index + nslots] : 0);
 		/*
 		 * Step 1: return the slots to the free list, merging the
 		 * slots with superceeding slots
 		 */
 		for (i = index + nslots - 1; i >= index; i--) {
-			io_tlb_list[SWIOTLB_LO][i] = ++count;
-			io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+			io_tlb_list[type][i] = ++count;
+			io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 		}
 		/*
 		 * Step 2: merge the returned slots with the preceding slots,
@@ -640,11 +691,11 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 		 */
 		for (i = index - 1;
 		     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
-		     io_tlb_list[SWIOTLB_LO][i];
+		     io_tlb_list[type][i];
 		     i--)
-			io_tlb_list[SWIOTLB_LO][i] = ++count;
+			io_tlb_list[type][i] = ++count;
 
-		io_tlb_used[SWIOTLB_LO] -= nslots;
+		io_tlb_used[type] -= nslots;
 	}
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 }
@@ -653,8 +704,9 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 			     size_t size, enum dma_data_direction dir,
 			     enum dma_sync_target target)
 {
-	int index = (tlb_addr - io_tlb_start[SWIOTLB_LO]) >> IO_TLB_SHIFT;
-	phys_addr_t orig_addr = io_tlb_orig_addr[SWIOTLB_LO][index];
+	int type = swiotlb_get_type(tlb_addr);
+	int index = (tlb_addr - io_tlb_start[type]) >> IO_TLB_SHIFT;
+	phys_addr_t orig_addr = io_tlb_orig_addr[type][index];
 
 	if (orig_addr == INVALID_PHYS_ADDR)
 		return;
@@ -737,6 +789,15 @@ static int __init swiotlb_create_debugfs(void)
 	root = debugfs_create_dir("swiotlb", NULL);
 	debugfs_create_ulong("io_tlb_nslabs", 0400, root, &io_tlb_nslabs[SWIOTLB_LO]);
 	debugfs_create_ulong("io_tlb_used", 0400, root, &io_tlb_used[SWIOTLB_LO]);
+
+	if (swiotlb_nr == 1)
+		return 0;
+
+	debugfs_create_ulong("io_tlb_nslabs-highmem", 0400,
+			     root, &io_tlb_nslabs[SWIOTLB_HI]);
+	debugfs_create_ulong("io_tlb_used-highmem", 0400,
+			     root, &io_tlb_used[SWIOTLB_HI]);
+
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb
@ 2021-02-03 23:37   ` Dongli Zhang
  0 siblings, 0 replies; 6+ messages in thread
From: Dongli Zhang @ 2021-02-03 23:37 UTC (permalink / raw)
  To: dri-devel, intel-gfx, iommu, linux-mips, linux-mmc, linux-pci,
	linuxppc-dev, nouveau, x86, xen-devel
  Cc: ulf.hansson, airlied, joonas.lahtinen, adrian.hunter, paulus,
	hpa, mingo, m.szyprowski, sstabellini, joe.jin, hch, peterz,
	mingo, matthew.auld, bskeggs, thomas.lendacky, konrad.wilk,
	jani.nikula, bp, rodrigo.vivi, bhelgaas, boris.ostrovsky, chris,
	jgross, tsbogend, linux-kernel, tglx, bauerman, daniel, akpm,
	robin.murphy, rppt

This patch is to enable the 64-bit swiotlb buffer.

The state of the art swiotlb pre-allocates <=32-bit memory in order to meet
the DMA mask requirement for some 32-bit legacy device. Considering most
devices nowadays support 64-bit DMA and IOMMU is available, the swiotlb is
not used for most of the times, except:

1. The xen PVM domain requires the DMA addresses to both (1) less than the
dma mask, and (2) continuous in machine address. Therefore, the 64-bit
device may still require swiotlb on PVM domain.

2. From source code the AMD SME/SEV will enable SWIOTLB_FORCE. As a result
it is always required to allocate from swiotlb buffer even the device dma
mask is 64-bit.

sme_early_init()
-> if (sev_active())
   swiotlb_force = SWIOTLB_FORCE;

Therefore, this patch introduces the 2nd swiotlb buffer for 64-bit DMA
access. For instance, the swiotlb_tbl_map_single() allocates from the 2nd
64-bit buffer if the device DMA mask is
min_not_zero(*hwdev->dma_mask, hwdev->bus_dma_limit).

The example to configure 64-bit swiotlb is "swiotlb=65536,524288,force"
or "swiotlb=,524288,force", where 524288 is the size of 64-bit buffer.

With the patch, the kernel will be able to allocate >4GB swiotlb buffer.
This patch is only for swiotlb, not including xen-swiotlb.

Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 arch/mips/cavium-octeon/dma-octeon.c         |   3 +-
 arch/powerpc/kernel/dma-swiotlb.c            |   2 +-
 arch/powerpc/platforms/pseries/svm.c         |   2 +-
 arch/x86/kernel/pci-swiotlb.c                |   5 +-
 arch/x86/pci/sta2x11-fixup.c                 |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_internal.c |   4 +-
 drivers/gpu/drm/i915/i915_scatterlist.h      |   2 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c        |   2 +-
 drivers/mmc/host/sdhci.c                     |   2 +-
 drivers/pci/xen-pcifront.c                   |   2 +-
 drivers/xen/swiotlb-xen.c                    |   9 +-
 include/linux/swiotlb.h                      |  28 +-
 kernel/dma/swiotlb.c                         | 339 +++++++++++--------
 13 files changed, 238 insertions(+), 164 deletions(-)

diff --git a/arch/mips/cavium-octeon/dma-octeon.c b/arch/mips/cavium-octeon/dma-octeon.c
index df70308db0e6..3480555d908a 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -245,6 +245,7 @@ void __init plat_swiotlb_setup(void)
 		panic("%s: Failed to allocate %zu bytes align=%lx\n",
 		      __func__, swiotlbsize, PAGE_SIZE);
 
-	if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs, 1) == -ENOMEM)
+	if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs,
+				  SWIOTLB_LO, 1) == -ENOMEM)
 		panic("Cannot allocate SWIOTLB buffer");
 }
diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index fc7816126a40..88113318c53f 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -20,7 +20,7 @@ void __init swiotlb_detect_4g(void)
 static int __init check_swiotlb_enabled(void)
 {
 	if (ppc_swiotlb_enable)
-		swiotlb_print_info();
+		swiotlb_print_info(SWIOTLB_LO);
 	else
 		swiotlb_exit();
 
diff --git a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c
index 9f8842d0da1f..77910e4ffad8 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -52,7 +52,7 @@ void __init svm_swiotlb_init(void)
 	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
 	vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
-	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, false))
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, SWIOTLB_LO, false))
 		return;
 
 	if (io_tlb_start[SWIOTLB_LO])
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..950624fd95a4 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -67,12 +67,15 @@ void __init pci_swiotlb_init(void)
 
 void __init pci_swiotlb_late_init(void)
 {
+	int i;
+
 	/* An IOMMU turned us off. */
 	if (!swiotlb)
 		swiotlb_exit();
 	else {
 		printk(KERN_INFO "PCI-DMA: "
 		       "Using software bounce buffering for IO (SWIOTLB)\n");
-		swiotlb_print_info();
+		for (i = 0; i < swiotlb_nr; i++)
+			swiotlb_print_info(i);
 	}
 }
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index 7d2525691854..c440520b2055 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -57,7 +57,7 @@ static void sta2x11_new_instance(struct pci_dev *pdev)
 		int size = STA2X11_SWIOTLB_SIZE;
 		/* First instance: register your own swiotlb area */
 		dev_info(&pdev->dev, "Using SWIOTLB (size %i)\n", size);
-		if (swiotlb_late_init_with_default_size(size))
+		if (swiotlb_late_init_with_default_size(size), SWIOTLB_LO)
 			dev_emerg(&pdev->dev, "init swiotlb failed\n");
 	}
 	list_add(&instance->list, &sta2x11_instance_list);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index ad22f42541bd..947683f2e568 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -42,10 +42,10 @@ static int i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
 
 	max_order = MAX_ORDER;
 #ifdef CONFIG_SWIOTLB
-	if (swiotlb_nr_tbl()) {
+	if (swiotlb_nr_tbl(SWIOTLB_LO)) {
 		unsigned int max_segment;
 
-		max_segment = swiotlb_max_segment();
+		max_segment = swiotlb_max_segment(SWIOTLB_LO);
 		if (max_segment) {
 			max_segment = max_t(unsigned int, max_segment,
 					    PAGE_SIZE) >> PAGE_SHIFT;
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h
index 9cb26a224034..c63c7f6941f6 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -118,7 +118,7 @@ static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg)
 
 static inline unsigned int i915_sg_segment_size(void)
 {
-	unsigned int size = swiotlb_max_segment();
+	unsigned int size = swiotlb_max_segment(SWIOTLB_LO);
 
 	if (size == 0)
 		size = UINT_MAX;
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index a37bc3d7b38b..0919b207ac47 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -321,7 +321,7 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 	}
 
 #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86)
-	need_swiotlb = !!swiotlb_nr_tbl();
+	need_swiotlb = !!swiotlb_nr_tbl(SWIOTLB_LO);
 #endif
 
 	ret = ttm_bo_device_init(&drm->ttm.bdev, &nouveau_bo_driver,
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 646823ddd317..1f7fb912d5a9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -4582,7 +4582,7 @@ int sdhci_setup_host(struct sdhci_host *host)
 		mmc->max_segs = SDHCI_MAX_SEGS;
 	} else if (host->flags & SDHCI_USE_SDMA) {
 		mmc->max_segs = 1;
-		if (swiotlb_max_segment()) {
+		if (swiotlb_max_segment(SWIOTLB_LO)) {
 			unsigned int max_req_size = (1 << IO_TLB_SHIFT) *
 						IO_TLB_SEGSIZE;
 			mmc->max_req_size = min(mmc->max_req_size,
diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index c6fe0cfec0f6..9509ed9b4126 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -693,7 +693,7 @@ static int pcifront_connect_and_init_dma(struct pcifront_device *pdev)
 
 	spin_unlock(&pcifront_dev_lock);
 
-	if (!err && !swiotlb_nr_tbl()) {
+	if (!err && !swiotlb_nr_tbl(SWIOTLB_LO)) {
 		err = pci_xen_swiotlb_init_late();
 		if (err)
 			dev_err(&pdev->xdev->dev, "Could not setup SWIOTLB!\n");
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 3261880ad859..662638093542 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -184,7 +184,7 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 	enum xen_swiotlb_err m_ret = XEN_SWIOTLB_UNKNOWN;
 	unsigned int repeat = 3;
 
-	xen_io_tlb_nslabs = swiotlb_nr_tbl();
+	xen_io_tlb_nslabs = swiotlb_nr_tbl(SWIOTLB_LO);
 retry:
 	bytes = xen_set_nslabs(xen_io_tlb_nslabs);
 	order = get_order(xen_io_tlb_nslabs << IO_TLB_SHIFT);
@@ -245,16 +245,17 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 	}
 	if (early) {
 		if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
-			 verbose))
+					  SWIOTLB_LO, verbose))
 			panic("Cannot allocate SWIOTLB buffer");
 		rc = 0;
 	} else
-		rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs);
+		rc = swiotlb_late_init_with_tbl(xen_io_tlb_start,
+						xen_io_tlb_nslabs, SWIOTLB_LO);
 
 end:
 	xen_io_tlb_end = xen_io_tlb_start + bytes;
 	if (!rc)
-		swiotlb_set_max_segment(PAGE_SIZE);
+		swiotlb_set_max_segment(PAGE_SIZE, SWIOTLB_LO);
 
 	return rc;
 error:
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 3d5980d77810..8ba45fddfb14 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -43,11 +43,14 @@ extern int swiotlb_nr;
 #define IO_TLB_DEFAULT_SIZE (64UL<<20)
 
 extern void swiotlb_init(int verbose);
-int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose);
-extern unsigned long swiotlb_nr_tbl(void);
+int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs,
+			  enum swiotlb_t type, int verbose);
+extern unsigned long swiotlb_nr_tbl(enum swiotlb_t type);
 unsigned long swiotlb_size_or_default(void);
-extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs);
-extern int swiotlb_late_init_with_default_size(size_t default_size);
+extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs,
+				      enum swiotlb_t type);
+extern int swiotlb_late_init_with_default_size(size_t default_size,
+					       enum swiotlb_t type);
 extern void __init swiotlb_update_mem_attributes(void);
 
 /*
@@ -83,8 +86,13 @@ extern phys_addr_t io_tlb_start[], io_tlb_end[];
 
 static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 {
-	return paddr >= io_tlb_start[SWIOTLB_LO] &&
-	       paddr < io_tlb_end[SWIOTLB_LO];
+	int i;
+
+	for (i = 0; i < swiotlb_nr; i++)
+		if (paddr >= io_tlb_start[i] && paddr < io_tlb_end[i])
+			return true;
+
+	return false;
 }
 
 static inline int swiotlb_get_type(phys_addr_t paddr)
@@ -99,7 +107,7 @@ static inline int swiotlb_get_type(phys_addr_t paddr)
 }
 
 void __init swiotlb_exit(void);
-unsigned int swiotlb_max_segment(void);
+unsigned int swiotlb_max_segment(enum swiotlb_t type);
 size_t swiotlb_max_mapping_size(struct device *dev);
 bool is_swiotlb_active(void);
 void __init swiotlb_adjust_size(unsigned long new_size);
@@ -112,7 +120,7 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 static inline void swiotlb_exit(void)
 {
 }
-static inline unsigned int swiotlb_max_segment(void)
+static inline unsigned int swiotlb_max_segment(enum swiotlb_t type)
 {
 	return 0;
 }
@@ -131,7 +139,7 @@ static inline void swiotlb_adjust_size(unsigned long new_size)
 }
 #endif /* CONFIG_SWIOTLB */
 
-extern void swiotlb_print_info(void);
-extern void swiotlb_set_max_segment(unsigned int);
+extern void swiotlb_print_info(enum swiotlb_t type);
+extern void swiotlb_set_max_segment(unsigned int val, enum swiotlb_t type);
 
 #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index c91d3d2c3936..cd28db5b016a 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -111,6 +111,11 @@ static int late_alloc;
 
 int swiotlb_nr = 1;
 
+static const char * const swiotlb_name[] = {
+	"Low Mem",
+	"High Mem"
+};
+
 static int __init
 setup_io_tlb_npages(char *str)
 {
@@ -121,11 +126,25 @@ setup_io_tlb_npages(char *str)
 	}
 	if (*str == ',')
 		++str;
+
+	if (isdigit(*str)) {
+		io_tlb_nslabs[SWIOTLB_HI] = simple_strtoul(str, &str, 0);
+		/* avoid tail segment of size < IO_TLB_SEGSIZE */
+		io_tlb_nslabs[SWIOTLB_HI] = ALIGN(io_tlb_nslabs[SWIOTLB_HI], IO_TLB_SEGSIZE);
+
+		swiotlb_nr = 2;
+	}
+
+	if (*str == ',')
+		++str;
+
 	if (!strcmp(str, "force")) {
 		swiotlb_force = SWIOTLB_FORCE;
 	} else if (!strcmp(str, "noforce")) {
 		swiotlb_force = SWIOTLB_NO_FORCE;
 		io_tlb_nslabs[SWIOTLB_LO] = 1;
+		if (swiotlb_nr > 1)
+			io_tlb_nslabs[SWIOTLB_HI] = 1;
 	}
 
 	return 0;
@@ -134,24 +153,24 @@ early_param("swiotlb", setup_io_tlb_npages);
 
 static bool no_iotlb_memory[SWIOTLB_MAX];
 
-unsigned long swiotlb_nr_tbl(void)
+unsigned long swiotlb_nr_tbl(enum swiotlb_t type)
 {
-	return unlikely(no_iotlb_memory[SWIOTLB_LO]) ? 0 : io_tlb_nslabs[SWIOTLB_LO];
+	return unlikely(no_iotlb_memory[type]) ? 0 : io_tlb_nslabs[type];
 }
 EXPORT_SYMBOL_GPL(swiotlb_nr_tbl);
 
-unsigned int swiotlb_max_segment(void)
+unsigned int swiotlb_max_segment(enum swiotlb_t type)
 {
-	return unlikely(no_iotlb_memory[SWIOTLB_LO]) ? 0 : max_segment[SWIOTLB_LO];
+	return unlikely(no_iotlb_memory[type]) ? 0 : max_segment[type];
 }
 EXPORT_SYMBOL_GPL(swiotlb_max_segment);
 
-void swiotlb_set_max_segment(unsigned int val)
+void swiotlb_set_max_segment(unsigned int val, enum swiotlb_t type)
 {
 	if (swiotlb_force == SWIOTLB_FORCE)
-		max_segment[SWIOTLB_LO] = 1;
+		max_segment[type] = 1;
 	else
-		max_segment[SWIOTLB_LO] = rounddown(val, PAGE_SIZE);
+		max_segment[type] = rounddown(val, PAGE_SIZE);
 }
 
 unsigned long swiotlb_size_or_default(void)
@@ -181,18 +200,18 @@ void __init swiotlb_adjust_size(unsigned long new_size)
 	}
 }
 
-void swiotlb_print_info(void)
+void swiotlb_print_info(enum swiotlb_t type)
 {
-	unsigned long bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	unsigned long bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
 
-	if (no_iotlb_memory[SWIOTLB_LO]) {
-		pr_warn("No low mem\n");
+	if (no_iotlb_memory[type]) {
+		pr_warn("No low mem with %s\n", swiotlb_name[type]);
 		return;
 	}
 
-	pr_info("mapped [mem %pa-%pa] (%luMB)\n",
-		&io_tlb_start[SWIOTLB_LO], &io_tlb_end[SWIOTLB_LO],
-		bytes >> 20);
+	pr_info("%s mapped [mem %pa-%pa] (%luMB)\n",
+		swiotlb_name[type],
+		&io_tlb_start[type], &io_tlb_end[type], bytes >> 20);
 }
 
 /*
@@ -205,88 +224,104 @@ void __init swiotlb_update_mem_attributes(void)
 {
 	void *vaddr;
 	unsigned long bytes;
+	int i;
 
-	if (no_iotlb_memory[SWIOTLB_LO] || late_alloc)
-		return;
+	for (i = 0; i < swiotlb_nr; i++) {
+		if (no_iotlb_memory[i] || late_alloc)
+			continue;
 
-	vaddr = phys_to_virt(io_tlb_start[SWIOTLB_LO]);
-	bytes = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
-	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
-	memset(vaddr, 0, bytes);
+		vaddr = phys_to_virt(io_tlb_start[i]);
+		bytes = PAGE_ALIGN(io_tlb_nslabs[i] << IO_TLB_SHIFT);
+		set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
+		memset(vaddr, 0, bytes);
+	}
 }
 
-int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
+int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs,
+				 enum swiotlb_t type, int verbose)
 {
 	unsigned long i, bytes;
 	size_t alloc_size;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
-	io_tlb_nslabs[SWIOTLB_LO] = nslabs;
-	io_tlb_start[SWIOTLB_LO] = __pa(tlb);
-	io_tlb_end[SWIOTLB_LO] = io_tlb_start[SWIOTLB_LO] + bytes;
+	io_tlb_nslabs[type] = nslabs;
+	io_tlb_start[type] = __pa(tlb);
+	io_tlb_end[type] = io_tlb_start[type] + bytes;
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
 	 * between io_tlb_start and io_tlb_end.
 	 */
-	alloc_size = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int));
-	io_tlb_list[SWIOTLB_LO] = memblock_alloc(alloc_size, PAGE_SIZE);
-	if (!io_tlb_list[SWIOTLB_LO])
+	alloc_size = PAGE_ALIGN(io_tlb_nslabs[type] * sizeof(int));
+	io_tlb_list[type] = memblock_alloc(alloc_size, PAGE_SIZE);
+	if (!io_tlb_list[type])
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	alloc_size = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t));
-	io_tlb_orig_addr[SWIOTLB_LO] = memblock_alloc(alloc_size, PAGE_SIZE);
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
+	alloc_size = PAGE_ALIGN(io_tlb_nslabs[type] * sizeof(phys_addr_t));
+	io_tlb_orig_addr[type] = memblock_alloc(alloc_size, PAGE_SIZE);
+	if (!io_tlb_orig_addr[type])
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	for (i = 0; i < io_tlb_nslabs[SWIOTLB_LO]; i++) {
-		io_tlb_list[SWIOTLB_LO][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-		io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+	for (i = 0; i < io_tlb_nslabs[type]; i++) {
+		io_tlb_list[type][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 	}
-	io_tlb_index[SWIOTLB_LO] = 0;
-	no_iotlb_memory[SWIOTLB_LO] = false;
+	io_tlb_index[type] = 0;
+	no_iotlb_memory[type] = false;
 
 	if (verbose)
-		swiotlb_print_info();
+		swiotlb_print_info(type);
 
-	swiotlb_set_max_segment(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+	swiotlb_set_max_segment(io_tlb_nslabs[type] << IO_TLB_SHIFT, type);
 	return 0;
 }
 
-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the DMA API.
- */
-void  __init
-swiotlb_init(int verbose)
+static void __init
+swiotlb_init_type(enum swiotlb_t type, int verbose)
 {
 	size_t default_size = IO_TLB_DEFAULT_SIZE;
 	unsigned char *vstart;
 	unsigned long bytes;
 
-	if (!io_tlb_nslabs[SWIOTLB_LO]) {
-		io_tlb_nslabs[SWIOTLB_LO] = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], IO_TLB_SEGSIZE);
+	if (!io_tlb_nslabs[type]) {
+		io_tlb_nslabs[type] = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs[type] = ALIGN(io_tlb_nslabs[type], IO_TLB_SEGSIZE);
 	}
 
-	bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
+
+	if (type == SWIOTLB_LO)
+		vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
+	else
+		vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
 
-	/* Get IO TLB memory from the low pages */
-	vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
-	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs[SWIOTLB_LO], verbose))
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs[type], type, verbose))
 		return;
 
-	if (io_tlb_start[SWIOTLB_LO]) {
-		memblock_free_early(io_tlb_start[SWIOTLB_LO],
-				    PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
-		io_tlb_start[SWIOTLB_LO] = 0;
+	if (io_tlb_start[type]) {
+		memblock_free_early(io_tlb_start[type],
+				    PAGE_ALIGN(io_tlb_nslabs[type] << IO_TLB_SHIFT));
+		io_tlb_start[type] = 0;
 	}
-	pr_warn("Cannot allocate buffer");
-	no_iotlb_memory[SWIOTLB_LO] = true;
+	pr_warn("Cannot allocate buffer %s\n", swiotlb_name[type]);
+	no_iotlb_memory[type] = true;
+}
+
+/*
+ * Statically reserve bounce buffer space and initialize bounce buffer data
+ * structures for the software IO TLB used to implement the DMA API.
+ */
+void  __init
+swiotlb_init(int verbose)
+{
+	int i;
+
+	for (i = 0; i < swiotlb_nr; i++)
+		swiotlb_init_type(i, verbose);
 }
 
 /*
@@ -295,67 +330,68 @@ swiotlb_init(int verbose)
  * This should be just like above, but with some error catching.
  */
 int
-swiotlb_late_init_with_default_size(size_t default_size)
+swiotlb_late_init_with_default_size(size_t default_size, enum swiotlb_t type)
 {
-	unsigned long bytes, req_nslabs = io_tlb_nslabs[SWIOTLB_LO];
+	unsigned long bytes, req_nslabs = io_tlb_nslabs[type];
 	unsigned char *vstart = NULL;
 	unsigned int order;
 	int rc = 0;
+	gfp_t gfp_mask = (type == SWIOTLB_LO) ? GFP_DMA | __GFP_NOWARN :
+						GFP_KERNEL | __GFP_NOWARN;
 
-	if (!io_tlb_nslabs[SWIOTLB_LO]) {
-		io_tlb_nslabs[SWIOTLB_LO] = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], IO_TLB_SEGSIZE);
+	if (!io_tlb_nslabs[type]) {
+		io_tlb_nslabs[type] = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs[type] = ALIGN(io_tlb_nslabs[type], IO_TLB_SEGSIZE);
 	}
 
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	order = get_order(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
-	io_tlb_nslabs[SWIOTLB_LO] = SLABS_PER_PAGE << order;
-	bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	order = get_order(io_tlb_nslabs[type] << IO_TLB_SHIFT);
+	io_tlb_nslabs[type] = SLABS_PER_PAGE << order;
+	bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
 
 	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-		vstart = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
-						  order);
+		vstart = (void *)__get_free_pages(gfp_mask, order);
 		if (vstart)
 			break;
 		order--;
 	}
 
 	if (!vstart) {
-		io_tlb_nslabs[SWIOTLB_LO] = req_nslabs;
+		io_tlb_nslabs[type] = req_nslabs;
 		return -ENOMEM;
 	}
 	if (order != get_order(bytes)) {
 		pr_warn("only able to allocate %ld MB\n",
 			(PAGE_SIZE << order) >> 20);
-		io_tlb_nslabs[SWIOTLB_LO] = SLABS_PER_PAGE << order;
+		io_tlb_nslabs[type] = SLABS_PER_PAGE << order;
 	}
-	rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs[SWIOTLB_LO]);
+	rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs[type], type);
 	if (rc)
 		free_pages((unsigned long)vstart, order);
 
 	return rc;
 }
 
-static void swiotlb_cleanup(void)
+static void swiotlb_cleanup(enum swiotlb_t type)
 {
-	io_tlb_end[SWIOTLB_LO] = 0;
-	io_tlb_start[SWIOTLB_LO] = 0;
-	io_tlb_nslabs[SWIOTLB_LO] = 0;
-	max_segment[SWIOTLB_LO] = 0;
+	io_tlb_end[type] = 0;
+	io_tlb_start[type] = 0;
+	io_tlb_nslabs[type] = 0;
+	max_segment[type] = 0;
 }
 
 int
-swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
+swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs, enum swiotlb_t type)
 {
 	unsigned long i, bytes;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
-	io_tlb_nslabs[SWIOTLB_LO] = nslabs;
-	io_tlb_start[SWIOTLB_LO] = virt_to_phys(tlb);
-	io_tlb_end[SWIOTLB_LO] = io_tlb_start[SWIOTLB_LO] + bytes;
+	io_tlb_nslabs[type] = nslabs;
+	io_tlb_start[type] = virt_to_phys(tlb);
+	io_tlb_end[type] = io_tlb_start[type] + bytes;
 
 	set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
 	memset(tlb, 0, bytes);
@@ -365,63 +401,67 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
 	 * between io_tlb_start and io_tlb_end.
 	 */
-	io_tlb_list[SWIOTLB_LO] = (unsigned int *)__get_free_pages(GFP_KERNEL,
-			get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-	if (!io_tlb_list[SWIOTLB_LO])
+	io_tlb_list[type] = (unsigned int *)__get_free_pages(GFP_KERNEL,
+			get_order(io_tlb_nslabs[type] * sizeof(int)));
+	if (!io_tlb_list[type])
 		goto cleanup3;
 
-	io_tlb_orig_addr[SWIOTLB_LO] = (phys_addr_t *)
+	io_tlb_orig_addr[type] = (phys_addr_t *)
 		__get_free_pages(GFP_KERNEL,
-				 get_order(io_tlb_nslabs[SWIOTLB_LO] *
+				 get_order(io_tlb_nslabs[type] *
 					   sizeof(phys_addr_t)));
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
+	if (!io_tlb_orig_addr[type])
 		goto cleanup4;
 
-	for (i = 0; i < io_tlb_nslabs[SWIOTLB_LO]; i++) {
-		io_tlb_list[SWIOTLB_LO][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-		io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+	for (i = 0; i < io_tlb_nslabs[type]; i++) {
+		io_tlb_list[type][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 	}
-	io_tlb_index[SWIOTLB_LO] = 0;
-	no_iotlb_memory[SWIOTLB_LO] = false;
+	io_tlb_index[type] = 0;
+	no_iotlb_memory[type] = false;
 
-	swiotlb_print_info();
+	swiotlb_print_info(type);
 
 	late_alloc = 1;
 
-	swiotlb_set_max_segment(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+	swiotlb_set_max_segment(io_tlb_nslabs[type] << IO_TLB_SHIFT, type);
 
 	return 0;
 
 cleanup4:
-	free_pages((unsigned long)io_tlb_list[SWIOTLB_LO], get_order(io_tlb_nslabs[SWIOTLB_LO] *
+	free_pages((unsigned long)io_tlb_list[type], get_order(io_tlb_nslabs[type] *
 	                                                 sizeof(int)));
-	io_tlb_list[SWIOTLB_LO] = NULL;
+	io_tlb_list[type] = NULL;
 cleanup3:
-	swiotlb_cleanup();
+	swiotlb_cleanup(type);
 	return -ENOMEM;
 }
 
 void __init swiotlb_exit(void)
 {
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
-		return;
+	int i;
 
-	if (late_alloc) {
-		free_pages((unsigned long)io_tlb_orig_addr[SWIOTLB_LO],
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t)));
-		free_pages((unsigned long)io_tlb_list[SWIOTLB_LO],
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-		free_pages((unsigned long)phys_to_virt(io_tlb_start[SWIOTLB_LO]),
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
-	} else {
-		memblock_free_late(__pa(io_tlb_orig_addr[SWIOTLB_LO]),
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t)));
-		memblock_free_late(__pa(io_tlb_list[SWIOTLB_LO]),
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-		memblock_free_late(io_tlb_start[SWIOTLB_LO],
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
+	for (i = 0; i < swiotlb_nr; i++) {
+		if (!io_tlb_orig_addr[i])
+			continue;
+
+		if (late_alloc) {
+			free_pages((unsigned long)io_tlb_orig_addr[i],
+				   get_order(io_tlb_nslabs[i] * sizeof(phys_addr_t)));
+			free_pages((unsigned long)io_tlb_list[i],
+				   get_order(io_tlb_nslabs[i] * sizeof(int)));
+			free_pages((unsigned long)phys_to_virt(io_tlb_start[i]),
+				   get_order(io_tlb_nslabs[i] << IO_TLB_SHIFT));
+		} else {
+			memblock_free_late(__pa(io_tlb_orig_addr[i]),
+					   PAGE_ALIGN(io_tlb_nslabs[i] * sizeof(phys_addr_t)));
+			memblock_free_late(__pa(io_tlb_list[i]),
+					   PAGE_ALIGN(io_tlb_nslabs[i] * sizeof(int)));
+			memblock_free_late(io_tlb_start[i],
+					   PAGE_ALIGN(io_tlb_nslabs[i] << IO_TLB_SHIFT));
+		}
+		swiotlb_cleanup(i);
 	}
-	swiotlb_cleanup();
 }
 
 /*
@@ -468,7 +508,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		size_t mapping_size, size_t alloc_size,
 		enum dma_data_direction dir, unsigned long attrs)
 {
-	dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[SWIOTLB_LO]);
+	dma_addr_t tbl_dma_addr;
 	unsigned long flags;
 	phys_addr_t tlb_addr;
 	unsigned int nslots, stride, index, wrap;
@@ -477,8 +517,16 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	unsigned long offset_slots;
 	unsigned long max_slots;
 	unsigned long tmp_io_tlb_used;
+	unsigned long dma_mask = min_not_zero(*hwdev->dma_mask,
+					      hwdev->bus_dma_limit);
+	int type;
 
-	if (no_iotlb_memory[SWIOTLB_LO])
+	if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
+		type = SWIOTLB_HI;
+	else
+		type = SWIOTLB_LO;
+
+	if (no_iotlb_memory[type])
 		panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
 
 	if (mem_encrypt_active())
@@ -490,6 +538,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		return (phys_addr_t)DMA_MAPPING_ERROR;
 	}
 
+	tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[type]);
+
 	mask = dma_get_seg_boundary(hwdev);
 
 	tbl_dma_addr &= mask;
@@ -521,11 +571,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	 */
 	spin_lock_irqsave(&io_tlb_lock, flags);
 
-	if (unlikely(nslots > io_tlb_nslabs[SWIOTLB_LO] - io_tlb_used[SWIOTLB_LO]))
+	if (unlikely(nslots > io_tlb_nslabs[type] - io_tlb_used[type]))
 		goto not_found;
 
-	index = ALIGN(io_tlb_index[SWIOTLB_LO], stride);
-	if (index >= io_tlb_nslabs[SWIOTLB_LO])
+	index = ALIGN(io_tlb_index[type], stride);
+	if (index >= io_tlb_nslabs[type])
 		index = 0;
 	wrap = index;
 
@@ -533,7 +583,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		while (iommu_is_span_boundary(index, nslots, offset_slots,
 					      max_slots)) {
 			index += stride;
-			if (index >= io_tlb_nslabs[SWIOTLB_LO])
+			if (index >= io_tlb_nslabs[type])
 				index = 0;
 			if (index == wrap)
 				goto not_found;
@@ -544,42 +594,42 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		 * contiguous buffers, we allocate the buffers from that slot
 		 * and mark the entries as '0' indicating unavailable.
 		 */
-		if (io_tlb_list[SWIOTLB_LO][index] >= nslots) {
+		if (io_tlb_list[type][index] >= nslots) {
 			int count = 0;
 
 			for (i = index; i < (int) (index + nslots); i++)
-				io_tlb_list[SWIOTLB_LO][i] = 0;
+				io_tlb_list[type][i] = 0;
 			for (i = index - 1;
 			     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
-			     io_tlb_list[SWIOTLB_LO][i];
+			     io_tlb_list[type][i];
 			     i--)
-				io_tlb_list[SWIOTLB_LO][i] = ++count;
-			tlb_addr = io_tlb_start[SWIOTLB_LO] + (index << IO_TLB_SHIFT);
+				io_tlb_list[type][i] = ++count;
+			tlb_addr = io_tlb_start[type] + (index << IO_TLB_SHIFT);
 
 			/*
 			 * Update the indices to avoid searching in the next
 			 * round.
 			 */
-			io_tlb_index[SWIOTLB_LO] = ((index + nslots) < io_tlb_nslabs[SWIOTLB_LO]
+			io_tlb_index[type] = ((index + nslots) < io_tlb_nslabs[type]
 					? (index + nslots) : 0);
 
 			goto found;
 		}
 		index += stride;
-		if (index >= io_tlb_nslabs[SWIOTLB_LO])
+		if (index >= io_tlb_nslabs[type])
 			index = 0;
 	} while (index != wrap);
 
 not_found:
-	tmp_io_tlb_used = io_tlb_used[SWIOTLB_LO];
+	tmp_io_tlb_used = io_tlb_used[type];
 
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
-		dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
-			 alloc_size, io_tlb_nslabs[SWIOTLB_LO], tmp_io_tlb_used);
+		dev_warn(hwdev, "%s swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
+			 swiotlb_name[type], alloc_size, io_tlb_nslabs[type], tmp_io_tlb_used);
 	return (phys_addr_t)DMA_MAPPING_ERROR;
 found:
-	io_tlb_used[SWIOTLB_LO] += nslots;
+	io_tlb_used[type] += nslots;
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 
 	/*
@@ -588,7 +638,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	 * needed.
 	 */
 	for (i = 0; i < nslots; i++)
-		io_tlb_orig_addr[SWIOTLB_LO][index+i] = orig_addr + (i << IO_TLB_SHIFT);
+		io_tlb_orig_addr[type][index+i] = orig_addr + (i << IO_TLB_SHIFT);
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
 		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
@@ -605,8 +655,9 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 {
 	unsigned long flags;
 	int i, count, nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	int index = (tlb_addr - io_tlb_start[SWIOTLB_LO]) >> IO_TLB_SHIFT;
-	phys_addr_t orig_addr = io_tlb_orig_addr[SWIOTLB_LO][index];
+	int type = swiotlb_get_type(tlb_addr);
+	int index = (tlb_addr - io_tlb_start[type]) >> IO_TLB_SHIFT;
+	phys_addr_t orig_addr = io_tlb_orig_addr[type][index];
 
 	/*
 	 * First, sync the memory before unmapping the entry
@@ -625,14 +676,14 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 	spin_lock_irqsave(&io_tlb_lock, flags);
 	{
 		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
-			 io_tlb_list[SWIOTLB_LO][index + nslots] : 0);
+			 io_tlb_list[type][index + nslots] : 0);
 		/*
 		 * Step 1: return the slots to the free list, merging the
 		 * slots with superceeding slots
 		 */
 		for (i = index + nslots - 1; i >= index; i--) {
-			io_tlb_list[SWIOTLB_LO][i] = ++count;
-			io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+			io_tlb_list[type][i] = ++count;
+			io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 		}
 		/*
 		 * Step 2: merge the returned slots with the preceding slots,
@@ -640,11 +691,11 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 		 */
 		for (i = index - 1;
 		     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
-		     io_tlb_list[SWIOTLB_LO][i];
+		     io_tlb_list[type][i];
 		     i--)
-			io_tlb_list[SWIOTLB_LO][i] = ++count;
+			io_tlb_list[type][i] = ++count;
 
-		io_tlb_used[SWIOTLB_LO] -= nslots;
+		io_tlb_used[type] -= nslots;
 	}
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 }
@@ -653,8 +704,9 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 			     size_t size, enum dma_data_direction dir,
 			     enum dma_sync_target target)
 {
-	int index = (tlb_addr - io_tlb_start[SWIOTLB_LO]) >> IO_TLB_SHIFT;
-	phys_addr_t orig_addr = io_tlb_orig_addr[SWIOTLB_LO][index];
+	int type = swiotlb_get_type(tlb_addr);
+	int index = (tlb_addr - io_tlb_start[type]) >> IO_TLB_SHIFT;
+	phys_addr_t orig_addr = io_tlb_orig_addr[type][index];
 
 	if (orig_addr == INVALID_PHYS_ADDR)
 		return;
@@ -737,6 +789,15 @@ static int __init swiotlb_create_debugfs(void)
 	root = debugfs_create_dir("swiotlb", NULL);
 	debugfs_create_ulong("io_tlb_nslabs", 0400, root, &io_tlb_nslabs[SWIOTLB_LO]);
 	debugfs_create_ulong("io_tlb_used", 0400, root, &io_tlb_used[SWIOTLB_LO]);
+
+	if (swiotlb_nr == 1)
+		return 0;
+
+	debugfs_create_ulong("io_tlb_nslabs-highmem", 0400,
+			     root, &io_tlb_nslabs[SWIOTLB_HI]);
+	debugfs_create_ulong("io_tlb_used-highmem", 0400,
+			     root, &io_tlb_used[SWIOTLB_HI]);
+
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb
@ 2021-02-03 23:37   ` Dongli Zhang
  0 siblings, 0 replies; 6+ messages in thread
From: Dongli Zhang @ 2021-02-03 23:37 UTC (permalink / raw)
  To: dri-devel, intel-gfx, iommu, linux-mips, linux-mmc, linux-pci,
	linuxppc-dev, nouveau, x86, xen-devel
  Cc: ulf.hansson, airlied, benh, joonas.lahtinen, adrian.hunter,
	paulus, hpa, mingo, sstabellini, mpe, joe.jin, hch, peterz,
	mingo, matthew.auld, bskeggs, thomas.lendacky, konrad.wilk,
	jani.nikula, bp, rodrigo.vivi, bhelgaas, boris.ostrovsky, chris,
	jgross, tsbogend, linux-kernel, tglx, daniel, akpm, robin.murphy,
	rppt

This patch is to enable the 64-bit swiotlb buffer.

The state of the art swiotlb pre-allocates <=32-bit memory in order to meet
the DMA mask requirement for some 32-bit legacy device. Considering most
devices nowadays support 64-bit DMA and IOMMU is available, the swiotlb is
not used for most of the times, except:

1. The xen PVM domain requires the DMA addresses to both (1) less than the
dma mask, and (2) continuous in machine address. Therefore, the 64-bit
device may still require swiotlb on PVM domain.

2. From source code the AMD SME/SEV will enable SWIOTLB_FORCE. As a result
it is always required to allocate from swiotlb buffer even the device dma
mask is 64-bit.

sme_early_init()
-> if (sev_active())
   swiotlb_force = SWIOTLB_FORCE;

Therefore, this patch introduces the 2nd swiotlb buffer for 64-bit DMA
access. For instance, the swiotlb_tbl_map_single() allocates from the 2nd
64-bit buffer if the device DMA mask is
min_not_zero(*hwdev->dma_mask, hwdev->bus_dma_limit).

The example to configure 64-bit swiotlb is "swiotlb=65536,524288,force"
or "swiotlb=,524288,force", where 524288 is the size of 64-bit buffer.

With the patch, the kernel will be able to allocate >4GB swiotlb buffer.
This patch is only for swiotlb, not including xen-swiotlb.

Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 arch/mips/cavium-octeon/dma-octeon.c         |   3 +-
 arch/powerpc/kernel/dma-swiotlb.c            |   2 +-
 arch/powerpc/platforms/pseries/svm.c         |   2 +-
 arch/x86/kernel/pci-swiotlb.c                |   5 +-
 arch/x86/pci/sta2x11-fixup.c                 |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_internal.c |   4 +-
 drivers/gpu/drm/i915/i915_scatterlist.h      |   2 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c        |   2 +-
 drivers/mmc/host/sdhci.c                     |   2 +-
 drivers/pci/xen-pcifront.c                   |   2 +-
 drivers/xen/swiotlb-xen.c                    |   9 +-
 include/linux/swiotlb.h                      |  28 +-
 kernel/dma/swiotlb.c                         | 339 +++++++++++--------
 13 files changed, 238 insertions(+), 164 deletions(-)

diff --git a/arch/mips/cavium-octeon/dma-octeon.c b/arch/mips/cavium-octeon/dma-octeon.c
index df70308db0e6..3480555d908a 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -245,6 +245,7 @@ void __init plat_swiotlb_setup(void)
 		panic("%s: Failed to allocate %zu bytes align=%lx\n",
 		      __func__, swiotlbsize, PAGE_SIZE);
 
-	if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs, 1) == -ENOMEM)
+	if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs,
+				  SWIOTLB_LO, 1) == -ENOMEM)
 		panic("Cannot allocate SWIOTLB buffer");
 }
diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index fc7816126a40..88113318c53f 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -20,7 +20,7 @@ void __init swiotlb_detect_4g(void)
 static int __init check_swiotlb_enabled(void)
 {
 	if (ppc_swiotlb_enable)
-		swiotlb_print_info();
+		swiotlb_print_info(SWIOTLB_LO);
 	else
 		swiotlb_exit();
 
diff --git a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c
index 9f8842d0da1f..77910e4ffad8 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -52,7 +52,7 @@ void __init svm_swiotlb_init(void)
 	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
 	vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
-	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, false))
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, SWIOTLB_LO, false))
 		return;
 
 	if (io_tlb_start[SWIOTLB_LO])
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..950624fd95a4 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -67,12 +67,15 @@ void __init pci_swiotlb_init(void)
 
 void __init pci_swiotlb_late_init(void)
 {
+	int i;
+
 	/* An IOMMU turned us off. */
 	if (!swiotlb)
 		swiotlb_exit();
 	else {
 		printk(KERN_INFO "PCI-DMA: "
 		       "Using software bounce buffering for IO (SWIOTLB)\n");
-		swiotlb_print_info();
+		for (i = 0; i < swiotlb_nr; i++)
+			swiotlb_print_info(i);
 	}
 }
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index 7d2525691854..c440520b2055 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -57,7 +57,7 @@ static void sta2x11_new_instance(struct pci_dev *pdev)
 		int size = STA2X11_SWIOTLB_SIZE;
 		/* First instance: register your own swiotlb area */
 		dev_info(&pdev->dev, "Using SWIOTLB (size %i)\n", size);
-		if (swiotlb_late_init_with_default_size(size))
+		if (swiotlb_late_init_with_default_size(size), SWIOTLB_LO)
 			dev_emerg(&pdev->dev, "init swiotlb failed\n");
 	}
 	list_add(&instance->list, &sta2x11_instance_list);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index ad22f42541bd..947683f2e568 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -42,10 +42,10 @@ static int i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
 
 	max_order = MAX_ORDER;
 #ifdef CONFIG_SWIOTLB
-	if (swiotlb_nr_tbl()) {
+	if (swiotlb_nr_tbl(SWIOTLB_LO)) {
 		unsigned int max_segment;
 
-		max_segment = swiotlb_max_segment();
+		max_segment = swiotlb_max_segment(SWIOTLB_LO);
 		if (max_segment) {
 			max_segment = max_t(unsigned int, max_segment,
 					    PAGE_SIZE) >> PAGE_SHIFT;
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h
index 9cb26a224034..c63c7f6941f6 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -118,7 +118,7 @@ static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg)
 
 static inline unsigned int i915_sg_segment_size(void)
 {
-	unsigned int size = swiotlb_max_segment();
+	unsigned int size = swiotlb_max_segment(SWIOTLB_LO);
 
 	if (size == 0)
 		size = UINT_MAX;
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index a37bc3d7b38b..0919b207ac47 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -321,7 +321,7 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 	}
 
 #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86)
-	need_swiotlb = !!swiotlb_nr_tbl();
+	need_swiotlb = !!swiotlb_nr_tbl(SWIOTLB_LO);
 #endif
 
 	ret = ttm_bo_device_init(&drm->ttm.bdev, &nouveau_bo_driver,
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 646823ddd317..1f7fb912d5a9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -4582,7 +4582,7 @@ int sdhci_setup_host(struct sdhci_host *host)
 		mmc->max_segs = SDHCI_MAX_SEGS;
 	} else if (host->flags & SDHCI_USE_SDMA) {
 		mmc->max_segs = 1;
-		if (swiotlb_max_segment()) {
+		if (swiotlb_max_segment(SWIOTLB_LO)) {
 			unsigned int max_req_size = (1 << IO_TLB_SHIFT) *
 						IO_TLB_SEGSIZE;
 			mmc->max_req_size = min(mmc->max_req_size,
diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index c6fe0cfec0f6..9509ed9b4126 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -693,7 +693,7 @@ static int pcifront_connect_and_init_dma(struct pcifront_device *pdev)
 
 	spin_unlock(&pcifront_dev_lock);
 
-	if (!err && !swiotlb_nr_tbl()) {
+	if (!err && !swiotlb_nr_tbl(SWIOTLB_LO)) {
 		err = pci_xen_swiotlb_init_late();
 		if (err)
 			dev_err(&pdev->xdev->dev, "Could not setup SWIOTLB!\n");
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 3261880ad859..662638093542 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -184,7 +184,7 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 	enum xen_swiotlb_err m_ret = XEN_SWIOTLB_UNKNOWN;
 	unsigned int repeat = 3;
 
-	xen_io_tlb_nslabs = swiotlb_nr_tbl();
+	xen_io_tlb_nslabs = swiotlb_nr_tbl(SWIOTLB_LO);
 retry:
 	bytes = xen_set_nslabs(xen_io_tlb_nslabs);
 	order = get_order(xen_io_tlb_nslabs << IO_TLB_SHIFT);
@@ -245,16 +245,17 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 	}
 	if (early) {
 		if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
-			 verbose))
+					  SWIOTLB_LO, verbose))
 			panic("Cannot allocate SWIOTLB buffer");
 		rc = 0;
 	} else
-		rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs);
+		rc = swiotlb_late_init_with_tbl(xen_io_tlb_start,
+						xen_io_tlb_nslabs, SWIOTLB_LO);
 
 end:
 	xen_io_tlb_end = xen_io_tlb_start + bytes;
 	if (!rc)
-		swiotlb_set_max_segment(PAGE_SIZE);
+		swiotlb_set_max_segment(PAGE_SIZE, SWIOTLB_LO);
 
 	return rc;
 error:
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 3d5980d77810..8ba45fddfb14 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -43,11 +43,14 @@ extern int swiotlb_nr;
 #define IO_TLB_DEFAULT_SIZE (64UL<<20)
 
 extern void swiotlb_init(int verbose);
-int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose);
-extern unsigned long swiotlb_nr_tbl(void);
+int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs,
+			  enum swiotlb_t type, int verbose);
+extern unsigned long swiotlb_nr_tbl(enum swiotlb_t type);
 unsigned long swiotlb_size_or_default(void);
-extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs);
-extern int swiotlb_late_init_with_default_size(size_t default_size);
+extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs,
+				      enum swiotlb_t type);
+extern int swiotlb_late_init_with_default_size(size_t default_size,
+					       enum swiotlb_t type);
 extern void __init swiotlb_update_mem_attributes(void);
 
 /*
@@ -83,8 +86,13 @@ extern phys_addr_t io_tlb_start[], io_tlb_end[];
 
 static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 {
-	return paddr >= io_tlb_start[SWIOTLB_LO] &&
-	       paddr < io_tlb_end[SWIOTLB_LO];
+	int i;
+
+	for (i = 0; i < swiotlb_nr; i++)
+		if (paddr >= io_tlb_start[i] && paddr < io_tlb_end[i])
+			return true;
+
+	return false;
 }
 
 static inline int swiotlb_get_type(phys_addr_t paddr)
@@ -99,7 +107,7 @@ static inline int swiotlb_get_type(phys_addr_t paddr)
 }
 
 void __init swiotlb_exit(void);
-unsigned int swiotlb_max_segment(void);
+unsigned int swiotlb_max_segment(enum swiotlb_t type);
 size_t swiotlb_max_mapping_size(struct device *dev);
 bool is_swiotlb_active(void);
 void __init swiotlb_adjust_size(unsigned long new_size);
@@ -112,7 +120,7 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 static inline void swiotlb_exit(void)
 {
 }
-static inline unsigned int swiotlb_max_segment(void)
+static inline unsigned int swiotlb_max_segment(enum swiotlb_t type)
 {
 	return 0;
 }
@@ -131,7 +139,7 @@ static inline void swiotlb_adjust_size(unsigned long new_size)
 }
 #endif /* CONFIG_SWIOTLB */
 
-extern void swiotlb_print_info(void);
-extern void swiotlb_set_max_segment(unsigned int);
+extern void swiotlb_print_info(enum swiotlb_t type);
+extern void swiotlb_set_max_segment(unsigned int val, enum swiotlb_t type);
 
 #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index c91d3d2c3936..cd28db5b016a 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -111,6 +111,11 @@ static int late_alloc;
 
 int swiotlb_nr = 1;
 
+static const char * const swiotlb_name[] = {
+	"Low Mem",
+	"High Mem"
+};
+
 static int __init
 setup_io_tlb_npages(char *str)
 {
@@ -121,11 +126,25 @@ setup_io_tlb_npages(char *str)
 	}
 	if (*str == ',')
 		++str;
+
+	if (isdigit(*str)) {
+		io_tlb_nslabs[SWIOTLB_HI] = simple_strtoul(str, &str, 0);
+		/* avoid tail segment of size < IO_TLB_SEGSIZE */
+		io_tlb_nslabs[SWIOTLB_HI] = ALIGN(io_tlb_nslabs[SWIOTLB_HI], IO_TLB_SEGSIZE);
+
+		swiotlb_nr = 2;
+	}
+
+	if (*str == ',')
+		++str;
+
 	if (!strcmp(str, "force")) {
 		swiotlb_force = SWIOTLB_FORCE;
 	} else if (!strcmp(str, "noforce")) {
 		swiotlb_force = SWIOTLB_NO_FORCE;
 		io_tlb_nslabs[SWIOTLB_LO] = 1;
+		if (swiotlb_nr > 1)
+			io_tlb_nslabs[SWIOTLB_HI] = 1;
 	}
 
 	return 0;
@@ -134,24 +153,24 @@ early_param("swiotlb", setup_io_tlb_npages);
 
 static bool no_iotlb_memory[SWIOTLB_MAX];
 
-unsigned long swiotlb_nr_tbl(void)
+unsigned long swiotlb_nr_tbl(enum swiotlb_t type)
 {
-	return unlikely(no_iotlb_memory[SWIOTLB_LO]) ? 0 : io_tlb_nslabs[SWIOTLB_LO];
+	return unlikely(no_iotlb_memory[type]) ? 0 : io_tlb_nslabs[type];
 }
 EXPORT_SYMBOL_GPL(swiotlb_nr_tbl);
 
-unsigned int swiotlb_max_segment(void)
+unsigned int swiotlb_max_segment(enum swiotlb_t type)
 {
-	return unlikely(no_iotlb_memory[SWIOTLB_LO]) ? 0 : max_segment[SWIOTLB_LO];
+	return unlikely(no_iotlb_memory[type]) ? 0 : max_segment[type];
 }
 EXPORT_SYMBOL_GPL(swiotlb_max_segment);
 
-void swiotlb_set_max_segment(unsigned int val)
+void swiotlb_set_max_segment(unsigned int val, enum swiotlb_t type)
 {
 	if (swiotlb_force == SWIOTLB_FORCE)
-		max_segment[SWIOTLB_LO] = 1;
+		max_segment[type] = 1;
 	else
-		max_segment[SWIOTLB_LO] = rounddown(val, PAGE_SIZE);
+		max_segment[type] = rounddown(val, PAGE_SIZE);
 }
 
 unsigned long swiotlb_size_or_default(void)
@@ -181,18 +200,18 @@ void __init swiotlb_adjust_size(unsigned long new_size)
 	}
 }
 
-void swiotlb_print_info(void)
+void swiotlb_print_info(enum swiotlb_t type)
 {
-	unsigned long bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	unsigned long bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
 
-	if (no_iotlb_memory[SWIOTLB_LO]) {
-		pr_warn("No low mem\n");
+	if (no_iotlb_memory[type]) {
+		pr_warn("No low mem with %s\n", swiotlb_name[type]);
 		return;
 	}
 
-	pr_info("mapped [mem %pa-%pa] (%luMB)\n",
-		&io_tlb_start[SWIOTLB_LO], &io_tlb_end[SWIOTLB_LO],
-		bytes >> 20);
+	pr_info("%s mapped [mem %pa-%pa] (%luMB)\n",
+		swiotlb_name[type],
+		&io_tlb_start[type], &io_tlb_end[type], bytes >> 20);
 }
 
 /*
@@ -205,88 +224,104 @@ void __init swiotlb_update_mem_attributes(void)
 {
 	void *vaddr;
 	unsigned long bytes;
+	int i;
 
-	if (no_iotlb_memory[SWIOTLB_LO] || late_alloc)
-		return;
+	for (i = 0; i < swiotlb_nr; i++) {
+		if (no_iotlb_memory[i] || late_alloc)
+			continue;
 
-	vaddr = phys_to_virt(io_tlb_start[SWIOTLB_LO]);
-	bytes = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
-	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
-	memset(vaddr, 0, bytes);
+		vaddr = phys_to_virt(io_tlb_start[i]);
+		bytes = PAGE_ALIGN(io_tlb_nslabs[i] << IO_TLB_SHIFT);
+		set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
+		memset(vaddr, 0, bytes);
+	}
 }
 
-int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
+int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs,
+				 enum swiotlb_t type, int verbose)
 {
 	unsigned long i, bytes;
 	size_t alloc_size;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
-	io_tlb_nslabs[SWIOTLB_LO] = nslabs;
-	io_tlb_start[SWIOTLB_LO] = __pa(tlb);
-	io_tlb_end[SWIOTLB_LO] = io_tlb_start[SWIOTLB_LO] + bytes;
+	io_tlb_nslabs[type] = nslabs;
+	io_tlb_start[type] = __pa(tlb);
+	io_tlb_end[type] = io_tlb_start[type] + bytes;
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
 	 * between io_tlb_start and io_tlb_end.
 	 */
-	alloc_size = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int));
-	io_tlb_list[SWIOTLB_LO] = memblock_alloc(alloc_size, PAGE_SIZE);
-	if (!io_tlb_list[SWIOTLB_LO])
+	alloc_size = PAGE_ALIGN(io_tlb_nslabs[type] * sizeof(int));
+	io_tlb_list[type] = memblock_alloc(alloc_size, PAGE_SIZE);
+	if (!io_tlb_list[type])
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	alloc_size = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t));
-	io_tlb_orig_addr[SWIOTLB_LO] = memblock_alloc(alloc_size, PAGE_SIZE);
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
+	alloc_size = PAGE_ALIGN(io_tlb_nslabs[type] * sizeof(phys_addr_t));
+	io_tlb_orig_addr[type] = memblock_alloc(alloc_size, PAGE_SIZE);
+	if (!io_tlb_orig_addr[type])
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	for (i = 0; i < io_tlb_nslabs[SWIOTLB_LO]; i++) {
-		io_tlb_list[SWIOTLB_LO][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-		io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+	for (i = 0; i < io_tlb_nslabs[type]; i++) {
+		io_tlb_list[type][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 	}
-	io_tlb_index[SWIOTLB_LO] = 0;
-	no_iotlb_memory[SWIOTLB_LO] = false;
+	io_tlb_index[type] = 0;
+	no_iotlb_memory[type] = false;
 
 	if (verbose)
-		swiotlb_print_info();
+		swiotlb_print_info(type);
 
-	swiotlb_set_max_segment(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+	swiotlb_set_max_segment(io_tlb_nslabs[type] << IO_TLB_SHIFT, type);
 	return 0;
 }
 
-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the DMA API.
- */
-void  __init
-swiotlb_init(int verbose)
+static void __init
+swiotlb_init_type(enum swiotlb_t type, int verbose)
 {
 	size_t default_size = IO_TLB_DEFAULT_SIZE;
 	unsigned char *vstart;
 	unsigned long bytes;
 
-	if (!io_tlb_nslabs[SWIOTLB_LO]) {
-		io_tlb_nslabs[SWIOTLB_LO] = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], IO_TLB_SEGSIZE);
+	if (!io_tlb_nslabs[type]) {
+		io_tlb_nslabs[type] = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs[type] = ALIGN(io_tlb_nslabs[type], IO_TLB_SEGSIZE);
 	}
 
-	bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
+
+	if (type == SWIOTLB_LO)
+		vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
+	else
+		vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
 
-	/* Get IO TLB memory from the low pages */
-	vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
-	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs[SWIOTLB_LO], verbose))
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs[type], type, verbose))
 		return;
 
-	if (io_tlb_start[SWIOTLB_LO]) {
-		memblock_free_early(io_tlb_start[SWIOTLB_LO],
-				    PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
-		io_tlb_start[SWIOTLB_LO] = 0;
+	if (io_tlb_start[type]) {
+		memblock_free_early(io_tlb_start[type],
+				    PAGE_ALIGN(io_tlb_nslabs[type] << IO_TLB_SHIFT));
+		io_tlb_start[type] = 0;
 	}
-	pr_warn("Cannot allocate buffer");
-	no_iotlb_memory[SWIOTLB_LO] = true;
+	pr_warn("Cannot allocate buffer %s\n", swiotlb_name[type]);
+	no_iotlb_memory[type] = true;
+}
+
+/*
+ * Statically reserve bounce buffer space and initialize bounce buffer data
+ * structures for the software IO TLB used to implement the DMA API.
+ */
+void  __init
+swiotlb_init(int verbose)
+{
+	int i;
+
+	for (i = 0; i < swiotlb_nr; i++)
+		swiotlb_init_type(i, verbose);
 }
 
 /*
@@ -295,67 +330,68 @@ swiotlb_init(int verbose)
  * This should be just like above, but with some error catching.
  */
 int
-swiotlb_late_init_with_default_size(size_t default_size)
+swiotlb_late_init_with_default_size(size_t default_size, enum swiotlb_t type)
 {
-	unsigned long bytes, req_nslabs = io_tlb_nslabs[SWIOTLB_LO];
+	unsigned long bytes, req_nslabs = io_tlb_nslabs[type];
 	unsigned char *vstart = NULL;
 	unsigned int order;
 	int rc = 0;
+	gfp_t gfp_mask = (type == SWIOTLB_LO) ? GFP_DMA | __GFP_NOWARN :
+						GFP_KERNEL | __GFP_NOWARN;
 
-	if (!io_tlb_nslabs[SWIOTLB_LO]) {
-		io_tlb_nslabs[SWIOTLB_LO] = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], IO_TLB_SEGSIZE);
+	if (!io_tlb_nslabs[type]) {
+		io_tlb_nslabs[type] = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs[type] = ALIGN(io_tlb_nslabs[type], IO_TLB_SEGSIZE);
 	}
 
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	order = get_order(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
-	io_tlb_nslabs[SWIOTLB_LO] = SLABS_PER_PAGE << order;
-	bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	order = get_order(io_tlb_nslabs[type] << IO_TLB_SHIFT);
+	io_tlb_nslabs[type] = SLABS_PER_PAGE << order;
+	bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
 
 	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-		vstart = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
-						  order);
+		vstart = (void *)__get_free_pages(gfp_mask, order);
 		if (vstart)
 			break;
 		order--;
 	}
 
 	if (!vstart) {
-		io_tlb_nslabs[SWIOTLB_LO] = req_nslabs;
+		io_tlb_nslabs[type] = req_nslabs;
 		return -ENOMEM;
 	}
 	if (order != get_order(bytes)) {
 		pr_warn("only able to allocate %ld MB\n",
 			(PAGE_SIZE << order) >> 20);
-		io_tlb_nslabs[SWIOTLB_LO] = SLABS_PER_PAGE << order;
+		io_tlb_nslabs[type] = SLABS_PER_PAGE << order;
 	}
-	rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs[SWIOTLB_LO]);
+	rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs[type], type);
 	if (rc)
 		free_pages((unsigned long)vstart, order);
 
 	return rc;
 }
 
-static void swiotlb_cleanup(void)
+static void swiotlb_cleanup(enum swiotlb_t type)
 {
-	io_tlb_end[SWIOTLB_LO] = 0;
-	io_tlb_start[SWIOTLB_LO] = 0;
-	io_tlb_nslabs[SWIOTLB_LO] = 0;
-	max_segment[SWIOTLB_LO] = 0;
+	io_tlb_end[type] = 0;
+	io_tlb_start[type] = 0;
+	io_tlb_nslabs[type] = 0;
+	max_segment[type] = 0;
 }
 
 int
-swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
+swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs, enum swiotlb_t type)
 {
 	unsigned long i, bytes;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
-	io_tlb_nslabs[SWIOTLB_LO] = nslabs;
-	io_tlb_start[SWIOTLB_LO] = virt_to_phys(tlb);
-	io_tlb_end[SWIOTLB_LO] = io_tlb_start[SWIOTLB_LO] + bytes;
+	io_tlb_nslabs[type] = nslabs;
+	io_tlb_start[type] = virt_to_phys(tlb);
+	io_tlb_end[type] = io_tlb_start[type] + bytes;
 
 	set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
 	memset(tlb, 0, bytes);
@@ -365,63 +401,67 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
 	 * between io_tlb_start and io_tlb_end.
 	 */
-	io_tlb_list[SWIOTLB_LO] = (unsigned int *)__get_free_pages(GFP_KERNEL,
-			get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-	if (!io_tlb_list[SWIOTLB_LO])
+	io_tlb_list[type] = (unsigned int *)__get_free_pages(GFP_KERNEL,
+			get_order(io_tlb_nslabs[type] * sizeof(int)));
+	if (!io_tlb_list[type])
 		goto cleanup3;
 
-	io_tlb_orig_addr[SWIOTLB_LO] = (phys_addr_t *)
+	io_tlb_orig_addr[type] = (phys_addr_t *)
 		__get_free_pages(GFP_KERNEL,
-				 get_order(io_tlb_nslabs[SWIOTLB_LO] *
+				 get_order(io_tlb_nslabs[type] *
 					   sizeof(phys_addr_t)));
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
+	if (!io_tlb_orig_addr[type])
 		goto cleanup4;
 
-	for (i = 0; i < io_tlb_nslabs[SWIOTLB_LO]; i++) {
-		io_tlb_list[SWIOTLB_LO][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-		io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+	for (i = 0; i < io_tlb_nslabs[type]; i++) {
+		io_tlb_list[type][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 	}
-	io_tlb_index[SWIOTLB_LO] = 0;
-	no_iotlb_memory[SWIOTLB_LO] = false;
+	io_tlb_index[type] = 0;
+	no_iotlb_memory[type] = false;
 
-	swiotlb_print_info();
+	swiotlb_print_info(type);
 
 	late_alloc = 1;
 
-	swiotlb_set_max_segment(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+	swiotlb_set_max_segment(io_tlb_nslabs[type] << IO_TLB_SHIFT, type);
 
 	return 0;
 
 cleanup4:
-	free_pages((unsigned long)io_tlb_list[SWIOTLB_LO], get_order(io_tlb_nslabs[SWIOTLB_LO] *
+	free_pages((unsigned long)io_tlb_list[type], get_order(io_tlb_nslabs[type] *
 	                                                 sizeof(int)));
-	io_tlb_list[SWIOTLB_LO] = NULL;
+	io_tlb_list[type] = NULL;
 cleanup3:
-	swiotlb_cleanup();
+	swiotlb_cleanup(type);
 	return -ENOMEM;
 }
 
 void __init swiotlb_exit(void)
 {
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
-		return;
+	int i;
 
-	if (late_alloc) {
-		free_pages((unsigned long)io_tlb_orig_addr[SWIOTLB_LO],
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t)));
-		free_pages((unsigned long)io_tlb_list[SWIOTLB_LO],
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-		free_pages((unsigned long)phys_to_virt(io_tlb_start[SWIOTLB_LO]),
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
-	} else {
-		memblock_free_late(__pa(io_tlb_orig_addr[SWIOTLB_LO]),
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t)));
-		memblock_free_late(__pa(io_tlb_list[SWIOTLB_LO]),
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-		memblock_free_late(io_tlb_start[SWIOTLB_LO],
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
+	for (i = 0; i < swiotlb_nr; i++) {
+		if (!io_tlb_orig_addr[i])
+			continue;
+
+		if (late_alloc) {
+			free_pages((unsigned long)io_tlb_orig_addr[i],
+				   get_order(io_tlb_nslabs[i] * sizeof(phys_addr_t)));
+			free_pages((unsigned long)io_tlb_list[i],
+				   get_order(io_tlb_nslabs[i] * sizeof(int)));
+			free_pages((unsigned long)phys_to_virt(io_tlb_start[i]),
+				   get_order(io_tlb_nslabs[i] << IO_TLB_SHIFT));
+		} else {
+			memblock_free_late(__pa(io_tlb_orig_addr[i]),
+					   PAGE_ALIGN(io_tlb_nslabs[i] * sizeof(phys_addr_t)));
+			memblock_free_late(__pa(io_tlb_list[i]),
+					   PAGE_ALIGN(io_tlb_nslabs[i] * sizeof(int)));
+			memblock_free_late(io_tlb_start[i],
+					   PAGE_ALIGN(io_tlb_nslabs[i] << IO_TLB_SHIFT));
+		}
+		swiotlb_cleanup(i);
 	}
-	swiotlb_cleanup();
 }
 
 /*
@@ -468,7 +508,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		size_t mapping_size, size_t alloc_size,
 		enum dma_data_direction dir, unsigned long attrs)
 {
-	dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[SWIOTLB_LO]);
+	dma_addr_t tbl_dma_addr;
 	unsigned long flags;
 	phys_addr_t tlb_addr;
 	unsigned int nslots, stride, index, wrap;
@@ -477,8 +517,16 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	unsigned long offset_slots;
 	unsigned long max_slots;
 	unsigned long tmp_io_tlb_used;
+	unsigned long dma_mask = min_not_zero(*hwdev->dma_mask,
+					      hwdev->bus_dma_limit);
+	int type;
 
-	if (no_iotlb_memory[SWIOTLB_LO])
+	if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
+		type = SWIOTLB_HI;
+	else
+		type = SWIOTLB_LO;
+
+	if (no_iotlb_memory[type])
 		panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
 
 	if (mem_encrypt_active())
@@ -490,6 +538,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		return (phys_addr_t)DMA_MAPPING_ERROR;
 	}
 
+	tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[type]);
+
 	mask = dma_get_seg_boundary(hwdev);
 
 	tbl_dma_addr &= mask;
@@ -521,11 +571,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	 */
 	spin_lock_irqsave(&io_tlb_lock, flags);
 
-	if (unlikely(nslots > io_tlb_nslabs[SWIOTLB_LO] - io_tlb_used[SWIOTLB_LO]))
+	if (unlikely(nslots > io_tlb_nslabs[type] - io_tlb_used[type]))
 		goto not_found;
 
-	index = ALIGN(io_tlb_index[SWIOTLB_LO], stride);
-	if (index >= io_tlb_nslabs[SWIOTLB_LO])
+	index = ALIGN(io_tlb_index[type], stride);
+	if (index >= io_tlb_nslabs[type])
 		index = 0;
 	wrap = index;
 
@@ -533,7 +583,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		while (iommu_is_span_boundary(index, nslots, offset_slots,
 					      max_slots)) {
 			index += stride;
-			if (index >= io_tlb_nslabs[SWIOTLB_LO])
+			if (index >= io_tlb_nslabs[type])
 				index = 0;
 			if (index == wrap)
 				goto not_found;
@@ -544,42 +594,42 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		 * contiguous buffers, we allocate the buffers from that slot
 		 * and mark the entries as '0' indicating unavailable.
 		 */
-		if (io_tlb_list[SWIOTLB_LO][index] >= nslots) {
+		if (io_tlb_list[type][index] >= nslots) {
 			int count = 0;
 
 			for (i = index; i < (int) (index + nslots); i++)
-				io_tlb_list[SWIOTLB_LO][i] = 0;
+				io_tlb_list[type][i] = 0;
 			for (i = index - 1;
 			     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
-			     io_tlb_list[SWIOTLB_LO][i];
+			     io_tlb_list[type][i];
 			     i--)
-				io_tlb_list[SWIOTLB_LO][i] = ++count;
-			tlb_addr = io_tlb_start[SWIOTLB_LO] + (index << IO_TLB_SHIFT);
+				io_tlb_list[type][i] = ++count;
+			tlb_addr = io_tlb_start[type] + (index << IO_TLB_SHIFT);
 
 			/*
 			 * Update the indices to avoid searching in the next
 			 * round.
 			 */
-			io_tlb_index[SWIOTLB_LO] = ((index + nslots) < io_tlb_nslabs[SWIOTLB_LO]
+			io_tlb_index[type] = ((index + nslots) < io_tlb_nslabs[type]
 					? (index + nslots) : 0);
 
 			goto found;
 		}
 		index += stride;
-		if (index >= io_tlb_nslabs[SWIOTLB_LO])
+		if (index >= io_tlb_nslabs[type])
 			index = 0;
 	} while (index != wrap);
 
 not_found:
-	tmp_io_tlb_used = io_tlb_used[SWIOTLB_LO];
+	tmp_io_tlb_used = io_tlb_used[type];
 
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
-		dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
-			 alloc_size, io_tlb_nslabs[SWIOTLB_LO], tmp_io_tlb_used);
+		dev_warn(hwdev, "%s swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
+			 swiotlb_name[type], alloc_size, io_tlb_nslabs[type], tmp_io_tlb_used);
 	return (phys_addr_t)DMA_MAPPING_ERROR;
 found:
-	io_tlb_used[SWIOTLB_LO] += nslots;
+	io_tlb_used[type] += nslots;
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 
 	/*
@@ -588,7 +638,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	 * needed.
 	 */
 	for (i = 0; i < nslots; i++)
-		io_tlb_orig_addr[SWIOTLB_LO][index+i] = orig_addr + (i << IO_TLB_SHIFT);
+		io_tlb_orig_addr[type][index+i] = orig_addr + (i << IO_TLB_SHIFT);
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
 		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
@@ -605,8 +655,9 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 {
 	unsigned long flags;
 	int i, count, nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	int index = (tlb_addr - io_tlb_start[SWIOTLB_LO]) >> IO_TLB_SHIFT;
-	phys_addr_t orig_addr = io_tlb_orig_addr[SWIOTLB_LO][index];
+	int type = swiotlb_get_type(tlb_addr);
+	int index = (tlb_addr - io_tlb_start[type]) >> IO_TLB_SHIFT;
+	phys_addr_t orig_addr = io_tlb_orig_addr[type][index];
 
 	/*
 	 * First, sync the memory before unmapping the entry
@@ -625,14 +676,14 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 	spin_lock_irqsave(&io_tlb_lock, flags);
 	{
 		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
-			 io_tlb_list[SWIOTLB_LO][index + nslots] : 0);
+			 io_tlb_list[type][index + nslots] : 0);
 		/*
 		 * Step 1: return the slots to the free list, merging the
 		 * slots with superceeding slots
 		 */
 		for (i = index + nslots - 1; i >= index; i--) {
-			io_tlb_list[SWIOTLB_LO][i] = ++count;
-			io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+			io_tlb_list[type][i] = ++count;
+			io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 		}
 		/*
 		 * Step 2: merge the returned slots with the preceding slots,
@@ -640,11 +691,11 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 		 */
 		for (i = index - 1;
 		     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
-		     io_tlb_list[SWIOTLB_LO][i];
+		     io_tlb_list[type][i];
 		     i--)
-			io_tlb_list[SWIOTLB_LO][i] = ++count;
+			io_tlb_list[type][i] = ++count;
 
-		io_tlb_used[SWIOTLB_LO] -= nslots;
+		io_tlb_used[type] -= nslots;
 	}
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 }
@@ -653,8 +704,9 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 			     size_t size, enum dma_data_direction dir,
 			     enum dma_sync_target target)
 {
-	int index = (tlb_addr - io_tlb_start[SWIOTLB_LO]) >> IO_TLB_SHIFT;
-	phys_addr_t orig_addr = io_tlb_orig_addr[SWIOTLB_LO][index];
+	int type = swiotlb_get_type(tlb_addr);
+	int index = (tlb_addr - io_tlb_start[type]) >> IO_TLB_SHIFT;
+	phys_addr_t orig_addr = io_tlb_orig_addr[type][index];
 
 	if (orig_addr == INVALID_PHYS_ADDR)
 		return;
@@ -737,6 +789,15 @@ static int __init swiotlb_create_debugfs(void)
 	root = debugfs_create_dir("swiotlb", NULL);
 	debugfs_create_ulong("io_tlb_nslabs", 0400, root, &io_tlb_nslabs[SWIOTLB_LO]);
 	debugfs_create_ulong("io_tlb_used", 0400, root, &io_tlb_used[SWIOTLB_LO]);
+
+	if (swiotlb_nr == 1)
+		return 0;
+
+	debugfs_create_ulong("io_tlb_nslabs-highmem", 0400,
+			     root, &io_tlb_nslabs[SWIOTLB_HI]);
+	debugfs_create_ulong("io_tlb_used-highmem", 0400,
+			     root, &io_tlb_used[SWIOTLB_HI]);
+
 	return 0;
 }
 
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb
@ 2021-02-03 23:37   ` Dongli Zhang
  0 siblings, 0 replies; 6+ messages in thread
From: Dongli Zhang @ 2021-02-03 23:37 UTC (permalink / raw)
  To: dri-devel, intel-gfx, iommu, linux-mips, linux-mmc, linux-pci,
	linuxppc-dev, nouveau, x86, xen-devel
  Cc: ulf.hansson, airlied, adrian.hunter, paulus, hpa, mingo,
	m.szyprowski, sstabellini, mpe, joe.jin, hch, peterz, mingo,
	matthew.auld, bskeggs, thomas.lendacky, konrad.wilk, bp,
	rodrigo.vivi, bhelgaas, boris.ostrovsky, chris, jgross, tsbogend,
	linux-kernel, tglx, bauerman, akpm, robin.murphy, rppt

This patch is to enable the 64-bit swiotlb buffer.

The state of the art swiotlb pre-allocates <=32-bit memory in order to meet
the DMA mask requirement for some 32-bit legacy device. Considering most
devices nowadays support 64-bit DMA and IOMMU is available, the swiotlb is
not used for most of the times, except:

1. The xen PVM domain requires the DMA addresses to both (1) less than the
dma mask, and (2) continuous in machine address. Therefore, the 64-bit
device may still require swiotlb on PVM domain.

2. From source code the AMD SME/SEV will enable SWIOTLB_FORCE. As a result
it is always required to allocate from swiotlb buffer even the device dma
mask is 64-bit.

sme_early_init()
-> if (sev_active())
   swiotlb_force = SWIOTLB_FORCE;

Therefore, this patch introduces the 2nd swiotlb buffer for 64-bit DMA
access. For instance, the swiotlb_tbl_map_single() allocates from the 2nd
64-bit buffer if the device DMA mask is
min_not_zero(*hwdev->dma_mask, hwdev->bus_dma_limit).

The example to configure 64-bit swiotlb is "swiotlb=65536,524288,force"
or "swiotlb=,524288,force", where 524288 is the size of 64-bit buffer.

With the patch, the kernel will be able to allocate >4GB swiotlb buffer.
This patch is only for swiotlb, not including xen-swiotlb.

Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 arch/mips/cavium-octeon/dma-octeon.c         |   3 +-
 arch/powerpc/kernel/dma-swiotlb.c            |   2 +-
 arch/powerpc/platforms/pseries/svm.c         |   2 +-
 arch/x86/kernel/pci-swiotlb.c                |   5 +-
 arch/x86/pci/sta2x11-fixup.c                 |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_internal.c |   4 +-
 drivers/gpu/drm/i915/i915_scatterlist.h      |   2 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c        |   2 +-
 drivers/mmc/host/sdhci.c                     |   2 +-
 drivers/pci/xen-pcifront.c                   |   2 +-
 drivers/xen/swiotlb-xen.c                    |   9 +-
 include/linux/swiotlb.h                      |  28 +-
 kernel/dma/swiotlb.c                         | 339 +++++++++++--------
 13 files changed, 238 insertions(+), 164 deletions(-)

diff --git a/arch/mips/cavium-octeon/dma-octeon.c b/arch/mips/cavium-octeon/dma-octeon.c
index df70308db0e6..3480555d908a 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -245,6 +245,7 @@ void __init plat_swiotlb_setup(void)
 		panic("%s: Failed to allocate %zu bytes align=%lx\n",
 		      __func__, swiotlbsize, PAGE_SIZE);
 
-	if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs, 1) == -ENOMEM)
+	if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs,
+				  SWIOTLB_LO, 1) == -ENOMEM)
 		panic("Cannot allocate SWIOTLB buffer");
 }
diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index fc7816126a40..88113318c53f 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -20,7 +20,7 @@ void __init swiotlb_detect_4g(void)
 static int __init check_swiotlb_enabled(void)
 {
 	if (ppc_swiotlb_enable)
-		swiotlb_print_info();
+		swiotlb_print_info(SWIOTLB_LO);
 	else
 		swiotlb_exit();
 
diff --git a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c
index 9f8842d0da1f..77910e4ffad8 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -52,7 +52,7 @@ void __init svm_swiotlb_init(void)
 	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
 	vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
-	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, false))
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, SWIOTLB_LO, false))
 		return;
 
 	if (io_tlb_start[SWIOTLB_LO])
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..950624fd95a4 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -67,12 +67,15 @@ void __init pci_swiotlb_init(void)
 
 void __init pci_swiotlb_late_init(void)
 {
+	int i;
+
 	/* An IOMMU turned us off. */
 	if (!swiotlb)
 		swiotlb_exit();
 	else {
 		printk(KERN_INFO "PCI-DMA: "
 		       "Using software bounce buffering for IO (SWIOTLB)\n");
-		swiotlb_print_info();
+		for (i = 0; i < swiotlb_nr; i++)
+			swiotlb_print_info(i);
 	}
 }
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index 7d2525691854..c440520b2055 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -57,7 +57,7 @@ static void sta2x11_new_instance(struct pci_dev *pdev)
 		int size = STA2X11_SWIOTLB_SIZE;
 		/* First instance: register your own swiotlb area */
 		dev_info(&pdev->dev, "Using SWIOTLB (size %i)\n", size);
-		if (swiotlb_late_init_with_default_size(size))
+		if (swiotlb_late_init_with_default_size(size), SWIOTLB_LO)
 			dev_emerg(&pdev->dev, "init swiotlb failed\n");
 	}
 	list_add(&instance->list, &sta2x11_instance_list);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index ad22f42541bd..947683f2e568 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -42,10 +42,10 @@ static int i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
 
 	max_order = MAX_ORDER;
 #ifdef CONFIG_SWIOTLB
-	if (swiotlb_nr_tbl()) {
+	if (swiotlb_nr_tbl(SWIOTLB_LO)) {
 		unsigned int max_segment;
 
-		max_segment = swiotlb_max_segment();
+		max_segment = swiotlb_max_segment(SWIOTLB_LO);
 		if (max_segment) {
 			max_segment = max_t(unsigned int, max_segment,
 					    PAGE_SIZE) >> PAGE_SHIFT;
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h
index 9cb26a224034..c63c7f6941f6 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -118,7 +118,7 @@ static inline unsigned int i915_sg_page_sizes(struct scatterlist *sg)
 
 static inline unsigned int i915_sg_segment_size(void)
 {
-	unsigned int size = swiotlb_max_segment();
+	unsigned int size = swiotlb_max_segment(SWIOTLB_LO);
 
 	if (size == 0)
 		size = UINT_MAX;
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index a37bc3d7b38b..0919b207ac47 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -321,7 +321,7 @@ nouveau_ttm_init(struct nouveau_drm *drm)
 	}
 
 #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86)
-	need_swiotlb = !!swiotlb_nr_tbl();
+	need_swiotlb = !!swiotlb_nr_tbl(SWIOTLB_LO);
 #endif
 
 	ret = ttm_bo_device_init(&drm->ttm.bdev, &nouveau_bo_driver,
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 646823ddd317..1f7fb912d5a9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -4582,7 +4582,7 @@ int sdhci_setup_host(struct sdhci_host *host)
 		mmc->max_segs = SDHCI_MAX_SEGS;
 	} else if (host->flags & SDHCI_USE_SDMA) {
 		mmc->max_segs = 1;
-		if (swiotlb_max_segment()) {
+		if (swiotlb_max_segment(SWIOTLB_LO)) {
 			unsigned int max_req_size = (1 << IO_TLB_SHIFT) *
 						IO_TLB_SEGSIZE;
 			mmc->max_req_size = min(mmc->max_req_size,
diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index c6fe0cfec0f6..9509ed9b4126 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -693,7 +693,7 @@ static int pcifront_connect_and_init_dma(struct pcifront_device *pdev)
 
 	spin_unlock(&pcifront_dev_lock);
 
-	if (!err && !swiotlb_nr_tbl()) {
+	if (!err && !swiotlb_nr_tbl(SWIOTLB_LO)) {
 		err = pci_xen_swiotlb_init_late();
 		if (err)
 			dev_err(&pdev->xdev->dev, "Could not setup SWIOTLB!\n");
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 3261880ad859..662638093542 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -184,7 +184,7 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 	enum xen_swiotlb_err m_ret = XEN_SWIOTLB_UNKNOWN;
 	unsigned int repeat = 3;
 
-	xen_io_tlb_nslabs = swiotlb_nr_tbl();
+	xen_io_tlb_nslabs = swiotlb_nr_tbl(SWIOTLB_LO);
 retry:
 	bytes = xen_set_nslabs(xen_io_tlb_nslabs);
 	order = get_order(xen_io_tlb_nslabs << IO_TLB_SHIFT);
@@ -245,16 +245,17 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 	}
 	if (early) {
 		if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
-			 verbose))
+					  SWIOTLB_LO, verbose))
 			panic("Cannot allocate SWIOTLB buffer");
 		rc = 0;
 	} else
-		rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs);
+		rc = swiotlb_late_init_with_tbl(xen_io_tlb_start,
+						xen_io_tlb_nslabs, SWIOTLB_LO);
 
 end:
 	xen_io_tlb_end = xen_io_tlb_start + bytes;
 	if (!rc)
-		swiotlb_set_max_segment(PAGE_SIZE);
+		swiotlb_set_max_segment(PAGE_SIZE, SWIOTLB_LO);
 
 	return rc;
 error:
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 3d5980d77810..8ba45fddfb14 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -43,11 +43,14 @@ extern int swiotlb_nr;
 #define IO_TLB_DEFAULT_SIZE (64UL<<20)
 
 extern void swiotlb_init(int verbose);
-int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose);
-extern unsigned long swiotlb_nr_tbl(void);
+int swiotlb_init_with_tbl(char *tlb, unsigned long nslabs,
+			  enum swiotlb_t type, int verbose);
+extern unsigned long swiotlb_nr_tbl(enum swiotlb_t type);
 unsigned long swiotlb_size_or_default(void);
-extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs);
-extern int swiotlb_late_init_with_default_size(size_t default_size);
+extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs,
+				      enum swiotlb_t type);
+extern int swiotlb_late_init_with_default_size(size_t default_size,
+					       enum swiotlb_t type);
 extern void __init swiotlb_update_mem_attributes(void);
 
 /*
@@ -83,8 +86,13 @@ extern phys_addr_t io_tlb_start[], io_tlb_end[];
 
 static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 {
-	return paddr >= io_tlb_start[SWIOTLB_LO] &&
-	       paddr < io_tlb_end[SWIOTLB_LO];
+	int i;
+
+	for (i = 0; i < swiotlb_nr; i++)
+		if (paddr >= io_tlb_start[i] && paddr < io_tlb_end[i])
+			return true;
+
+	return false;
 }
 
 static inline int swiotlb_get_type(phys_addr_t paddr)
@@ -99,7 +107,7 @@ static inline int swiotlb_get_type(phys_addr_t paddr)
 }
 
 void __init swiotlb_exit(void);
-unsigned int swiotlb_max_segment(void);
+unsigned int swiotlb_max_segment(enum swiotlb_t type);
 size_t swiotlb_max_mapping_size(struct device *dev);
 bool is_swiotlb_active(void);
 void __init swiotlb_adjust_size(unsigned long new_size);
@@ -112,7 +120,7 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 static inline void swiotlb_exit(void)
 {
 }
-static inline unsigned int swiotlb_max_segment(void)
+static inline unsigned int swiotlb_max_segment(enum swiotlb_t type)
 {
 	return 0;
 }
@@ -131,7 +139,7 @@ static inline void swiotlb_adjust_size(unsigned long new_size)
 }
 #endif /* CONFIG_SWIOTLB */
 
-extern void swiotlb_print_info(void);
-extern void swiotlb_set_max_segment(unsigned int);
+extern void swiotlb_print_info(enum swiotlb_t type);
+extern void swiotlb_set_max_segment(unsigned int val, enum swiotlb_t type);
 
 #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index c91d3d2c3936..cd28db5b016a 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -111,6 +111,11 @@ static int late_alloc;
 
 int swiotlb_nr = 1;
 
+static const char * const swiotlb_name[] = {
+	"Low Mem",
+	"High Mem"
+};
+
 static int __init
 setup_io_tlb_npages(char *str)
 {
@@ -121,11 +126,25 @@ setup_io_tlb_npages(char *str)
 	}
 	if (*str == ',')
 		++str;
+
+	if (isdigit(*str)) {
+		io_tlb_nslabs[SWIOTLB_HI] = simple_strtoul(str, &str, 0);
+		/* avoid tail segment of size < IO_TLB_SEGSIZE */
+		io_tlb_nslabs[SWIOTLB_HI] = ALIGN(io_tlb_nslabs[SWIOTLB_HI], IO_TLB_SEGSIZE);
+
+		swiotlb_nr = 2;
+	}
+
+	if (*str == ',')
+		++str;
+
 	if (!strcmp(str, "force")) {
 		swiotlb_force = SWIOTLB_FORCE;
 	} else if (!strcmp(str, "noforce")) {
 		swiotlb_force = SWIOTLB_NO_FORCE;
 		io_tlb_nslabs[SWIOTLB_LO] = 1;
+		if (swiotlb_nr > 1)
+			io_tlb_nslabs[SWIOTLB_HI] = 1;
 	}
 
 	return 0;
@@ -134,24 +153,24 @@ early_param("swiotlb", setup_io_tlb_npages);
 
 static bool no_iotlb_memory[SWIOTLB_MAX];
 
-unsigned long swiotlb_nr_tbl(void)
+unsigned long swiotlb_nr_tbl(enum swiotlb_t type)
 {
-	return unlikely(no_iotlb_memory[SWIOTLB_LO]) ? 0 : io_tlb_nslabs[SWIOTLB_LO];
+	return unlikely(no_iotlb_memory[type]) ? 0 : io_tlb_nslabs[type];
 }
 EXPORT_SYMBOL_GPL(swiotlb_nr_tbl);
 
-unsigned int swiotlb_max_segment(void)
+unsigned int swiotlb_max_segment(enum swiotlb_t type)
 {
-	return unlikely(no_iotlb_memory[SWIOTLB_LO]) ? 0 : max_segment[SWIOTLB_LO];
+	return unlikely(no_iotlb_memory[type]) ? 0 : max_segment[type];
 }
 EXPORT_SYMBOL_GPL(swiotlb_max_segment);
 
-void swiotlb_set_max_segment(unsigned int val)
+void swiotlb_set_max_segment(unsigned int val, enum swiotlb_t type)
 {
 	if (swiotlb_force == SWIOTLB_FORCE)
-		max_segment[SWIOTLB_LO] = 1;
+		max_segment[type] = 1;
 	else
-		max_segment[SWIOTLB_LO] = rounddown(val, PAGE_SIZE);
+		max_segment[type] = rounddown(val, PAGE_SIZE);
 }
 
 unsigned long swiotlb_size_or_default(void)
@@ -181,18 +200,18 @@ void __init swiotlb_adjust_size(unsigned long new_size)
 	}
 }
 
-void swiotlb_print_info(void)
+void swiotlb_print_info(enum swiotlb_t type)
 {
-	unsigned long bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	unsigned long bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
 
-	if (no_iotlb_memory[SWIOTLB_LO]) {
-		pr_warn("No low mem\n");
+	if (no_iotlb_memory[type]) {
+		pr_warn("No low mem with %s\n", swiotlb_name[type]);
 		return;
 	}
 
-	pr_info("mapped [mem %pa-%pa] (%luMB)\n",
-		&io_tlb_start[SWIOTLB_LO], &io_tlb_end[SWIOTLB_LO],
-		bytes >> 20);
+	pr_info("%s mapped [mem %pa-%pa] (%luMB)\n",
+		swiotlb_name[type],
+		&io_tlb_start[type], &io_tlb_end[type], bytes >> 20);
 }
 
 /*
@@ -205,88 +224,104 @@ void __init swiotlb_update_mem_attributes(void)
 {
 	void *vaddr;
 	unsigned long bytes;
+	int i;
 
-	if (no_iotlb_memory[SWIOTLB_LO] || late_alloc)
-		return;
+	for (i = 0; i < swiotlb_nr; i++) {
+		if (no_iotlb_memory[i] || late_alloc)
+			continue;
 
-	vaddr = phys_to_virt(io_tlb_start[SWIOTLB_LO]);
-	bytes = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
-	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
-	memset(vaddr, 0, bytes);
+		vaddr = phys_to_virt(io_tlb_start[i]);
+		bytes = PAGE_ALIGN(io_tlb_nslabs[i] << IO_TLB_SHIFT);
+		set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
+		memset(vaddr, 0, bytes);
+	}
 }
 
-int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
+int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs,
+				 enum swiotlb_t type, int verbose)
 {
 	unsigned long i, bytes;
 	size_t alloc_size;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
-	io_tlb_nslabs[SWIOTLB_LO] = nslabs;
-	io_tlb_start[SWIOTLB_LO] = __pa(tlb);
-	io_tlb_end[SWIOTLB_LO] = io_tlb_start[SWIOTLB_LO] + bytes;
+	io_tlb_nslabs[type] = nslabs;
+	io_tlb_start[type] = __pa(tlb);
+	io_tlb_end[type] = io_tlb_start[type] + bytes;
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
 	 * between io_tlb_start and io_tlb_end.
 	 */
-	alloc_size = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int));
-	io_tlb_list[SWIOTLB_LO] = memblock_alloc(alloc_size, PAGE_SIZE);
-	if (!io_tlb_list[SWIOTLB_LO])
+	alloc_size = PAGE_ALIGN(io_tlb_nslabs[type] * sizeof(int));
+	io_tlb_list[type] = memblock_alloc(alloc_size, PAGE_SIZE);
+	if (!io_tlb_list[type])
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	alloc_size = PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t));
-	io_tlb_orig_addr[SWIOTLB_LO] = memblock_alloc(alloc_size, PAGE_SIZE);
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
+	alloc_size = PAGE_ALIGN(io_tlb_nslabs[type] * sizeof(phys_addr_t));
+	io_tlb_orig_addr[type] = memblock_alloc(alloc_size, PAGE_SIZE);
+	if (!io_tlb_orig_addr[type])
 		panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
 		      __func__, alloc_size, PAGE_SIZE);
 
-	for (i = 0; i < io_tlb_nslabs[SWIOTLB_LO]; i++) {
-		io_tlb_list[SWIOTLB_LO][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-		io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+	for (i = 0; i < io_tlb_nslabs[type]; i++) {
+		io_tlb_list[type][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 	}
-	io_tlb_index[SWIOTLB_LO] = 0;
-	no_iotlb_memory[SWIOTLB_LO] = false;
+	io_tlb_index[type] = 0;
+	no_iotlb_memory[type] = false;
 
 	if (verbose)
-		swiotlb_print_info();
+		swiotlb_print_info(type);
 
-	swiotlb_set_max_segment(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+	swiotlb_set_max_segment(io_tlb_nslabs[type] << IO_TLB_SHIFT, type);
 	return 0;
 }
 
-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the DMA API.
- */
-void  __init
-swiotlb_init(int verbose)
+static void __init
+swiotlb_init_type(enum swiotlb_t type, int verbose)
 {
 	size_t default_size = IO_TLB_DEFAULT_SIZE;
 	unsigned char *vstart;
 	unsigned long bytes;
 
-	if (!io_tlb_nslabs[SWIOTLB_LO]) {
-		io_tlb_nslabs[SWIOTLB_LO] = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], IO_TLB_SEGSIZE);
+	if (!io_tlb_nslabs[type]) {
+		io_tlb_nslabs[type] = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs[type] = ALIGN(io_tlb_nslabs[type], IO_TLB_SEGSIZE);
 	}
 
-	bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
+
+	if (type == SWIOTLB_LO)
+		vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
+	else
+		vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
 
-	/* Get IO TLB memory from the low pages */
-	vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
-	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs[SWIOTLB_LO], verbose))
+	if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs[type], type, verbose))
 		return;
 
-	if (io_tlb_start[SWIOTLB_LO]) {
-		memblock_free_early(io_tlb_start[SWIOTLB_LO],
-				    PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
-		io_tlb_start[SWIOTLB_LO] = 0;
+	if (io_tlb_start[type]) {
+		memblock_free_early(io_tlb_start[type],
+				    PAGE_ALIGN(io_tlb_nslabs[type] << IO_TLB_SHIFT));
+		io_tlb_start[type] = 0;
 	}
-	pr_warn("Cannot allocate buffer");
-	no_iotlb_memory[SWIOTLB_LO] = true;
+	pr_warn("Cannot allocate buffer %s\n", swiotlb_name[type]);
+	no_iotlb_memory[type] = true;
+}
+
+/*
+ * Statically reserve bounce buffer space and initialize bounce buffer data
+ * structures for the software IO TLB used to implement the DMA API.
+ */
+void  __init
+swiotlb_init(int verbose)
+{
+	int i;
+
+	for (i = 0; i < swiotlb_nr; i++)
+		swiotlb_init_type(i, verbose);
 }
 
 /*
@@ -295,67 +330,68 @@ swiotlb_init(int verbose)
  * This should be just like above, but with some error catching.
  */
 int
-swiotlb_late_init_with_default_size(size_t default_size)
+swiotlb_late_init_with_default_size(size_t default_size, enum swiotlb_t type)
 {
-	unsigned long bytes, req_nslabs = io_tlb_nslabs[SWIOTLB_LO];
+	unsigned long bytes, req_nslabs = io_tlb_nslabs[type];
 	unsigned char *vstart = NULL;
 	unsigned int order;
 	int rc = 0;
+	gfp_t gfp_mask = (type == SWIOTLB_LO) ? GFP_DMA | __GFP_NOWARN :
+						GFP_KERNEL | __GFP_NOWARN;
 
-	if (!io_tlb_nslabs[SWIOTLB_LO]) {
-		io_tlb_nslabs[SWIOTLB_LO] = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], IO_TLB_SEGSIZE);
+	if (!io_tlb_nslabs[type]) {
+		io_tlb_nslabs[type] = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs[type] = ALIGN(io_tlb_nslabs[type], IO_TLB_SEGSIZE);
 	}
 
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	order = get_order(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
-	io_tlb_nslabs[SWIOTLB_LO] = SLABS_PER_PAGE << order;
-	bytes = io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+	order = get_order(io_tlb_nslabs[type] << IO_TLB_SHIFT);
+	io_tlb_nslabs[type] = SLABS_PER_PAGE << order;
+	bytes = io_tlb_nslabs[type] << IO_TLB_SHIFT;
 
 	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-		vstart = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
-						  order);
+		vstart = (void *)__get_free_pages(gfp_mask, order);
 		if (vstart)
 			break;
 		order--;
 	}
 
 	if (!vstart) {
-		io_tlb_nslabs[SWIOTLB_LO] = req_nslabs;
+		io_tlb_nslabs[type] = req_nslabs;
 		return -ENOMEM;
 	}
 	if (order != get_order(bytes)) {
 		pr_warn("only able to allocate %ld MB\n",
 			(PAGE_SIZE << order) >> 20);
-		io_tlb_nslabs[SWIOTLB_LO] = SLABS_PER_PAGE << order;
+		io_tlb_nslabs[type] = SLABS_PER_PAGE << order;
 	}
-	rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs[SWIOTLB_LO]);
+	rc = swiotlb_late_init_with_tbl(vstart, io_tlb_nslabs[type], type);
 	if (rc)
 		free_pages((unsigned long)vstart, order);
 
 	return rc;
 }
 
-static void swiotlb_cleanup(void)
+static void swiotlb_cleanup(enum swiotlb_t type)
 {
-	io_tlb_end[SWIOTLB_LO] = 0;
-	io_tlb_start[SWIOTLB_LO] = 0;
-	io_tlb_nslabs[SWIOTLB_LO] = 0;
-	max_segment[SWIOTLB_LO] = 0;
+	io_tlb_end[type] = 0;
+	io_tlb_start[type] = 0;
+	io_tlb_nslabs[type] = 0;
+	max_segment[type] = 0;
 }
 
 int
-swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
+swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs, enum swiotlb_t type)
 {
 	unsigned long i, bytes;
 
 	bytes = nslabs << IO_TLB_SHIFT;
 
-	io_tlb_nslabs[SWIOTLB_LO] = nslabs;
-	io_tlb_start[SWIOTLB_LO] = virt_to_phys(tlb);
-	io_tlb_end[SWIOTLB_LO] = io_tlb_start[SWIOTLB_LO] + bytes;
+	io_tlb_nslabs[type] = nslabs;
+	io_tlb_start[type] = virt_to_phys(tlb);
+	io_tlb_end[type] = io_tlb_start[type] + bytes;
 
 	set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
 	memset(tlb, 0, bytes);
@@ -365,63 +401,67 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
 	 * between io_tlb_start and io_tlb_end.
 	 */
-	io_tlb_list[SWIOTLB_LO] = (unsigned int *)__get_free_pages(GFP_KERNEL,
-			get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-	if (!io_tlb_list[SWIOTLB_LO])
+	io_tlb_list[type] = (unsigned int *)__get_free_pages(GFP_KERNEL,
+			get_order(io_tlb_nslabs[type] * sizeof(int)));
+	if (!io_tlb_list[type])
 		goto cleanup3;
 
-	io_tlb_orig_addr[SWIOTLB_LO] = (phys_addr_t *)
+	io_tlb_orig_addr[type] = (phys_addr_t *)
 		__get_free_pages(GFP_KERNEL,
-				 get_order(io_tlb_nslabs[SWIOTLB_LO] *
+				 get_order(io_tlb_nslabs[type] *
 					   sizeof(phys_addr_t)));
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
+	if (!io_tlb_orig_addr[type])
 		goto cleanup4;
 
-	for (i = 0; i < io_tlb_nslabs[SWIOTLB_LO]; i++) {
-		io_tlb_list[SWIOTLB_LO][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-		io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+	for (i = 0; i < io_tlb_nslabs[type]; i++) {
+		io_tlb_list[type][i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 	}
-	io_tlb_index[SWIOTLB_LO] = 0;
-	no_iotlb_memory[SWIOTLB_LO] = false;
+	io_tlb_index[type] = 0;
+	no_iotlb_memory[type] = false;
 
-	swiotlb_print_info();
+	swiotlb_print_info(type);
 
 	late_alloc = 1;
 
-	swiotlb_set_max_segment(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+	swiotlb_set_max_segment(io_tlb_nslabs[type] << IO_TLB_SHIFT, type);
 
 	return 0;
 
 cleanup4:
-	free_pages((unsigned long)io_tlb_list[SWIOTLB_LO], get_order(io_tlb_nslabs[SWIOTLB_LO] *
+	free_pages((unsigned long)io_tlb_list[type], get_order(io_tlb_nslabs[type] *
 	                                                 sizeof(int)));
-	io_tlb_list[SWIOTLB_LO] = NULL;
+	io_tlb_list[type] = NULL;
 cleanup3:
-	swiotlb_cleanup();
+	swiotlb_cleanup(type);
 	return -ENOMEM;
 }
 
 void __init swiotlb_exit(void)
 {
-	if (!io_tlb_orig_addr[SWIOTLB_LO])
-		return;
+	int i;
 
-	if (late_alloc) {
-		free_pages((unsigned long)io_tlb_orig_addr[SWIOTLB_LO],
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t)));
-		free_pages((unsigned long)io_tlb_list[SWIOTLB_LO],
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-		free_pages((unsigned long)phys_to_virt(io_tlb_start[SWIOTLB_LO]),
-			   get_order(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
-	} else {
-		memblock_free_late(__pa(io_tlb_orig_addr[SWIOTLB_LO]),
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(phys_addr_t)));
-		memblock_free_late(__pa(io_tlb_list[SWIOTLB_LO]),
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] * sizeof(int)));
-		memblock_free_late(io_tlb_start[SWIOTLB_LO],
-				   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT));
+	for (i = 0; i < swiotlb_nr; i++) {
+		if (!io_tlb_orig_addr[i])
+			continue;
+
+		if (late_alloc) {
+			free_pages((unsigned long)io_tlb_orig_addr[i],
+				   get_order(io_tlb_nslabs[i] * sizeof(phys_addr_t)));
+			free_pages((unsigned long)io_tlb_list[i],
+				   get_order(io_tlb_nslabs[i] * sizeof(int)));
+			free_pages((unsigned long)phys_to_virt(io_tlb_start[i]),
+				   get_order(io_tlb_nslabs[i] << IO_TLB_SHIFT));
+		} else {
+			memblock_free_late(__pa(io_tlb_orig_addr[i]),
+					   PAGE_ALIGN(io_tlb_nslabs[i] * sizeof(phys_addr_t)));
+			memblock_free_late(__pa(io_tlb_list[i]),
+					   PAGE_ALIGN(io_tlb_nslabs[i] * sizeof(int)));
+			memblock_free_late(io_tlb_start[i],
+					   PAGE_ALIGN(io_tlb_nslabs[i] << IO_TLB_SHIFT));
+		}
+		swiotlb_cleanup(i);
 	}
-	swiotlb_cleanup();
 }
 
 /*
@@ -468,7 +508,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		size_t mapping_size, size_t alloc_size,
 		enum dma_data_direction dir, unsigned long attrs)
 {
-	dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[SWIOTLB_LO]);
+	dma_addr_t tbl_dma_addr;
 	unsigned long flags;
 	phys_addr_t tlb_addr;
 	unsigned int nslots, stride, index, wrap;
@@ -477,8 +517,16 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	unsigned long offset_slots;
 	unsigned long max_slots;
 	unsigned long tmp_io_tlb_used;
+	unsigned long dma_mask = min_not_zero(*hwdev->dma_mask,
+					      hwdev->bus_dma_limit);
+	int type;
 
-	if (no_iotlb_memory[SWIOTLB_LO])
+	if (swiotlb_nr > 1 && dma_mask == DMA_BIT_MASK(64))
+		type = SWIOTLB_HI;
+	else
+		type = SWIOTLB_LO;
+
+	if (no_iotlb_memory[type])
 		panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
 
 	if (mem_encrypt_active())
@@ -490,6 +538,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		return (phys_addr_t)DMA_MAPPING_ERROR;
 	}
 
+	tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start[type]);
+
 	mask = dma_get_seg_boundary(hwdev);
 
 	tbl_dma_addr &= mask;
@@ -521,11 +571,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	 */
 	spin_lock_irqsave(&io_tlb_lock, flags);
 
-	if (unlikely(nslots > io_tlb_nslabs[SWIOTLB_LO] - io_tlb_used[SWIOTLB_LO]))
+	if (unlikely(nslots > io_tlb_nslabs[type] - io_tlb_used[type]))
 		goto not_found;
 
-	index = ALIGN(io_tlb_index[SWIOTLB_LO], stride);
-	if (index >= io_tlb_nslabs[SWIOTLB_LO])
+	index = ALIGN(io_tlb_index[type], stride);
+	if (index >= io_tlb_nslabs[type])
 		index = 0;
 	wrap = index;
 
@@ -533,7 +583,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		while (iommu_is_span_boundary(index, nslots, offset_slots,
 					      max_slots)) {
 			index += stride;
-			if (index >= io_tlb_nslabs[SWIOTLB_LO])
+			if (index >= io_tlb_nslabs[type])
 				index = 0;
 			if (index == wrap)
 				goto not_found;
@@ -544,42 +594,42 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 		 * contiguous buffers, we allocate the buffers from that slot
 		 * and mark the entries as '0' indicating unavailable.
 		 */
-		if (io_tlb_list[SWIOTLB_LO][index] >= nslots) {
+		if (io_tlb_list[type][index] >= nslots) {
 			int count = 0;
 
 			for (i = index; i < (int) (index + nslots); i++)
-				io_tlb_list[SWIOTLB_LO][i] = 0;
+				io_tlb_list[type][i] = 0;
 			for (i = index - 1;
 			     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
-			     io_tlb_list[SWIOTLB_LO][i];
+			     io_tlb_list[type][i];
 			     i--)
-				io_tlb_list[SWIOTLB_LO][i] = ++count;
-			tlb_addr = io_tlb_start[SWIOTLB_LO] + (index << IO_TLB_SHIFT);
+				io_tlb_list[type][i] = ++count;
+			tlb_addr = io_tlb_start[type] + (index << IO_TLB_SHIFT);
 
 			/*
 			 * Update the indices to avoid searching in the next
 			 * round.
 			 */
-			io_tlb_index[SWIOTLB_LO] = ((index + nslots) < io_tlb_nslabs[SWIOTLB_LO]
+			io_tlb_index[type] = ((index + nslots) < io_tlb_nslabs[type]
 					? (index + nslots) : 0);
 
 			goto found;
 		}
 		index += stride;
-		if (index >= io_tlb_nslabs[SWIOTLB_LO])
+		if (index >= io_tlb_nslabs[type])
 			index = 0;
 	} while (index != wrap);
 
 not_found:
-	tmp_io_tlb_used = io_tlb_used[SWIOTLB_LO];
+	tmp_io_tlb_used = io_tlb_used[type];
 
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
-		dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
-			 alloc_size, io_tlb_nslabs[SWIOTLB_LO], tmp_io_tlb_used);
+		dev_warn(hwdev, "%s swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
+			 swiotlb_name[type], alloc_size, io_tlb_nslabs[type], tmp_io_tlb_used);
 	return (phys_addr_t)DMA_MAPPING_ERROR;
 found:
-	io_tlb_used[SWIOTLB_LO] += nslots;
+	io_tlb_used[type] += nslots;
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 
 	/*
@@ -588,7 +638,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
 	 * needed.
 	 */
 	for (i = 0; i < nslots; i++)
-		io_tlb_orig_addr[SWIOTLB_LO][index+i] = orig_addr + (i << IO_TLB_SHIFT);
+		io_tlb_orig_addr[type][index+i] = orig_addr + (i << IO_TLB_SHIFT);
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
 		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
@@ -605,8 +655,9 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 {
 	unsigned long flags;
 	int i, count, nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	int index = (tlb_addr - io_tlb_start[SWIOTLB_LO]) >> IO_TLB_SHIFT;
-	phys_addr_t orig_addr = io_tlb_orig_addr[SWIOTLB_LO][index];
+	int type = swiotlb_get_type(tlb_addr);
+	int index = (tlb_addr - io_tlb_start[type]) >> IO_TLB_SHIFT;
+	phys_addr_t orig_addr = io_tlb_orig_addr[type][index];
 
 	/*
 	 * First, sync the memory before unmapping the entry
@@ -625,14 +676,14 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 	spin_lock_irqsave(&io_tlb_lock, flags);
 	{
 		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
-			 io_tlb_list[SWIOTLB_LO][index + nslots] : 0);
+			 io_tlb_list[type][index + nslots] : 0);
 		/*
 		 * Step 1: return the slots to the free list, merging the
 		 * slots with superceeding slots
 		 */
 		for (i = index + nslots - 1; i >= index; i--) {
-			io_tlb_list[SWIOTLB_LO][i] = ++count;
-			io_tlb_orig_addr[SWIOTLB_LO][i] = INVALID_PHYS_ADDR;
+			io_tlb_list[type][i] = ++count;
+			io_tlb_orig_addr[type][i] = INVALID_PHYS_ADDR;
 		}
 		/*
 		 * Step 2: merge the returned slots with the preceding slots,
@@ -640,11 +691,11 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 		 */
 		for (i = index - 1;
 		     (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) &&
-		     io_tlb_list[SWIOTLB_LO][i];
+		     io_tlb_list[type][i];
 		     i--)
-			io_tlb_list[SWIOTLB_LO][i] = ++count;
+			io_tlb_list[type][i] = ++count;
 
-		io_tlb_used[SWIOTLB_LO] -= nslots;
+		io_tlb_used[type] -= nslots;
 	}
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 }
@@ -653,8 +704,9 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 			     size_t size, enum dma_data_direction dir,
 			     enum dma_sync_target target)
 {
-	int index = (tlb_addr - io_tlb_start[SWIOTLB_LO]) >> IO_TLB_SHIFT;
-	phys_addr_t orig_addr = io_tlb_orig_addr[SWIOTLB_LO][index];
+	int type = swiotlb_get_type(tlb_addr);
+	int index = (tlb_addr - io_tlb_start[type]) >> IO_TLB_SHIFT;
+	phys_addr_t orig_addr = io_tlb_orig_addr[type][index];
 
 	if (orig_addr == INVALID_PHYS_ADDR)
 		return;
@@ -737,6 +789,15 @@ static int __init swiotlb_create_debugfs(void)
 	root = debugfs_create_dir("swiotlb", NULL);
 	debugfs_create_ulong("io_tlb_nslabs", 0400, root, &io_tlb_nslabs[SWIOTLB_LO]);
 	debugfs_create_ulong("io_tlb_used", 0400, root, &io_tlb_used[SWIOTLB_LO]);
+
+	if (swiotlb_nr == 1)
+		return 0;
+
+	debugfs_create_ulong("io_tlb_nslabs-highmem", 0400,
+			     root, &io_tlb_nslabs[SWIOTLB_HI]);
+	debugfs_create_ulong("io_tlb_used-highmem", 0400,
+			     root, &io_tlb_used[SWIOTLB_HI]);
+
 	return 0;
 }
 
-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-02-04  8:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-04  2:13 [PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2021-02-03 23:37 [PATCH RFC v1 0/6] swiotlb: 64-bit DMA buffer Dongli Zhang
2021-02-03 23:37 ` [PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb Dongli Zhang
2021-02-03 23:37   ` Dongli Zhang
2021-02-03 23:37   ` Dongli Zhang
2021-02-03 23:37   ` Dongli Zhang
2021-02-04  4:10   ` kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.