* Re: [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
@ 2022-06-04 16:04 kernel test robot
0 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2022-06-04 16:04 UTC (permalink / raw)
To: kbuild
[-- Attachment #1: Type: text/plain, Size: 33368 bytes --]
::::::
:::::: Manual check reason: "low confidence static check warning: drivers/vhost/vdpa.c:614:2: warning: Value stored to 'range_size' is never read [clang-analyzer-deadcode.DeadStores]"
::::::
CC: llvm(a)lists.linux.dev
CC: kbuild-all(a)lists.01.org
BCC: lkp(a)intel.com
In-Reply-To: <20220601012019.1102186-1-lulu@redhat.com>
References: <20220601012019.1102186-1-lulu@redhat.com>
TO: Cindy Lu <lulu@redhat.com>
Hi Cindy,
[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on v5.18]
[cannot apply to mst-vhost/linux-next next-20220603]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/intel-lab-lkp/linux/commits/Cindy-Lu/vdpa-Do-not-count-the-pages-that-were-already-pinned-in-the-vhost-vDPA/20220601-092308
base: 4b0986a3613c92f4ec1bdc7f60ec66fea135991f
:::::: branch date: 4 days ago
:::::: commit date: 4 days ago
config: i386-randconfig-c001 (https://download.01.org/0day-ci/archive/20220604/202206042319.frgBEFEG-lkp(a)intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project c825abd6b0198fb088d9752f556a70705bc99dfd)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/8f0b25afe9ac570a70eafb5c285747e3bdd2471d
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Cindy-Lu/vdpa-Do-not-count-the-pages-that-were-already-pinned-in-the-vhost-vDPA/20220601-092308
git checkout 8f0b25afe9ac570a70eafb5c285747e3bdd2471d
# save the config file
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=i386 clang-analyzer
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>
clang-analyzer warnings: (new ones prefixed by >>)
^
arch/x86/include/asm/string_32.h:150:25: note: expanded from macro 'memcpy'
#define memcpy(t, f, n) __builtin_memcpy(t, f, n)
^~~~~~~~~~~~~~~~
net/ipv6/addrconf.c:7150:4: note: Call to function 'memcpy' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'memcpy_s' in case of C11
memcpy(dflt,
^
arch/x86/include/asm/string_32.h:150:25: note: expanded from macro 'memcpy'
#define memcpy(t, f, n) __builtin_memcpy(t, f, n)
^~~~~~~~~~~~~~~~
Suppressed 106 warnings (106 in non-user code).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
47 warnings generated.
drivers/md/dm-thin.c:1269:2: warning: Call to function 'memset' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'memset_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
memset(m, 0, sizeof(struct dm_thin_new_mapping));
^
arch/x86/include/asm/string_32.h:195:29: note: expanded from macro 'memset'
#define memset(s, c, count) __builtin_memset(s, c, count)
^~~~~~~~~~~~~~~~
drivers/md/dm-thin.c:1269:2: note: Call to function 'memset' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'memset_s' in case of C11
memset(m, 0, sizeof(struct dm_thin_new_mapping));
^
arch/x86/include/asm/string_32.h:195:29: note: expanded from macro 'memset'
#define memset(s, c, count) __builtin_memset(s, c, count)
^~~~~~~~~~~~~~~~
drivers/md/dm-thin.c:3990:10: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
format_dev_t(buf, pt->metadata_dev->bdev->bd_dev),
^
include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \
^~~~~~~
include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
0 : scnprintf(result + sz, maxlen - sz, x))
^
drivers/md/dm-thin.c:3990:10: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
format_dev_t(buf, pt->metadata_dev->bdev->bd_dev),
^
include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \
^~~~~~~
include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
0 : scnprintf(result + sz, maxlen - sz, x))
^
drivers/md/dm-thin.c:3991:10: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
format_dev_t(buf2, pt->data_dev->bdev->bd_dev),
^
include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \
^~~~~~~
include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
0 : scnprintf(result + sz, maxlen - sz, x))
^
drivers/md/dm-thin.c:3991:10: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
format_dev_t(buf2, pt->data_dev->bdev->bd_dev),
^
include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \
^~~~~~~
include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
0 : scnprintf(result + sz, maxlen - sz, x))
^
drivers/md/dm-thin.c:4407:11: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
format_dev_t(buf, tc->pool_dev->bdev->bd_dev),
^
include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \
^~~~~~~
include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
0 : scnprintf(result + sz, maxlen - sz, x))
^
drivers/md/dm-thin.c:4407:11: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
format_dev_t(buf, tc->pool_dev->bdev->bd_dev),
^
include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \
^~~~~~~
include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
0 : scnprintf(result + sz, maxlen - sz, x))
^
drivers/md/dm-thin.c:4410:19: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
DMEMIT(" %s", format_dev_t(buf, tc->origin_dev->bdev->bd_dev));
^
include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \
^~~~~~~
include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
0 : scnprintf(result + sz, maxlen - sz, x))
^
drivers/md/dm-thin.c:4410:19: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
DMEMIT(" %s", format_dev_t(buf, tc->origin_dev->bdev->bd_dev));
^
include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \
^~~~~~~
include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
0 : scnprintf(result + sz, maxlen - sz, x))
^
Suppressed 42 warnings (42 in non-user code).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
84 warnings generated.
>> drivers/vhost/vdpa.c:614:2: warning: Value stored to 'range_size' is never read [clang-analyzer-deadcode.DeadStores]
range_size = range_start - range_last;
^ ~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:614:2: note: Value stored to 'range_size' is never read
range_size = range_start - range_last;
^ ~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:719:40: warning: The right operand of '>=' is a garbage value [clang-analyzer-core.UndefinedBinaryOperatorResult]
if ((range_size >= 0) && (range_start >= node_last) &&
^
include/linux/compiler.h:56:47: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^~~~
include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^~~~
drivers/vhost/vdpa.c:596:2: note: 'node_last' declared without an initial value
u64 node_last;
^~~~~~~~~~~~~
drivers/vhost/vdpa.c:604:6: note: Assuming 'node_number' is not equal to 0
if (node_number == 0) {
^
include/linux/compiler.h:56:47: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^~~~
include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^~~~
drivers/vhost/vdpa.c:604:2: note: '?' condition is false
if (node_number == 0) {
^
include/linux/compiler.h:56:28: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^
include/linux/compiler.h:58:31: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^
drivers/vhost/vdpa.c:604:6: note: 'node_number' is not equal to 0
if (node_number == 0) {
^
include/linux/compiler.h:56:47: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^~~~
include/linux/compiler.h:58:86: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^~~~
include/linux/compiler.h:69:3: note: expanded from macro '__trace_if_value'
(cond) ? \
^~~~
drivers/vhost/vdpa.c:604:2: note: '?' condition is false
if (node_number == 0) {
^
include/linux/compiler.h:56:28: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^
include/linux/compiler.h:58:69: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^
include/linux/compiler.h:69:2: note: expanded from macro '__trace_if_value'
(cond) ? \
^
drivers/vhost/vdpa.c:604:2: note: Taking false branch
if (node_number == 0) {
^
include/linux/compiler.h:56:23: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^
drivers/vhost/vdpa.c:615:14: note: Assuming 'i' is >= 'node_number'
for (i = 0; i < node_number; i++) {
^~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:615:2: note: Loop condition is false. Execution continues on line 715
for (i = 0; i < node_number; i++) {
^
drivers/vhost/vdpa.c:719:7: note: Assuming 'range_size' is >= 0
if ((range_size >= 0) && (range_start >= node_last) &&
^
include/linux/compiler.h:56:47: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^~~~
include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^~~~
drivers/vhost/vdpa.c:719:6: note: Left side of '&&' is true
if ((range_size >= 0) && (range_start >= node_last) &&
^
drivers/vhost/vdpa.c:719:40: note: The right operand of '>=' is a garbage value
if ((range_size >= 0) && (range_start >= node_last) &&
^
include/linux/compiler.h:56:47: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^~~~
include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^~~~
>> drivers/vhost/vdpa.c:978:16: warning: Assigned value is garbage or undefined [clang-analyzer-core.uninitialized.Assign]
pre_link_node = link_head_tmp;
^
drivers/vhost/vdpa.c:1589:2: note: Calling 'vhost_vdpa_iotlb_free'
vhost_vdpa_iotlb_free(v);
^~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:1065:2: note: Calling 'vhost_vdpa_iotlb_unmap'
vhost_vdpa_iotlb_unmap(v, 0ULL, 0ULL - 1);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:1055:2: note: Assuming field 'use_va' is false
if (vdpa->use_va)
^
include/linux/compiler.h:56:45: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^~~~
drivers/vhost/vdpa.c:1055:2: note: '?' condition is false
if (vdpa->use_va)
^
include/linux/compiler.h:56:28: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^
include/linux/compiler.h:58:31: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^
drivers/vhost/vdpa.c:1055:12: note: Field 'use_va' is false
if (vdpa->use_va)
^
drivers/vhost/vdpa.c:1055:2: note: '?' condition is false
if (vdpa->use_va)
^
include/linux/compiler.h:56:28: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^
include/linux/compiler.h:58:69: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^
include/linux/compiler.h:69:2: note: expanded from macro '__trace_if_value'
(cond) ? \
^
drivers/vhost/vdpa.c:1055:2: note: Taking false branch
if (vdpa->use_va)
^
include/linux/compiler.h:56:23: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^
drivers/vhost/vdpa.c:1058:9: note: Calling 'vhost_vdpa_pa_unmap'
return vhost_vdpa_pa_unmap(v, start, last);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:1018:9: note: Assuming the condition is true
while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:1018:2: note: Loop condition is true. Entering loop body
while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
^
drivers/vhost/vdpa.c:1021:8: note: Assuming 'pinned' is <= 0
pinned > 0; pfn++, pinned--) {
^~~~~~~~~~
drivers/vhost/vdpa.c:1020:3: note: Loop condition is false. Execution continues on line 1028
for (pfn = PFN_DOWN(map->addr);
^
drivers/vhost/vdpa.c:1028:10: note: Calling 'vhost_vdpa_search_range_del'
size = vhost_vdpa_search_range_del(dev->vdpa_mem_tree,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:1004:9: note: Calling 'vhost_vdpa_range_ops'
size = vhost_vdpa_range_ops(root, start, last, false);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:941:2: note: 'link_head_tmp' declared without an initial value
struct vdpa_link_node *link_head_tmp;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:946:2: note: Loop condition is true. Entering loop body
for (node = interval_tree_iter_first(root, start, last); node;
^
drivers/vhost/vdpa.c:949:7: note: Assuming 'link_node' is equal to null
if (link_node == NULL) {
^
include/linux/compiler.h:56:47: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^~~~
drivers/vhost/vdpa.c:949:3: note: '?' condition is false
if (link_node == NULL) {
^
include/linux/compiler.h:56:28: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^
include/linux/compiler.h:58:31: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
^
drivers/vhost/vdpa.c:949:7: note: 'link_node' is equal to null
if (link_node == NULL) {
^
include/linux/compiler.h:56:47: note: expanded from macro 'if'
#define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
^~~~
include/linux/compiler.h:58:86: note: expanded from macro '__trace_if_var'
#define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
vim +/range_size +614 drivers/vhost/vdpa.c
8f0b25afe9ac57 Cindy Lu 2022-06-01 587
8f0b25afe9ac57 Cindy Lu 2022-06-01 588 u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
8f0b25afe9ac57 Cindy Lu 2022-06-01 589 struct vdpa_link_node *link_head, int node_number,
8f0b25afe9ac57 Cindy Lu 2022-06-01 590 u64 start, u64 last)
8f0b25afe9ac57 Cindy Lu 2022-06-01 591 {
8f0b25afe9ac57 Cindy Lu 2022-06-01 592 int i = 0;
8f0b25afe9ac57 Cindy Lu 2022-06-01 593 u64 size = 0;
8f0b25afe9ac57 Cindy Lu 2022-06-01 594 int new_ref;
8f0b25afe9ac57 Cindy Lu 2022-06-01 595 u64 node_start;
8f0b25afe9ac57 Cindy Lu 2022-06-01 596 u64 node_last;
8f0b25afe9ac57 Cindy Lu 2022-06-01 597 u64 range_start;
8f0b25afe9ac57 Cindy Lu 2022-06-01 598 u64 range_last;
8f0b25afe9ac57 Cindy Lu 2022-06-01 599 int range_size;
8f0b25afe9ac57 Cindy Lu 2022-06-01 600 struct vdpa_link_node *link_node;
8f0b25afe9ac57 Cindy Lu 2022-06-01 601 struct vdpa_tree_node *vdpa_node = NULL;
8f0b25afe9ac57 Cindy Lu 2022-06-01 602 struct interval_tree_node *node = NULL;
8f0b25afe9ac57 Cindy Lu 2022-06-01 603
8f0b25afe9ac57 Cindy Lu 2022-06-01 604 if (node_number == 0) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 605 vhost_vdpa_add_range_ctx(root, start, last, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 606
8f0b25afe9ac57 Cindy Lu 2022-06-01 607 size = last - start + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01 608 return size;
8f0b25afe9ac57 Cindy Lu 2022-06-01 609 }
8f0b25afe9ac57 Cindy Lu 2022-06-01 610
8f0b25afe9ac57 Cindy Lu 2022-06-01 611 link_node = link_head;
8f0b25afe9ac57 Cindy Lu 2022-06-01 612 range_start = start;
8f0b25afe9ac57 Cindy Lu 2022-06-01 613 range_last = last;
8f0b25afe9ac57 Cindy Lu 2022-06-01 @614 range_size = range_start - range_last;
8f0b25afe9ac57 Cindy Lu 2022-06-01 615 for (i = 0; i < node_number; i++) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 616 vdpa_node = link_node->vdpa_node;
8f0b25afe9ac57 Cindy Lu 2022-06-01 617 link_node = link_node->next;
8f0b25afe9ac57 Cindy Lu 2022-06-01 618 node = &vdpa_node->tree_node;
8f0b25afe9ac57 Cindy Lu 2022-06-01 619 new_ref = vdpa_node->ref;
8f0b25afe9ac57 Cindy Lu 2022-06-01 620 node_start = node->start;
8f0b25afe9ac57 Cindy Lu 2022-06-01 621 node_last = node->last;
8f0b25afe9ac57 Cindy Lu 2022-06-01 622
8f0b25afe9ac57 Cindy Lu 2022-06-01 623 if (range_start == node_start) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 624 if (node_last < range_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 625 /* range_start= node->start--- node->last--range_last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 626 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 627 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 628 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 629 /*count the next range */
8f0b25afe9ac57 Cindy Lu 2022-06-01 630 } else if (node_last > range_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 631 /* range_start= node->start --- last -- node->last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 632 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 633 range_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 634 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 635 vhost_vdpa_add_range_ctx(root, range_last + 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01 636 node_last, new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01 637 } else {
8f0b25afe9ac57 Cindy Lu 2022-06-01 638 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 639 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 640 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 641 }
8f0b25afe9ac57 Cindy Lu 2022-06-01 642 } else if (node_start < range_start) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 643 if (range_last < node_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 644 /* node->start--- start--- last--- node->last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 645 /* should the end rang*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 646
8f0b25afe9ac57 Cindy Lu 2022-06-01 647 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 648 range_start - 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01 649 new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01 650 vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 651 range_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 652 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 653 vhost_vdpa_add_range_ctx(root, range_last + 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01 654 node_last, new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01 655
8f0b25afe9ac57 Cindy Lu 2022-06-01 656 } else if (range_last > node_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 657 /* node->start--- start--- node->last-- last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 658
8f0b25afe9ac57 Cindy Lu 2022-06-01 659 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 660 range_start - 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01 661 new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01 662 vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 663 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 664 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 665 } else {
8f0b25afe9ac57 Cindy Lu 2022-06-01 666 /* node->start--- start--- node->last= last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 667 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 668 range_start - 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01 669 new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01 670 vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 671 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 672 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 673 /* should the end rang*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 674 }
8f0b25afe9ac57 Cindy Lu 2022-06-01 675 } else {
8f0b25afe9ac57 Cindy Lu 2022-06-01 676 if (node_last < range_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 677 /* range_start --- node->start --- node->last ----last */
8f0b25afe9ac57 Cindy Lu 2022-06-01 678
8f0b25afe9ac57 Cindy Lu 2022-06-01 679 vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 680 node_start - 1, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 681 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 682 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 683 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 684 size += ((node_start - 1) - range_start) + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01 685 } else if (node_last > range_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 686 /* range_start--- node->start --- last -- node->last */
8f0b25afe9ac57 Cindy Lu 2022-06-01 687 vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 688 node_start - 1, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 689 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 690 range_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 691 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 692 vhost_vdpa_add_range_ctx(root, range_last + 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01 693 node_last, new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01 694 size += ((node_start - 1) - range_start) + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01 695
8f0b25afe9ac57 Cindy Lu 2022-06-01 696 /* should the end rang*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 697 } else {
8f0b25afe9ac57 Cindy Lu 2022-06-01 698 /* range_start--- node->start --- last = node->last */
8f0b25afe9ac57 Cindy Lu 2022-06-01 699 vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 700 node_start - 1, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 701 vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 702 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01 703 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 704 size += ((node_start - 1) - range_start) + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01 705
8f0b25afe9ac57 Cindy Lu 2022-06-01 706 /* should the end rang*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 707 }
8f0b25afe9ac57 Cindy Lu 2022-06-01 708 }
8f0b25afe9ac57 Cindy Lu 2022-06-01 709 /* work in next node*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 710 range_start = node_last + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01 711 if (range_start > range_last)
8f0b25afe9ac57 Cindy Lu 2022-06-01 712 break;
8f0b25afe9ac57 Cindy Lu 2022-06-01 713 }
8f0b25afe9ac57 Cindy Lu 2022-06-01 714
8f0b25afe9ac57 Cindy Lu 2022-06-01 715 range_size = range_last - range_start;
8f0b25afe9ac57 Cindy Lu 2022-06-01 716
8f0b25afe9ac57 Cindy Lu 2022-06-01 717 /* last round and still some range*/
8f0b25afe9ac57 Cindy Lu 2022-06-01 718
8f0b25afe9ac57 Cindy Lu 2022-06-01 719 if ((range_size >= 0) && (range_start >= node_last) &&
8f0b25afe9ac57 Cindy Lu 2022-06-01 720 (node_number == i + 1)) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 721 vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01 722 size = size + (range_last - range_start) + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01 723 } else if ((range_size == -1) && (node_number == i + 1)) {
8f0b25afe9ac57 Cindy Lu 2022-06-01 724 return size;
8f0b25afe9ac57 Cindy Lu 2022-06-01 725 } else {
8f0b25afe9ac57 Cindy Lu 2022-06-01 726 printk(KERN_WARNING,
8f0b25afe9ac57 Cindy Lu 2022-06-01 727 "%s %d FAIL start %lld last %lld node->start %lld node->last %lld i %d",
8f0b25afe9ac57 Cindy Lu 2022-06-01 728 __func__, __LINE__, range_start, range_last, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01 729 node_last, i);
8f0b25afe9ac57 Cindy Lu 2022-06-01 730 }
8f0b25afe9ac57 Cindy Lu 2022-06-01 731
8f0b25afe9ac57 Cindy Lu 2022-06-01 732 return size;
8f0b25afe9ac57 Cindy Lu 2022-06-01 733 }
8f0b25afe9ac57 Cindy Lu 2022-06-01 734
--
0-DAY CI Kernel Test Service
https://01.org/lkp
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
2022-06-01 1:20 Cindy Lu
@ 2022-06-01 5:51 ` Jason Wang
2022-06-01 5:51 ` Jason Wang
1 sibling, 0 replies; 5+ messages in thread
From: Jason Wang @ 2022-06-01 5:51 UTC (permalink / raw)
To: Cindy Lu; +Cc: mst, kvm, virtualization, netdev, linux-kernel
On Wed, Jun 1, 2022 at 9:20 AM Cindy Lu <lulu@redhat.com> wrote:
>
> We count pinned_vm as follow in vhost-vDPA
>
> lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> if (npages + atomic64_read(&dev->mm->pinned_vm) > lock_limit) {
> ret = -ENOMEM;
> goto unlock;
> }
> This means if we have two vDPA devices for the same VM the pages would be counted twice
> So we add a tree to save the page that counted and we will not count it
> again.
The code is not easy to be reviewed, some suggestions:
- It's better to explain in general the algorithm you used here
- Add more comment in the codes to explain the rationale
And I still see the above check against the RLIMIT in the code, is it
intentional?
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
> drivers/vhost/vdpa.c | 542 +++++++++++++++++++++++++++++++++++++++++-
> drivers/vhost/vhost.h | 1 +
> 2 files changed, 539 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 05f5fd2af58f..1b0da0735efd 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -24,6 +24,10 @@
> #include <linux/vhost.h>
>
> #include "vhost.h"
> +#include <linux/rbtree.h>
> +#include <linux/interval_tree.h>
> +#include <linux/interval_tree_generic.h>
> +#include <linux/hashtable.h>
>
> enum {
> VHOST_VDPA_BACKEND_FEATURES =
> @@ -506,12 +510,478 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> return r;
> }
>
> +struct vdpa_tree_node {
> + struct interval_tree_node tree_node;
Can we simply reuse the vhost_iotlb tree? Note that vhost_iotlb_map
can be associated with a opaque as token which could be used as
reference count.
> + int ref;
If it's a refcount, let's use unsigned here.
> +};
> +struct vdpa_link_node {
> + struct vdpa_tree_node *vdpa_node;
> + struct vdpa_link_node *next;
Need to explain why we need a linked list here considering we've
already structured it as an interval tree.
Btw, unless it's performance critical, let's try to re-use kernel list.h.
> + u64 node_start;
> + u64 node_last;
Let's add a comment to explain each member here.
> +};
> +
> +int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
> + int ref)
We don't want to export this symbol, so let's make the function
static, so did other functions.
> +{
> + struct interval_tree_node *new_node;
> + struct vdpa_tree_node *vdpa_node;
> +
> + if (last < start)
> + return -EFAULT;
> +
> + /* If the range being mapped is [0, ULONG_MAX], split it into two entries
> + * otherwise its size would overflow u64.
> + */
> + if (start == 0 && last == ULONG_MAX) {
> + u64 mid = last / 2;
> +
> + vhost_vdpa_add_range_ctx(root, start, mid, ref);
> + start = mid + 1;
> + }
> + vdpa_node = kmalloc(sizeof(struct vdpa_tree_node), GFP_ATOMIC);
> +
Let's check if kmalloc succeeds here.
> + new_node = &vdpa_node->tree_node;
> + if (!new_node)
> + return -ENOMEM;
> +
> + new_node->start = start;
> + new_node->last = last;
> + vdpa_node->ref = ref;
> +
> + interval_tree_insert(new_node, root);
> +
> + return 0;
> +}
> +
> +u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
> + struct vdpa_link_node *link_head, int node_number,
> + u64 start, u64 last)
> +{
> + int i = 0;
> + u64 size = 0;
> + int new_ref;
> + u64 node_start;
> + u64 node_last;
> + u64 range_start;
> + u64 range_last;
> + int range_size;
> + struct vdpa_link_node *link_node;
> + struct vdpa_tree_node *vdpa_node = NULL;
> + struct interval_tree_node *node = NULL;
> +
> + if (node_number == 0) {
> + vhost_vdpa_add_range_ctx(root, start, last, 1);
> +
> + size = last - start + 1;
> + return size;
> + }
> +
> + link_node = link_head;
> + range_start = start;
> + range_last = last;
> + range_size = range_start - range_last;
> + for (i = 0; i < node_number; i++) {
> + vdpa_node = link_node->vdpa_node;
> + link_node = link_node->next;
> + node = &vdpa_node->tree_node;
> + new_ref = vdpa_node->ref;
> + node_start = node->start;
> + node_last = node->last;
> +
> + if (range_start == node_start) {
> + if (node_last < range_last) {
> + /* range_start= node->start--- node->last--range_last*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + node_last,
> + new_ref + 1);
> + /*count the next range */
> + } else if (node_last > range_last) {
> + /* range_start= node->start --- last -- node->last*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_last,
> + new_ref + 1);
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> + } else {
> + vhost_vdpa_add_range_ctx(root, node_start,
> + node_last,
> + new_ref + 1);
> + }
> + } else if (node_start < range_start) {
> + if (range_last < node_last) {
> + /* node->start--- start--- last--- node->last*/
> + /* should the end rang*/
> +
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + vhost_vdpa_add_range_ctx(root, range_start,
> + range_last,
> + new_ref + 1);
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> +
> + } else if (range_last > node_last) {
> + /* node->start--- start--- node->last-- last*/
> +
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_last,
> + new_ref + 1);
> + } else {
> + /* node->start--- start--- node->last= last*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_last,
> + new_ref + 1);
> + /* should the end rang*/
> + }
> + } else {
> + if (node_last < range_last) {
> + /* range_start --- node->start --- node->last ----last */
> +
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_start - 1, 1);
> + vhost_vdpa_add_range_ctx(root, node_start,
> + node_last,
> + new_ref + 1);
> + size += ((node_start - 1) - range_start) + 1;
> + } else if (node_last > range_last) {
> + /* range_start--- node->start --- last -- node->last */
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_start - 1, 1);
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_last,
> + new_ref + 1);
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> + size += ((node_start - 1) - range_start) + 1;
> +
> + /* should the end rang*/
> + } else {
> + /* range_start--- node->start --- last = node->last */
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_start - 1, 1);
> + vhost_vdpa_add_range_ctx(root, node_start,
> + node_last,
> + new_ref + 1);
> + size += ((node_start - 1) - range_start) + 1;
> +
> + /* should the end rang*/
> + }
> + }
> + /* work in next node*/
> + range_start = node_last + 1;
> + if (range_start > range_last)
> + break;
> + }
> +
> + range_size = range_last - range_start;
> +
> + /* last round and still some range*/
> +
> + if ((range_size >= 0) && (range_start >= node_last) &&
> + (node_number == i + 1)) {
> + vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
> + size = size + (range_last - range_start) + 1;
> + } else if ((range_size == -1) && (node_number == i + 1)) {
> + return size;
> + } else {
> + printk(KERN_WARNING,
> + "%s %d FAIL start %lld last %lld node->start %lld node->last %lld i %d",
> + __func__, __LINE__, range_start, range_last, node_start,
> + node_last, i);
> + }
> +
> + return size;
> +}
> +
> +u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
> + struct vdpa_link_node *link_head, int node_number,
> + u64 start, u64 last)
> +{
> + int i = 0;
> + u64 size = 0;
> + int new_ref;
> + u64 node_start;
> + u64 node_last;
> + u64 range_start;
> + u64 range_last;
> + int range_size;
> + struct vdpa_link_node *link_node;
> + struct vdpa_tree_node *vdpa_node = NULL;
> + struct interval_tree_node *node = NULL;
> +
> + if (node_number == 0)
> + return 0;
> +
> + link_node = link_head;
> + range_start = start;
> + range_last = last;
> +
> + for (i = 0; i < node_number; i++) {
> + vdpa_node = link_node->vdpa_node;
> + link_node = link_node->next;
> + node = &vdpa_node->tree_node;
> + new_ref = vdpa_node->ref;
> + node_start = node->start;
> + node_last = node->last;
> +
> + if (range_start == node_start) {
> + if (node_last < range_last) {
> + /* range_start =node->start --- node->last ----last*/
The comment needs some tweaking to be understood by the reviewers easily.
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + node_start,
> + node_last,
> + new_ref - 1);
> + /*count the next range */
> + } else {
> + /* if the ref =0, do not need add it back, count size*/
> + size += (node_last - node_start) + 1;
> + }
> +
> + } else if (node_last > range_last) {
> + /* range_start= node->start --- last -- node->last*/
> +
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + node_start,
> + range_last,
> + new_ref - 1);
> + } else {
> + size += (range_last - node_start) + 1;
> + }
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> + } else {
> + /* range_start= node->start --- last = node->last*/
> +
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + node_start,
> + range_last,
> + new_ref - 1);
> + } else {
> + size += (range_last - node_start) + 1;
> + }
> + /* should be the end */
> + }
> + } else if (node_start < range_start) {
> + if (range_last < node_last) {
> + /* node->start--- start--- last--- node->last*/
> + /* should the end rang*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + range_start,
> + range_last,
> + new_ref - 1);
> + } else {
> + size += (range_last - range_start) + 1;
> + }
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> +
> + } else if (range_last > node_last) {
> + /* node->start--- start--- node->last--- last*/
> +
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + range_start,
> + node_last,
> + new_ref - 1);
> + } else {
> + size += (node_last - range_start) + 1;
> + }
> + } else {
> + /* node->start--- start--- node->last= last*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + range_start,
> + range_last,
> + new_ref - 1);
> + } else {
> + size += (range_last - range_start) + 1;
> + }
> + /* should be the end */
> + }
> + } else {
> + /* some range not in the node, error*/
> + printk(KERN_WARNING,
> + "%s %d FAIL start %lld last %lld node->start %lld node->last %lld new_ref %d",
> + __func__, __LINE__, range_start, range_last,
> + node_start, node_last, new_ref);
> + }
> +
> + range_start = node_last + 1;
> + if (range_start > range_last)
> + break;
> + }
> +
> + range_size = range_last - range_start;
> +
> + /* last round and still some range*/
> +
> + if ((range_size > 0) && (node_number == i + 1)) {
> + printk(KERN_WARNING,
> + "%s %d FAIL start %lld last %lld node->start %lld node->last %lld range_size %d",
> + __func__, __LINE__, range_start, range_last, node_start,
> + node_last, range_size);
> + }
> + return size;
> +}
> +
> +struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
> + struct vdpa_link_node *list2)
> +{
> + struct vdpa_link_node dummy_head;
> + struct vdpa_link_node *ptr = &dummy_head;
> +
> + while (list1 && list2) {
> + if (list1->node_start < list2->node_start) {
> + ptr->next = list1;
> + list1 = list1->next;
> + } else {
> + ptr->next = list2;
> + list2 = list2->next;
> + }
> + ptr = ptr->next;
> + }
> + if (list1)
> + ptr->next = list1;
> + else
> + ptr->next = list2;
> +
> + return dummy_head.next;
> +}
> +
> +struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
> +{
> + struct vdpa_link_node *mid_prev = NULL;
> + struct vdpa_link_node *mid;
> +
> + while (head && head->next) {
> + mid_prev = (mid_prev == NULL) ? head : mid_prev->next;
> + head = head->next->next;
> + }
> + mid = mid_prev->next;
> + mid_prev->next = NULL;
> + return mid;
> +}
> +struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
> +{
> + struct vdpa_link_node *mid;
> + struct vdpa_link_node *left;
> + struct vdpa_link_node *right;
> +
> + if (!head || !head->next)
> + return head;
> +
> + mid = vhost_vdpa_get_mid(head);
> + left = vhost_vdpa_sort_list(head);
> + right = vhost_vdpa_sort_list(mid);
> + return vhost_vdpa_merge_list(left, right);
> +}
> +
> +u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
> + bool ops)
> +{
> + struct interval_tree_node *node = NULL;
> + struct vdpa_tree_node *vdpa_node;
> + int node_number = 0;
> + int i = 0;
> + u64 size = 0;
> + struct vdpa_link_node dummy_head = { 0 };
> + struct vdpa_link_node *link_node;
> + struct vdpa_link_node *link_head_tmp;
> + struct vdpa_link_node *pre_link_node;
> +
> + pre_link_node = &dummy_head;
> + /*search the rang overlaped, and del from the tree*/
> + for (node = interval_tree_iter_first(root, start, last); node;
> + node = interval_tree_iter_next(node, start, last)) {
> + link_node = kmalloc(sizeof(struct vdpa_link_node), GFP_ATOMIC);
> + if (link_node == NULL) {
> + goto out;
> + }
> + vdpa_node =
> + container_of(node, struct vdpa_tree_node, tree_node);
> + link_node->vdpa_node = vdpa_node;
> + link_node->node_start = node->start;
> + link_node->node_last = node->last;
> +
> + pre_link_node->next = link_node;
> + pre_link_node = link_node;
> + pre_link_node->next = NULL;
> +
> + node_number++;
> +
> + interval_tree_remove(node, root);
> + }
> + /* sorting the node */
The code explains itself, let's explain why we need the sort.
> + link_head_tmp = vhost_vdpa_sort_list(dummy_head.next);
> +
> + /* these link node are have overlap with range, check the ref and add back to the tree*/
> + if (ops == true) {
> + size = vhost_vdpa_range_ref_add(root, link_head_tmp,
> + node_number, start, last);
> + } else {
> + size = vhost_vdpa_range_ref_del(root, link_head_tmp,
> + node_number, start, last);
> + }
> +out:
> + pre_link_node = link_head_tmp;
> +
> + for (i = 0; i < node_number; i++) {
> + vdpa_node = pre_link_node->vdpa_node;
> + link_node = pre_link_node->next;
> + kfree(vdpa_node);
> + kfree(pre_link_node);
> + pre_link_node = link_node;
> + }
> + return size;
> +}
> +u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
> + u64 last)
> +{
> + u64 size;
> +
> + size = vhost_vdpa_range_ops(root, start, last, true);
> +
> + return size;
> +}
> +
> +u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
> + u64 last)
> +{
> + u64 size;
> +
> + size = vhost_vdpa_range_ops(root, start, last, false);
> +
> + return size;
> +}
> +
> static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
> {
> struct vhost_dev *dev = &v->vdev;
> struct vhost_iotlb *iotlb = dev->iotlb;
> struct vhost_iotlb_map *map;
> struct page *page;
> + u64 size;
> unsigned long pfn, pinned;
>
> while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
> @@ -523,7 +993,11 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
> set_page_dirty_lock(page);
> unpin_user_page(page);
> }
> - atomic64_sub(PFN_DOWN(map->size), &dev->mm->pinned_vm);
> +
> + size = vhost_vdpa_search_range_del(dev->vdpa_mem_tree,
> + map->start,
> + map->start + map->size - 1);
> + atomic64_sub(PFN_DOWN(size), &dev->mm->pinned_vm);
> vhost_iotlb_map_free(iotlb, map);
> }
> }
> @@ -591,6 +1065,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> int r = 0;
> + u64 size_count;
>
> r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1,
> pa, perm, opaque);
> @@ -610,9 +1085,11 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
> vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
> return r;
> }
> -
> - if (!vdpa->use_va)
> - atomic64_add(PFN_DOWN(size), &dev->mm->pinned_vm);
> + if (!vdpa->use_va) {
> + size_count = vhost_vdpa_search_range_add(dev->vdpa_mem_tree,
> + iova, iova + size - 1);
> + atomic64_add(PFN_DOWN(size_count), &dev->mm->pinned_vm);
> + }
>
> return 0;
> }
> @@ -946,6 +1423,58 @@ static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
> }
> }
>
> +struct root_for_vdpa_node {
> + struct hlist_node hlist;
> + struct rb_root_cached vdpa_mem_tree;
> + pid_t pid_using;
> +};
> +static DECLARE_HASHTABLE(root_for_vdpa_node_list, 8);
> +int status_for_vdpa_tree = 0;
> +
> +struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
> +{
Any reason we get the tree via task_struct instead of mm_struct?
> + struct root_for_vdpa_node *root_get_tmp = NULL;
> + pid_t pid_using = task_pid_nr(task);
> +
> + /* No hased table, init one */
> + if (status_for_vdpa_tree == 0) {
> + hash_init(root_for_vdpa_node_list);
> + status_for_vdpa_tree = 1;
> + }
> +
> + hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
> + pid_using) {
> + if (root_get_tmp->pid_using == pid_using)
> + return root_get_tmp;
> + }
> +
> + root_get_tmp = kmalloc(sizeof(*root_get_tmp), GFP_KERNEL);
> + root_get_tmp->pid_using = pid_using;
> +
> + root_get_tmp->vdpa_mem_tree = RB_ROOT_CACHED;
> +
> + hash_add(root_for_vdpa_node_list, &root_get_tmp->hlist,
> + root_get_tmp->pid_using);
> +
> + return root_get_tmp;
> +}
> +
> +void vhost_vdpa_relase_mem_tree(struct task_struct *task)
> +{
> + struct root_for_vdpa_node *root_get_tmp = NULL;
> + pid_t pid_using = task_pid_nr(task);
> +
> + /* No hased table, init one */
> + hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
> + pid_using) {
> + if (root_get_tmp->pid_using == pid_using)
> + kfree(root_get_tmp);
> + return;
> + }
> +
> + return;
> +}
> +
> static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> {
> struct vhost_vdpa *v;
> @@ -991,10 +1520,13 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> vhost_vdpa_set_iova_range(v);
>
> filep->private_data = v;
> + struct root_for_vdpa_node *tmp = vhost_vdpa_get_mem_tree(current);
This looks wrong, the mapping contains VA so it is bound to the owner.
This means
1) Need to get and put the accounting tree via mm_struct in set_owner
2) Need to release all mappings during reset owner
Thanks
> + dev->vdpa_mem_tree = &tmp->vdpa_mem_tree;
>
> return 0;
>
> err_init_iotlb:
> + vhost_vdpa_relase_mem_tree(current);
> vhost_dev_cleanup(&v->vdev);
> kfree(vqs);
> err:
> @@ -1016,6 +1548,8 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
> struct vhost_dev *d = &v->vdev;
>
> mutex_lock(&d->mutex);
> + vhost_vdpa_relase_mem_tree(current);
> +
> filep->private_data = NULL;
> vhost_vdpa_clean_irq(v);
> vhost_vdpa_reset(v);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 638bb640d6b4..d1c662eb4f26 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -161,6 +161,7 @@ struct vhost_dev {
> int byte_weight;
> u64 kcov_handle;
> bool use_worker;
> + struct rb_root_cached *vdpa_mem_tree;
> int (*msg_handler)(struct vhost_dev *dev,
> struct vhost_iotlb_msg *msg);
> };
> --
> 2.34.3
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
@ 2022-06-01 5:51 ` Jason Wang
0 siblings, 0 replies; 5+ messages in thread
From: Jason Wang @ 2022-06-01 5:51 UTC (permalink / raw)
To: Cindy Lu; +Cc: netdev, virtualization, linux-kernel, kvm, mst
On Wed, Jun 1, 2022 at 9:20 AM Cindy Lu <lulu@redhat.com> wrote:
>
> We count pinned_vm as follow in vhost-vDPA
>
> lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> if (npages + atomic64_read(&dev->mm->pinned_vm) > lock_limit) {
> ret = -ENOMEM;
> goto unlock;
> }
> This means if we have two vDPA devices for the same VM the pages would be counted twice
> So we add a tree to save the page that counted and we will not count it
> again.
The code is not easy to be reviewed, some suggestions:
- It's better to explain in general the algorithm you used here
- Add more comment in the codes to explain the rationale
And I still see the above check against the RLIMIT in the code, is it
intentional?
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
> drivers/vhost/vdpa.c | 542 +++++++++++++++++++++++++++++++++++++++++-
> drivers/vhost/vhost.h | 1 +
> 2 files changed, 539 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 05f5fd2af58f..1b0da0735efd 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -24,6 +24,10 @@
> #include <linux/vhost.h>
>
> #include "vhost.h"
> +#include <linux/rbtree.h>
> +#include <linux/interval_tree.h>
> +#include <linux/interval_tree_generic.h>
> +#include <linux/hashtable.h>
>
> enum {
> VHOST_VDPA_BACKEND_FEATURES =
> @@ -506,12 +510,478 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> return r;
> }
>
> +struct vdpa_tree_node {
> + struct interval_tree_node tree_node;
Can we simply reuse the vhost_iotlb tree? Note that vhost_iotlb_map
can be associated with a opaque as token which could be used as
reference count.
> + int ref;
If it's a refcount, let's use unsigned here.
> +};
> +struct vdpa_link_node {
> + struct vdpa_tree_node *vdpa_node;
> + struct vdpa_link_node *next;
Need to explain why we need a linked list here considering we've
already structured it as an interval tree.
Btw, unless it's performance critical, let's try to re-use kernel list.h.
> + u64 node_start;
> + u64 node_last;
Let's add a comment to explain each member here.
> +};
> +
> +int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
> + int ref)
We don't want to export this symbol, so let's make the function
static, so did other functions.
> +{
> + struct interval_tree_node *new_node;
> + struct vdpa_tree_node *vdpa_node;
> +
> + if (last < start)
> + return -EFAULT;
> +
> + /* If the range being mapped is [0, ULONG_MAX], split it into two entries
> + * otherwise its size would overflow u64.
> + */
> + if (start == 0 && last == ULONG_MAX) {
> + u64 mid = last / 2;
> +
> + vhost_vdpa_add_range_ctx(root, start, mid, ref);
> + start = mid + 1;
> + }
> + vdpa_node = kmalloc(sizeof(struct vdpa_tree_node), GFP_ATOMIC);
> +
Let's check if kmalloc succeeds here.
> + new_node = &vdpa_node->tree_node;
> + if (!new_node)
> + return -ENOMEM;
> +
> + new_node->start = start;
> + new_node->last = last;
> + vdpa_node->ref = ref;
> +
> + interval_tree_insert(new_node, root);
> +
> + return 0;
> +}
> +
> +u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
> + struct vdpa_link_node *link_head, int node_number,
> + u64 start, u64 last)
> +{
> + int i = 0;
> + u64 size = 0;
> + int new_ref;
> + u64 node_start;
> + u64 node_last;
> + u64 range_start;
> + u64 range_last;
> + int range_size;
> + struct vdpa_link_node *link_node;
> + struct vdpa_tree_node *vdpa_node = NULL;
> + struct interval_tree_node *node = NULL;
> +
> + if (node_number == 0) {
> + vhost_vdpa_add_range_ctx(root, start, last, 1);
> +
> + size = last - start + 1;
> + return size;
> + }
> +
> + link_node = link_head;
> + range_start = start;
> + range_last = last;
> + range_size = range_start - range_last;
> + for (i = 0; i < node_number; i++) {
> + vdpa_node = link_node->vdpa_node;
> + link_node = link_node->next;
> + node = &vdpa_node->tree_node;
> + new_ref = vdpa_node->ref;
> + node_start = node->start;
> + node_last = node->last;
> +
> + if (range_start == node_start) {
> + if (node_last < range_last) {
> + /* range_start= node->start--- node->last--range_last*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + node_last,
> + new_ref + 1);
> + /*count the next range */
> + } else if (node_last > range_last) {
> + /* range_start= node->start --- last -- node->last*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_last,
> + new_ref + 1);
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> + } else {
> + vhost_vdpa_add_range_ctx(root, node_start,
> + node_last,
> + new_ref + 1);
> + }
> + } else if (node_start < range_start) {
> + if (range_last < node_last) {
> + /* node->start--- start--- last--- node->last*/
> + /* should the end rang*/
> +
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + vhost_vdpa_add_range_ctx(root, range_start,
> + range_last,
> + new_ref + 1);
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> +
> + } else if (range_last > node_last) {
> + /* node->start--- start--- node->last-- last*/
> +
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_last,
> + new_ref + 1);
> + } else {
> + /* node->start--- start--- node->last= last*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_last,
> + new_ref + 1);
> + /* should the end rang*/
> + }
> + } else {
> + if (node_last < range_last) {
> + /* range_start --- node->start --- node->last ----last */
> +
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_start - 1, 1);
> + vhost_vdpa_add_range_ctx(root, node_start,
> + node_last,
> + new_ref + 1);
> + size += ((node_start - 1) - range_start) + 1;
> + } else if (node_last > range_last) {
> + /* range_start--- node->start --- last -- node->last */
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_start - 1, 1);
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_last,
> + new_ref + 1);
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> + size += ((node_start - 1) - range_start) + 1;
> +
> + /* should the end rang*/
> + } else {
> + /* range_start--- node->start --- last = node->last */
> + vhost_vdpa_add_range_ctx(root, range_start,
> + node_start - 1, 1);
> + vhost_vdpa_add_range_ctx(root, node_start,
> + node_last,
> + new_ref + 1);
> + size += ((node_start - 1) - range_start) + 1;
> +
> + /* should the end rang*/
> + }
> + }
> + /* work in next node*/
> + range_start = node_last + 1;
> + if (range_start > range_last)
> + break;
> + }
> +
> + range_size = range_last - range_start;
> +
> + /* last round and still some range*/
> +
> + if ((range_size >= 0) && (range_start >= node_last) &&
> + (node_number == i + 1)) {
> + vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
> + size = size + (range_last - range_start) + 1;
> + } else if ((range_size == -1) && (node_number == i + 1)) {
> + return size;
> + } else {
> + printk(KERN_WARNING,
> + "%s %d FAIL start %lld last %lld node->start %lld node->last %lld i %d",
> + __func__, __LINE__, range_start, range_last, node_start,
> + node_last, i);
> + }
> +
> + return size;
> +}
> +
> +u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
> + struct vdpa_link_node *link_head, int node_number,
> + u64 start, u64 last)
> +{
> + int i = 0;
> + u64 size = 0;
> + int new_ref;
> + u64 node_start;
> + u64 node_last;
> + u64 range_start;
> + u64 range_last;
> + int range_size;
> + struct vdpa_link_node *link_node;
> + struct vdpa_tree_node *vdpa_node = NULL;
> + struct interval_tree_node *node = NULL;
> +
> + if (node_number == 0)
> + return 0;
> +
> + link_node = link_head;
> + range_start = start;
> + range_last = last;
> +
> + for (i = 0; i < node_number; i++) {
> + vdpa_node = link_node->vdpa_node;
> + link_node = link_node->next;
> + node = &vdpa_node->tree_node;
> + new_ref = vdpa_node->ref;
> + node_start = node->start;
> + node_last = node->last;
> +
> + if (range_start == node_start) {
> + if (node_last < range_last) {
> + /* range_start =node->start --- node->last ----last*/
The comment needs some tweaking to be understood by the reviewers easily.
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + node_start,
> + node_last,
> + new_ref - 1);
> + /*count the next range */
> + } else {
> + /* if the ref =0, do not need add it back, count size*/
> + size += (node_last - node_start) + 1;
> + }
> +
> + } else if (node_last > range_last) {
> + /* range_start= node->start --- last -- node->last*/
> +
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + node_start,
> + range_last,
> + new_ref - 1);
> + } else {
> + size += (range_last - node_start) + 1;
> + }
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> + } else {
> + /* range_start= node->start --- last = node->last*/
> +
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + node_start,
> + range_last,
> + new_ref - 1);
> + } else {
> + size += (range_last - node_start) + 1;
> + }
> + /* should be the end */
> + }
> + } else if (node_start < range_start) {
> + if (range_last < node_last) {
> + /* node->start--- start--- last--- node->last*/
> + /* should the end rang*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + range_start,
> + range_last,
> + new_ref - 1);
> + } else {
> + size += (range_last - range_start) + 1;
> + }
> + vhost_vdpa_add_range_ctx(root, range_last + 1,
> + node_last, new_ref);
> +
> + } else if (range_last > node_last) {
> + /* node->start--- start--- node->last--- last*/
> +
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + range_start,
> + node_last,
> + new_ref - 1);
> + } else {
> + size += (node_last - range_start) + 1;
> + }
> + } else {
> + /* node->start--- start--- node->last= last*/
> + vhost_vdpa_add_range_ctx(root, node_start,
> + range_start - 1,
> + new_ref);
> + if (new_ref > 1) {
> + vhost_vdpa_add_range_ctx(root,
> + range_start,
> + range_last,
> + new_ref - 1);
> + } else {
> + size += (range_last - range_start) + 1;
> + }
> + /* should be the end */
> + }
> + } else {
> + /* some range not in the node, error*/
> + printk(KERN_WARNING,
> + "%s %d FAIL start %lld last %lld node->start %lld node->last %lld new_ref %d",
> + __func__, __LINE__, range_start, range_last,
> + node_start, node_last, new_ref);
> + }
> +
> + range_start = node_last + 1;
> + if (range_start > range_last)
> + break;
> + }
> +
> + range_size = range_last - range_start;
> +
> + /* last round and still some range*/
> +
> + if ((range_size > 0) && (node_number == i + 1)) {
> + printk(KERN_WARNING,
> + "%s %d FAIL start %lld last %lld node->start %lld node->last %lld range_size %d",
> + __func__, __LINE__, range_start, range_last, node_start,
> + node_last, range_size);
> + }
> + return size;
> +}
> +
> +struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
> + struct vdpa_link_node *list2)
> +{
> + struct vdpa_link_node dummy_head;
> + struct vdpa_link_node *ptr = &dummy_head;
> +
> + while (list1 && list2) {
> + if (list1->node_start < list2->node_start) {
> + ptr->next = list1;
> + list1 = list1->next;
> + } else {
> + ptr->next = list2;
> + list2 = list2->next;
> + }
> + ptr = ptr->next;
> + }
> + if (list1)
> + ptr->next = list1;
> + else
> + ptr->next = list2;
> +
> + return dummy_head.next;
> +}
> +
> +struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
> +{
> + struct vdpa_link_node *mid_prev = NULL;
> + struct vdpa_link_node *mid;
> +
> + while (head && head->next) {
> + mid_prev = (mid_prev == NULL) ? head : mid_prev->next;
> + head = head->next->next;
> + }
> + mid = mid_prev->next;
> + mid_prev->next = NULL;
> + return mid;
> +}
> +struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
> +{
> + struct vdpa_link_node *mid;
> + struct vdpa_link_node *left;
> + struct vdpa_link_node *right;
> +
> + if (!head || !head->next)
> + return head;
> +
> + mid = vhost_vdpa_get_mid(head);
> + left = vhost_vdpa_sort_list(head);
> + right = vhost_vdpa_sort_list(mid);
> + return vhost_vdpa_merge_list(left, right);
> +}
> +
> +u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
> + bool ops)
> +{
> + struct interval_tree_node *node = NULL;
> + struct vdpa_tree_node *vdpa_node;
> + int node_number = 0;
> + int i = 0;
> + u64 size = 0;
> + struct vdpa_link_node dummy_head = { 0 };
> + struct vdpa_link_node *link_node;
> + struct vdpa_link_node *link_head_tmp;
> + struct vdpa_link_node *pre_link_node;
> +
> + pre_link_node = &dummy_head;
> + /*search the rang overlaped, and del from the tree*/
> + for (node = interval_tree_iter_first(root, start, last); node;
> + node = interval_tree_iter_next(node, start, last)) {
> + link_node = kmalloc(sizeof(struct vdpa_link_node), GFP_ATOMIC);
> + if (link_node == NULL) {
> + goto out;
> + }
> + vdpa_node =
> + container_of(node, struct vdpa_tree_node, tree_node);
> + link_node->vdpa_node = vdpa_node;
> + link_node->node_start = node->start;
> + link_node->node_last = node->last;
> +
> + pre_link_node->next = link_node;
> + pre_link_node = link_node;
> + pre_link_node->next = NULL;
> +
> + node_number++;
> +
> + interval_tree_remove(node, root);
> + }
> + /* sorting the node */
The code explains itself, let's explain why we need the sort.
> + link_head_tmp = vhost_vdpa_sort_list(dummy_head.next);
> +
> + /* these link node are have overlap with range, check the ref and add back to the tree*/
> + if (ops == true) {
> + size = vhost_vdpa_range_ref_add(root, link_head_tmp,
> + node_number, start, last);
> + } else {
> + size = vhost_vdpa_range_ref_del(root, link_head_tmp,
> + node_number, start, last);
> + }
> +out:
> + pre_link_node = link_head_tmp;
> +
> + for (i = 0; i < node_number; i++) {
> + vdpa_node = pre_link_node->vdpa_node;
> + link_node = pre_link_node->next;
> + kfree(vdpa_node);
> + kfree(pre_link_node);
> + pre_link_node = link_node;
> + }
> + return size;
> +}
> +u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
> + u64 last)
> +{
> + u64 size;
> +
> + size = vhost_vdpa_range_ops(root, start, last, true);
> +
> + return size;
> +}
> +
> +u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
> + u64 last)
> +{
> + u64 size;
> +
> + size = vhost_vdpa_range_ops(root, start, last, false);
> +
> + return size;
> +}
> +
> static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
> {
> struct vhost_dev *dev = &v->vdev;
> struct vhost_iotlb *iotlb = dev->iotlb;
> struct vhost_iotlb_map *map;
> struct page *page;
> + u64 size;
> unsigned long pfn, pinned;
>
> while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
> @@ -523,7 +993,11 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
> set_page_dirty_lock(page);
> unpin_user_page(page);
> }
> - atomic64_sub(PFN_DOWN(map->size), &dev->mm->pinned_vm);
> +
> + size = vhost_vdpa_search_range_del(dev->vdpa_mem_tree,
> + map->start,
> + map->start + map->size - 1);
> + atomic64_sub(PFN_DOWN(size), &dev->mm->pinned_vm);
> vhost_iotlb_map_free(iotlb, map);
> }
> }
> @@ -591,6 +1065,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> int r = 0;
> + u64 size_count;
>
> r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1,
> pa, perm, opaque);
> @@ -610,9 +1085,11 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
> vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
> return r;
> }
> -
> - if (!vdpa->use_va)
> - atomic64_add(PFN_DOWN(size), &dev->mm->pinned_vm);
> + if (!vdpa->use_va) {
> + size_count = vhost_vdpa_search_range_add(dev->vdpa_mem_tree,
> + iova, iova + size - 1);
> + atomic64_add(PFN_DOWN(size_count), &dev->mm->pinned_vm);
> + }
>
> return 0;
> }
> @@ -946,6 +1423,58 @@ static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
> }
> }
>
> +struct root_for_vdpa_node {
> + struct hlist_node hlist;
> + struct rb_root_cached vdpa_mem_tree;
> + pid_t pid_using;
> +};
> +static DECLARE_HASHTABLE(root_for_vdpa_node_list, 8);
> +int status_for_vdpa_tree = 0;
> +
> +struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
> +{
Any reason we get the tree via task_struct instead of mm_struct?
> + struct root_for_vdpa_node *root_get_tmp = NULL;
> + pid_t pid_using = task_pid_nr(task);
> +
> + /* No hased table, init one */
> + if (status_for_vdpa_tree == 0) {
> + hash_init(root_for_vdpa_node_list);
> + status_for_vdpa_tree = 1;
> + }
> +
> + hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
> + pid_using) {
> + if (root_get_tmp->pid_using == pid_using)
> + return root_get_tmp;
> + }
> +
> + root_get_tmp = kmalloc(sizeof(*root_get_tmp), GFP_KERNEL);
> + root_get_tmp->pid_using = pid_using;
> +
> + root_get_tmp->vdpa_mem_tree = RB_ROOT_CACHED;
> +
> + hash_add(root_for_vdpa_node_list, &root_get_tmp->hlist,
> + root_get_tmp->pid_using);
> +
> + return root_get_tmp;
> +}
> +
> +void vhost_vdpa_relase_mem_tree(struct task_struct *task)
> +{
> + struct root_for_vdpa_node *root_get_tmp = NULL;
> + pid_t pid_using = task_pid_nr(task);
> +
> + /* No hased table, init one */
> + hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
> + pid_using) {
> + if (root_get_tmp->pid_using == pid_using)
> + kfree(root_get_tmp);
> + return;
> + }
> +
> + return;
> +}
> +
> static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> {
> struct vhost_vdpa *v;
> @@ -991,10 +1520,13 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> vhost_vdpa_set_iova_range(v);
>
> filep->private_data = v;
> + struct root_for_vdpa_node *tmp = vhost_vdpa_get_mem_tree(current);
This looks wrong, the mapping contains VA so it is bound to the owner.
This means
1) Need to get and put the accounting tree via mm_struct in set_owner
2) Need to release all mappings during reset owner
Thanks
> + dev->vdpa_mem_tree = &tmp->vdpa_mem_tree;
>
> return 0;
>
> err_init_iotlb:
> + vhost_vdpa_relase_mem_tree(current);
> vhost_dev_cleanup(&v->vdev);
> kfree(vqs);
> err:
> @@ -1016,6 +1548,8 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
> struct vhost_dev *d = &v->vdev;
>
> mutex_lock(&d->mutex);
> + vhost_vdpa_relase_mem_tree(current);
> +
> filep->private_data = NULL;
> vhost_vdpa_clean_irq(v);
> vhost_vdpa_reset(v);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 638bb640d6b4..d1c662eb4f26 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -161,6 +161,7 @@ struct vhost_dev {
> int byte_weight;
> u64 kcov_handle;
> bool use_worker;
> + struct rb_root_cached *vdpa_mem_tree;
> int (*msg_handler)(struct vhost_dev *dev,
> struct vhost_iotlb_msg *msg);
> };
> --
> 2.34.3
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
2022-06-01 1:20 Cindy Lu
@ 2022-06-01 5:03 ` kernel test robot
2022-06-01 5:51 ` Jason Wang
1 sibling, 0 replies; 5+ messages in thread
From: kernel test robot @ 2022-06-01 5:03 UTC (permalink / raw)
To: Cindy Lu; +Cc: llvm, kbuild-all
Hi Cindy,
[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on v5.18]
[cannot apply to mst-vhost/linux-next next-20220601]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/intel-lab-lkp/linux/commits/Cindy-Lu/vdpa-Do-not-count-the-pages-that-were-already-pinned-in-the-vhost-vDPA/20220601-092308
base: 4b0986a3613c92f4ec1bdc7f60ec66fea135991f
config: x86_64-randconfig-a005 (https://download.01.org/0day-ci/archive/20220601/202206011237.d34xUNB3-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project c825abd6b0198fb088d9752f556a70705bc99dfd)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/8f0b25afe9ac570a70eafb5c285747e3bdd2471d
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Cindy-Lu/vdpa-Do-not-count-the-pages-that-were-already-pinned-in-the-vhost-vDPA/20220601-092308
git checkout 8f0b25afe9ac570a70eafb5c285747e3bdd2471d
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/vhost/
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
>> drivers/vhost/vdpa.c:555:5: warning: no previous prototype for function 'vhost_vdpa_add_range_ctx' [-Wmissing-prototypes]
int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
^
drivers/vhost/vdpa.c:555:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
^
static
>> drivers/vhost/vdpa.c:727:10: warning: data argument not used by format string [-Wformat-extra-args]
"%s %d FAIL start %lld last %lld node->start %lld node->last %lld i %d",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/printk.h:446:60: note: expanded from macro 'printk'
#define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
include/linux/printk.h:418:19: note: expanded from macro 'printk_index_wrap'
_p_func(_fmt, ##__VA_ARGS__); \
~~~~ ^
>> drivers/vhost/vdpa.c:588:5: warning: no previous prototype for function 'vhost_vdpa_range_ref_add' [-Wmissing-prototypes]
u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
^
drivers/vhost/vdpa.c:588:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
^
static
drivers/vhost/vdpa.c:856:11: warning: data argument not used by format string [-Wformat-extra-args]
"%s %d FAIL start %lld last %lld node->start %lld node->last %lld new_ref %d",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/printk.h:446:60: note: expanded from macro 'printk'
#define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
include/linux/printk.h:418:19: note: expanded from macro 'printk_index_wrap'
_p_func(_fmt, ##__VA_ARGS__); \
~~~~ ^
drivers/vhost/vdpa.c:872:10: warning: data argument not used by format string [-Wformat-extra-args]
"%s %d FAIL start %lld last %lld node->start %lld node->last %lld range_size %d",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/printk.h:446:60: note: expanded from macro 'printk'
#define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
include/linux/printk.h:418:19: note: expanded from macro 'printk_index_wrap'
_p_func(_fmt, ##__VA_ARGS__); \
~~~~ ^
>> drivers/vhost/vdpa.c:735:5: warning: no previous prototype for function 'vhost_vdpa_range_ref_del' [-Wmissing-prototypes]
u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
^
drivers/vhost/vdpa.c:735:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
^
static
>> drivers/vhost/vdpa.c:879:24: warning: no previous prototype for function 'vhost_vdpa_merge_list' [-Wmissing-prototypes]
struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
^
drivers/vhost/vdpa.c:879:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
^
static
>> drivers/vhost/vdpa.c:903:24: warning: no previous prototype for function 'vhost_vdpa_get_mid' [-Wmissing-prototypes]
struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
^
drivers/vhost/vdpa.c:903:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
^
static
>> drivers/vhost/vdpa.c:916:24: warning: no previous prototype for function 'vhost_vdpa_sort_list' [-Wmissing-prototypes]
struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
^
drivers/vhost/vdpa.c:916:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
^
static
>> drivers/vhost/vdpa.c:931:5: warning: no previous prototype for function 'vhost_vdpa_range_ops' [-Wmissing-prototypes]
u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
^
drivers/vhost/vdpa.c:931:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
^
static
>> drivers/vhost/vdpa.c:949:7: warning: variable 'link_head_tmp' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (link_node == NULL) {
^~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:978:18: note: uninitialized use occurs here
pre_link_node = link_head_tmp;
^~~~~~~~~~~~~
drivers/vhost/vdpa.c:949:3: note: remove the 'if' if its condition is always false
if (link_node == NULL) {
^~~~~~~~~~~~~~~~~~~~~~~~
drivers/vhost/vdpa.c:941:38: note: initialize the variable 'link_head_tmp' to silence this warning
struct vdpa_link_node *link_head_tmp;
^
= NULL
>> drivers/vhost/vdpa.c:989:5: warning: no previous prototype for function 'vhost_vdpa_search_range_add' [-Wmissing-prototypes]
u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
^
drivers/vhost/vdpa.c:989:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
^
static
>> drivers/vhost/vdpa.c:999:5: warning: no previous prototype for function 'vhost_vdpa_search_range_del' [-Wmissing-prototypes]
u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
^
drivers/vhost/vdpa.c:999:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
^
static
>> drivers/vhost/vdpa.c:1465:28: warning: no previous prototype for function 'vhost_vdpa_get_mem_tree' [-Wmissing-prototypes]
struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
^
drivers/vhost/vdpa.c:1465:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
^
static
>> drivers/vhost/vdpa.c:1493:6: warning: no previous prototype for function 'vhost_vdpa_relase_mem_tree' [-Wmissing-prototypes]
void vhost_vdpa_relase_mem_tree(struct task_struct *task)
^
drivers/vhost/vdpa.c:1493:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void vhost_vdpa_relase_mem_tree(struct task_struct *task)
^
static
>> drivers/vhost/vdpa.c:1555:29: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
struct root_for_vdpa_node *tmp = vhost_vdpa_get_mem_tree(current);
^
16 warnings generated.
vim +/vhost_vdpa_add_range_ctx +555 drivers/vhost/vdpa.c
554
> 555 int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
556 int ref)
557 {
558 struct interval_tree_node *new_node;
559 struct vdpa_tree_node *vdpa_node;
560
561 if (last < start)
562 return -EFAULT;
563
564 /* If the range being mapped is [0, ULONG_MAX], split it into two entries
565 * otherwise its size would overflow u64.
566 */
567 if (start == 0 && last == ULONG_MAX) {
568 u64 mid = last / 2;
569
570 vhost_vdpa_add_range_ctx(root, start, mid, ref);
571 start = mid + 1;
572 }
573 vdpa_node = kmalloc(sizeof(struct vdpa_tree_node), GFP_ATOMIC);
574
575 new_node = &vdpa_node->tree_node;
576 if (!new_node)
577 return -ENOMEM;
578
579 new_node->start = start;
580 new_node->last = last;
581 vdpa_node->ref = ref;
582
583 interval_tree_insert(new_node, root);
584
585 return 0;
586 }
587
> 588 u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
589 struct vdpa_link_node *link_head, int node_number,
590 u64 start, u64 last)
591 {
592 int i = 0;
593 u64 size = 0;
594 int new_ref;
595 u64 node_start;
596 u64 node_last;
597 u64 range_start;
598 u64 range_last;
599 int range_size;
600 struct vdpa_link_node *link_node;
601 struct vdpa_tree_node *vdpa_node = NULL;
602 struct interval_tree_node *node = NULL;
603
604 if (node_number == 0) {
605 vhost_vdpa_add_range_ctx(root, start, last, 1);
606
607 size = last - start + 1;
608 return size;
609 }
610
611 link_node = link_head;
612 range_start = start;
613 range_last = last;
614 range_size = range_start - range_last;
615 for (i = 0; i < node_number; i++) {
616 vdpa_node = link_node->vdpa_node;
617 link_node = link_node->next;
618 node = &vdpa_node->tree_node;
619 new_ref = vdpa_node->ref;
620 node_start = node->start;
621 node_last = node->last;
622
623 if (range_start == node_start) {
624 if (node_last < range_last) {
625 /* range_start= node->start--- node->last--range_last*/
626 vhost_vdpa_add_range_ctx(root, node_start,
627 node_last,
628 new_ref + 1);
629 /*count the next range */
630 } else if (node_last > range_last) {
631 /* range_start= node->start --- last -- node->last*/
632 vhost_vdpa_add_range_ctx(root, node_start,
633 range_last,
634 new_ref + 1);
635 vhost_vdpa_add_range_ctx(root, range_last + 1,
636 node_last, new_ref);
637 } else {
638 vhost_vdpa_add_range_ctx(root, node_start,
639 node_last,
640 new_ref + 1);
641 }
642 } else if (node_start < range_start) {
643 if (range_last < node_last) {
644 /* node->start--- start--- last--- node->last*/
645 /* should the end rang*/
646
647 vhost_vdpa_add_range_ctx(root, node_start,
648 range_start - 1,
649 new_ref);
650 vhost_vdpa_add_range_ctx(root, range_start,
651 range_last,
652 new_ref + 1);
653 vhost_vdpa_add_range_ctx(root, range_last + 1,
654 node_last, new_ref);
655
656 } else if (range_last > node_last) {
657 /* node->start--- start--- node->last-- last*/
658
659 vhost_vdpa_add_range_ctx(root, node_start,
660 range_start - 1,
661 new_ref);
662 vhost_vdpa_add_range_ctx(root, range_start,
663 node_last,
664 new_ref + 1);
665 } else {
666 /* node->start--- start--- node->last= last*/
667 vhost_vdpa_add_range_ctx(root, node_start,
668 range_start - 1,
669 new_ref);
670 vhost_vdpa_add_range_ctx(root, range_start,
671 node_last,
672 new_ref + 1);
673 /* should the end rang*/
674 }
675 } else {
676 if (node_last < range_last) {
677 /* range_start --- node->start --- node->last ----last */
678
679 vhost_vdpa_add_range_ctx(root, range_start,
680 node_start - 1, 1);
681 vhost_vdpa_add_range_ctx(root, node_start,
682 node_last,
683 new_ref + 1);
684 size += ((node_start - 1) - range_start) + 1;
685 } else if (node_last > range_last) {
686 /* range_start--- node->start --- last -- node->last */
687 vhost_vdpa_add_range_ctx(root, range_start,
688 node_start - 1, 1);
689 vhost_vdpa_add_range_ctx(root, node_start,
690 range_last,
691 new_ref + 1);
692 vhost_vdpa_add_range_ctx(root, range_last + 1,
693 node_last, new_ref);
694 size += ((node_start - 1) - range_start) + 1;
695
696 /* should the end rang*/
697 } else {
698 /* range_start--- node->start --- last = node->last */
699 vhost_vdpa_add_range_ctx(root, range_start,
700 node_start - 1, 1);
701 vhost_vdpa_add_range_ctx(root, node_start,
702 node_last,
703 new_ref + 1);
704 size += ((node_start - 1) - range_start) + 1;
705
706 /* should the end rang*/
707 }
708 }
709 /* work in next node*/
710 range_start = node_last + 1;
711 if (range_start > range_last)
712 break;
713 }
714
715 range_size = range_last - range_start;
716
717 /* last round and still some range*/
718
719 if ((range_size >= 0) && (range_start >= node_last) &&
720 (node_number == i + 1)) {
721 vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
722 size = size + (range_last - range_start) + 1;
723 } else if ((range_size == -1) && (node_number == i + 1)) {
724 return size;
725 } else {
726 printk(KERN_WARNING,
> 727 "%s %d FAIL start %lld last %lld node->start %lld node->last %lld i %d",
728 __func__, __LINE__, range_start, range_last, node_start,
729 node_last, i);
730 }
731
732 return size;
733 }
734
> 735 u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
736 struct vdpa_link_node *link_head, int node_number,
737 u64 start, u64 last)
738 {
739 int i = 0;
740 u64 size = 0;
741 int new_ref;
742 u64 node_start;
743 u64 node_last;
744 u64 range_start;
745 u64 range_last;
746 int range_size;
747 struct vdpa_link_node *link_node;
748 struct vdpa_tree_node *vdpa_node = NULL;
749 struct interval_tree_node *node = NULL;
750
751 if (node_number == 0)
752 return 0;
753
754 link_node = link_head;
755 range_start = start;
756 range_last = last;
757
758 for (i = 0; i < node_number; i++) {
759 vdpa_node = link_node->vdpa_node;
760 link_node = link_node->next;
761 node = &vdpa_node->tree_node;
762 new_ref = vdpa_node->ref;
763 node_start = node->start;
764 node_last = node->last;
765
766 if (range_start == node_start) {
767 if (node_last < range_last) {
768 /* range_start =node->start --- node->last ----last*/
769 if (new_ref > 1) {
770 vhost_vdpa_add_range_ctx(root,
771 node_start,
772 node_last,
773 new_ref - 1);
774 /*count the next range */
775 } else {
776 /* if the ref =0, do not need add it back, count size*/
777 size += (node_last - node_start) + 1;
778 }
779
780 } else if (node_last > range_last) {
781 /* range_start= node->start --- last -- node->last*/
782
783 if (new_ref > 1) {
784 vhost_vdpa_add_range_ctx(root,
785 node_start,
786 range_last,
787 new_ref - 1);
788 } else {
789 size += (range_last - node_start) + 1;
790 }
791 vhost_vdpa_add_range_ctx(root, range_last + 1,
792 node_last, new_ref);
793 } else {
794 /* range_start= node->start --- last = node->last*/
795
796 if (new_ref > 1) {
797 vhost_vdpa_add_range_ctx(root,
798 node_start,
799 range_last,
800 new_ref - 1);
801 } else {
802 size += (range_last - node_start) + 1;
803 }
804 /* should be the end */
805 }
806 } else if (node_start < range_start) {
807 if (range_last < node_last) {
808 /* node->start--- start--- last--- node->last*/
809 /* should the end rang*/
810 vhost_vdpa_add_range_ctx(root, node_start,
811 range_start - 1,
812 new_ref);
813 if (new_ref > 1) {
814 vhost_vdpa_add_range_ctx(root,
815 range_start,
816 range_last,
817 new_ref - 1);
818 } else {
819 size += (range_last - range_start) + 1;
820 }
821 vhost_vdpa_add_range_ctx(root, range_last + 1,
822 node_last, new_ref);
823
824 } else if (range_last > node_last) {
825 /* node->start--- start--- node->last--- last*/
826
827 vhost_vdpa_add_range_ctx(root, node_start,
828 range_start - 1,
829 new_ref);
830 if (new_ref > 1) {
831 vhost_vdpa_add_range_ctx(root,
832 range_start,
833 node_last,
834 new_ref - 1);
835 } else {
836 size += (node_last - range_start) + 1;
837 }
838 } else {
839 /* node->start--- start--- node->last= last*/
840 vhost_vdpa_add_range_ctx(root, node_start,
841 range_start - 1,
842 new_ref);
843 if (new_ref > 1) {
844 vhost_vdpa_add_range_ctx(root,
845 range_start,
846 range_last,
847 new_ref - 1);
848 } else {
849 size += (range_last - range_start) + 1;
850 }
851 /* should be the end */
852 }
853 } else {
854 /* some range not in the node, error*/
855 printk(KERN_WARNING,
856 "%s %d FAIL start %lld last %lld node->start %lld node->last %lld new_ref %d",
857 __func__, __LINE__, range_start, range_last,
858 node_start, node_last, new_ref);
859 }
860
861 range_start = node_last + 1;
862 if (range_start > range_last)
863 break;
864 }
865
866 range_size = range_last - range_start;
867
868 /* last round and still some range*/
869
870 if ((range_size > 0) && (node_number == i + 1)) {
871 printk(KERN_WARNING,
872 "%s %d FAIL start %lld last %lld node->start %lld node->last %lld range_size %d",
873 __func__, __LINE__, range_start, range_last, node_start,
874 node_last, range_size);
875 }
876 return size;
877 }
878
--
0-DAY CI Kernel Test Service
https://01.org/lkp
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
@ 2022-06-01 1:20 Cindy Lu
2022-06-01 5:03 ` kernel test robot
2022-06-01 5:51 ` Jason Wang
0 siblings, 2 replies; 5+ messages in thread
From: Cindy Lu @ 2022-06-01 1:20 UTC (permalink / raw)
To: lulu, mst, jasowang, kvm, virtualization, netdev, linux-kernel
We count pinned_vm as follow in vhost-vDPA
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
if (npages + atomic64_read(&dev->mm->pinned_vm) > lock_limit) {
ret = -ENOMEM;
goto unlock;
}
This means if we have two vDPA devices for the same VM the pages would be counted twice
So we add a tree to save the page that counted and we will not count it
again.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
drivers/vhost/vdpa.c | 542 +++++++++++++++++++++++++++++++++++++++++-
drivers/vhost/vhost.h | 1 +
2 files changed, 539 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 05f5fd2af58f..1b0da0735efd 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -24,6 +24,10 @@
#include <linux/vhost.h>
#include "vhost.h"
+#include <linux/rbtree.h>
+#include <linux/interval_tree.h>
+#include <linux/interval_tree_generic.h>
+#include <linux/hashtable.h>
enum {
VHOST_VDPA_BACKEND_FEATURES =
@@ -506,12 +510,478 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
return r;
}
+struct vdpa_tree_node {
+ struct interval_tree_node tree_node;
+ int ref;
+};
+struct vdpa_link_node {
+ struct vdpa_tree_node *vdpa_node;
+ struct vdpa_link_node *next;
+ u64 node_start;
+ u64 node_last;
+};
+
+int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
+ int ref)
+{
+ struct interval_tree_node *new_node;
+ struct vdpa_tree_node *vdpa_node;
+
+ if (last < start)
+ return -EFAULT;
+
+ /* If the range being mapped is [0, ULONG_MAX], split it into two entries
+ * otherwise its size would overflow u64.
+ */
+ if (start == 0 && last == ULONG_MAX) {
+ u64 mid = last / 2;
+
+ vhost_vdpa_add_range_ctx(root, start, mid, ref);
+ start = mid + 1;
+ }
+ vdpa_node = kmalloc(sizeof(struct vdpa_tree_node), GFP_ATOMIC);
+
+ new_node = &vdpa_node->tree_node;
+ if (!new_node)
+ return -ENOMEM;
+
+ new_node->start = start;
+ new_node->last = last;
+ vdpa_node->ref = ref;
+
+ interval_tree_insert(new_node, root);
+
+ return 0;
+}
+
+u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
+ struct vdpa_link_node *link_head, int node_number,
+ u64 start, u64 last)
+{
+ int i = 0;
+ u64 size = 0;
+ int new_ref;
+ u64 node_start;
+ u64 node_last;
+ u64 range_start;
+ u64 range_last;
+ int range_size;
+ struct vdpa_link_node *link_node;
+ struct vdpa_tree_node *vdpa_node = NULL;
+ struct interval_tree_node *node = NULL;
+
+ if (node_number == 0) {
+ vhost_vdpa_add_range_ctx(root, start, last, 1);
+
+ size = last - start + 1;
+ return size;
+ }
+
+ link_node = link_head;
+ range_start = start;
+ range_last = last;
+ range_size = range_start - range_last;
+ for (i = 0; i < node_number; i++) {
+ vdpa_node = link_node->vdpa_node;
+ link_node = link_node->next;
+ node = &vdpa_node->tree_node;
+ new_ref = vdpa_node->ref;
+ node_start = node->start;
+ node_last = node->last;
+
+ if (range_start == node_start) {
+ if (node_last < range_last) {
+ /* range_start= node->start--- node->last--range_last*/
+ vhost_vdpa_add_range_ctx(root, node_start,
+ node_last,
+ new_ref + 1);
+ /*count the next range */
+ } else if (node_last > range_last) {
+ /* range_start= node->start --- last -- node->last*/
+ vhost_vdpa_add_range_ctx(root, node_start,
+ range_last,
+ new_ref + 1);
+ vhost_vdpa_add_range_ctx(root, range_last + 1,
+ node_last, new_ref);
+ } else {
+ vhost_vdpa_add_range_ctx(root, node_start,
+ node_last,
+ new_ref + 1);
+ }
+ } else if (node_start < range_start) {
+ if (range_last < node_last) {
+ /* node->start--- start--- last--- node->last*/
+ /* should the end rang*/
+
+ vhost_vdpa_add_range_ctx(root, node_start,
+ range_start - 1,
+ new_ref);
+ vhost_vdpa_add_range_ctx(root, range_start,
+ range_last,
+ new_ref + 1);
+ vhost_vdpa_add_range_ctx(root, range_last + 1,
+ node_last, new_ref);
+
+ } else if (range_last > node_last) {
+ /* node->start--- start--- node->last-- last*/
+
+ vhost_vdpa_add_range_ctx(root, node_start,
+ range_start - 1,
+ new_ref);
+ vhost_vdpa_add_range_ctx(root, range_start,
+ node_last,
+ new_ref + 1);
+ } else {
+ /* node->start--- start--- node->last= last*/
+ vhost_vdpa_add_range_ctx(root, node_start,
+ range_start - 1,
+ new_ref);
+ vhost_vdpa_add_range_ctx(root, range_start,
+ node_last,
+ new_ref + 1);
+ /* should the end rang*/
+ }
+ } else {
+ if (node_last < range_last) {
+ /* range_start --- node->start --- node->last ----last */
+
+ vhost_vdpa_add_range_ctx(root, range_start,
+ node_start - 1, 1);
+ vhost_vdpa_add_range_ctx(root, node_start,
+ node_last,
+ new_ref + 1);
+ size += ((node_start - 1) - range_start) + 1;
+ } else if (node_last > range_last) {
+ /* range_start--- node->start --- last -- node->last */
+ vhost_vdpa_add_range_ctx(root, range_start,
+ node_start - 1, 1);
+ vhost_vdpa_add_range_ctx(root, node_start,
+ range_last,
+ new_ref + 1);
+ vhost_vdpa_add_range_ctx(root, range_last + 1,
+ node_last, new_ref);
+ size += ((node_start - 1) - range_start) + 1;
+
+ /* should the end rang*/
+ } else {
+ /* range_start--- node->start --- last = node->last */
+ vhost_vdpa_add_range_ctx(root, range_start,
+ node_start - 1, 1);
+ vhost_vdpa_add_range_ctx(root, node_start,
+ node_last,
+ new_ref + 1);
+ size += ((node_start - 1) - range_start) + 1;
+
+ /* should the end rang*/
+ }
+ }
+ /* work in next node*/
+ range_start = node_last + 1;
+ if (range_start > range_last)
+ break;
+ }
+
+ range_size = range_last - range_start;
+
+ /* last round and still some range*/
+
+ if ((range_size >= 0) && (range_start >= node_last) &&
+ (node_number == i + 1)) {
+ vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
+ size = size + (range_last - range_start) + 1;
+ } else if ((range_size == -1) && (node_number == i + 1)) {
+ return size;
+ } else {
+ printk(KERN_WARNING,
+ "%s %d FAIL start %lld last %lld node->start %lld node->last %lld i %d",
+ __func__, __LINE__, range_start, range_last, node_start,
+ node_last, i);
+ }
+
+ return size;
+}
+
+u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
+ struct vdpa_link_node *link_head, int node_number,
+ u64 start, u64 last)
+{
+ int i = 0;
+ u64 size = 0;
+ int new_ref;
+ u64 node_start;
+ u64 node_last;
+ u64 range_start;
+ u64 range_last;
+ int range_size;
+ struct vdpa_link_node *link_node;
+ struct vdpa_tree_node *vdpa_node = NULL;
+ struct interval_tree_node *node = NULL;
+
+ if (node_number == 0)
+ return 0;
+
+ link_node = link_head;
+ range_start = start;
+ range_last = last;
+
+ for (i = 0; i < node_number; i++) {
+ vdpa_node = link_node->vdpa_node;
+ link_node = link_node->next;
+ node = &vdpa_node->tree_node;
+ new_ref = vdpa_node->ref;
+ node_start = node->start;
+ node_last = node->last;
+
+ if (range_start == node_start) {
+ if (node_last < range_last) {
+ /* range_start =node->start --- node->last ----last*/
+ if (new_ref > 1) {
+ vhost_vdpa_add_range_ctx(root,
+ node_start,
+ node_last,
+ new_ref - 1);
+ /*count the next range */
+ } else {
+ /* if the ref =0, do not need add it back, count size*/
+ size += (node_last - node_start) + 1;
+ }
+
+ } else if (node_last > range_last) {
+ /* range_start= node->start --- last -- node->last*/
+
+ if (new_ref > 1) {
+ vhost_vdpa_add_range_ctx(root,
+ node_start,
+ range_last,
+ new_ref - 1);
+ } else {
+ size += (range_last - node_start) + 1;
+ }
+ vhost_vdpa_add_range_ctx(root, range_last + 1,
+ node_last, new_ref);
+ } else {
+ /* range_start= node->start --- last = node->last*/
+
+ if (new_ref > 1) {
+ vhost_vdpa_add_range_ctx(root,
+ node_start,
+ range_last,
+ new_ref - 1);
+ } else {
+ size += (range_last - node_start) + 1;
+ }
+ /* should be the end */
+ }
+ } else if (node_start < range_start) {
+ if (range_last < node_last) {
+ /* node->start--- start--- last--- node->last*/
+ /* should the end rang*/
+ vhost_vdpa_add_range_ctx(root, node_start,
+ range_start - 1,
+ new_ref);
+ if (new_ref > 1) {
+ vhost_vdpa_add_range_ctx(root,
+ range_start,
+ range_last,
+ new_ref - 1);
+ } else {
+ size += (range_last - range_start) + 1;
+ }
+ vhost_vdpa_add_range_ctx(root, range_last + 1,
+ node_last, new_ref);
+
+ } else if (range_last > node_last) {
+ /* node->start--- start--- node->last--- last*/
+
+ vhost_vdpa_add_range_ctx(root, node_start,
+ range_start - 1,
+ new_ref);
+ if (new_ref > 1) {
+ vhost_vdpa_add_range_ctx(root,
+ range_start,
+ node_last,
+ new_ref - 1);
+ } else {
+ size += (node_last - range_start) + 1;
+ }
+ } else {
+ /* node->start--- start--- node->last= last*/
+ vhost_vdpa_add_range_ctx(root, node_start,
+ range_start - 1,
+ new_ref);
+ if (new_ref > 1) {
+ vhost_vdpa_add_range_ctx(root,
+ range_start,
+ range_last,
+ new_ref - 1);
+ } else {
+ size += (range_last - range_start) + 1;
+ }
+ /* should be the end */
+ }
+ } else {
+ /* some range not in the node, error*/
+ printk(KERN_WARNING,
+ "%s %d FAIL start %lld last %lld node->start %lld node->last %lld new_ref %d",
+ __func__, __LINE__, range_start, range_last,
+ node_start, node_last, new_ref);
+ }
+
+ range_start = node_last + 1;
+ if (range_start > range_last)
+ break;
+ }
+
+ range_size = range_last - range_start;
+
+ /* last round and still some range*/
+
+ if ((range_size > 0) && (node_number == i + 1)) {
+ printk(KERN_WARNING,
+ "%s %d FAIL start %lld last %lld node->start %lld node->last %lld range_size %d",
+ __func__, __LINE__, range_start, range_last, node_start,
+ node_last, range_size);
+ }
+ return size;
+}
+
+struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
+ struct vdpa_link_node *list2)
+{
+ struct vdpa_link_node dummy_head;
+ struct vdpa_link_node *ptr = &dummy_head;
+
+ while (list1 && list2) {
+ if (list1->node_start < list2->node_start) {
+ ptr->next = list1;
+ list1 = list1->next;
+ } else {
+ ptr->next = list2;
+ list2 = list2->next;
+ }
+ ptr = ptr->next;
+ }
+ if (list1)
+ ptr->next = list1;
+ else
+ ptr->next = list2;
+
+ return dummy_head.next;
+}
+
+struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
+{
+ struct vdpa_link_node *mid_prev = NULL;
+ struct vdpa_link_node *mid;
+
+ while (head && head->next) {
+ mid_prev = (mid_prev == NULL) ? head : mid_prev->next;
+ head = head->next->next;
+ }
+ mid = mid_prev->next;
+ mid_prev->next = NULL;
+ return mid;
+}
+struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
+{
+ struct vdpa_link_node *mid;
+ struct vdpa_link_node *left;
+ struct vdpa_link_node *right;
+
+ if (!head || !head->next)
+ return head;
+
+ mid = vhost_vdpa_get_mid(head);
+ left = vhost_vdpa_sort_list(head);
+ right = vhost_vdpa_sort_list(mid);
+ return vhost_vdpa_merge_list(left, right);
+}
+
+u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
+ bool ops)
+{
+ struct interval_tree_node *node = NULL;
+ struct vdpa_tree_node *vdpa_node;
+ int node_number = 0;
+ int i = 0;
+ u64 size = 0;
+ struct vdpa_link_node dummy_head = { 0 };
+ struct vdpa_link_node *link_node;
+ struct vdpa_link_node *link_head_tmp;
+ struct vdpa_link_node *pre_link_node;
+
+ pre_link_node = &dummy_head;
+ /*search the rang overlaped, and del from the tree*/
+ for (node = interval_tree_iter_first(root, start, last); node;
+ node = interval_tree_iter_next(node, start, last)) {
+ link_node = kmalloc(sizeof(struct vdpa_link_node), GFP_ATOMIC);
+ if (link_node == NULL) {
+ goto out;
+ }
+ vdpa_node =
+ container_of(node, struct vdpa_tree_node, tree_node);
+ link_node->vdpa_node = vdpa_node;
+ link_node->node_start = node->start;
+ link_node->node_last = node->last;
+
+ pre_link_node->next = link_node;
+ pre_link_node = link_node;
+ pre_link_node->next = NULL;
+
+ node_number++;
+
+ interval_tree_remove(node, root);
+ }
+ /* sorting the node */
+ link_head_tmp = vhost_vdpa_sort_list(dummy_head.next);
+
+ /* these link node are have overlap with range, check the ref and add back to the tree*/
+ if (ops == true) {
+ size = vhost_vdpa_range_ref_add(root, link_head_tmp,
+ node_number, start, last);
+ } else {
+ size = vhost_vdpa_range_ref_del(root, link_head_tmp,
+ node_number, start, last);
+ }
+out:
+ pre_link_node = link_head_tmp;
+
+ for (i = 0; i < node_number; i++) {
+ vdpa_node = pre_link_node->vdpa_node;
+ link_node = pre_link_node->next;
+ kfree(vdpa_node);
+ kfree(pre_link_node);
+ pre_link_node = link_node;
+ }
+ return size;
+}
+u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
+ u64 last)
+{
+ u64 size;
+
+ size = vhost_vdpa_range_ops(root, start, last, true);
+
+ return size;
+}
+
+u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
+ u64 last)
+{
+ u64 size;
+
+ size = vhost_vdpa_range_ops(root, start, last, false);
+
+ return size;
+}
+
static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
{
struct vhost_dev *dev = &v->vdev;
struct vhost_iotlb *iotlb = dev->iotlb;
struct vhost_iotlb_map *map;
struct page *page;
+ u64 size;
unsigned long pfn, pinned;
while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
@@ -523,7 +993,11 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
set_page_dirty_lock(page);
unpin_user_page(page);
}
- atomic64_sub(PFN_DOWN(map->size), &dev->mm->pinned_vm);
+
+ size = vhost_vdpa_search_range_del(dev->vdpa_mem_tree,
+ map->start,
+ map->start + map->size - 1);
+ atomic64_sub(PFN_DOWN(size), &dev->mm->pinned_vm);
vhost_iotlb_map_free(iotlb, map);
}
}
@@ -591,6 +1065,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
int r = 0;
+ u64 size_count;
r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1,
pa, perm, opaque);
@@ -610,9 +1085,11 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
return r;
}
-
- if (!vdpa->use_va)
- atomic64_add(PFN_DOWN(size), &dev->mm->pinned_vm);
+ if (!vdpa->use_va) {
+ size_count = vhost_vdpa_search_range_add(dev->vdpa_mem_tree,
+ iova, iova + size - 1);
+ atomic64_add(PFN_DOWN(size_count), &dev->mm->pinned_vm);
+ }
return 0;
}
@@ -946,6 +1423,58 @@ static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
}
}
+struct root_for_vdpa_node {
+ struct hlist_node hlist;
+ struct rb_root_cached vdpa_mem_tree;
+ pid_t pid_using;
+};
+static DECLARE_HASHTABLE(root_for_vdpa_node_list, 8);
+int status_for_vdpa_tree = 0;
+
+struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
+{
+ struct root_for_vdpa_node *root_get_tmp = NULL;
+ pid_t pid_using = task_pid_nr(task);
+
+ /* No hased table, init one */
+ if (status_for_vdpa_tree == 0) {
+ hash_init(root_for_vdpa_node_list);
+ status_for_vdpa_tree = 1;
+ }
+
+ hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
+ pid_using) {
+ if (root_get_tmp->pid_using == pid_using)
+ return root_get_tmp;
+ }
+
+ root_get_tmp = kmalloc(sizeof(*root_get_tmp), GFP_KERNEL);
+ root_get_tmp->pid_using = pid_using;
+
+ root_get_tmp->vdpa_mem_tree = RB_ROOT_CACHED;
+
+ hash_add(root_for_vdpa_node_list, &root_get_tmp->hlist,
+ root_get_tmp->pid_using);
+
+ return root_get_tmp;
+}
+
+void vhost_vdpa_relase_mem_tree(struct task_struct *task)
+{
+ struct root_for_vdpa_node *root_get_tmp = NULL;
+ pid_t pid_using = task_pid_nr(task);
+
+ /* No hased table, init one */
+ hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
+ pid_using) {
+ if (root_get_tmp->pid_using == pid_using)
+ kfree(root_get_tmp);
+ return;
+ }
+
+ return;
+}
+
static int vhost_vdpa_open(struct inode *inode, struct file *filep)
{
struct vhost_vdpa *v;
@@ -991,10 +1520,13 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
vhost_vdpa_set_iova_range(v);
filep->private_data = v;
+ struct root_for_vdpa_node *tmp = vhost_vdpa_get_mem_tree(current);
+ dev->vdpa_mem_tree = &tmp->vdpa_mem_tree;
return 0;
err_init_iotlb:
+ vhost_vdpa_relase_mem_tree(current);
vhost_dev_cleanup(&v->vdev);
kfree(vqs);
err:
@@ -1016,6 +1548,8 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
struct vhost_dev *d = &v->vdev;
mutex_lock(&d->mutex);
+ vhost_vdpa_relase_mem_tree(current);
+
filep->private_data = NULL;
vhost_vdpa_clean_irq(v);
vhost_vdpa_reset(v);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 638bb640d6b4..d1c662eb4f26 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -161,6 +161,7 @@ struct vhost_dev {
int byte_weight;
u64 kcov_handle;
bool use_worker;
+ struct rb_root_cached *vdpa_mem_tree;
int (*msg_handler)(struct vhost_dev *dev,
struct vhost_iotlb_msg *msg);
};
--
2.34.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-06-04 16:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-04 16:04 [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA kernel test robot
-- strict thread matches above, loose matches on Subject: below --
2022-06-01 1:20 Cindy Lu
2022-06-01 5:03 ` kernel test robot
2022-06-01 5:51 ` Jason Wang
2022-06-01 5:51 ` Jason Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.