All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
@ 2022-06-04 16:04 kernel test robot
  0 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2022-06-04 16:04 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 33368 bytes --]

:::::: 
:::::: Manual check reason: "low confidence static check warning: drivers/vhost/vdpa.c:614:2: warning: Value stored to 'range_size' is never read [clang-analyzer-deadcode.DeadStores]"
:::::: 

CC: llvm(a)lists.linux.dev
CC: kbuild-all(a)lists.01.org
BCC: lkp(a)intel.com
In-Reply-To: <20220601012019.1102186-1-lulu@redhat.com>
References: <20220601012019.1102186-1-lulu@redhat.com>
TO: Cindy Lu <lulu@redhat.com>

Hi Cindy,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on v5.18]
[cannot apply to mst-vhost/linux-next next-20220603]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Cindy-Lu/vdpa-Do-not-count-the-pages-that-were-already-pinned-in-the-vhost-vDPA/20220601-092308
base:    4b0986a3613c92f4ec1bdc7f60ec66fea135991f
:::::: branch date: 4 days ago
:::::: commit date: 4 days ago
config: i386-randconfig-c001 (https://download.01.org/0day-ci/archive/20220604/202206042319.frgBEFEG-lkp(a)intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project c825abd6b0198fb088d9752f556a70705bc99dfd)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/8f0b25afe9ac570a70eafb5c285747e3bdd2471d
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Cindy-Lu/vdpa-Do-not-count-the-pages-that-were-already-pinned-in-the-vhost-vDPA/20220601-092308
        git checkout 8f0b25afe9ac570a70eafb5c285747e3bdd2471d
        # save the config file
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=i386 clang-analyzer 

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>


clang-analyzer warnings: (new ones prefixed by >>)
                           ^
   arch/x86/include/asm/string_32.h:150:25: note: expanded from macro 'memcpy'
   #define memcpy(t, f, n) __builtin_memcpy(t, f, n)
                           ^~~~~~~~~~~~~~~~
   net/ipv6/addrconf.c:7150:4: note: Call to function 'memcpy' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'memcpy_s' in case of C11
                           memcpy(dflt,
                           ^
   arch/x86/include/asm/string_32.h:150:25: note: expanded from macro 'memcpy'
   #define memcpy(t, f, n) __builtin_memcpy(t, f, n)
                           ^~~~~~~~~~~~~~~~
   Suppressed 106 warnings (106 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   47 warnings generated.
   drivers/md/dm-thin.c:1269:2: warning: Call to function 'memset' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'memset_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
           memset(m, 0, sizeof(struct dm_thin_new_mapping));
           ^
   arch/x86/include/asm/string_32.h:195:29: note: expanded from macro 'memset'
   #define memset(s, c, count) __builtin_memset(s, c, count)
                               ^~~~~~~~~~~~~~~~
   drivers/md/dm-thin.c:1269:2: note: Call to function 'memset' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'memset_s' in case of C11
           memset(m, 0, sizeof(struct dm_thin_new_mapping));
           ^
   arch/x86/include/asm/string_32.h:195:29: note: expanded from macro 'memset'
   #define memset(s, c, count) __builtin_memset(s, c, count)
                               ^~~~~~~~~~~~~~~~
   drivers/md/dm-thin.c:3990:10: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
                          format_dev_t(buf, pt->metadata_dev->bdev->bd_dev),
                          ^
   include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
                   sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));       \
                   ^~~~~~~
   include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
                             0 : scnprintf(result + sz, maxlen - sz, x))
                                                                     ^
   drivers/md/dm-thin.c:3990:10: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
                          format_dev_t(buf, pt->metadata_dev->bdev->bd_dev),
                          ^
   include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
                   sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));       \
                   ^~~~~~~
   include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
                             0 : scnprintf(result + sz, maxlen - sz, x))
                                                                     ^
   drivers/md/dm-thin.c:3991:10: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
                          format_dev_t(buf2, pt->data_dev->bdev->bd_dev),
                          ^
   include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
                   sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));       \
                   ^~~~~~~
   include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
                             0 : scnprintf(result + sz, maxlen - sz, x))
                                                                     ^
   drivers/md/dm-thin.c:3991:10: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
                          format_dev_t(buf2, pt->data_dev->bdev->bd_dev),
                          ^
   include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
                   sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));       \
                   ^~~~~~~
   include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
                             0 : scnprintf(result + sz, maxlen - sz, x))
                                                                     ^
   drivers/md/dm-thin.c:4407:11: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
                                  format_dev_t(buf, tc->pool_dev->bdev->bd_dev),
                                  ^
   include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
                   sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));       \
                   ^~~~~~~
   include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
                             0 : scnprintf(result + sz, maxlen - sz, x))
                                                                     ^
   drivers/md/dm-thin.c:4407:11: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
                                  format_dev_t(buf, tc->pool_dev->bdev->bd_dev),
                                  ^
   include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
                   sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));       \
                   ^~~~~~~
   include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
                             0 : scnprintf(result + sz, maxlen - sz, x))
                                                                     ^
   drivers/md/dm-thin.c:4410:19: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
                                   DMEMIT(" %s", format_dev_t(buf, tc->origin_dev->bdev->bd_dev));
                                                 ^
   include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
                   sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));       \
                   ^~~~~~~
   include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
                             0 : scnprintf(result + sz, maxlen - sz, x))
                                                                     ^
   drivers/md/dm-thin.c:4410:19: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
                                   DMEMIT(" %s", format_dev_t(buf, tc->origin_dev->bdev->bd_dev));
                                                 ^
   include/linux/kdev_t.h:19:3: note: expanded from macro 'format_dev_t'
                   sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));       \
                   ^~~~~~~
   include/linux/device-mapper.h:600:46: note: expanded from macro 'DMEMIT'
                             0 : scnprintf(result + sz, maxlen - sz, x))
                                                                     ^
   Suppressed 42 warnings (42 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   84 warnings generated.
>> drivers/vhost/vdpa.c:614:2: warning: Value stored to 'range_size' is never read [clang-analyzer-deadcode.DeadStores]
           range_size = range_start - range_last;
           ^            ~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:614:2: note: Value stored to 'range_size' is never read
           range_size = range_start - range_last;
           ^            ~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:719:40: warning: The right operand of '>=' is a garbage value [clang-analyzer-core.UndefinedBinaryOperatorResult]
           if ((range_size >= 0) && (range_start >= node_last) &&
                                                 ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
   drivers/vhost/vdpa.c:596:2: note: 'node_last' declared without an initial value
           u64 node_last;
           ^~~~~~~~~~~~~
   drivers/vhost/vdpa.c:604:6: note: Assuming 'node_number' is not equal to 0
           if (node_number == 0) {
               ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
   drivers/vhost/vdpa.c:604:2: note: '?' condition is false
           if (node_number == 0) {
           ^
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^
   include/linux/compiler.h:58:31: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                 ^
   drivers/vhost/vdpa.c:604:6: note: 'node_number' is not equal to 0
           if (node_number == 0) {
               ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:86: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                                                        ^~~~
   include/linux/compiler.h:69:3: note: expanded from macro '__trace_if_value'
           (cond) ?                                        \
            ^~~~
   drivers/vhost/vdpa.c:604:2: note: '?' condition is false
           if (node_number == 0) {
           ^
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^
   include/linux/compiler.h:58:69: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                                       ^
   include/linux/compiler.h:69:2: note: expanded from macro '__trace_if_value'
           (cond) ?                                        \
           ^
   drivers/vhost/vdpa.c:604:2: note: Taking false branch
           if (node_number == 0) {
           ^
   include/linux/compiler.h:56:23: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                         ^
   drivers/vhost/vdpa.c:615:14: note: Assuming 'i' is >= 'node_number'
           for (i = 0; i < node_number; i++) {
                       ^~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:615:2: note: Loop condition is false. Execution continues on line 715
           for (i = 0; i < node_number; i++) {
           ^
   drivers/vhost/vdpa.c:719:7: note: Assuming 'range_size' is >= 0
           if ((range_size >= 0) && (range_start >= node_last) &&
                ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
   drivers/vhost/vdpa.c:719:6: note: Left side of '&&' is true
           if ((range_size >= 0) && (range_start >= node_last) &&
               ^
   drivers/vhost/vdpa.c:719:40: note: The right operand of '>=' is a garbage value
           if ((range_size >= 0) && (range_start >= node_last) &&
                                                 ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
>> drivers/vhost/vdpa.c:978:16: warning: Assigned value is garbage or undefined [clang-analyzer-core.uninitialized.Assign]
           pre_link_node = link_head_tmp;
                         ^
   drivers/vhost/vdpa.c:1589:2: note: Calling 'vhost_vdpa_iotlb_free'
           vhost_vdpa_iotlb_free(v);
           ^~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:1065:2: note: Calling 'vhost_vdpa_iotlb_unmap'
           vhost_vdpa_iotlb_unmap(v, 0ULL, 0ULL - 1);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:1055:2: note: Assuming field 'use_va' is false
           if (vdpa->use_va)
           ^
   include/linux/compiler.h:56:45: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
   drivers/vhost/vdpa.c:1055:2: note: '?' condition is false
           if (vdpa->use_va)
           ^
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^
   include/linux/compiler.h:58:31: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                 ^
   drivers/vhost/vdpa.c:1055:12: note: Field 'use_va' is false
           if (vdpa->use_va)
                     ^
   drivers/vhost/vdpa.c:1055:2: note: '?' condition is false
           if (vdpa->use_va)
           ^
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^
   include/linux/compiler.h:58:69: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                                       ^
   include/linux/compiler.h:69:2: note: expanded from macro '__trace_if_value'
           (cond) ?                                        \
           ^
   drivers/vhost/vdpa.c:1055:2: note: Taking false branch
           if (vdpa->use_va)
           ^
   include/linux/compiler.h:56:23: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                         ^
   drivers/vhost/vdpa.c:1058:9: note: Calling 'vhost_vdpa_pa_unmap'
           return vhost_vdpa_pa_unmap(v, start, last);
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:1018:9: note: Assuming the condition is true
           while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:1018:2: note: Loop condition is true.  Entering loop body
           while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
           ^
   drivers/vhost/vdpa.c:1021:8: note: Assuming 'pinned' is <= 0
                        pinned > 0; pfn++, pinned--) {
                        ^~~~~~~~~~
   drivers/vhost/vdpa.c:1020:3: note: Loop condition is false. Execution continues on line 1028
                   for (pfn = PFN_DOWN(map->addr);
                   ^
   drivers/vhost/vdpa.c:1028:10: note: Calling 'vhost_vdpa_search_range_del'
                   size = vhost_vdpa_search_range_del(dev->vdpa_mem_tree,
                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:1004:9: note: Calling 'vhost_vdpa_range_ops'
           size = vhost_vdpa_range_ops(root, start, last, false);
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:941:2: note: 'link_head_tmp' declared without an initial value
           struct vdpa_link_node *link_head_tmp;
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:946:2: note: Loop condition is true.  Entering loop body
           for (node = interval_tree_iter_first(root, start, last); node;
           ^
   drivers/vhost/vdpa.c:949:7: note: Assuming 'link_node' is equal to null
                   if (link_node == NULL) {
                       ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
   drivers/vhost/vdpa.c:949:3: note: '?' condition is false
                   if (link_node == NULL) {
                   ^
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^
   include/linux/compiler.h:58:31: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                 ^
   drivers/vhost/vdpa.c:949:7: note: 'link_node' is equal to null
                   if (link_node == NULL) {
                       ^
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:86: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))

vim +/range_size +614 drivers/vhost/vdpa.c

8f0b25afe9ac57 Cindy Lu 2022-06-01  587  
8f0b25afe9ac57 Cindy Lu 2022-06-01  588  u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
8f0b25afe9ac57 Cindy Lu 2022-06-01  589  			     struct vdpa_link_node *link_head, int node_number,
8f0b25afe9ac57 Cindy Lu 2022-06-01  590  			     u64 start, u64 last)
8f0b25afe9ac57 Cindy Lu 2022-06-01  591  {
8f0b25afe9ac57 Cindy Lu 2022-06-01  592  	int i = 0;
8f0b25afe9ac57 Cindy Lu 2022-06-01  593  	u64 size = 0;
8f0b25afe9ac57 Cindy Lu 2022-06-01  594  	int new_ref;
8f0b25afe9ac57 Cindy Lu 2022-06-01  595  	u64 node_start;
8f0b25afe9ac57 Cindy Lu 2022-06-01  596  	u64 node_last;
8f0b25afe9ac57 Cindy Lu 2022-06-01  597  	u64 range_start;
8f0b25afe9ac57 Cindy Lu 2022-06-01  598  	u64 range_last;
8f0b25afe9ac57 Cindy Lu 2022-06-01  599  	int range_size;
8f0b25afe9ac57 Cindy Lu 2022-06-01  600  	struct vdpa_link_node *link_node;
8f0b25afe9ac57 Cindy Lu 2022-06-01  601  	struct vdpa_tree_node *vdpa_node = NULL;
8f0b25afe9ac57 Cindy Lu 2022-06-01  602  	struct interval_tree_node *node = NULL;
8f0b25afe9ac57 Cindy Lu 2022-06-01  603  
8f0b25afe9ac57 Cindy Lu 2022-06-01  604  	if (node_number == 0) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  605  		vhost_vdpa_add_range_ctx(root, start, last, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  606  
8f0b25afe9ac57 Cindy Lu 2022-06-01  607  		size = last - start + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01  608  		return size;
8f0b25afe9ac57 Cindy Lu 2022-06-01  609  	}
8f0b25afe9ac57 Cindy Lu 2022-06-01  610  
8f0b25afe9ac57 Cindy Lu 2022-06-01  611  	link_node = link_head;
8f0b25afe9ac57 Cindy Lu 2022-06-01  612  	range_start = start;
8f0b25afe9ac57 Cindy Lu 2022-06-01  613  	range_last = last;
8f0b25afe9ac57 Cindy Lu 2022-06-01 @614  	range_size = range_start - range_last;
8f0b25afe9ac57 Cindy Lu 2022-06-01  615  	for (i = 0; i < node_number; i++) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  616  		vdpa_node = link_node->vdpa_node;
8f0b25afe9ac57 Cindy Lu 2022-06-01  617  		link_node = link_node->next;
8f0b25afe9ac57 Cindy Lu 2022-06-01  618  		node = &vdpa_node->tree_node;
8f0b25afe9ac57 Cindy Lu 2022-06-01  619  		new_ref = vdpa_node->ref;
8f0b25afe9ac57 Cindy Lu 2022-06-01  620  		node_start = node->start;
8f0b25afe9ac57 Cindy Lu 2022-06-01  621  		node_last = node->last;
8f0b25afe9ac57 Cindy Lu 2022-06-01  622  
8f0b25afe9ac57 Cindy Lu 2022-06-01  623  		if (range_start == node_start) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  624  			if (node_last < range_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  625  				/* range_start= node->start--- node->last--range_last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  626  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  627  							 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  628  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  629  				/*count the next range */
8f0b25afe9ac57 Cindy Lu 2022-06-01  630  			} else if (node_last > range_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  631  				/* range_start= node->start	---  last --  node->last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  632  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  633  							 range_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  634  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  635  				vhost_vdpa_add_range_ctx(root, range_last + 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01  636  							 node_last, new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01  637  			} else {
8f0b25afe9ac57 Cindy Lu 2022-06-01  638  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  639  							 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  640  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  641  			}
8f0b25afe9ac57 Cindy Lu 2022-06-01  642  		} else if (node_start < range_start) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  643  			if (range_last < node_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  644  				/* node->start---  start--- last--- node->last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  645  				/* should the end rang*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  646  
8f0b25afe9ac57 Cindy Lu 2022-06-01  647  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  648  							 range_start - 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01  649  							 new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01  650  				vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  651  							 range_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  652  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  653  				vhost_vdpa_add_range_ctx(root, range_last + 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01  654  							 node_last, new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01  655  
8f0b25afe9ac57 Cindy Lu 2022-06-01  656  			} else if (range_last > node_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  657  				/* node->start---  start--- node->last-- last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  658  
8f0b25afe9ac57 Cindy Lu 2022-06-01  659  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  660  							 range_start - 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01  661  							 new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01  662  				vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  663  							 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  664  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  665  			} else {
8f0b25afe9ac57 Cindy Lu 2022-06-01  666  				/* node->start---  start--- node->last= last*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  667  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  668  							 range_start - 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01  669  							 new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01  670  				vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  671  							 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  672  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  673  				/* should the end rang*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  674  			}
8f0b25afe9ac57 Cindy Lu 2022-06-01  675  		} else {
8f0b25afe9ac57 Cindy Lu 2022-06-01  676  			if (node_last < range_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  677  				/* range_start --- node->start --- node->last ----last  */
8f0b25afe9ac57 Cindy Lu 2022-06-01  678  
8f0b25afe9ac57 Cindy Lu 2022-06-01  679  				vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  680  							 node_start - 1, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  681  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  682  							 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  683  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  684  				size += ((node_start - 1) - range_start) + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01  685  			} else if (node_last > range_last) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  686  				/* range_start--- node->start	---  last --  node->last	*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  687  				vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  688  							 node_start - 1, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  689  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  690  							 range_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  691  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  692  				vhost_vdpa_add_range_ctx(root, range_last + 1,
8f0b25afe9ac57 Cindy Lu 2022-06-01  693  							 node_last, new_ref);
8f0b25afe9ac57 Cindy Lu 2022-06-01  694  				size += ((node_start - 1) - range_start) + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01  695  
8f0b25afe9ac57 Cindy Lu 2022-06-01  696  				/* should the end rang*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  697  			} else {
8f0b25afe9ac57 Cindy Lu 2022-06-01  698  				/* range_start--- node->start	---  last =  node->last	*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  699  				vhost_vdpa_add_range_ctx(root, range_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  700  							 node_start - 1, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  701  				vhost_vdpa_add_range_ctx(root, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  702  							 node_last,
8f0b25afe9ac57 Cindy Lu 2022-06-01  703  							 new_ref + 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  704  				size += ((node_start - 1) - range_start) + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01  705  
8f0b25afe9ac57 Cindy Lu 2022-06-01  706  				/* should the end rang*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  707  			}
8f0b25afe9ac57 Cindy Lu 2022-06-01  708  		}
8f0b25afe9ac57 Cindy Lu 2022-06-01  709  		/* work in next node*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  710  		range_start = node_last + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01  711  		if (range_start > range_last)
8f0b25afe9ac57 Cindy Lu 2022-06-01  712  			break;
8f0b25afe9ac57 Cindy Lu 2022-06-01  713  	}
8f0b25afe9ac57 Cindy Lu 2022-06-01  714  
8f0b25afe9ac57 Cindy Lu 2022-06-01  715  	range_size = range_last - range_start;
8f0b25afe9ac57 Cindy Lu 2022-06-01  716  
8f0b25afe9ac57 Cindy Lu 2022-06-01  717  	/* last round and still some range*/
8f0b25afe9ac57 Cindy Lu 2022-06-01  718  
8f0b25afe9ac57 Cindy Lu 2022-06-01  719  	if ((range_size >= 0) && (range_start >= node_last) &&
8f0b25afe9ac57 Cindy Lu 2022-06-01  720  	    (node_number == i + 1)) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  721  		vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
8f0b25afe9ac57 Cindy Lu 2022-06-01  722  		size = size + (range_last - range_start) + 1;
8f0b25afe9ac57 Cindy Lu 2022-06-01  723  	} else if ((range_size == -1) && (node_number == i + 1)) {
8f0b25afe9ac57 Cindy Lu 2022-06-01  724  		return size;
8f0b25afe9ac57 Cindy Lu 2022-06-01  725  	} else {
8f0b25afe9ac57 Cindy Lu 2022-06-01  726  		printk(KERN_WARNING,
8f0b25afe9ac57 Cindy Lu 2022-06-01  727  		       "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld i  %d",
8f0b25afe9ac57 Cindy Lu 2022-06-01  728  		       __func__, __LINE__, range_start, range_last, node_start,
8f0b25afe9ac57 Cindy Lu 2022-06-01  729  		       node_last, i);
8f0b25afe9ac57 Cindy Lu 2022-06-01  730  	}
8f0b25afe9ac57 Cindy Lu 2022-06-01  731  
8f0b25afe9ac57 Cindy Lu 2022-06-01  732  	return size;
8f0b25afe9ac57 Cindy Lu 2022-06-01  733  }
8f0b25afe9ac57 Cindy Lu 2022-06-01  734  

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
  2022-06-01  1:20 Cindy Lu
@ 2022-06-01  5:51   ` Jason Wang
  2022-06-01  5:51   ` Jason Wang
  1 sibling, 0 replies; 5+ messages in thread
From: Jason Wang @ 2022-06-01  5:51 UTC (permalink / raw)
  To: Cindy Lu; +Cc: mst, kvm, virtualization, netdev, linux-kernel

On Wed, Jun 1, 2022 at 9:20 AM Cindy Lu <lulu@redhat.com> wrote:
>
> We count pinned_vm as follow in vhost-vDPA
>
> lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> if (npages + atomic64_read(&dev->mm->pinned_vm) > lock_limit) {
>          ret = -ENOMEM;
>          goto unlock;
> }
> This means if we have two vDPA devices for the same VM the pages would be counted twice
> So we add a tree to save the page that counted and we will not count it
> again.

The code is not easy to be reviewed, some suggestions:

- It's better to explain in general the algorithm you used here
- Add more comment in the codes to explain the rationale

And I still see the above check against the RLIMIT in the code, is it
intentional?

> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  drivers/vhost/vdpa.c  | 542 +++++++++++++++++++++++++++++++++++++++++-
>  drivers/vhost/vhost.h |   1 +
>  2 files changed, 539 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 05f5fd2af58f..1b0da0735efd 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -24,6 +24,10 @@
>  #include <linux/vhost.h>
>
>  #include "vhost.h"
> +#include <linux/rbtree.h>
> +#include <linux/interval_tree.h>
> +#include <linux/interval_tree_generic.h>
> +#include <linux/hashtable.h>
>
>  enum {
>         VHOST_VDPA_BACKEND_FEATURES =
> @@ -506,12 +510,478 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>         return r;
>  }
>
> +struct vdpa_tree_node {
> +       struct interval_tree_node tree_node;

Can we simply reuse the vhost_iotlb tree? Note that vhost_iotlb_map
can be associated with a opaque as token which could be used as
reference count.

> +       int ref;

If it's a refcount, let's use unsigned here.

> +};
> +struct vdpa_link_node {
> +       struct vdpa_tree_node *vdpa_node;
> +       struct vdpa_link_node *next;

Need to explain why we need a linked list here considering we've
already structured it as an interval tree.

Btw, unless it's performance critical, let's try to re-use kernel list.h.

> +       u64 node_start;
> +       u64 node_last;

Let's add a comment to explain each member here.

> +};
> +
> +int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
> +                            int ref)

We don't want to export this symbol, so let's make the function
static, so did other functions.

> +{
> +       struct interval_tree_node *new_node;
> +       struct vdpa_tree_node *vdpa_node;
> +
> +       if (last < start)
> +               return -EFAULT;
> +
> +       /* If the range being mapped is [0, ULONG_MAX], split it into two entries
> +        * otherwise its size would overflow u64.
> +        */
> +       if (start == 0 && last == ULONG_MAX) {
> +               u64 mid = last / 2;
> +
> +               vhost_vdpa_add_range_ctx(root, start, mid, ref);
> +               start = mid + 1;
> +       }
> +       vdpa_node = kmalloc(sizeof(struct vdpa_tree_node), GFP_ATOMIC);
> +

Let's check if kmalloc succeeds here.


> +       new_node = &vdpa_node->tree_node;
> +       if (!new_node)
> +               return -ENOMEM;
> +
> +       new_node->start = start;
> +       new_node->last = last;
> +       vdpa_node->ref = ref;
> +
> +       interval_tree_insert(new_node, root);
> +
> +       return 0;
> +}
> +
> +u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
> +                            struct vdpa_link_node *link_head, int node_number,
> +                            u64 start, u64 last)
> +{
> +       int i = 0;
> +       u64 size = 0;
> +       int new_ref;
> +       u64 node_start;
> +       u64 node_last;
> +       u64 range_start;
> +       u64 range_last;
> +       int range_size;
> +       struct vdpa_link_node *link_node;
> +       struct vdpa_tree_node *vdpa_node = NULL;
> +       struct interval_tree_node *node = NULL;
> +
> +       if (node_number == 0) {
> +               vhost_vdpa_add_range_ctx(root, start, last, 1);
> +
> +               size = last - start + 1;
> +               return size;
> +       }
> +
> +       link_node = link_head;
> +       range_start = start;
> +       range_last = last;
> +       range_size = range_start - range_last;
> +       for (i = 0; i < node_number; i++) {
> +               vdpa_node = link_node->vdpa_node;
> +               link_node = link_node->next;
> +               node = &vdpa_node->tree_node;
> +               new_ref = vdpa_node->ref;
> +               node_start = node->start;
> +               node_last = node->last;
> +
> +               if (range_start == node_start) {
> +                       if (node_last < range_last) {
> +                               /* range_start= node->start--- node->last--range_last*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                               /*count the next range */
> +                       } else if (node_last > range_last) {
> +                               /* range_start= node->start     ---  last --  node->last*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_last,
> +                                                        new_ref + 1);
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +                       } else {
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                       }
> +               } else if (node_start < range_start) {
> +                       if (range_last < node_last) {
> +                               /* node->start---  start--- last--- node->last*/
> +                               /* should the end rang*/
> +
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        range_last,
> +                                                        new_ref + 1);
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +
> +                       } else if (range_last > node_last) {
> +                               /* node->start---  start--- node->last-- last*/
> +
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                       } else {
> +                               /* node->start---  start--- node->last= last*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                               /* should the end rang*/
> +                       }
> +               } else {
> +                       if (node_last < range_last) {
> +                               /* range_start --- node->start --- node->last ----last  */
> +
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_start - 1, 1);
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                               size += ((node_start - 1) - range_start) + 1;
> +                       } else if (node_last > range_last) {
> +                               /* range_start--- node->start   ---  last --  node->last        */
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_start - 1, 1);
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_last,
> +                                                        new_ref + 1);
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +                               size += ((node_start - 1) - range_start) + 1;
> +
> +                               /* should the end rang*/
> +                       } else {
> +                               /* range_start--- node->start   ---  last =  node->last */
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_start - 1, 1);
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                               size += ((node_start - 1) - range_start) + 1;
> +
> +                               /* should the end rang*/
> +                       }
> +               }
> +               /* work in next node*/
> +               range_start = node_last + 1;
> +               if (range_start > range_last)
> +                       break;
> +       }
> +
> +       range_size = range_last - range_start;
> +
> +       /* last round and still some range*/
> +
> +       if ((range_size >= 0) && (range_start >= node_last) &&
> +           (node_number == i + 1)) {
> +               vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
> +               size = size + (range_last - range_start) + 1;
> +       } else if ((range_size == -1) && (node_number == i + 1)) {
> +               return size;
> +       } else {
> +               printk(KERN_WARNING,
> +                      "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld i  %d",
> +                      __func__, __LINE__, range_start, range_last, node_start,
> +                      node_last, i);
> +       }
> +
> +       return size;
> +}
> +
> +u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
> +                            struct vdpa_link_node *link_head, int node_number,
> +                            u64 start, u64 last)
> +{
> +       int i = 0;
> +       u64 size = 0;
> +       int new_ref;
> +       u64 node_start;
> +       u64 node_last;
> +       u64 range_start;
> +       u64 range_last;
> +       int range_size;
> +       struct vdpa_link_node *link_node;
> +       struct vdpa_tree_node *vdpa_node = NULL;
> +       struct interval_tree_node *node = NULL;
> +
> +       if (node_number == 0)
> +               return 0;
> +
> +       link_node = link_head;
> +       range_start = start;
> +       range_last = last;
> +
> +       for (i = 0; i < node_number; i++) {
> +               vdpa_node = link_node->vdpa_node;
> +               link_node = link_node->next;
> +               node = &vdpa_node->tree_node;
> +               new_ref = vdpa_node->ref;
> +               node_start = node->start;
> +               node_last = node->last;
> +
> +               if (range_start == node_start) {
> +                       if (node_last < range_last) {
> +                               /* range_start =node->start --- node->last ----last*/

The comment needs some tweaking to be understood by the reviewers easily.


> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                node_start,
> +                                                                node_last,
> +                                                                new_ref - 1);
> +                                       /*count the next range */
> +                               } else {
> +                                       /* if the ref =0, do not need add it back, count size*/
> +                                       size += (node_last - node_start) + 1;
> +                               }
> +
> +                       } else if (node_last > range_last) {
> +                               /* range_start= node->start     ---  last --  node->last*/
> +
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                node_start,
> +                                                                range_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (range_last - node_start) + 1;
> +                               }
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +                       } else {
> +                               /* range_start= node->start     ---  last = node->last*/
> +
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                node_start,
> +                                                                range_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (range_last - node_start) + 1;
> +                               }
> +                               /* should be the end */
> +                       }
> +               } else if (node_start < range_start) {
> +                       if (range_last < node_last) {
> +                               /* node->start---  start--- last--- node->last*/
> +                               /* should the end rang*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                range_start,
> +                                                                range_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (range_last - range_start) + 1;
> +                               }
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +
> +                       } else if (range_last > node_last) {
> +                               /* node->start---  start--- node->last--- last*/
> +
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                range_start,
> +                                                                node_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (node_last - range_start) + 1;
> +                               }
> +                       } else {
> +                               /* node->start---  start--- node->last= last*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                range_start,
> +                                                                range_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (range_last - range_start) + 1;
> +                               }
> +                               /* should be the end */
> +                       }
> +               } else {
> +                       /* some range not in the node, error*/
> +                       printk(KERN_WARNING,
> +                              "%s %d FAIL  start %lld last %lld node->start %lld  node->last %lld new_ref %d",
> +                              __func__, __LINE__, range_start, range_last,
> +                              node_start, node_last, new_ref);
> +               }
> +
> +               range_start = node_last + 1;
> +               if (range_start > range_last)
> +                       break;
> +       }
> +
> +       range_size = range_last - range_start;
> +
> +       /* last round and still some range*/
> +
> +       if ((range_size > 0) && (node_number == i + 1)) {
> +               printk(KERN_WARNING,
> +                      "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld range_size  %d",
> +                      __func__, __LINE__, range_start, range_last, node_start,
> +                      node_last, range_size);
> +       }
> +       return size;
> +}
> +
> +struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
> +                                            struct vdpa_link_node *list2)
> +{
> +       struct vdpa_link_node dummy_head;
> +       struct vdpa_link_node *ptr = &dummy_head;
> +
> +       while (list1 && list2) {
> +               if (list1->node_start < list2->node_start) {
> +                       ptr->next = list1;
> +                       list1 = list1->next;
> +               } else {
> +                       ptr->next = list2;
> +                       list2 = list2->next;
> +               }
> +               ptr = ptr->next;
> +       }
> +       if (list1)
> +               ptr->next = list1;
> +       else
> +               ptr->next = list2;
> +
> +       return dummy_head.next;
> +}
> +
> +struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
> +{
> +       struct vdpa_link_node *mid_prev = NULL;
> +       struct vdpa_link_node *mid;
> +
> +       while (head && head->next) {
> +               mid_prev = (mid_prev == NULL) ? head : mid_prev->next;
> +               head = head->next->next;
> +       }
> +       mid = mid_prev->next;
> +       mid_prev->next = NULL;
> +       return mid;
> +}
> +struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
> +{
> +       struct vdpa_link_node *mid;
> +       struct vdpa_link_node *left;
> +       struct vdpa_link_node *right;
> +
> +       if (!head || !head->next)
> +               return head;
> +
> +       mid = vhost_vdpa_get_mid(head);
> +       left = vhost_vdpa_sort_list(head);
> +       right = vhost_vdpa_sort_list(mid);
> +       return vhost_vdpa_merge_list(left, right);
> +}
> +
> +u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
> +                        bool ops)
> +{
> +       struct interval_tree_node *node = NULL;
> +       struct vdpa_tree_node *vdpa_node;
> +       int node_number = 0;
> +       int i = 0;
> +       u64 size = 0;
> +       struct vdpa_link_node dummy_head = { 0 };
> +       struct vdpa_link_node *link_node;
> +       struct vdpa_link_node *link_head_tmp;
> +       struct vdpa_link_node *pre_link_node;
> +
> +       pre_link_node = &dummy_head;
> +       /*search the rang overlaped, and del from the tree*/
> +       for (node = interval_tree_iter_first(root, start, last); node;
> +            node = interval_tree_iter_next(node, start, last)) {
> +               link_node = kmalloc(sizeof(struct vdpa_link_node), GFP_ATOMIC);
> +               if (link_node == NULL) {
> +                       goto out;
> +               }
> +               vdpa_node =
> +                       container_of(node, struct vdpa_tree_node, tree_node);
> +               link_node->vdpa_node = vdpa_node;
> +               link_node->node_start = node->start;
> +               link_node->node_last = node->last;
> +
> +               pre_link_node->next = link_node;
> +               pre_link_node = link_node;
> +               pre_link_node->next = NULL;
> +
> +               node_number++;
> +
> +               interval_tree_remove(node, root);
> +       }
> +       /* sorting the node */

The code explains itself, let's explain why we need the sort.

> +       link_head_tmp = vhost_vdpa_sort_list(dummy_head.next);
> +
> +       /* these link node are have overlap with range, check the ref and add back to the tree*/
> +       if (ops == true) {
> +               size = vhost_vdpa_range_ref_add(root, link_head_tmp,
> +                                               node_number, start, last);
> +       } else {
> +               size = vhost_vdpa_range_ref_del(root, link_head_tmp,
> +                                               node_number, start, last);
> +       }
> +out:
> +       pre_link_node = link_head_tmp;
> +
> +       for (i = 0; i < node_number; i++) {
> +               vdpa_node = pre_link_node->vdpa_node;
> +               link_node = pre_link_node->next;
> +               kfree(vdpa_node);
> +               kfree(pre_link_node);
> +               pre_link_node = link_node;
> +       }
> +       return size;
> +}
> +u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
> +                               u64 last)
> +{
> +       u64 size;
> +
> +       size = vhost_vdpa_range_ops(root, start, last, true);
> +
> +       return size;
> +}
> +
> +u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
> +                               u64 last)
> +{
> +       u64 size;
> +
> +       size = vhost_vdpa_range_ops(root, start, last, false);
> +
> +       return size;
> +}
> +
>  static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>  {
>         struct vhost_dev *dev = &v->vdev;
>         struct vhost_iotlb *iotlb = dev->iotlb;
>         struct vhost_iotlb_map *map;
>         struct page *page;
> +       u64 size;
>         unsigned long pfn, pinned;
>
>         while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
> @@ -523,7 +993,11 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>                                 set_page_dirty_lock(page);
>                         unpin_user_page(page);
>                 }
> -               atomic64_sub(PFN_DOWN(map->size), &dev->mm->pinned_vm);
> +
> +               size = vhost_vdpa_search_range_del(dev->vdpa_mem_tree,
> +                                                  map->start,
> +                                                  map->start + map->size - 1);
> +               atomic64_sub(PFN_DOWN(size), &dev->mm->pinned_vm);
>                 vhost_iotlb_map_free(iotlb, map);
>         }
>  }
> @@ -591,6 +1065,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
>         struct vdpa_device *vdpa = v->vdpa;
>         const struct vdpa_config_ops *ops = vdpa->config;
>         int r = 0;
> +       u64 size_count;
>
>         r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1,
>                                       pa, perm, opaque);
> @@ -610,9 +1085,11 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
>                 vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
>                 return r;
>         }
> -
> -       if (!vdpa->use_va)
> -               atomic64_add(PFN_DOWN(size), &dev->mm->pinned_vm);
> +       if (!vdpa->use_va) {
> +               size_count = vhost_vdpa_search_range_add(dev->vdpa_mem_tree,
> +                                                        iova, iova + size - 1);
> +               atomic64_add(PFN_DOWN(size_count), &dev->mm->pinned_vm);
> +       }
>
>         return 0;
>  }
> @@ -946,6 +1423,58 @@ static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
>         }
>  }
>
> +struct root_for_vdpa_node {
> +       struct hlist_node hlist;
> +       struct rb_root_cached vdpa_mem_tree;
> +       pid_t pid_using;
> +};
> +static DECLARE_HASHTABLE(root_for_vdpa_node_list, 8);
> +int status_for_vdpa_tree = 0;
> +
> +struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
> +{

Any reason we get the tree via task_struct instead of mm_struct?

> +       struct root_for_vdpa_node *root_get_tmp = NULL;
> +       pid_t pid_using = task_pid_nr(task);
> +
> +       /* No hased table, init one */
> +       if (status_for_vdpa_tree == 0) {
> +               hash_init(root_for_vdpa_node_list);
> +               status_for_vdpa_tree = 1;
> +       }
> +
> +       hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
> +                               pid_using) {
> +               if (root_get_tmp->pid_using == pid_using)
> +                       return root_get_tmp;
> +       }
> +
> +       root_get_tmp = kmalloc(sizeof(*root_get_tmp), GFP_KERNEL);
> +       root_get_tmp->pid_using = pid_using;
> +
> +       root_get_tmp->vdpa_mem_tree = RB_ROOT_CACHED;
> +
> +       hash_add(root_for_vdpa_node_list, &root_get_tmp->hlist,
> +                root_get_tmp->pid_using);
> +
> +       return root_get_tmp;
> +}
> +
> +void vhost_vdpa_relase_mem_tree(struct task_struct *task)
> +{
> +       struct root_for_vdpa_node *root_get_tmp = NULL;
> +       pid_t pid_using = task_pid_nr(task);
> +
> +       /* No hased table, init one */
> +       hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
> +                               pid_using) {
> +               if (root_get_tmp->pid_using == pid_using)
> +                       kfree(root_get_tmp);
> +               return;
> +       }
> +
> +       return;
> +}
> +
>  static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>  {
>         struct vhost_vdpa *v;
> @@ -991,10 +1520,13 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>         vhost_vdpa_set_iova_range(v);
>
>         filep->private_data = v;
> +       struct root_for_vdpa_node *tmp = vhost_vdpa_get_mem_tree(current);

This looks wrong, the mapping contains VA so it is bound to the owner.
This means

1) Need to get and put the accounting tree via mm_struct in set_owner
2) Need to release all mappings during reset owner

Thanks

> +       dev->vdpa_mem_tree = &tmp->vdpa_mem_tree;
>
>         return 0;
>
>  err_init_iotlb:
> +       vhost_vdpa_relase_mem_tree(current);
>         vhost_dev_cleanup(&v->vdev);
>         kfree(vqs);
>  err:
> @@ -1016,6 +1548,8 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
>         struct vhost_dev *d = &v->vdev;
>
>         mutex_lock(&d->mutex);
> +       vhost_vdpa_relase_mem_tree(current);
> +
>         filep->private_data = NULL;
>         vhost_vdpa_clean_irq(v);
>         vhost_vdpa_reset(v);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 638bb640d6b4..d1c662eb4f26 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -161,6 +161,7 @@ struct vhost_dev {
>         int byte_weight;
>         u64 kcov_handle;
>         bool use_worker;
> +       struct rb_root_cached *vdpa_mem_tree;
>         int (*msg_handler)(struct vhost_dev *dev,
>                            struct vhost_iotlb_msg *msg);
>  };
> --
> 2.34.3
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
@ 2022-06-01  5:51   ` Jason Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Jason Wang @ 2022-06-01  5:51 UTC (permalink / raw)
  To: Cindy Lu; +Cc: netdev, virtualization, linux-kernel, kvm, mst

On Wed, Jun 1, 2022 at 9:20 AM Cindy Lu <lulu@redhat.com> wrote:
>
> We count pinned_vm as follow in vhost-vDPA
>
> lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> if (npages + atomic64_read(&dev->mm->pinned_vm) > lock_limit) {
>          ret = -ENOMEM;
>          goto unlock;
> }
> This means if we have two vDPA devices for the same VM the pages would be counted twice
> So we add a tree to save the page that counted and we will not count it
> again.

The code is not easy to be reviewed, some suggestions:

- It's better to explain in general the algorithm you used here
- Add more comment in the codes to explain the rationale

And I still see the above check against the RLIMIT in the code, is it
intentional?

> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  drivers/vhost/vdpa.c  | 542 +++++++++++++++++++++++++++++++++++++++++-
>  drivers/vhost/vhost.h |   1 +
>  2 files changed, 539 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 05f5fd2af58f..1b0da0735efd 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -24,6 +24,10 @@
>  #include <linux/vhost.h>
>
>  #include "vhost.h"
> +#include <linux/rbtree.h>
> +#include <linux/interval_tree.h>
> +#include <linux/interval_tree_generic.h>
> +#include <linux/hashtable.h>
>
>  enum {
>         VHOST_VDPA_BACKEND_FEATURES =
> @@ -506,12 +510,478 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>         return r;
>  }
>
> +struct vdpa_tree_node {
> +       struct interval_tree_node tree_node;

Can we simply reuse the vhost_iotlb tree? Note that vhost_iotlb_map
can be associated with a opaque as token which could be used as
reference count.

> +       int ref;

If it's a refcount, let's use unsigned here.

> +};
> +struct vdpa_link_node {
> +       struct vdpa_tree_node *vdpa_node;
> +       struct vdpa_link_node *next;

Need to explain why we need a linked list here considering we've
already structured it as an interval tree.

Btw, unless it's performance critical, let's try to re-use kernel list.h.

> +       u64 node_start;
> +       u64 node_last;

Let's add a comment to explain each member here.

> +};
> +
> +int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
> +                            int ref)

We don't want to export this symbol, so let's make the function
static, so did other functions.

> +{
> +       struct interval_tree_node *new_node;
> +       struct vdpa_tree_node *vdpa_node;
> +
> +       if (last < start)
> +               return -EFAULT;
> +
> +       /* If the range being mapped is [0, ULONG_MAX], split it into two entries
> +        * otherwise its size would overflow u64.
> +        */
> +       if (start == 0 && last == ULONG_MAX) {
> +               u64 mid = last / 2;
> +
> +               vhost_vdpa_add_range_ctx(root, start, mid, ref);
> +               start = mid + 1;
> +       }
> +       vdpa_node = kmalloc(sizeof(struct vdpa_tree_node), GFP_ATOMIC);
> +

Let's check if kmalloc succeeds here.


> +       new_node = &vdpa_node->tree_node;
> +       if (!new_node)
> +               return -ENOMEM;
> +
> +       new_node->start = start;
> +       new_node->last = last;
> +       vdpa_node->ref = ref;
> +
> +       interval_tree_insert(new_node, root);
> +
> +       return 0;
> +}
> +
> +u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
> +                            struct vdpa_link_node *link_head, int node_number,
> +                            u64 start, u64 last)
> +{
> +       int i = 0;
> +       u64 size = 0;
> +       int new_ref;
> +       u64 node_start;
> +       u64 node_last;
> +       u64 range_start;
> +       u64 range_last;
> +       int range_size;
> +       struct vdpa_link_node *link_node;
> +       struct vdpa_tree_node *vdpa_node = NULL;
> +       struct interval_tree_node *node = NULL;
> +
> +       if (node_number == 0) {
> +               vhost_vdpa_add_range_ctx(root, start, last, 1);
> +
> +               size = last - start + 1;
> +               return size;
> +       }
> +
> +       link_node = link_head;
> +       range_start = start;
> +       range_last = last;
> +       range_size = range_start - range_last;
> +       for (i = 0; i < node_number; i++) {
> +               vdpa_node = link_node->vdpa_node;
> +               link_node = link_node->next;
> +               node = &vdpa_node->tree_node;
> +               new_ref = vdpa_node->ref;
> +               node_start = node->start;
> +               node_last = node->last;
> +
> +               if (range_start == node_start) {
> +                       if (node_last < range_last) {
> +                               /* range_start= node->start--- node->last--range_last*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                               /*count the next range */
> +                       } else if (node_last > range_last) {
> +                               /* range_start= node->start     ---  last --  node->last*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_last,
> +                                                        new_ref + 1);
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +                       } else {
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                       }
> +               } else if (node_start < range_start) {
> +                       if (range_last < node_last) {
> +                               /* node->start---  start--- last--- node->last*/
> +                               /* should the end rang*/
> +
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        range_last,
> +                                                        new_ref + 1);
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +
> +                       } else if (range_last > node_last) {
> +                               /* node->start---  start--- node->last-- last*/
> +
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                       } else {
> +                               /* node->start---  start--- node->last= last*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                               /* should the end rang*/
> +                       }
> +               } else {
> +                       if (node_last < range_last) {
> +                               /* range_start --- node->start --- node->last ----last  */
> +
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_start - 1, 1);
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                               size += ((node_start - 1) - range_start) + 1;
> +                       } else if (node_last > range_last) {
> +                               /* range_start--- node->start   ---  last --  node->last        */
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_start - 1, 1);
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_last,
> +                                                        new_ref + 1);
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +                               size += ((node_start - 1) - range_start) + 1;
> +
> +                               /* should the end rang*/
> +                       } else {
> +                               /* range_start--- node->start   ---  last =  node->last */
> +                               vhost_vdpa_add_range_ctx(root, range_start,
> +                                                        node_start - 1, 1);
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        node_last,
> +                                                        new_ref + 1);
> +                               size += ((node_start - 1) - range_start) + 1;
> +
> +                               /* should the end rang*/
> +                       }
> +               }
> +               /* work in next node*/
> +               range_start = node_last + 1;
> +               if (range_start > range_last)
> +                       break;
> +       }
> +
> +       range_size = range_last - range_start;
> +
> +       /* last round and still some range*/
> +
> +       if ((range_size >= 0) && (range_start >= node_last) &&
> +           (node_number == i + 1)) {
> +               vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
> +               size = size + (range_last - range_start) + 1;
> +       } else if ((range_size == -1) && (node_number == i + 1)) {
> +               return size;
> +       } else {
> +               printk(KERN_WARNING,
> +                      "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld i  %d",
> +                      __func__, __LINE__, range_start, range_last, node_start,
> +                      node_last, i);
> +       }
> +
> +       return size;
> +}
> +
> +u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
> +                            struct vdpa_link_node *link_head, int node_number,
> +                            u64 start, u64 last)
> +{
> +       int i = 0;
> +       u64 size = 0;
> +       int new_ref;
> +       u64 node_start;
> +       u64 node_last;
> +       u64 range_start;
> +       u64 range_last;
> +       int range_size;
> +       struct vdpa_link_node *link_node;
> +       struct vdpa_tree_node *vdpa_node = NULL;
> +       struct interval_tree_node *node = NULL;
> +
> +       if (node_number == 0)
> +               return 0;
> +
> +       link_node = link_head;
> +       range_start = start;
> +       range_last = last;
> +
> +       for (i = 0; i < node_number; i++) {
> +               vdpa_node = link_node->vdpa_node;
> +               link_node = link_node->next;
> +               node = &vdpa_node->tree_node;
> +               new_ref = vdpa_node->ref;
> +               node_start = node->start;
> +               node_last = node->last;
> +
> +               if (range_start == node_start) {
> +                       if (node_last < range_last) {
> +                               /* range_start =node->start --- node->last ----last*/

The comment needs some tweaking to be understood by the reviewers easily.


> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                node_start,
> +                                                                node_last,
> +                                                                new_ref - 1);
> +                                       /*count the next range */
> +                               } else {
> +                                       /* if the ref =0, do not need add it back, count size*/
> +                                       size += (node_last - node_start) + 1;
> +                               }
> +
> +                       } else if (node_last > range_last) {
> +                               /* range_start= node->start     ---  last --  node->last*/
> +
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                node_start,
> +                                                                range_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (range_last - node_start) + 1;
> +                               }
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +                       } else {
> +                               /* range_start= node->start     ---  last = node->last*/
> +
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                node_start,
> +                                                                range_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (range_last - node_start) + 1;
> +                               }
> +                               /* should be the end */
> +                       }
> +               } else if (node_start < range_start) {
> +                       if (range_last < node_last) {
> +                               /* node->start---  start--- last--- node->last*/
> +                               /* should the end rang*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                range_start,
> +                                                                range_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (range_last - range_start) + 1;
> +                               }
> +                               vhost_vdpa_add_range_ctx(root, range_last + 1,
> +                                                        node_last, new_ref);
> +
> +                       } else if (range_last > node_last) {
> +                               /* node->start---  start--- node->last--- last*/
> +
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                range_start,
> +                                                                node_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (node_last - range_start) + 1;
> +                               }
> +                       } else {
> +                               /* node->start---  start--- node->last= last*/
> +                               vhost_vdpa_add_range_ctx(root, node_start,
> +                                                        range_start - 1,
> +                                                        new_ref);
> +                               if (new_ref > 1) {
> +                                       vhost_vdpa_add_range_ctx(root,
> +                                                                range_start,
> +                                                                range_last,
> +                                                                new_ref - 1);
> +                               } else {
> +                                       size += (range_last - range_start) + 1;
> +                               }
> +                               /* should be the end */
> +                       }
> +               } else {
> +                       /* some range not in the node, error*/
> +                       printk(KERN_WARNING,
> +                              "%s %d FAIL  start %lld last %lld node->start %lld  node->last %lld new_ref %d",
> +                              __func__, __LINE__, range_start, range_last,
> +                              node_start, node_last, new_ref);
> +               }
> +
> +               range_start = node_last + 1;
> +               if (range_start > range_last)
> +                       break;
> +       }
> +
> +       range_size = range_last - range_start;
> +
> +       /* last round and still some range*/
> +
> +       if ((range_size > 0) && (node_number == i + 1)) {
> +               printk(KERN_WARNING,
> +                      "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld range_size  %d",
> +                      __func__, __LINE__, range_start, range_last, node_start,
> +                      node_last, range_size);
> +       }
> +       return size;
> +}
> +
> +struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
> +                                            struct vdpa_link_node *list2)
> +{
> +       struct vdpa_link_node dummy_head;
> +       struct vdpa_link_node *ptr = &dummy_head;
> +
> +       while (list1 && list2) {
> +               if (list1->node_start < list2->node_start) {
> +                       ptr->next = list1;
> +                       list1 = list1->next;
> +               } else {
> +                       ptr->next = list2;
> +                       list2 = list2->next;
> +               }
> +               ptr = ptr->next;
> +       }
> +       if (list1)
> +               ptr->next = list1;
> +       else
> +               ptr->next = list2;
> +
> +       return dummy_head.next;
> +}
> +
> +struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
> +{
> +       struct vdpa_link_node *mid_prev = NULL;
> +       struct vdpa_link_node *mid;
> +
> +       while (head && head->next) {
> +               mid_prev = (mid_prev == NULL) ? head : mid_prev->next;
> +               head = head->next->next;
> +       }
> +       mid = mid_prev->next;
> +       mid_prev->next = NULL;
> +       return mid;
> +}
> +struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
> +{
> +       struct vdpa_link_node *mid;
> +       struct vdpa_link_node *left;
> +       struct vdpa_link_node *right;
> +
> +       if (!head || !head->next)
> +               return head;
> +
> +       mid = vhost_vdpa_get_mid(head);
> +       left = vhost_vdpa_sort_list(head);
> +       right = vhost_vdpa_sort_list(mid);
> +       return vhost_vdpa_merge_list(left, right);
> +}
> +
> +u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
> +                        bool ops)
> +{
> +       struct interval_tree_node *node = NULL;
> +       struct vdpa_tree_node *vdpa_node;
> +       int node_number = 0;
> +       int i = 0;
> +       u64 size = 0;
> +       struct vdpa_link_node dummy_head = { 0 };
> +       struct vdpa_link_node *link_node;
> +       struct vdpa_link_node *link_head_tmp;
> +       struct vdpa_link_node *pre_link_node;
> +
> +       pre_link_node = &dummy_head;
> +       /*search the rang overlaped, and del from the tree*/
> +       for (node = interval_tree_iter_first(root, start, last); node;
> +            node = interval_tree_iter_next(node, start, last)) {
> +               link_node = kmalloc(sizeof(struct vdpa_link_node), GFP_ATOMIC);
> +               if (link_node == NULL) {
> +                       goto out;
> +               }
> +               vdpa_node =
> +                       container_of(node, struct vdpa_tree_node, tree_node);
> +               link_node->vdpa_node = vdpa_node;
> +               link_node->node_start = node->start;
> +               link_node->node_last = node->last;
> +
> +               pre_link_node->next = link_node;
> +               pre_link_node = link_node;
> +               pre_link_node->next = NULL;
> +
> +               node_number++;
> +
> +               interval_tree_remove(node, root);
> +       }
> +       /* sorting the node */

The code explains itself, let's explain why we need the sort.

> +       link_head_tmp = vhost_vdpa_sort_list(dummy_head.next);
> +
> +       /* these link node are have overlap with range, check the ref and add back to the tree*/
> +       if (ops == true) {
> +               size = vhost_vdpa_range_ref_add(root, link_head_tmp,
> +                                               node_number, start, last);
> +       } else {
> +               size = vhost_vdpa_range_ref_del(root, link_head_tmp,
> +                                               node_number, start, last);
> +       }
> +out:
> +       pre_link_node = link_head_tmp;
> +
> +       for (i = 0; i < node_number; i++) {
> +               vdpa_node = pre_link_node->vdpa_node;
> +               link_node = pre_link_node->next;
> +               kfree(vdpa_node);
> +               kfree(pre_link_node);
> +               pre_link_node = link_node;
> +       }
> +       return size;
> +}
> +u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
> +                               u64 last)
> +{
> +       u64 size;
> +
> +       size = vhost_vdpa_range_ops(root, start, last, true);
> +
> +       return size;
> +}
> +
> +u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
> +                               u64 last)
> +{
> +       u64 size;
> +
> +       size = vhost_vdpa_range_ops(root, start, last, false);
> +
> +       return size;
> +}
> +
>  static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>  {
>         struct vhost_dev *dev = &v->vdev;
>         struct vhost_iotlb *iotlb = dev->iotlb;
>         struct vhost_iotlb_map *map;
>         struct page *page;
> +       u64 size;
>         unsigned long pfn, pinned;
>
>         while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
> @@ -523,7 +993,11 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>                                 set_page_dirty_lock(page);
>                         unpin_user_page(page);
>                 }
> -               atomic64_sub(PFN_DOWN(map->size), &dev->mm->pinned_vm);
> +
> +               size = vhost_vdpa_search_range_del(dev->vdpa_mem_tree,
> +                                                  map->start,
> +                                                  map->start + map->size - 1);
> +               atomic64_sub(PFN_DOWN(size), &dev->mm->pinned_vm);
>                 vhost_iotlb_map_free(iotlb, map);
>         }
>  }
> @@ -591,6 +1065,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
>         struct vdpa_device *vdpa = v->vdpa;
>         const struct vdpa_config_ops *ops = vdpa->config;
>         int r = 0;
> +       u64 size_count;
>
>         r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1,
>                                       pa, perm, opaque);
> @@ -610,9 +1085,11 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
>                 vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
>                 return r;
>         }
> -
> -       if (!vdpa->use_va)
> -               atomic64_add(PFN_DOWN(size), &dev->mm->pinned_vm);
> +       if (!vdpa->use_va) {
> +               size_count = vhost_vdpa_search_range_add(dev->vdpa_mem_tree,
> +                                                        iova, iova + size - 1);
> +               atomic64_add(PFN_DOWN(size_count), &dev->mm->pinned_vm);
> +       }
>
>         return 0;
>  }
> @@ -946,6 +1423,58 @@ static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
>         }
>  }
>
> +struct root_for_vdpa_node {
> +       struct hlist_node hlist;
> +       struct rb_root_cached vdpa_mem_tree;
> +       pid_t pid_using;
> +};
> +static DECLARE_HASHTABLE(root_for_vdpa_node_list, 8);
> +int status_for_vdpa_tree = 0;
> +
> +struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
> +{

Any reason we get the tree via task_struct instead of mm_struct?

> +       struct root_for_vdpa_node *root_get_tmp = NULL;
> +       pid_t pid_using = task_pid_nr(task);
> +
> +       /* No hased table, init one */
> +       if (status_for_vdpa_tree == 0) {
> +               hash_init(root_for_vdpa_node_list);
> +               status_for_vdpa_tree = 1;
> +       }
> +
> +       hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
> +                               pid_using) {
> +               if (root_get_tmp->pid_using == pid_using)
> +                       return root_get_tmp;
> +       }
> +
> +       root_get_tmp = kmalloc(sizeof(*root_get_tmp), GFP_KERNEL);
> +       root_get_tmp->pid_using = pid_using;
> +
> +       root_get_tmp->vdpa_mem_tree = RB_ROOT_CACHED;
> +
> +       hash_add(root_for_vdpa_node_list, &root_get_tmp->hlist,
> +                root_get_tmp->pid_using);
> +
> +       return root_get_tmp;
> +}
> +
> +void vhost_vdpa_relase_mem_tree(struct task_struct *task)
> +{
> +       struct root_for_vdpa_node *root_get_tmp = NULL;
> +       pid_t pid_using = task_pid_nr(task);
> +
> +       /* No hased table, init one */
> +       hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
> +                               pid_using) {
> +               if (root_get_tmp->pid_using == pid_using)
> +                       kfree(root_get_tmp);
> +               return;
> +       }
> +
> +       return;
> +}
> +
>  static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>  {
>         struct vhost_vdpa *v;
> @@ -991,10 +1520,13 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>         vhost_vdpa_set_iova_range(v);
>
>         filep->private_data = v;
> +       struct root_for_vdpa_node *tmp = vhost_vdpa_get_mem_tree(current);

This looks wrong, the mapping contains VA so it is bound to the owner.
This means

1) Need to get and put the accounting tree via mm_struct in set_owner
2) Need to release all mappings during reset owner

Thanks

> +       dev->vdpa_mem_tree = &tmp->vdpa_mem_tree;
>
>         return 0;
>
>  err_init_iotlb:
> +       vhost_vdpa_relase_mem_tree(current);
>         vhost_dev_cleanup(&v->vdev);
>         kfree(vqs);
>  err:
> @@ -1016,6 +1548,8 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
>         struct vhost_dev *d = &v->vdev;
>
>         mutex_lock(&d->mutex);
> +       vhost_vdpa_relase_mem_tree(current);
> +
>         filep->private_data = NULL;
>         vhost_vdpa_clean_irq(v);
>         vhost_vdpa_reset(v);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 638bb640d6b4..d1c662eb4f26 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -161,6 +161,7 @@ struct vhost_dev {
>         int byte_weight;
>         u64 kcov_handle;
>         bool use_worker;
> +       struct rb_root_cached *vdpa_mem_tree;
>         int (*msg_handler)(struct vhost_dev *dev,
>                            struct vhost_iotlb_msg *msg);
>  };
> --
> 2.34.3
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
  2022-06-01  1:20 Cindy Lu
@ 2022-06-01  5:03 ` kernel test robot
  2022-06-01  5:51   ` Jason Wang
  1 sibling, 0 replies; 5+ messages in thread
From: kernel test robot @ 2022-06-01  5:03 UTC (permalink / raw)
  To: Cindy Lu; +Cc: llvm, kbuild-all

Hi Cindy,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on v5.18]
[cannot apply to mst-vhost/linux-next next-20220601]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Cindy-Lu/vdpa-Do-not-count-the-pages-that-were-already-pinned-in-the-vhost-vDPA/20220601-092308
base:    4b0986a3613c92f4ec1bdc7f60ec66fea135991f
config: x86_64-randconfig-a005 (https://download.01.org/0day-ci/archive/20220601/202206011237.d34xUNB3-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project c825abd6b0198fb088d9752f556a70705bc99dfd)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/8f0b25afe9ac570a70eafb5c285747e3bdd2471d
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Cindy-Lu/vdpa-Do-not-count-the-pages-that-were-already-pinned-in-the-vhost-vDPA/20220601-092308
        git checkout 8f0b25afe9ac570a70eafb5c285747e3bdd2471d
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/vhost/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/vhost/vdpa.c:555:5: warning: no previous prototype for function 'vhost_vdpa_add_range_ctx' [-Wmissing-prototypes]
   int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
       ^
   drivers/vhost/vdpa.c:555:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
   ^
   static 
>> drivers/vhost/vdpa.c:727:10: warning: data argument not used by format string [-Wformat-extra-args]
                          "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld i  %d",
                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/printk.h:446:60: note: expanded from macro 'printk'
   #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
   include/linux/printk.h:418:19: note: expanded from macro 'printk_index_wrap'
                   _p_func(_fmt, ##__VA_ARGS__);                           \
                           ~~~~    ^
>> drivers/vhost/vdpa.c:588:5: warning: no previous prototype for function 'vhost_vdpa_range_ref_add' [-Wmissing-prototypes]
   u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
       ^
   drivers/vhost/vdpa.c:588:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
   ^
   static 
   drivers/vhost/vdpa.c:856:11: warning: data argument not used by format string [-Wformat-extra-args]
                                  "%s %d FAIL  start %lld last %lld node->start %lld  node->last %lld new_ref %d",
                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/printk.h:446:60: note: expanded from macro 'printk'
   #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
   include/linux/printk.h:418:19: note: expanded from macro 'printk_index_wrap'
                   _p_func(_fmt, ##__VA_ARGS__);                           \
                           ~~~~    ^
   drivers/vhost/vdpa.c:872:10: warning: data argument not used by format string [-Wformat-extra-args]
                          "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld range_size  %d",
                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/printk.h:446:60: note: expanded from macro 'printk'
   #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
   include/linux/printk.h:418:19: note: expanded from macro 'printk_index_wrap'
                   _p_func(_fmt, ##__VA_ARGS__);                           \
                           ~~~~    ^
>> drivers/vhost/vdpa.c:735:5: warning: no previous prototype for function 'vhost_vdpa_range_ref_del' [-Wmissing-prototypes]
   u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
       ^
   drivers/vhost/vdpa.c:735:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
   ^
   static 
>> drivers/vhost/vdpa.c:879:24: warning: no previous prototype for function 'vhost_vdpa_merge_list' [-Wmissing-prototypes]
   struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
                          ^
   drivers/vhost/vdpa.c:879:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
   ^
   static 
>> drivers/vhost/vdpa.c:903:24: warning: no previous prototype for function 'vhost_vdpa_get_mid' [-Wmissing-prototypes]
   struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
                          ^
   drivers/vhost/vdpa.c:903:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
   ^
   static 
>> drivers/vhost/vdpa.c:916:24: warning: no previous prototype for function 'vhost_vdpa_sort_list' [-Wmissing-prototypes]
   struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
                          ^
   drivers/vhost/vdpa.c:916:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
   ^
   static 
>> drivers/vhost/vdpa.c:931:5: warning: no previous prototype for function 'vhost_vdpa_range_ops' [-Wmissing-prototypes]
   u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
       ^
   drivers/vhost/vdpa.c:931:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
   ^
   static 
>> drivers/vhost/vdpa.c:949:7: warning: variable 'link_head_tmp' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
                   if (link_node == NULL) {
                       ^~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:978:18: note: uninitialized use occurs here
           pre_link_node = link_head_tmp;
                           ^~~~~~~~~~~~~
   drivers/vhost/vdpa.c:949:3: note: remove the 'if' if its condition is always false
                   if (link_node == NULL) {
                   ^~~~~~~~~~~~~~~~~~~~~~~~
   drivers/vhost/vdpa.c:941:38: note: initialize the variable 'link_head_tmp' to silence this warning
           struct vdpa_link_node *link_head_tmp;
                                               ^
                                                = NULL
>> drivers/vhost/vdpa.c:989:5: warning: no previous prototype for function 'vhost_vdpa_search_range_add' [-Wmissing-prototypes]
   u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
       ^
   drivers/vhost/vdpa.c:989:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
   ^
   static 
>> drivers/vhost/vdpa.c:999:5: warning: no previous prototype for function 'vhost_vdpa_search_range_del' [-Wmissing-prototypes]
   u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
       ^
   drivers/vhost/vdpa.c:999:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
   ^
   static 
>> drivers/vhost/vdpa.c:1465:28: warning: no previous prototype for function 'vhost_vdpa_get_mem_tree' [-Wmissing-prototypes]
   struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
                              ^
   drivers/vhost/vdpa.c:1465:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
   ^
   static 
>> drivers/vhost/vdpa.c:1493:6: warning: no previous prototype for function 'vhost_vdpa_relase_mem_tree' [-Wmissing-prototypes]
   void vhost_vdpa_relase_mem_tree(struct task_struct *task)
        ^
   drivers/vhost/vdpa.c:1493:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void vhost_vdpa_relase_mem_tree(struct task_struct *task)
   ^
   static 
>> drivers/vhost/vdpa.c:1555:29: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
           struct root_for_vdpa_node *tmp = vhost_vdpa_get_mem_tree(current);
                                      ^
   16 warnings generated.


vim +/vhost_vdpa_add_range_ctx +555 drivers/vhost/vdpa.c

   554	
 > 555	int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
   556				     int ref)
   557	{
   558		struct interval_tree_node *new_node;
   559		struct vdpa_tree_node *vdpa_node;
   560	
   561		if (last < start)
   562			return -EFAULT;
   563	
   564		/* If the range being mapped is [0, ULONG_MAX], split it into two entries
   565		 * otherwise its size would overflow u64.
   566		 */
   567		if (start == 0 && last == ULONG_MAX) {
   568			u64 mid = last / 2;
   569	
   570			vhost_vdpa_add_range_ctx(root, start, mid, ref);
   571			start = mid + 1;
   572		}
   573		vdpa_node = kmalloc(sizeof(struct vdpa_tree_node), GFP_ATOMIC);
   574	
   575		new_node = &vdpa_node->tree_node;
   576		if (!new_node)
   577			return -ENOMEM;
   578	
   579		new_node->start = start;
   580		new_node->last = last;
   581		vdpa_node->ref = ref;
   582	
   583		interval_tree_insert(new_node, root);
   584	
   585		return 0;
   586	}
   587	
 > 588	u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
   589				     struct vdpa_link_node *link_head, int node_number,
   590				     u64 start, u64 last)
   591	{
   592		int i = 0;
   593		u64 size = 0;
   594		int new_ref;
   595		u64 node_start;
   596		u64 node_last;
   597		u64 range_start;
   598		u64 range_last;
   599		int range_size;
   600		struct vdpa_link_node *link_node;
   601		struct vdpa_tree_node *vdpa_node = NULL;
   602		struct interval_tree_node *node = NULL;
   603	
   604		if (node_number == 0) {
   605			vhost_vdpa_add_range_ctx(root, start, last, 1);
   606	
   607			size = last - start + 1;
   608			return size;
   609		}
   610	
   611		link_node = link_head;
   612		range_start = start;
   613		range_last = last;
   614		range_size = range_start - range_last;
   615		for (i = 0; i < node_number; i++) {
   616			vdpa_node = link_node->vdpa_node;
   617			link_node = link_node->next;
   618			node = &vdpa_node->tree_node;
   619			new_ref = vdpa_node->ref;
   620			node_start = node->start;
   621			node_last = node->last;
   622	
   623			if (range_start == node_start) {
   624				if (node_last < range_last) {
   625					/* range_start= node->start--- node->last--range_last*/
   626					vhost_vdpa_add_range_ctx(root, node_start,
   627								 node_last,
   628								 new_ref + 1);
   629					/*count the next range */
   630				} else if (node_last > range_last) {
   631					/* range_start= node->start	---  last --  node->last*/
   632					vhost_vdpa_add_range_ctx(root, node_start,
   633								 range_last,
   634								 new_ref + 1);
   635					vhost_vdpa_add_range_ctx(root, range_last + 1,
   636								 node_last, new_ref);
   637				} else {
   638					vhost_vdpa_add_range_ctx(root, node_start,
   639								 node_last,
   640								 new_ref + 1);
   641				}
   642			} else if (node_start < range_start) {
   643				if (range_last < node_last) {
   644					/* node->start---  start--- last--- node->last*/
   645					/* should the end rang*/
   646	
   647					vhost_vdpa_add_range_ctx(root, node_start,
   648								 range_start - 1,
   649								 new_ref);
   650					vhost_vdpa_add_range_ctx(root, range_start,
   651								 range_last,
   652								 new_ref + 1);
   653					vhost_vdpa_add_range_ctx(root, range_last + 1,
   654								 node_last, new_ref);
   655	
   656				} else if (range_last > node_last) {
   657					/* node->start---  start--- node->last-- last*/
   658	
   659					vhost_vdpa_add_range_ctx(root, node_start,
   660								 range_start - 1,
   661								 new_ref);
   662					vhost_vdpa_add_range_ctx(root, range_start,
   663								 node_last,
   664								 new_ref + 1);
   665				} else {
   666					/* node->start---  start--- node->last= last*/
   667					vhost_vdpa_add_range_ctx(root, node_start,
   668								 range_start - 1,
   669								 new_ref);
   670					vhost_vdpa_add_range_ctx(root, range_start,
   671								 node_last,
   672								 new_ref + 1);
   673					/* should the end rang*/
   674				}
   675			} else {
   676				if (node_last < range_last) {
   677					/* range_start --- node->start --- node->last ----last  */
   678	
   679					vhost_vdpa_add_range_ctx(root, range_start,
   680								 node_start - 1, 1);
   681					vhost_vdpa_add_range_ctx(root, node_start,
   682								 node_last,
   683								 new_ref + 1);
   684					size += ((node_start - 1) - range_start) + 1;
   685				} else if (node_last > range_last) {
   686					/* range_start--- node->start	---  last --  node->last	*/
   687					vhost_vdpa_add_range_ctx(root, range_start,
   688								 node_start - 1, 1);
   689					vhost_vdpa_add_range_ctx(root, node_start,
   690								 range_last,
   691								 new_ref + 1);
   692					vhost_vdpa_add_range_ctx(root, range_last + 1,
   693								 node_last, new_ref);
   694					size += ((node_start - 1) - range_start) + 1;
   695	
   696					/* should the end rang*/
   697				} else {
   698					/* range_start--- node->start	---  last =  node->last	*/
   699					vhost_vdpa_add_range_ctx(root, range_start,
   700								 node_start - 1, 1);
   701					vhost_vdpa_add_range_ctx(root, node_start,
   702								 node_last,
   703								 new_ref + 1);
   704					size += ((node_start - 1) - range_start) + 1;
   705	
   706					/* should the end rang*/
   707				}
   708			}
   709			/* work in next node*/
   710			range_start = node_last + 1;
   711			if (range_start > range_last)
   712				break;
   713		}
   714	
   715		range_size = range_last - range_start;
   716	
   717		/* last round and still some range*/
   718	
   719		if ((range_size >= 0) && (range_start >= node_last) &&
   720		    (node_number == i + 1)) {
   721			vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
   722			size = size + (range_last - range_start) + 1;
   723		} else if ((range_size == -1) && (node_number == i + 1)) {
   724			return size;
   725		} else {
   726			printk(KERN_WARNING,
 > 727			       "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld i  %d",
   728			       __func__, __LINE__, range_start, range_last, node_start,
   729			       node_last, i);
   730		}
   731	
   732		return size;
   733	}
   734	
 > 735	u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
   736				     struct vdpa_link_node *link_head, int node_number,
   737				     u64 start, u64 last)
   738	{
   739		int i = 0;
   740		u64 size = 0;
   741		int new_ref;
   742		u64 node_start;
   743		u64 node_last;
   744		u64 range_start;
   745		u64 range_last;
   746		int range_size;
   747		struct vdpa_link_node *link_node;
   748		struct vdpa_tree_node *vdpa_node = NULL;
   749		struct interval_tree_node *node = NULL;
   750	
   751		if (node_number == 0)
   752			return 0;
   753	
   754		link_node = link_head;
   755		range_start = start;
   756		range_last = last;
   757	
   758		for (i = 0; i < node_number; i++) {
   759			vdpa_node = link_node->vdpa_node;
   760			link_node = link_node->next;
   761			node = &vdpa_node->tree_node;
   762			new_ref = vdpa_node->ref;
   763			node_start = node->start;
   764			node_last = node->last;
   765	
   766			if (range_start == node_start) {
   767				if (node_last < range_last) {
   768					/* range_start =node->start --- node->last ----last*/
   769					if (new_ref > 1) {
   770						vhost_vdpa_add_range_ctx(root,
   771									 node_start,
   772									 node_last,
   773									 new_ref - 1);
   774						/*count the next range */
   775					} else {
   776						/* if the ref =0, do not need add it back, count size*/
   777						size += (node_last - node_start) + 1;
   778					}
   779	
   780				} else if (node_last > range_last) {
   781					/* range_start= node->start	---  last --  node->last*/
   782	
   783					if (new_ref > 1) {
   784						vhost_vdpa_add_range_ctx(root,
   785									 node_start,
   786									 range_last,
   787									 new_ref - 1);
   788					} else {
   789						size += (range_last - node_start) + 1;
   790					}
   791					vhost_vdpa_add_range_ctx(root, range_last + 1,
   792								 node_last, new_ref);
   793				} else {
   794					/* range_start= node->start	---  last = node->last*/
   795	
   796					if (new_ref > 1) {
   797						vhost_vdpa_add_range_ctx(root,
   798									 node_start,
   799									 range_last,
   800									 new_ref - 1);
   801					} else {
   802						size += (range_last - node_start) + 1;
   803					}
   804					/* should be the end */
   805				}
   806			} else if (node_start < range_start) {
   807				if (range_last < node_last) {
   808					/* node->start---  start--- last--- node->last*/
   809					/* should the end rang*/
   810					vhost_vdpa_add_range_ctx(root, node_start,
   811								 range_start - 1,
   812								 new_ref);
   813					if (new_ref > 1) {
   814						vhost_vdpa_add_range_ctx(root,
   815									 range_start,
   816									 range_last,
   817									 new_ref - 1);
   818					} else {
   819						size += (range_last - range_start) + 1;
   820					}
   821					vhost_vdpa_add_range_ctx(root, range_last + 1,
   822								 node_last, new_ref);
   823	
   824				} else if (range_last > node_last) {
   825					/* node->start---  start--- node->last--- last*/
   826	
   827					vhost_vdpa_add_range_ctx(root, node_start,
   828								 range_start - 1,
   829								 new_ref);
   830					if (new_ref > 1) {
   831						vhost_vdpa_add_range_ctx(root,
   832									 range_start,
   833									 node_last,
   834									 new_ref - 1);
   835					} else {
   836						size += (node_last - range_start) + 1;
   837					}
   838				} else {
   839					/* node->start---  start--- node->last= last*/
   840					vhost_vdpa_add_range_ctx(root, node_start,
   841								 range_start - 1,
   842								 new_ref);
   843					if (new_ref > 1) {
   844						vhost_vdpa_add_range_ctx(root,
   845									 range_start,
   846									 range_last,
   847									 new_ref - 1);
   848					} else {
   849						size += (range_last - range_start) + 1;
   850					}
   851					/* should be the end */
   852				}
   853			} else {
   854				/* some range not in the node, error*/
   855				printk(KERN_WARNING,
   856				       "%s %d FAIL  start %lld last %lld node->start %lld  node->last %lld new_ref %d",
   857				       __func__, __LINE__, range_start, range_last,
   858				       node_start, node_last, new_ref);
   859			}
   860	
   861			range_start = node_last + 1;
   862			if (range_start > range_last)
   863				break;
   864		}
   865	
   866		range_size = range_last - range_start;
   867	
   868		/* last round and still some range*/
   869	
   870		if ((range_size > 0) && (node_number == i + 1)) {
   871			printk(KERN_WARNING,
   872			       "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld range_size  %d",
   873			       __func__, __LINE__, range_start, range_last, node_start,
   874			       node_last, range_size);
   875		}
   876		return size;
   877	}
   878	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA
@ 2022-06-01  1:20 Cindy Lu
  2022-06-01  5:03 ` kernel test robot
  2022-06-01  5:51   ` Jason Wang
  0 siblings, 2 replies; 5+ messages in thread
From: Cindy Lu @ 2022-06-01  1:20 UTC (permalink / raw)
  To: lulu, mst, jasowang, kvm, virtualization, netdev, linux-kernel

We count pinned_vm as follow in vhost-vDPA

lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
if (npages + atomic64_read(&dev->mm->pinned_vm) > lock_limit) {
         ret = -ENOMEM;
         goto unlock;
}
This means if we have two vDPA devices for the same VM the pages would be counted twice
So we add a tree to save the page that counted and we will not count it
again.
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 drivers/vhost/vdpa.c  | 542 +++++++++++++++++++++++++++++++++++++++++-
 drivers/vhost/vhost.h |   1 +
 2 files changed, 539 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 05f5fd2af58f..1b0da0735efd 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -24,6 +24,10 @@
 #include <linux/vhost.h>
 
 #include "vhost.h"
+#include <linux/rbtree.h>
+#include <linux/interval_tree.h>
+#include <linux/interval_tree_generic.h>
+#include <linux/hashtable.h>
 
 enum {
 	VHOST_VDPA_BACKEND_FEATURES =
@@ -506,12 +510,478 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
 	return r;
 }
 
+struct vdpa_tree_node {
+	struct interval_tree_node tree_node;
+	int ref;
+};
+struct vdpa_link_node {
+	struct vdpa_tree_node *vdpa_node;
+	struct vdpa_link_node *next;
+	u64 node_start;
+	u64 node_last;
+};
+
+int vhost_vdpa_add_range_ctx(struct rb_root_cached *root, u64 start, u64 last,
+			     int ref)
+{
+	struct interval_tree_node *new_node;
+	struct vdpa_tree_node *vdpa_node;
+
+	if (last < start)
+		return -EFAULT;
+
+	/* If the range being mapped is [0, ULONG_MAX], split it into two entries
+	 * otherwise its size would overflow u64.
+	 */
+	if (start == 0 && last == ULONG_MAX) {
+		u64 mid = last / 2;
+
+		vhost_vdpa_add_range_ctx(root, start, mid, ref);
+		start = mid + 1;
+	}
+	vdpa_node = kmalloc(sizeof(struct vdpa_tree_node), GFP_ATOMIC);
+
+	new_node = &vdpa_node->tree_node;
+	if (!new_node)
+		return -ENOMEM;
+
+	new_node->start = start;
+	new_node->last = last;
+	vdpa_node->ref = ref;
+
+	interval_tree_insert(new_node, root);
+
+	return 0;
+}
+
+u64 vhost_vdpa_range_ref_add(struct rb_root_cached *root,
+			     struct vdpa_link_node *link_head, int node_number,
+			     u64 start, u64 last)
+{
+	int i = 0;
+	u64 size = 0;
+	int new_ref;
+	u64 node_start;
+	u64 node_last;
+	u64 range_start;
+	u64 range_last;
+	int range_size;
+	struct vdpa_link_node *link_node;
+	struct vdpa_tree_node *vdpa_node = NULL;
+	struct interval_tree_node *node = NULL;
+
+	if (node_number == 0) {
+		vhost_vdpa_add_range_ctx(root, start, last, 1);
+
+		size = last - start + 1;
+		return size;
+	}
+
+	link_node = link_head;
+	range_start = start;
+	range_last = last;
+	range_size = range_start - range_last;
+	for (i = 0; i < node_number; i++) {
+		vdpa_node = link_node->vdpa_node;
+		link_node = link_node->next;
+		node = &vdpa_node->tree_node;
+		new_ref = vdpa_node->ref;
+		node_start = node->start;
+		node_last = node->last;
+
+		if (range_start == node_start) {
+			if (node_last < range_last) {
+				/* range_start= node->start--- node->last--range_last*/
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 node_last,
+							 new_ref + 1);
+				/*count the next range */
+			} else if (node_last > range_last) {
+				/* range_start= node->start	---  last --  node->last*/
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 range_last,
+							 new_ref + 1);
+				vhost_vdpa_add_range_ctx(root, range_last + 1,
+							 node_last, new_ref);
+			} else {
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 node_last,
+							 new_ref + 1);
+			}
+		} else if (node_start < range_start) {
+			if (range_last < node_last) {
+				/* node->start---  start--- last--- node->last*/
+				/* should the end rang*/
+
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 range_start - 1,
+							 new_ref);
+				vhost_vdpa_add_range_ctx(root, range_start,
+							 range_last,
+							 new_ref + 1);
+				vhost_vdpa_add_range_ctx(root, range_last + 1,
+							 node_last, new_ref);
+
+			} else if (range_last > node_last) {
+				/* node->start---  start--- node->last-- last*/
+
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 range_start - 1,
+							 new_ref);
+				vhost_vdpa_add_range_ctx(root, range_start,
+							 node_last,
+							 new_ref + 1);
+			} else {
+				/* node->start---  start--- node->last= last*/
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 range_start - 1,
+							 new_ref);
+				vhost_vdpa_add_range_ctx(root, range_start,
+							 node_last,
+							 new_ref + 1);
+				/* should the end rang*/
+			}
+		} else {
+			if (node_last < range_last) {
+				/* range_start --- node->start --- node->last ----last  */
+
+				vhost_vdpa_add_range_ctx(root, range_start,
+							 node_start - 1, 1);
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 node_last,
+							 new_ref + 1);
+				size += ((node_start - 1) - range_start) + 1;
+			} else if (node_last > range_last) {
+				/* range_start--- node->start	---  last --  node->last	*/
+				vhost_vdpa_add_range_ctx(root, range_start,
+							 node_start - 1, 1);
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 range_last,
+							 new_ref + 1);
+				vhost_vdpa_add_range_ctx(root, range_last + 1,
+							 node_last, new_ref);
+				size += ((node_start - 1) - range_start) + 1;
+
+				/* should the end rang*/
+			} else {
+				/* range_start--- node->start	---  last =  node->last	*/
+				vhost_vdpa_add_range_ctx(root, range_start,
+							 node_start - 1, 1);
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 node_last,
+							 new_ref + 1);
+				size += ((node_start - 1) - range_start) + 1;
+
+				/* should the end rang*/
+			}
+		}
+		/* work in next node*/
+		range_start = node_last + 1;
+		if (range_start > range_last)
+			break;
+	}
+
+	range_size = range_last - range_start;
+
+	/* last round and still some range*/
+
+	if ((range_size >= 0) && (range_start >= node_last) &&
+	    (node_number == i + 1)) {
+		vhost_vdpa_add_range_ctx(root, range_start, range_last, 1);
+		size = size + (range_last - range_start) + 1;
+	} else if ((range_size == -1) && (node_number == i + 1)) {
+		return size;
+	} else {
+		printk(KERN_WARNING,
+		       "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld i  %d",
+		       __func__, __LINE__, range_start, range_last, node_start,
+		       node_last, i);
+	}
+
+	return size;
+}
+
+u64 vhost_vdpa_range_ref_del(struct rb_root_cached *root,
+			     struct vdpa_link_node *link_head, int node_number,
+			     u64 start, u64 last)
+{
+	int i = 0;
+	u64 size = 0;
+	int new_ref;
+	u64 node_start;
+	u64 node_last;
+	u64 range_start;
+	u64 range_last;
+	int range_size;
+	struct vdpa_link_node *link_node;
+	struct vdpa_tree_node *vdpa_node = NULL;
+	struct interval_tree_node *node = NULL;
+
+	if (node_number == 0)
+		return 0;
+
+	link_node = link_head;
+	range_start = start;
+	range_last = last;
+
+	for (i = 0; i < node_number; i++) {
+		vdpa_node = link_node->vdpa_node;
+		link_node = link_node->next;
+		node = &vdpa_node->tree_node;
+		new_ref = vdpa_node->ref;
+		node_start = node->start;
+		node_last = node->last;
+
+		if (range_start == node_start) {
+			if (node_last < range_last) {
+				/* range_start =node->start --- node->last ----last*/
+				if (new_ref > 1) {
+					vhost_vdpa_add_range_ctx(root,
+								 node_start,
+								 node_last,
+								 new_ref - 1);
+					/*count the next range */
+				} else {
+					/* if the ref =0, do not need add it back, count size*/
+					size += (node_last - node_start) + 1;
+				}
+
+			} else if (node_last > range_last) {
+				/* range_start= node->start	---  last --  node->last*/
+
+				if (new_ref > 1) {
+					vhost_vdpa_add_range_ctx(root,
+								 node_start,
+								 range_last,
+								 new_ref - 1);
+				} else {
+					size += (range_last - node_start) + 1;
+				}
+				vhost_vdpa_add_range_ctx(root, range_last + 1,
+							 node_last, new_ref);
+			} else {
+				/* range_start= node->start	---  last = node->last*/
+
+				if (new_ref > 1) {
+					vhost_vdpa_add_range_ctx(root,
+								 node_start,
+								 range_last,
+								 new_ref - 1);
+				} else {
+					size += (range_last - node_start) + 1;
+				}
+				/* should be the end */
+			}
+		} else if (node_start < range_start) {
+			if (range_last < node_last) {
+				/* node->start---  start--- last--- node->last*/
+				/* should the end rang*/
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 range_start - 1,
+							 new_ref);
+				if (new_ref > 1) {
+					vhost_vdpa_add_range_ctx(root,
+								 range_start,
+								 range_last,
+								 new_ref - 1);
+				} else {
+					size += (range_last - range_start) + 1;
+				}
+				vhost_vdpa_add_range_ctx(root, range_last + 1,
+							 node_last, new_ref);
+
+			} else if (range_last > node_last) {
+				/* node->start---  start--- node->last--- last*/
+
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 range_start - 1,
+							 new_ref);
+				if (new_ref > 1) {
+					vhost_vdpa_add_range_ctx(root,
+								 range_start,
+								 node_last,
+								 new_ref - 1);
+				} else {
+					size += (node_last - range_start) + 1;
+				}
+			} else {
+				/* node->start---  start--- node->last= last*/
+				vhost_vdpa_add_range_ctx(root, node_start,
+							 range_start - 1,
+							 new_ref);
+				if (new_ref > 1) {
+					vhost_vdpa_add_range_ctx(root,
+								 range_start,
+								 range_last,
+								 new_ref - 1);
+				} else {
+					size += (range_last - range_start) + 1;
+				}
+				/* should be the end */
+			}
+		} else {
+			/* some range not in the node, error*/
+			printk(KERN_WARNING,
+			       "%s %d FAIL  start %lld last %lld node->start %lld  node->last %lld new_ref %d",
+			       __func__, __LINE__, range_start, range_last,
+			       node_start, node_last, new_ref);
+		}
+
+		range_start = node_last + 1;
+		if (range_start > range_last)
+			break;
+	}
+
+	range_size = range_last - range_start;
+
+	/* last round and still some range*/
+
+	if ((range_size > 0) && (node_number == i + 1)) {
+		printk(KERN_WARNING,
+		       "%s %d FAIL start %lld last %lld node->start %lld  node->last %lld range_size  %d",
+		       __func__, __LINE__, range_start, range_last, node_start,
+		       node_last, range_size);
+	}
+	return size;
+}
+
+struct vdpa_link_node *vhost_vdpa_merge_list(struct vdpa_link_node *list1,
+					     struct vdpa_link_node *list2)
+{
+	struct vdpa_link_node dummy_head;
+	struct vdpa_link_node *ptr = &dummy_head;
+
+	while (list1 && list2) {
+		if (list1->node_start < list2->node_start) {
+			ptr->next = list1;
+			list1 = list1->next;
+		} else {
+			ptr->next = list2;
+			list2 = list2->next;
+		}
+		ptr = ptr->next;
+	}
+	if (list1)
+		ptr->next = list1;
+	else
+		ptr->next = list2;
+
+	return dummy_head.next;
+}
+
+struct vdpa_link_node *vhost_vdpa_get_mid(struct vdpa_link_node *head)
+{
+	struct vdpa_link_node *mid_prev = NULL;
+	struct vdpa_link_node *mid;
+
+	while (head && head->next) {
+		mid_prev = (mid_prev == NULL) ? head : mid_prev->next;
+		head = head->next->next;
+	}
+	mid = mid_prev->next;
+	mid_prev->next = NULL;
+	return mid;
+}
+struct vdpa_link_node *vhost_vdpa_sort_list(struct vdpa_link_node *head)
+{
+	struct vdpa_link_node *mid;
+	struct vdpa_link_node *left;
+	struct vdpa_link_node *right;
+
+	if (!head || !head->next)
+		return head;
+
+	mid = vhost_vdpa_get_mid(head);
+	left = vhost_vdpa_sort_list(head);
+	right = vhost_vdpa_sort_list(mid);
+	return vhost_vdpa_merge_list(left, right);
+}
+
+u64 vhost_vdpa_range_ops(struct rb_root_cached *root, u64 start, u64 last,
+			 bool ops)
+{
+	struct interval_tree_node *node = NULL;
+	struct vdpa_tree_node *vdpa_node;
+	int node_number = 0;
+	int i = 0;
+	u64 size = 0;
+	struct vdpa_link_node dummy_head = { 0 };
+	struct vdpa_link_node *link_node;
+	struct vdpa_link_node *link_head_tmp;
+	struct vdpa_link_node *pre_link_node;
+
+	pre_link_node = &dummy_head;
+	/*search the rang overlaped, and del from the tree*/
+	for (node = interval_tree_iter_first(root, start, last); node;
+	     node = interval_tree_iter_next(node, start, last)) {
+		link_node = kmalloc(sizeof(struct vdpa_link_node), GFP_ATOMIC);
+		if (link_node == NULL) {
+			goto out;
+		}
+		vdpa_node =
+			container_of(node, struct vdpa_tree_node, tree_node);
+		link_node->vdpa_node = vdpa_node;
+		link_node->node_start = node->start;
+		link_node->node_last = node->last;
+
+		pre_link_node->next = link_node;
+		pre_link_node = link_node;
+		pre_link_node->next = NULL;
+
+		node_number++;
+
+		interval_tree_remove(node, root);
+	}
+	/* sorting the node */
+	link_head_tmp = vhost_vdpa_sort_list(dummy_head.next);
+
+	/* these link node are have overlap with range, check the ref and add back to the tree*/
+	if (ops == true) {
+		size = vhost_vdpa_range_ref_add(root, link_head_tmp,
+						node_number, start, last);
+	} else {
+		size = vhost_vdpa_range_ref_del(root, link_head_tmp,
+						node_number, start, last);
+	}
+out:
+	pre_link_node = link_head_tmp;
+
+	for (i = 0; i < node_number; i++) {
+		vdpa_node = pre_link_node->vdpa_node;
+		link_node = pre_link_node->next;
+		kfree(vdpa_node);
+		kfree(pre_link_node);
+		pre_link_node = link_node;
+	}
+	return size;
+}
+u64 vhost_vdpa_search_range_add(struct rb_root_cached *root, u64 start,
+				u64 last)
+{
+	u64 size;
+
+	size = vhost_vdpa_range_ops(root, start, last, true);
+
+	return size;
+}
+
+u64 vhost_vdpa_search_range_del(struct rb_root_cached *root, u64 start,
+				u64 last)
+{
+	u64 size;
+
+	size = vhost_vdpa_range_ops(root, start, last, false);
+
+	return size;
+}
+
 static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
 {
 	struct vhost_dev *dev = &v->vdev;
 	struct vhost_iotlb *iotlb = dev->iotlb;
 	struct vhost_iotlb_map *map;
 	struct page *page;
+	u64 size;
 	unsigned long pfn, pinned;
 
 	while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) {
@@ -523,7 +993,11 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last)
 				set_page_dirty_lock(page);
 			unpin_user_page(page);
 		}
-		atomic64_sub(PFN_DOWN(map->size), &dev->mm->pinned_vm);
+
+		size = vhost_vdpa_search_range_del(dev->vdpa_mem_tree,
+						   map->start,
+						   map->start + map->size - 1);
+		atomic64_sub(PFN_DOWN(size), &dev->mm->pinned_vm);
 		vhost_iotlb_map_free(iotlb, map);
 	}
 }
@@ -591,6 +1065,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
 	struct vdpa_device *vdpa = v->vdpa;
 	const struct vdpa_config_ops *ops = vdpa->config;
 	int r = 0;
+	u64 size_count;
 
 	r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1,
 				      pa, perm, opaque);
@@ -610,9 +1085,11 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova,
 		vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
 		return r;
 	}
-
-	if (!vdpa->use_va)
-		atomic64_add(PFN_DOWN(size), &dev->mm->pinned_vm);
+	if (!vdpa->use_va) {
+		size_count = vhost_vdpa_search_range_add(dev->vdpa_mem_tree,
+							 iova, iova + size - 1);
+		atomic64_add(PFN_DOWN(size_count), &dev->mm->pinned_vm);
+	}
 
 	return 0;
 }
@@ -946,6 +1423,58 @@ static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
 	}
 }
 
+struct root_for_vdpa_node {
+	struct hlist_node hlist;
+	struct rb_root_cached vdpa_mem_tree;
+	pid_t pid_using;
+};
+static DECLARE_HASHTABLE(root_for_vdpa_node_list, 8);
+int status_for_vdpa_tree = 0;
+
+struct root_for_vdpa_node *vhost_vdpa_get_mem_tree(struct task_struct *task)
+{
+	struct root_for_vdpa_node *root_get_tmp = NULL;
+	pid_t pid_using = task_pid_nr(task);
+
+	/* No hased table, init one */
+	if (status_for_vdpa_tree == 0) {
+		hash_init(root_for_vdpa_node_list);
+		status_for_vdpa_tree = 1;
+	}
+
+	hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
+				pid_using) {
+		if (root_get_tmp->pid_using == pid_using)
+			return root_get_tmp;
+	}
+
+	root_get_tmp = kmalloc(sizeof(*root_get_tmp), GFP_KERNEL);
+	root_get_tmp->pid_using = pid_using;
+
+	root_get_tmp->vdpa_mem_tree = RB_ROOT_CACHED;
+
+	hash_add(root_for_vdpa_node_list, &root_get_tmp->hlist,
+		 root_get_tmp->pid_using);
+
+	return root_get_tmp;
+}
+
+void vhost_vdpa_relase_mem_tree(struct task_struct *task)
+{
+	struct root_for_vdpa_node *root_get_tmp = NULL;
+	pid_t pid_using = task_pid_nr(task);
+
+	/* No hased table, init one */
+	hash_for_each_possible (root_for_vdpa_node_list, root_get_tmp, hlist,
+				pid_using) {
+		if (root_get_tmp->pid_using == pid_using)
+			kfree(root_get_tmp);
+		return;
+	}
+
+	return;
+}
+
 static int vhost_vdpa_open(struct inode *inode, struct file *filep)
 {
 	struct vhost_vdpa *v;
@@ -991,10 +1520,13 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
 	vhost_vdpa_set_iova_range(v);
 
 	filep->private_data = v;
+	struct root_for_vdpa_node *tmp = vhost_vdpa_get_mem_tree(current);
+	dev->vdpa_mem_tree = &tmp->vdpa_mem_tree;
 
 	return 0;
 
 err_init_iotlb:
+	vhost_vdpa_relase_mem_tree(current);
 	vhost_dev_cleanup(&v->vdev);
 	kfree(vqs);
 err:
@@ -1016,6 +1548,8 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
 	struct vhost_dev *d = &v->vdev;
 
 	mutex_lock(&d->mutex);
+	vhost_vdpa_relase_mem_tree(current);
+
 	filep->private_data = NULL;
 	vhost_vdpa_clean_irq(v);
 	vhost_vdpa_reset(v);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 638bb640d6b4..d1c662eb4f26 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -161,6 +161,7 @@ struct vhost_dev {
 	int byte_weight;
 	u64 kcov_handle;
 	bool use_worker;
+	struct rb_root_cached *vdpa_mem_tree;
 	int (*msg_handler)(struct vhost_dev *dev,
 			   struct vhost_iotlb_msg *msg);
 };
-- 
2.34.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-06-04 16:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-04 16:04 [RFC PATCH v2] vdpa: Do not count the pages that were already pinned in the vhost-vDPA kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2022-06-01  1:20 Cindy Lu
2022-06-01  5:03 ` kernel test robot
2022-06-01  5:51 ` Jason Wang
2022-06-01  5:51   ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.