All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next] RDMA/rxe: fix regression caused by recent patch
@ 2020-10-29 21:25 Bob Pearson
  2020-10-30  2:39   ` kernel test robot
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Bob Pearson @ 2020-10-29 21:25 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma; +Cc: Bob Pearson

The commit referenced below performs additional checking on
devices used for DMA. Specifically it checks that

device->dma_mask != NULL

Rdma_rxe uses this device when pinning MR memory but did not
set the value of dma_mask. In fact rdma_rxe does not perform
any DMA operations so the value is never used but is checked.

This patch gives dma_mask a valid value. Without this patch
rdma_rxe does not function at all.

Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference")
Signed-off-by: Bob Pearson <rpearson@hpe.com>
---
 drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 7652d53af2c1..116a234e92db 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -1134,8 +1134,15 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
 	dev->node_type = RDMA_NODE_IB_CA;
 	dev->phys_port_cnt = 1;
 	dev->num_comp_vectors = num_possible_cpus();
+
+	/* rdma_rxe never does real DMA but does rely on
+	 * pinning user memory in MRs to avoid page faults
+	 * in responder and completer tasklets
+	 */
 	dev->dev.parent = rxe_dma_device(rxe);
+	dev->dev.dma_mask = DMA_BIT_MASK(64);
 	dev->local_dma_lkey = 0;
+
 	addrconf_addr_eui48((unsigned char *)&dev->node_guid,
 			    rxe->ndev->dev_addr);
 	dev->dev.dma_parms = &rxe->dma_parms;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH for-next] RDMA/rxe: fix regression caused by recent patch
  2020-10-29 21:25 [PATCH for-next] RDMA/rxe: fix regression caused by recent patch Bob Pearson
@ 2020-10-30  2:39   ` kernel test robot
  2020-10-30  2:54 ` Zhu Yanjun
  2020-10-30  5:46 ` Bob Pearson
  2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2020-10-30  2:39 UTC (permalink / raw)
  To: Bob Pearson, jgg, zyjzyj2000, linux-rdma; +Cc: kbuild-all, Bob Pearson

[-- Attachment #1: Type: text/plain, Size: 2649 bytes --]

Hi Bob,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on rdma/for-next]
[also build test WARNING on v5.10-rc1 next-20201029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Bob-Pearson/RDMA-rxe-fix-regression-caused-by-recent-patch/20201030-052848
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/880fe509bd2bdc73c885fd887cb3935000855d49
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Bob-Pearson/RDMA-rxe-fix-regression-caused-by-recent-patch/20201030-052848
        git checkout 880fe509bd2bdc73c885fd887cb3935000855d49
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/infiniband/sw/rxe/rxe_verbs.c: In function 'rxe_register_device':
>> drivers/infiniband/sw/rxe/rxe_verbs.c:1143:20: warning: assignment to 'u64 *' {aka 'long long unsigned int *'} from 'long long unsigned int' makes pointer from integer without a cast [-Wint-conversion]
    1143 |  dev->dev.dma_mask = DMA_BIT_MASK(64);
         |                    ^

vim +1143 drivers/infiniband/sw/rxe/rxe_verbs.c

  1125	
  1126	int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
  1127	{
  1128		int err;
  1129		struct ib_device *dev = &rxe->ib_dev;
  1130		struct crypto_shash *tfm;
  1131	
  1132		strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc));
  1133	
  1134		dev->node_type = RDMA_NODE_IB_CA;
  1135		dev->phys_port_cnt = 1;
  1136		dev->num_comp_vectors = num_possible_cpus();
  1137	
  1138		/* rdma_rxe never does real DMA but does rely on
  1139		 * pinning user memory in MRs to avoid page faults
  1140		 * in responder and completer tasklets
  1141		 */
  1142		dev->dev.parent = rxe_dma_device(rxe);
> 1143		dev->dev.dma_mask = DMA_BIT_MASK(64);

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 71486 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-next] RDMA/rxe: fix regression caused by recent patch
@ 2020-10-30  2:39   ` kernel test robot
  0 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2020-10-30  2:39 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2710 bytes --]

Hi Bob,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on rdma/for-next]
[also build test WARNING on v5.10-rc1 next-20201029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Bob-Pearson/RDMA-rxe-fix-regression-caused-by-recent-patch/20201030-052848
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/880fe509bd2bdc73c885fd887cb3935000855d49
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Bob-Pearson/RDMA-rxe-fix-regression-caused-by-recent-patch/20201030-052848
        git checkout 880fe509bd2bdc73c885fd887cb3935000855d49
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/infiniband/sw/rxe/rxe_verbs.c: In function 'rxe_register_device':
>> drivers/infiniband/sw/rxe/rxe_verbs.c:1143:20: warning: assignment to 'u64 *' {aka 'long long unsigned int *'} from 'long long unsigned int' makes pointer from integer without a cast [-Wint-conversion]
    1143 |  dev->dev.dma_mask = DMA_BIT_MASK(64);
         |                    ^

vim +1143 drivers/infiniband/sw/rxe/rxe_verbs.c

  1125	
  1126	int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
  1127	{
  1128		int err;
  1129		struct ib_device *dev = &rxe->ib_dev;
  1130		struct crypto_shash *tfm;
  1131	
  1132		strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc));
  1133	
  1134		dev->node_type = RDMA_NODE_IB_CA;
  1135		dev->phys_port_cnt = 1;
  1136		dev->num_comp_vectors = num_possible_cpus();
  1137	
  1138		/* rdma_rxe never does real DMA but does rely on
  1139		 * pinning user memory in MRs to avoid page faults
  1140		 * in responder and completer tasklets
  1141		 */
  1142		dev->dev.parent = rxe_dma_device(rxe);
> 1143		dev->dev.dma_mask = DMA_BIT_MASK(64);

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 71486 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-next] RDMA/rxe: fix regression caused by recent patch
  2020-10-29 21:25 [PATCH for-next] RDMA/rxe: fix regression caused by recent patch Bob Pearson
  2020-10-30  2:39   ` kernel test robot
@ 2020-10-30  2:54 ` Zhu Yanjun
  2020-10-30  5:46 ` Bob Pearson
  2 siblings, 0 replies; 5+ messages in thread
From: Zhu Yanjun @ 2020-10-30  2:54 UTC (permalink / raw)
  To: Bob Pearson; +Cc: Jason Gunthorpe, linux-rdma, Bob Pearson

On Fri, Oct 30, 2020 at 5:27 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> The commit referenced below performs additional checking on
> devices used for DMA. Specifically it checks that
>
> device->dma_mask != NULL
>
> Rdma_rxe uses this device when pinning MR memory but did not
> set the value of dma_mask. In fact rdma_rxe does not perform
> any DMA operations so the value is never used but is checked.
>
> This patch gives dma_mask a valid value. Without this patch
> rdma_rxe does not function at all.
>
> Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference")
> Signed-off-by: Bob Pearson <rpearson@hpe.com>

Thanks a lot.

Zhu Yanjun

> ---
>  drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 7652d53af2c1..116a234e92db 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -1134,8 +1134,15 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
>         dev->node_type = RDMA_NODE_IB_CA;
>         dev->phys_port_cnt = 1;
>         dev->num_comp_vectors = num_possible_cpus();
> +
> +       /* rdma_rxe never does real DMA but does rely on
> +        * pinning user memory in MRs to avoid page faults
> +        * in responder and completer tasklets
> +        */
>         dev->dev.parent = rxe_dma_device(rxe);
> +       dev->dev.dma_mask = DMA_BIT_MASK(64);
>         dev->local_dma_lkey = 0;
> +
>         addrconf_addr_eui48((unsigned char *)&dev->node_guid,
>                             rxe->ndev->dev_addr);
>         dev->dev.dma_parms = &rxe->dma_parms;
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-next] RDMA/rxe: fix regression caused by recent patch
  2020-10-29 21:25 [PATCH for-next] RDMA/rxe: fix regression caused by recent patch Bob Pearson
  2020-10-30  2:39   ` kernel test robot
  2020-10-30  2:54 ` Zhu Yanjun
@ 2020-10-30  5:46 ` Bob Pearson
  2 siblings, 0 replies; 5+ messages in thread
From: Bob Pearson @ 2020-10-30  5:46 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma; +Cc: Bob Pearson

On 10/29/20 4:25 PM, Bob Pearson wrote:
> The commit referenced below performs additional checking on
> devices used for DMA. Specifically it checks that
> 
> device->dma_mask != NULL
> 
> Rdma_rxe uses this device when pinning MR memory but did not
> set the value of dma_mask. In fact rdma_rxe does not perform
> any DMA operations so the value is never used but is checked.
> 
> This patch gives dma_mask a valid value. Without this patch
> rdma_rxe does not function at all.
> 
> Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference")
> Signed-off-by: Bob Pearson <rpearson@hpe.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 7652d53af2c1..116a234e92db 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -1134,8 +1134,15 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
>  	dev->node_type = RDMA_NODE_IB_CA;
>  	dev->phys_port_cnt = 1;
>  	dev->num_comp_vectors = num_possible_cpus();
> +
> +	/* rdma_rxe never does real DMA but does rely on
> +	 * pinning user memory in MRs to avoid page faults
> +	 * in responder and completer tasklets
> +	 */
>  	dev->dev.parent = rxe_dma_device(rxe);
> +	dev->dev.dma_mask = DMA_BIT_MASK(64);
>  	dev->local_dma_lkey = 0;
> +
>  	addrconf_addr_eui48((unsigned char *)&dev->node_guid,
>  			    rxe->ndev->dev_addr);
>  	dev->dev.dma_parms = &rxe->dma_parms;
>

Ignore this patch. It turns out it works because any nonzero number in dma_mask will stop the check that is failing and since rxe never uses DMA it won't affect anything. But, it doesn't compile cleanly because the dma_mask is a pointer to the actual dma_mask and not the mask. Somehow I missed the warning. I have a newer version that uses the function dma_coerce_mask_and_coherent() and also works. (Works means it gets to the next problem as mentioned in the prvious note.)

Bob

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-30  5:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-29 21:25 [PATCH for-next] RDMA/rxe: fix regression caused by recent patch Bob Pearson
2020-10-30  2:39 ` kernel test robot
2020-10-30  2:39   ` kernel test robot
2020-10-30  2:54 ` Zhu Yanjun
2020-10-30  5:46 ` Bob Pearson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.