From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com> To: Jason Gunthorpe <jgg@ziepe.ca> Cc: dledford@redhat.com, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, xavier.huwei@tom.com, lijun_nudt@163.com Subject: Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08 Date: Wed, 23 May 2018 11:49:13 +0800 [thread overview] Message-ID: <5B04E4B9.1050900@huawei.com> (raw) In-Reply-To: <5B04D7FE.70806@huawei.com> On 2018/5/23 10:54, Wei Hu (Xavier) wrote: > > On 2018/5/23 4:26, Jason Gunthorpe wrote: >> On Fri, May 18, 2018 at 03:23:00PM +0800, Wei Hu (Xavier) wrote: >>> On 2018/5/18 12:15, Jason Gunthorpe wrote: >>>> On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote: >>>>> On 2018/5/17 23:14, Jason Gunthorpe wrote: >>>>>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote: >>>>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >>>>>>> index 86ef15f..e1c44a6 100644 >>>>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >>>>>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev, >>>>>>> int ret = 0; >>>>>>> int ntc; >>>>>>> >>>>>>> + if (hr_dev->is_reset) >>>>>>> + return 0; >>>>>>> + >>>>>>> spin_lock_bh(&csq->lock); >>>>>>> >>>>>>> if (num > hns_roce_cmq_space(csq)) { >>>>>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle) >>>>>>> return 0; >>>>>>> >>>>>>> error_failed_get_cfg: >>>>>>> + handle->priv = NULL; >>>>>>> kfree(hr_dev->priv); >>>>>>> >>>>>>> error_failed_kzalloc: >>>>>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle, >>>>>>> { >>>>>>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv; >>>>>>> >>>>>>> + if (!hr_dev) >>>>>>> + return; >>>>>>> + >>>>>>> hns_roce_exit(hr_dev); >>>>>>> + handle->priv = NULL; >>>>>>> kfree(hr_dev->priv); >>>>>>> ib_dealloc_device(&hr_dev->ib_dev); >>>>>>> } >>>>>> Why are these hunks here? If init fails then uninit should not be >>>>>> called, so why meddle with priv? >>>>> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with >>>>> hr_dev, >>>>> We want clear the value in hns_roce_hw_v2_uninit_instance function. >>>>> So we can ensure no problem in RoCE driver. >>>> What problem could happen? >>>> >>>> I keep removing unnecessary sets to null and checks of null, so please >>>> don't add them if they cannot happen. >>>> >>>> Eg uninit should never be called with a null priv, that is a serious >>>> logic mis-design someplace if it happens. >>>> >>>> Jason >>> NIC driver call the registered reset_notify() function to finish the >>> part of RoCE reset process. >>> In RoCE driver, when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT, >>> we call hns_roce_hw_v2_uninit_instance(handle, false) to release the >>> resources. >>> when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call >>> hns_roce_hw_v2_init_instance. >>> if hns_roce_hw_v2_init_instance failed, we should ensure no problem in >>> the other callback >>> function registered by RoCE driver. >> Don't design things like this. >> >> init/uninit are paired - do not call something uninit if it can be >> called after init fails, or better, arrange to prevent that so things >> are sane. >> >> Jason >> >> . > The current RoCE driver registered 3 callback function to NIC driver as > belows: > 1.init_instance/uninit_instance are paired. > 2.In reset_notify function, RoCE dirver still call > init_instance/uninit_instance function. > but NIC driver does not perceive the behavior. We need to judge in RoCE > driver. > > static const struct hnae3_client_ops hns_roce_hw_v2_ops = { > .init_instance = hns_roce_hw_v2_init_instance, > .uninit_instance = hns_roce_hw_v2_uninit_instance, > .reset_notify = hns_roce_hw_v2_reset_notify, > }; struct hnae3_handle is defined in NIC driver, and handle->priv is used for RoCE driver, NIC driver will not use this member handle->priv. struct hnae3_handle { struct hnae3_client *client; struct pci_dev *pdev; void *priv; struct hnae3_ae_algo *ae_algo; /* the class who provides this handle */ u64 flags; /* Indicate the capabilities for this handle*/ unsigned long last_reset_time; enum hnae3_reset_type reset_level; union { struct net_device *netdev; /* first member */ struct hnae3_knic_private_info kinfo; struct hnae3_unic_private_info uinfo; struct hnae3_roce_private_info rinfo; }; u32 numa_node_mask; /* for multi-chip support */ }; > Wei Hu > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > . >
WARNING: multiple messages have this Message-ID (diff)
From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com> To: Jason Gunthorpe <jgg@ziepe.ca> Cc: <dledford@redhat.com>, <linux-rdma@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <xavier.huwei@tom.com>, <lijun_nudt@163.com> Subject: Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08 Date: Wed, 23 May 2018 11:49:13 +0800 [thread overview] Message-ID: <5B04E4B9.1050900@huawei.com> (raw) In-Reply-To: <5B04D7FE.70806@huawei.com> On 2018/5/23 10:54, Wei Hu (Xavier) wrote: > > On 2018/5/23 4:26, Jason Gunthorpe wrote: >> On Fri, May 18, 2018 at 03:23:00PM +0800, Wei Hu (Xavier) wrote: >>> On 2018/5/18 12:15, Jason Gunthorpe wrote: >>>> On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote: >>>>> On 2018/5/17 23:14, Jason Gunthorpe wrote: >>>>>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote: >>>>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >>>>>>> index 86ef15f..e1c44a6 100644 >>>>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >>>>>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev, >>>>>>> int ret = 0; >>>>>>> int ntc; >>>>>>> >>>>>>> + if (hr_dev->is_reset) >>>>>>> + return 0; >>>>>>> + >>>>>>> spin_lock_bh(&csq->lock); >>>>>>> >>>>>>> if (num > hns_roce_cmq_space(csq)) { >>>>>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle) >>>>>>> return 0; >>>>>>> >>>>>>> error_failed_get_cfg: >>>>>>> + handle->priv = NULL; >>>>>>> kfree(hr_dev->priv); >>>>>>> >>>>>>> error_failed_kzalloc: >>>>>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle, >>>>>>> { >>>>>>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv; >>>>>>> >>>>>>> + if (!hr_dev) >>>>>>> + return; >>>>>>> + >>>>>>> hns_roce_exit(hr_dev); >>>>>>> + handle->priv = NULL; >>>>>>> kfree(hr_dev->priv); >>>>>>> ib_dealloc_device(&hr_dev->ib_dev); >>>>>>> } >>>>>> Why are these hunks here? If init fails then uninit should not be >>>>>> called, so why meddle with priv? >>>>> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with >>>>> hr_dev, >>>>> We want clear the value in hns_roce_hw_v2_uninit_instance function. >>>>> So we can ensure no problem in RoCE driver. >>>> What problem could happen? >>>> >>>> I keep removing unnecessary sets to null and checks of null, so please >>>> don't add them if they cannot happen. >>>> >>>> Eg uninit should never be called with a null priv, that is a serious >>>> logic mis-design someplace if it happens. >>>> >>>> Jason >>> NIC driver call the registered reset_notify() function to finish the >>> part of RoCE reset process. >>> In RoCE driver, when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT, >>> we call hns_roce_hw_v2_uninit_instance(handle, false) to release the >>> resources. >>> when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call >>> hns_roce_hw_v2_init_instance. >>> if hns_roce_hw_v2_init_instance failed, we should ensure no problem in >>> the other callback >>> function registered by RoCE driver. >> Don't design things like this. >> >> init/uninit are paired - do not call something uninit if it can be >> called after init fails, or better, arrange to prevent that so things >> are sane. >> >> Jason >> >> . > The current RoCE driver registered 3 callback function to NIC driver as > belows: > 1.init_instance/uninit_instance are paired. > 2.In reset_notify function, RoCE dirver still call > init_instance/uninit_instance function. > but NIC driver does not perceive the behavior. We need to judge in RoCE > driver. > > static const struct hnae3_client_ops hns_roce_hw_v2_ops = { > .init_instance = hns_roce_hw_v2_init_instance, > .uninit_instance = hns_roce_hw_v2_uninit_instance, > .reset_notify = hns_roce_hw_v2_reset_notify, > }; struct hnae3_handle is defined in NIC driver, and handle->priv is used for RoCE driver, NIC driver will not use this member handle->priv. struct hnae3_handle { struct hnae3_client *client; struct pci_dev *pdev; void *priv; struct hnae3_ae_algo *ae_algo; /* the class who provides this handle */ u64 flags; /* Indicate the capabilities for this handle*/ unsigned long last_reset_time; enum hnae3_reset_type reset_level; union { struct net_device *netdev; /* first member */ struct hnae3_knic_private_info kinfo; struct hnae3_unic_private_info uinfo; struct hnae3_roce_private_info rinfo; }; u32 numa_node_mask; /* for multi-chip support */ }; > Wei Hu > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > . >
next prev parent reply other threads:[~2018-05-23 3:49 UTC|newest] Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-05-17 8:02 [PATCH rdma-next 0/5] Misc update for hns driver Wei Hu (Xavier) 2018-05-17 8:02 ` Wei Hu (Xavier) 2018-05-17 8:02 ` [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API Wei Hu (Xavier) 2018-05-17 8:02 ` Wei Hu (Xavier) 2018-05-17 15:00 ` Jason Gunthorpe 2018-05-19 8:24 ` Wei Hu (Xavier) 2018-05-19 8:24 ` Wei Hu (Xavier) 2018-05-22 20:21 ` Jason Gunthorpe 2018-05-23 9:33 ` Wei Hu (Xavier) 2018-05-23 9:33 ` Wei Hu (Xavier) 2018-05-17 8:02 ` [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust Wei Hu (Xavier) 2018-05-17 8:02 ` Wei Hu (Xavier) 2018-05-23 6:05 ` Leon Romanovsky 2018-05-23 6:49 ` Wei Hu (Xavier) 2018-05-23 6:49 ` Wei Hu (Xavier) 2018-05-23 7:00 ` Leon Romanovsky 2018-05-23 7:12 ` Wei Hu (Xavier) 2018-05-23 7:12 ` Wei Hu (Xavier) 2018-05-23 7:22 ` Leon Romanovsky 2018-05-17 8:02 ` [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value Wei Hu (Xavier) 2018-05-17 8:02 ` Wei Hu (Xavier) 2018-05-23 5:49 ` Leon Romanovsky 2018-05-23 6:09 ` Wei Hu (Xavier) 2018-05-23 6:09 ` Wei Hu (Xavier) 2018-05-23 6:15 ` Wei Hu (Xavier) 2018-05-23 6:15 ` Wei Hu (Xavier) 2018-05-23 6:23 ` Leon Romanovsky 2018-05-17 8:02 ` [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08 Wei Hu (Xavier) 2018-05-17 8:02 ` Wei Hu (Xavier) 2018-05-17 15:14 ` Jason Gunthorpe 2018-05-18 3:28 ` Wei Hu (Xavier) 2018-05-18 3:28 ` Wei Hu (Xavier) 2018-05-18 4:15 ` Jason Gunthorpe 2018-05-18 7:23 ` Wei Hu (Xavier) 2018-05-18 7:23 ` Wei Hu (Xavier) 2018-05-22 20:26 ` Jason Gunthorpe 2018-05-23 2:54 ` Wei Hu (Xavier) 2018-05-23 2:54 ` Wei Hu (Xavier) 2018-05-23 3:47 ` Jason Gunthorpe 2018-05-23 9:35 ` Wei Hu (Xavier) 2018-05-23 9:35 ` Wei Hu (Xavier) 2018-05-23 3:49 ` Wei Hu (Xavier) [this message] 2018-05-23 3:49 ` Wei Hu (Xavier) 2018-05-17 8:02 ` [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page Wei Hu (Xavier) 2018-05-17 8:02 ` Wei Hu (Xavier) 2018-05-23 6:17 ` Leon Romanovsky 2018-05-23 6:38 ` Wei Hu (Xavier) 2018-05-23 6:38 ` Wei Hu (Xavier)
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=5B04E4B9.1050900@huawei.com \ --to=xavier.huwei@huawei.com \ --cc=dledford@redhat.com \ --cc=jgg@ziepe.ca \ --cc=lijun_nudt@163.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-rdma@vger.kernel.org \ --cc=xavier.huwei@tom.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.