* [PATCH rfc 0/3] nvmet-rdma automatic port re-activation @ 2018-03-22 19:02 Sagi Grimberg 2018-03-22 19:03 ` [PATCH rfc 1/3] nvmet-rdma: automatic listening " Sagi Grimberg ` (4 more replies) 0 siblings, 5 replies; 12+ messages in thread From: Sagi Grimberg @ 2018-03-22 19:02 UTC (permalink / raw) When a RDMA device goes away we must destroy all it's associated RDMA resources. RDMa device resets also manifest as device removal events and a short while after they come back. We want to re-activate a port listener on this RDMA device when it comes back in to the system. In order to make it happen, we save the RDMA device node_guid on a ULP listener representation (nvmet_rdma_port) and when a RDMA device comes into the system, we check if there is a listener port that needs to be re-activated. In addition, reflect the port state to the sysadmin. Israel Rukshin (2): nvmet: Add fabrics ops to port nvmet: Add port transport active flag Sagi Grimberg (1): nvmet-rdma: automatic listening port re-activation drivers/nvme/target/configfs.c | 10 ++ drivers/nvme/target/core.c | 17 ++- drivers/nvme/target/nvmet.h | 3 + drivers/nvme/target/rdma.c | 231 +++++++++++++++++++++++++---------------- 4 files changed, 169 insertions(+), 92 deletions(-) -- 2.14.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH rfc 1/3] nvmet-rdma: automatic listening port re-activation 2018-03-22 19:02 [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Sagi Grimberg @ 2018-03-22 19:03 ` Sagi Grimberg 2018-03-25 8:25 ` Max Gurtovoy 2018-03-26 7:01 ` Max Gurtovoy 2018-03-22 19:03 ` [PATCH rfc 2/3] nvmet: Add fabrics ops to port Sagi Grimberg ` (3 subsequent siblings) 4 siblings, 2 replies; 12+ messages in thread From: Sagi Grimberg @ 2018-03-22 19:03 UTC (permalink / raw) In case the device goes away (or resets) we get a device removal event (or .remove ib_client callback). So what we want is to destroy the listening cm_id and re-activate (or enable) when the same device comes back. Hence we introduce nvmet_rdma_port which stores the ib_device node guid, and when a new device comes in to the system (ib_client .add callback) we search for an existing listener port on this device and reconfigure the listener cm_id. Signed-off-by: Sagi Grimberg <sagi at grimberg.me> --- drivers/nvme/target/rdma.c | 223 +++++++++++++++++++++++++++------------------ 1 file changed, 136 insertions(+), 87 deletions(-) diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index d7831372e1f9..1b7f72925e9f 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -119,6 +119,15 @@ struct nvmet_rdma_device { struct list_head entry; }; +struct nvmet_rdma_port { + struct nvmet_port *nport; + struct sockaddr_storage addr; + struct rdma_cm_id *cm_id; + __be64 node_guid; + struct list_head entry; + struct delayed_work enable_work; +}; + static bool nvmet_rdma_use_srq; module_param_named(use_srq, nvmet_rdma_use_srq, bool, 0444); MODULE_PARM_DESC(use_srq, "Use shared receive queue."); @@ -130,6 +139,9 @@ static DEFINE_MUTEX(nvmet_rdma_queue_mutex); static LIST_HEAD(device_list); static DEFINE_MUTEX(device_list_mutex); +static LIST_HEAD(port_list); +static DEFINE_MUTEX(port_list_mutex); + static bool nvmet_rdma_execute_command(struct nvmet_rdma_rsp *rsp); static void nvmet_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc); static void nvmet_rdma_recv_done(struct ib_cq *cq, struct ib_wc *wc); @@ -1130,6 +1142,7 @@ static int nvmet_rdma_cm_accept(struct rdma_cm_id *cm_id, static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id, struct rdma_cm_event *event) { + struct nvmet_rdma_port *port = cm_id->context; struct nvmet_rdma_device *ndev; struct nvmet_rdma_queue *queue; int ret = -EINVAL; @@ -1145,7 +1158,7 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id, ret = -ENOMEM; goto put_device; } - queue->port = cm_id->context; + queue->port = port->nport; if (queue->host_qid == 0) { /* Let inflight controller teardown complete */ @@ -1252,53 +1265,6 @@ static void nvmet_rdma_queue_connect_fail(struct rdma_cm_id *cm_id, schedule_work(&queue->release_work); } -/** - * nvme_rdma_device_removal() - Handle RDMA device removal - * @cm_id: rdma_cm id, used for nvmet port - * @queue: nvmet rdma queue (cm id qp_context) - * - * DEVICE_REMOVAL event notifies us that the RDMA device is about - * to unplug. Note that this event can be generated on a normal - * queue cm_id and/or a device bound listener cm_id (where in this - * case queue will be null). - * - * We registered an ib_client to handle device removal for queues, - * so we only need to handle the listening port cm_ids. In this case - * we nullify the priv to prevent double cm_id destruction and destroying - * the cm_id implicitely by returning a non-zero rc to the callout. - */ -static int nvmet_rdma_device_removal(struct rdma_cm_id *cm_id, - struct nvmet_rdma_queue *queue) -{ - struct nvmet_port *port; - - if (queue) { - /* - * This is a queue cm_id. we have registered - * an ib_client to handle queues removal - * so don't interfear and just return. - */ - return 0; - } - - port = cm_id->context; - - /* - * This is a listener cm_id. Make sure that - * future remove_port won't invoke a double - * cm_id destroy. use atomic xchg to make sure - * we don't compete with remove_port. - */ - if (xchg(&port->priv, NULL) != cm_id) - return 0; - - /* - * We need to return 1 so that the core will destroy - * it's own ID. What a great API design.. - */ - return 1; -} - static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event *event) { @@ -1331,8 +1297,7 @@ static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id, nvmet_rdma_queue_disconnect(queue); break; case RDMA_CM_EVENT_DEVICE_REMOVAL: - ret = nvmet_rdma_device_removal(cm_id, queue); - break; + break; /* handled by nvmet_rdma_remove_one */ case RDMA_CM_EVENT_REJECTED: pr_debug("Connection rejected: %s\n", rdma_reject_msg(cm_id, event->status)); @@ -1368,34 +1333,12 @@ static void nvmet_rdma_delete_ctrl(struct nvmet_ctrl *ctrl) mutex_unlock(&nvmet_rdma_queue_mutex); } -static int nvmet_rdma_add_port(struct nvmet_port *port) +static int nvmet_rdma_enable_port(struct nvmet_rdma_port *port) { + struct sockaddr *addr = (struct sockaddr *)&port->addr; struct rdma_cm_id *cm_id; - struct sockaddr_storage addr = { }; - __kernel_sa_family_t af; int ret; - switch (port->disc_addr.adrfam) { - case NVMF_ADDR_FAMILY_IP4: - af = AF_INET; - break; - case NVMF_ADDR_FAMILY_IP6: - af = AF_INET6; - break; - default: - pr_err("address family %d not supported\n", - port->disc_addr.adrfam); - return -EINVAL; - } - - ret = inet_pton_with_scope(&init_net, af, port->disc_addr.traddr, - port->disc_addr.trsvcid, &addr); - if (ret) { - pr_err("malformed ip/port passed: %s:%s\n", - port->disc_addr.traddr, port->disc_addr.trsvcid); - return ret; - } - cm_id = rdma_create_id(&init_net, nvmet_rdma_cm_handler, port, RDMA_PS_TCP, IB_QPT_RC); if (IS_ERR(cm_id)) { @@ -1413,23 +1356,22 @@ static int nvmet_rdma_add_port(struct nvmet_port *port) goto out_destroy_id; } - ret = rdma_bind_addr(cm_id, (struct sockaddr *)&addr); + ret = rdma_bind_addr(cm_id, addr); if (ret) { - pr_err("binding CM ID to %pISpcs failed (%d)\n", - (struct sockaddr *)&addr, ret); + pr_err("binding CM ID to %pISpcs failed (%d)\n", addr, ret); goto out_destroy_id; } ret = rdma_listen(cm_id, 128); if (ret) { - pr_err("listening to %pISpcs failed (%d)\n", - (struct sockaddr *)&addr, ret); + pr_err("listening to %pISpcs failed (%d)\n", addr, ret); goto out_destroy_id; } - pr_info("enabling port %d (%pISpcs)\n", - le16_to_cpu(port->disc_addr.portid), (struct sockaddr *)&addr); - port->priv = cm_id; + port->cm_id = cm_id; + if (cm_id->device) + port->node_guid = cm_id->device->node_guid; + return 0; out_destroy_id: @@ -1437,18 +1379,100 @@ static int nvmet_rdma_add_port(struct nvmet_port *port) return ret; } -static void nvmet_rdma_remove_port(struct nvmet_port *port) +static void nvmet_rdma_enable_port_work(struct work_struct *w) +{ + struct nvmet_rdma_port *port = container_of(to_delayed_work(w), + struct nvmet_rdma_port, enable_work); + int ret; + + ret = nvmet_rdma_enable_port(port); + if (ret) + schedule_delayed_work(&port->enable_work, HZ); +} + +static int nvmet_rdma_add_port(struct nvmet_port *nport) +{ + struct nvmet_rdma_port *port; + __kernel_sa_family_t af; + int ret; + + port = kzalloc(sizeof(*port), GFP_KERNEL); + if (!port) + return -ENOMEM; + + switch (nport->disc_addr.adrfam) { + case NVMF_ADDR_FAMILY_IP4: + af = AF_INET; + break; + case NVMF_ADDR_FAMILY_IP6: + af = AF_INET6; + break; + default: + pr_err("address family %d not supported\n", + nport->disc_addr.adrfam); + ret = -EINVAL; + goto out_free_port; + } + + ret = inet_pton_with_scope(&init_net, af, nport->disc_addr.traddr, + nport->disc_addr.trsvcid, &port->addr); + if (ret) { + pr_err("malformed ip/port passed: %s:%s\n", + nport->disc_addr.traddr, nport->disc_addr.trsvcid); + goto out_free_port; + } + + ret = nvmet_rdma_enable_port(port); + if(ret) + goto out_free_port; + + pr_info("enabling port %d (%pISpcs)\n", + le16_to_cpu(nport->disc_addr.portid), + (struct sockaddr *)&port->addr); + + nport->priv = port; + port->nport = nport; + INIT_DELAYED_WORK(&port->enable_work, nvmet_rdma_enable_port_work); + + mutex_lock(&port_list_mutex); + list_add_tail(&port->entry, &port_list); + mutex_unlock(&port_list_mutex); + + return 0; + +out_free_port: + kfree(port); + return ret; +} + +static void nvmet_rdma_disable_port(struct nvmet_rdma_port *port) { - struct rdma_cm_id *cm_id = xchg(&port->priv, NULL); + struct rdma_cm_id *cm_id = port->cm_id; + port->cm_id = NULL; if (cm_id) rdma_destroy_id(cm_id); } +static void nvmet_rdma_remove_port(struct nvmet_port *nport) +{ + struct nvmet_rdma_port *port = nport->priv; + + mutex_lock(&port_list_mutex); + list_del(&port->entry); + mutex_unlock(&port_list_mutex); + + cancel_delayed_work_sync(&port->enable_work); + + nvmet_rdma_disable_port(port); + kfree(port); +} + static void nvmet_rdma_disc_port_addr(struct nvmet_req *req, - struct nvmet_port *port, char *traddr) + struct nvmet_port *nport, char *traddr) { - struct rdma_cm_id *cm_id = port->priv; + struct nvmet_rdma_port *port = nport->priv; + struct rdma_cm_id *cm_id = port->cm_id; if (inet_addr_is_any((struct sockaddr *)&cm_id->route.addr.src_addr)) { struct nvmet_rdma_rsp *rsp = @@ -1458,7 +1482,7 @@ static void nvmet_rdma_disc_port_addr(struct nvmet_req *req, sprintf(traddr, "%pISc", addr); } else { - memcpy(traddr, port->disc_addr.traddr, NVMF_TRADDR_SIZE); + memcpy(traddr, nport->disc_addr.traddr, NVMF_TRADDR_SIZE); } } @@ -1475,9 +1499,24 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { .disc_traddr = nvmet_rdma_disc_port_addr, }; +static void nvmet_rdma_add_one(struct ib_device *ib_device) +{ + struct nvmet_rdma_port *port, *n; + + mutex_lock(&port_list_mutex); + list_for_each_entry_safe(port, n, &port_list, entry) { + if (port->node_guid != ib_device->node_guid) + continue; + + schedule_delayed_work(&port->enable_work, HZ); + } + mutex_unlock(&port_list_mutex); +} + static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data) { struct nvmet_rdma_queue *queue, *tmp; + struct nvmet_rdma_port *port, *n; /* Device is being removed, delete all queues using this device */ mutex_lock(&nvmet_rdma_queue_mutex); @@ -1492,11 +1531,21 @@ static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data } mutex_unlock(&nvmet_rdma_queue_mutex); + mutex_lock(&port_list_mutex); + list_for_each_entry_safe(port, n, &port_list, entry) { + if (port->node_guid != ib_device->node_guid) + continue; + + nvmet_rdma_disable_port(port); + } + mutex_unlock(&port_list_mutex); + flush_scheduled_work(); } static struct ib_client nvmet_rdma_ib_client = { .name = "nvmet_rdma", + .add = nvmet_rdma_add_one, .remove = nvmet_rdma_remove_one }; -- 2.14.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rfc 1/3] nvmet-rdma: automatic listening port re-activation 2018-03-22 19:03 ` [PATCH rfc 1/3] nvmet-rdma: automatic listening " Sagi Grimberg @ 2018-03-25 8:25 ` Max Gurtovoy 2018-03-26 7:01 ` Max Gurtovoy 1 sibling, 0 replies; 12+ messages in thread From: Max Gurtovoy @ 2018-03-25 8:25 UTC (permalink / raw) Hi Sagi, this is pretty much the same approach (.add_one) that I used in my first implementation for persistancy in NVMe-oF, before the discussion we had with OrenD on the list. So I'm good with that (after initial review). We'll run some testing with this code and send the results later. One thing we should think of is fatal errors handling for the target (I've sent a patch for the initiator using IB events, I'll see how I can add this to the target too). Cheers, -Max On 3/22/2018 9:03 PM, Sagi Grimberg wrote: > In case the device goes away (or resets) we get a device > removal event (or .remove ib_client callback). So what > we want is to destroy the listening cm_id and re-activate > (or enable) when the same device comes back. Hence we introduce > nvmet_rdma_port which stores the ib_device node guid, and when > a new device comes in to the system (ib_client .add callback) we > search for an existing listener port on this device and reconfigure > the listener cm_id. > > Signed-off-by: Sagi Grimberg <sagi at grimberg.me> > --- > drivers/nvme/target/rdma.c | 223 +++++++++++++++++++++++++++------------------ > 1 file changed, 136 insertions(+), 87 deletions(-) > > diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c > index d7831372e1f9..1b7f72925e9f 100644 > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@ -119,6 +119,15 @@ struct nvmet_rdma_device { > struct list_head entry; > }; > > +struct nvmet_rdma_port { > + struct nvmet_port *nport; > + struct sockaddr_storage addr; > + struct rdma_cm_id *cm_id; > + __be64 node_guid; > + struct list_head entry; > + struct delayed_work enable_work; > +}; > + > static bool nvmet_rdma_use_srq; > module_param_named(use_srq, nvmet_rdma_use_srq, bool, 0444); > MODULE_PARM_DESC(use_srq, "Use shared receive queue."); > @@ -130,6 +139,9 @@ static DEFINE_MUTEX(nvmet_rdma_queue_mutex); > static LIST_HEAD(device_list); > static DEFINE_MUTEX(device_list_mutex); > > +static LIST_HEAD(port_list); > +static DEFINE_MUTEX(port_list_mutex); > + > static bool nvmet_rdma_execute_command(struct nvmet_rdma_rsp *rsp); > static void nvmet_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc); > static void nvmet_rdma_recv_done(struct ib_cq *cq, struct ib_wc *wc); > @@ -1130,6 +1142,7 @@ static int nvmet_rdma_cm_accept(struct rdma_cm_id *cm_id, > static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id, > struct rdma_cm_event *event) > { > + struct nvmet_rdma_port *port = cm_id->context; > struct nvmet_rdma_device *ndev; > struct nvmet_rdma_queue *queue; > int ret = -EINVAL; > @@ -1145,7 +1158,7 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id, > ret = -ENOMEM; > goto put_device; > } > - queue->port = cm_id->context; > + queue->port = port->nport; > > if (queue->host_qid == 0) { > /* Let inflight controller teardown complete */ > @@ -1252,53 +1265,6 @@ static void nvmet_rdma_queue_connect_fail(struct rdma_cm_id *cm_id, > schedule_work(&queue->release_work); > } > > -/** > - * nvme_rdma_device_removal() - Handle RDMA device removal > - * @cm_id: rdma_cm id, used for nvmet port > - * @queue: nvmet rdma queue (cm id qp_context) > - * > - * DEVICE_REMOVAL event notifies us that the RDMA device is about > - * to unplug. Note that this event can be generated on a normal > - * queue cm_id and/or a device bound listener cm_id (where in this > - * case queue will be null). > - * > - * We registered an ib_client to handle device removal for queues, > - * so we only need to handle the listening port cm_ids. In this case > - * we nullify the priv to prevent double cm_id destruction and destroying > - * the cm_id implicitely by returning a non-zero rc to the callout. > - */ > -static int nvmet_rdma_device_removal(struct rdma_cm_id *cm_id, > - struct nvmet_rdma_queue *queue) > -{ > - struct nvmet_port *port; > - > - if (queue) { > - /* > - * This is a queue cm_id. we have registered > - * an ib_client to handle queues removal > - * so don't interfear and just return. > - */ > - return 0; > - } > - > - port = cm_id->context; > - > - /* > - * This is a listener cm_id. Make sure that > - * future remove_port won't invoke a double > - * cm_id destroy. use atomic xchg to make sure > - * we don't compete with remove_port. > - */ > - if (xchg(&port->priv, NULL) != cm_id) > - return 0; > - > - /* > - * We need to return 1 so that the core will destroy > - * it's own ID. What a great API design.. > - */ > - return 1; > -} > - > static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id, > struct rdma_cm_event *event) > { > @@ -1331,8 +1297,7 @@ static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id, > nvmet_rdma_queue_disconnect(queue); > break; > case RDMA_CM_EVENT_DEVICE_REMOVAL: > - ret = nvmet_rdma_device_removal(cm_id, queue); > - break; > + break; /* handled by nvmet_rdma_remove_one */ > case RDMA_CM_EVENT_REJECTED: > pr_debug("Connection rejected: %s\n", > rdma_reject_msg(cm_id, event->status)); > @@ -1368,34 +1333,12 @@ static void nvmet_rdma_delete_ctrl(struct nvmet_ctrl *ctrl) > mutex_unlock(&nvmet_rdma_queue_mutex); > } > > -static int nvmet_rdma_add_port(struct nvmet_port *port) > +static int nvmet_rdma_enable_port(struct nvmet_rdma_port *port) > { > + struct sockaddr *addr = (struct sockaddr *)&port->addr; > struct rdma_cm_id *cm_id; > - struct sockaddr_storage addr = { }; > - __kernel_sa_family_t af; > int ret; > > - switch (port->disc_addr.adrfam) { > - case NVMF_ADDR_FAMILY_IP4: > - af = AF_INET; > - break; > - case NVMF_ADDR_FAMILY_IP6: > - af = AF_INET6; > - break; > - default: > - pr_err("address family %d not supported\n", > - port->disc_addr.adrfam); > - return -EINVAL; > - } > - > - ret = inet_pton_with_scope(&init_net, af, port->disc_addr.traddr, > - port->disc_addr.trsvcid, &addr); > - if (ret) { > - pr_err("malformed ip/port passed: %s:%s\n", > - port->disc_addr.traddr, port->disc_addr.trsvcid); > - return ret; > - } > - > cm_id = rdma_create_id(&init_net, nvmet_rdma_cm_handler, port, > RDMA_PS_TCP, IB_QPT_RC); > if (IS_ERR(cm_id)) { > @@ -1413,23 +1356,22 @@ static int nvmet_rdma_add_port(struct nvmet_port *port) > goto out_destroy_id; > } > > - ret = rdma_bind_addr(cm_id, (struct sockaddr *)&addr); > + ret = rdma_bind_addr(cm_id, addr); > if (ret) { > - pr_err("binding CM ID to %pISpcs failed (%d)\n", > - (struct sockaddr *)&addr, ret); > + pr_err("binding CM ID to %pISpcs failed (%d)\n", addr, ret); > goto out_destroy_id; > } > > ret = rdma_listen(cm_id, 128); > if (ret) { > - pr_err("listening to %pISpcs failed (%d)\n", > - (struct sockaddr *)&addr, ret); > + pr_err("listening to %pISpcs failed (%d)\n", addr, ret); > goto out_destroy_id; > } > > - pr_info("enabling port %d (%pISpcs)\n", > - le16_to_cpu(port->disc_addr.portid), (struct sockaddr *)&addr); > - port->priv = cm_id; > + port->cm_id = cm_id; > + if (cm_id->device) > + port->node_guid = cm_id->device->node_guid; > + > return 0; > > out_destroy_id: > @@ -1437,18 +1379,100 @@ static int nvmet_rdma_add_port(struct nvmet_port *port) > return ret; > } > > -static void nvmet_rdma_remove_port(struct nvmet_port *port) > +static void nvmet_rdma_enable_port_work(struct work_struct *w) > +{ > + struct nvmet_rdma_port *port = container_of(to_delayed_work(w), > + struct nvmet_rdma_port, enable_work); > + int ret; > + > + ret = nvmet_rdma_enable_port(port); > + if (ret) > + schedule_delayed_work(&port->enable_work, HZ); > +} > + > +static int nvmet_rdma_add_port(struct nvmet_port *nport) > +{ > + struct nvmet_rdma_port *port; > + __kernel_sa_family_t af; > + int ret; > + > + port = kzalloc(sizeof(*port), GFP_KERNEL); > + if (!port) > + return -ENOMEM; > + > + switch (nport->disc_addr.adrfam) { > + case NVMF_ADDR_FAMILY_IP4: > + af = AF_INET; > + break; > + case NVMF_ADDR_FAMILY_IP6: > + af = AF_INET6; > + break; > + default: > + pr_err("address family %d not supported\n", > + nport->disc_addr.adrfam); > + ret = -EINVAL; > + goto out_free_port; > + } > + > + ret = inet_pton_with_scope(&init_net, af, nport->disc_addr.traddr, > + nport->disc_addr.trsvcid, &port->addr); > + if (ret) { > + pr_err("malformed ip/port passed: %s:%s\n", > + nport->disc_addr.traddr, nport->disc_addr.trsvcid); > + goto out_free_port; > + } > + > + ret = nvmet_rdma_enable_port(port); > + if(ret) > + goto out_free_port; > + > + pr_info("enabling port %d (%pISpcs)\n", > + le16_to_cpu(nport->disc_addr.portid), > + (struct sockaddr *)&port->addr); > + > + nport->priv = port; > + port->nport = nport; > + INIT_DELAYED_WORK(&port->enable_work, nvmet_rdma_enable_port_work); > + > + mutex_lock(&port_list_mutex); > + list_add_tail(&port->entry, &port_list); > + mutex_unlock(&port_list_mutex); > + > + return 0; > + > +out_free_port: > + kfree(port); > + return ret; > +} > + > +static void nvmet_rdma_disable_port(struct nvmet_rdma_port *port) > { > - struct rdma_cm_id *cm_id = xchg(&port->priv, NULL); > + struct rdma_cm_id *cm_id = port->cm_id; > > + port->cm_id = NULL; > if (cm_id) > rdma_destroy_id(cm_id); > } > > +static void nvmet_rdma_remove_port(struct nvmet_port *nport) > +{ > + struct nvmet_rdma_port *port = nport->priv; > + > + mutex_lock(&port_list_mutex); > + list_del(&port->entry); > + mutex_unlock(&port_list_mutex); > + > + cancel_delayed_work_sync(&port->enable_work); > + > + nvmet_rdma_disable_port(port); > + kfree(port); > +} > + > static void nvmet_rdma_disc_port_addr(struct nvmet_req *req, > - struct nvmet_port *port, char *traddr) > + struct nvmet_port *nport, char *traddr) > { > - struct rdma_cm_id *cm_id = port->priv; > + struct nvmet_rdma_port *port = nport->priv; > + struct rdma_cm_id *cm_id = port->cm_id; > > if (inet_addr_is_any((struct sockaddr *)&cm_id->route.addr.src_addr)) { > struct nvmet_rdma_rsp *rsp = > @@ -1458,7 +1482,7 @@ static void nvmet_rdma_disc_port_addr(struct nvmet_req *req, > > sprintf(traddr, "%pISc", addr); > } else { > - memcpy(traddr, port->disc_addr.traddr, NVMF_TRADDR_SIZE); > + memcpy(traddr, nport->disc_addr.traddr, NVMF_TRADDR_SIZE); > } > } > > @@ -1475,9 +1499,24 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { > .disc_traddr = nvmet_rdma_disc_port_addr, > }; > > +static void nvmet_rdma_add_one(struct ib_device *ib_device) > +{ > + struct nvmet_rdma_port *port, *n; > + > + mutex_lock(&port_list_mutex); > + list_for_each_entry_safe(port, n, &port_list, entry) { > + if (port->node_guid != ib_device->node_guid) > + continue; > + > + schedule_delayed_work(&port->enable_work, HZ); > + } > + mutex_unlock(&port_list_mutex); > +} > + > static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data) > { > struct nvmet_rdma_queue *queue, *tmp; > + struct nvmet_rdma_port *port, *n; > > /* Device is being removed, delete all queues using this device */ > mutex_lock(&nvmet_rdma_queue_mutex); > @@ -1492,11 +1531,21 @@ static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data > } > mutex_unlock(&nvmet_rdma_queue_mutex); > > + mutex_lock(&port_list_mutex); > + list_for_each_entry_safe(port, n, &port_list, entry) { > + if (port->node_guid != ib_device->node_guid) > + continue; > + > + nvmet_rdma_disable_port(port); > + } > + mutex_unlock(&port_list_mutex); > + > flush_scheduled_work(); > } > > static struct ib_client nvmet_rdma_ib_client = { > .name = "nvmet_rdma", > + .add = nvmet_rdma_add_one, > .remove = nvmet_rdma_remove_one > }; > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH rfc 1/3] nvmet-rdma: automatic listening port re-activation 2018-03-22 19:03 ` [PATCH rfc 1/3] nvmet-rdma: automatic listening " Sagi Grimberg 2018-03-25 8:25 ` Max Gurtovoy @ 2018-03-26 7:01 ` Max Gurtovoy 2018-04-04 13:06 ` Sagi Grimberg 1 sibling, 1 reply; 12+ messages in thread From: Max Gurtovoy @ 2018-03-26 7:01 UTC (permalink / raw) On 3/22/2018 9:03 PM, Sagi Grimberg wrote: > In case the device goes away (or resets) we get a device > removal event (or .remove ib_client callback). So what > we want is to destroy the listening cm_id and re-activate > (or enable) when the same device comes back. Hence we introduce > nvmet_rdma_port which stores the ib_device node guid, and when > a new device comes in to the system (ib_client .add callback) we > search for an existing listener port on this device and reconfigure > the listener cm_id. > > Signed-off-by: Sagi Grimberg <sagi at grimberg.me> > --- > drivers/nvme/target/rdma.c | 223 +++++++++++++++++++++++++++------------------ > 1 file changed, 136 insertions(+), 87 deletions(-) > > diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c > index d7831372e1f9..1b7f72925e9f 100644 > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@ -119,6 +119,15 @@ struct nvmet_rdma_device { > struct list_head entry; > }; > > +struct nvmet_rdma_port { > + struct nvmet_port *nport; > + struct sockaddr_storage addr; > + struct rdma_cm_id *cm_id; > + __be64 node_guid; > + struct list_head entry; > + struct delayed_work enable_work; > +}; > + > static bool nvmet_rdma_use_srq; > module_param_named(use_srq, nvmet_rdma_use_srq, bool, 0444); > MODULE_PARM_DESC(use_srq, "Use shared receive queue."); > @@ -130,6 +139,9 @@ static DEFINE_MUTEX(nvmet_rdma_queue_mutex); > static LIST_HEAD(device_list); > static DEFINE_MUTEX(device_list_mutex); > > +static LIST_HEAD(port_list); > +static DEFINE_MUTEX(port_list_mutex); > + > static bool nvmet_rdma_execute_command(struct nvmet_rdma_rsp *rsp); > static void nvmet_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc); > static void nvmet_rdma_recv_done(struct ib_cq *cq, struct ib_wc *wc); > @@ -1130,6 +1142,7 @@ static int nvmet_rdma_cm_accept(struct rdma_cm_id *cm_id, > static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id, > struct rdma_cm_event *event) > { > + struct nvmet_rdma_port *port = cm_id->context; > struct nvmet_rdma_device *ndev; > struct nvmet_rdma_queue *queue; > int ret = -EINVAL; > @@ -1145,7 +1158,7 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id, > ret = -ENOMEM; > goto put_device; > } > - queue->port = cm_id->context; > + queue->port = port->nport; > > if (queue->host_qid == 0) { > /* Let inflight controller teardown complete */ > @@ -1252,53 +1265,6 @@ static void nvmet_rdma_queue_connect_fail(struct rdma_cm_id *cm_id, > schedule_work(&queue->release_work); > } > > -/** > - * nvme_rdma_device_removal() - Handle RDMA device removal > - * @cm_id: rdma_cm id, used for nvmet port > - * @queue: nvmet rdma queue (cm id qp_context) > - * > - * DEVICE_REMOVAL event notifies us that the RDMA device is about > - * to unplug. Note that this event can be generated on a normal > - * queue cm_id and/or a device bound listener cm_id (where in this > - * case queue will be null). > - * > - * We registered an ib_client to handle device removal for queues, > - * so we only need to handle the listening port cm_ids. In this case > - * we nullify the priv to prevent double cm_id destruction and destroying > - * the cm_id implicitely by returning a non-zero rc to the callout. > - */ > -static int nvmet_rdma_device_removal(struct rdma_cm_id *cm_id, > - struct nvmet_rdma_queue *queue) > -{ > - struct nvmet_port *port; > - > - if (queue) { > - /* > - * This is a queue cm_id. we have registered > - * an ib_client to handle queues removal > - * so don't interfear and just return. > - */ > - return 0; > - } > - > - port = cm_id->context; > - > - /* > - * This is a listener cm_id. Make sure that > - * future remove_port won't invoke a double > - * cm_id destroy. use atomic xchg to make sure > - * we don't compete with remove_port. > - */ > - if (xchg(&port->priv, NULL) != cm_id) > - return 0; > - > - /* > - * We need to return 1 so that the core will destroy > - * it's own ID. What a great API design.. > - */ > - return 1; > -} > - > static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id, > struct rdma_cm_event *event) > { > @@ -1331,8 +1297,7 @@ static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id, > nvmet_rdma_queue_disconnect(queue); > break; > case RDMA_CM_EVENT_DEVICE_REMOVAL: > - ret = nvmet_rdma_device_removal(cm_id, queue); > - break; > + break; /* handled by nvmet_rdma_remove_one */ > case RDMA_CM_EVENT_REJECTED: > pr_debug("Connection rejected: %s\n", > rdma_reject_msg(cm_id, event->status)); > @@ -1368,34 +1333,12 @@ static void nvmet_rdma_delete_ctrl(struct nvmet_ctrl *ctrl) > mutex_unlock(&nvmet_rdma_queue_mutex); > } > > -static int nvmet_rdma_add_port(struct nvmet_port *port) > +static int nvmet_rdma_enable_port(struct nvmet_rdma_port *port) > { > + struct sockaddr *addr = (struct sockaddr *)&port->addr; > struct rdma_cm_id *cm_id; > - struct sockaddr_storage addr = { }; > - __kernel_sa_family_t af; > int ret; > > - switch (port->disc_addr.adrfam) { > - case NVMF_ADDR_FAMILY_IP4: > - af = AF_INET; > - break; > - case NVMF_ADDR_FAMILY_IP6: > - af = AF_INET6; > - break; > - default: > - pr_err("address family %d not supported\n", > - port->disc_addr.adrfam); > - return -EINVAL; > - } > - > - ret = inet_pton_with_scope(&init_net, af, port->disc_addr.traddr, > - port->disc_addr.trsvcid, &addr); > - if (ret) { > - pr_err("malformed ip/port passed: %s:%s\n", > - port->disc_addr.traddr, port->disc_addr.trsvcid); > - return ret; > - } > - > cm_id = rdma_create_id(&init_net, nvmet_rdma_cm_handler, port, > RDMA_PS_TCP, IB_QPT_RC); > if (IS_ERR(cm_id)) { > @@ -1413,23 +1356,22 @@ static int nvmet_rdma_add_port(struct nvmet_port *port) > goto out_destroy_id; > } > > - ret = rdma_bind_addr(cm_id, (struct sockaddr *)&addr); > + ret = rdma_bind_addr(cm_id, addr); > if (ret) { > - pr_err("binding CM ID to %pISpcs failed (%d)\n", > - (struct sockaddr *)&addr, ret); > + pr_err("binding CM ID to %pISpcs failed (%d)\n", addr, ret); > goto out_destroy_id; > } > > ret = rdma_listen(cm_id, 128); > if (ret) { > - pr_err("listening to %pISpcs failed (%d)\n", > - (struct sockaddr *)&addr, ret); > + pr_err("listening to %pISpcs failed (%d)\n", addr, ret); > goto out_destroy_id; > } > > - pr_info("enabling port %d (%pISpcs)\n", > - le16_to_cpu(port->disc_addr.portid), (struct sockaddr *)&addr); > - port->priv = cm_id; > + port->cm_id = cm_id; > + if (cm_id->device) can device be NULL here ? > + port->node_guid = cm_id->device->node_guid; > + > return 0; > > out_destroy_id: > @@ -1437,18 +1379,100 @@ static int nvmet_rdma_add_port(struct nvmet_port *port) > return ret; > } > > -static void nvmet_rdma_remove_port(struct nvmet_port *port) > +static void nvmet_rdma_enable_port_work(struct work_struct *w) > +{ > + struct nvmet_rdma_port *port = container_of(to_delayed_work(w), > + struct nvmet_rdma_port, enable_work); > + int ret; > + > + ret = nvmet_rdma_enable_port(port); > + if (ret) > + schedule_delayed_work(&port->enable_work, HZ); I think it's beter to schedule it after 5 seconds since we use the system_wq and since it's a recovery process. > +} > + > +static int nvmet_rdma_add_port(struct nvmet_port *nport) > +{ > + struct nvmet_rdma_port *port; > + __kernel_sa_family_t af; > + int ret; > + > + port = kzalloc(sizeof(*port), GFP_KERNEL); > + if (!port) > + return -ENOMEM; > + > + switch (nport->disc_addr.adrfam) { > + case NVMF_ADDR_FAMILY_IP4: > + af = AF_INET; > + break; > + case NVMF_ADDR_FAMILY_IP6: > + af = AF_INET6; > + break; > + default: > + pr_err("address family %d not supported\n", > + nport->disc_addr.adrfam); > + ret = -EINVAL; > + goto out_free_port; > + } > + > + ret = inet_pton_with_scope(&init_net, af, nport->disc_addr.traddr, > + nport->disc_addr.trsvcid, &port->addr); > + if (ret) { > + pr_err("malformed ip/port passed: %s:%s\n", > + nport->disc_addr.traddr, nport->disc_addr.trsvcid); > + goto out_free_port; > + } > + > + ret = nvmet_rdma_enable_port(port); > + if(ret) > + goto out_free_port; > + > + pr_info("enabling port %d (%pISpcs)\n", > + le16_to_cpu(nport->disc_addr.portid), > + (struct sockaddr *)&port->addr); > + > + nport->priv = port; > + port->nport = nport; > + INIT_DELAYED_WORK(&port->enable_work, nvmet_rdma_enable_port_work); > + > + mutex_lock(&port_list_mutex); > + list_add_tail(&port->entry, &port_list); > + mutex_unlock(&port_list_mutex); > + > + return 0; > + > +out_free_port: > + kfree(port); > + return ret; > +} > + > +static void nvmet_rdma_disable_port(struct nvmet_rdma_port *port) > { > - struct rdma_cm_id *cm_id = xchg(&port->priv, NULL); > + struct rdma_cm_id *cm_id = port->cm_id; > > + port->cm_id = NULL; > if (cm_id) > rdma_destroy_id(cm_id); > } > > +static void nvmet_rdma_remove_port(struct nvmet_port *nport) > +{ > + struct nvmet_rdma_port *port = nport->priv; > + > + mutex_lock(&port_list_mutex); > + list_del(&port->entry); > + mutex_unlock(&port_list_mutex); > + > + cancel_delayed_work_sync(&port->enable_work); > + > + nvmet_rdma_disable_port(port); > + kfree(port); > +} > + > static void nvmet_rdma_disc_port_addr(struct nvmet_req *req, > - struct nvmet_port *port, char *traddr) > + struct nvmet_port *nport, char *traddr) > { > - struct rdma_cm_id *cm_id = port->priv; > + struct nvmet_rdma_port *port = nport->priv; > + struct rdma_cm_id *cm_id = port->cm_id; > > if (inet_addr_is_any((struct sockaddr *)&cm_id->route.addr.src_addr)) { > struct nvmet_rdma_rsp *rsp = > @@ -1458,7 +1482,7 @@ static void nvmet_rdma_disc_port_addr(struct nvmet_req *req, > > sprintf(traddr, "%pISc", addr); > } else { > - memcpy(traddr, port->disc_addr.traddr, NVMF_TRADDR_SIZE); > + memcpy(traddr, nport->disc_addr.traddr, NVMF_TRADDR_SIZE); > } > } > > @@ -1475,9 +1499,24 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { > .disc_traddr = nvmet_rdma_disc_port_addr, > }; > > +static void nvmet_rdma_add_one(struct ib_device *ib_device) > +{ > + struct nvmet_rdma_port *port, *n; > + > + mutex_lock(&port_list_mutex); > + list_for_each_entry_safe(port, n, &port_list, entry) { > + if (port->node_guid != ib_device->node_guid) > + continue; > + > + schedule_delayed_work(&port->enable_work, HZ); > + } > + mutex_unlock(&port_list_mutex); > +} > + > static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data) > { > struct nvmet_rdma_queue *queue, *tmp; > + struct nvmet_rdma_port *port, *n; > > /* Device is being removed, delete all queues using this device */ > mutex_lock(&nvmet_rdma_queue_mutex); > @@ -1492,11 +1531,21 @@ static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data > } > mutex_unlock(&nvmet_rdma_queue_mutex); > > + mutex_lock(&port_list_mutex); > + list_for_each_entry_safe(port, n, &port_list, entry) { > + if (port->node_guid != ib_device->node_guid) > + continue; > + > + nvmet_rdma_disable_port(port); > + } > + mutex_unlock(&port_list_mutex); > + I'm not sure it's rebased, but please check my "nvmet-rdma: Don't flush system_wq by default during remove_one" We need to add a condition here: if (found) flush_scheduled_work(); and remove it from above. > flush_scheduled_work(); > } > > static struct ib_client nvmet_rdma_ib_client = { > .name = "nvmet_rdma", > + .add = nvmet_rdma_add_one, > .remove = nvmet_rdma_remove_one > }; > > We've started testing this and looks good with the above changes. We'll increase the scale today. Last issue we saw is some tabs/spaces mixture. Cheers, Max. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH rfc 1/3] nvmet-rdma: automatic listening port re-activation 2018-03-26 7:01 ` Max Gurtovoy @ 2018-04-04 13:06 ` Sagi Grimberg 0 siblings, 0 replies; 12+ messages in thread From: Sagi Grimberg @ 2018-04-04 13:06 UTC (permalink / raw) >> @@ -1413,23 +1356,22 @@ static int nvmet_rdma_add_port(struct >> nvmet_port *port) >> ????????? goto out_destroy_id; >> ????? } >> -??? ret = rdma_bind_addr(cm_id, (struct sockaddr *)&addr); >> +??? ret = rdma_bind_addr(cm_id, addr); >> ????? if (ret) { >> -??????? pr_err("binding CM ID to %pISpcs failed (%d)\n", >> -??????????? (struct sockaddr *)&addr, ret); >> +??????? pr_err("binding CM ID to %pISpcs failed (%d)\n", addr, ret); >> ????????? goto out_destroy_id; >> ????? } >> ????? ret = rdma_listen(cm_id, 128); >> ????? if (ret) { >> -??????? pr_err("listening to %pISpcs failed (%d)\n", >> -??????????? (struct sockaddr *)&addr, ret); >> +??????? pr_err("listening to %pISpcs failed (%d)\n", addr, ret); >> ????????? goto out_destroy_id; >> ????? } >> -??? pr_info("enabling port %d (%pISpcs)\n", >> -??????? le16_to_cpu(port->disc_addr.portid), (struct sockaddr *)&addr); >> -??? port->priv = cm_id; >> +??? port->cm_id = cm_id; >> +??? if (cm_id->device) > > can device be NULL here ? Yes, for unbound listeners. >> +??? ret = nvmet_rdma_enable_port(port); >> +??? if (ret) >> +??????? schedule_delayed_work(&port->enable_work, HZ); > > I think it's beter to schedule it after 5 seconds since we use the > system_wq and since it's a recovery process. OK... ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH rfc 2/3] nvmet: Add fabrics ops to port 2018-03-22 19:02 [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Sagi Grimberg 2018-03-22 19:03 ` [PATCH rfc 1/3] nvmet-rdma: automatic listening " Sagi Grimberg @ 2018-03-22 19:03 ` Sagi Grimberg 2018-03-22 19:03 ` [PATCH rfc 3/3] nvmet: Add port transport active flag Sagi Grimberg ` (2 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: Sagi Grimberg @ 2018-03-22 19:03 UTC (permalink / raw) From: Israel Rukshin <israelr@mellanox.com> Instead of getting the nvmet_fabrics_ops from nvmet_transports array each time we want to use it, just take it directly from the port. Signed-off-by: Israel Rukshin <israelr at mellanox.com> Reviewed-by: Max Gurtovoy <maxg at mellanox.com> Signed-off-by: Sagi Grimberg <sagi at grimberg.me> --- drivers/nvme/target/core.c | 9 ++++----- drivers/nvme/target/nvmet.h | 1 + 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index a78029e4e5f4..cd97ec93c1a9 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -189,21 +189,20 @@ int nvmet_enable_port(struct nvmet_port *port) return ret; } + port->ops = ops; port->enabled = true; return 0; } void nvmet_disable_port(struct nvmet_port *port) { - struct nvmet_fabrics_ops *ops; - lockdep_assert_held(&nvmet_config_sem); port->enabled = false; - ops = nvmet_transports[port->disc_addr.trtype]; - ops->remove_port(port); - module_put(ops->owner); + port->ops->remove_port(port); + module_put(port->ops->owner); + port->ops = NULL; } static void nvmet_keep_alive_timer(struct work_struct *work) diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index 40afb5d6ed91..dfba1a4ab302 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -98,6 +98,7 @@ struct nvmet_port { struct list_head referrals; void *priv; bool enabled; + struct nvmet_fabrics_ops *ops; }; static inline struct nvmet_port *to_nvmet_port(struct config_item *item) -- 2.14.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rfc 3/3] nvmet: Add port transport active flag 2018-03-22 19:02 [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Sagi Grimberg 2018-03-22 19:03 ` [PATCH rfc 1/3] nvmet-rdma: automatic listening " Sagi Grimberg 2018-03-22 19:03 ` [PATCH rfc 2/3] nvmet: Add fabrics ops to port Sagi Grimberg @ 2018-03-22 19:03 ` Sagi Grimberg 2018-03-26 7:52 ` Max Gurtovoy 2018-03-22 19:03 ` [PATCH 4/3 rfc nvmetcli] nvmetcli: expose nvmet port status and state Sagi Grimberg 2018-04-11 16:17 ` [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Max Gurtovoy 4 siblings, 1 reply; 12+ messages in thread From: Sagi Grimberg @ 2018-03-22 19:03 UTC (permalink / raw) From: Israel Rukshin <israelr@mellanox.com> trstate port flag means that nvmet transport is active and ready for receiving requests from host. It differ from enabled port state (port with subsystem symbolic link) which doesn't guarantee this. The trstate flag is necessary in case the physical ports become down while nvmet ports are enabled. In this case the port state remains enabled but tractive flag becomes false. Signed-off-by: Israel Rukshin <israelr at mellanox.com> Reviewed-by: Max Gurtovoy <maxg at mellanox.com> Signed-off-by: Sagi Grimberg <sagi at grimberg.me> --- drivers/nvme/target/configfs.c | 10 ++++++++++ drivers/nvme/target/core.c | 8 ++++++++ drivers/nvme/target/nvmet.h | 2 ++ drivers/nvme/target/rdma.c | 8 ++++++++ 4 files changed, 28 insertions(+) diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c index e6b2d2af81b6..460eeb0c5801 100644 --- a/drivers/nvme/target/configfs.c +++ b/drivers/nvme/target/configfs.c @@ -268,6 +268,15 @@ static ssize_t nvmet_addr_trtype_store(struct config_item *item, CONFIGFS_ATTR(nvmet_, addr_trtype); +static ssize_t nvmet_addr_tractive_show(struct config_item *item, char *page) +{ + struct nvmet_port *port = to_nvmet_port(item); + + return sprintf(page, "%d\n", nvmet_is_port_active(port)); +} + +CONFIGFS_ATTR_RO(nvmet_, addr_tractive); + /* * Namespace structures & file operation functions below */ @@ -873,6 +882,7 @@ static struct configfs_attribute *nvmet_port_attrs[] = { &nvmet_attr_addr_traddr, &nvmet_attr_addr_trsvcid, &nvmet_attr_addr_trtype, + &nvmet_attr_addr_tractive, NULL, }; diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index cd97ec93c1a9..fa2a02c6f0c6 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -205,6 +205,14 @@ void nvmet_disable_port(struct nvmet_port *port) port->ops = NULL; } +bool nvmet_is_port_active(struct nvmet_port *port) +{ + if (port->ops && port->ops->is_port_active) + return port->ops->is_port_active(port); + + return port->enabled; +} + static void nvmet_keep_alive_timer(struct work_struct *work) { struct nvmet_ctrl *ctrl = container_of(to_delayed_work(work), diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index dfba1a4ab302..b7ddb38a8210 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -212,6 +212,7 @@ struct nvmet_fabrics_ops { void (*delete_ctrl)(struct nvmet_ctrl *ctrl); void (*disc_traddr)(struct nvmet_req *req, struct nvmet_port *port, char *traddr); + bool (*is_port_active)(struct nvmet_port *port); }; #define NVMET_MAX_INLINE_BIOVEC 8 @@ -309,6 +310,7 @@ void nvmet_unregister_transport(struct nvmet_fabrics_ops *ops); int nvmet_enable_port(struct nvmet_port *port); void nvmet_disable_port(struct nvmet_port *port); +bool nvmet_is_port_active(struct nvmet_port *port); void nvmet_referral_enable(struct nvmet_port *parent, struct nvmet_port *port); void nvmet_referral_disable(struct nvmet_port *port); diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index 1b7f72925e9f..80cb7298a838 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -1486,6 +1486,13 @@ static void nvmet_rdma_disc_port_addr(struct nvmet_req *req, } } +static bool nvmet_rdma_is_port_active(struct nvmet_port *nport) +{ + struct nvmet_rdma_port *port = nport->priv; + + return port->cm_id ? true : false; +} + static struct nvmet_fabrics_ops nvmet_rdma_ops = { .owner = THIS_MODULE, .type = NVMF_TRTYPE_RDMA, @@ -1493,6 +1500,7 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { .msdbd = 1, .has_keyed_sgls = 1, .add_port = nvmet_rdma_add_port, + .is_port_active = nvmet_rdma_is_port_active, .remove_port = nvmet_rdma_remove_port, .queue_response = nvmet_rdma_queue_response, .delete_ctrl = nvmet_rdma_delete_ctrl, -- 2.14.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rfc 3/3] nvmet: Add port transport active flag 2018-03-22 19:03 ` [PATCH rfc 3/3] nvmet: Add port transport active flag Sagi Grimberg @ 2018-03-26 7:52 ` Max Gurtovoy 2018-04-04 13:08 ` Sagi Grimberg 0 siblings, 1 reply; 12+ messages in thread From: Max Gurtovoy @ 2018-03-26 7:52 UTC (permalink / raw) On 3/22/2018 9:03 PM, Sagi Grimberg wrote: > From: Israel Rukshin <israelr at mellanox.com> > > trstate port flag means that nvmet transport is active and ready for *tractive this is the configfs attribute name. > receiving requests from host. It differ from enabled port state (port > with subsystem symbolic link) which doesn't guarantee this. The trstate *tractive > flag is necessary in case the physical ports become down while nvmet ports are enabled. > In this case the port state remains enabled but tractive flag becomes false. > > Signed-off-by: Israel Rukshin <israelr at mellanox.com> > Reviewed-by: Max Gurtovoy <maxg at mellanox.com> > Signed-off-by: Sagi Grimberg <sagi at grimberg.me> > --- > drivers/nvme/target/configfs.c | 10 ++++++++++ > drivers/nvme/target/core.c | 8 ++++++++ > drivers/nvme/target/nvmet.h | 2 ++ > drivers/nvme/target/rdma.c | 8 ++++++++ > 4 files changed, 28 insertions(+) > > diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c > index e6b2d2af81b6..460eeb0c5801 100644 > --- a/drivers/nvme/target/configfs.c > +++ b/drivers/nvme/target/configfs.c > @@ -268,6 +268,15 @@ static ssize_t nvmet_addr_trtype_store(struct config_item *item, > > CONFIGFS_ATTR(nvmet_, addr_trtype); > > +static ssize_t nvmet_addr_tractive_show(struct config_item *item, char *page) > +{ > + struct nvmet_port *port = to_nvmet_port(item); > + > + return sprintf(page, "%d\n", nvmet_is_port_active(port)); > +} > + > +CONFIGFS_ATTR_RO(nvmet_, addr_tractive); > + addr_ prefix is needed for nvmetcli. Since this is a RO attribute, I'm not sure we need this prefix. I guess It will be saved in the config file (nvmetcli save) with no reason. And "nvmetcli restore" will try to echo 1 in to a RO attribute. > /* > * Namespace structures & file operation functions below > */ > @@ -873,6 +882,7 @@ static struct configfs_attribute *nvmet_port_attrs[] = { > &nvmet_attr_addr_traddr, > &nvmet_attr_addr_trsvcid, > &nvmet_attr_addr_trtype, > + &nvmet_attr_addr_tractive, > NULL, > }; > > diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c > index cd97ec93c1a9..fa2a02c6f0c6 100644 > --- a/drivers/nvme/target/core.c > +++ b/drivers/nvme/target/core.c > @@ -205,6 +205,14 @@ void nvmet_disable_port(struct nvmet_port *port) > port->ops = NULL; > } > > +bool nvmet_is_port_active(struct nvmet_port *port) > +{ > + if (port->ops && port->ops->is_port_active) > + return port->ops->is_port_active(port); > + > + return port->enabled; > +} > + > static void nvmet_keep_alive_timer(struct work_struct *work) > { > struct nvmet_ctrl *ctrl = container_of(to_delayed_work(work), > diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h > index dfba1a4ab302..b7ddb38a8210 100644 > --- a/drivers/nvme/target/nvmet.h > +++ b/drivers/nvme/target/nvmet.h > @@ -212,6 +212,7 @@ struct nvmet_fabrics_ops { > void (*delete_ctrl)(struct nvmet_ctrl *ctrl); > void (*disc_traddr)(struct nvmet_req *req, > struct nvmet_port *port, char *traddr); > + bool (*is_port_active)(struct nvmet_port *port); > }; > > #define NVMET_MAX_INLINE_BIOVEC 8 > @@ -309,6 +310,7 @@ void nvmet_unregister_transport(struct nvmet_fabrics_ops *ops); > > int nvmet_enable_port(struct nvmet_port *port); > void nvmet_disable_port(struct nvmet_port *port); > +bool nvmet_is_port_active(struct nvmet_port *port); > > void nvmet_referral_enable(struct nvmet_port *parent, struct nvmet_port *port); > void nvmet_referral_disable(struct nvmet_port *port); > diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c > index 1b7f72925e9f..80cb7298a838 100644 > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@ -1486,6 +1486,13 @@ static void nvmet_rdma_disc_port_addr(struct nvmet_req *req, > } > } > > +static bool nvmet_rdma_is_port_active(struct nvmet_port *nport) > +{ > + struct nvmet_rdma_port *port = nport->priv; > + > + return port->cm_id ? true : false; > +} > + > static struct nvmet_fabrics_ops nvmet_rdma_ops = { > .owner = THIS_MODULE, > .type = NVMF_TRTYPE_RDMA, > @@ -1493,6 +1500,7 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = { > .msdbd = 1, > .has_keyed_sgls = 1, > .add_port = nvmet_rdma_add_port, > + .is_port_active = nvmet_rdma_is_port_active, > .remove_port = nvmet_rdma_remove_port, > .queue_response = nvmet_rdma_queue_response, > .delete_ctrl = nvmet_rdma_delete_ctrl, > ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH rfc 3/3] nvmet: Add port transport active flag 2018-03-26 7:52 ` Max Gurtovoy @ 2018-04-04 13:08 ` Sagi Grimberg 0 siblings, 0 replies; 12+ messages in thread From: Sagi Grimberg @ 2018-04-04 13:08 UTC (permalink / raw) > On 3/22/2018 9:03 PM, Sagi Grimberg wrote: >> From: Israel Rukshin <israelr at mellanox.com> >> >> trstate port flag means that nvmet transport is active and ready for > > *tractive > > this is the configfs attribute name. We should change it to trstate then... >> diff --git a/drivers/nvme/target/configfs.c >> b/drivers/nvme/target/configfs.c >> index e6b2d2af81b6..460eeb0c5801 100644 >> --- a/drivers/nvme/target/configfs.c >> +++ b/drivers/nvme/target/configfs.c >> @@ -268,6 +268,15 @@ static ssize_t nvmet_addr_trtype_store(struct >> config_item *item, >> ? CONFIGFS_ATTR(nvmet_, addr_trtype); >> +static ssize_t nvmet_addr_tractive_show(struct config_item *item, >> char *page) >> +{ >> +??? struct nvmet_port *port = to_nvmet_port(item); >> + >> +??? return sprintf(page, "%d\n", nvmet_is_port_active(port)); >> +} >> + >> +CONFIGFS_ATTR_RO(nvmet_, addr_tractive); >> + > > addr_ prefix is needed for nvmetcli. Since this is a RO attribute, I'm > not sure we need this prefix. I guess It will be saved in the config > file (nvmetcli save) with no reason. And "nvmetcli restore" will try to > echo 1 in to a RO attribute. Agreed. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 4/3 rfc nvmetcli] nvmetcli: expose nvmet port status and state 2018-03-22 19:02 [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Sagi Grimberg ` (2 preceding siblings ...) 2018-03-22 19:03 ` [PATCH rfc 3/3] nvmet: Add port transport active flag Sagi Grimberg @ 2018-03-22 19:03 ` Sagi Grimberg 2018-03-27 8:34 ` Nitzan Carmi 2018-04-11 16:17 ` [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Max Gurtovoy 4 siblings, 1 reply; 12+ messages in thread From: Sagi Grimberg @ 2018-03-22 19:03 UTC (permalink / raw) status reflects if any subsystems are bound and can accept existing connections. state reflects the physical state of the port. Signed-off-by: Sagi Grimberg <sagi at grimberg.me> --- nvmetcli | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/nvmetcli b/nvmetcli index 6b102a235450..86f7c01708cb 100755 --- a/nvmetcli +++ b/nvmetcli @@ -388,7 +388,14 @@ class UIPortNode(UINode): if trsvcid != "none": info.append("trsvcid=%s" % trsvcid) enabled = not (not self.cfnode.subsystems and not list(self.cfnode.referrals)) - return (", ".join(info), True if enabled else 0) + info.append("status=" + ("enabled" if enabled else "disabled")) + if not enabled: + ret = 0 + else: + active = not (not int(self.cfnode.get_attr("addr", "tractive"))) + info.append("state=" + ("up" if active else "down")) + ret = True if active else False + return (", ".join(info), ret) class UIPortSubsystemsNode(UINode): def __init__(self, parent): -- 2.14.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/3 rfc nvmetcli] nvmetcli: expose nvmet port status and state 2018-03-22 19:03 ` [PATCH 4/3 rfc nvmetcli] nvmetcli: expose nvmet port status and state Sagi Grimberg @ 2018-03-27 8:34 ` Nitzan Carmi 0 siblings, 0 replies; 12+ messages in thread From: Nitzan Carmi @ 2018-03-27 8:34 UTC (permalink / raw) Hi Sagi, Please see my suggestion below, for a minor improvement to your nvmetcli patch, which makes the built-in "status" UI function available for the port (like in namespaces). Besides, I think the terminology of "status"/"state" might be confusing. Don't you think it is better to stay with "trstate"/"tractive" namings? Nitzan. --- nvmet/nvme.py | 5 +++++ nvmetcli | 15 +++++++-------- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/nvmet/nvme.py b/nvmet/nvme.py index c245a42..dc28b31 100644 --- a/nvmet/nvme.py +++ b/nvmet/nvme.py @@ -625,6 +625,11 @@ class Port(CFSNode): subsystems = property(_list_subsystems, doc="Get the list of Subsystem for this Port.") + def _is_active(self): + return bool(int(self.get_attr("addr", "tractive"))) + + active = property(_is_active, doc="Physical state of the port") + def add_subsystem(self, nqn): ''' Enable access to the Subsystem identified by I{nqn} through this Port. diff --git a/nvmetcli b/nvmetcli index 1a7cf67..46e34b2 100755 --- a/nvmetcli +++ b/nvmetcli @@ -380,6 +380,11 @@ class UIPortNode(UINode): UIPortSubsystemsNode(self) UIReferralsNode(self) + def status(self): + status = "enabled" if self.cfnode.subsystems else "disabled" + state = "up" if self.cfnode.active else "down" + return "status=%s, state=%s" % (status, state) + def summary(self): info = [] info.append("trtype=" + self.cfnode.get_attr("addr", "trtype")) @@ -387,14 +392,8 @@ class UIPortNode(UINode): trsvcid = self.cfnode.get_attr("addr", "trsvcid") if trsvcid != "none": info.append("trsvcid=%s" % trsvcid) - enabled = not (not self.cfnode.subsystems and not list(self.cfnode.referrals)) - info.append("status=" + ("enabled" if enabled else "disabled")) - if not enabled: - ret = 0 - else: - active = not (not int(self.cfnode.get_attr("addr", "tractive"))) - info.append("state=" + ("up" if active else "down")) - ret = True if active else False + info.append(self.status()) + ret = self.cfnode.active return (", ".join(info), ret) class UIPortSubsystemsNode(UINode): -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rfc 0/3] nvmet-rdma automatic port re-activation 2018-03-22 19:02 [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Sagi Grimberg ` (3 preceding siblings ...) 2018-03-22 19:03 ` [PATCH 4/3 rfc nvmetcli] nvmetcli: expose nvmet port status and state Sagi Grimberg @ 2018-04-11 16:17 ` Max Gurtovoy 4 siblings, 0 replies; 12+ messages in thread From: Max Gurtovoy @ 2018-04-11 16:17 UTC (permalink / raw) Hi Sagi, Can we add it to the v4.17 cycle with the small fixes/updates needed ? On 3/22/2018 9:02 PM, Sagi Grimberg wrote: > When a RDMA device goes away we must destroy all it's associated > RDMA resources. RDMa device resets also manifest as device removal > events and a short while after they come back. We want to re-activate > a port listener on this RDMA device when it comes back in to the system. > > In order to make it happen, we save the RDMA device node_guid on a > ULP listener representation (nvmet_rdma_port) and when a RDMA device > comes into the system, we check if there is a listener port that needs > to be re-activated. > > In addition, reflect the port state to the sysadmin. > > Israel Rukshin (2): > nvmet: Add fabrics ops to port > nvmet: Add port transport active flag > > Sagi Grimberg (1): > nvmet-rdma: automatic listening port re-activation > > drivers/nvme/target/configfs.c | 10 ++ > drivers/nvme/target/core.c | 17 ++- > drivers/nvme/target/nvmet.h | 3 + > drivers/nvme/target/rdma.c | 231 +++++++++++++++++++++++++---------------- > 4 files changed, 169 insertions(+), 92 deletions(-) > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-04-11 16:17 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-22 19:02 [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Sagi Grimberg 2018-03-22 19:03 ` [PATCH rfc 1/3] nvmet-rdma: automatic listening " Sagi Grimberg 2018-03-25 8:25 ` Max Gurtovoy 2018-03-26 7:01 ` Max Gurtovoy 2018-04-04 13:06 ` Sagi Grimberg 2018-03-22 19:03 ` [PATCH rfc 2/3] nvmet: Add fabrics ops to port Sagi Grimberg 2018-03-22 19:03 ` [PATCH rfc 3/3] nvmet: Add port transport active flag Sagi Grimberg 2018-03-26 7:52 ` Max Gurtovoy 2018-04-04 13:08 ` Sagi Grimberg 2018-03-22 19:03 ` [PATCH 4/3 rfc nvmetcli] nvmetcli: expose nvmet port status and state Sagi Grimberg 2018-03-27 8:34 ` Nitzan Carmi 2018-04-11 16:17 ` [PATCH rfc 0/3] nvmet-rdma automatic port re-activation Max Gurtovoy
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.