* [PATCH V2 0/3] Fix a race condition when performing a controller reset @ 2021-11-15 16:31 Maurizio Lombardi 2021-11-15 16:31 ` [PATCH V2 1/3] nvmet-tcp: add an helper to free the iovec Maurizio Lombardi ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Maurizio Lombardi @ 2021-11-15 16:31 UTC (permalink / raw) To: linux-nvme; +Cc: hch, sagi, chaitanya.kulkarni Memory leaks and kernel panics involving the nvmet driver have been observed when an initiator executes a reset_controller operation while doing I/O. The problem is due to a race condition between io_work and release_queue, the latter may end up destroying the commands while io_work is still running, causing use-after-free and memory leaks. V2: - Use "queue->rcv_state" to prevent the race condition, as suggested by Sagi Grimberg. - Dropped the changes to nvmet_tcp_queue_response() because they are no longer necessary. - Fix the memory leaks in a separate patch (PATCH 3/3). Maurizio Lombardi (3): nvmet-tcp: add an helper to free the iovec nvmet-tcp: fix a race condition between release_queue and io_work nvmet-tcp: fix memory leak when performing a controller reset drivers/nvme/target/tcp.c | 37 ++++++++++++++++++++++++++----------- 1 file changed, 26 insertions(+), 11 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH V2 1/3] nvmet-tcp: add an helper to free the iovec 2021-11-15 16:31 [PATCH V2 0/3] Fix a race condition when performing a controller reset Maurizio Lombardi @ 2021-11-15 16:31 ` Maurizio Lombardi 2021-11-16 9:00 ` Sagi Grimberg 2021-11-15 16:31 ` [PATCH V2 2/3] nvmet-tcp: fix a race condition between release_queue and io_work Maurizio Lombardi 2021-11-15 16:31 ` [PATCH V2 3/3] nvmet-tcp: fix memory leak when performing a controller reset Maurizio Lombardi 2 siblings, 1 reply; 10+ messages in thread From: Maurizio Lombardi @ 2021-11-15 16:31 UTC (permalink / raw) To: linux-nvme; +Cc: hch, sagi, chaitanya.kulkarni Makes the code easier to read and to debug. Sets the freed pointers to NULL, it will be useful when destroying the queues to understand if the iovecs have been released already or not. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> --- drivers/nvme/target/tcp.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 84c387e4bf43..9aa81af84efa 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -166,6 +166,8 @@ static struct workqueue_struct *nvmet_tcp_wq; static const struct nvmet_fabrics_ops nvmet_tcp_ops; static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c); static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd); +static void nvmet_tcp_free_iovec(struct nvmet_tcp_cmd *cmd); +static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd); static inline u16 nvmet_tcp_cmd_tag(struct nvmet_tcp_queue *queue, struct nvmet_tcp_cmd *cmd) @@ -297,6 +299,16 @@ static int nvmet_tcp_check_ddgst(struct nvmet_tcp_queue *queue, void *pdu) return 0; } +static void nvmet_tcp_free_iovec(struct nvmet_tcp_cmd *cmd) +{ + WARN_ON(unlikely(cmd->nr_mapped > 0)); + + kfree(cmd->iov); + sgl_free(cmd->req.sg); + cmd->iov = NULL; + cmd->req.sg = NULL; +} + static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd) { struct scatterlist *sg; @@ -306,6 +318,8 @@ static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd) for (i = 0; i < cmd->nr_mapped; i++) kunmap(sg_page(&sg[i])); + + cmd->nr_mapped = 0; } static void nvmet_tcp_map_pdu_iovec(struct nvmet_tcp_cmd *cmd) @@ -387,7 +401,7 @@ static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd) return 0; err: - sgl_free(cmd->req.sg); + nvmet_tcp_free_iovec(cmd); return NVME_SC_INTERNAL; } @@ -632,10 +646,8 @@ static int nvmet_try_send_data(struct nvmet_tcp_cmd *cmd, bool last_in_batch) } } - if (queue->nvme_sq.sqhd_disabled) { - kfree(cmd->iov); - sgl_free(cmd->req.sg); - } + if (queue->nvme_sq.sqhd_disabled) + nvmet_tcp_free_iovec(cmd); return 1; @@ -664,8 +676,7 @@ static int nvmet_try_send_response(struct nvmet_tcp_cmd *cmd, if (left) return -EAGAIN; - kfree(cmd->iov); - sgl_free(cmd->req.sg); + nvmet_tcp_free_iovec(cmd); cmd->queue->snd_cmd = NULL; nvmet_tcp_put_cmd(cmd); return 1; @@ -1406,8 +1417,7 @@ static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd) { nvmet_req_uninit(&cmd->req); nvmet_tcp_unmap_pdu_iovec(cmd); - kfree(cmd->iov); - sgl_free(cmd->req.sg); + nvmet_tcp_free_iovec(cmd); } static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue) -- 2.27.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH V2 1/3] nvmet-tcp: add an helper to free the iovec 2021-11-15 16:31 ` [PATCH V2 1/3] nvmet-tcp: add an helper to free the iovec Maurizio Lombardi @ 2021-11-16 9:00 ` Sagi Grimberg 2021-11-16 9:05 ` Maurizio Lombardi 0 siblings, 1 reply; 10+ messages in thread From: Sagi Grimberg @ 2021-11-16 9:00 UTC (permalink / raw) To: Maurizio Lombardi, linux-nvme; +Cc: hch, chaitanya.kulkarni On 11/15/21 6:31 PM, Maurizio Lombardi wrote: > Makes the code easier to read and to debug. > > Sets the freed pointers to NULL, it will be useful > when destroying the queues to understand if the > iovecs have been released already or not. > > Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> > --- > drivers/nvme/target/tcp.c | 28 +++++++++++++++++++--------- > 1 file changed, 19 insertions(+), 9 deletions(-) > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index 84c387e4bf43..9aa81af84efa 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -166,6 +166,8 @@ static struct workqueue_struct *nvmet_tcp_wq; > static const struct nvmet_fabrics_ops nvmet_tcp_ops; > static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c); > static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd); > +static void nvmet_tcp_free_iovec(struct nvmet_tcp_cmd *cmd); > +static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd); > > static inline u16 nvmet_tcp_cmd_tag(struct nvmet_tcp_queue *queue, > struct nvmet_tcp_cmd *cmd) > @@ -297,6 +299,16 @@ static int nvmet_tcp_check_ddgst(struct nvmet_tcp_queue *queue, void *pdu) > return 0; > } > > +static void nvmet_tcp_free_iovec(struct nvmet_tcp_cmd *cmd) I'd call it nvmet_tcp_free_cmd_buffers maybe. It is freeing both the data buffers, sgl vector and iovec. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V2 1/3] nvmet-tcp: add an helper to free the iovec 2021-11-16 9:00 ` Sagi Grimberg @ 2021-11-16 9:05 ` Maurizio Lombardi 0 siblings, 0 replies; 10+ messages in thread From: Maurizio Lombardi @ 2021-11-16 9:05 UTC (permalink / raw) To: Sagi Grimberg; +Cc: linux-nvme, hch, chaitanya.kulkarni On Tue, Nov 16, 2021 at 11:00:56AM +0200, Sagi Grimberg wrote: > > > On 11/15/21 6:31 PM, Maurizio Lombardi wrote: > > Makes the code easier to read and to debug. > > > > Sets the freed pointers to NULL, it will be useful > > when destroying the queues to understand if the > > iovecs have been released already or not. > > > > Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> > > --- > > drivers/nvme/target/tcp.c | 28 +++++++++++++++++++--------- > > 1 file changed, 19 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > > index 84c387e4bf43..9aa81af84efa 100644 > > --- a/drivers/nvme/target/tcp.c > > +++ b/drivers/nvme/target/tcp.c > > @@ -166,6 +166,8 @@ static struct workqueue_struct *nvmet_tcp_wq; > > static const struct nvmet_fabrics_ops nvmet_tcp_ops; > > static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c); > > static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd); > > +static void nvmet_tcp_free_iovec(struct nvmet_tcp_cmd *cmd); > > +static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd); > > static inline u16 nvmet_tcp_cmd_tag(struct nvmet_tcp_queue *queue, > > struct nvmet_tcp_cmd *cmd) > > @@ -297,6 +299,16 @@ static int nvmet_tcp_check_ddgst(struct nvmet_tcp_queue *queue, void *pdu) > > return 0; > > } > > +static void nvmet_tcp_free_iovec(struct nvmet_tcp_cmd *cmd) > > I'd call it nvmet_tcp_free_cmd_buffers maybe. It is freeing > both the data buffers, sgl vector and iovec. Ok no problem for me, I will resubmit it. Thanks, Maurizio ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH V2 2/3] nvmet-tcp: fix a race condition between release_queue and io_work 2021-11-15 16:31 [PATCH V2 0/3] Fix a race condition when performing a controller reset Maurizio Lombardi 2021-11-15 16:31 ` [PATCH V2 1/3] nvmet-tcp: add an helper to free the iovec Maurizio Lombardi @ 2021-11-15 16:31 ` Maurizio Lombardi 2021-11-16 8:57 ` Sagi Grimberg 2021-11-15 16:31 ` [PATCH V2 3/3] nvmet-tcp: fix memory leak when performing a controller reset Maurizio Lombardi 2 siblings, 1 reply; 10+ messages in thread From: Maurizio Lombardi @ 2021-11-15 16:31 UTC (permalink / raw) To: linux-nvme; +Cc: hch, sagi, chaitanya.kulkarni If the initiator executes a reset controller operation while performing I/O, the target kernel will crash because of a race condition between release_queue and io_work; nvmet_tcp_uninit_data_in_cmds() may be executed while io_work is running, calling flush_work() was not sufficient to prevent this because io_work could requeue itself. Fix this bug by using cancel_work_sync() to prevent io_work from requeuing itself and set rcv_state to NVMET_TCP_RECV_ERR to make sure we don't receive any more data from the socket. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> --- drivers/nvme/target/tcp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 9aa81af84efa..028b47ff0f0f 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1447,7 +1447,9 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w) mutex_unlock(&nvmet_tcp_queue_mutex); nvmet_tcp_restore_socket_callbacks(queue); - flush_work(&queue->io_work); + cancel_work_sync(&queue->io_work); + /* stop accepting incoming data */ + queue->rcv_state = NVMET_TCP_RECV_ERR; nvmet_tcp_uninit_data_in_cmds(queue); nvmet_sq_destroy(&queue->nvme_sq); -- 2.27.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH V2 2/3] nvmet-tcp: fix a race condition between release_queue and io_work 2021-11-15 16:31 ` [PATCH V2 2/3] nvmet-tcp: fix a race condition between release_queue and io_work Maurizio Lombardi @ 2021-11-16 8:57 ` Sagi Grimberg 0 siblings, 0 replies; 10+ messages in thread From: Sagi Grimberg @ 2021-11-16 8:57 UTC (permalink / raw) To: Maurizio Lombardi, linux-nvme; +Cc: hch, chaitanya.kulkarni This should be patch 1 of the series. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH V2 3/3] nvmet-tcp: fix memory leak when performing a controller reset 2021-11-15 16:31 [PATCH V2 0/3] Fix a race condition when performing a controller reset Maurizio Lombardi 2021-11-15 16:31 ` [PATCH V2 1/3] nvmet-tcp: add an helper to free the iovec Maurizio Lombardi 2021-11-15 16:31 ` [PATCH V2 2/3] nvmet-tcp: fix a race condition between release_queue and io_work Maurizio Lombardi @ 2021-11-15 16:31 ` Maurizio Lombardi 2021-11-16 9:07 ` Sagi Grimberg 2 siblings, 1 reply; 10+ messages in thread From: Maurizio Lombardi @ 2021-11-15 16:31 UTC (permalink / raw) To: linux-nvme; +Cc: hch, sagi, chaitanya.kulkarni If a reset controller is executed while the initiator is performing some I/O the driver may leak the memory allocated for the commands' iovec. Make sure that nvmet_tcp_uninit_data_in_cmds() releases all the memory. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> --- drivers/nvme/target/tcp.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 028b47ff0f0f..5a238fa067cd 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1427,7 +1427,10 @@ static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue) for (i = 0; i < queue->nr_cmds; i++, cmd++) { if (nvmet_tcp_need_data_in(cmd)) - nvmet_tcp_finish_cmd(cmd); + nvmet_req_uninit(&cmd->req); + + nvmet_tcp_unmap_pdu_iovec(cmd); + nvmet_tcp_free_iovec(cmd); } if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) { -- 2.27.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH V2 3/3] nvmet-tcp: fix memory leak when performing a controller reset 2021-11-15 16:31 ` [PATCH V2 3/3] nvmet-tcp: fix memory leak when performing a controller reset Maurizio Lombardi @ 2021-11-16 9:07 ` Sagi Grimberg 2021-11-16 9:22 ` Maurizio Lombardi 0 siblings, 1 reply; 10+ messages in thread From: Sagi Grimberg @ 2021-11-16 9:07 UTC (permalink / raw) To: Maurizio Lombardi, linux-nvme; +Cc: hch, chaitanya.kulkarni > If a reset controller is executed while the initiator > is performing some I/O the driver may leak the memory allocated > for the commands' iovec. > > Make sure that nvmet_tcp_uninit_data_in_cmds() releases > all the memory. > > Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> > --- > drivers/nvme/target/tcp.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index 028b47ff0f0f..5a238fa067cd 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -1427,7 +1427,10 @@ static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue) > > for (i = 0; i < queue->nr_cmds; i++, cmd++) { > if (nvmet_tcp_need_data_in(cmd)) > - nvmet_tcp_finish_cmd(cmd); > + nvmet_req_uninit(&cmd->req); > + > + nvmet_tcp_unmap_pdu_iovec(cmd); > + nvmet_tcp_free_iovec(cmd); I understand how unmap_pdu_iovec is re-entrance safe, but sgl_free called from free_iovec is not. What prevents it from being called for commands that have already freed their sgls? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V2 3/3] nvmet-tcp: fix memory leak when performing a controller reset 2021-11-16 9:07 ` Sagi Grimberg @ 2021-11-16 9:22 ` Maurizio Lombardi 2021-11-16 10:11 ` Sagi Grimberg 0 siblings, 1 reply; 10+ messages in thread From: Maurizio Lombardi @ 2021-11-16 9:22 UTC (permalink / raw) To: Sagi Grimberg; +Cc: linux-nvme, hch, chaitanya.kulkarni On Tue, Nov 16, 2021 at 11:07:29AM +0200, Sagi Grimberg wrote: > > > If a reset controller is executed while the initiator > > is performing some I/O the driver may leak the memory allocated > > for the commands' iovec. > > > > Make sure that nvmet_tcp_uninit_data_in_cmds() releases > > all the memory. > > > > Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> > > --- > > drivers/nvme/target/tcp.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > > index 028b47ff0f0f..5a238fa067cd 100644 > > --- a/drivers/nvme/target/tcp.c > > +++ b/drivers/nvme/target/tcp.c > > @@ -1427,7 +1427,10 @@ static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue) > > for (i = 0; i < queue->nr_cmds; i++, cmd++) { > > if (nvmet_tcp_need_data_in(cmd)) > > - nvmet_tcp_finish_cmd(cmd); > > + nvmet_req_uninit(&cmd->req); > > + > > + nvmet_tcp_unmap_pdu_iovec(cmd); > > + nvmet_tcp_free_iovec(cmd); > > I understand how unmap_pdu_iovec is re-entrance safe, but sgl_free > called from free_iovec is not. What prevents it from being called > for commands that have already freed their sgls? The commands that have already freed their sgls will have the pointers set to NULL, calling sgl_free() against a NULL pointer is safe (it does nothing, like kfree()) but if you prefer I will add a check. Maurizio ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V2 3/3] nvmet-tcp: fix memory leak when performing a controller reset 2021-11-16 9:22 ` Maurizio Lombardi @ 2021-11-16 10:11 ` Sagi Grimberg 0 siblings, 0 replies; 10+ messages in thread From: Sagi Grimberg @ 2021-11-16 10:11 UTC (permalink / raw) To: Maurizio Lombardi; +Cc: linux-nvme, hch, chaitanya.kulkarni >>> If a reset controller is executed while the initiator >>> is performing some I/O the driver may leak the memory allocated >>> for the commands' iovec. >>> >>> Make sure that nvmet_tcp_uninit_data_in_cmds() releases >>> all the memory. >>> >>> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> >>> --- >>> drivers/nvme/target/tcp.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c >>> index 028b47ff0f0f..5a238fa067cd 100644 >>> --- a/drivers/nvme/target/tcp.c >>> +++ b/drivers/nvme/target/tcp.c >>> @@ -1427,7 +1427,10 @@ static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue) >>> for (i = 0; i < queue->nr_cmds; i++, cmd++) { >>> if (nvmet_tcp_need_data_in(cmd)) >>> - nvmet_tcp_finish_cmd(cmd); >>> + nvmet_req_uninit(&cmd->req); >>> + >>> + nvmet_tcp_unmap_pdu_iovec(cmd); >>> + nvmet_tcp_free_iovec(cmd); >> >> I understand how unmap_pdu_iovec is re-entrance safe, but sgl_free >> called from free_iovec is not. What prevents it from being called >> for commands that have already freed their sgls? > > The commands that have already freed their sgls > will have the pointers set to NULL, > calling sgl_free() against a NULL pointer is safe (it does nothing, like kfree()) > but if you prefer I will add a check. Oh, right. So, Reviewed-by: Sagi Grimberg <sagi@grimberg.me> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-11-16 10:12 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-11-15 16:31 [PATCH V2 0/3] Fix a race condition when performing a controller reset Maurizio Lombardi 2021-11-15 16:31 ` [PATCH V2 1/3] nvmet-tcp: add an helper to free the iovec Maurizio Lombardi 2021-11-16 9:00 ` Sagi Grimberg 2021-11-16 9:05 ` Maurizio Lombardi 2021-11-15 16:31 ` [PATCH V2 2/3] nvmet-tcp: fix a race condition between release_queue and io_work Maurizio Lombardi 2021-11-16 8:57 ` Sagi Grimberg 2021-11-15 16:31 ` [PATCH V2 3/3] nvmet-tcp: fix memory leak when performing a controller reset Maurizio Lombardi 2021-11-16 9:07 ` Sagi Grimberg 2021-11-16 9:22 ` Maurizio Lombardi 2021-11-16 10:11 ` Sagi Grimberg
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.