All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr
@ 2022-09-02 14:23 Martin Belanger
  2022-09-05  7:40 ` Sagi Grimberg
  2022-09-06 20:12 ` Chaitanya Kulkarni
  0 siblings, 2 replies; 6+ messages in thread
From: Martin Belanger @ 2022-09-02 14:23 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, hare, axboe, hch, sagi, Martin Belanger

From: Martin Belanger <martin.belanger@dell.com>

TCP transport relies on the routing table to determine which source
address and interface to use when making a connection. Currently,
there is no way to tell from userspace where a connection was made.
This patch exposes the actual source address using a new field named
"src_addr=" in the "address" attribute.

This is needed to diagnose and identify connectivity issues. With
the source address we can infer the interface associated with each
connection.

This was tested with nvme-cli 2.0 to verify it does not have any
adverse effect. The new "src_addr=" field will simply be displayed in
the output of the "list-subsys" or "list -v" commands as shown here.

$ nvme list-subsys
nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress.discovery
\
 +- nvme0 tcp traddr=192.168.56.1,trsvcid=8009,src_addr=192.168.56.101 live

Changes to v2:
Taken Sagi's comments into account

Changes to original submission:
Reword comments per Chaitanya's suggestion

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
---
 drivers/nvme/host/tcp.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 044da18c06f5..85b767a01402 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2532,6 +2532,25 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
 	return queue->nr_cqe;
 }
 
+static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int size)
+{
+	int ret, len;
+	struct sockaddr_storage src_addr;
+	struct nvme_tcp_queue *queue = &to_tcp_ctrl(ctrl)->queues[0];
+
+	len = nvmf_get_address(ctrl, buf, size);
+
+	/* Retrieve the actual source address from the socket */
+	ret = kernel_getsockname(queue->sock, (struct sockaddr *)&src_addr);
+	if (ret > 0) {
+		len--; /* strip trailing newline */
+		len += scnprintf(buf + len, size - len, "%ssrc_addr=%pISc\n",
+				(len) ? "," : "", &src_addr);
+	}
+
+	return len;
+}
+
 static const struct blk_mq_ops nvme_tcp_mq_ops = {
 	.queue_rq	= nvme_tcp_queue_rq,
 	.commit_rqs	= nvme_tcp_commit_rqs,
@@ -2563,7 +2582,7 @@ static const struct nvme_ctrl_ops nvme_tcp_ctrl_ops = {
 	.free_ctrl		= nvme_tcp_free_ctrl,
 	.submit_async_event	= nvme_tcp_submit_async_event,
 	.delete_ctrl		= nvme_tcp_delete_ctrl,
-	.get_address		= nvmf_get_address,
+	.get_address		= nvme_tcp_get_address,
 	.stop_ctrl		= nvme_tcp_stop_ctrl,
 };
 
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr
  2022-09-02 14:23 [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr Martin Belanger
@ 2022-09-05  7:40 ` Sagi Grimberg
  2022-09-06 12:42   ` Belanger, Martin
  2022-09-06 20:12 ` Chaitanya Kulkarni
  1 sibling, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2022-09-05  7:40 UTC (permalink / raw)
  To: Martin Belanger, linux-nvme; +Cc: kbusch, hare, axboe, hch, Martin Belanger


> From: Martin Belanger <martin.belanger@dell.com>
> 
> TCP transport relies on the routing table to determine which source
> address and interface to use when making a connection. Currently,
> there is no way to tell from userspace where a connection was made.
> This patch exposes the actual source address using a new field named
> "src_addr=" in the "address" attribute.
> 
> This is needed to diagnose and identify connectivity issues. With
> the source address we can infer the interface associated with each
> connection.
> 
> This was tested with nvme-cli 2.0 to verify it does not have any
> adverse effect. The new "src_addr=" field will simply be displayed in
> the output of the "list-subsys" or "list -v" commands as shown here.
> 
> $ nvme list-subsys
> nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress.discovery
> \
>   +- nvme0 tcp traddr=192.168.56.1,trsvcid=8009,src_addr=192.168.56.101 live

I wonder how this works with udev events etc for auto-connect? and also
for uniqueness checks...


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr
  2022-09-05  7:40 ` Sagi Grimberg
@ 2022-09-06 12:42   ` Belanger, Martin
  2022-09-06 13:08     ` Sagi Grimberg
  0 siblings, 1 reply; 6+ messages in thread
From: Belanger, Martin @ 2022-09-06 12:42 UTC (permalink / raw)
  To: Sagi Grimberg, Martin Belanger, linux-nvme; +Cc: kbusch, hare, axboe, hch

> > From: Martin Belanger <martin.belanger@dell.com>
> >
> > TCP transport relies on the routing table to determine which source
> > address and interface to use when making a connection. Currently,
> > there is no way to tell from userspace where a connection was made.
> > This patch exposes the actual source address using a new field named
> > "src_addr=" in the "address" attribute.
> >
> > This is needed to diagnose and identify connectivity issues. With the
> > source address we can infer the interface associated with each
> > connection.
> >
> > This was tested with nvme-cli 2.0 to verify it does not have any
> > adverse effect. The new "src_addr=" field will simply be displayed in
> > the output of the "list-subsys" or "list -v" commands as shown here.
> >
> > $ nvme list-subsys
> > nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress.discovery
> > \
> >   +- nvme0 tcp
> > traddr=192.168.56.1,trsvcid=8009,src_addr=192.168.56.101 live
> 
> I wonder how this works with udev events etc for auto-connect? and also for
> uniqueness checks...

Udev events are triggered by device properties (which are not the same as attributes).
The properties are found in the attribute "uevent". For example:

$ cat /sys/class/nvme/nvme1/uevent
MAJOR=238
MINOR=1
DEVNAME=nvme1
NVME_TRTYPE=tcp
NVME_TRADDR=::ffff:192.168.56.1
NVME_TRSVCID=8009
NVME_HOST_TRADDR=none
NVME_HOST_IFACE=enp0s8

The change I'm proposing does not affect device properties and therefore no impact on udev rules.

With regards to uniqueness checks, there is only one place in libnvme where we read and check the "address" attribute (file: tree.c, function: nvme_ctrl_alloc). The code will simply skip "src_address=" and extract all the other known keys from "address".

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr
  2022-09-06 12:42   ` Belanger, Martin
@ 2022-09-06 13:08     ` Sagi Grimberg
  2022-09-06 15:32       ` Belanger, Martin
  0 siblings, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2022-09-06 13:08 UTC (permalink / raw)
  To: Belanger, Martin, Martin Belanger, linux-nvme; +Cc: kbusch, hare, axboe, hch


>>> TCP transport relies on the routing table to determine which source
>>> address and interface to use when making a connection. Currently,
>>> there is no way to tell from userspace where a connection was made.
>>> This patch exposes the actual source address using a new field named
>>> "src_addr=" in the "address" attribute.
>>>
>>> This is needed to diagnose and identify connectivity issues. With the
>>> source address we can infer the interface associated with each
>>> connection.
>>>
>>> This was tested with nvme-cli 2.0 to verify it does not have any
>>> adverse effect. The new "src_addr=" field will simply be displayed in
>>> the output of the "list-subsys" or "list -v" commands as shown here.
>>>
>>> $ nvme list-subsys
>>> nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress.discovery
>>> \
>>>    +- nvme0 tcp
>>> traddr=192.168.56.1,trsvcid=8009,src_addr=192.168.56.101 live
>>
>> I wonder how this works with udev events etc for auto-connect? and also for
>> uniqueness checks...
> 
> Udev events are triggered by device properties (which are not the same as attributes).
> The properties are found in the attribute "uevent". For example:
> 
> $ cat /sys/class/nvme/nvme1/uevent
> MAJOR=238
> MINOR=1
> DEVNAME=nvme1
> NVME_TRTYPE=tcp
> NVME_TRADDR=::ffff:192.168.56.1
> NVME_TRSVCID=8009
> NVME_HOST_TRADDR=none
> NVME_HOST_IFACE=enp0s8
> 
> The change I'm proposing does not affect device properties and therefore no impact on udev rules.
> 
> With regards to uniqueness checks, there is only one place in libnvme where we read and check the "address" attribute (file: tree.c, function: nvme_ctrl_alloc). The code will simply skip "src_address=" and extract all the other known keys from "address".

I'm concerned that adding this unconditionally may trigger someone to
fail due to this addition... It is also confusing when host_traddr is
provided...

If this would be logged as pr_dbg instead, would this achieve the goal?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr
  2022-09-06 13:08     ` Sagi Grimberg
@ 2022-09-06 15:32       ` Belanger, Martin
  0 siblings, 0 replies; 6+ messages in thread
From: Belanger, Martin @ 2022-09-06 15:32 UTC (permalink / raw)
  To: Sagi Grimberg, Martin Belanger, linux-nvme; +Cc: kbusch, hare, axboe, hch

> >>> TCP transport relies on the routing table to determine which source
> >>> address and interface to use when making a connection. Currently,
> >>> there is no way to tell from userspace where a connection was made.
> >>> This patch exposes the actual source address using a new field named
> >>> "src_addr=" in the "address" attribute.
> >>>
> >>> This is needed to diagnose and identify connectivity issues. With
> >>> the source address we can infer the interface associated with each
> >>> connection.
> >>>
> >>> This was tested with nvme-cli 2.0 to verify it does not have any
> >>> adverse effect. The new "src_addr=" field will simply be displayed
> >>> in the output of the "list-subsys" or "list -v" commands as shown here.
> >>>
> >>> $ nvme list-subsys
> >>> nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress.discovery
> >>> \
> >>>    +- nvme0 tcp
> >>> traddr=192.168.56.1,trsvcid=8009,src_addr=192.168.56.101 live
> >>
> >> I wonder how this works with udev events etc for auto-connect? and
> >> also for uniqueness checks...
> >
> > Udev events are triggered by device properties (which are not the same as
> attributes).
> > The properties are found in the attribute "uevent". For example:
> >
> > $ cat /sys/class/nvme/nvme1/uevent
> > MAJOR=238
> > MINOR=1
> > DEVNAME=nvme1
> > NVME_TRTYPE=tcp
> > NVME_TRADDR=::ffff:192.168.56.1
> > NVME_TRSVCID=8009
> > NVME_HOST_TRADDR=none
> > NVME_HOST_IFACE=enp0s8
> >
> > The change I'm proposing does not affect device properties and therefore
> no impact on udev rules.
> >
> > With regards to uniqueness checks, there is only one place in libnvme where
> we read and check the "address" attribute (file: tree.c, function:
> nvme_ctrl_alloc). The code will simply skip "src_address=" and extract all the
> other known keys from "address".
> 
> I'm concerned that adding this unconditionally may trigger someone to fail
> due to this addition... It is also confusing when host_traddr is provided...

That would happen if people are testing the "address" attribute as a "string".
In that case, I would say it's a bug on their part because there is no guarantee
that all the key=value pairs in the "address" string will always be in the same
order. By the way, I added a new key (host_iface) to the "address" attribute 
over a year ago (May 2021) and have not seen any negative effect. Yes, it can 
be confusing when host_traddr is provided, but in that case src_addr and 
host_traddr will be equal. In fact, that's the reason why Hannes suggested 
only using the host_addr and not introducing a new key (src_addr).

> 
> If this would be logged as pr_dbg instead, would this achieve the goal?

Not really. Yes a log would help in the case someone is debugging a connectivity
issue in real time while looking at the logs, but what if the system has been up 
for days? I know that the syslog can keep several days (weeks) worth of info
but that's not the same as having an attribute you can look at directly.

Also, an attribute is needed for the nvme-stas application I'm working on.
An attribute allows nvme-stas to run a complete audit of all existing connections.
Not just the configuration parameters, but where each connections is actually 
made.

Just this morning I was debugging an issue where an I/O controller connection
was made on the management interface (default route) instead of going over
the interface dedicated for subsystem connections. This happened because the
user forgot to add explicit routes for the subsystems. By having an attribute 
to expose where a connection is made we can then provide tools to help user
diagnose connectivity issues by themselves instead of submitting tickets 
requiring the developer's help. That's the main goal for this change.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr
  2022-09-02 14:23 [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr Martin Belanger
  2022-09-05  7:40 ` Sagi Grimberg
@ 2022-09-06 20:12 ` Chaitanya Kulkarni
  1 sibling, 0 replies; 6+ messages in thread
From: Chaitanya Kulkarni @ 2022-09-06 20:12 UTC (permalink / raw)
  To: Martin Belanger, linux-nvme
  Cc: kbusch, hare, axboe, hch, sagi, Martin Belanger


>   
> +static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int size)
> +{
> +	int ret, len;
> +	struct sockaddr_storage src_addr;
> +	struct nvme_tcp_queue *queue = &to_tcp_ctrl(ctrl)->queues[0];
> +

nit:- consider this :

	struct nvme_tcp_queue *queue = &to_tcp_ctrl(ctrl)->queues[0];
	struct sockaddr_storage src_addr;
	int ret, len;

> +	len = nvmf_get_address(ctrl, buf, size);
> +

do we have add check above if len == 0 case ?
I believe you have already checked it, if not it is worth a look.

> +	/* Retrieve the actual source address from the socket */

apart from the ongoing discussion, above comment is redundant
since kernel_getsockname() call is pretty clear.

> +	ret = kernel_getsockname(queue->sock, (struct sockaddr *)&src_addr);
> +	if (ret > 0) {
> +		len--; /* strip trailing newline */
> +		len += scnprintf(buf + len, size - len, "%ssrc_addr=%pISc\n",
> +				(len) ? "," : "", &src_addr);
> +	}

-ck


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-09-06 20:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-02 14:23 [PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr Martin Belanger
2022-09-05  7:40 ` Sagi Grimberg
2022-09-06 12:42   ` Belanger, Martin
2022-09-06 13:08     ` Sagi Grimberg
2022-09-06 15:32       ` Belanger, Martin
2022-09-06 20:12 ` Chaitanya Kulkarni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.