netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] tls: add zerocopy device sendpage
       [not found] ` <20200712.153233.370000904740228888.davem@davemloft.net>
@ 2020-07-13  7:49   ` Boris Pismenny
  2020-07-13 19:05     ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Boris Pismenny @ 2020-07-13  7:49 UTC (permalink / raw)
  To: David Miller; +Cc: kuba, john.fastabend, daniel, tariqt, netdev

On 13/07/2020 1:32, David Miller wrote:
> From: Boris Pismenny <borisp@mellanox.com>
> Date: Sun, 12 Jul 2020 13:44:09 +0300
>
>> This patch adds two configuration knobs to control TLS zerocopy sendfile:
>> 1) socket option named TLS_TX_ZEROCOPY_SENDFILE that enables
>> applications to use zerocopy sendfile on a per-socket basis.
>> 2) global sysctl named tls_zerocopy_sendfile that defines the default
>> for the entire system.
> We already have too many knobs, don't add this until we find that
> it is necessary and that the kernel can't do the optimal thing
> on it's own.

An alternative approach that requires no knobs is to change the
current TLS_DEVICE sendfile flow to skip the copy. It is really
not necessary to copy the data, as the guarantees it provides to
users, namely that users can modify page cache data sent via sendfile
with no error, justifies the performance overhead.
Users that sendfile data from the pagecache while modifying
it cannot reasonably expect data on the other side to be
consistent. TCP sendfile guarantees nothing except that
the TCP checksum is correct. TLS sendfile with copy guarantees
the same, but TLS sendfile zerocopy (with offload) will send
the modified data, and this can trigger an authentication error
on the TLS layer when inconsistent data is received. If the data
is inconsistent, then letting the user know via an error is desirable,
right?

If there are no objections, I'd gladly resubmit it with the approach
described above.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-13  7:49   ` [PATCH] tls: add zerocopy device sendpage Boris Pismenny
@ 2020-07-13 19:05     ` David Miller
  2020-07-13 22:15       ` Boris Pismenny
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2020-07-13 19:05 UTC (permalink / raw)
  To: borisp; +Cc: kuba, john.fastabend, daniel, tariqt, netdev

From: Boris Pismenny <borisp@mellanox.com>
Date: Mon, 13 Jul 2020 10:49:49 +0300

> An alternative approach that requires no knobs is to change the
> current TLS_DEVICE sendfile flow to skip the copy. It is really
> not necessary to copy the data, as the guarantees it provides to
> users, namely that users can modify page cache data sent via sendfile
> with no error, justifies the performance overhead.
> Users that sendfile data from the pagecache while modifying
> it cannot reasonably expect data on the other side to be
> consistent. TCP sendfile guarantees nothing except that
> the TCP checksum is correct. TLS sendfile with copy guarantees
> the same, but TLS sendfile zerocopy (with offload) will send
> the modified data, and this can trigger an authentication error
> on the TLS layer when inconsistent data is received. If the data
> is inconsistent, then letting the user know via an error is desirable,
> right?
> 
> If there are no objections, I'd gladly resubmit it with the approach
> described above.

The TLS signatures are supposed to be even stronger than the protocol
checksum, and therefore we should send out valid ones rather than
incorrect ones.

Why can't the device generate the correct TLS signature when
offloading?  Just like for the protocol checksum, the device should
load the payload into the device over DMA and make it's calculations
on that copy.

For SW kTLS, we must copy.  Potentially sending out garbage signatures
in a packet cannot be an "option".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-13 19:05     ` David Miller
@ 2020-07-13 22:15       ` Boris Pismenny
  2020-07-13 22:59         ` Jakub Kicinski
  2020-07-14  1:02         ` David Miller
  0 siblings, 2 replies; 12+ messages in thread
From: Boris Pismenny @ 2020-07-13 22:15 UTC (permalink / raw)
  To: David Miller; +Cc: kuba, john.fastabend, daniel, tariqt, netdev

On 13/07/2020 22:05, David Miller wrote:
> From: Boris Pismenny <borisp@mellanox.com>
> Date: Mon, 13 Jul 2020 10:49:49 +0300
>
>> An alternative approach that requires no knobs is to change the
>> current TLS_DEVICE sendfile flow to skip the copy. It is really
>> not necessary to copy the data, as the guarantees it provides to
>> users, namely that users can modify page cache data sent via sendfile
>> with no error, justifies the performance overhead.
>> Users that sendfile data from the pagecache while modifying
>> it cannot reasonably expect data on the other side to be
>> consistent. TCP sendfile guarantees nothing except that
>> the TCP checksum is correct. TLS sendfile with copy guarantees
>> the same, but TLS sendfile zerocopy (with offload) will send
>> the modified data, and this can trigger an authentication error
>> on the TLS layer when inconsistent data is received. If the data
>> is inconsistent, then letting the user know via an error is desirable,
>> right?
>>
>> If there are no objections, I'd gladly resubmit it with the approach
>> described above.
> The TLS signatures are supposed to be even stronger than the protocol
> checksum, and therefore we should send out valid ones rather than
> incorrect ones.

Right, but one is on packet payload, while the other is part of the payload.

>
> Why can't the device generate the correct TLS signature when
> offloading?  Just like for the protocol checksum, the device should
> load the payload into the device over DMA and make it's calculations
> on that copy.

Right. The problematic case is when some part of the record is already
received by the other party, and then some (modified) data including
the TLS authentication tag is re-transmitted.
The modified tag is calculated over the new data, while the other party
will use the already received old data, resulting in authentication error.

>
> For SW kTLS, we must copy.  Potentially sending out garbage signatures
> in a packet cannot be an "option".
Obviously, SW kTLS must encrypt the data into a different kernel buffer,
which is the same as copying for that matter. TLS_DEVICE doesn't require this.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-13 22:15       ` Boris Pismenny
@ 2020-07-13 22:59         ` Jakub Kicinski
  2020-07-14  7:31           ` Boris Pismenny
  2020-07-14  1:02         ` David Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2020-07-13 22:59 UTC (permalink / raw)
  To: Boris Pismenny; +Cc: David Miller, john.fastabend, daniel, tariqt, netdev

On Tue, 14 Jul 2020 01:15:26 +0300 Boris Pismenny wrote:
> On 13/07/2020 22:05, David Miller wrote:
> > The TLS signatures are supposed to be even stronger than the protocol
> > checksum, and therefore we should send out valid ones rather than
> > incorrect ones.  
> 
> Right, but one is on packet payload, while the other is part of the payload.
> 
> > Why can't the device generate the correct TLS signature when
> > offloading?  Just like for the protocol checksum, the device should
> > load the payload into the device over DMA and make it's calculations
> > on that copy.  
> 
> Right. The problematic case is when some part of the record is already
> received by the other party, and then some (modified) data including
> the TLS authentication tag is re-transmitted.
> The modified tag is calculated over the new data, while the other party
> will use the already received old data, resulting in authentication error.
> 
> > For SW kTLS, we must copy.  Potentially sending out garbage signatures
> > in a packet cannot be an "option".  
> 
> Obviously, SW kTLS must encrypt the data into a different kernel buffer,
> which is the same as copying for that matter. TLS_DEVICE doesn't require this.

This proposal is one big attrition of requirements, which I personally
dislike quite a bit. Nothing material has changed since the first
version of the code was upstreamed, let's ask ourselves - why was the
knob not part of the initial submission?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-13 22:15       ` Boris Pismenny
  2020-07-13 22:59         ` Jakub Kicinski
@ 2020-07-14  1:02         ` David Miller
  2020-07-14  7:27           ` Boris Pismenny
  1 sibling, 1 reply; 12+ messages in thread
From: David Miller @ 2020-07-14  1:02 UTC (permalink / raw)
  To: borisp; +Cc: kuba, john.fastabend, daniel, tariqt, netdev

From: Boris Pismenny <borisp@mellanox.com>
Date: Tue, 14 Jul 2020 01:15:26 +0300

> On 13/07/2020 22:05, David Miller wrote:
>> From: Boris Pismenny <borisp@mellanox.com>
>> Date: Mon, 13 Jul 2020 10:49:49 +0300
>>
>> Why can't the device generate the correct TLS signature when
>> offloading?  Just like for the protocol checksum, the device should
>> load the payload into the device over DMA and make it's calculations
>> on that copy.
> 
> Right. The problematic case is when some part of the record is already
> received by the other party, and then some (modified) data including
> the TLS authentication tag is re-transmitted.

Then we must copy to avoid this.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-14  1:02         ` David Miller
@ 2020-07-14  7:27           ` Boris Pismenny
  2020-07-14 20:42             ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Boris Pismenny @ 2020-07-14  7:27 UTC (permalink / raw)
  To: David Miller; +Cc: kuba, john.fastabend, daniel, tariqt, netdev

On 14/07/2020 4:02, David Miller wrote:
> From: Boris Pismenny <borisp@mellanox.com>
> Date: Tue, 14 Jul 2020 01:15:26 +0300
>
>> On 13/07/2020 22:05, David Miller wrote:
>>> From: Boris Pismenny <borisp@mellanox.com>
>>> Date: Mon, 13 Jul 2020 10:49:49 +0300
>>>
>>> Why can't the device generate the correct TLS signature when
>>> offloading?  Just like for the protocol checksum, the device should
>>> load the payload into the device over DMA and make it's calculations
>>> on that copy.
>> Right. The problematic case is when some part of the record is already
>> received by the other party, and then some (modified) data including
>> the TLS authentication tag is re-transmitted.
> Then we must copy to avoid this.

Why is it the kernel's role to protect against such an error?
Surely the user that modifies pagecache data while sending it over
sendfile (with TCP) will suffer from consistency bugs that are even worse.

The copy in the TLS_DEVICE sendfile path greatly reduces the value of
all of this work. If we want to get the maximum out of this, then the
copy has to go.

If we can't make this the default (as it is in FreeBSD), and we can't
add a knob to enable this. Then, what should we do here?




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-13 22:59         ` Jakub Kicinski
@ 2020-07-14  7:31           ` Boris Pismenny
  2020-07-14 16:23             ` Jakub Kicinski
  2020-07-14 20:38             ` David Miller
  0 siblings, 2 replies; 12+ messages in thread
From: Boris Pismenny @ 2020-07-14  7:31 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: David Miller, john.fastabend, daniel, tariqt, netdev

On 14/07/2020 1:59, Jakub Kicinski wrote:
> On Tue, 14 Jul 2020 01:15:26 +0300 Boris Pismenny wrote:
>> On 13/07/2020 22:05, David Miller wrote:
>>> The TLS signatures are supposed to be even stronger than the protocol
>>> checksum, and therefore we should send out valid ones rather than
>>> incorrect ones.  
>> Right, but one is on packet payload, while the other is part of the payload.
>>
>>> Why can't the device generate the correct TLS signature when
>>> offloading?  Just like for the protocol checksum, the device should
>>> load the payload into the device over DMA and make it's calculations
>>> on that copy.  
>> Right. The problematic case is when some part of the record is already
>> received by the other party, and then some (modified) data including
>> the TLS authentication tag is re-transmitted.
>> The modified tag is calculated over the new data, while the other party
>> will use the already received old data, resulting in authentication error.
>>
>>> For SW kTLS, we must copy.  Potentially sending out garbage signatures
>>> in a packet cannot be an "option".  
>> Obviously, SW kTLS must encrypt the data into a different kernel buffer,
>> which is the same as copying for that matter. TLS_DEVICE doesn't require this.
> This proposal is one big attrition of requirements, which I personally
> dislike quite a bit. Nothing material has changed since the first
> version of the code was upstreamed, let's ask ourselves - why was the
> knob not part of the initial submission?

I'm really not convinced that the copy requirement is needed.

At the time, Dave objected when we presented this on the netdev conference,
and we didn't want to delay the entire series just to argue this point. It's
all a matter of timing and priorities. Now we have an ASIC that uses this API,
and I'd like to show the best possible outcome, and not the best possible given
an arbitrary limitation that avoids an error where the user does something
erroneous.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-14  7:31           ` Boris Pismenny
@ 2020-07-14 16:23             ` Jakub Kicinski
  2020-07-14 20:38             ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2020-07-14 16:23 UTC (permalink / raw)
  To: Boris Pismenny; +Cc: David Miller, john.fastabend, daniel, tariqt, netdev

On Tue, 14 Jul 2020 10:31:25 +0300 Boris Pismenny wrote:
> Now we have an ASIC that uses this API, and I'd like to show the best
> possible outcome, and not the best possible given an arbitrary
> limitation that avoids an error where the user does something
> erroneous.

I would not call correctness an arbitrary limitation.

AFAIU page cache is shared, one application thinking that files it
opens can't be modified is no guarantee that the pages themselves 
will remain unchanged.

Isn't support for read-only page cache entries also a problem for 
RDMA when it comes to fs checksums? Sounds like a problem area that
needs real solutions.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-14  7:31           ` Boris Pismenny
  2020-07-14 16:23             ` Jakub Kicinski
@ 2020-07-14 20:38             ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2020-07-14 20:38 UTC (permalink / raw)
  To: borisp; +Cc: kuba, john.fastabend, daniel, tariqt, netdev

From: Boris Pismenny <borisp@mellanox.com>
Date: Tue, 14 Jul 2020 10:31:25 +0300

> At the time, Dave objected when we presented this on the netdev conference,
> and we didn't want to delay the entire series just to argue this point. It's
> all a matter of timing and priorities. Now we have an ASIC that uses this API,
> and I'd like to show the best possible outcome, and not the best possible given
> an arbitrary limitation that avoids an error where the user does something
> erroneous.

It's not arbitrary, and what the user is doing is not "erroneous".

Imagine a userspace fileserver using sendpage, other users in the system can
write to the files while the fileserver sends it off to a client.

And that's perfectly legitimate and fine.  We get the IP checksums
correct, everything works.

And, therefore, if TLS is used, the signatures should be correct too.

And I'm also happy to hear that from the very start I was against an
implementation that would knowingly send incorrect signatures.

I'm not moving on this point, sorry.  Correctness over performance.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-14  7:27           ` Boris Pismenny
@ 2020-07-14 20:42             ` David Miller
  2020-07-14 20:56               ` Boris Pismenny
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2020-07-14 20:42 UTC (permalink / raw)
  To: borisp; +Cc: kuba, john.fastabend, daniel, tariqt, netdev

From: Boris Pismenny <borisp@mellanox.com>
Date: Tue, 14 Jul 2020 10:27:11 +0300

> Why is it the kernel's role to protect against such an error?

Because the kernel should perform it's task correctly no matter what
in the world the user does.

> Surely the user that modifies pagecache data while sending it over
> sendfile (with TCP) will suffer from consistency bugs that are even worse.

No they won't, often times this is completely legitimate.  One task is
doing a sendpage while another process with access to the file writes
to it.

And that's perfectly fine and allowed by the APIs.

And we must set the IP checksums and TLS signatures correctly.

I will not allow for an implementation that can knowingly send corrupt
things onto the wire.

> The copy in the TLS_DEVICE sendfile path greatly reduces the value of
> all of this work. If we want to get the maximum out of this, then the
> copy has to go.
> 
> If we can't make this the default (as it is in FreeBSD), and we can't
> add a knob to enable this. Then, what should we do here?

I have no problem people using FreeBSD if it serves their needs better
than Linux does.  If people want corrupt TLS signatures in legitimate
use cases, and FreeBSD allows it, so be it.

So don't bother using this as a threat or a reason for me to allow a
feature or a change into the Linux networking.  It will never work.

And, let me get this straight, from the very beginning you intended to
try and add this thing even though I was %100 explicitly against it?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tls: add zerocopy device sendpage
  2020-07-14 20:42             ` David Miller
@ 2020-07-14 20:56               ` Boris Pismenny
  0 siblings, 0 replies; 12+ messages in thread
From: Boris Pismenny @ 2020-07-14 20:56 UTC (permalink / raw)
  To: David Miller; +Cc: kuba, john.fastabend, daniel, tariqt, netdev

On 14/07/2020 23:42, David Miller wrote:
> From: Boris Pismenny <borisp@mellanox.com>
> Date: Tue, 14 Jul 2020 10:27:11 +0300
>
>> Why is it the kernel's role to protect against such an error?
> Because the kernel should perform it's task correctly no matter what
> in the world the user does.
>
>> Surely the user that modifies pagecache data while sending it over
>> sendfile (with TCP) will suffer from consistency bugs that are even worse.
> No they won't, often times this is completely legitimate.  One task is
> doing a sendpage while another process with access to the file writes
> to it.
>
> And that's perfectly fine and allowed by the APIs.
>
> And we must set the IP checksums and TLS signatures correctly.
>
> I will not allow for an implementation that can knowingly send corrupt
> things onto the wire.

Not even if the user knows exactly what she is doing. For example, when
serving static files?

>> The copy in the TLS_DEVICE sendfile path greatly reduces the value of
>> all of this work. If we want to get the maximum out of this, then the
>> copy has to go.
>>
>> If we can't make this the default (as it is in FreeBSD), and we can't
>> add a knob to enable this. Then, what should we do here?
> I have no problem people using FreeBSD if it serves their needs better
> than Linux does.  If people want corrupt TLS signatures in legitimate
> use cases, and FreeBSD allows it, so be it.
>
> So don't bother using this as a threat or a reason for me to allow a
> feature or a change into the Linux networking.  It will never work.

This isn't what I intended to convey. I've used the FreeBSD implementation
to emphasize that the performance gain justifies including this despite
the implication on user applications.

> And, let me get this straight, from the very beginning you intended to
> try and add this thing even though I was %100 explicitly against it?

There was no intention to hide the correctness issue here. I've proposed
to expose it via a knob for this very reason. I'm sorry that I haven't
conveyed this more clearly in the commit message.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] tls: add zerocopy device sendpage
@ 2020-07-12 10:53 Boris Pismenny
  0 siblings, 0 replies; 12+ messages in thread
From: Boris Pismenny @ 2020-07-12 10:53 UTC (permalink / raw)
  To: kuba, john.fastabend, daniel, davem; +Cc: borisp, tariqt, netdev

Add support for zerocopy sendfile when using TLS device offload.
Before this patch, TLS device offload would copy sendfile data to a
bounce buffer. This can be avoided when the user knows that page cache
data is not modified. For example, when a serving static files.
Removing this copy improves performance significaintly, as TLS and TCP
sendfile perform the same operations, and the only overhead is TLS
header/trailer insertion.

This patch adds two configuration knobs to control TLS zerocopy sendfile:
1) socket option named TLS_TX_ZEROCOPY_SENDFILE that enables
applications to use zerocopy sendfile on a per-socket basis.
2) global sysctl named tls_zerocopy_sendfile that defines the default
for the entire system.

Non TLS device enabled sockets are not affected by this option,
and attempts to configure it will fail.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
---
 include/net/netns/ipv4.h   |  4 ++++
 include/net/tls.h          |  1 +
 include/uapi/linux/tls.h   |  1 +
 net/ipv4/sysctl_net_ipv4.c |  9 +++++++
 net/tls/tls_device.c       | 39 ++++++++++++++++++++++--------
 net/tls/tls_main.c         | 60 ++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 104 insertions(+), 10 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 9e36738c1fe1..bc828d272151 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -196,6 +196,10 @@ struct netns_ipv4 {
 	int sysctl_igmp_llm_reports;
 	int sysctl_igmp_qrv;
 
+#ifdef CONFIG_TLS_DEVICE
+	int sysctl_tls_zerocopy_sendfile;
+#endif
+
 	struct ping_group_range ping_group_range;
 
 	atomic_t dev_addr_genid;
diff --git a/include/net/tls.h b/include/net/tls.h
index e5dac7e74e79..f80985ac55de 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -172,6 +172,7 @@ struct tls_record_info {
 
 struct tls_offload_context_tx {
 	struct crypto_aead *aead_send;
+	bool zerocopy_sendpage;
 	spinlock_t lock;	/* protects records list */
 	struct list_head records_list;
 	struct tls_record_info *open_record;
diff --git a/include/uapi/linux/tls.h b/include/uapi/linux/tls.h
index bcd2869ed472..d6f65f5d206f 100644
--- a/include/uapi/linux/tls.h
+++ b/include/uapi/linux/tls.h
@@ -39,6 +39,7 @@
 /* TLS socket options */
 #define TLS_TX			1	/* Set transmit parameters */
 #define TLS_RX			2	/* Set receive parameters */
+#define TLS_TX_ZEROCOPY_SENDFILE	3	/* transmit zerocopy sendfile */
 
 /* Supported versions */
 #define TLS_VERSION_MINOR(ver)	((ver) & 0xFF)
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 5653e3b011bf..0a0fc29225a2 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1353,6 +1353,15 @@ static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ONE
 	},
+#ifdef CONFIG_TLS_DEVICE
+	{
+		.procname	= "tls_zerocopy_sendfile",
+		.data		= &init_net.ipv4.sysctl_tls_zerocopy_sendfile,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec
+	},
+#endif
 	{ }
 };
 
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 18fa6067bb7f..092b20428c15 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -413,7 +413,8 @@ static int tls_device_copy_data(void *addr, size_t bytes, struct iov_iter *i)
 static int tls_push_data(struct sock *sk,
 			 struct iov_iter *msg_iter,
 			 size_t size, int flags,
-			 unsigned char record_type)
+			 unsigned char record_type,
+			 struct page *zc_page)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_prot_info *prot = &tls_ctx->prot_info;
@@ -482,11 +483,21 @@ static int tls_push_data(struct sock *sk,
 		copy = min_t(size_t, size, (pfrag->size - pfrag->offset));
 		copy = min_t(size_t, copy, (max_open_record_len - record->len));
 
-		rc = tls_device_copy_data(page_address(pfrag->page) +
-					  pfrag->offset, copy, msg_iter);
-		if (rc)
-			goto handle_error;
-		tls_append_frag(record, pfrag, copy);
+		if (!zc_page) {
+			rc = tls_device_copy_data(page_address(pfrag->page) +
+						  pfrag->offset, copy, msg_iter);
+			if (rc)
+				goto handle_error;
+			tls_append_frag(record, pfrag, copy);
+		} else {
+			struct page_frag _pfrag;
+
+			copy = min_t(size_t, size, (max_open_record_len - record->len));
+			_pfrag.page = zc_page;
+			_pfrag.offset = 0;
+			_pfrag.size = copy;
+			tls_append_frag(record, &_pfrag, copy);
+		}
 
 		size -= copy;
 		if (!size) {
@@ -548,7 +559,7 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	}
 
 	rc = tls_push_data(sk, &msg->msg_iter, size,
-			   msg->msg_flags, record_type);
+			   msg->msg_flags, record_type, NULL);
 
 out:
 	release_sock(sk);
@@ -560,9 +571,10 @@ int tls_device_sendpage(struct sock *sk, struct page *page,
 			int offset, size_t size, int flags)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	struct tls_offload_context_tx *ctx = tls_offload_ctx_tx(tls_ctx);
 	struct iov_iter	msg_iter;
-	char *kaddr = kmap(page);
 	struct kvec iov;
+	char *kaddr;
 	int rc;
 
 	if (flags & MSG_SENDPAGE_NOTLAST)
@@ -576,11 +588,18 @@ int tls_device_sendpage(struct sock *sk, struct page *page,
 		goto out;
 	}
 
+	if (ctx->zerocopy_sendpage) {
+		rc = tls_push_data(sk, &msg_iter, size,
+				   flags, TLS_RECORD_TYPE_DATA, page);
+		goto out;
+	}
+
+	kaddr = kmap(page);
 	iov.iov_base = kaddr + offset;
 	iov.iov_len = size;
 	iov_iter_kvec(&msg_iter, WRITE, &iov, 1, size);
 	rc = tls_push_data(sk, &msg_iter, size,
-			   flags, TLS_RECORD_TYPE_DATA);
+			   flags, TLS_RECORD_TYPE_DATA, NULL);
 	kunmap(page);
 
 out:
@@ -654,7 +673,7 @@ static int tls_device_push_pending_record(struct sock *sk, int flags)
 	struct iov_iter	msg_iter;
 
 	iov_iter_kvec(&msg_iter, WRITE, NULL, 0, 0);
-	return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA);
+	return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA, NULL);
 }
 
 void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index ec10041c6b7d..b95437c91339 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -422,6 +422,27 @@ static int do_tls_getsockopt_tx(struct sock *sk, char __user *optval,
 	return rc;
 }
 
+static int do_tls_getsockopt_tx_zc(struct sock *sk, char __user *optval,
+				   int __user *optlen)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	struct tls_offload_context_tx *ctx;
+	int len;
+
+	if (get_user(len, optlen))
+		return -EFAULT;
+
+	len = min_t(unsigned int, len, sizeof(int));
+	if (len < 0)
+		return -EINVAL;
+
+	if (!tls_ctx || tls_ctx->tx_conf != TLS_HW)
+		return -EBUSY;
+
+	ctx = tls_offload_ctx_tx(tls_ctx);
+	return ctx->zerocopy_sendpage;
+}
+
 static int do_tls_getsockopt(struct sock *sk, int optname,
 			     char __user *optval, int __user *optlen)
 {
@@ -431,6 +452,9 @@ static int do_tls_getsockopt(struct sock *sk, int optname,
 	case TLS_TX:
 		rc = do_tls_getsockopt_tx(sk, optval, optlen);
 		break;
+	case TLS_TX_ZEROCOPY_SENDFILE:
+		rc = do_tls_getsockopt_tx_zc(sk, optval, optlen);
+		break;
 	default:
 		rc = -ENOPROTOOPT;
 		break;
@@ -450,6 +474,15 @@ static int tls_getsockopt(struct sock *sk, int level, int optname,
 	return do_tls_getsockopt(sk, optname, optval, optlen);
 }
 
+static void tls_set_tx_zerocopy_sendfile(struct tls_context *tls_ctx,
+					 int val)
+{
+	struct tls_offload_context_tx *ctx;
+
+	ctx = tls_offload_ctx_tx(tls_ctx);
+	ctx->zerocopy_sendpage = val;
+}
+
 static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
 				  unsigned int optlen, int tx)
 {
@@ -533,8 +566,11 @@ static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
 		rc = tls_set_device_offload(sk, ctx);
 		conf = TLS_HW;
 		if (!rc) {
+			int zc = sock_net(sk)->ipv4.sysctl_tls_zerocopy_sendfile;
+
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSTXDEVICE);
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSCURRTXDEVICE);
+			tls_set_tx_zerocopy_sendfile(ctx, zc);
 		} else {
 			rc = tls_set_sw_offload(sk, ctx, 1);
 			if (rc)
@@ -579,6 +615,25 @@ static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
 	return rc;
 }
 
+static int do_tls_setsockopt_tx_zc(struct sock *sk, char __user *optval,
+				   unsigned int optlen)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	int val;
+
+	if (!tls_ctx || tls_ctx->tx_conf != TLS_HW)
+		return -EINVAL;
+
+	if (optlen < sizeof(int))
+		return -EINVAL;
+
+	if (get_user(val, (int __user *)optval))
+		return -EFAULT;
+
+	tls_set_tx_zerocopy_sendfile(tls_ctx, val);
+	return 0;
+}
+
 static int do_tls_setsockopt(struct sock *sk, int optname,
 			     char __user *optval, unsigned int optlen)
 {
@@ -592,6 +647,11 @@ static int do_tls_setsockopt(struct sock *sk, int optname,
 					    optname == TLS_TX);
 		release_sock(sk);
 		break;
+	case TLS_TX_ZEROCOPY_SENDFILE:
+		lock_sock(sk);
+		rc = do_tls_setsockopt_tx_zc(sk, optval, optlen);
+		release_sock(sk);
+		break;
 	default:
 		rc = -ENOPROTOOPT;
 		break;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-07-14 20:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1594550649-3097-1-git-send-email-borisp@mellanox.com>
     [not found] ` <20200712.153233.370000904740228888.davem@davemloft.net>
2020-07-13  7:49   ` [PATCH] tls: add zerocopy device sendpage Boris Pismenny
2020-07-13 19:05     ` David Miller
2020-07-13 22:15       ` Boris Pismenny
2020-07-13 22:59         ` Jakub Kicinski
2020-07-14  7:31           ` Boris Pismenny
2020-07-14 16:23             ` Jakub Kicinski
2020-07-14 20:38             ` David Miller
2020-07-14  1:02         ` David Miller
2020-07-14  7:27           ` Boris Pismenny
2020-07-14 20:42             ` David Miller
2020-07-14 20:56               ` Boris Pismenny
2020-07-12 10:53 Boris Pismenny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).