* md5sum (from libkcapi) fails on kernel 4.9 but not on 4.13
@ 2017-10-16 6:53 Christophe LEROY
2017-10-16 21:10 ` Stephan Mueller
0 siblings, 1 reply; 6+ messages in thread
From: Christophe LEROY @ 2017-10-16 6:53 UTC (permalink / raw)
To: Stephan Müller; +Cc: linux-crypto
Hi Stephan,
I get an issue with md5sum of a big file with kernel 4.9. It don't get
that issue with kernel 4.13.
When I do an strace, I see a difference in the calls: at the end of the
file, with 4.9 md5sum uses sendmsg() for the last block while with 4.13
it uses splice() as for all the previous blocks.
The problem is that the last block has a size over 32kbytes, which is
the maximum size the talitos driver accepts to hash at once.
It looks like sendmsg() sends the entire block to the crypto driver
while splice() calls the crypto driver with blocks of page size.
Is there a way to workaround this issue on 4.9 ? Is there a way to avoid
talitos ahash being called with more than 32kbytes of data ?
strace with 4.9 kernel:
...
vmsplice(5,
[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
558080}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 262144
splice(4, NULL, 7, NULL, 262144, SPLICE_F_MORE) = 262144
vmsplice(5,
[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
295936}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 262144
splice(4, NULL, 7, NULL, 262144, SPLICE_F_MORE) = 262144
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33792}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
write(2, "Generation of hash for file test"..., 45) = 45
...
strace with 4.13 kernel:
...
vmsplice(5,
[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
558080}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 262144
splice(4, NULL, 7, NULL, 262144, SPLICE_F_MORE) = 262144
vmsplice(5,
[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
295936}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 262144
splice(4, NULL, 7, NULL, 262144, SPLICE_F_MORE) = 262144
vmsplice(5,
[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33792}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 33792
splice(4, NULL, 7, NULL, 33792, SPLICE_F_MORE) = 33792
recvmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\"\35|\3273\302\35\252\4\276l\210\315B\210^", 64}],
msg_controllen=0, msg_flags=0}, 0) = 16
fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(204, 46), ...}) = 0
ioctl(1, TCGETS, {B115200 opost isig icanon echo ...}) = 0
write(1, "221d7cd733c21daa04be6c88cd42885e"..., 39) = 39
...
Thanks
Christophe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: md5sum (from libkcapi) fails on kernel 4.9 but not on 4.13
2017-10-16 6:53 md5sum (from libkcapi) fails on kernel 4.9 but not on 4.13 Christophe LEROY
@ 2017-10-16 21:10 ` Stephan Mueller
2017-10-17 7:58 ` Christophe LEROY
0 siblings, 1 reply; 6+ messages in thread
From: Stephan Mueller @ 2017-10-16 21:10 UTC (permalink / raw)
To: Christophe LEROY; +Cc: linux-crypto
Am Montag, 16. Oktober 2017, 08:53:00 CEST schrieb Christophe LEROY:
Hi Christophe,
> Hi Stephan,
>
> I get an issue with md5sum of a big file with kernel 4.9. It don't get
> that issue with kernel 4.13.
The key to the difference in libkcapi is the following code:
static void _kcapi_handle_flags(struct kcapi_handle *handle)
...
/* older interfaces only processed 16 pages in a row */
handle->flags.alg_max_pages = _kcapi_kernver_ge(handle, 4, 11, 0) ?
UINT_MAX : ALG_MAX_PAGES;
...
This check is mainly for skcipher but it affects all cipher types. Thus, on
older kernels, only 16 pages are injected into the kernel with splice and then
it is reverted to sendmsg. Again, this logic is not so much of an issue for
algif_hash, but rather for algif_skcipher and algif_aead.
>
> When I do an strace, I see a difference in the calls: at the end of the
> file, with 4.9 md5sum uses sendmsg() for the last block while with 4.13
> it uses splice() as for all the previous blocks.
>
> The problem is that the last block has a size over 32kbytes, which is
> the maximum size the talitos driver accepts to hash at once.
> It looks like sendmsg() sends the entire block to the crypto driver
> while splice() calls the crypto driver with blocks of page size.
I understand that your driver has this 32kb limit. But I am not sure what the
difference between the handling of the sendmsg and the splice invocation is.
According to the strace, the splice accepted 256kb of data. Looking into the
hash_sendpage function implementing the backend of this splice system call, it
simply performs an update invocation with this buffer size. I.e. it invokes
your driver with 256kb of data. The sendmsg first performs a copy_from_user
with the maximum limit of 16 * PAGE_SIZE (see the limit variable in
hash_sendmsg) and then invokes the update function.
So, I am not sure why the sendmsg call chokes where the sendpage call
succeeds.
If you tamper with the code shown above from libkcapi and set alg_max_pages to
a low value, the library reverts to sendmsg after the given number of pages.
Ciao
Stephan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: md5sum (from libkcapi) fails on kernel 4.9 but not on 4.13
2017-10-16 21:10 ` Stephan Mueller
@ 2017-10-17 7:58 ` Christophe LEROY
2017-10-17 8:51 ` Christophe LEROY
2017-10-17 16:43 ` Stephan Mueller
0 siblings, 2 replies; 6+ messages in thread
From: Christophe LEROY @ 2017-10-17 7:58 UTC (permalink / raw)
To: Stephan Mueller; +Cc: linux-crypto
Hi Stephan,
Le 16/10/2017 à 23:10, Stephan Mueller a écrit :
> Am Montag, 16. Oktober 2017, 08:53:00 CEST schrieb Christophe LEROY:
>
> Hi Christophe,
>
>> Hi Stephan,
>>
>> I get an issue with md5sum of a big file with kernel 4.9. It don't get
>> that issue with kernel 4.13.
>
> The key to the difference in libkcapi is the following code:
>
> static void _kcapi_handle_flags(struct kcapi_handle *handle)
> ...
> /* older interfaces only processed 16 pages in a row */
> handle->flags.alg_max_pages = _kcapi_kernver_ge(handle, 4, 11, 0) ?
> UINT_MAX : ALG_MAX_PAGES;
> ...
>
> This check is mainly for skcipher but it affects all cipher types. Thus, on
> older kernels, only 16 pages are injected into the kernel with splice and then
> it is reverted to sendmsg. Again, this logic is not so much of an issue for
> algif_hash, but rather for algif_skcipher and algif_aead.
>>
>> When I do an strace, I see a difference in the calls: at the end of the
>> file, with 4.9 md5sum uses sendmsg() for the last block while with 4.13
>> it uses splice() as for all the previous blocks.
>>
>> The problem is that the last block has a size over 32kbytes, which is
>> the maximum size the talitos driver accepts to hash at once.
>> It looks like sendmsg() sends the entire block to the crypto driver
>> while splice() calls the crypto driver with blocks of page size.
>
> I understand that your driver has this 32kb limit. But I am not sure what the
> difference between the handling of the sendmsg and the splice invocation is.
> According to the strace, the splice accepted 256kb of data. Looking into the
> hash_sendpage function implementing the backend of this splice system call, it
> simply performs an update invocation with this buffer size. I.e. it invokes
> your driver with 256kb of data. The sendmsg first performs a copy_from_user
> with the maximum limit of 16 * PAGE_SIZE (see the limit variable in
> hash_sendmsg) and then invokes the update function.
As far as I can see, hash_sendpage() is called with one single page, of
size 16kb in my setup. If I understand correct, the loop over every page
is implemented in __splice_from_pipe().
>
> So, I am not sure why the sendmsg call chokes where the sendpage call
> succeeds.
As far as I can see, the sendmsg() calls the update function with the
full buffer, which in my case is over the 32kb limit of the driver.
>
> If you tamper with the code shown above from libkcapi and set alg_max_pages to
> a low value, the library reverts to sendmsg after the given number of pages.
Couldn't we get the libkcapi to splice until the last full page, then
only use sendmsg() for the last chunk once it is smaller than an entire
page ?
Thanks
Christophe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: md5sum (from libkcapi) fails on kernel 4.9 but not on 4.13
2017-10-17 7:58 ` Christophe LEROY
@ 2017-10-17 8:51 ` Christophe LEROY
2017-10-17 13:42 ` Stephan Mueller
2017-10-17 16:43 ` Stephan Mueller
1 sibling, 1 reply; 6+ messages in thread
From: Christophe LEROY @ 2017-10-17 8:51 UTC (permalink / raw)
To: Stephan Mueller; +Cc: linux-crypto
Hi Again Stephan
Le 17/10/2017 à 09:58, Christophe LEROY a écrit :
> Hi Stephan,
>
> Le 16/10/2017 à 23:10, Stephan Mueller a écrit :
>> Am Montag, 16. Oktober 2017, 08:53:00 CEST schrieb Christophe LEROY:
>>
>> Hi Christophe,
>>
>>> Hi Stephan,
>>>
>>> I get an issue with md5sum of a big file with kernel 4.9. It don't get
>>> that issue with kernel 4.13.
>>
>> The key to the difference in libkcapi is the following code:
>>
>> static void _kcapi_handle_flags(struct kcapi_handle *handle)
>> ...
>> /* older interfaces only processed 16 pages in a row */
>> handle->flags.alg_max_pages = _kcapi_kernver_ge(handle, 4,
>> 11, 0) ?
>> UINT_MAX : ALG_MAX_PAGES;
>> ...
>>
>> This check is mainly for skcipher but it affects all cipher types.
>> Thus, on
>> older kernels, only 16 pages are injected into the kernel with splice
>> and then
>> it is reverted to sendmsg. Again, this logic is not so much of an
>> issue for
>> algif_hash, but rather for algif_skcipher and algif_aead.
>>>
>>> When I do an strace, I see a difference in the calls: at the end of the
>>> file, with 4.9 md5sum uses sendmsg() for the last block while with 4.13
>>> it uses splice() as for all the previous blocks.
>>>
>>> The problem is that the last block has a size over 32kbytes, which is
>>> the maximum size the talitos driver accepts to hash at once.
>>> It looks like sendmsg() sends the entire block to the crypto driver
>>> while splice() calls the crypto driver with blocks of page size.
>>
>> I understand that your driver has this 32kb limit. But I am not sure
>> what the
>> difference between the handling of the sendmsg and the splice
>> invocation is.
>> According to the strace, the splice accepted 256kb of data. Looking
>> into the
>> hash_sendpage function implementing the backend of this splice system
>> call, it
>> simply performs an update invocation with this buffer size. I.e. it
>> invokes
>> your driver with 256kb of data. The sendmsg first performs a
>> copy_from_user
>> with the maximum limit of 16 * PAGE_SIZE (see the limit variable in
>> hash_sendmsg) and then invokes the update function.
>
> As far as I can see, hash_sendpage() is called with one single page, of
> size 16kb in my setup. If I understand correct, the loop over every page
> is implemented in __splice_from_pipe().
>
>>
>> So, I am not sure why the sendmsg call chokes where the sendpage call
>> succeeds.
>
> As far as I can see, the sendmsg() calls the update function with the
> full buffer, which in my case is over the 32kb limit of the driver.
>
>>
>> If you tamper with the code shown above from libkcapi and set
>> alg_max_pages to
>> a low value, the library reverts to sendmsg after the given number of
>> pages.
>
> Couldn't we get the libkcapi to splice until the last full page, then
> only use sendmsg() for the last chunk once it is smaller than an entire
> page ?
>
Note that the test reported above is done with version 0.14.0
I've now tried a test with 1.0.0, and there seems to be another big
issue: the error returned by sendmsg() is not taken into account anymore:
...
vmsplice(5,
[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
558080}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 262144
splice(4, NULL, 7, NULL, 262144, SPLICE_F_MORE) = 262144
vmsplice(5,
[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
295936}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 262144
splice(4, NULL, 7, NULL, 262144, SPLICE_F_MORE) = 262144
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33792}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33814}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33836}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33858}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33880}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33902}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33924}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33946}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33968}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
33990}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
sendmsg(7, {msg_name(0)=NULL,
msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
34012}], msg_controllen=0, msg_flags=0}, MSG_MORE) = -1 EINVAL (Invalid
argument)
...
Christophe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: md5sum (from libkcapi) fails on kernel 4.9 but not on 4.13
2017-10-17 8:51 ` Christophe LEROY
@ 2017-10-17 13:42 ` Stephan Mueller
0 siblings, 0 replies; 6+ messages in thread
From: Stephan Mueller @ 2017-10-17 13:42 UTC (permalink / raw)
To: Christophe LEROY; +Cc: linux-crypto
Am Dienstag, 17. Oktober 2017, 10:51:06 CEST schrieb Christophe LEROY:
Hi Christophe,
>
> I've now tried a test with 1.0.0, and there seems to be another big
> issue: the error returned by sendmsg() is not taken into account anymore:
It seems that this bug was there before. Can you please check whether this one
fixes it?
diff --git a/lib/kcapi-kernel-if.c b/lib/kcapi-kernel-if.c
index e00d328..9053104 100644
--- a/lib/kcapi-kernel-if.c
+++ b/lib/kcapi-kernel-if.c
@@ -321,6 +321,8 @@ int32_t _kcapi_common_vmsplice_chunk_fd(struct
kcapi_handle *handle, int *fdptr,
if ((handle->processed_sg++) > handle->flags.alg_max_pages) {
ret = _kcapi_common_send_data(handle, &iov, 1,
sflags);
+ if (ret < 0)
+ return ret;
} else {
ret = vmsplice(handle->pipes[1], &iov, 1,
SPLICE_F_GIFT|flags);
Thanks
Ciao
Stephan
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: md5sum (from libkcapi) fails on kernel 4.9 but not on 4.13
2017-10-17 7:58 ` Christophe LEROY
2017-10-17 8:51 ` Christophe LEROY
@ 2017-10-17 16:43 ` Stephan Mueller
1 sibling, 0 replies; 6+ messages in thread
From: Stephan Mueller @ 2017-10-17 16:43 UTC (permalink / raw)
To: Christophe LEROY; +Cc: linux-crypto
Am Dienstag, 17. Oktober 2017, 09:58:31 CEST schrieb Christophe LEROY:
Hi Christophe,
>
> > If you tamper with the code shown above from libkcapi and set
> > alg_max_pages to a low value, the library reverts to sendmsg after the
> > given number of pages.
> Couldn't we get the libkcapi to splice until the last full page, then
> only use sendmsg() for the last chunk once it is smaller than an entire
> page ?
Adding such loop to the libkcapi invocation of sendmsg to send page-wise is
easy. Yet, I am wondering whether this is the right thing to do. The
algif_hash uses the standard in-kernel interface to invoke the cipher
implementation.
Thus, when fixing the issue in user space, the issue is still present in the
kernel. Thus, if for some reason the kernel wants to hash more than 32kb in a
row, it will fall with the Talitos driver. Thus, wouldn't it make more sense
that the loop to process the data in chunks inside the Talitos driver?
Ciao
Stephan
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-10-17 16:43 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-16 6:53 md5sum (from libkcapi) fails on kernel 4.9 but not on 4.13 Christophe LEROY
2017-10-16 21:10 ` Stephan Mueller
2017-10-17 7:58 ` Christophe LEROY
2017-10-17 8:51 ` Christophe LEROY
2017-10-17 13:42 ` Stephan Mueller
2017-10-17 16:43 ` Stephan Mueller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).