All of lore.kernel.org
 help / color / mirror / Atom feed
* x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-20 14:10 Alexander Potapenko
  2017-11-20 21:42   ` Eric Biggers
  0 siblings, 1 reply; 61+ messages in thread
From: Alexander Potapenko @ 2017-11-20 14:10 UTC (permalink / raw)
  To: linux-crypto
  Cc: Kostya Serebryany, Dmitriy Vyukov, Andrey Konovalov, Eric Biggers

[-- Attachment #1: Type: text/plain, Size: 5538 bytes --]

Hi all,

TL;DR userspace fuzzing may be very effective for finding bugs in
crypto code, but might require some kernel-side changes.

When the attached binary file,
crash-bbeda07d4ce60cf3b7fc20c3af92297e19f6dbae, is passed as payload
to add_key("asymmetric", "desc", payload, len), it triggers a
slab-out-of-bounds KASAN report (see below). This can be reproduced
with the attached program, add_key.c.

We found this bug by fuzzing asn1_ber_decoder() in the userspace using
libFuzzer (http://llvm.org/docs/LibFuzzer.html).
The trick is to compile and link together several translation units
that provide a transitive closure of asn1_ber_decoder() (some of the
kernel functions need to be re-implemented to run in the userspace).
I've attached the resulting fuzzer as well (asn1_fuzzer.tar.gz). It
can operate on its own, although better results are achieved when
using the fuzzing corpus from
https://github.com/google/boringssl/tree/master/fuzz and
https://github.com/openssl/openssl/tree/master/fuzz.

The question I'd like to raise is the testability of the crypto code
in the kernel.

Right now it's quite hard to extract the implementation of e.g.
message parsers to run them in the userspace, although there's nothing
kernel-specific in those algorithms. For example, I haven't managed to
quickly build a similar fuzzer for decoding PKCS7 messages, because of
too many extra dependencies.
I've started looking into linking libFuzzer into an UML build, but
that's also pretty invasive.

The main reason for testing this code in the userspace is speed.
Syzkaller fuzzing the add_key() syscall in the kernel can operate at
the speed of maybe 2000 executions per second (usually a lot less),
whereas a libFuzzer-based userspace fuzzer can run tens of thousands
tests per second. This will sure shed some light into the dark corners
and perhaps find more bugs. Once we are there, it should be easy to
leverage the power of https://github.com/google/oss-fuzz to automate
the fuzzing.

Could it be possible to decouple the parsing algorithms from the rest
of kernel-specific crypto code? Maybe someone has other ideas about
how this code can be ran in the userspace?

Alex.

KASAN report below:

==================================================================
BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x19f/0x940
Read of size 128 at addr ffff8800334747af by task add_key/2599

CPU: 0 PID: 2599 Comm: add_key Not tainted 4.14.0+ #1289
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:17
 dump_stack+0x90/0x10e lib/dump_stack.c:53
 print_address_description+0x65/0x270 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351
 kasan_report+0x251/0x340 mm/kasan/report.c:409
 memcpy+0x1f/0x50 mm/kasan/kasan.c:302
 x509_fabricate_name.constprop.1+0x19f/0x940
crypto/asymmetric_keys/x509_cert_parser.c:366
 asn1_ber_decoder+0x9cf/0x1dc0 lib/asn1_decoder.c:447
 x509_cert_parse+0x1c2/0x600 crypto/asymmetric_keys/x509_cert_parser.c:89
 x509_key_preparse+0x5c/0x850 crypto/asymmetric_keys/x509_public_key.c:174
 asymmetric_key_preparse+0x8e/0xe0 crypto/asymmetric_keys/asymmetric_type.c:388
 key_create_or_update+0x4d3/0x1080 security/keys/key.c:855
 SYSC_add_key security/keys/keyctl.c:122
 SyS_add_key+0x145/0x280 security/keys/keyctl.c:62
 entry_SYSCALL_64_fastpath+0x13/0x6c arch/x86/entry/entry_64.S:203
...
Allocated by task 2599:
 set_track mm/kasan/kasan.c:459
 kasan_kmalloc+0xa0/0xd0 mm/kasan/kasan.c:551
 __kmalloc_node+0x7d/0x240 mm/slub.c:3807
 kvmalloc ./include/linux/mm.h:540
 SYSC_add_key security/keys/keyctl.c:104
...
Freed by task 7:
 set_track mm/kasan/kasan.c:459
 kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
 slab_free_hook mm/slub.c:1391
 slab_free_freelist_hook mm/slub.c:1412
 slab_free mm/slub.c:2968
 kfree+0x88/0x190 mm/slub.c:3899
 __rcu_reclaim kernel/rcu/rcu.h:190
 rcu_do_batch kernel/rcu/tree.c:2758
 invoke_rcu_callbacks kernel/rcu/tree.c:3012
 __rcu_process_callbacks kernel/rcu/tree.c:2979
 rcu_process_callbacks+0x58e/0x1cb0 kernel/rcu/tree.c:2996
 __do_softirq+0x1e6/0x696 kernel/softirq.c:285

The buggy address belongs to the object at ffff880033474780
 which belongs to the cache kmalloc-96 of size 96
The buggy address is located 47 bytes inside of
 96-byte region [ffff880033474780, ffff8800334747e0)
The buggy address belongs to the page:
page:ffffea0000cd1d00 count:1 mapcount:0 mapping:          (null) index:0x0
flags: 0x100000000000100(slab)
raw: 0100000000000100 0000000000000000 0000000000000000 0000000180200020
raw: ffffea0000cd35c0 0000000800000008 ffff880035c01780 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff880033474680: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 ffff880033474700: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>ffff880033474780: 00 00 00 00 00 00 00 00 06 fc fc fc fc fc fc fc
                                           ^
 ffff880033474800: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
 ffff880033474880: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

[-- Attachment #2: add_key.c --]
[-- Type: text/x-csrc, Size: 738 bytes --]

#include <sys/types.h>
#include <keyutils.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[])
{
    key_serial_t key;

    if (argc != 4) {
        fprintf(stderr, "Usage: %s type description payload\n",
                argv[0]);
        exit(EXIT_FAILURE);
    }
    FILE *f = fopen(argv[3], "rb");
    fseek(f, 0, SEEK_END);
    int len = ftell(f);
    fseek(f, 0, SEEK_SET);
    char *payload = malloc(len + 1);
    fread(payload, len, 1, f);

    key = add_key(argv[1], argv[2], payload, len,
                KEY_SPEC_SESSION_KEYRING);
    if (key == -1) {
        perror("add_key");
        exit(EXIT_FAILURE);
    }

    printf("Key ID is %lx\n", (long) key);

    exit(EXIT_SUCCESS);
}


[-- Attachment #3: crash-bbeda07d4ce60cf3b7fc20c3af92297e19f6dbae --]
[-- Type: application/octet-stream, Size: 70 bytes --]

[-- Attachment #4: asn1_fuzzer.tar.gz --]
[-- Type: application/gzip, Size: 303197 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-20 14:10 x509 parsing bug + fuzzing crypto in the userspace Alexander Potapenko
@ 2017-11-20 21:42   ` Eric Biggers
  0 siblings, 0 replies; 61+ messages in thread
From: Eric Biggers @ 2017-11-20 21:42 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: linux-crypto, Kostya Serebryany, keyrings, Dmitriy Vyukov,
	Andrey Konovalov

+Cc keyrings@vger.kernel.org (for asymmetric_keys)

First of all, thanks for working on this!  A lot of this code really needs to be
better tested.

On Mon, Nov 20, 2017 at 03:10:55PM +0100, Alexander Potapenko wrote:
> Hi all,
> 
> TL;DR userspace fuzzing may be very effective for finding bugs in
> crypto code, but might require some kernel-side changes.
> 
> When the attached binary file,
> crash-bbeda07d4ce60cf3b7fc20c3af92297e19f6dbae, is passed as payload
> to add_key("asymmetric", "desc", payload, len), it triggers a
> slab-out-of-bounds KASAN report (see below). This can be reproduced
> with the attached program, add_key.c.

The bug is in asn1_ber_decoder(): the special length value of 0x80, which
indicates an "indefinite length", is being passed to one of the "action"
functions as the real length, causing the input buffer to be overread.  Off-hand
I don't know the best fix, but probably it would involve calling
asn1_find_indefinite_length() before the "action" rather than after...

> 
> We found this bug by fuzzing asn1_ber_decoder() in the userspace using
> libFuzzer (http://llvm.org/docs/LibFuzzer.html).
> The trick is to compile and link together several translation units
> that provide a transitive closure of asn1_ber_decoder() (some of the
> kernel functions need to be re-implemented to run in the userspace).
> I've attached the resulting fuzzer as well (asn1_fuzzer.tar.gz). It
> can operate on its own, although better results are achieved when
> using the fuzzing corpus from
> https://github.com/google/boringssl/tree/master/fuzz and
> https://github.com/openssl/openssl/tree/master/fuzz.

I had actually tried almost the same thing recently, but with afl-fuzz instead
of libFuzzer.  But I don't think it was any better than what you did, and I
didn't find the above bug, perhaps because the AFL_HARDEN compiler options don't
include AddressSanitizer.

I did use the original unpreprocessed .c files instead of preprocessed ones, but
it required stubbing out quite a few headers.  Probably using preprocessed files
is a bit easier.

For the fuzzing function I had it call x509_cert_parse() rather than
asn1_ber_decode() directly, but with x509_get_sig_params() and
x509_check_for_self_signed() stubbed out.  It would be nice to get coverage of
x509_check_for_self_signed() because it will call into crypto/rsa.c and, among
other things, run yet another ASN.1 decoder...  But that would have required
porting a lot of the crypto API to userspace.

> 
> Could it be possible to decouple the parsing algorithms from the rest
> of kernel-specific crypto code? Maybe someone has other ideas about
> how this code can be ran in the userspace?
> 

One idea is that the ASN.1 parsers could be run with no "actions", which would
minimize dependencies on other kernel code.  But unfortunately that would omit a
lot of the interesting code.  Many (or most?) bugs will be in how the parsed
data is used, rather than in the parsing itself.  I don't think there is an easy
solution.

Note that separate from asymmetric_keys (which you can think of as being
in-between the keyrings subsystem and the crypto subsystem) there is also the
userspace interface to cryptographic algorithms, AF_ALG.  It might be possible
to port a lot of the crypto API to userspace, but it would require a lot of work
to stub things out.  Maybe a simpler improvement would be to teach syzkaller to
more thoroughly test AF_ALG.  For example it could be made aware of algorithm
templates so that it could try combining them in unusual ways.  (Example:
https://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2 was a NULL pointer
dereference bug that occurred if you asked to use the algorithm "mcryptd(md5)",
i.e. the mcryptd template wrapping md5.)  Also,
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS should be unset, so that the crypto
self-tests are enabled.

Eric

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-20 21:42   ` Eric Biggers
  0 siblings, 0 replies; 61+ messages in thread
From: Eric Biggers @ 2017-11-20 21:42 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: linux-crypto, Kostya Serebryany, keyrings, Dmitriy Vyukov,
	Andrey Konovalov

+Cc keyrings@vger.kernel.org (for asymmetric_keys)

First of all, thanks for working on this!  A lot of this code really needs to be
better tested.

On Mon, Nov 20, 2017 at 03:10:55PM +0100, Alexander Potapenko wrote:
> Hi all,
> 
> TL;DR userspace fuzzing may be very effective for finding bugs in
> crypto code, but might require some kernel-side changes.
> 
> When the attached binary file,
> crash-bbeda07d4ce60cf3b7fc20c3af92297e19f6dbae, is passed as payload
> to add_key("asymmetric", "desc", payload, len), it triggers a
> slab-out-of-bounds KASAN report (see below). This can be reproduced
> with the attached program, add_key.c.

The bug is in asn1_ber_decoder(): the special length value of 0x80, which
indicates an "indefinite length", is being passed to one of the "action"
functions as the real length, causing the input buffer to be overread.  Off-hand
I don't know the best fix, but probably it would involve calling
asn1_find_indefinite_length() before the "action" rather than after...

> 
> We found this bug by fuzzing asn1_ber_decoder() in the userspace using
> libFuzzer (http://llvm.org/docs/LibFuzzer.html).
> The trick is to compile and link together several translation units
> that provide a transitive closure of asn1_ber_decoder() (some of the
> kernel functions need to be re-implemented to run in the userspace).
> I've attached the resulting fuzzer as well (asn1_fuzzer.tar.gz). It
> can operate on its own, although better results are achieved when
> using the fuzzing corpus from
> https://github.com/google/boringssl/tree/master/fuzz and
> https://github.com/openssl/openssl/tree/master/fuzz.

I had actually tried almost the same thing recently, but with afl-fuzz instead
of libFuzzer.  But I don't think it was any better than what you did, and I
didn't find the above bug, perhaps because the AFL_HARDEN compiler options don't
include AddressSanitizer.

I did use the original unpreprocessed .c files instead of preprocessed ones, but
it required stubbing out quite a few headers.  Probably using preprocessed files
is a bit easier.

For the fuzzing function I had it call x509_cert_parse() rather than
asn1_ber_decode() directly, but with x509_get_sig_params() and
x509_check_for_self_signed() stubbed out.  It would be nice to get coverage of
x509_check_for_self_signed() because it will call into crypto/rsa.c and, among
other things, run yet another ASN.1 decoder...  But that would have required
porting a lot of the crypto API to userspace.

> 
> Could it be possible to decouple the parsing algorithms from the rest
> of kernel-specific crypto code? Maybe someone has other ideas about
> how this code can be ran in the userspace?
> 

One idea is that the ASN.1 parsers could be run with no "actions", which would
minimize dependencies on other kernel code.  But unfortunately that would omit a
lot of the interesting code.  Many (or most?) bugs will be in how the parsed
data is used, rather than in the parsing itself.  I don't think there is an easy
solution.

Note that separate from asymmetric_keys (which you can think of as being
in-between the keyrings subsystem and the crypto subsystem) there is also the
userspace interface to cryptographic algorithms, AF_ALG.  It might be possible
to port a lot of the crypto API to userspace, but it would require a lot of work
to stub things out.  Maybe a simpler improvement would be to teach syzkaller to
more thoroughly test AF_ALG.  For example it could be made aware of algorithm
templates so that it could try combining them in unusual ways.  (Example:
https://marc.info/?l=linux-crypto-vger&m\x148063683310477&w=2 was a NULL pointer
dereference bug that occurred if you asked to use the algorithm "mcryptd(md5)",
i.e. the mcryptd template wrapping md5.)  Also,
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS should be unset, so that the crypto
self-tests are enabled.

Eric

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-20 21:42   ` Eric Biggers
@ 2017-11-21  8:00     ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-21  8:00 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Alexander Potapenko, linux-crypto, Kostya Serebryany, keyrings,
	Andrey Konovalov

On Mon, Nov 20, 2017 at 10:42 PM, Eric Biggers <ebiggers@google.com> wrote:
> +Cc keyrings@vger.kernel.org (for asymmetric_keys)
>
> First of all, thanks for working on this!  A lot of this code really needs to be
> better tested.
>
> On Mon, Nov 20, 2017 at 03:10:55PM +0100, Alexander Potapenko wrote:
>> Hi all,
>>
>> TL;DR userspace fuzzing may be very effective for finding bugs in
>> crypto code, but might require some kernel-side changes.
>>
>> When the attached binary file,
>> crash-bbeda07d4ce60cf3b7fc20c3af92297e19f6dbae, is passed as payload
>> to add_key("asymmetric", "desc", payload, len), it triggers a
>> slab-out-of-bounds KASAN report (see below). This can be reproduced
>> with the attached program, add_key.c.
>
> The bug is in asn1_ber_decoder(): the special length value of 0x80, which
> indicates an "indefinite length", is being passed to one of the "action"
> functions as the real length, causing the input buffer to be overread.  Off-hand
> I don't know the best fix, but probably it would involve calling
> asn1_find_indefinite_length() before the "action" rather than after...
>
>>
>> We found this bug by fuzzing asn1_ber_decoder() in the userspace using
>> libFuzzer (http://llvm.org/docs/LibFuzzer.html).
>> The trick is to compile and link together several translation units
>> that provide a transitive closure of asn1_ber_decoder() (some of the
>> kernel functions need to be re-implemented to run in the userspace).
>> I've attached the resulting fuzzer as well (asn1_fuzzer.tar.gz). It
>> can operate on its own, although better results are achieved when
>> using the fuzzing corpus from
>> https://github.com/google/boringssl/tree/master/fuzz and
>> https://github.com/openssl/openssl/tree/master/fuzz.
>
> I had actually tried almost the same thing recently, but with afl-fuzz instead
> of libFuzzer.  But I don't think it was any better than what you did, and I
> didn't find the above bug, perhaps because the AFL_HARDEN compiler options don't
> include AddressSanitizer.
>
> I did use the original unpreprocessed .c files instead of preprocessed ones, but
> it required stubbing out quite a few headers.  Probably using preprocessed files
> is a bit easier.


We've also experimented with building x86_64 user-mode linux and then
linking vmlinux.o to a normal C program with main. Then the program
can directly call any kernel functions.
This required approximately the following:
 - weaken main in vmlinux.o (there is some binutil for this)
 - weaken all actual slab function and define them in the main program
forwarding to malloc/free
 - do the same for other called function that use global kernel
state/can't work without kernel initialization
 - add symbols denoting sections start/end (added by linker script in
linux build) to the main program (we've added them as just int's
because we just need symbols)

We did not get to the point where it actually works, but it looks like
this can work.
The advantage of this approach is that we reuse kernel Makefile to get
working vmlinux.o, and that almost all of this can be reused for other
fuzzers.


> For the fuzzing function I had it call x509_cert_parse() rather than
> asn1_ber_decode() directly, but with x509_get_sig_params() and
> x509_check_for_self_signed() stubbed out.  It would be nice to get coverage of
> x509_check_for_self_signed() because it will call into crypto/rsa.c and, among
> other things, run yet another ASN.1 decoder...  But that would have required
> porting a lot of the crypto API to userspace.
>
>>
>> Could it be possible to decouple the parsing algorithms from the rest
>> of kernel-specific crypto code? Maybe someone has other ideas about
>> how this code can be ran in the userspace?
>>
>
> One idea is that the ASN.1 parsers could be run with no "actions", which would
> minimize dependencies on other kernel code.  But unfortunately that would omit a
> lot of the interesting code.  Many (or most?) bugs will be in how the parsed
> data is used, rather than in the parsing itself.

Good point.

>  I don't think there is an easy
> solution.
>
> Note that separate from asymmetric_keys (which you can think of as being
> in-between the keyrings subsystem and the crypto subsystem) there is also the
> userspace interface to cryptographic algorithms, AF_ALG.  It might be possible
> to port a lot of the crypto API to userspace, but it would require a lot of work
> to stub things out.  Maybe a simpler improvement would be to teach syzkaller to
> more thoroughly test AF_ALG.  For example it could be made aware of algorithm
> templates so that it could try combining them in unusual ways.  (Example:
> https://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2 was a NULL pointer
> dereference bug that occurred if you asked to use the algorithm "mcryptd(md5)",
> i.e. the mcryptd template wrapping md5.)  Also,
> CONFIG_CRYPTO_MANAGER_DISABLE_TESTS should be unset, so that the crypto
> self-tests are enabled.


Can you please outline all uncovered by the current syzkaller
descriptions parts? We should add least TODO's for them. Or maybe we
could just resolve them right away.

Thanks

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-21  8:00     ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-21  8:00 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Alexander Potapenko, linux-crypto, Kostya Serebryany, keyrings,
	Andrey Konovalov

On Mon, Nov 20, 2017 at 10:42 PM, Eric Biggers <ebiggers@google.com> wrote:
> +Cc keyrings@vger.kernel.org (for asymmetric_keys)
>
> First of all, thanks for working on this!  A lot of this code really needs to be
> better tested.
>
> On Mon, Nov 20, 2017 at 03:10:55PM +0100, Alexander Potapenko wrote:
>> Hi all,
>>
>> TL;DR userspace fuzzing may be very effective for finding bugs in
>> crypto code, but might require some kernel-side changes.
>>
>> When the attached binary file,
>> crash-bbeda07d4ce60cf3b7fc20c3af92297e19f6dbae, is passed as payload
>> to add_key("asymmetric", "desc", payload, len), it triggers a
>> slab-out-of-bounds KASAN report (see below). This can be reproduced
>> with the attached program, add_key.c.
>
> The bug is in asn1_ber_decoder(): the special length value of 0x80, which
> indicates an "indefinite length", is being passed to one of the "action"
> functions as the real length, causing the input buffer to be overread.  Off-hand
> I don't know the best fix, but probably it would involve calling
> asn1_find_indefinite_length() before the "action" rather than after...
>
>>
>> We found this bug by fuzzing asn1_ber_decoder() in the userspace using
>> libFuzzer (http://llvm.org/docs/LibFuzzer.html).
>> The trick is to compile and link together several translation units
>> that provide a transitive closure of asn1_ber_decoder() (some of the
>> kernel functions need to be re-implemented to run in the userspace).
>> I've attached the resulting fuzzer as well (asn1_fuzzer.tar.gz). It
>> can operate on its own, although better results are achieved when
>> using the fuzzing corpus from
>> https://github.com/google/boringssl/tree/master/fuzz and
>> https://github.com/openssl/openssl/tree/master/fuzz.
>
> I had actually tried almost the same thing recently, but with afl-fuzz instead
> of libFuzzer.  But I don't think it was any better than what you did, and I
> didn't find the above bug, perhaps because the AFL_HARDEN compiler options don't
> include AddressSanitizer.
>
> I did use the original unpreprocessed .c files instead of preprocessed ones, but
> it required stubbing out quite a few headers.  Probably using preprocessed files
> is a bit easier.


We've also experimented with building x86_64 user-mode linux and then
linking vmlinux.o to a normal C program with main. Then the program
can directly call any kernel functions.
This required approximately the following:
 - weaken main in vmlinux.o (there is some binutil for this)
 - weaken all actual slab function and define them in the main program
forwarding to malloc/free
 - do the same for other called function that use global kernel
state/can't work without kernel initialization
 - add symbols denoting sections start/end (added by linker script in
linux build) to the main program (we've added them as just int's
because we just need symbols)

We did not get to the point where it actually works, but it looks like
this can work.
The advantage of this approach is that we reuse kernel Makefile to get
working vmlinux.o, and that almost all of this can be reused for other
fuzzers.


> For the fuzzing function I had it call x509_cert_parse() rather than
> asn1_ber_decode() directly, but with x509_get_sig_params() and
> x509_check_for_self_signed() stubbed out.  It would be nice to get coverage of
> x509_check_for_self_signed() because it will call into crypto/rsa.c and, among
> other things, run yet another ASN.1 decoder...  But that would have required
> porting a lot of the crypto API to userspace.
>
>>
>> Could it be possible to decouple the parsing algorithms from the rest
>> of kernel-specific crypto code? Maybe someone has other ideas about
>> how this code can be ran in the userspace?
>>
>
> One idea is that the ASN.1 parsers could be run with no "actions", which would
> minimize dependencies on other kernel code.  But unfortunately that would omit a
> lot of the interesting code.  Many (or most?) bugs will be in how the parsed
> data is used, rather than in the parsing itself.

Good point.

>  I don't think there is an easy
> solution.
>
> Note that separate from asymmetric_keys (which you can think of as being
> in-between the keyrings subsystem and the crypto subsystem) there is also the
> userspace interface to cryptographic algorithms, AF_ALG.  It might be possible
> to port a lot of the crypto API to userspace, but it would require a lot of work
> to stub things out.  Maybe a simpler improvement would be to teach syzkaller to
> more thoroughly test AF_ALG.  For example it could be made aware of algorithm
> templates so that it could try combining them in unusual ways.  (Example:
> https://marc.info/?l=linux-crypto-vger&m\x148063683310477&w=2 was a NULL pointer
> dereference bug that occurred if you asked to use the algorithm "mcryptd(md5)",
> i.e. the mcryptd template wrapping md5.)  Also,
> CONFIG_CRYPTO_MANAGER_DISABLE_TESTS should be unset, so that the crypto
> self-tests are enabled.


Can you please outline all uncovered by the current syzkaller
descriptions parts? We should add least TODO's for them. Or maybe we
could just resolve them right away.

Thanks

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-21  8:00     ` Dmitry Vyukov
@ 2017-11-21 20:46       ` Eric Biggers
  -1 siblings, 0 replies; 61+ messages in thread
From: Eric Biggers @ 2017-11-21 20:46 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Alexander Potapenko, linux-crypto, Kostya Serebryany, keyrings,
	Andrey Konovalov

On Tue, Nov 21, 2017 at 09:00:26AM +0100, Dmitry Vyukov wrote:
> >
> > Note that separate from asymmetric_keys (which you can think of as being
> > in-between the keyrings subsystem and the crypto subsystem) there is also the
> > userspace interface to cryptographic algorithms, AF_ALG.  It might be possible
> > to port a lot of the crypto API to userspace, but it would require a lot of work
> > to stub things out.  Maybe a simpler improvement would be to teach syzkaller to
> > more thoroughly test AF_ALG.  For example it could be made aware of algorithm
> > templates so that it could try combining them in unusual ways.  (Example:
> > https://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2 was a NULL pointer
> > dereference bug that occurred if you asked to use the algorithm "mcryptd(md5)",
> > i.e. the mcryptd template wrapping md5.)  Also,
> > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS should be unset, so that the crypto
> > self-tests are enabled.
> 
> 
> Can you please outline all uncovered by the current syzkaller
> descriptions parts? We should add least TODO's for them. Or maybe we
> could just resolve them right away.
> 

Just focusing on the algorithm names, the syzkaller descriptions currently use a
fixed set of algorithm names:

	salg_name = "cmac(aes)", "ecb(aes)", "cbc(aes)", "hmac(sha1)", [...]

But algorithm names are not just fixed strings; you can create "new" algorithms
by composing templates.  For example "cmac(aes)" indicates the "cmac" template
instantiated using "aes" as the underlying block cipher.  But it could also be
"cmac(des)", "cmac(blowfish)", etc.  Templates can even take multiple arguments,
e.g. "gcm_base(ctr(aes),ghash)".

So ideally the descriptions would contain the list of all templates which might
be available in addition to all "primitive" algorithm names, then express that
an algorithm name has a syntax like:

	alg_name -> primitive_alg_name | template_name(alg_name[,alg_name]*)

To get the list of all "primitive" algorithm names which might be available you
can run:

	git grep -E '\.cra_(driver_)?name' | grep -o '".*"' | sort | uniq

It's a long list, though it doesn't distinguish between the different types of
algorithm (hash, symmetric cipher, AEAD, etc.), and not all are actually
accessible through AF_ALG.  Note that it still includes names with parentheses
because a module may directly implement an algorithm like "xts(aes)", which then
may be used instead of the template version.

And to get the list of templates which might be available you can run:

	git grep -A5 'struct crypto_template.*{' | grep '\.name' | grep -o '".*"' | sort

(There is probably more to improve for AF_ALG besides the algorithm names; this
is just what I happened to notice for now.)

Eric

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-21 20:46       ` Eric Biggers
  0 siblings, 0 replies; 61+ messages in thread
From: Eric Biggers @ 2017-11-21 20:46 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Alexander Potapenko, linux-crypto, Kostya Serebryany, keyrings,
	Andrey Konovalov

On Tue, Nov 21, 2017 at 09:00:26AM +0100, Dmitry Vyukov wrote:
> >
> > Note that separate from asymmetric_keys (which you can think of as being
> > in-between the keyrings subsystem and the crypto subsystem) there is also the
> > userspace interface to cryptographic algorithms, AF_ALG.  It might be possible
> > to port a lot of the crypto API to userspace, but it would require a lot of work
> > to stub things out.  Maybe a simpler improvement would be to teach syzkaller to
> > more thoroughly test AF_ALG.  For example it could be made aware of algorithm
> > templates so that it could try combining them in unusual ways.  (Example:
> > https://marc.info/?l=linux-crypto-vger&m\x148063683310477&w=2 was a NULL pointer
> > dereference bug that occurred if you asked to use the algorithm "mcryptd(md5)",
> > i.e. the mcryptd template wrapping md5.)  Also,
> > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS should be unset, so that the crypto
> > self-tests are enabled.
> 
> 
> Can you please outline all uncovered by the current syzkaller
> descriptions parts? We should add least TODO's for them. Or maybe we
> could just resolve them right away.
> 

Just focusing on the algorithm names, the syzkaller descriptions currently use a
fixed set of algorithm names:

	salg_name = "cmac(aes)", "ecb(aes)", "cbc(aes)", "hmac(sha1)", [...]

But algorithm names are not just fixed strings; you can create "new" algorithms
by composing templates.  For example "cmac(aes)" indicates the "cmac" template
instantiated using "aes" as the underlying block cipher.  But it could also be
"cmac(des)", "cmac(blowfish)", etc.  Templates can even take multiple arguments,
e.g. "gcm_base(ctr(aes),ghash)".

So ideally the descriptions would contain the list of all templates which might
be available in addition to all "primitive" algorithm names, then express that
an algorithm name has a syntax like:

	alg_name -> primitive_alg_name | template_name(alg_name[,alg_name]*)

To get the list of all "primitive" algorithm names which might be available you
can run:

	git grep -E '\.cra_(driver_)?name' | grep -o '".*"' | sort | uniq

It's a long list, though it doesn't distinguish between the different types of
algorithm (hash, symmetric cipher, AEAD, etc.), and not all are actually
accessible through AF_ALG.  Note that it still includes names with parentheses
because a module may directly implement an algorithm like "xts(aes)", which then
may be used instead of the template version.

And to get the list of templates which might be available you can run:

	git grep -A5 'struct crypto_template.*{' | grep '\.name' | grep -o '".*"' | sort

(There is probably more to improve for AF_ALG besides the algorithm names; this
is just what I happened to notice for now.)

Eric

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-21 20:46       ` Eric Biggers
@ 2017-11-22 10:44         ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-22 10:44 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Alexander Potapenko, linux-crypto, Kostya Serebryany, keyrings,
	Andrey Konovalov

On Tue, Nov 21, 2017 at 9:46 PM, Eric Biggers <ebiggers@google.com> wrote:
> On Tue, Nov 21, 2017 at 09:00:26AM +0100, Dmitry Vyukov wrote:
>> >
>> > Note that separate from asymmetric_keys (which you can think of as being
>> > in-between the keyrings subsystem and the crypto subsystem) there is also the
>> > userspace interface to cryptographic algorithms, AF_ALG.  It might be possible
>> > to port a lot of the crypto API to userspace, but it would require a lot of work
>> > to stub things out.  Maybe a simpler improvement would be to teach syzkaller to
>> > more thoroughly test AF_ALG.  For example it could be made aware of algorithm
>> > templates so that it could try combining them in unusual ways.  (Example:
>> > https://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2 was a NULL pointer
>> > dereference bug that occurred if you asked to use the algorithm "mcryptd(md5)",
>> > i.e. the mcryptd template wrapping md5.)  Also,
>> > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS should be unset, so that the crypto
>> > self-tests are enabled.
>>
>>
>> Can you please outline all uncovered by the current syzkaller
>> descriptions parts? We should add least TODO's for them. Or maybe we
>> could just resolve them right away.
>>
>
> Just focusing on the algorithm names, the syzkaller descriptions currently use a
> fixed set of algorithm names:
>
>         salg_name = "cmac(aes)", "ecb(aes)", "cbc(aes)", "hmac(sha1)", [...]
>
> But algorithm names are not just fixed strings; you can create "new" algorithms
> by composing templates.  For example "cmac(aes)" indicates the "cmac" template
> instantiated using "aes" as the underlying block cipher.  But it could also be
> "cmac(des)", "cmac(blowfish)", etc.  Templates can even take multiple arguments,
> e.g. "gcm_base(ctr(aes),ghash)".
>
> So ideally the descriptions would contain the list of all templates which might
> be available in addition to all "primitive" algorithm names, then express that
> an algorithm name has a syntax like:
>
>         alg_name -> primitive_alg_name | template_name(alg_name[,alg_name]*)
>
> To get the list of all "primitive" algorithm names which might be available you
> can run:
>
>         git grep -E '\.cra_(driver_)?name' | grep -o '".*"' | sort | uniq
>
> It's a long list, though it doesn't distinguish between the different types of
> algorithm (hash, symmetric cipher, AEAD, etc.), and not all are actually
> accessible through AF_ALG.  Note that it still includes names with parentheses
> because a module may directly implement an algorithm like "xts(aes)", which then
> may be used instead of the template version.
>
> And to get the list of templates which might be available you can run:
>
>         git grep -A5 'struct crypto_template.*{' | grep '\.name' | grep -o '".*"' | sort
>
> (There is probably more to improve for AF_ALG besides the algorithm names; this
> is just what I happened to notice for now.)


Thanks! I think we can incorporate this into syzkaller.

One question: what's the relation between alg names and type ("aead",
"hash", "rng", "skcipher")? I remember I already looked at it before
and could not figure it out. Are all algorithms and templates
partitioned between types? Or they are orthogonal?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-22 10:44         ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-22 10:44 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Alexander Potapenko, linux-crypto, Kostya Serebryany, keyrings,
	Andrey Konovalov

On Tue, Nov 21, 2017 at 9:46 PM, Eric Biggers <ebiggers@google.com> wrote:
> On Tue, Nov 21, 2017 at 09:00:26AM +0100, Dmitry Vyukov wrote:
>> >
>> > Note that separate from asymmetric_keys (which you can think of as being
>> > in-between the keyrings subsystem and the crypto subsystem) there is also the
>> > userspace interface to cryptographic algorithms, AF_ALG.  It might be possible
>> > to port a lot of the crypto API to userspace, but it would require a lot of work
>> > to stub things out.  Maybe a simpler improvement would be to teach syzkaller to
>> > more thoroughly test AF_ALG.  For example it could be made aware of algorithm
>> > templates so that it could try combining them in unusual ways.  (Example:
>> > https://marc.info/?l=linux-crypto-vger&m\x148063683310477&w=2 was a NULL pointer
>> > dereference bug that occurred if you asked to use the algorithm "mcryptd(md5)",
>> > i.e. the mcryptd template wrapping md5.)  Also,
>> > CONFIG_CRYPTO_MANAGER_DISABLE_TESTS should be unset, so that the crypto
>> > self-tests are enabled.
>>
>>
>> Can you please outline all uncovered by the current syzkaller
>> descriptions parts? We should add least TODO's for them. Or maybe we
>> could just resolve them right away.
>>
>
> Just focusing on the algorithm names, the syzkaller descriptions currently use a
> fixed set of algorithm names:
>
>         salg_name = "cmac(aes)", "ecb(aes)", "cbc(aes)", "hmac(sha1)", [...]
>
> But algorithm names are not just fixed strings; you can create "new" algorithms
> by composing templates.  For example "cmac(aes)" indicates the "cmac" template
> instantiated using "aes" as the underlying block cipher.  But it could also be
> "cmac(des)", "cmac(blowfish)", etc.  Templates can even take multiple arguments,
> e.g. "gcm_base(ctr(aes),ghash)".
>
> So ideally the descriptions would contain the list of all templates which might
> be available in addition to all "primitive" algorithm names, then express that
> an algorithm name has a syntax like:
>
>         alg_name -> primitive_alg_name | template_name(alg_name[,alg_name]*)
>
> To get the list of all "primitive" algorithm names which might be available you
> can run:
>
>         git grep -E '\.cra_(driver_)?name' | grep -o '".*"' | sort | uniq
>
> It's a long list, though it doesn't distinguish between the different types of
> algorithm (hash, symmetric cipher, AEAD, etc.), and not all are actually
> accessible through AF_ALG.  Note that it still includes names with parentheses
> because a module may directly implement an algorithm like "xts(aes)", which then
> may be used instead of the template version.
>
> And to get the list of templates which might be available you can run:
>
>         git grep -A5 'struct crypto_template.*{' | grep '\.name' | grep -o '".*"' | sort
>
> (There is probably more to improve for AF_ALG besides the algorithm names; this
> is just what I happened to notice for now.)


Thanks! I think we can incorporate this into syzkaller.

One question: what's the relation between alg names and type ("aead",
"hash", "rng", "skcipher")? I remember I already looked at it before
and could not figure it out. Are all algorithms and templates
partitioned between types? Or they are orthogonal?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-21 20:46       ` Eric Biggers
@ 2017-11-22 16:54         ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-22 16:54 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Dmitry Vyukov, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

[-- Attachment #1: Type: text/plain, Size: 446 bytes --]

Am Dienstag, 21. November 2017, 21:46:28 CET schrieb Eric Biggers:

Hi Eric,

> 
> (There is probably more to improve for AF_ALG besides the algorithm names;
> this is just what I happened to notice for now.)

Just grepping may not cover all possibilities. Attached is a script that I use 
to invoke an array different tests for different cipher implementations. For 
now, it only covers C, ASM and CPU-based cipher implementations.

Ciao
Stephan

[-- Attachment #2: cavs_exec_kcapi.sh --]
[-- Type: application/x-shellscript, Size: 14797 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-22 16:54         ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-22 16:54 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Dmitry Vyukov, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

[-- Attachment #1: Type: text/plain, Size: 446 bytes --]

Am Dienstag, 21. November 2017, 21:46:28 CET schrieb Eric Biggers:

Hi Eric,

> 
> (There is probably more to improve for AF_ALG besides the algorithm names;
> this is just what I happened to notice for now.)

Just grepping may not cover all possibilities. Attached is a script that I use 
to invoke an array different tests for different cipher implementations. For 
now, it only covers C, ASM and CPU-based cipher implementations.

Ciao
Stephan

[-- Attachment #2: cavs_exec_kcapi.sh --]
[-- Type: application/x-shellscript, Size: 14797 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-22 16:54         ` Stephan Mueller
@ 2017-11-22 17:03           ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-22 17:03 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Wed, Nov 22, 2017 at 5:54 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Dienstag, 21. November 2017, 21:46:28 CET schrieb Eric Biggers:
>
> Hi Eric,
>
>>
>> (There is probably more to improve for AF_ALG besides the algorithm names;
>> this is just what I happened to notice for now.)
>
> Just grepping may not cover all possibilities. Attached is a script that I use
> to invoke an array different tests for different cipher implementations. For
> now, it only covers C, ASM and CPU-based cipher implementations.

Hi Stephan,

I see it has lots of names hardcoded. Is it possible to extract
up-to-date list from kernel? Maybe at runtime from running kernel?

What's the max number of arguments for a template? I see there is at least 2:
  rfc4106(gcm_base(ctr(aes-aesni),ghash-clmulni))
can there be more?

Do you know answer to this question by any chance?
what's the relation between alg names and type ("aead", "hash", "rng",
"skcipher")? I remember I already looked at it before and could not
figure it out. Are all algorithms and templates partitioned between
types? Or they are orthogonal?

Thanks

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-22 17:03           ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-22 17:03 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Wed, Nov 22, 2017 at 5:54 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Dienstag, 21. November 2017, 21:46:28 CET schrieb Eric Biggers:
>
> Hi Eric,
>
>>
>> (There is probably more to improve for AF_ALG besides the algorithm names;
>> this is just what I happened to notice for now.)
>
> Just grepping may not cover all possibilities. Attached is a script that I use
> to invoke an array different tests for different cipher implementations. For
> now, it only covers C, ASM and CPU-based cipher implementations.

Hi Stephan,

I see it has lots of names hardcoded. Is it possible to extract
up-to-date list from kernel? Maybe at runtime from running kernel?

What's the max number of arguments for a template? I see there is at least 2:
  rfc4106(gcm_base(ctr(aes-aesni),ghash-clmulni))
can there be more?

Do you know answer to this question by any chance?
what's the relation between alg names and type ("aead", "hash", "rng",
"skcipher")? I remember I already looked at it before and could not
figure it out. Are all algorithms and templates partitioned between
types? Or they are orthogonal?

Thanks

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-22 10:44         ` Dmitry Vyukov
@ 2017-11-22 17:08           ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-22 17:08 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Mittwoch, 22. November 2017, 11:44:51 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> 
> Thanks! I think we can incorporate this into syzkaller.
> 
> One question: what's the relation between alg names and type ("aead",
> "hash", "rng", "skcipher")?

If you refer to AF_ALG, then the following applies:

AF_ALG type of aead -> kernel crypto API: crypto_aead_*, aead_request_*

AF_ALG type of skcipher -> kernel crypto API: crypto_skcipher_*, 
skcipher_request_*

AF_ALG type of hash -> kernel crypto API: crypto_ahash_*, ahash_request_*

AF_ALG type of rng -> kernel crypto API: crypto_rng_*


> I remember I already looked at it before
> and could not figure it out. Are all algorithms and templates
> partitioned between types? Or they are orthogonal?

If you refer to the cipher names, there are two types: templates and cipher 
implementations. See [1] for details. [2] gives you some ideas of the 
interrelationships between the templates and the ciphers.

The relationship between the names and the AF_ALG names mentioned above is 
defined with the .cra_flags in the cipher specification of each cipher 
implementation. See [3] for details.

Note, I started some fuzzing work in my libkcapi test application. See [4] for 
the implementation.

[1] http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority

[2] http://www.chronox.de/crypto-API/crypto/architecture.html#internal-structure-of-kernel-crypto-api

[3] http://www.chronox.de/crypto-API/crypto/architecture.html#cipher-allocation-type-and-masks

[4] https://github.com/smuellerDD/libkcapi/blob/master/test/kcapi-main.c line 
522

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-22 17:08           ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-22 17:08 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Mittwoch, 22. November 2017, 11:44:51 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> 
> Thanks! I think we can incorporate this into syzkaller.
> 
> One question: what's the relation between alg names and type ("aead",
> "hash", "rng", "skcipher")?

If you refer to AF_ALG, then the following applies:

AF_ALG type of aead -> kernel crypto API: crypto_aead_*, aead_request_*

AF_ALG type of skcipher -> kernel crypto API: crypto_skcipher_*, 
skcipher_request_*

AF_ALG type of hash -> kernel crypto API: crypto_ahash_*, ahash_request_*

AF_ALG type of rng -> kernel crypto API: crypto_rng_*


> I remember I already looked at it before
> and could not figure it out. Are all algorithms and templates
> partitioned between types? Or they are orthogonal?

If you refer to the cipher names, there are two types: templates and cipher 
implementations. See [1] for details. [2] gives you some ideas of the 
interrelationships between the templates and the ciphers.

The relationship between the names and the AF_ALG names mentioned above is 
defined with the .cra_flags in the cipher specification of each cipher 
implementation. See [3] for details.

Note, I started some fuzzing work in my libkcapi test application. See [4] for 
the implementation.

[1] http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority

[2] http://www.chronox.de/crypto-API/crypto/architecture.html#internal-structure-of-kernel-crypto-api

[3] http://www.chronox.de/crypto-API/crypto/architecture.html#cipher-allocation-type-and-masks

[4] https://github.com/smuellerDD/libkcapi/blob/master/test/kcapi-main.c line 
522

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-22 17:03           ` Dmitry Vyukov
@ 2017-11-22 17:15             ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-22 17:15 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Mittwoch, 22. November 2017, 18:03:14 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> On Wed, Nov 22, 2017 at 5:54 PM, Stephan Mueller <smueller@chronox.de> 
wrote:
> > Am Dienstag, 21. November 2017, 21:46:28 CET schrieb Eric Biggers:
> > 
> > Hi Eric,
> > 
> >> (There is probably more to improve for AF_ALG besides the algorithm
> >> names;
> >> this is just what I happened to notice for now.)
> > 
> > Just grepping may not cover all possibilities. Attached is a script that I
> > use to invoke an array different tests for different cipher
> > implementations. For now, it only covers C, ASM and CPU-based cipher
> > implementations.
> 
> Hi Stephan,
> 
> I see it has lots of names hardcoded. Is it possible to extract
> up-to-date list from kernel? Maybe at runtime from running kernel?

Nope, this is currently not possible because the names where templates are 
used are "created" on the fly. I.e. the kernel parses the name up to a 
paranthesis and tries to allocate that name.

Thus, the content of /proc/crypto is NOT complete per definition as it only 
contains registered ciphers and allocated templates/cipher combos.
> 
> What's the max number of arguments for a template? I see there is at least
> 2: rfc4106(gcm_base(ctr(aes-aesni),ghash-clmulni))
> can there be more?

This is always defined by an implementation. For gcm_base, you see that as 
follows: crypto/gcm.c: see all invocations of crypto_gcm_create_common where 
the last but one argument is the CTR implementation and the last argument is 
the GHASH implementation.
> 
> Do you know answer to this question by any chance?
> what's the relation between alg names and type ("aead", "hash", "rng",
> "skcipher")? I remember I already looked at it before and could not
> figure it out. Are all algorithms and templates partitioned between
> types? Or they are orthogonal?

See other email.

> 
> Thanks
> --
> To unsubscribe from this list: send the line "unsubscribe keyrings" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-22 17:15             ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-22 17:15 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Mittwoch, 22. November 2017, 18:03:14 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> On Wed, Nov 22, 2017 at 5:54 PM, Stephan Mueller <smueller@chronox.de> 
wrote:
> > Am Dienstag, 21. November 2017, 21:46:28 CET schrieb Eric Biggers:
> > 
> > Hi Eric,
> > 
> >> (There is probably more to improve for AF_ALG besides the algorithm
> >> names;
> >> this is just what I happened to notice for now.)
> > 
> > Just grepping may not cover all possibilities. Attached is a script that I
> > use to invoke an array different tests for different cipher
> > implementations. For now, it only covers C, ASM and CPU-based cipher
> > implementations.
> 
> Hi Stephan,
> 
> I see it has lots of names hardcoded. Is it possible to extract
> up-to-date list from kernel? Maybe at runtime from running kernel?

Nope, this is currently not possible because the names where templates are 
used are "created" on the fly. I.e. the kernel parses the name up to a 
paranthesis and tries to allocate that name.

Thus, the content of /proc/crypto is NOT complete per definition as it only 
contains registered ciphers and allocated templates/cipher combos.
> 
> What's the max number of arguments for a template? I see there is at least
> 2: rfc4106(gcm_base(ctr(aes-aesni),ghash-clmulni))
> can there be more?

This is always defined by an implementation. For gcm_base, you see that as 
follows: crypto/gcm.c: see all invocations of crypto_gcm_create_common where 
the last but one argument is the CTR implementation and the last argument is 
the GHASH implementation.
> 
> Do you know answer to this question by any chance?
> what's the relation between alg names and type ("aead", "hash", "rng",
> "skcipher")? I remember I already looked at it before and could not
> figure it out. Are all algorithms and templates partitioned between
> types? Or they are orthogonal?

See other email.

> 
> Thanks
> --
> To unsubscribe from this list: send the line "unsubscribe keyrings" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-22 17:08           ` Stephan Mueller
@ 2017-11-23  9:32             ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23  9:32 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Wed, Nov 22, 2017 at 6:08 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Mittwoch, 22. November 2017, 11:44:51 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>>
>> Thanks! I think we can incorporate this into syzkaller.
>>
>> One question: what's the relation between alg names and type ("aead",
>> "hash", "rng", "skcipher")?
>
> If you refer to AF_ALG, then the following applies:
>
> AF_ALG type of aead -> kernel crypto API: crypto_aead_*, aead_request_*
>
> AF_ALG type of skcipher -> kernel crypto API: crypto_skcipher_*,
> skcipher_request_*
>
> AF_ALG type of hash -> kernel crypto API: crypto_ahash_*, ahash_request_*
>
> AF_ALG type of rng -> kernel crypto API: crypto_rng_*
>
>
>> I remember I already looked at it before
>> and could not figure it out. Are all algorithms and templates
>> partitioned between types? Or they are orthogonal?
>
> If you refer to the cipher names, there are two types: templates and cipher
> implementations. See [1] for details. [2] gives you some ideas of the
> interrelationships between the templates and the ciphers.
>
> The relationship between the names and the AF_ALG names mentioned above is
> defined with the .cra_flags in the cipher specification of each cipher
> implementation. See [3] for details.
>
> Note, I started some fuzzing work in my libkcapi test application. See [4] for
> the implementation.
>
> [1] http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority
>
> [2] http://www.chronox.de/crypto-API/crypto/architecture.html#internal-structure-of-kernel-crypto-api
>
> [3] http://www.chronox.de/crypto-API/crypto/architecture.html#cipher-allocation-type-and-masks
>
> [4] https://github.com/smuellerDD/libkcapi/blob/master/test/kcapi-main.c line
> 522


I've read the links and starring at the code, but still can't get it.
The question is about textual type names in sockaddr.
.cra_flags does not specify textual names.
[3] again talks about int flags rather than textual names.

I see they are used here:
http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api

but it merely says:

    .salg_type = "aead", /* this selects the symmetric cipher */
    .salg_name = "gcm(aes)" /* this is the cipher name */

and does not explain why it is must be "aead" for  "gcm(aes)", nor why
would "skcipher" fail for  "gcm(aes)" (would it?).

These integer flags in sockaddr_alg.feat/mask seem to be better always
be 0 (because they can only fail an otherwise passing bind, right?).
But the textual type seems to matter, because bind first looks at type
and then everything else happens as callbacks on type.

I've found these guys:

tatic const struct crypto_type crypto_skcipher_type2 = {
.extsize = crypto_skcipher_extsize,
.init_tfm = crypto_skcipher_init_tfm,
.free = crypto_skcipher_free_instance,
#ifdef CONFIG_PROC_FS
.show = crypto_skcipher_show,
#endif
.report = crypto_skcipher_report,
.maskclear = ~CRYPTO_ALG_TYPE_MASK,
.maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
.type = CRYPTO_ALG_TYPE_SKCIPHER,
.tfmsize = offsetof(struct crypto_skcipher, base),
};

But it still does not make sense to me.
 CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual algorithms.
and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
constants....

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-23  9:32             ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23  9:32 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Wed, Nov 22, 2017 at 6:08 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Mittwoch, 22. November 2017, 11:44:51 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>>
>> Thanks! I think we can incorporate this into syzkaller.
>>
>> One question: what's the relation between alg names and type ("aead",
>> "hash", "rng", "skcipher")?
>
> If you refer to AF_ALG, then the following applies:
>
> AF_ALG type of aead -> kernel crypto API: crypto_aead_*, aead_request_*
>
> AF_ALG type of skcipher -> kernel crypto API: crypto_skcipher_*,
> skcipher_request_*
>
> AF_ALG type of hash -> kernel crypto API: crypto_ahash_*, ahash_request_*
>
> AF_ALG type of rng -> kernel crypto API: crypto_rng_*
>
>
>> I remember I already looked at it before
>> and could not figure it out. Are all algorithms and templates
>> partitioned between types? Or they are orthogonal?
>
> If you refer to the cipher names, there are two types: templates and cipher
> implementations. See [1] for details. [2] gives you some ideas of the
> interrelationships between the templates and the ciphers.
>
> The relationship between the names and the AF_ALG names mentioned above is
> defined with the .cra_flags in the cipher specification of each cipher
> implementation. See [3] for details.
>
> Note, I started some fuzzing work in my libkcapi test application. See [4] for
> the implementation.
>
> [1] http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority
>
> [2] http://www.chronox.de/crypto-API/crypto/architecture.html#internal-structure-of-kernel-crypto-api
>
> [3] http://www.chronox.de/crypto-API/crypto/architecture.html#cipher-allocation-type-and-masks
>
> [4] https://github.com/smuellerDD/libkcapi/blob/master/test/kcapi-main.c line
> 522


I've read the links and starring at the code, but still can't get it.
The question is about textual type names in sockaddr.
.cra_flags does not specify textual names.
[3] again talks about int flags rather than textual names.

I see they are used here:
http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api

but it merely says:

    .salg_type = "aead", /* this selects the symmetric cipher */
    .salg_name = "gcm(aes)" /* this is the cipher name */

and does not explain why it is must be "aead" for  "gcm(aes)", nor why
would "skcipher" fail for  "gcm(aes)" (would it?).

These integer flags in sockaddr_alg.feat/mask seem to be better always
be 0 (because they can only fail an otherwise passing bind, right?).
But the textual type seems to matter, because bind first looks at type
and then everything else happens as callbacks on type.

I've found these guys:

tatic const struct crypto_type crypto_skcipher_type2 = {
.extsize = crypto_skcipher_extsize,
.init_tfm = crypto_skcipher_init_tfm,
.free = crypto_skcipher_free_instance,
#ifdef CONFIG_PROC_FS
.show = crypto_skcipher_show,
#endif
.report = crypto_skcipher_report,
.maskclear = ~CRYPTO_ALG_TYPE_MASK,
.maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
.type = CRYPTO_ALG_TYPE_SKCIPHER,
.tfmsize = offsetof(struct crypto_skcipher, base),
};

But it still does not make sense to me.
 CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual algorithms.
and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
constants....

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-23  9:32             ` Dmitry Vyukov
@ 2017-11-23  9:35               ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23  9:35 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 10:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Wed, Nov 22, 2017 at 6:08 PM, Stephan Mueller <smueller@chronox.de> wrote:
>> Am Mittwoch, 22. November 2017, 11:44:51 CET schrieb Dmitry Vyukov:
>>
>> Hi Dmitry,
>>
>>>
>>> Thanks! I think we can incorporate this into syzkaller.
>>>
>>> One question: what's the relation between alg names and type ("aead",
>>> "hash", "rng", "skcipher")?
>>
>> If you refer to AF_ALG, then the following applies:
>>
>> AF_ALG type of aead -> kernel crypto API: crypto_aead_*, aead_request_*
>>
>> AF_ALG type of skcipher -> kernel crypto API: crypto_skcipher_*,
>> skcipher_request_*
>>
>> AF_ALG type of hash -> kernel crypto API: crypto_ahash_*, ahash_request_*
>>
>> AF_ALG type of rng -> kernel crypto API: crypto_rng_*
>>
>>
>>> I remember I already looked at it before
>>> and could not figure it out. Are all algorithms and templates
>>> partitioned between types? Or they are orthogonal?
>>
>> If you refer to the cipher names, there are two types: templates and cipher
>> implementations. See [1] for details. [2] gives you some ideas of the
>> interrelationships between the templates and the ciphers.
>>
>> The relationship between the names and the AF_ALG names mentioned above is
>> defined with the .cra_flags in the cipher specification of each cipher
>> implementation. See [3] for details.
>>
>> Note, I started some fuzzing work in my libkcapi test application. See [4] for
>> the implementation.
>>
>> [1] http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority
>>
>> [2] http://www.chronox.de/crypto-API/crypto/architecture.html#internal-structure-of-kernel-crypto-api
>>
>> [3] http://www.chronox.de/crypto-API/crypto/architecture.html#cipher-allocation-type-and-masks
>>
>> [4] https://github.com/smuellerDD/libkcapi/blob/master/test/kcapi-main.c line
>> 522
>
>
> I've read the links and starring at the code, but still can't get it.
> The question is about textual type names in sockaddr.
> .cra_flags does not specify textual names.
> [3] again talks about int flags rather than textual names.
>
> I see they are used here:
> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
>
> but it merely says:
>
>     .salg_type = "aead", /* this selects the symmetric cipher */
>     .salg_name = "gcm(aes)" /* this is the cipher name */
>
> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
> would "skcipher" fail for  "gcm(aes)" (would it?).
>
> These integer flags in sockaddr_alg.feat/mask seem to be better always
> be 0 (because they can only fail an otherwise passing bind, right?).
> But the textual type seems to matter, because bind first looks at type
> and then everything else happens as callbacks on type.
>
> I've found these guys:
>
> tatic const struct crypto_type crypto_skcipher_type2 = {
> .extsize = crypto_skcipher_extsize,
> .init_tfm = crypto_skcipher_init_tfm,
> .free = crypto_skcipher_free_instance,
> #ifdef CONFIG_PROC_FS
> .show = crypto_skcipher_show,
> #endif
> .report = crypto_skcipher_report,
> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
> .type = CRYPTO_ALG_TYPE_SKCIPHER,
> .tfmsize = offsetof(struct crypto_skcipher, base),
> };
>
> But it still does not make sense to me.
>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual algorithms.
> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
> constants....


Also, there seems to be only 4 types ("aead", "hash", "rng",
"skcipher"), but more algorithm types. For example, how do I get
access to ACOMPRESS/SCOMPRESS?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-23  9:35               ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23  9:35 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 10:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Wed, Nov 22, 2017 at 6:08 PM, Stephan Mueller <smueller@chronox.de> wrote:
>> Am Mittwoch, 22. November 2017, 11:44:51 CET schrieb Dmitry Vyukov:
>>
>> Hi Dmitry,
>>
>>>
>>> Thanks! I think we can incorporate this into syzkaller.
>>>
>>> One question: what's the relation between alg names and type ("aead",
>>> "hash", "rng", "skcipher")?
>>
>> If you refer to AF_ALG, then the following applies:
>>
>> AF_ALG type of aead -> kernel crypto API: crypto_aead_*, aead_request_*
>>
>> AF_ALG type of skcipher -> kernel crypto API: crypto_skcipher_*,
>> skcipher_request_*
>>
>> AF_ALG type of hash -> kernel crypto API: crypto_ahash_*, ahash_request_*
>>
>> AF_ALG type of rng -> kernel crypto API: crypto_rng_*
>>
>>
>>> I remember I already looked at it before
>>> and could not figure it out. Are all algorithms and templates
>>> partitioned between types? Or they are orthogonal?
>>
>> If you refer to the cipher names, there are two types: templates and cipher
>> implementations. See [1] for details. [2] gives you some ideas of the
>> interrelationships between the templates and the ciphers.
>>
>> The relationship between the names and the AF_ALG names mentioned above is
>> defined with the .cra_flags in the cipher specification of each cipher
>> implementation. See [3] for details.
>>
>> Note, I started some fuzzing work in my libkcapi test application. See [4] for
>> the implementation.
>>
>> [1] http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority
>>
>> [2] http://www.chronox.de/crypto-API/crypto/architecture.html#internal-structure-of-kernel-crypto-api
>>
>> [3] http://www.chronox.de/crypto-API/crypto/architecture.html#cipher-allocation-type-and-masks
>>
>> [4] https://github.com/smuellerDD/libkcapi/blob/master/test/kcapi-main.c line
>> 522
>
>
> I've read the links and starring at the code, but still can't get it.
> The question is about textual type names in sockaddr.
> .cra_flags does not specify textual names.
> [3] again talks about int flags rather than textual names.
>
> I see they are used here:
> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
>
> but it merely says:
>
>     .salg_type = "aead", /* this selects the symmetric cipher */
>     .salg_name = "gcm(aes)" /* this is the cipher name */
>
> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
> would "skcipher" fail for  "gcm(aes)" (would it?).
>
> These integer flags in sockaddr_alg.feat/mask seem to be better always
> be 0 (because they can only fail an otherwise passing bind, right?).
> But the textual type seems to matter, because bind first looks at type
> and then everything else happens as callbacks on type.
>
> I've found these guys:
>
> tatic const struct crypto_type crypto_skcipher_type2 = {
> .extsize = crypto_skcipher_extsize,
> .init_tfm = crypto_skcipher_init_tfm,
> .free = crypto_skcipher_free_instance,
> #ifdef CONFIG_PROC_FS
> .show = crypto_skcipher_show,
> #endif
> .report = crypto_skcipher_report,
> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
> .type = CRYPTO_ALG_TYPE_SKCIPHER,
> .tfmsize = offsetof(struct crypto_skcipher, base),
> };
>
> But it still does not make sense to me.
>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual algorithms.
> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
> constants....


Also, there seems to be only 4 types ("aead", "hash", "rng",
"skcipher"), but more algorithm types. For example, how do I get
access to ACOMPRESS/SCOMPRESS?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-23  9:35               ` Dmitry Vyukov
@ 2017-11-23  9:37                 ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23  9:37 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 10:35 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Thu, Nov 23, 2017 at 10:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Wed, Nov 22, 2017 at 6:08 PM, Stephan Mueller <smueller@chronox.de> wrote:
>>> Am Mittwoch, 22. November 2017, 11:44:51 CET schrieb Dmitry Vyukov:
>>>
>>> Hi Dmitry,
>>>
>>>>
>>>> Thanks! I think we can incorporate this into syzkaller.
>>>>
>>>> One question: what's the relation between alg names and type ("aead",
>>>> "hash", "rng", "skcipher")?
>>>
>>> If you refer to AF_ALG, then the following applies:
>>>
>>> AF_ALG type of aead -> kernel crypto API: crypto_aead_*, aead_request_*
>>>
>>> AF_ALG type of skcipher -> kernel crypto API: crypto_skcipher_*,
>>> skcipher_request_*
>>>
>>> AF_ALG type of hash -> kernel crypto API: crypto_ahash_*, ahash_request_*
>>>
>>> AF_ALG type of rng -> kernel crypto API: crypto_rng_*
>>>
>>>
>>>> I remember I already looked at it before
>>>> and could not figure it out. Are all algorithms and templates
>>>> partitioned between types? Or they are orthogonal?
>>>
>>> If you refer to the cipher names, there are two types: templates and cipher
>>> implementations. See [1] for details. [2] gives you some ideas of the
>>> interrelationships between the templates and the ciphers.
>>>
>>> The relationship between the names and the AF_ALG names mentioned above is
>>> defined with the .cra_flags in the cipher specification of each cipher
>>> implementation. See [3] for details.
>>>
>>> Note, I started some fuzzing work in my libkcapi test application. See [4] for
>>> the implementation.
>>>
>>> [1] http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority
>>>
>>> [2] http://www.chronox.de/crypto-API/crypto/architecture.html#internal-structure-of-kernel-crypto-api
>>>
>>> [3] http://www.chronox.de/crypto-API/crypto/architecture.html#cipher-allocation-type-and-masks
>>>
>>> [4] https://github.com/smuellerDD/libkcapi/blob/master/test/kcapi-main.c line
>>> 522
>>
>>
>> I've read the links and starring at the code, but still can't get it.
>> The question is about textual type names in sockaddr.
>> .cra_flags does not specify textual names.
>> [3] again talks about int flags rather than textual names.
>>
>> I see they are used here:
>> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
>>
>> but it merely says:
>>
>>     .salg_type = "aead", /* this selects the symmetric cipher */
>>     .salg_name = "gcm(aes)" /* this is the cipher name */
>>
>> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
>> would "skcipher" fail for  "gcm(aes)" (would it?).
>>
>> These integer flags in sockaddr_alg.feat/mask seem to be better always
>> be 0 (because they can only fail an otherwise passing bind, right?).
>> But the textual type seems to matter, because bind first looks at type
>> and then everything else happens as callbacks on type.
>>
>> I've found these guys:
>>
>> tatic const struct crypto_type crypto_skcipher_type2 = {
>> .extsize = crypto_skcipher_extsize,
>> .init_tfm = crypto_skcipher_init_tfm,
>> .free = crypto_skcipher_free_instance,
>> #ifdef CONFIG_PROC_FS
>> .show = crypto_skcipher_show,
>> #endif
>> .report = crypto_skcipher_report,
>> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
>> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
>> .type = CRYPTO_ALG_TYPE_SKCIPHER,
>> .tfmsize = offsetof(struct crypto_skcipher, base),
>> };
>>
>> But it still does not make sense to me.
>>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual algorithms.
>> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
>> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
>> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
>> constants....
>
>
> Also, there seems to be only 4 types ("aead", "hash", "rng",
> "skcipher"), but more algorithm types. For example, how do I get
> access to ACOMPRESS/SCOMPRESS?

Looking at /proc/crypto confuses me even more:

$ cat /proc/crypto  | grep type | sort | uniq
type         : ablkcipher
type         : aead
type         : ahash
type         : akcipher
type         : blkcipher
type         : cipher
type         : compression
type         : givcipher
type         : rng
type         : shash

Now, there are 10 types. They partially intersect with the other
textual types (e.g. "aead"). But, say "skcipher" is not present in
/proc/crypto at all.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-23  9:37                 ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23  9:37 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 10:35 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Thu, Nov 23, 2017 at 10:32 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Wed, Nov 22, 2017 at 6:08 PM, Stephan Mueller <smueller@chronox.de> wrote:
>>> Am Mittwoch, 22. November 2017, 11:44:51 CET schrieb Dmitry Vyukov:
>>>
>>> Hi Dmitry,
>>>
>>>>
>>>> Thanks! I think we can incorporate this into syzkaller.
>>>>
>>>> One question: what's the relation between alg names and type ("aead",
>>>> "hash", "rng", "skcipher")?
>>>
>>> If you refer to AF_ALG, then the following applies:
>>>
>>> AF_ALG type of aead -> kernel crypto API: crypto_aead_*, aead_request_*
>>>
>>> AF_ALG type of skcipher -> kernel crypto API: crypto_skcipher_*,
>>> skcipher_request_*
>>>
>>> AF_ALG type of hash -> kernel crypto API: crypto_ahash_*, ahash_request_*
>>>
>>> AF_ALG type of rng -> kernel crypto API: crypto_rng_*
>>>
>>>
>>>> I remember I already looked at it before
>>>> and could not figure it out. Are all algorithms and templates
>>>> partitioned between types? Or they are orthogonal?
>>>
>>> If you refer to the cipher names, there are two types: templates and cipher
>>> implementations. See [1] for details. [2] gives you some ideas of the
>>> interrelationships between the templates and the ciphers.
>>>
>>> The relationship between the names and the AF_ALG names mentioned above is
>>> defined with the .cra_flags in the cipher specification of each cipher
>>> implementation. See [3] for details.
>>>
>>> Note, I started some fuzzing work in my libkcapi test application. See [4] for
>>> the implementation.
>>>
>>> [1] http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority
>>>
>>> [2] http://www.chronox.de/crypto-API/crypto/architecture.html#internal-structure-of-kernel-crypto-api
>>>
>>> [3] http://www.chronox.de/crypto-API/crypto/architecture.html#cipher-allocation-type-and-masks
>>>
>>> [4] https://github.com/smuellerDD/libkcapi/blob/master/test/kcapi-main.c line
>>> 522
>>
>>
>> I've read the links and starring at the code, but still can't get it.
>> The question is about textual type names in sockaddr.
>> .cra_flags does not specify textual names.
>> [3] again talks about int flags rather than textual names.
>>
>> I see they are used here:
>> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
>>
>> but it merely says:
>>
>>     .salg_type = "aead", /* this selects the symmetric cipher */
>>     .salg_name = "gcm(aes)" /* this is the cipher name */
>>
>> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
>> would "skcipher" fail for  "gcm(aes)" (would it?).
>>
>> These integer flags in sockaddr_alg.feat/mask seem to be better always
>> be 0 (because they can only fail an otherwise passing bind, right?).
>> But the textual type seems to matter, because bind first looks at type
>> and then everything else happens as callbacks on type.
>>
>> I've found these guys:
>>
>> tatic const struct crypto_type crypto_skcipher_type2 = {
>> .extsize = crypto_skcipher_extsize,
>> .init_tfm = crypto_skcipher_init_tfm,
>> .free = crypto_skcipher_free_instance,
>> #ifdef CONFIG_PROC_FS
>> .show = crypto_skcipher_show,
>> #endif
>> .report = crypto_skcipher_report,
>> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
>> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
>> .type = CRYPTO_ALG_TYPE_SKCIPHER,
>> .tfmsize = offsetof(struct crypto_skcipher, base),
>> };
>>
>> But it still does not make sense to me.
>>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual algorithms.
>> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
>> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
>> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
>> constants....
>
>
> Also, there seems to be only 4 types ("aead", "hash", "rng",
> "skcipher"), but more algorithm types. For example, how do I get
> access to ACOMPRESS/SCOMPRESS?

Looking at /proc/crypto confuses me even more:

$ cat /proc/crypto  | grep type | sort | uniq
type         : ablkcipher
type         : aead
type         : ahash
type         : akcipher
type         : blkcipher
type         : cipher
type         : compression
type         : givcipher
type         : rng
type         : shash

Now, there are 10 types. They partially intersect with the other
textual types (e.g. "aead"). But, say "skcipher" is not present in
/proc/crypto at all.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-23  9:37                 ` Dmitry Vyukov
@ 2017-11-23 11:10                   ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-23 11:10 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Donnerstag, 23. November 2017, 10:37:35 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> >> I've read the links and starring at the code, but still can't get it.
> >> The question is about textual type names in sockaddr.
> >> .cra_flags does not specify textual names.
> >> [3] again talks about int flags rather than textual names.
> >> 
> >> I see they are used here:
> >> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
> >> 
> >> but it merely says:
> >>     .salg_type = "aead", /* this selects the symmetric cipher */
> >>     .salg_name = "gcm(aes)" /* this is the cipher name */
> >> 
> >> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
> >> would "skcipher" fail for  "gcm(aes)" (would it?).
> >> 
> >> These integer flags in sockaddr_alg.feat/mask seem to be better always
> >> be 0 (because they can only fail an otherwise passing bind, right?).
> >> But the textual type seems to matter, because bind first looks at type
> >> and then everything else happens as callbacks on type.
> >> 
> >> I've found these guys:
> >> 
> >> tatic const struct crypto_type crypto_skcipher_type2 = {
> >> .extsize = crypto_skcipher_extsize,
> >> .init_tfm = crypto_skcipher_init_tfm,
> >> .free = crypto_skcipher_free_instance,
> >> #ifdef CONFIG_PROC_FS
> >> .show = crypto_skcipher_show,
> >> #endif
> >> .report = crypto_skcipher_report,
> >> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
> >> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
> >> .type = CRYPTO_ALG_TYPE_SKCIPHER,
> >> .tfmsize = offsetof(struct crypto_skcipher, base),
> >> };
> >> 
> >> But it still does not make sense to me.
> >> 
> >>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual
> >>  algorithms.
> >> 
> >> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
> >> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
> >> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
> >> constants....
> > 
> > Also, there seems to be only 4 types ("aead", "hash", "rng",
> > "skcipher"), but more algorithm types. For example, how do I get
> > access to ACOMPRESS/SCOMPRESS?
> 
> Looking at /proc/crypto confuses me even more:
> 
> $ cat /proc/crypto  | grep type | sort | uniq
> type         : ablkcipher
> type         : aead
> type         : ahash
> type         : akcipher
> type         : blkcipher
> type         : cipher
> type         : compression
> type         : givcipher
> type         : rng
> type         : shash
> 
> Now, there are 10 types. They partially intersect with the other
> textual types (e.g. "aead"). But, say "skcipher" is not present in
> /proc/crypto at all.

The types that a cipher can implement is given in include/linux/crypto.h:

/*
 * Algorithm masks and types.
 */
#define CRYPTO_ALG_TYPE_MASK            0x0000000f
#define CRYPTO_ALG_TYPE_CIPHER          0x00000001
...

These types are resolved when the various crypto_alloc_* functions are 
invoked. For example

static const struct crypto_type crypto_skcipher_type2 = {
...
        .type = CRYPTO_ALG_TYPE_SKCIPHER,
...
};

struct crypto_skcipher *crypto_alloc_skcipher(const char *alg_name,
                                              u32 type, u32 mask)
{
        return crypto_alloc_tfm(alg_name, &crypto_skcipher_type2, type, mask);
}

Thus, when you use crypto_alloc_skcipher, it will only "find" cipher 
implementations that are of type SKCIPHER (or ABLKCIPHER as both values are 
identical). I.e. even when you try to call crypto_alloc_skcipher("sha256"), 
the lookup code will not find it as the sha256 implementation is of type AHASH 
(or SHASH) and not SKCIPHER.

The name you see in /proc/crypto are given with the respective report function 
call (e.g. crypto_skcipher_report for the aforementioned example). These names 
are just informative and not relevant at all for anything.


When you come to the AF_ALG interface, the used names of "skcipher" or "aead" 
are *not* related to the names you see in /proc/crypto. They are simply 
identifiers referring to the different AF_ALG handler callbacks. For example, 
crypto/algif_skcipher.c:

static const struct af_alg_type algif_type_skcipher = {
...
        .name           =       "skcipher",
...
};

This name is used to find the right AF_ALG handler in alg_bind:

        type = alg_get_type(sa->salg_type);
        if (IS_ERR(type) && PTR_ERR(type) == -ENOENT) {
                request_module("algif-%s", sa->salg_type);
                type = alg_get_type(sa->salg_type);
        }

Thus, if you use "skcipher" during bind it is resolved to algif_skcipher.c. If 
you use some unknown name, alg_bind will error out.

Now, algif_skcipher only uses crypto_alloc_skcipher() which as shown above can 
only allocate ciphers marked as SKCIPHER or ABLKCIPHER.
 
These names are to be pointed to by the sockaddr type value:

Here my libkcapi code in _kcapi_allocated_handle_init:

        memset(&sa, 0, sizeof(sa));
        sa.salg_family = AF_ALG;
        snprintf((char *)sa.salg_type, sizeof(sa.salg_type),"%s", type);
        snprintf((char *)sa.salg_name, sizeof(sa.salg_name),"%s", ciphername);

===> type contains the name to resolve the right AF_ALG handler.

===> ciphername contains the actual cipher name (like "gcm(aes)") to be used 
in the crypto_alloc_* functions implemented by AF_ALG.

Now, assume user space is nasty. When you use the type "aead" resolving to 
algif_aead.c and the cipher of, say, "sha256", the algif_aead.c will do an 
crypto_alloc_aead("sha256") which will cause an error because the allocation 
function will never match a cipher name of "sha256" and the type of AEAD.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-23 11:10                   ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-23 11:10 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Donnerstag, 23. November 2017, 10:37:35 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> >> I've read the links and starring at the code, but still can't get it.
> >> The question is about textual type names in sockaddr.
> >> .cra_flags does not specify textual names.
> >> [3] again talks about int flags rather than textual names.
> >> 
> >> I see they are used here:
> >> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
> >> 
> >> but it merely says:
> >>     .salg_type = "aead", /* this selects the symmetric cipher */
> >>     .salg_name = "gcm(aes)" /* this is the cipher name */
> >> 
> >> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
> >> would "skcipher" fail for  "gcm(aes)" (would it?).
> >> 
> >> These integer flags in sockaddr_alg.feat/mask seem to be better always
> >> be 0 (because they can only fail an otherwise passing bind, right?).
> >> But the textual type seems to matter, because bind first looks at type
> >> and then everything else happens as callbacks on type.
> >> 
> >> I've found these guys:
> >> 
> >> tatic const struct crypto_type crypto_skcipher_type2 = {
> >> .extsize = crypto_skcipher_extsize,
> >> .init_tfm = crypto_skcipher_init_tfm,
> >> .free = crypto_skcipher_free_instance,
> >> #ifdef CONFIG_PROC_FS
> >> .show = crypto_skcipher_show,
> >> #endif
> >> .report = crypto_skcipher_report,
> >> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
> >> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
> >> .type = CRYPTO_ALG_TYPE_SKCIPHER,
> >> .tfmsize = offsetof(struct crypto_skcipher, base),
> >> };
> >> 
> >> But it still does not make sense to me.
> >> 
> >>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual
> >>  algorithms.
> >> 
> >> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
> >> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
> >> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
> >> constants....
> > 
> > Also, there seems to be only 4 types ("aead", "hash", "rng",
> > "skcipher"), but more algorithm types. For example, how do I get
> > access to ACOMPRESS/SCOMPRESS?
> 
> Looking at /proc/crypto confuses me even more:
> 
> $ cat /proc/crypto  | grep type | sort | uniq
> type         : ablkcipher
> type         : aead
> type         : ahash
> type         : akcipher
> type         : blkcipher
> type         : cipher
> type         : compression
> type         : givcipher
> type         : rng
> type         : shash
> 
> Now, there are 10 types. They partially intersect with the other
> textual types (e.g. "aead"). But, say "skcipher" is not present in
> /proc/crypto at all.

The types that a cipher can implement is given in include/linux/crypto.h:

/*
 * Algorithm masks and types.
 */
#define CRYPTO_ALG_TYPE_MASK            0x0000000f
#define CRYPTO_ALG_TYPE_CIPHER          0x00000001
...

These types are resolved when the various crypto_alloc_* functions are 
invoked. For example

static const struct crypto_type crypto_skcipher_type2 = {
...
        .type = CRYPTO_ALG_TYPE_SKCIPHER,
...
};

struct crypto_skcipher *crypto_alloc_skcipher(const char *alg_name,
                                              u32 type, u32 mask)
{
        return crypto_alloc_tfm(alg_name, &crypto_skcipher_type2, type, mask);
}

Thus, when you use crypto_alloc_skcipher, it will only "find" cipher 
implementations that are of type SKCIPHER (or ABLKCIPHER as both values are 
identical). I.e. even when you try to call crypto_alloc_skcipher("sha256"), 
the lookup code will not find it as the sha256 implementation is of type AHASH 
(or SHASH) and not SKCIPHER.

The name you see in /proc/crypto are given with the respective report function 
call (e.g. crypto_skcipher_report for the aforementioned example). These names 
are just informative and not relevant at all for anything.


When you come to the AF_ALG interface, the used names of "skcipher" or "aead" 
are *not* related to the names you see in /proc/crypto. They are simply 
identifiers referring to the different AF_ALG handler callbacks. For example, 
crypto/algif_skcipher.c:

static const struct af_alg_type algif_type_skcipher = {
...
        .name           =       "skcipher",
...
};

This name is used to find the right AF_ALG handler in alg_bind:

        type = alg_get_type(sa->salg_type);
        if (IS_ERR(type) && PTR_ERR(type) = -ENOENT) {
                request_module("algif-%s", sa->salg_type);
                type = alg_get_type(sa->salg_type);
        }

Thus, if you use "skcipher" during bind it is resolved to algif_skcipher.c. If 
you use some unknown name, alg_bind will error out.

Now, algif_skcipher only uses crypto_alloc_skcipher() which as shown above can 
only allocate ciphers marked as SKCIPHER or ABLKCIPHER.
 
These names are to be pointed to by the sockaddr type value:

Here my libkcapi code in _kcapi_allocated_handle_init:

        memset(&sa, 0, sizeof(sa));
        sa.salg_family = AF_ALG;
        snprintf((char *)sa.salg_type, sizeof(sa.salg_type),"%s", type);
        snprintf((char *)sa.salg_name, sizeof(sa.salg_name),"%s", ciphername);

==> type contains the name to resolve the right AF_ALG handler.

==> ciphername contains the actual cipher name (like "gcm(aes)") to be used 
in the crypto_alloc_* functions implemented by AF_ALG.

Now, assume user space is nasty. When you use the type "aead" resolving to 
algif_aead.c and the cipher of, say, "sha256", the algif_aead.c will do an 
crypto_alloc_aead("sha256") which will cause an error because the allocation 
function will never match a cipher name of "sha256" and the type of AEAD.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-23 11:10                   ` Stephan Mueller
@ 2017-11-23 11:27                     ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23 11:27 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 12:10 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Donnerstag, 23. November 2017, 10:37:35 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> >> I've read the links and starring at the code, but still can't get it.
>> >> The question is about textual type names in sockaddr.
>> >> .cra_flags does not specify textual names.
>> >> [3] again talks about int flags rather than textual names.
>> >>
>> >> I see they are used here:
>> >> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
>> >>
>> >> but it merely says:
>> >>     .salg_type = "aead", /* this selects the symmetric cipher */
>> >>     .salg_name = "gcm(aes)" /* this is the cipher name */
>> >>
>> >> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
>> >> would "skcipher" fail for  "gcm(aes)" (would it?).
>> >>
>> >> These integer flags in sockaddr_alg.feat/mask seem to be better always
>> >> be 0 (because they can only fail an otherwise passing bind, right?).
>> >> But the textual type seems to matter, because bind first looks at type
>> >> and then everything else happens as callbacks on type.
>> >>
>> >> I've found these guys:
>> >>
>> >> tatic const struct crypto_type crypto_skcipher_type2 = {
>> >> .extsize = crypto_skcipher_extsize,
>> >> .init_tfm = crypto_skcipher_init_tfm,
>> >> .free = crypto_skcipher_free_instance,
>> >> #ifdef CONFIG_PROC_FS
>> >> .show = crypto_skcipher_show,
>> >> #endif
>> >> .report = crypto_skcipher_report,
>> >> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
>> >> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
>> >> .type = CRYPTO_ALG_TYPE_SKCIPHER,
>> >> .tfmsize = offsetof(struct crypto_skcipher, base),
>> >> };
>> >>
>> >> But it still does not make sense to me.
>> >>
>> >>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual
>> >>  algorithms.
>> >>
>> >> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
>> >> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
>> >> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
>> >> constants....
>> >
>> > Also, there seems to be only 4 types ("aead", "hash", "rng",
>> > "skcipher"), but more algorithm types. For example, how do I get
>> > access to ACOMPRESS/SCOMPRESS?
>>
>> Looking at /proc/crypto confuses me even more:
>>
>> $ cat /proc/crypto  | grep type | sort | uniq
>> type         : ablkcipher
>> type         : aead
>> type         : ahash
>> type         : akcipher
>> type         : blkcipher
>> type         : cipher
>> type         : compression
>> type         : givcipher
>> type         : rng
>> type         : shash
>>
>> Now, there are 10 types. They partially intersect with the other
>> textual types (e.g. "aead"). But, say "skcipher" is not present in
>> /proc/crypto at all.
>
> The types that a cipher can implement is given in include/linux/crypto.h:
>
> /*
>  * Algorithm masks and types.
>  */
> #define CRYPTO_ALG_TYPE_MASK            0x0000000f
> #define CRYPTO_ALG_TYPE_CIPHER          0x00000001
> ...
>
> These types are resolved when the various crypto_alloc_* functions are
> invoked. For example
>
> static const struct crypto_type crypto_skcipher_type2 = {
> ...
>         .type = CRYPTO_ALG_TYPE_SKCIPHER,
> ...
> };
>
> struct crypto_skcipher *crypto_alloc_skcipher(const char *alg_name,
>                                               u32 type, u32 mask)
> {
>         return crypto_alloc_tfm(alg_name, &crypto_skcipher_type2, type, mask);
> }
>
> Thus, when you use crypto_alloc_skcipher, it will only "find" cipher
> implementations that are of type SKCIPHER (or ABLKCIPHER as both values are
> identical). I.e. even when you try to call crypto_alloc_skcipher("sha256"),
> the lookup code will not find it as the sha256 implementation is of type AHASH
> (or SHASH) and not SKCIPHER.
>
> The name you see in /proc/crypto are given with the respective report function
> call (e.g. crypto_skcipher_report for the aforementioned example). These names
> are just informative and not relevant at all for anything.
>
>
> When you come to the AF_ALG interface, the used names of "skcipher" or "aead"
> are *not* related to the names you see in /proc/crypto. They are simply
> identifiers referring to the different AF_ALG handler callbacks. For example,
> crypto/algif_skcipher.c:
>
> static const struct af_alg_type algif_type_skcipher = {
> ...
>         .name           =       "skcipher",
> ...
> };
>
> This name is used to find the right AF_ALG handler in alg_bind:
>
>         type = alg_get_type(sa->salg_type);
>         if (IS_ERR(type) && PTR_ERR(type) == -ENOENT) {
>                 request_module("algif-%s", sa->salg_type);
>                 type = alg_get_type(sa->salg_type);
>         }
>
> Thus, if you use "skcipher" during bind it is resolved to algif_skcipher.c. If
> you use some unknown name, alg_bind will error out.
>
> Now, algif_skcipher only uses crypto_alloc_skcipher() which as shown above can
> only allocate ciphers marked as SKCIPHER or ABLKCIPHER.
>
> These names are to be pointed to by the sockaddr type value:
>
> Here my libkcapi code in _kcapi_allocated_handle_init:
>
>         memset(&sa, 0, sizeof(sa));
>         sa.salg_family = AF_ALG;
>         snprintf((char *)sa.salg_type, sizeof(sa.salg_type),"%s", type);
>         snprintf((char *)sa.salg_name, sizeof(sa.salg_name),"%s", ciphername);
>
> ===> type contains the name to resolve the right AF_ALG handler.
>
> ===> ciphername contains the actual cipher name (like "gcm(aes)") to be used
> in the crypto_alloc_* functions implemented by AF_ALG.
>
> Now, assume user space is nasty. When you use the type "aead" resolving to
> algif_aead.c and the cipher of, say, "sha256", the algif_aead.c will do an
> crypto_alloc_aead("sha256") which will cause an error because the allocation
> function will never match a cipher name of "sha256" and the type of AEAD.

Hi Stephan,

Thanks for the explanation! I am starting to digesting it.

You say that:

> static const struct crypto_type crypto_skcipher_type2 = {
>         .type = CRYPTO_ALG_TYPE_SKCIPHER,

will select implementations of types SKCIPHER or ABLKCIPHER.
But there are no such implementations. I see only implementation of
types CIPHER and BLKCIPHER:

.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
crypto/seed.c

.cra_flags          =   CRYPTO_ALG_TYPE_BLKCIPHER,
crypto/salsa20_generic.c

SKCIPHER=0x5. Does it mean that it selects implementations that has
((cra_flags&CRYPTO_ALG_TYPE_MASK)&SKCIPHER) != 0? I.e. CIPHER and
BLKCIPHER would match that.

But then I am again confused, because CRYPTO_ALG_TYPE_AEAD is 0x3 so
it would match CRYPTO_ALG_TYPE_COMPRESS and CRYPTO_ALG_TYPE_CIPHER
again. Does not make sense to me...


And to confirm, we can't reach compress from userspace because only
these 4 types are exposed "aead", "hash", "rng", "skcipher". Is it
correct?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-23 11:27                     ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23 11:27 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 12:10 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Donnerstag, 23. November 2017, 10:37:35 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> >> I've read the links and starring at the code, but still can't get it.
>> >> The question is about textual type names in sockaddr.
>> >> .cra_flags does not specify textual names.
>> >> [3] again talks about int flags rather than textual names.
>> >>
>> >> I see they are used here:
>> >> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
>> >>
>> >> but it merely says:
>> >>     .salg_type = "aead", /* this selects the symmetric cipher */
>> >>     .salg_name = "gcm(aes)" /* this is the cipher name */
>> >>
>> >> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
>> >> would "skcipher" fail for  "gcm(aes)" (would it?).
>> >>
>> >> These integer flags in sockaddr_alg.feat/mask seem to be better always
>> >> be 0 (because they can only fail an otherwise passing bind, right?).
>> >> But the textual type seems to matter, because bind first looks at type
>> >> and then everything else happens as callbacks on type.
>> >>
>> >> I've found these guys:
>> >>
>> >> tatic const struct crypto_type crypto_skcipher_type2 = {
>> >> .extsize = crypto_skcipher_extsize,
>> >> .init_tfm = crypto_skcipher_init_tfm,
>> >> .free = crypto_skcipher_free_instance,
>> >> #ifdef CONFIG_PROC_FS
>> >> .show = crypto_skcipher_show,
>> >> #endif
>> >> .report = crypto_skcipher_report,
>> >> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
>> >> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
>> >> .type = CRYPTO_ALG_TYPE_SKCIPHER,
>> >> .tfmsize = offsetof(struct crypto_skcipher, base),
>> >> };
>> >>
>> >> But it still does not make sense to me.
>> >>
>> >>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual
>> >>  algorithms.
>> >>
>> >> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
>> >> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
>> >> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
>> >> constants....
>> >
>> > Also, there seems to be only 4 types ("aead", "hash", "rng",
>> > "skcipher"), but more algorithm types. For example, how do I get
>> > access to ACOMPRESS/SCOMPRESS?
>>
>> Looking at /proc/crypto confuses me even more:
>>
>> $ cat /proc/crypto  | grep type | sort | uniq
>> type         : ablkcipher
>> type         : aead
>> type         : ahash
>> type         : akcipher
>> type         : blkcipher
>> type         : cipher
>> type         : compression
>> type         : givcipher
>> type         : rng
>> type         : shash
>>
>> Now, there are 10 types. They partially intersect with the other
>> textual types (e.g. "aead"). But, say "skcipher" is not present in
>> /proc/crypto at all.
>
> The types that a cipher can implement is given in include/linux/crypto.h:
>
> /*
>  * Algorithm masks and types.
>  */
> #define CRYPTO_ALG_TYPE_MASK            0x0000000f
> #define CRYPTO_ALG_TYPE_CIPHER          0x00000001
> ...
>
> These types are resolved when the various crypto_alloc_* functions are
> invoked. For example
>
> static const struct crypto_type crypto_skcipher_type2 = {
> ...
>         .type = CRYPTO_ALG_TYPE_SKCIPHER,
> ...
> };
>
> struct crypto_skcipher *crypto_alloc_skcipher(const char *alg_name,
>                                               u32 type, u32 mask)
> {
>         return crypto_alloc_tfm(alg_name, &crypto_skcipher_type2, type, mask);
> }
>
> Thus, when you use crypto_alloc_skcipher, it will only "find" cipher
> implementations that are of type SKCIPHER (or ABLKCIPHER as both values are
> identical). I.e. even when you try to call crypto_alloc_skcipher("sha256"),
> the lookup code will not find it as the sha256 implementation is of type AHASH
> (or SHASH) and not SKCIPHER.
>
> The name you see in /proc/crypto are given with the respective report function
> call (e.g. crypto_skcipher_report for the aforementioned example). These names
> are just informative and not relevant at all for anything.
>
>
> When you come to the AF_ALG interface, the used names of "skcipher" or "aead"
> are *not* related to the names you see in /proc/crypto. They are simply
> identifiers referring to the different AF_ALG handler callbacks. For example,
> crypto/algif_skcipher.c:
>
> static const struct af_alg_type algif_type_skcipher = {
> ...
>         .name           =       "skcipher",
> ...
> };
>
> This name is used to find the right AF_ALG handler in alg_bind:
>
>         type = alg_get_type(sa->salg_type);
>         if (IS_ERR(type) && PTR_ERR(type) = -ENOENT) {
>                 request_module("algif-%s", sa->salg_type);
>                 type = alg_get_type(sa->salg_type);
>         }
>
> Thus, if you use "skcipher" during bind it is resolved to algif_skcipher.c. If
> you use some unknown name, alg_bind will error out.
>
> Now, algif_skcipher only uses crypto_alloc_skcipher() which as shown above can
> only allocate ciphers marked as SKCIPHER or ABLKCIPHER.
>
> These names are to be pointed to by the sockaddr type value:
>
> Here my libkcapi code in _kcapi_allocated_handle_init:
>
>         memset(&sa, 0, sizeof(sa));
>         sa.salg_family = AF_ALG;
>         snprintf((char *)sa.salg_type, sizeof(sa.salg_type),"%s", type);
>         snprintf((char *)sa.salg_name, sizeof(sa.salg_name),"%s", ciphername);
>
> ==> type contains the name to resolve the right AF_ALG handler.
>
> ==> ciphername contains the actual cipher name (like "gcm(aes)") to be used
> in the crypto_alloc_* functions implemented by AF_ALG.
>
> Now, assume user space is nasty. When you use the type "aead" resolving to
> algif_aead.c and the cipher of, say, "sha256", the algif_aead.c will do an
> crypto_alloc_aead("sha256") which will cause an error because the allocation
> function will never match a cipher name of "sha256" and the type of AEAD.

Hi Stephan,

Thanks for the explanation! I am starting to digesting it.

You say that:

> static const struct crypto_type crypto_skcipher_type2 = {
>         .type = CRYPTO_ALG_TYPE_SKCIPHER,

will select implementations of types SKCIPHER or ABLKCIPHER.
But there are no such implementations. I see only implementation of
types CIPHER and BLKCIPHER:

.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
crypto/seed.c

.cra_flags          =   CRYPTO_ALG_TYPE_BLKCIPHER,
crypto/salsa20_generic.c

SKCIPHER=0x5. Does it mean that it selects implementations that has
((cra_flags&CRYPTO_ALG_TYPE_MASK)&SKCIPHER) != 0? I.e. CIPHER and
BLKCIPHER would match that.

But then I am again confused, because CRYPTO_ALG_TYPE_AEAD is 0x3 so
it would match CRYPTO_ALG_TYPE_COMPRESS and CRYPTO_ALG_TYPE_CIPHER
again. Does not make sense to me...


And to confirm, we can't reach compress from userspace because only
these 4 types are exposed "aead", "hash", "rng", "skcipher". Is it
correct?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-23 11:27                     ` Dmitry Vyukov
@ 2017-11-23 11:34                       ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23 11:34 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 12:27 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>
>> Hi Dmitry,
>>
>>> >> I've read the links and starring at the code, but still can't get it.
>>> >> The question is about textual type names in sockaddr.
>>> >> .cra_flags does not specify textual names.
>>> >> [3] again talks about int flags rather than textual names.
>>> >>
>>> >> I see they are used here:
>>> >> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
>>> >>
>>> >> but it merely says:
>>> >>     .salg_type = "aead", /* this selects the symmetric cipher */
>>> >>     .salg_name = "gcm(aes)" /* this is the cipher name */
>>> >>
>>> >> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
>>> >> would "skcipher" fail for  "gcm(aes)" (would it?).
>>> >>
>>> >> These integer flags in sockaddr_alg.feat/mask seem to be better always
>>> >> be 0 (because they can only fail an otherwise passing bind, right?).
>>> >> But the textual type seems to matter, because bind first looks at type
>>> >> and then everything else happens as callbacks on type.
>>> >>
>>> >> I've found these guys:
>>> >>
>>> >> tatic const struct crypto_type crypto_skcipher_type2 = {
>>> >> .extsize = crypto_skcipher_extsize,
>>> >> .init_tfm = crypto_skcipher_init_tfm,
>>> >> .free = crypto_skcipher_free_instance,
>>> >> #ifdef CONFIG_PROC_FS
>>> >> .show = crypto_skcipher_show,
>>> >> #endif
>>> >> .report = crypto_skcipher_report,
>>> >> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
>>> >> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
>>> >> .type = CRYPTO_ALG_TYPE_SKCIPHER,
>>> >> .tfmsize = offsetof(struct crypto_skcipher, base),
>>> >> };
>>> >>
>>> >> But it still does not make sense to me.
>>> >>
>>> >>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual
>>> >>  algorithms.
>>> >>
>>> >> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
>>> >> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
>>> >> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
>>> >> constants....
>>> >
>>> > Also, there seems to be only 4 types ("aead", "hash", "rng",
>>> > "skcipher"), but more algorithm types. For example, how do I get
>>> > access to ACOMPRESS/SCOMPRESS?
>>>
>>> Looking at /proc/crypto confuses me even more:
>>>
>>> $ cat /proc/crypto  | grep type | sort | uniq
>>> type         : ablkcipher
>>> type         : aead
>>> type         : ahash
>>> type         : akcipher
>>> type         : blkcipher
>>> type         : cipher
>>> type         : compression
>>> type         : givcipher
>>> type         : rng
>>> type         : shash
>>>
>>> Now, there are 10 types. They partially intersect with the other
>>> textual types (e.g. "aead"). But, say "skcipher" is not present in
>>> /proc/crypto at all.
>>
>> The types that a cipher can implement is given in include/linux/crypto.h:
>>
>> /*
>>  * Algorithm masks and types.
>>  */
>> #define CRYPTO_ALG_TYPE_MASK            0x0000000f
>> #define CRYPTO_ALG_TYPE_CIPHER          0x00000001
>> ...
>>
>> These types are resolved when the various crypto_alloc_* functions are
>> invoked. For example
>>
>> static const struct crypto_type crypto_skcipher_type2 = {
>> ...
>>         .type = CRYPTO_ALG_TYPE_SKCIPHER,
>> ...
>> };
>>
>> struct crypto_skcipher *crypto_alloc_skcipher(const char *alg_name,
>>                                               u32 type, u32 mask)
>> {
>>         return crypto_alloc_tfm(alg_name, &crypto_skcipher_type2, type, mask);
>> }
>>
>> Thus, when you use crypto_alloc_skcipher, it will only "find" cipher
>> implementations that are of type SKCIPHER (or ABLKCIPHER as both values are
>> identical). I.e. even when you try to call crypto_alloc_skcipher("sha256"),
>> the lookup code will not find it as the sha256 implementation is of type AHASH
>> (or SHASH) and not SKCIPHER.
>>
>> The name you see in /proc/crypto are given with the respective report function
>> call (e.g. crypto_skcipher_report for the aforementioned example). These names
>> are just informative and not relevant at all for anything.
>>
>>
>> When you come to the AF_ALG interface, the used names of "skcipher" or "aead"
>> are *not* related to the names you see in /proc/crypto. They are simply
>> identifiers referring to the different AF_ALG handler callbacks. For example,
>> crypto/algif_skcipher.c:
>>
>> static const struct af_alg_type algif_type_skcipher = {
>> ...
>>         .name           =       "skcipher",
>> ...
>> };
>>
>> This name is used to find the right AF_ALG handler in alg_bind:
>>
>>         type = alg_get_type(sa->salg_type);
>>         if (IS_ERR(type) && PTR_ERR(type) == -ENOENT) {
>>                 request_module("algif-%s", sa->salg_type);
>>                 type = alg_get_type(sa->salg_type);
>>         }
>>
>> Thus, if you use "skcipher" during bind it is resolved to algif_skcipher.c. If
>> you use some unknown name, alg_bind will error out.
>>
>> Now, algif_skcipher only uses crypto_alloc_skcipher() which as shown above can
>> only allocate ciphers marked as SKCIPHER or ABLKCIPHER.
>>
>> These names are to be pointed to by the sockaddr type value:
>>
>> Here my libkcapi code in _kcapi_allocated_handle_init:
>>
>>         memset(&sa, 0, sizeof(sa));
>>         sa.salg_family = AF_ALG;
>>         snprintf((char *)sa.salg_type, sizeof(sa.salg_type),"%s", type);
>>         snprintf((char *)sa.salg_name, sizeof(sa.salg_name),"%s", ciphername);
>>
>> ===> type contains the name to resolve the right AF_ALG handler.
>>
>> ===> ciphername contains the actual cipher name (like "gcm(aes)") to be used
>> in the crypto_alloc_* functions implemented by AF_ALG.
>>
>> Now, assume user space is nasty. When you use the type "aead" resolving to
>> algif_aead.c and the cipher of, say, "sha256", the algif_aead.c will do an
>> crypto_alloc_aead("sha256") which will cause an error because the allocation
>> function will never match a cipher name of "sha256" and the type of AEAD.
>
> Hi Stephan,
>
> Thanks for the explanation! I am starting to digesting it.
>
> You say that:
>
>> static const struct crypto_type crypto_skcipher_type2 = {
>>         .type = CRYPTO_ALG_TYPE_SKCIPHER,
>
> will select implementations of types SKCIPHER or ABLKCIPHER.
> But there are no such implementations. I see only implementation of
> types CIPHER and BLKCIPHER:
>
> .cra_flags = CRYPTO_ALG_TYPE_CIPHER,
> crypto/seed.c
>
> .cra_flags          =   CRYPTO_ALG_TYPE_BLKCIPHER,
> crypto/salsa20_generic.c
>
> SKCIPHER=0x5. Does it mean that it selects implementations that has
> ((cra_flags&CRYPTO_ALG_TYPE_MASK)&SKCIPHER) != 0? I.e. CIPHER and
> BLKCIPHER would match that.
>
> But then I am again confused, because CRYPTO_ALG_TYPE_AEAD is 0x3 so
> it would match CRYPTO_ALG_TYPE_COMPRESS and CRYPTO_ALG_TYPE_CIPHER
> again. Does not make sense to me...
>
>
> And to confirm, we can't reach compress from userspace because only
> these 4 types are exposed "aead", "hash", "rng", "skcipher". Is it
> correct?



Btw, I've started doing some minimal improvements, did not yet sorted
out alg types/names, and fuzzer started scratching surface:

WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30 CET
general protection fault in crypto_chacha20_crypt 100 Nov 23 2017 12:29:48 CET
suspicious RCU usage at ./include/trace/events/kmem.h:LINE 88 Nov 23
2017 12:29:15 CET

This strongly suggests that we need to dig deeper.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-23 11:34                       ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-23 11:34 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 12:27 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>
>> Hi Dmitry,
>>
>>> >> I've read the links and starring at the code, but still can't get it.
>>> >> The question is about textual type names in sockaddr.
>>> >> .cra_flags does not specify textual names.
>>> >> [3] again talks about int flags rather than textual names.
>>> >>
>>> >> I see they are used here:
>>> >> http://www.chronox.de/crypto-API/crypto/userspace-if.html#aead-cipher-api
>>> >>
>>> >> but it merely says:
>>> >>     .salg_type = "aead", /* this selects the symmetric cipher */
>>> >>     .salg_name = "gcm(aes)" /* this is the cipher name */
>>> >>
>>> >> and does not explain why it is must be "aead" for  "gcm(aes)", nor why
>>> >> would "skcipher" fail for  "gcm(aes)" (would it?).
>>> >>
>>> >> These integer flags in sockaddr_alg.feat/mask seem to be better always
>>> >> be 0 (because they can only fail an otherwise passing bind, right?).
>>> >> But the textual type seems to matter, because bind first looks at type
>>> >> and then everything else happens as callbacks on type.
>>> >>
>>> >> I've found these guys:
>>> >>
>>> >> tatic const struct crypto_type crypto_skcipher_type2 = {
>>> >> .extsize = crypto_skcipher_extsize,
>>> >> .init_tfm = crypto_skcipher_init_tfm,
>>> >> .free = crypto_skcipher_free_instance,
>>> >> #ifdef CONFIG_PROC_FS
>>> >> .show = crypto_skcipher_show,
>>> >> #endif
>>> >> .report = crypto_skcipher_report,
>>> >> .maskclear = ~CRYPTO_ALG_TYPE_MASK,
>>> >> .maskset = CRYPTO_ALG_TYPE_BLKCIPHER_MASK,
>>> >> .type = CRYPTO_ALG_TYPE_SKCIPHER,
>>> >> .tfmsize = offsetof(struct crypto_skcipher, base),
>>> >> };
>>> >>
>>> >> But it still does not make sense to me.
>>> >>
>>> >>  CRYPTO_ALG_TYPE_SKCIPHER const is not mentioned in any actual
>>> >>  algorithms.
>>> >>
>>> >> and CRYPTO_ALG_TYPE_BLKCIPHER_MASK is 0xc, which selects
>>> >> CRYPTO_ALG_TYPE_BLKCIPHER, CRYPTO_ALG_TYPE_KPP and
>>> >> CRYPTO_ALG_TYPE_RNG. And it looks like a random subset of
>>> >> constants....
>>> >
>>> > Also, there seems to be only 4 types ("aead", "hash", "rng",
>>> > "skcipher"), but more algorithm types. For example, how do I get
>>> > access to ACOMPRESS/SCOMPRESS?
>>>
>>> Looking at /proc/crypto confuses me even more:
>>>
>>> $ cat /proc/crypto  | grep type | sort | uniq
>>> type         : ablkcipher
>>> type         : aead
>>> type         : ahash
>>> type         : akcipher
>>> type         : blkcipher
>>> type         : cipher
>>> type         : compression
>>> type         : givcipher
>>> type         : rng
>>> type         : shash
>>>
>>> Now, there are 10 types. They partially intersect with the other
>>> textual types (e.g. "aead"). But, say "skcipher" is not present in
>>> /proc/crypto at all.
>>
>> The types that a cipher can implement is given in include/linux/crypto.h:
>>
>> /*
>>  * Algorithm masks and types.
>>  */
>> #define CRYPTO_ALG_TYPE_MASK            0x0000000f
>> #define CRYPTO_ALG_TYPE_CIPHER          0x00000001
>> ...
>>
>> These types are resolved when the various crypto_alloc_* functions are
>> invoked. For example
>>
>> static const struct crypto_type crypto_skcipher_type2 = {
>> ...
>>         .type = CRYPTO_ALG_TYPE_SKCIPHER,
>> ...
>> };
>>
>> struct crypto_skcipher *crypto_alloc_skcipher(const char *alg_name,
>>                                               u32 type, u32 mask)
>> {
>>         return crypto_alloc_tfm(alg_name, &crypto_skcipher_type2, type, mask);
>> }
>>
>> Thus, when you use crypto_alloc_skcipher, it will only "find" cipher
>> implementations that are of type SKCIPHER (or ABLKCIPHER as both values are
>> identical). I.e. even when you try to call crypto_alloc_skcipher("sha256"),
>> the lookup code will not find it as the sha256 implementation is of type AHASH
>> (or SHASH) and not SKCIPHER.
>>
>> The name you see in /proc/crypto are given with the respective report function
>> call (e.g. crypto_skcipher_report for the aforementioned example). These names
>> are just informative and not relevant at all for anything.
>>
>>
>> When you come to the AF_ALG interface, the used names of "skcipher" or "aead"
>> are *not* related to the names you see in /proc/crypto. They are simply
>> identifiers referring to the different AF_ALG handler callbacks. For example,
>> crypto/algif_skcipher.c:
>>
>> static const struct af_alg_type algif_type_skcipher = {
>> ...
>>         .name           =       "skcipher",
>> ...
>> };
>>
>> This name is used to find the right AF_ALG handler in alg_bind:
>>
>>         type = alg_get_type(sa->salg_type);
>>         if (IS_ERR(type) && PTR_ERR(type) = -ENOENT) {
>>                 request_module("algif-%s", sa->salg_type);
>>                 type = alg_get_type(sa->salg_type);
>>         }
>>
>> Thus, if you use "skcipher" during bind it is resolved to algif_skcipher.c. If
>> you use some unknown name, alg_bind will error out.
>>
>> Now, algif_skcipher only uses crypto_alloc_skcipher() which as shown above can
>> only allocate ciphers marked as SKCIPHER or ABLKCIPHER.
>>
>> These names are to be pointed to by the sockaddr type value:
>>
>> Here my libkcapi code in _kcapi_allocated_handle_init:
>>
>>         memset(&sa, 0, sizeof(sa));
>>         sa.salg_family = AF_ALG;
>>         snprintf((char *)sa.salg_type, sizeof(sa.salg_type),"%s", type);
>>         snprintf((char *)sa.salg_name, sizeof(sa.salg_name),"%s", ciphername);
>>
>> ==> type contains the name to resolve the right AF_ALG handler.
>>
>> ==> ciphername contains the actual cipher name (like "gcm(aes)") to be used
>> in the crypto_alloc_* functions implemented by AF_ALG.
>>
>> Now, assume user space is nasty. When you use the type "aead" resolving to
>> algif_aead.c and the cipher of, say, "sha256", the algif_aead.c will do an
>> crypto_alloc_aead("sha256") which will cause an error because the allocation
>> function will never match a cipher name of "sha256" and the type of AEAD.
>
> Hi Stephan,
>
> Thanks for the explanation! I am starting to digesting it.
>
> You say that:
>
>> static const struct crypto_type crypto_skcipher_type2 = {
>>         .type = CRYPTO_ALG_TYPE_SKCIPHER,
>
> will select implementations of types SKCIPHER or ABLKCIPHER.
> But there are no such implementations. I see only implementation of
> types CIPHER and BLKCIPHER:
>
> .cra_flags = CRYPTO_ALG_TYPE_CIPHER,
> crypto/seed.c
>
> .cra_flags          =   CRYPTO_ALG_TYPE_BLKCIPHER,
> crypto/salsa20_generic.c
>
> SKCIPHER=0x5. Does it mean that it selects implementations that has
> ((cra_flags&CRYPTO_ALG_TYPE_MASK)&SKCIPHER) != 0? I.e. CIPHER and
> BLKCIPHER would match that.
>
> But then I am again confused, because CRYPTO_ALG_TYPE_AEAD is 0x3 so
> it would match CRYPTO_ALG_TYPE_COMPRESS and CRYPTO_ALG_TYPE_CIPHER
> again. Does not make sense to me...
>
>
> And to confirm, we can't reach compress from userspace because only
> these 4 types are exposed "aead", "hash", "rng", "skcipher". Is it
> correct?



Btw, I've started doing some minimal improvements, did not yet sorted
out alg types/names, and fuzzer started scratching surface:

WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30 CET
general protection fault in crypto_chacha20_crypt 100 Nov 23 2017 12:29:48 CET
suspicious RCU usage at ./include/trace/events/kmem.h:LINE 88 Nov 23
2017 12:29:15 CET

This strongly suggests that we need to dig deeper.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-23 11:27                     ` Dmitry Vyukov
@ 2017-11-23 12:32                       ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-23 12:32 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Donnerstag, 23. November 2017, 12:27:30 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> Hi Stephan,
> 
> Thanks for the explanation! I am starting to digesting it.
> 
> You say that:
> > static const struct crypto_type crypto_skcipher_type2 = {
> > 
> >         .type = CRYPTO_ALG_TYPE_SKCIPHER,
> 
> will select implementations of types SKCIPHER or ABLKCIPHER.
> But there are no such implementations. I see only implementation of
> types CIPHER and BLKCIPHER:
> 
> .cra_flags = CRYPTO_ALG_TYPE_CIPHER,
> crypto/seed.c
> 
> .cra_flags          =   CRYPTO_ALG_TYPE_BLKCIPHER,
> crypto/salsa20_generic.c

The blkcipher (symmetric synchronous cipher) is made asynchronous (ABLKCIPHER) 
via cryptd() implemented in crypto/cryptd.c. When using the skcipher API 
without any specific flags, it will allocate cryptd(salsa20_generic) in your 
case above transparently and mark it as an ABLKCIPHER.

CIPHER is a block cipher without block chaining which is only capable of 
encrypting one block. You can only use it with the crypto_cipher* API.

As this API is not exported via AF_ALG, you cannot access those ciphers from 
user space directly (albeit they are accessible indirectly via templates 
implementing block chaining algos)
> 
> SKCIPHER=0x5. Does it mean that it selects implementations that has
> ((cra_flags&CRYPTO_ALG_TYPE_MASK)&SKCIPHER) != 0? I.e. CIPHER and
> BLKCIPHER would match that.
> 
> But then I am again confused, because CRYPTO_ALG_TYPE_AEAD is 0x3 so
> it would match CRYPTO_ALG_TYPE_COMPRESS and CRYPTO_ALG_TYPE_CIPHER
> again. Does not make sense to me...

COMPRESS / PCOMPRESS implementations are accessible via the crypto_compress_* 
APIs. There is (currently) no AF_ALG interface for this API.
> 
> 
> And to confirm, we can't reach compress from userspace because only
> these 4 types are exposed "aead", "hash", "rng", "skcipher". Is it
> correct?

Precisely. Look into algif_*.c which kernel crypto API calls are exported to 
user space.
> --
> To unsubscribe from this list: send the line "unsubscribe keyrings" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-23 12:32                       ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-23 12:32 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Donnerstag, 23. November 2017, 12:27:30 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> Hi Stephan,
> 
> Thanks for the explanation! I am starting to digesting it.
> 
> You say that:
> > static const struct crypto_type crypto_skcipher_type2 = {
> > 
> >         .type = CRYPTO_ALG_TYPE_SKCIPHER,
> 
> will select implementations of types SKCIPHER or ABLKCIPHER.
> But there are no such implementations. I see only implementation of
> types CIPHER and BLKCIPHER:
> 
> .cra_flags = CRYPTO_ALG_TYPE_CIPHER,
> crypto/seed.c
> 
> .cra_flags          =   CRYPTO_ALG_TYPE_BLKCIPHER,
> crypto/salsa20_generic.c

The blkcipher (symmetric synchronous cipher) is made asynchronous (ABLKCIPHER) 
via cryptd() implemented in crypto/cryptd.c. When using the skcipher API 
without any specific flags, it will allocate cryptd(salsa20_generic) in your 
case above transparently and mark it as an ABLKCIPHER.

CIPHER is a block cipher without block chaining which is only capable of 
encrypting one block. You can only use it with the crypto_cipher* API.

As this API is not exported via AF_ALG, you cannot access those ciphers from 
user space directly (albeit they are accessible indirectly via templates 
implementing block chaining algos)
> 
> SKCIPHER=0x5. Does it mean that it selects implementations that has
> ((cra_flags&CRYPTO_ALG_TYPE_MASK)&SKCIPHER) != 0? I.e. CIPHER and
> BLKCIPHER would match that.
> 
> But then I am again confused, because CRYPTO_ALG_TYPE_AEAD is 0x3 so
> it would match CRYPTO_ALG_TYPE_COMPRESS and CRYPTO_ALG_TYPE_CIPHER
> again. Does not make sense to me...

COMPRESS / PCOMPRESS implementations are accessible via the crypto_compress_* 
APIs. There is (currently) no AF_ALG interface for this API.
> 
> 
> And to confirm, we can't reach compress from userspace because only
> these 4 types are exposed "aead", "hash", "rng", "skcipher". Is it
> correct?

Precisely. Look into algif_*.c which kernel crypto API calls are exported to 
user space.
> --
> To unsubscribe from this list: send the line "unsubscribe keyrings" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-23 11:34                       ` Dmitry Vyukov
@ 2017-11-23 12:35                         ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-23 12:35 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> Btw, I've started doing some minimal improvements, did not yet sorted
> out alg types/names, and fuzzer started scratching surface:
> 
> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30
> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
> 12:29:48 CET suspicious RCU usage at ./include/trace/events/kmem.h:LINE 88
> Nov 23 2017 12:29:15 CET

This all looks strange. Where would RCU come into play with 
af_alg_free_areq_sgls?

Do you have a reproducer?
> 
> This strongly suggests that we need to dig deeper.

Absolutely. That is why I started my fuzzer that turned up already quite some 
issues.
> --
> To unsubscribe from this list: send the line "unsubscribe keyrings" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-23 12:35                         ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-23 12:35 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> Btw, I've started doing some minimal improvements, did not yet sorted
> out alg types/names, and fuzzer started scratching surface:
> 
> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30
> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
> 12:29:48 CET suspicious RCU usage at ./include/trace/events/kmem.h:LINE 88
> Nov 23 2017 12:29:15 CET

This all looks strange. Where would RCU come into play with 
af_alg_free_areq_sgls?

Do you have a reproducer?
> 
> This strongly suggests that we need to dig deeper.

Absolutely. That is why I started my fuzzer that turned up already quite some 
issues.
> --
> To unsubscribe from this list: send the line "unsubscribe keyrings" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-23 12:35                         ` Stephan Mueller
@ 2017-11-24 13:49                           ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 13:49 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> Btw, I've started doing some minimal improvements, did not yet sorted
>> out alg types/names, and fuzzer started scratching surface:
>>
>> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
>> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30
>> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
>> 12:29:48 CET suspicious RCU usage at ./include/trace/events/kmem.h:LINE 88
>> Nov 23 2017 12:29:15 CET
>
> This all looks strange. Where would RCU come into play with
> af_alg_free_areq_sgls?
>
> Do you have a reproducer?
>>
>> This strongly suggests that we need to dig deeper.
>
> Absolutely. That is why I started my fuzzer that turned up already quite some
> issues.

I've cooked syzkaller change that teaches it to generate more
algorithm names. Probably not idea, but much better than was before:
https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2cc7db/sys/linux/alg.go
(if you see any obvious issues there, feedback is welcome, I still did
not figure out completely difference between e.g. HASH/AHASH,
BLKCIPHER/ABLKCIPHER as most of them seem to be interchangable; this
was mostly based on try and trial approach).

All bugs with details will soon be reported by syzbot
(https://goo.gl/tpsmEJ) to kernel mailing lists with all details.

Stephan, thanks for your help!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 13:49                           ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 13:49 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> Btw, I've started doing some minimal improvements, did not yet sorted
>> out alg types/names, and fuzzer started scratching surface:
>>
>> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
>> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30
>> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
>> 12:29:48 CET suspicious RCU usage at ./include/trace/events/kmem.h:LINE 88
>> Nov 23 2017 12:29:15 CET
>
> This all looks strange. Where would RCU come into play with
> af_alg_free_areq_sgls?
>
> Do you have a reproducer?
>>
>> This strongly suggests that we need to dig deeper.
>
> Absolutely. That is why I started my fuzzer that turned up already quite some
> issues.

I've cooked syzkaller change that teaches it to generate more
algorithm names. Probably not idea, but much better than was before:
https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2cc7db/sys/linux/alg.go
(if you see any obvious issues there, feedback is welcome, I still did
not figure out completely difference between e.g. HASH/AHASH,
BLKCIPHER/ABLKCIPHER as most of them seem to be interchangable; this
was mostly based on try and trial approach).

All bugs with details will soon be reported by syzbot
(https://goo.gl/tpsmEJ) to kernel mailing lists with all details.

Stephan, thanks for your help!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 13:49                           ` Dmitry Vyukov
@ 2017-11-24 14:36                             ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 14:36 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de> 
wrote:
> > Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
> > 
> > Hi Dmitry,
> > 
> >> Btw, I've started doing some minimal improvements, did not yet sorted
> >> out alg types/names, and fuzzer started scratching surface:
> >> 
> >> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
> >> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30
> >> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
> >> 12:29:48 CET suspicious RCU usage at ./include/trace/events/kmem.h:LINE
> >> 88
> >> Nov 23 2017 12:29:15 CET
> > 
> > This all looks strange. Where would RCU come into play with
> > af_alg_free_areq_sgls?
> > 
> > Do you have a reproducer?
> > 
> >> This strongly suggests that we need to dig deeper.
> > 
> > Absolutely. That is why I started my fuzzer that turned up already quite
> > some issues.
> 
> I've cooked syzkaller change that teaches it to generate more
> algorithm names. Probably not idea, but much better than was before:
> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
> cc7db/sys/linux/alg.go (if you see any obvious issues there, feedback is
> welcome,

I will peek into that code shortly.

> I still did not figure out completely difference between e.g.
> HASH/AHASH,

AHASH is the asynchronous hash. I.e. the implementation can sleep.

HASH == SHASH and is the synchronous hash. I.e. that implementation will never 
sleep.

An SHASH can be turned into an AHASH by using cryptd().

An AHASH can never be turned into an SHASH.

To use SHASH implementations, you use the *_shash API calls. This API does not 
require a request structure.

To use AHASH implementations, you use the *_ahash API calls. This API requires 
the use of ahash_request_* calls. By transparently employing cryptd(), the 
kernel allows the use of SHASH implementations with the AHASH API.

Currently there is only one real AHASH implementation outside specific 
hardware drivers: sha1_mb and sha2*_mb found in arch/x86/crypto/. This 
implementation can only be used with the AHASH API. All (other) SHASH 
implementations can be used with either the shash or the ahash API, because 
when using it as AHASH, the kernel automatically uses the cryptd() under the 
hood.


> BLKCIPHER/ABLKCIPHER as most of them seem to be interchangable; this
> was mostly based on try and trial approach).

Dto. BLKCIPHER is the synchronous and ABLKCIPHER is the asynchronous 
implementation. The same logic applies here as discussed for the SHASH/AHASH.

There used to be a blkcipher and ablkcipher API just like the shash/ahash 
which follows exactly the same logic.

About a year ago, the skcipher API was introduced which tries to get rid of 
the blkcipher API and supersedes the ablkcipher API. I.e. per default, the 
skcipher is asynchronous in nature. Yet, you can make it synchronous by 
supplying a specific flag/mask combo during crypto_skcipher_alloc.

Again, BLKCIPHERS are turned into async implementations using the cryptd 
transparently by the kernel.

PS: AF_ALG does *not* allow choosing the API discussed above. It always uses 
the async API (which implies that *all* cipher implementations are 
accessible). Though, it provides two types of interfaces:

1. the sync API

2. the async API

The sync API simply does a sendmsg/recvmsg.

The async API uses the io_submit and friends.

You can see implementations for both in libkcapi.
> 
> All bugs with details will soon be reported by syzbot
> (https://goo.gl/tpsmEJ) to kernel mailing lists with all details.
> 
> Stephan, thanks for your help!

You are welcome.


Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 14:36                             ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 14:36 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de> 
wrote:
> > Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
> > 
> > Hi Dmitry,
> > 
> >> Btw, I've started doing some minimal improvements, did not yet sorted
> >> out alg types/names, and fuzzer started scratching surface:
> >> 
> >> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
> >> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30
> >> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
> >> 12:29:48 CET suspicious RCU usage at ./include/trace/events/kmem.h:LINE
> >> 88
> >> Nov 23 2017 12:29:15 CET
> > 
> > This all looks strange. Where would RCU come into play with
> > af_alg_free_areq_sgls?
> > 
> > Do you have a reproducer?
> > 
> >> This strongly suggests that we need to dig deeper.
> > 
> > Absolutely. That is why I started my fuzzer that turned up already quite
> > some issues.
> 
> I've cooked syzkaller change that teaches it to generate more
> algorithm names. Probably not idea, but much better than was before:
> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
> cc7db/sys/linux/alg.go (if you see any obvious issues there, feedback is
> welcome,

I will peek into that code shortly.

> I still did not figure out completely difference between e.g.
> HASH/AHASH,

AHASH is the asynchronous hash. I.e. the implementation can sleep.

HASH = SHASH and is the synchronous hash. I.e. that implementation will never 
sleep.

An SHASH can be turned into an AHASH by using cryptd().

An AHASH can never be turned into an SHASH.

To use SHASH implementations, you use the *_shash API calls. This API does not 
require a request structure.

To use AHASH implementations, you use the *_ahash API calls. This API requires 
the use of ahash_request_* calls. By transparently employing cryptd(), the 
kernel allows the use of SHASH implementations with the AHASH API.

Currently there is only one real AHASH implementation outside specific 
hardware drivers: sha1_mb and sha2*_mb found in arch/x86/crypto/. This 
implementation can only be used with the AHASH API. All (other) SHASH 
implementations can be used with either the shash or the ahash API, because 
when using it as AHASH, the kernel automatically uses the cryptd() under the 
hood.


> BLKCIPHER/ABLKCIPHER as most of them seem to be interchangable; this
> was mostly based on try and trial approach).

Dto. BLKCIPHER is the synchronous and ABLKCIPHER is the asynchronous 
implementation. The same logic applies here as discussed for the SHASH/AHASH.

There used to be a blkcipher and ablkcipher API just like the shash/ahash 
which follows exactly the same logic.

About a year ago, the skcipher API was introduced which tries to get rid of 
the blkcipher API and supersedes the ablkcipher API. I.e. per default, the 
skcipher is asynchronous in nature. Yet, you can make it synchronous by 
supplying a specific flag/mask combo during crypto_skcipher_alloc.

Again, BLKCIPHERS are turned into async implementations using the cryptd 
transparently by the kernel.

PS: AF_ALG does *not* allow choosing the API discussed above. It always uses 
the async API (which implies that *all* cipher implementations are 
accessible). Though, it provides two types of interfaces:

1. the sync API

2. the async API

The sync API simply does a sendmsg/recvmsg.

The async API uses the io_submit and friends.

You can see implementations for both in libkcapi.
> 
> All bugs with details will soon be reported by syzbot
> (https://goo.gl/tpsmEJ) to kernel mailing lists with all details.
> 
> Stephan, thanks for your help!

You are welcome.


Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 14:36                             ` Stephan Mueller
@ 2017-11-24 14:55                               ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 14:55 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 3:36 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de>
> wrote:
>> > Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
>> >
>> > Hi Dmitry,
>> >
>> >> Btw, I've started doing some minimal improvements, did not yet sorted
>> >> out alg types/names, and fuzzer started scratching surface:
>> >>
>> >> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
>> >> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30
>> >> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
>> >> 12:29:48 CET suspicious RCU usage at ./include/trace/events/kmem.h:LINE
>> >> 88
>> >> Nov 23 2017 12:29:15 CET
>> >
>> > This all looks strange. Where would RCU come into play with
>> > af_alg_free_areq_sgls?
>> >
>> > Do you have a reproducer?
>> >
>> >> This strongly suggests that we need to dig deeper.
>> >
>> > Absolutely. That is why I started my fuzzer that turned up already quite
>> > some issues.
>>
>> I've cooked syzkaller change that teaches it to generate more
>> algorithm names. Probably not idea, but much better than was before:
>> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
>> cc7db/sys/linux/alg.go (if you see any obvious issues there, feedback is
>> welcome,
>
> I will peek into that code shortly.
>
>> I still did not figure out completely difference between e.g.
>> HASH/AHASH,
>
> AHASH is the asynchronous hash. I.e. the implementation can sleep.
>
> HASH == SHASH and is the synchronous hash. I.e. that implementation will never
> sleep.
>
> An SHASH can be turned into an AHASH by using cryptd().
>
> An AHASH can never be turned into an SHASH.
>
> To use SHASH implementations, you use the *_shash API calls. This API does not
> require a request structure.
>
> To use AHASH implementations, you use the *_ahash API calls. This API requires
> the use of ahash_request_* calls. By transparently employing cryptd(), the
> kernel allows the use of SHASH implementations with the AHASH API.
>
> Currently there is only one real AHASH implementation outside specific
> hardware drivers: sha1_mb and sha2*_mb found in arch/x86/crypto/. This
> implementation can only be used with the AHASH API. All (other) SHASH
> implementations can be used with either the shash or the ahash API, because
> when using it as AHASH, the kernel automatically uses the cryptd() under the
> hood.


I am interested solely in user-space API because that's what fuzzer
uses. *_shash, ahash_request_* are of little help.
Your last sentence above means that there is _no_ difference between
HASH and AHASH from user-space?
I thrown all HASH/AHASH algs into a single bucket here:
https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2cc7db/sys/linux/alg.go
And similarly for BLKCIPHER/ABLKCIPHER.


Few additional questions:

1. just to double check: compress algs are not accessible from
user-space, right?

2. There is some setup for algorithms (ALG_SET_KEY,
ALG_SET_AEAD_AUTHSIZE setsockopts and ALG_SET_IV, ALG_SET_OP,
ALG_SET_AEAD_ASSOCLEN control messages are the ones that was able to
find). Now if I chain something complex like
gcm_base(ctr(aes-aesni),ghash-generic) (I don't know if algorithms
there require setup or not, but let's assume they do).  How do I setup
inner algorithms parameters (e.g. aes-aesni in this case)? Is there a
way to call setsockopt effectively on a particular inner alg? Or pass
control messages to an inner alg? Maybe I am asking non-sense, but
that's what comes to my mind looking at the api.




>
> Dto. BLKCIPHER is the synchronous and ABLKCIPHER is the asynchronous
> implementation. The same logic applies here as discussed for the SHASH/AHASH.
>
> There used to be a blkcipher and ablkcipher API just like the shash/ahash
> which follows exactly the same logic.
>
> About a year ago, the skcipher API was introduced which tries to get rid of
> the blkcipher API and supersedes the ablkcipher API. I.e. per default, the
> skcipher is asynchronous in nature. Yet, you can make it synchronous by
> supplying a specific flag/mask combo during crypto_skcipher_alloc.
>
> Again, BLKCIPHERS are turned into async implementations using the cryptd
> transparently by the kernel.
>
> PS: AF_ALG does *not* allow choosing the API discussed above. It always uses
> the async API (which implies that *all* cipher implementations are
> accessible). Though, it provides two types of interfaces:
>
> 1. the sync API
>
> 2. the async API
>
> The sync API simply does a sendmsg/recvmsg.
>
> The async API uses the io_submit and friends.
>
> You can see implementations for both in libkcapi.
>>
>> All bugs with details will soon be reported by syzbot
>> (https://goo.gl/tpsmEJ) to kernel mailing lists with all details.
>>
>> Stephan, thanks for your help!
>
> You are welcome.
>
>
> Ciao
> Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 14:55                               ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 14:55 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 3:36 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de>
> wrote:
>> > Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
>> >
>> > Hi Dmitry,
>> >
>> >> Btw, I've started doing some minimal improvements, did not yet sorted
>> >> out alg types/names, and fuzzer started scratching surface:
>> >>
>> >> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36 CET
>> >> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017 12:23:30
>> >> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
>> >> 12:29:48 CET suspicious RCU usage at ./include/trace/events/kmem.h:LINE
>> >> 88
>> >> Nov 23 2017 12:29:15 CET
>> >
>> > This all looks strange. Where would RCU come into play with
>> > af_alg_free_areq_sgls?
>> >
>> > Do you have a reproducer?
>> >
>> >> This strongly suggests that we need to dig deeper.
>> >
>> > Absolutely. That is why I started my fuzzer that turned up already quite
>> > some issues.
>>
>> I've cooked syzkaller change that teaches it to generate more
>> algorithm names. Probably not idea, but much better than was before:
>> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
>> cc7db/sys/linux/alg.go (if you see any obvious issues there, feedback is
>> welcome,
>
> I will peek into that code shortly.
>
>> I still did not figure out completely difference between e.g.
>> HASH/AHASH,
>
> AHASH is the asynchronous hash. I.e. the implementation can sleep.
>
> HASH = SHASH and is the synchronous hash. I.e. that implementation will never
> sleep.
>
> An SHASH can be turned into an AHASH by using cryptd().
>
> An AHASH can never be turned into an SHASH.
>
> To use SHASH implementations, you use the *_shash API calls. This API does not
> require a request structure.
>
> To use AHASH implementations, you use the *_ahash API calls. This API requires
> the use of ahash_request_* calls. By transparently employing cryptd(), the
> kernel allows the use of SHASH implementations with the AHASH API.
>
> Currently there is only one real AHASH implementation outside specific
> hardware drivers: sha1_mb and sha2*_mb found in arch/x86/crypto/. This
> implementation can only be used with the AHASH API. All (other) SHASH
> implementations can be used with either the shash or the ahash API, because
> when using it as AHASH, the kernel automatically uses the cryptd() under the
> hood.


I am interested solely in user-space API because that's what fuzzer
uses. *_shash, ahash_request_* are of little help.
Your last sentence above means that there is _no_ difference between
HASH and AHASH from user-space?
I thrown all HASH/AHASH algs into a single bucket here:
https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2cc7db/sys/linux/alg.go
And similarly for BLKCIPHER/ABLKCIPHER.


Few additional questions:

1. just to double check: compress algs are not accessible from
user-space, right?

2. There is some setup for algorithms (ALG_SET_KEY,
ALG_SET_AEAD_AUTHSIZE setsockopts and ALG_SET_IV, ALG_SET_OP,
ALG_SET_AEAD_ASSOCLEN control messages are the ones that was able to
find). Now if I chain something complex like
gcm_base(ctr(aes-aesni),ghash-generic) (I don't know if algorithms
there require setup or not, but let's assume they do).  How do I setup
inner algorithms parameters (e.g. aes-aesni in this case)? Is there a
way to call setsockopt effectively on a particular inner alg? Or pass
control messages to an inner alg? Maybe I am asking non-sense, but
that's what comes to my mind looking at the api.




>
> Dto. BLKCIPHER is the synchronous and ABLKCIPHER is the asynchronous
> implementation. The same logic applies here as discussed for the SHASH/AHASH.
>
> There used to be a blkcipher and ablkcipher API just like the shash/ahash
> which follows exactly the same logic.
>
> About a year ago, the skcipher API was introduced which tries to get rid of
> the blkcipher API and supersedes the ablkcipher API. I.e. per default, the
> skcipher is asynchronous in nature. Yet, you can make it synchronous by
> supplying a specific flag/mask combo during crypto_skcipher_alloc.
>
> Again, BLKCIPHERS are turned into async implementations using the cryptd
> transparently by the kernel.
>
> PS: AF_ALG does *not* allow choosing the API discussed above. It always uses
> the async API (which implies that *all* cipher implementations are
> accessible). Though, it provides two types of interfaces:
>
> 1. the sync API
>
> 2. the async API
>
> The sync API simply does a sendmsg/recvmsg.
>
> The async API uses the io_submit and friends.
>
> You can see implementations for both in libkcapi.
>>
>> All bugs with details will soon be reported by syzbot
>> (https://goo.gl/tpsmEJ) to kernel mailing lists with all details.
>>
>> Stephan, thanks for your help!
>
> You are welcome.
>
>
> Ciao
> Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 13:49                           ` Dmitry Vyukov
@ 2017-11-24 15:03                             ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 15:03 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> I've cooked syzkaller change that teaches it to generate more
> algorithm names. Probably not idea, but much better than was before:
> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
> cc7db/sys/linux/alg.go

I am not as fluent in GO, so I may miss a point here:

		{"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
		{"authenc", []int{ALG_HASH, ALG_BLKCIPHER}},

These both are templates to form an AEAD cipher. They allow you to specify the 
block cipher and the hash type.


		{"rfc7539esp", []int{ALG_BLKCIPHER, ALG_HASH}},
		{"rfc7539", []int{ALG_BLKCIPHER, ALG_HASH}},
		{"rfc4543", []int{ALG_AEAD}},
		{"rfc4106", []int{ALG_AEAD}},

These are no ciphers per se, but simply formatting mechanisms. For example, to 
make use of rfc4106, you must split the IV: the first four bytes need to be 
appended to the key and the trailing 8 bytes are used as the IV. Any other 
formatting should cause an error. Besides, these implementations should only 
work with some AEAD ciphers like GCM.


		{"pcrypt", []int{ALG_AEAD}},
		{"rfc4309", []int{ALG_AEAD}},
		{"gcm", []int{ALG_CIPHER}},
		{"gcm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
		{"ccm", []int{ALG_CIPHER}},
		{"ccm_base", []int{ALG_BLKCIPHER, ALG_HASH}},




		{"echainiv", []int{ALG_AEAD}},
{"seqiv", []int{ALG_AEAD}},

These are IV generators and should be used with an AEAD cipher to internally 
generate IVs. Note, the use of IV generators have changed with 4.2 (see my 
script I sent you as part of this thread).



		{"gcm(aes)", nil},

gcm() is also a template that can consume any block cipher, including aes-
generic, serpent, ...

		{"gcm_base(ctr(aes-aesni),ghash-generic)", nil},

Again, ctr() is a template that can consume other block ciphers.

		{"generic-gcm-aesni", nil},

Does this exist?

		{"rfc4106(gcm(aes))", nil},

rfc... is a template that can consume AEAD ciphers.

		{"rfc4106-gcm-aesni", nil},
		{"__gcm-aes-aesni", nil},
{"__driver-gcm-aes-aesni", nil},

Please specifically test that __driver_... names should NEVER be allocatable 
from user space.



	// algorithms:
		{"cbc(aes)", nil},

cbc() is a template


		{"cbc(aes-aesni)", nil},
		{"chacha20", nil},
		{"chacha20-simd", nil},
		{"pcbc(aes)", nil},
		{"pcbc-aes-aesni", nil},
		{"fpu(pcbc(__aes))", nil},
		{"fpu(pcbc(__aes-aesni))", nil},
		{"pcbc(__aes)", nil},
		{"pcbc(__aes-aesni)", nil},

They are all templates.

		{"xts(aes)", nil},

xts() is a template.

Note, starting with 4.9, you must use xts(ecb(aes)).

		{"xts-aes-aesni", nil},
{"ctr(aes)", nil},

ctr() is a template



In general, I would rather suggest:

1. identify all templates and mark them what they can consume (other 
templates, AEAD, skcipher, hash, ...)

2. identify all ciphers (aes, aes-generic, aes-aesni, ...) and identify their 
types (skcipher, hash)

3. mix-n-match templates with the ciphers according to what templates can 
consume


Start another test where you arbitrarily mix-n-match templates and ciphers or 
even bogus names, NULL, etc.


One word of warning: some cipher names may look like templates, but they are 
not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not consisting of 
templates. Thus, I would always use the driver names (gcm-aes-aesni for the 
given example) to ensure you test exactly the cipher you had in mind.

Word of warning II: There is a bug in the kernel crypto API: if you allocate a 
cipher using the driver name that was not registered before, this cipher will 
never be available using its "name". /proc/crypto shows you the reason: the 
"name" is then identical to the driver name.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 15:03                             ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 15:03 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> I've cooked syzkaller change that teaches it to generate more
> algorithm names. Probably not idea, but much better than was before:
> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
> cc7db/sys/linux/alg.go

I am not as fluent in GO, so I may miss a point here:

		{"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
		{"authenc", []int{ALG_HASH, ALG_BLKCIPHER}},

These both are templates to form an AEAD cipher. They allow you to specify the 
block cipher and the hash type.


		{"rfc7539esp", []int{ALG_BLKCIPHER, ALG_HASH}},
		{"rfc7539", []int{ALG_BLKCIPHER, ALG_HASH}},
		{"rfc4543", []int{ALG_AEAD}},
		{"rfc4106", []int{ALG_AEAD}},

These are no ciphers per se, but simply formatting mechanisms. For example, to 
make use of rfc4106, you must split the IV: the first four bytes need to be 
appended to the key and the trailing 8 bytes are used as the IV. Any other 
formatting should cause an error. Besides, these implementations should only 
work with some AEAD ciphers like GCM.


		{"pcrypt", []int{ALG_AEAD}},
		{"rfc4309", []int{ALG_AEAD}},
		{"gcm", []int{ALG_CIPHER}},
		{"gcm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
		{"ccm", []int{ALG_CIPHER}},
		{"ccm_base", []int{ALG_BLKCIPHER, ALG_HASH}},




		{"echainiv", []int{ALG_AEAD}},
{"seqiv", []int{ALG_AEAD}},

These are IV generators and should be used with an AEAD cipher to internally 
generate IVs. Note, the use of IV generators have changed with 4.2 (see my 
script I sent you as part of this thread).



		{"gcm(aes)", nil},

gcm() is also a template that can consume any block cipher, including aes-
generic, serpent, ...

		{"gcm_base(ctr(aes-aesni),ghash-generic)", nil},

Again, ctr() is a template that can consume other block ciphers.

		{"generic-gcm-aesni", nil},

Does this exist?

		{"rfc4106(gcm(aes))", nil},

rfc... is a template that can consume AEAD ciphers.

		{"rfc4106-gcm-aesni", nil},
		{"__gcm-aes-aesni", nil},
{"__driver-gcm-aes-aesni", nil},

Please specifically test that __driver_... names should NEVER be allocatable 
from user space.



	// algorithms:
		{"cbc(aes)", nil},

cbc() is a template


		{"cbc(aes-aesni)", nil},
		{"chacha20", nil},
		{"chacha20-simd", nil},
		{"pcbc(aes)", nil},
		{"pcbc-aes-aesni", nil},
		{"fpu(pcbc(__aes))", nil},
		{"fpu(pcbc(__aes-aesni))", nil},
		{"pcbc(__aes)", nil},
		{"pcbc(__aes-aesni)", nil},

They are all templates.

		{"xts(aes)", nil},

xts() is a template.

Note, starting with 4.9, you must use xts(ecb(aes)).

		{"xts-aes-aesni", nil},
{"ctr(aes)", nil},

ctr() is a template



In general, I would rather suggest:

1. identify all templates and mark them what they can consume (other 
templates, AEAD, skcipher, hash, ...)

2. identify all ciphers (aes, aes-generic, aes-aesni, ...) and identify their 
types (skcipher, hash)

3. mix-n-match templates with the ciphers according to what templates can 
consume


Start another test where you arbitrarily mix-n-match templates and ciphers or 
even bogus names, NULL, etc.


One word of warning: some cipher names may look like templates, but they are 
not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not consisting of 
templates. Thus, I would always use the driver names (gcm-aes-aesni for the 
given example) to ensure you test exactly the cipher you had in mind.

Word of warning II: There is a bug in the kernel crypto API: if you allocate a 
cipher using the driver name that was not registered before, this cipher will 
never be available using its "name". /proc/crypto shows you the reason: the 
"name" is then identical to the driver name.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 14:55                               ` Dmitry Vyukov
@ 2017-11-24 15:13                                 ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 15:13 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 15:55:59 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> On Fri, Nov 24, 2017 at 3:36 PM, Stephan Mueller <smueller@chronox.de> 
wrote:
> > Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
> > 
> > Hi Dmitry,
> > 
> >> On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de>
> > 
> > wrote:
> >> > Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
> >> > 
> >> > Hi Dmitry,
> >> > 
> >> >> Btw, I've started doing some minimal improvements, did not yet sorted
> >> >> out alg types/names, and fuzzer started scratching surface:
> >> >> 
> >> >> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36
> >> >> CET
> >> >> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017
> >> >> 12:23:30
> >> >> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
> >> >> 12:29:48 CET suspicious RCU usage at
> >> >> ./include/trace/events/kmem.h:LINE
> >> >> 88
> >> >> Nov 23 2017 12:29:15 CET
> >> > 
> >> > This all looks strange. Where would RCU come into play with
> >> > af_alg_free_areq_sgls?
> >> > 
> >> > Do you have a reproducer?
> >> > 
> >> >> This strongly suggests that we need to dig deeper.
> >> > 
> >> > Absolutely. That is why I started my fuzzer that turned up already
> >> > quite
> >> > some issues.
> >> 
> >> I've cooked syzkaller change that teaches it to generate more
> >> algorithm names. Probably not idea, but much better than was before:
> >> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c148
> >> 6d2 cc7db/sys/linux/alg.go (if you see any obvious issues there, feedback
> >> is welcome,
> > 
> > I will peek into that code shortly.
> > 
> >> I still did not figure out completely difference between e.g.
> >> HASH/AHASH,
> > 
> > AHASH is the asynchronous hash. I.e. the implementation can sleep.
> > 
> > HASH == SHASH and is the synchronous hash. I.e. that implementation will
> > never sleep.
> > 
> > An SHASH can be turned into an AHASH by using cryptd().
> > 
> > An AHASH can never be turned into an SHASH.
> > 
> > To use SHASH implementations, you use the *_shash API calls. This API does
> > not require a request structure.
> > 
> > To use AHASH implementations, you use the *_ahash API calls. This API
> > requires the use of ahash_request_* calls. By transparently employing
> > cryptd(), the kernel allows the use of SHASH implementations with the
> > AHASH API.
> > 
> > Currently there is only one real AHASH implementation outside specific
> > hardware drivers: sha1_mb and sha2*_mb found in arch/x86/crypto/. This
> > implementation can only be used with the AHASH API. All (other) SHASH
> > implementations can be used with either the shash or the ahash API,
> > because
> > when using it as AHASH, the kernel automatically uses the cryptd() under
> > the hood.
> 
> I am interested solely in user-space API because that's what fuzzer
> uses. *_shash, ahash_request_* are of little help.
> Your last sentence above means that there is _no_ difference between
> HASH and AHASH from user-space?

Correct.

> I thrown all HASH/AHASH algs into a single bucket here:
> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
> cc7db/sys/linux/alg.go And similarly for BLKCIPHER/ABLKCIPHER.

This approach is correct.
> 
> 
> Few additional questions:
> 
> 1. just to double check: compress algs are not accessible from
> user-space, right?

Right, because there is no algif_acomp (yet).
> 
> 2. There is some setup for algorithms (ALG_SET_KEY,
> ALG_SET_AEAD_AUTHSIZE setsockopts and ALG_SET_IV, ALG_SET_OP,
> ALG_SET_AEAD_ASSOCLEN control messages are the ones that was able to
> find).

... and do not forget that you need to call the setup calls *before* the 
accept() call for an operation to work correctly.

> Now if I chain something complex like
> gcm_base(ctr(aes-aesni),ghash-generic) (I don't know if algorithms
> there require setup or not, but let's assume they do).

All ciphers always require setup:

- skciphers: IV (excluding ECB of course) and key

- AEAD: IV, key, authsize and assoclen

- hashes: key (only for the MACs)

Your example of gcm_base is an AEAD.


> How do I setup
> inner algorithms parameters (e.g. aes-aesni in this case)?

You cannot access the inner ciphers. For the interface, you only have *one* 
cipher, i.e. an AEAD. The name only tells the kernel how to construct the 
cipher. But once it is constructed, it takes the aforementioned parameters. 
Though, some ciphers may be more restrictive on some parameters than others 
(e.g. the authsize or assoclen may be restricted for some AEAD ciphers).

> Is there a
> way to call setsockopt effectively on a particular inner alg?

You cannot do that and you do not want to do that.

> Or pass
> control messages to an inner alg? Maybe I am asking non-sense, but
> that's what comes to my mind looking at the api.

You cannot talk to the inner ciphers. You only talk to one cipher that you 
referred to with the name. Remember, the name is ONLY used to tell the kernel 
which parts to put together during allocation. After the allocation, you have 
only one cipher and interact with only one cipher of the given type.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 15:13                                 ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 15:13 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 15:55:59 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> On Fri, Nov 24, 2017 at 3:36 PM, Stephan Mueller <smueller@chronox.de> 
wrote:
> > Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
> > 
> > Hi Dmitry,
> > 
> >> On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de>
> > 
> > wrote:
> >> > Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
> >> > 
> >> > Hi Dmitry,
> >> > 
> >> >> Btw, I've started doing some minimal improvements, did not yet sorted
> >> >> out alg types/names, and fuzzer started scratching surface:
> >> >> 
> >> >> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36
> >> >> CET
> >> >> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017
> >> >> 12:23:30
> >> >> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
> >> >> 12:29:48 CET suspicious RCU usage at
> >> >> ./include/trace/events/kmem.h:LINE
> >> >> 88
> >> >> Nov 23 2017 12:29:15 CET
> >> > 
> >> > This all looks strange. Where would RCU come into play with
> >> > af_alg_free_areq_sgls?
> >> > 
> >> > Do you have a reproducer?
> >> > 
> >> >> This strongly suggests that we need to dig deeper.
> >> > 
> >> > Absolutely. That is why I started my fuzzer that turned up already
> >> > quite
> >> > some issues.
> >> 
> >> I've cooked syzkaller change that teaches it to generate more
> >> algorithm names. Probably not idea, but much better than was before:
> >> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c148
> >> 6d2 cc7db/sys/linux/alg.go (if you see any obvious issues there, feedback
> >> is welcome,
> > 
> > I will peek into that code shortly.
> > 
> >> I still did not figure out completely difference between e.g.
> >> HASH/AHASH,
> > 
> > AHASH is the asynchronous hash. I.e. the implementation can sleep.
> > 
> > HASH = SHASH and is the synchronous hash. I.e. that implementation will
> > never sleep.
> > 
> > An SHASH can be turned into an AHASH by using cryptd().
> > 
> > An AHASH can never be turned into an SHASH.
> > 
> > To use SHASH implementations, you use the *_shash API calls. This API does
> > not require a request structure.
> > 
> > To use AHASH implementations, you use the *_ahash API calls. This API
> > requires the use of ahash_request_* calls. By transparently employing
> > cryptd(), the kernel allows the use of SHASH implementations with the
> > AHASH API.
> > 
> > Currently there is only one real AHASH implementation outside specific
> > hardware drivers: sha1_mb and sha2*_mb found in arch/x86/crypto/. This
> > implementation can only be used with the AHASH API. All (other) SHASH
> > implementations can be used with either the shash or the ahash API,
> > because
> > when using it as AHASH, the kernel automatically uses the cryptd() under
> > the hood.
> 
> I am interested solely in user-space API because that's what fuzzer
> uses. *_shash, ahash_request_* are of little help.
> Your last sentence above means that there is _no_ difference between
> HASH and AHASH from user-space?

Correct.

> I thrown all HASH/AHASH algs into a single bucket here:
> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
> cc7db/sys/linux/alg.go And similarly for BLKCIPHER/ABLKCIPHER.

This approach is correct.
> 
> 
> Few additional questions:
> 
> 1. just to double check: compress algs are not accessible from
> user-space, right?

Right, because there is no algif_acomp (yet).
> 
> 2. There is some setup for algorithms (ALG_SET_KEY,
> ALG_SET_AEAD_AUTHSIZE setsockopts and ALG_SET_IV, ALG_SET_OP,
> ALG_SET_AEAD_ASSOCLEN control messages are the ones that was able to
> find).

... and do not forget that you need to call the setup calls *before* the 
accept() call for an operation to work correctly.

> Now if I chain something complex like
> gcm_base(ctr(aes-aesni),ghash-generic) (I don't know if algorithms
> there require setup or not, but let's assume they do).

All ciphers always require setup:

- skciphers: IV (excluding ECB of course) and key

- AEAD: IV, key, authsize and assoclen

- hashes: key (only for the MACs)

Your example of gcm_base is an AEAD.


> How do I setup
> inner algorithms parameters (e.g. aes-aesni in this case)?

You cannot access the inner ciphers. For the interface, you only have *one* 
cipher, i.e. an AEAD. The name only tells the kernel how to construct the 
cipher. But once it is constructed, it takes the aforementioned parameters. 
Though, some ciphers may be more restrictive on some parameters than others 
(e.g. the authsize or assoclen may be restricted for some AEAD ciphers).

> Is there a
> way to call setsockopt effectively on a particular inner alg?

You cannot do that and you do not want to do that.

> Or pass
> control messages to an inner alg? Maybe I am asking non-sense, but
> that's what comes to my mind looking at the api.

You cannot talk to the inner ciphers. You only talk to one cipher that you 
referred to with the name. Remember, the name is ONLY used to tell the kernel 
which parts to put together during allocation. After the allocation, you have 
only one cipher and interact with only one cipher of the given type.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 15:13                                 ` Stephan Mueller
@ 2017-11-24 15:53                                   ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 15:53 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 4:13 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 15:55:59 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> On Fri, Nov 24, 2017 at 3:36 PM, Stephan Mueller <smueller@chronox.de>
> wrote:
>> > Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
>> >
>> > Hi Dmitry,
>> >
>> >> On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de>
>> >
>> > wrote:
>> >> > Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
>> >> >
>> >> > Hi Dmitry,
>> >> >
>> >> >> Btw, I've started doing some minimal improvements, did not yet sorted
>> >> >> out alg types/names, and fuzzer started scratching surface:
>> >> >>
>> >> >> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36
>> >> >> CET
>> >> >> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017
>> >> >> 12:23:30
>> >> >> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
>> >> >> 12:29:48 CET suspicious RCU usage at
>> >> >> ./include/trace/events/kmem.h:LINE
>> >> >> 88
>> >> >> Nov 23 2017 12:29:15 CET
>> >> >
>> >> > This all looks strange. Where would RCU come into play with
>> >> > af_alg_free_areq_sgls?
>> >> >
>> >> > Do you have a reproducer?
>> >> >
>> >> >> This strongly suggests that we need to dig deeper.
>> >> >
>> >> > Absolutely. That is why I started my fuzzer that turned up already
>> >> > quite
>> >> > some issues.
>> >>
>> >> I've cooked syzkaller change that teaches it to generate more
>> >> algorithm names. Probably not idea, but much better than was before:
>> >> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c148
>> >> 6d2 cc7db/sys/linux/alg.go (if you see any obvious issues there, feedback
>> >> is welcome,
>> >
>> > I will peek into that code shortly.
>> >
>> >> I still did not figure out completely difference between e.g.
>> >> HASH/AHASH,
>> >
>> > AHASH is the asynchronous hash. I.e. the implementation can sleep.
>> >
>> > HASH == SHASH and is the synchronous hash. I.e. that implementation will
>> > never sleep.
>> >
>> > An SHASH can be turned into an AHASH by using cryptd().
>> >
>> > An AHASH can never be turned into an SHASH.
>> >
>> > To use SHASH implementations, you use the *_shash API calls. This API does
>> > not require a request structure.
>> >
>> > To use AHASH implementations, you use the *_ahash API calls. This API
>> > requires the use of ahash_request_* calls. By transparently employing
>> > cryptd(), the kernel allows the use of SHASH implementations with the
>> > AHASH API.
>> >
>> > Currently there is only one real AHASH implementation outside specific
>> > hardware drivers: sha1_mb and sha2*_mb found in arch/x86/crypto/. This
>> > implementation can only be used with the AHASH API. All (other) SHASH
>> > implementations can be used with either the shash or the ahash API,
>> > because
>> > when using it as AHASH, the kernel automatically uses the cryptd() under
>> > the hood.
>>
>> I am interested solely in user-space API because that's what fuzzer
>> uses. *_shash, ahash_request_* are of little help.
>> Your last sentence above means that there is _no_ difference between
>> HASH and AHASH from user-space?
>
> Correct.
>
>> I thrown all HASH/AHASH algs into a single bucket here:
>> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
>> cc7db/sys/linux/alg.go And similarly for BLKCIPHER/ABLKCIPHER.
>
> This approach is correct.
>>
>>
>> Few additional questions:
>>
>> 1. just to double check: compress algs are not accessible from
>> user-space, right?
>
> Right, because there is no algif_acomp (yet).
>>
>> 2. There is some setup for algorithms (ALG_SET_KEY,
>> ALG_SET_AEAD_AUTHSIZE setsockopts and ALG_SET_IV, ALG_SET_OP,
>> ALG_SET_AEAD_ASSOCLEN control messages are the ones that was able to
>> find).
>
> ... and do not forget that you need to call the setup calls *before* the
> accept() call for an operation to work correctly.


I will hand it off to fuzzer, it should be able to do it both ways (+
calling both before and after, calling concurrently, etc).


>> Now if I chain something complex like
>> gcm_base(ctr(aes-aesni),ghash-generic) (I don't know if algorithms
>> there require setup or not, but let's assume they do).
>
> All ciphers always require setup:
>
> - skciphers: IV (excluding ECB of course) and key
>
> - AEAD: IV, key, authsize and assoclen
>
> - hashes: key (only for the MACs)
>
> Your example of gcm_base is an AEAD.
>
>
>> How do I setup
>> inner algorithms parameters (e.g. aes-aesni in this case)?
>
> You cannot access the inner ciphers. For the interface, you only have *one*
> cipher, i.e. an AEAD. The name only tells the kernel how to construct the
> cipher. But once it is constructed, it takes the aforementioned parameters.
> Though, some ciphers may be more restrictive on some parameters than others
> (e.g. the authsize or assoclen may be restricted for some AEAD ciphers).
>
>> Is there a
>> way to call setsockopt effectively on a particular inner alg?
>
> You cannot do that and you do not want to do that.
>
>> Or pass
>> control messages to an inner alg? Maybe I am asking non-sense, but
>> that's what comes to my mind looking at the api.
>
> You cannot talk to the inner ciphers. You only talk to one cipher that you
> referred to with the name. Remember, the name is ONLY used to tell the kernel
> which parts to put together during allocation. After the allocation, you have
> only one cipher and interact with only one cipher of the given type.

I see. Makes sense. I guess an outer template can transitively setup
inner algorithms if necessary.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 15:53                                   ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 15:53 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 4:13 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 15:55:59 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> On Fri, Nov 24, 2017 at 3:36 PM, Stephan Mueller <smueller@chronox.de>
> wrote:
>> > Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
>> >
>> > Hi Dmitry,
>> >
>> >> On Thu, Nov 23, 2017 at 1:35 PM, Stephan Mueller <smueller@chronox.de>
>> >
>> > wrote:
>> >> > Am Donnerstag, 23. November 2017, 12:34:54 CET schrieb Dmitry Vyukov:
>> >> >
>> >> > Hi Dmitry,
>> >> >
>> >> >> Btw, I've started doing some minimal improvements, did not yet sorted
>> >> >> out alg types/names, and fuzzer started scratching surface:
>> >> >>
>> >> >> WARNING: kernel stack regs has bad 'bp' value 77 Nov 23 2017 12:29:36
>> >> >> CET
>> >> >> general protection fault in af_alg_free_areq_sgls 54 Nov 23 2017
>> >> >> 12:23:30
>> >> >> CET general protection fault in crypto_chacha20_crypt 100 Nov 23 2017
>> >> >> 12:29:48 CET suspicious RCU usage at
>> >> >> ./include/trace/events/kmem.h:LINE
>> >> >> 88
>> >> >> Nov 23 2017 12:29:15 CET
>> >> >
>> >> > This all looks strange. Where would RCU come into play with
>> >> > af_alg_free_areq_sgls?
>> >> >
>> >> > Do you have a reproducer?
>> >> >
>> >> >> This strongly suggests that we need to dig deeper.
>> >> >
>> >> > Absolutely. That is why I started my fuzzer that turned up already
>> >> > quite
>> >> > some issues.
>> >>
>> >> I've cooked syzkaller change that teaches it to generate more
>> >> algorithm names. Probably not idea, but much better than was before:
>> >> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c148
>> >> 6d2 cc7db/sys/linux/alg.go (if you see any obvious issues there, feedback
>> >> is welcome,
>> >
>> > I will peek into that code shortly.
>> >
>> >> I still did not figure out completely difference between e.g.
>> >> HASH/AHASH,
>> >
>> > AHASH is the asynchronous hash. I.e. the implementation can sleep.
>> >
>> > HASH = SHASH and is the synchronous hash. I.e. that implementation will
>> > never sleep.
>> >
>> > An SHASH can be turned into an AHASH by using cryptd().
>> >
>> > An AHASH can never be turned into an SHASH.
>> >
>> > To use SHASH implementations, you use the *_shash API calls. This API does
>> > not require a request structure.
>> >
>> > To use AHASH implementations, you use the *_ahash API calls. This API
>> > requires the use of ahash_request_* calls. By transparently employing
>> > cryptd(), the kernel allows the use of SHASH implementations with the
>> > AHASH API.
>> >
>> > Currently there is only one real AHASH implementation outside specific
>> > hardware drivers: sha1_mb and sha2*_mb found in arch/x86/crypto/. This
>> > implementation can only be used with the AHASH API. All (other) SHASH
>> > implementations can be used with either the shash or the ahash API,
>> > because
>> > when using it as AHASH, the kernel automatically uses the cryptd() under
>> > the hood.
>>
>> I am interested solely in user-space API because that's what fuzzer
>> uses. *_shash, ahash_request_* are of little help.
>> Your last sentence above means that there is _no_ difference between
>> HASH and AHASH from user-space?
>
> Correct.
>
>> I thrown all HASH/AHASH algs into a single bucket here:
>> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
>> cc7db/sys/linux/alg.go And similarly for BLKCIPHER/ABLKCIPHER.
>
> This approach is correct.
>>
>>
>> Few additional questions:
>>
>> 1. just to double check: compress algs are not accessible from
>> user-space, right?
>
> Right, because there is no algif_acomp (yet).
>>
>> 2. There is some setup for algorithms (ALG_SET_KEY,
>> ALG_SET_AEAD_AUTHSIZE setsockopts and ALG_SET_IV, ALG_SET_OP,
>> ALG_SET_AEAD_ASSOCLEN control messages are the ones that was able to
>> find).
>
> ... and do not forget that you need to call the setup calls *before* the
> accept() call for an operation to work correctly.


I will hand it off to fuzzer, it should be able to do it both ways (+
calling both before and after, calling concurrently, etc).


>> Now if I chain something complex like
>> gcm_base(ctr(aes-aesni),ghash-generic) (I don't know if algorithms
>> there require setup or not, but let's assume they do).
>
> All ciphers always require setup:
>
> - skciphers: IV (excluding ECB of course) and key
>
> - AEAD: IV, key, authsize and assoclen
>
> - hashes: key (only for the MACs)
>
> Your example of gcm_base is an AEAD.
>
>
>> How do I setup
>> inner algorithms parameters (e.g. aes-aesni in this case)?
>
> You cannot access the inner ciphers. For the interface, you only have *one*
> cipher, i.e. an AEAD. The name only tells the kernel how to construct the
> cipher. But once it is constructed, it takes the aforementioned parameters.
> Though, some ciphers may be more restrictive on some parameters than others
> (e.g. the authsize or assoclen may be restricted for some AEAD ciphers).
>
>> Is there a
>> way to call setsockopt effectively on a particular inner alg?
>
> You cannot do that and you do not want to do that.
>
>> Or pass
>> control messages to an inner alg? Maybe I am asking non-sense, but
>> that's what comes to my mind looking at the api.
>
> You cannot talk to the inner ciphers. You only talk to one cipher that you
> referred to with the name. Remember, the name is ONLY used to tell the kernel
> which parts to put together during allocation. After the allocation, you have
> only one cipher and interact with only one cipher of the given type.

I see. Makes sense. I guess an outer template can transitively setup
inner algorithms if necessary.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 15:53                                   ` Dmitry Vyukov
@ 2017-11-24 16:07                                     ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 16:07 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 16:53:26 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> > You cannot talk to the inner ciphers. You only talk to one cipher that you
> > referred to with the name. Remember, the name is ONLY used to tell the
> > kernel which parts to put together during allocation. After the
> > allocation, you have only one cipher and interact with only one cipher of
> > the given type.
> I see. Makes sense. I guess an outer template can transitively setup
> inner algorithms if necessary.

Exactly. See crypto/gcm.c for example. This is a template to invoke a CTR and 
GHASH implementation. Thus it has no need for a key itself. Hence, a setkey 
is:

static int crypto_gcm_setkey(struct crypto_aead *aead, const u8 *key,
                             unsigned int keylen)
{
...
        err = crypto_skcipher_setkey(ctr, key, keylen);
...

Where ctr is the reference to the CTR block cipher that was allocated with 
gcm_base(...) or implicitly when using gcm(...) which internally turns into a 
gcm_base(...).


Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 16:07                                     ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 16:07 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 16:53:26 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> > You cannot talk to the inner ciphers. You only talk to one cipher that you
> > referred to with the name. Remember, the name is ONLY used to tell the
> > kernel which parts to put together during allocation. After the
> > allocation, you have only one cipher and interact with only one cipher of
> > the given type.
> I see. Makes sense. I guess an outer template can transitively setup
> inner algorithms if necessary.

Exactly. See crypto/gcm.c for example. This is a template to invoke a CTR and 
GHASH implementation. Thus it has no need for a key itself. Hence, a setkey 
is:

static int crypto_gcm_setkey(struct crypto_aead *aead, const u8 *key,
                             unsigned int keylen)
{
...
        err = crypto_skcipher_setkey(ctr, key, keylen);
...

Where ctr is the reference to the CTR block cipher that was allocated with 
gcm_base(...) or implicitly when using gcm(...) which internally turns into a 
gcm_base(...).


Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 15:03                             ` Stephan Mueller
@ 2017-11-24 16:10                               ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 16:10 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 4:03 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> I've cooked syzkaller change that teaches it to generate more
>> algorithm names. Probably not idea, but much better than was before:
>> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
>> cc7db/sys/linux/alg.go
>
> I am not as fluent in GO, so I may miss a point here:
>
>                 {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
>                 {"authenc", []int{ALG_HASH, ALG_BLKCIPHER}},
>
> These both are templates to form an AEAD cipher. They allow you to specify the
> block cipher and the hash type.
>
>
>                 {"rfc7539esp", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"rfc7539", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"rfc4543", []int{ALG_AEAD}},
>                 {"rfc4106", []int{ALG_AEAD}},
>
> These are no ciphers per se, but simply formatting mechanisms. For example, to
> make use of rfc4106, you must split the IV: the first four bytes need to be
> appended to the key and the trailing 8 bytes are used as the IV. Any other
> formatting should cause an error. Besides, these implementations should only
> work with some AEAD ciphers like GCM.
>
>
>                 {"pcrypt", []int{ALG_AEAD}},
>                 {"rfc4309", []int{ALG_AEAD}},
>                 {"gcm", []int{ALG_CIPHER}},
>                 {"gcm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"ccm", []int{ALG_CIPHER}},
>                 {"ccm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
>
>
>
>
>                 {"echainiv", []int{ALG_AEAD}},
> {"seqiv", []int{ALG_AEAD}},
>
> These are IV generators and should be used with an AEAD cipher to internally
> generate IVs. Note, the use of IV generators have changed with 4.2 (see my
> script I sent you as part of this thread).
>
>
>
>                 {"gcm(aes)", nil},
>
> gcm() is also a template that can consume any block cipher, including aes-
> generic, serpent, ...
>
>                 {"gcm_base(ctr(aes-aesni),ghash-generic)", nil},
>
> Again, ctr() is a template that can consume other block ciphers.
>
>                 {"generic-gcm-aesni", nil},
>
> Does this exist?
>
>                 {"rfc4106(gcm(aes))", nil},
>
> rfc... is a template that can consume AEAD ciphers.
>
>                 {"rfc4106-gcm-aesni", nil},
>                 {"__gcm-aes-aesni", nil},
> {"__driver-gcm-aes-aesni", nil},
>
> Please specifically test that __driver_... names should NEVER be allocatable
> from user space.
>
>
>
>         // algorithms:
>                 {"cbc(aes)", nil},
>
> cbc() is a template
>
>
>                 {"cbc(aes-aesni)", nil},
>                 {"chacha20", nil},
>                 {"chacha20-simd", nil},
>                 {"pcbc(aes)", nil},
>                 {"pcbc-aes-aesni", nil},
>                 {"fpu(pcbc(__aes))", nil},
>                 {"fpu(pcbc(__aes-aesni))", nil},
>                 {"pcbc(__aes)", nil},
>                 {"pcbc(__aes-aesni)", nil},
>
> They are all templates.
>
>                 {"xts(aes)", nil},
>
> xts() is a template.
>
> Note, starting with 4.9, you must use xts(ecb(aes)).
>
>                 {"xts-aes-aesni", nil},
> {"ctr(aes)", nil},
>
> ctr() is a template
>
>
>
> In general, I would rather suggest:
>
> 1. identify all templates and mark them what they can consume (other
> templates, AEAD, skcipher, hash, ...)
>
> 2. identify all ciphers (aes, aes-generic, aes-aesni, ...) and identify their
> types (skcipher, hash)
>
> 3. mix-n-match templates with the ciphers according to what templates can
> consume


That's more-or-less what I did. Here:

var allAlgs = map[int][]algDesc{
  ALG_AEAD: []algDesc{
    // templates:
    {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
    {"gcm", []int{ALG_CIPHER}},

ALG_AEAD means that all of these names produce ALG_AEAD as the result.
Names that has arguments after them (authencesn, gcm) are templates,
and the list of arguments denote number and types of arguments for the
template.

So here authencesn is a template that produces AEAD and has 2
arguments: first is ALG_HASH, second is ALG_BLKCIPHER.
Similarly, gsm is template that produces AEAD and has 1 argument of type CIPHER.

...
    // algorithms:
    {"gcm(aes)", nil},
    {"__gcm-aes-aesni", nil},

If second value is nil, that's a complete algorithm (also of type
AEAD). I pulled in all registered implementations, so the contain the
specialized implementations like "gcm(aes)", but note that gsm is also
described as template above so fuzzer can use other inner ALG_CIPHER
algorithms with gsm.


> Start another test where you arbitrarily mix-n-match templates and ciphers or
> even bogus names, NULL, etc.
>
>
> One word of warning: some cipher names may look like templates, but they are
> not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not consisting of
> templates. Thus, I would always use the driver names (gcm-aes-aesni for the
> given example) to ensure you test exactly the cipher you had in mind.

I've pulled all registered algs from kernel with the following code:

void crypto_dumb(void)
{
    struct crypto_alg *alg;
    const char *type;

    down_write(&crypto_alg_sem);
    list_for_each_entry(alg, &crypto_alg_list, cra_list) {
        pr_err("name=%s driver=%s async=%d type=%d\n",
            alg->cra_name, alg->cra_driver_name,
            !!(alg->cra_flags & CRYPTO_ALG_ASYNC),
            alg->cra_flags & CRYPTO_ALG_TYPE_MASK);
    }
    up_write(&crypto_alg_sem);
}

so I've got these "gcm(aes)" as well. Which is why you see
{"gcm(aes)", nil} in algorithms section.

However, the name was actually "__gcm-aes-aesni", not "gcm-aes-aesni".
But kernel does not let me allocate neither of them (EINVAL in both
cases).




> Word of warning II: There is a bug in the kernel crypto API: if you allocate a
> cipher using the driver name that was not registered before, this cipher will
> never be available using its "name". /proc/crypto shows you the reason: the
> "name" is then identical to the driver name.
>
> Ciao
> Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 16:10                               ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 16:10 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 4:03 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> I've cooked syzkaller change that teaches it to generate more
>> algorithm names. Probably not idea, but much better than was before:
>> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
>> cc7db/sys/linux/alg.go
>
> I am not as fluent in GO, so I may miss a point here:
>
>                 {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
>                 {"authenc", []int{ALG_HASH, ALG_BLKCIPHER}},
>
> These both are templates to form an AEAD cipher. They allow you to specify the
> block cipher and the hash type.
>
>
>                 {"rfc7539esp", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"rfc7539", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"rfc4543", []int{ALG_AEAD}},
>                 {"rfc4106", []int{ALG_AEAD}},
>
> These are no ciphers per se, but simply formatting mechanisms. For example, to
> make use of rfc4106, you must split the IV: the first four bytes need to be
> appended to the key and the trailing 8 bytes are used as the IV. Any other
> formatting should cause an error. Besides, these implementations should only
> work with some AEAD ciphers like GCM.
>
>
>                 {"pcrypt", []int{ALG_AEAD}},
>                 {"rfc4309", []int{ALG_AEAD}},
>                 {"gcm", []int{ALG_CIPHER}},
>                 {"gcm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"ccm", []int{ALG_CIPHER}},
>                 {"ccm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
>
>
>
>
>                 {"echainiv", []int{ALG_AEAD}},
> {"seqiv", []int{ALG_AEAD}},
>
> These are IV generators and should be used with an AEAD cipher to internally
> generate IVs. Note, the use of IV generators have changed with 4.2 (see my
> script I sent you as part of this thread).
>
>
>
>                 {"gcm(aes)", nil},
>
> gcm() is also a template that can consume any block cipher, including aes-
> generic, serpent, ...
>
>                 {"gcm_base(ctr(aes-aesni),ghash-generic)", nil},
>
> Again, ctr() is a template that can consume other block ciphers.
>
>                 {"generic-gcm-aesni", nil},
>
> Does this exist?
>
>                 {"rfc4106(gcm(aes))", nil},
>
> rfc... is a template that can consume AEAD ciphers.
>
>                 {"rfc4106-gcm-aesni", nil},
>                 {"__gcm-aes-aesni", nil},
> {"__driver-gcm-aes-aesni", nil},
>
> Please specifically test that __driver_... names should NEVER be allocatable
> from user space.
>
>
>
>         // algorithms:
>                 {"cbc(aes)", nil},
>
> cbc() is a template
>
>
>                 {"cbc(aes-aesni)", nil},
>                 {"chacha20", nil},
>                 {"chacha20-simd", nil},
>                 {"pcbc(aes)", nil},
>                 {"pcbc-aes-aesni", nil},
>                 {"fpu(pcbc(__aes))", nil},
>                 {"fpu(pcbc(__aes-aesni))", nil},
>                 {"pcbc(__aes)", nil},
>                 {"pcbc(__aes-aesni)", nil},
>
> They are all templates.
>
>                 {"xts(aes)", nil},
>
> xts() is a template.
>
> Note, starting with 4.9, you must use xts(ecb(aes)).
>
>                 {"xts-aes-aesni", nil},
> {"ctr(aes)", nil},
>
> ctr() is a template
>
>
>
> In general, I would rather suggest:
>
> 1. identify all templates and mark them what they can consume (other
> templates, AEAD, skcipher, hash, ...)
>
> 2. identify all ciphers (aes, aes-generic, aes-aesni, ...) and identify their
> types (skcipher, hash)
>
> 3. mix-n-match templates with the ciphers according to what templates can
> consume


That's more-or-less what I did. Here:

var allAlgs = map[int][]algDesc{
  ALG_AEAD: []algDesc{
    // templates:
    {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
    {"gcm", []int{ALG_CIPHER}},

ALG_AEAD means that all of these names produce ALG_AEAD as the result.
Names that has arguments after them (authencesn, gcm) are templates,
and the list of arguments denote number and types of arguments for the
template.

So here authencesn is a template that produces AEAD and has 2
arguments: first is ALG_HASH, second is ALG_BLKCIPHER.
Similarly, gsm is template that produces AEAD and has 1 argument of type CIPHER.

...
    // algorithms:
    {"gcm(aes)", nil},
    {"__gcm-aes-aesni", nil},

If second value is nil, that's a complete algorithm (also of type
AEAD). I pulled in all registered implementations, so the contain the
specialized implementations like "gcm(aes)", but note that gsm is also
described as template above so fuzzer can use other inner ALG_CIPHER
algorithms with gsm.


> Start another test where you arbitrarily mix-n-match templates and ciphers or
> even bogus names, NULL, etc.
>
>
> One word of warning: some cipher names may look like templates, but they are
> not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not consisting of
> templates. Thus, I would always use the driver names (gcm-aes-aesni for the
> given example) to ensure you test exactly the cipher you had in mind.

I've pulled all registered algs from kernel with the following code:

void crypto_dumb(void)
{
    struct crypto_alg *alg;
    const char *type;

    down_write(&crypto_alg_sem);
    list_for_each_entry(alg, &crypto_alg_list, cra_list) {
        pr_err("name=%s driver=%s async=%d type=%d\n",
            alg->cra_name, alg->cra_driver_name,
            !!(alg->cra_flags & CRYPTO_ALG_ASYNC),
            alg->cra_flags & CRYPTO_ALG_TYPE_MASK);
    }
    up_write(&crypto_alg_sem);
}

so I've got these "gcm(aes)" as well. Which is why you see
{"gcm(aes)", nil} in algorithms section.

However, the name was actually "__gcm-aes-aesni", not "gcm-aes-aesni".
But kernel does not let me allocate neither of them (EINVAL in both
cases).




> Word of warning II: There is a bug in the kernel crypto API: if you allocate a
> cipher using the driver name that was not registered before, this cipher will
> never be available using its "name". /proc/crypto shows you the reason: the
> "name" is then identical to the driver name.
>
> Ciao
> Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 15:03                             ` Stephan Mueller
@ 2017-11-24 16:18                               ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 16:18 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 4:03 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> I've cooked syzkaller change that teaches it to generate more
>> algorithm names. Probably not idea, but much better than was before:
>> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
>> cc7db/sys/linux/alg.go
>
> I am not as fluent in GO, so I may miss a point here:
>
>                 {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
>                 {"authenc", []int{ALG_HASH, ALG_BLKCIPHER}},
>
> These both are templates to form an AEAD cipher. They allow you to specify the
> block cipher and the hash type.


Most of this works the way you describe. This is effectively production grammar.
authencesn is a template produces AEAD and consumes 2 args: ALG_HASH
and ALG_BLKCIPHER.



>                 {"rfc7539esp", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"rfc7539", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"rfc4543", []int{ALG_AEAD}},
>                 {"rfc4106", []int{ALG_AEAD}},
>
> These are no ciphers per se, but simply formatting mechanisms. For example, to
> make use of rfc4106, you must split the IV: the first four bytes need to be
> appended to the key and the trailing 8 bytes are used as the IV. Any other
> formatting should cause an error. Besides, these implementations should only
> work with some AEAD ciphers like GCM.



So rfc4543 consumes AEAD and itself is a AEAD (can be passed whenever
AEAD is requried), right? If yes, then it works the way you described
(minus the part that is works only with _some_ AEAD ciphers, fuzzer
will try to blindly combine it with all of them).

rfc7539 consumes 2 args, not 1, right? I figured out that it consumes
BLKCIPHER and HASH.



>
>
>                 {"pcrypt", []int{ALG_AEAD}},
>                 {"rfc4309", []int{ALG_AEAD}},
>                 {"gcm", []int{ALG_CIPHER}},
>                 {"gcm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"ccm", []int{ALG_CIPHER}},
>                 {"ccm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
>
>
>
>
>                 {"echainiv", []int{ALG_AEAD}},
> {"seqiv", []int{ALG_AEAD}},
>
> These are IV generators and should be used with an AEAD cipher to internally
> generate IVs. Note, the use of IV generators have changed with 4.2 (see my
> script I sent you as part of this thread).
>
>
>
>                 {"gcm(aes)", nil},
>
> gcm() is also a template that can consume any block cipher, including aes-
> generic, serpent, ...
>
>                 {"gcm_base(ctr(aes-aesni),ghash-generic)", nil},
>
> Again, ctr() is a template that can consume other block ciphers.
>
>                 {"generic-gcm-aesni", nil},
>
> Does this exist?

I can create it:

strcpy(addr.salg_type, "aead");
strcpy(addr.salg_name, "generic-gcm-aesni");

bind(3, {sa_family=0x26 /* AF_??? */,
sa_data="aead\0\0\0\0\0\0\0\0\0\0"}, 88) = 0



>
>                 {"rfc4106(gcm(aes))", nil},
>
> rfc... is a template that can consume AEAD ciphers.
>
>                 {"rfc4106-gcm-aesni", nil},
>                 {"__gcm-aes-aesni", nil},
> {"__driver-gcm-aes-aesni", nil},
>
> Please specifically test that __driver_... names should NEVER be allocatable
> from user space.
>
>
>
>         // algorithms:
>                 {"cbc(aes)", nil},
>
> cbc() is a template
>
>
>                 {"cbc(aes-aesni)", nil},
>                 {"chacha20", nil},
>                 {"chacha20-simd", nil},
>                 {"pcbc(aes)", nil},
>                 {"pcbc-aes-aesni", nil},
>                 {"fpu(pcbc(__aes))", nil},
>                 {"fpu(pcbc(__aes-aesni))", nil},
>                 {"pcbc(__aes)", nil},
>                 {"pcbc(__aes-aesni)", nil},
>
> They are all templates.
>
>                 {"xts(aes)", nil},
>
> xts() is a template.
>
> Note, starting with 4.9, you must use xts(ecb(aes)).

"xts(aes)" also works on upstream (4.15):

strcpy(addr.salg_type, "skcipher");
strcpy(addr.salg_name, "xts(aes)");

bind(3, {sa_family=0x26 /* AF_??? */, sa_data="skcipher\0\0\0\0\0\0"}, 88) = 0



>                 {"xts-aes-aesni", nil},
> {"ctr(aes)", nil},
>
> ctr() is a template
>
>
>
> In general, I would rather suggest:
>
> 1. identify all templates and mark them what they can consume (other
> templates, AEAD, skcipher, hash, ...)
>
> 2. identify all ciphers (aes, aes-generic, aes-aesni, ...) and identify their
> types (skcipher, hash)
>
> 3. mix-n-match templates with the ciphers according to what templates can
> consume
>
>
> Start another test where you arbitrarily mix-n-match templates and ciphers or
> even bogus names, NULL, etc.
>
>
> One word of warning: some cipher names may look like templates, but they are
> not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not consisting of
> templates. Thus, I would always use the driver names (gcm-aes-aesni for the
> given example) to ensure you test exactly the cipher you had in mind.
>
> Word of warning II: There is a bug in the kernel crypto API: if you allocate a
> cipher using the driver name that was not registered before, this cipher will
> never be available using its "name". /proc/crypto shows you the reason: the
> "name" is then identical to the driver name.
>
> Ciao
> Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 16:18                               ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 16:18 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 4:03 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 14:49:49 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> I've cooked syzkaller change that teaches it to generate more
>> algorithm names. Probably not idea, but much better than was before:
>> https://github.com/google/syzkaller/blob/ddf7b3e0655cf6dfeacfe509e477c1486d2
>> cc7db/sys/linux/alg.go
>
> I am not as fluent in GO, so I may miss a point here:
>
>                 {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
>                 {"authenc", []int{ALG_HASH, ALG_BLKCIPHER}},
>
> These both are templates to form an AEAD cipher. They allow you to specify the
> block cipher and the hash type.


Most of this works the way you describe. This is effectively production grammar.
authencesn is a template produces AEAD and consumes 2 args: ALG_HASH
and ALG_BLKCIPHER.



>                 {"rfc7539esp", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"rfc7539", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"rfc4543", []int{ALG_AEAD}},
>                 {"rfc4106", []int{ALG_AEAD}},
>
> These are no ciphers per se, but simply formatting mechanisms. For example, to
> make use of rfc4106, you must split the IV: the first four bytes need to be
> appended to the key and the trailing 8 bytes are used as the IV. Any other
> formatting should cause an error. Besides, these implementations should only
> work with some AEAD ciphers like GCM.



So rfc4543 consumes AEAD and itself is a AEAD (can be passed whenever
AEAD is requried), right? If yes, then it works the way you described
(minus the part that is works only with _some_ AEAD ciphers, fuzzer
will try to blindly combine it with all of them).

rfc7539 consumes 2 args, not 1, right? I figured out that it consumes
BLKCIPHER and HASH.



>
>
>                 {"pcrypt", []int{ALG_AEAD}},
>                 {"rfc4309", []int{ALG_AEAD}},
>                 {"gcm", []int{ALG_CIPHER}},
>                 {"gcm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
>                 {"ccm", []int{ALG_CIPHER}},
>                 {"ccm_base", []int{ALG_BLKCIPHER, ALG_HASH}},
>
>
>
>
>                 {"echainiv", []int{ALG_AEAD}},
> {"seqiv", []int{ALG_AEAD}},
>
> These are IV generators and should be used with an AEAD cipher to internally
> generate IVs. Note, the use of IV generators have changed with 4.2 (see my
> script I sent you as part of this thread).
>
>
>
>                 {"gcm(aes)", nil},
>
> gcm() is also a template that can consume any block cipher, including aes-
> generic, serpent, ...
>
>                 {"gcm_base(ctr(aes-aesni),ghash-generic)", nil},
>
> Again, ctr() is a template that can consume other block ciphers.
>
>                 {"generic-gcm-aesni", nil},
>
> Does this exist?

I can create it:

strcpy(addr.salg_type, "aead");
strcpy(addr.salg_name, "generic-gcm-aesni");

bind(3, {sa_family=0x26 /* AF_??? */,
sa_data="aead\0\0\0\0\0\0\0\0\0\0"}, 88) = 0



>
>                 {"rfc4106(gcm(aes))", nil},
>
> rfc... is a template that can consume AEAD ciphers.
>
>                 {"rfc4106-gcm-aesni", nil},
>                 {"__gcm-aes-aesni", nil},
> {"__driver-gcm-aes-aesni", nil},
>
> Please specifically test that __driver_... names should NEVER be allocatable
> from user space.
>
>
>
>         // algorithms:
>                 {"cbc(aes)", nil},
>
> cbc() is a template
>
>
>                 {"cbc(aes-aesni)", nil},
>                 {"chacha20", nil},
>                 {"chacha20-simd", nil},
>                 {"pcbc(aes)", nil},
>                 {"pcbc-aes-aesni", nil},
>                 {"fpu(pcbc(__aes))", nil},
>                 {"fpu(pcbc(__aes-aesni))", nil},
>                 {"pcbc(__aes)", nil},
>                 {"pcbc(__aes-aesni)", nil},
>
> They are all templates.
>
>                 {"xts(aes)", nil},
>
> xts() is a template.
>
> Note, starting with 4.9, you must use xts(ecb(aes)).

"xts(aes)" also works on upstream (4.15):

strcpy(addr.salg_type, "skcipher");
strcpy(addr.salg_name, "xts(aes)");

bind(3, {sa_family=0x26 /* AF_??? */, sa_data="skcipher\0\0\0\0\0\0"}, 88) = 0



>                 {"xts-aes-aesni", nil},
> {"ctr(aes)", nil},
>
> ctr() is a template
>
>
>
> In general, I would rather suggest:
>
> 1. identify all templates and mark them what they can consume (other
> templates, AEAD, skcipher, hash, ...)
>
> 2. identify all ciphers (aes, aes-generic, aes-aesni, ...) and identify their
> types (skcipher, hash)
>
> 3. mix-n-match templates with the ciphers according to what templates can
> consume
>
>
> Start another test where you arbitrarily mix-n-match templates and ciphers or
> even bogus names, NULL, etc.
>
>
> One word of warning: some cipher names may look like templates, but they are
> not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not consisting of
> templates. Thus, I would always use the driver names (gcm-aes-aesni for the
> given example) to ensure you test exactly the cipher you had in mind.
>
> Word of warning II: There is a bug in the kernel crypto API: if you allocate a
> cipher using the driver name that was not registered before, this cipher will
> never be available using its "name". /proc/crypto shows you the reason: the
> "name" is then identical to the driver name.
>
> Ciao
> Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 16:10                               ` Dmitry Vyukov
@ 2017-11-24 16:19                                 ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 16:19 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 17:10:59 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> That's more-or-less what I did. Here:
> 
> var allAlgs = map[int][]algDesc{
>   ALG_AEAD: []algDesc{
>     // templates:
>     {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
>     {"gcm", []int{ALG_CIPHER}},
> 
> ALG_AEAD means that all of these names produce ALG_AEAD as the result.
> Names that has arguments after them (authencesn, gcm) are templates,
> and the list of arguments denote number and types of arguments for the
> template.
> 
> So here authencesn is a template that produces AEAD and has 2
> arguments: first is ALG_HASH, second is ALG_BLKCIPHER.
> Similarly, gsm is template that produces AEAD and has 1 argument of type
> CIPHER.
> 
> ...
>     // algorithms:
>     {"gcm(aes)", nil},
>     {"__gcm-aes-aesni", nil},
> 
> If second value is nil, that's a complete algorithm (also of type
> AEAD). I pulled in all registered implementations, so the contain the
> specialized implementations like "gcm(aes)", but note that gsm is also
> described as template above so fuzzer can use other inner ALG_CIPHER
> algorithms with gsm.

Thanks for the clarification -- as said, I am not very fluent in GO yet. :-)
> 
> > Start another test where you arbitrarily mix-n-match templates and ciphers
> > or even bogus names, NULL, etc.
> > 
> > 
> > One word of warning: some cipher names may look like templates, but they
> > are not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not
> > consisting of templates. Thus, I would always use the driver names
> > (gcm-aes-aesni for the given example) to ensure you test exactly the
> > cipher you had in mind.
> I've pulled all registered algs from kernel with the following code:
> 
> void crypto_dumb(void)
> {
>     struct crypto_alg *alg;
>     const char *type;
> 
>     down_write(&crypto_alg_sem);
>     list_for_each_entry(alg, &crypto_alg_list, cra_list) {
>         pr_err("name=%s driver=%s async=%d type=%d\n",
>             alg->cra_name, alg->cra_driver_name,
>             !!(alg->cra_flags & CRYPTO_ALG_ASYNC),
>             alg->cra_flags & CRYPTO_ALG_TYPE_MASK);
>     }
>     up_write(&crypto_alg_sem);
> }

I would not obtain the list from a running kernel. Many ciphers are 
implemented in modules that are loaded on demand (or sometimes even manually). 
This list therefore is not complete per definition. Thus, I would rather grep 
through the code and search for cra_driver_name.

And once you get to templates, it is even more imperative to grep the code: 
Only full ciphers are listed in the given list. A template itself is never a 
cipher and will not show up there. Only once a template is combined into a 
full cipher, it will be registered in the list at runtime.

Note, the traversed list is what forms /proc/crypto.

Thus, after a boot, you will not see, say, rfc4106(gcm(aes)) in /proc/crypto 
(or that list). But after one allocation of that cipher, it suddenly pops up 
in /proc/crypto or that list.
> 
> so I've got these "gcm(aes)" as well. Which is why you see
> {"gcm(aes)", nil} in algorithms section.
> 
> However, the name was actually "__gcm-aes-aesni", not "gcm-aes-aesni".
> But kernel does not let me allocate neither of them (EINVAL in both
> cases).

Very good. Please leave such test, because they must not be allocatable.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 16:19                                 ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 16:19 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 17:10:59 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> That's more-or-less what I did. Here:
> 
> var allAlgs = map[int][]algDesc{
>   ALG_AEAD: []algDesc{
>     // templates:
>     {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
>     {"gcm", []int{ALG_CIPHER}},
> 
> ALG_AEAD means that all of these names produce ALG_AEAD as the result.
> Names that has arguments after them (authencesn, gcm) are templates,
> and the list of arguments denote number and types of arguments for the
> template.
> 
> So here authencesn is a template that produces AEAD and has 2
> arguments: first is ALG_HASH, second is ALG_BLKCIPHER.
> Similarly, gsm is template that produces AEAD and has 1 argument of type
> CIPHER.
> 
> ...
>     // algorithms:
>     {"gcm(aes)", nil},
>     {"__gcm-aes-aesni", nil},
> 
> If second value is nil, that's a complete algorithm (also of type
> AEAD). I pulled in all registered implementations, so the contain the
> specialized implementations like "gcm(aes)", but note that gsm is also
> described as template above so fuzzer can use other inner ALG_CIPHER
> algorithms with gsm.

Thanks for the clarification -- as said, I am not very fluent in GO yet. :-)
> 
> > Start another test where you arbitrarily mix-n-match templates and ciphers
> > or even bogus names, NULL, etc.
> > 
> > 
> > One word of warning: some cipher names may look like templates, but they
> > are not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not
> > consisting of templates. Thus, I would always use the driver names
> > (gcm-aes-aesni for the given example) to ensure you test exactly the
> > cipher you had in mind.
> I've pulled all registered algs from kernel with the following code:
> 
> void crypto_dumb(void)
> {
>     struct crypto_alg *alg;
>     const char *type;
> 
>     down_write(&crypto_alg_sem);
>     list_for_each_entry(alg, &crypto_alg_list, cra_list) {
>         pr_err("name=%s driver=%s async=%d type=%d\n",
>             alg->cra_name, alg->cra_driver_name,
>             !!(alg->cra_flags & CRYPTO_ALG_ASYNC),
>             alg->cra_flags & CRYPTO_ALG_TYPE_MASK);
>     }
>     up_write(&crypto_alg_sem);
> }

I would not obtain the list from a running kernel. Many ciphers are 
implemented in modules that are loaded on demand (or sometimes even manually). 
This list therefore is not complete per definition. Thus, I would rather grep 
through the code and search for cra_driver_name.

And once you get to templates, it is even more imperative to grep the code: 
Only full ciphers are listed in the given list. A template itself is never a 
cipher and will not show up there. Only once a template is combined into a 
full cipher, it will be registered in the list at runtime.

Note, the traversed list is what forms /proc/crypto.

Thus, after a boot, you will not see, say, rfc4106(gcm(aes)) in /proc/crypto 
(or that list). But after one allocation of that cipher, it suddenly pops up 
in /proc/crypto or that list.
> 
> so I've got these "gcm(aes)" as well. Which is why you see
> {"gcm(aes)", nil} in algorithms section.
> 
> However, the name was actually "__gcm-aes-aesni", not "gcm-aes-aesni".
> But kernel does not let me allocate neither of them (EINVAL in both
> cases).

Very good. Please leave such test, because they must not be allocatable.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 16:18                               ` Dmitry Vyukov
@ 2017-11-24 16:23                                 ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 16:23 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 17:18:08 CET schrieb Dmitry Vyukov:

Hi Dmitry,
> 
> >                 {"rfc7539esp", []int{ALG_BLKCIPHER, ALG_HASH}},
> >                 {"rfc7539", []int{ALG_BLKCIPHER, ALG_HASH}},
> >                 {"rfc4543", []int{ALG_AEAD}},
> >                 {"rfc4106", []int{ALG_AEAD}},
> > 
> > These are no ciphers per se, but simply formatting mechanisms. For
> > example, to make use of rfc4106, you must split the IV: the first four
> > bytes need to be appended to the key and the trailing 8 bytes are used as
> > the IV. Any other formatting should cause an error. Besides, these
> > implementations should only work with some AEAD ciphers like GCM.
> 
> So rfc4543 consumes AEAD and itself is a AEAD (can be passed whenever
> AEAD is requried), right?

Yes. Again, it is purely formatting of input data.

> If yes, then it works the way you described
> (minus the part that is works only with _some_ AEAD ciphers, fuzzer
> will try to blindly combine it with all of them).
> 
> rfc7539 consumes 2 args, not 1, right? I figured out that it consumes
> BLKCIPHER and HASH.

Right, it is intended for the combo of chacha20 and poly1305.

> >                 {"generic-gcm-aesni", nil},
> > 
> > Does this exist?
> 
> I can create it:
> 
> strcpy(addr.salg_type, "aead");
> strcpy(addr.salg_name, "generic-gcm-aesni");
> 
> bind(3, {sa_family=0x26 /* AF_??? */,
> sa_data="aead\0\0\0\0\0\0\0\0\0\0"}, 88) = 0

Ok, I have not seen that one before.

> > xts() is a template.
> > 
> > Note, starting with 4.9, you must use xts(ecb(aes)).
> 
> "xts(aes)" also works on upstream (4.15):
> 
> strcpy(addr.salg_type, "skcipher");
> strcpy(addr.salg_name, "xts(aes)");
> 
> bind(3, {sa_family=0x26 /* AF_??? */, sa_data="skcipher\0\0\0\0\0\0"}, 88) =
> 0

Ok, I stand corrected. At one point, this did not work :-)


Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 16:23                                 ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 16:23 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 17:18:08 CET schrieb Dmitry Vyukov:

Hi Dmitry,
> 
> >                 {"rfc7539esp", []int{ALG_BLKCIPHER, ALG_HASH}},
> >                 {"rfc7539", []int{ALG_BLKCIPHER, ALG_HASH}},
> >                 {"rfc4543", []int{ALG_AEAD}},
> >                 {"rfc4106", []int{ALG_AEAD}},
> > 
> > These are no ciphers per se, but simply formatting mechanisms. For
> > example, to make use of rfc4106, you must split the IV: the first four
> > bytes need to be appended to the key and the trailing 8 bytes are used as
> > the IV. Any other formatting should cause an error. Besides, these
> > implementations should only work with some AEAD ciphers like GCM.
> 
> So rfc4543 consumes AEAD and itself is a AEAD (can be passed whenever
> AEAD is requried), right?

Yes. Again, it is purely formatting of input data.

> If yes, then it works the way you described
> (minus the part that is works only with _some_ AEAD ciphers, fuzzer
> will try to blindly combine it with all of them).
> 
> rfc7539 consumes 2 args, not 1, right? I figured out that it consumes
> BLKCIPHER and HASH.

Right, it is intended for the combo of chacha20 and poly1305.

> >                 {"generic-gcm-aesni", nil},
> > 
> > Does this exist?
> 
> I can create it:
> 
> strcpy(addr.salg_type, "aead");
> strcpy(addr.salg_name, "generic-gcm-aesni");
> 
> bind(3, {sa_family=0x26 /* AF_??? */,
> sa_data="aead\0\0\0\0\0\0\0\0\0\0"}, 88) = 0

Ok, I have not seen that one before.

> > xts() is a template.
> > 
> > Note, starting with 4.9, you must use xts(ecb(aes)).
> 
> "xts(aes)" also works on upstream (4.15):
> 
> strcpy(addr.salg_type, "skcipher");
> strcpy(addr.salg_name, "xts(aes)");
> 
> bind(3, {sa_family=0x26 /* AF_??? */, sa_data="skcipher\0\0\0\0\0\0"}, 88) > 0

Ok, I stand corrected. At one point, this did not work :-)


Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 16:19                                 ` Stephan Mueller
@ 2017-11-24 16:25                                   ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 16:25 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 5:19 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 17:10:59 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> That's more-or-less what I did. Here:
>>
>> var allAlgs = map[int][]algDesc{
>>   ALG_AEAD: []algDesc{
>>     // templates:
>>     {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
>>     {"gcm", []int{ALG_CIPHER}},
>>
>> ALG_AEAD means that all of these names produce ALG_AEAD as the result.
>> Names that has arguments after them (authencesn, gcm) are templates,
>> and the list of arguments denote number and types of arguments for the
>> template.
>>
>> So here authencesn is a template that produces AEAD and has 2
>> arguments: first is ALG_HASH, second is ALG_BLKCIPHER.
>> Similarly, gsm is template that produces AEAD and has 1 argument of type
>> CIPHER.
>>
>> ...
>>     // algorithms:
>>     {"gcm(aes)", nil},
>>     {"__gcm-aes-aesni", nil},
>>
>> If second value is nil, that's a complete algorithm (also of type
>> AEAD). I pulled in all registered implementations, so the contain the
>> specialized implementations like "gcm(aes)", but note that gsm is also
>> described as template above so fuzzer can use other inner ALG_CIPHER
>> algorithms with gsm.
>
> Thanks for the clarification -- as said, I am not very fluent in GO yet. :-)
>>
>> > Start another test where you arbitrarily mix-n-match templates and ciphers
>> > or even bogus names, NULL, etc.
>> >
>> >
>> > One word of warning: some cipher names may look like templates, but they
>> > are not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not
>> > consisting of templates. Thus, I would always use the driver names
>> > (gcm-aes-aesni for the given example) to ensure you test exactly the
>> > cipher you had in mind.
>> I've pulled all registered algs from kernel with the following code:
>>
>> void crypto_dumb(void)
>> {
>>     struct crypto_alg *alg;
>>     const char *type;
>>
>>     down_write(&crypto_alg_sem);
>>     list_for_each_entry(alg, &crypto_alg_list, cra_list) {
>>         pr_err("name=%s driver=%s async=%d type=%d\n",
>>             alg->cra_name, alg->cra_driver_name,
>>             !!(alg->cra_flags & CRYPTO_ALG_ASYNC),
>>             alg->cra_flags & CRYPTO_ALG_TYPE_MASK);
>>     }
>>     up_write(&crypto_alg_sem);
>> }
>
> I would not obtain the list from a running kernel. Many ciphers are
> implemented in modules that are loaded on demand (or sometimes even manually).
> This list therefore is not complete per definition. Thus, I would rather grep
> through the code and search for cra_driver_name.
>
> And once you get to templates, it is even more imperative to grep the code:
> Only full ciphers are listed in the given list. A template itself is never a
> cipher and will not show up there. Only once a template is combined into a
> full cipher, it will be registered in the list at runtime.
>
> Note, the traversed list is what forms /proc/crypto.
>
> Thus, after a boot, you will not see, say, rfc4106(gcm(aes)) in /proc/crypto
> (or that list). But after one allocation of that cipher, it suddenly pops up
> in /proc/crypto or that list.


I've enabled ALL crypto configs as not-modules. So I should have all
of them there (at least all x86 ones).
I've also dumped the templates the same way:

+void crypto_template_dumb(void)
+{
+       struct crypto_template *tmpl;
+
+       down_read(&crypto_alg_sem);
+       list_for_each_entry(tmpl, &crypto_template_list, list) {
+               pr_err("name=%s\n", tmpl->name);
+       }
+       up_read(&crypto_alg_sem);
+}


Eric also pointed me to grep. But I can't say the code is intuitive.
I've spent way more time than I expected to just get a list of all
algorithms with their types. Say, in some cases algorithm descriptions
do not specify their type, it's somehow plumbed somewhere later.
Getting the list at runtime allowed me to get relevant ones with
proper types.






>> so I've got these "gcm(aes)" as well. Which is why you see
>> {"gcm(aes)", nil} in algorithms section.
>>
>> However, the name was actually "__gcm-aes-aesni", not "gcm-aes-aesni".
>> But kernel does not let me allocate neither of them (EINVAL in both
>> cases).
>
> Very good. Please leave such test, because they must not be allocatable.
>
> Ciao
> Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 16:25                                   ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-24 16:25 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 5:19 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 17:10:59 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> That's more-or-less what I did. Here:
>>
>> var allAlgs = map[int][]algDesc{
>>   ALG_AEAD: []algDesc{
>>     // templates:
>>     {"authencesn", []int{ALG_HASH, ALG_BLKCIPHER}},
>>     {"gcm", []int{ALG_CIPHER}},
>>
>> ALG_AEAD means that all of these names produce ALG_AEAD as the result.
>> Names that has arguments after them (authencesn, gcm) are templates,
>> and the list of arguments denote number and types of arguments for the
>> template.
>>
>> So here authencesn is a template that produces AEAD and has 2
>> arguments: first is ALG_HASH, second is ALG_BLKCIPHER.
>> Similarly, gsm is template that produces AEAD and has 1 argument of type
>> CIPHER.
>>
>> ...
>>     // algorithms:
>>     {"gcm(aes)", nil},
>>     {"__gcm-aes-aesni", nil},
>>
>> If second value is nil, that's a complete algorithm (also of type
>> AEAD). I pulled in all registered implementations, so the contain the
>> specialized implementations like "gcm(aes)", but note that gsm is also
>> described as template above so fuzzer can use other inner ALG_CIPHER
>> algorithms with gsm.
>
> Thanks for the clarification -- as said, I am not very fluent in GO yet. :-)
>>
>> > Start another test where you arbitrarily mix-n-match templates and ciphers
>> > or even bogus names, NULL, etc.
>> >
>> >
>> > One word of warning: some cipher names may look like templates, but they
>> > are not: gcm(aes) in arch/x86/crypto/ is a complete cipher and not
>> > consisting of templates. Thus, I would always use the driver names
>> > (gcm-aes-aesni for the given example) to ensure you test exactly the
>> > cipher you had in mind.
>> I've pulled all registered algs from kernel with the following code:
>>
>> void crypto_dumb(void)
>> {
>>     struct crypto_alg *alg;
>>     const char *type;
>>
>>     down_write(&crypto_alg_sem);
>>     list_for_each_entry(alg, &crypto_alg_list, cra_list) {
>>         pr_err("name=%s driver=%s async=%d type=%d\n",
>>             alg->cra_name, alg->cra_driver_name,
>>             !!(alg->cra_flags & CRYPTO_ALG_ASYNC),
>>             alg->cra_flags & CRYPTO_ALG_TYPE_MASK);
>>     }
>>     up_write(&crypto_alg_sem);
>> }
>
> I would not obtain the list from a running kernel. Many ciphers are
> implemented in modules that are loaded on demand (or sometimes even manually).
> This list therefore is not complete per definition. Thus, I would rather grep
> through the code and search for cra_driver_name.
>
> And once you get to templates, it is even more imperative to grep the code:
> Only full ciphers are listed in the given list. A template itself is never a
> cipher and will not show up there. Only once a template is combined into a
> full cipher, it will be registered in the list at runtime.
>
> Note, the traversed list is what forms /proc/crypto.
>
> Thus, after a boot, you will not see, say, rfc4106(gcm(aes)) in /proc/crypto
> (or that list). But after one allocation of that cipher, it suddenly pops up
> in /proc/crypto or that list.


I've enabled ALL crypto configs as not-modules. So I should have all
of them there (at least all x86 ones).
I've also dumped the templates the same way:

+void crypto_template_dumb(void)
+{
+       struct crypto_template *tmpl;
+
+       down_read(&crypto_alg_sem);
+       list_for_each_entry(tmpl, &crypto_template_list, list) {
+               pr_err("name=%s\n", tmpl->name);
+       }
+       up_read(&crypto_alg_sem);
+}


Eric also pointed me to grep. But I can't say the code is intuitive.
I've spent way more time than I expected to just get a list of all
algorithms with their types. Say, in some cases algorithm descriptions
do not specify their type, it's somehow plumbed somewhere later.
Getting the list at runtime allowed me to get relevant ones with
proper types.






>> so I've got these "gcm(aes)" as well. Which is why you see
>> {"gcm(aes)", nil} in algorithms section.
>>
>> However, the name was actually "__gcm-aes-aesni", not "gcm-aes-aesni".
>> But kernel does not let me allocate neither of them (EINVAL in both
>> cases).
>
> Very good. Please leave such test, because they must not be allocatable.
>
> Ciao
> Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 16:25                                   ` Dmitry Vyukov
@ 2017-11-24 16:31                                     ` Stephan Mueller
  -1 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 16:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 17:25:55 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> Eric also pointed me to grep. But I can't say the code is intuitive.
> I've spent way more time than I expected to just get a list of all
> algorithms with their types. Say, in some cases algorithm descriptions
> do not specify their type, it's somehow plumbed somewhere later.
> Getting the list at runtime allowed me to get relevant ones with
> proper types.

Hehe, getting the list of ciphers in the script I sent the other day took me a 
long time.

Maybe you leave it for now and come back later for a good grepping code to 
search the sources.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-24 16:31                                     ` Stephan Mueller
  0 siblings, 0 replies; 61+ messages in thread
From: Stephan Mueller @ 2017-11-24 16:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

Am Freitag, 24. November 2017, 17:25:55 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> Eric also pointed me to grep. But I can't say the code is intuitive.
> I've spent way more time than I expected to just get a list of all
> algorithms with their types. Say, in some cases algorithm descriptions
> do not specify their type, it's somehow plumbed somewhere later.
> Getting the list at runtime allowed me to get relevant ones with
> proper types.

Hehe, getting the list of ciphers in the script I sent the other day took me a 
long time.

Maybe you leave it for now and come back later for a good grepping code to 
search the sources.

Ciao
Stephan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
  2017-11-24 16:31                                     ` Stephan Mueller
@ 2017-11-28  9:59                                       ` Dmitry Vyukov
  -1 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-28  9:59 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 5:31 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 17:25:55 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> Eric also pointed me to grep. But I can't say the code is intuitive.
>> I've spent way more time than I expected to just get a list of all
>> algorithms with their types. Say, in some cases algorithm descriptions
>> do not specify their type, it's somehow plumbed somewhere later.
>> Getting the list at runtime allowed me to get relevant ones with
>> proper types.
>
> Hehe, getting the list of ciphers in the script I sent the other day took me a
> long time.
>
> Maybe you leave it for now and come back later for a good grepping code to
> search the sources.


First bugs uncovered by the change started popping up:
https://groups.google.com/forum/#!forum/syzkaller-bugs
More coming!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: x509 parsing bug + fuzzing crypto in the userspace
@ 2017-11-28  9:59                                       ` Dmitry Vyukov
  0 siblings, 0 replies; 61+ messages in thread
From: Dmitry Vyukov @ 2017-11-28  9:59 UTC (permalink / raw)
  To: Stephan Mueller
  Cc: Eric Biggers, Alexander Potapenko, linux-crypto,
	Kostya Serebryany, keyrings, Andrey Konovalov

On Fri, Nov 24, 2017 at 5:31 PM, Stephan Mueller <smueller@chronox.de> wrote:
> Am Freitag, 24. November 2017, 17:25:55 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> Eric also pointed me to grep. But I can't say the code is intuitive.
>> I've spent way more time than I expected to just get a list of all
>> algorithms with their types. Say, in some cases algorithm descriptions
>> do not specify their type, it's somehow plumbed somewhere later.
>> Getting the list at runtime allowed me to get relevant ones with
>> proper types.
>
> Hehe, getting the list of ciphers in the script I sent the other day took me a
> long time.
>
> Maybe you leave it for now and come back later for a good grepping code to
> search the sources.


First bugs uncovered by the change started popping up:
https://groups.google.com/forum/#!forum/syzkaller-bugs
More coming!

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2017-11-28  9:59 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-20 14:10 x509 parsing bug + fuzzing crypto in the userspace Alexander Potapenko
2017-11-20 21:42 ` Eric Biggers
2017-11-20 21:42   ` Eric Biggers
2017-11-21  8:00   ` Dmitry Vyukov
2017-11-21  8:00     ` Dmitry Vyukov
2017-11-21 20:46     ` Eric Biggers
2017-11-21 20:46       ` Eric Biggers
2017-11-22 10:44       ` Dmitry Vyukov
2017-11-22 10:44         ` Dmitry Vyukov
2017-11-22 17:08         ` Stephan Mueller
2017-11-22 17:08           ` Stephan Mueller
2017-11-23  9:32           ` Dmitry Vyukov
2017-11-23  9:32             ` Dmitry Vyukov
2017-11-23  9:35             ` Dmitry Vyukov
2017-11-23  9:35               ` Dmitry Vyukov
2017-11-23  9:37               ` Dmitry Vyukov
2017-11-23  9:37                 ` Dmitry Vyukov
2017-11-23 11:10                 ` Stephan Mueller
2017-11-23 11:10                   ` Stephan Mueller
2017-11-23 11:27                   ` Dmitry Vyukov
2017-11-23 11:27                     ` Dmitry Vyukov
2017-11-23 11:34                     ` Dmitry Vyukov
2017-11-23 11:34                       ` Dmitry Vyukov
2017-11-23 12:35                       ` Stephan Mueller
2017-11-23 12:35                         ` Stephan Mueller
2017-11-24 13:49                         ` Dmitry Vyukov
2017-11-24 13:49                           ` Dmitry Vyukov
2017-11-24 14:36                           ` Stephan Mueller
2017-11-24 14:36                             ` Stephan Mueller
2017-11-24 14:55                             ` Dmitry Vyukov
2017-11-24 14:55                               ` Dmitry Vyukov
2017-11-24 15:13                               ` Stephan Mueller
2017-11-24 15:13                                 ` Stephan Mueller
2017-11-24 15:53                                 ` Dmitry Vyukov
2017-11-24 15:53                                   ` Dmitry Vyukov
2017-11-24 16:07                                   ` Stephan Mueller
2017-11-24 16:07                                     ` Stephan Mueller
2017-11-24 15:03                           ` Stephan Mueller
2017-11-24 15:03                             ` Stephan Mueller
2017-11-24 16:10                             ` Dmitry Vyukov
2017-11-24 16:10                               ` Dmitry Vyukov
2017-11-24 16:19                               ` Stephan Mueller
2017-11-24 16:19                                 ` Stephan Mueller
2017-11-24 16:25                                 ` Dmitry Vyukov
2017-11-24 16:25                                   ` Dmitry Vyukov
2017-11-24 16:31                                   ` Stephan Mueller
2017-11-24 16:31                                     ` Stephan Mueller
2017-11-28  9:59                                     ` Dmitry Vyukov
2017-11-28  9:59                                       ` Dmitry Vyukov
2017-11-24 16:18                             ` Dmitry Vyukov
2017-11-24 16:18                               ` Dmitry Vyukov
2017-11-24 16:23                               ` Stephan Mueller
2017-11-24 16:23                                 ` Stephan Mueller
2017-11-23 12:32                     ` Stephan Mueller
2017-11-23 12:32                       ` Stephan Mueller
2017-11-22 16:54       ` Stephan Mueller
2017-11-22 16:54         ` Stephan Mueller
2017-11-22 17:03         ` Dmitry Vyukov
2017-11-22 17:03           ` Dmitry Vyukov
2017-11-22 17:15           ` Stephan Mueller
2017-11-22 17:15             ` Stephan Mueller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.