linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Remaining crypto API regressions with CONFIG_VMAP_STACK
@ 2016-12-09 23:08 Eric Biggers
  2016-12-10  5:25 ` Andy Lutomirski
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Eric Biggers @ 2016-12-09 23:08 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-kernel, linux-mm, kernel-hardening, Herbert Xu,
	Andrew Lutomirski, Stephan Mueller

In the 4.9 kernel, virtually-mapped stacks will be supported and enabled by
default on x86_64.  This has been exposing a number of problems in which
on-stack buffers are being passed into the crypto API, which to support crypto
accelerators operates on 'struct page' rather than on virtual memory.

Some of these problems have already been fixed, but I was wondering how many
problems remain, so I briefly looked through all the callers of sg_set_buf() and
sg_init_one().  Overall I found quite a few remaining problems, detailed below.

The following crypto drivers initialize a scatterlist to point into an
ahash_request, which may have been allocated on the stack with
AHASH_REQUEST_ON_STACK():

	drivers/crypto/bfin_crc.c:351
	drivers/crypto/qce/sha.c:299
	drivers/crypto/sahara.c:973,988
	drivers/crypto/talitos.c:1910
	drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
	drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
	drivers/crypto/qce/sha.c:325

The following crypto drivers initialize a scatterlist to point into an
ablkcipher_request, which may have been allocated on the stack with
SKCIPHER_REQUEST_ON_STACK():

	drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
	drivers/crypto/ccp/ccp-crypto-aes.c:94

And these other places do crypto operations on buffers clearly on the stack:

	drivers/net/wireless/intersil/orinoco/mic.c:72
	drivers/usb/wusbcore/crypto.c:264
	net/ceph/crypto.c:182
	net/rxrpc/rxkad.c:737,1000
	security/keys/encrypted-keys/encrypted.c:500
	fs/cifs/smbencrypt.c:96

Note: I almost certainly missed some, since I excluded places where the use of a
stack buffer was not obvious to me.  I also excluded AEAD algorithms since there
isn't an AEAD_REQUEST_ON_STACK() macro (yet).

The "good" news with these bugs is that on x86_64 without CONFIG_DEBUG_SG=y or
CONFIG_DEBUG_VIRTUAL=y, you can still do virt_to_page() and then page_address()
on a vmalloc address and get back the same address, even though you aren't
*supposed* to be able to do this.  This will make things still work for most
people.  The bad news is that if you happen to have consumed just about 1 page
(or N pages) of your stack at the time you call the crypto API, your stack
buffer may actually span physically non-contiguous pages, so the crypto
algorithm will scribble over some unrelated page.  Also, hardware crypto drivers
which actually do operate on physical memory will break too.

So I am wondering: is the best solution really to make all these crypto API
algorithms and users use heap buffers, as opposed to something like maintaining
a lowmem alias for the stack, or introducing a more general function to convert
buffers (possibly in the vmalloc space) into scatterlists?  And if the current
solution is desired, who is going to fix all of these bugs and when?

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-09 23:08 Remaining crypto API regressions with CONFIG_VMAP_STACK Eric Biggers
@ 2016-12-10  5:25 ` Andy Lutomirski
  2016-12-10  5:32   ` Herbert Xu
                     ` (2 more replies)
  2016-12-11 19:13 ` Andy Lutomirski
  2016-12-12 18:34 ` Andy Lutomirski
  2 siblings, 3 replies; 19+ messages in thread
From: Andy Lutomirski @ 2016-12-10  5:25 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, linux-kernel, linux-mm, kernel-hardening,
	Herbert Xu, Andrew Lutomirski, Stephan Mueller

On Fri, Dec 9, 2016 at 3:08 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> In the 4.9 kernel, virtually-mapped stacks will be supported and enabled by
> default on x86_64.  This has been exposing a number of problems in which
> on-stack buffers are being passed into the crypto API, which to support crypto
> accelerators operates on 'struct page' rather than on virtual memory.
>
> Some of these problems have already been fixed, but I was wondering how many
> problems remain, so I briefly looked through all the callers of sg_set_buf() and
> sg_init_one().  Overall I found quite a few remaining problems, detailed below.
>
> The following crypto drivers initialize a scatterlist to point into an
> ahash_request, which may have been allocated on the stack with
> AHASH_REQUEST_ON_STACK():
>
>         drivers/crypto/bfin_crc.c:351
>         drivers/crypto/qce/sha.c:299
>         drivers/crypto/sahara.c:973,988
>         drivers/crypto/talitos.c:1910

This are impossible or highly unlikely on x86.

>         drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
>         drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124

These

>         drivers/crypto/qce/sha.c:325

This is impossible on x86.

>
> The following crypto drivers initialize a scatterlist to point into an
> ablkcipher_request, which may have been allocated on the stack with
> SKCIPHER_REQUEST_ON_STACK():
>
>         drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
>         drivers/crypto/ccp/ccp-crypto-aes.c:94

These are real, and I wish I'd known about them sooner.

>
> And these other places do crypto operations on buffers clearly on the stack:
>
>         drivers/net/wireless/intersil/orinoco/mic.c:72

Ick.

>         drivers/usb/wusbcore/crypto.c:264

Well, crud.  I thought I had fixed this driver but I missed one case.
Will send a fix tomorrow.  But I'm still unconvinced that this
hardware ever shipped.

>         net/ceph/crypto.c:182

Ick.

>         net/rxrpc/rxkad.c:737,1000

Well, crud.  This was supposed to have been fixed in:

commit a263629da519b2064588377416e067727e2cbdf9
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Sun Jun 26 14:55:24 2016 -0700

    rxrpc: Avoid using stack memory in SG lists in rxkad


>         security/keys/encrypted-keys/encrypted.c:500

That's a trivial one-liner.  Patch coming tomorrow.

>         fs/cifs/smbencrypt.c:96

Ick.

>
> Note: I almost certainly missed some, since I excluded places where the use of a
> stack buffer was not obvious to me.  I also excluded AEAD algorithms since there
> isn't an AEAD_REQUEST_ON_STACK() macro (yet).
>
> The "good" news with these bugs is that on x86_64 without CONFIG_DEBUG_SG=y or
> CONFIG_DEBUG_VIRTUAL=y, you can still do virt_to_page() and then page_address()
> on a vmalloc address and get back the same address, even though you aren't
> *supposed* to be able to do this.  This will make things still work for most
> people.  The bad news is that if you happen to have consumed just about 1 page
> (or N pages) of your stack at the time you call the crypto API, your stack
> buffer may actually span physically non-contiguous pages, so the crypto
> algorithm will scribble over some unrelated page.

Are you sure?  If it round-trips to the same virtual address, it
doesn't matter if the buffer is contiguous.

>  Also, hardware crypto drivers
> which actually do operate on physical memory will break too.

Those were already broken.  DMA has been illegal on the stack for
years and DMA debugging would have caught it.

>
> So I am wondering: is the best solution really to make all these crypto API
> algorithms and users use heap buffers, as opposed to something like maintaining
> a lowmem alias for the stack, or introducing a more general function to convert
> buffers (possibly in the vmalloc space) into scatterlists?  And if the current
> solution is desired, who is going to fix all of these bugs and when?

The *right* solution IMO is to fix crypto to stop using scatterlists.
Scatterlists are for DMA using physical addresses, and they're
inappropriate almost every user of them that's using them for crypto.
kiov would be much better -- it would make sense and it would be
faster.

I have a hack to make scatterlists pointing to the stack work (as long
as they're only one element), but that's seriously gross.

Herbert, how hard would it be to teach the crypto code to use a more
sensible data structure than scatterlist and to use coccinelle fix
this stuff for real?

In the mean time, we should patch the handful of drivers that matter.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10  5:25 ` Andy Lutomirski
@ 2016-12-10  5:32   ` Herbert Xu
  2016-12-10  6:03     ` [kernel-hardening] " Eric Biggers
  2016-12-10  5:37   ` Herbert Xu
  2016-12-10  5:55   ` Eric Biggers
  2 siblings, 1 reply; 19+ messages in thread
From: Herbert Xu @ 2016-12-10  5:32 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Eric Biggers, linux-crypto, linux-kernel, linux-mm,
	kernel-hardening, Andrew Lutomirski, Stephan Mueller

On Fri, Dec 09, 2016 at 09:25:38PM -0800, Andy Lutomirski wrote:
>
> > The following crypto drivers initialize a scatterlist to point into an
> > ablkcipher_request, which may have been allocated on the stack with
> > SKCIPHER_REQUEST_ON_STACK():
> >
> >         drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
> >         drivers/crypto/ccp/ccp-crypto-aes.c:94
> 
> These are real, and I wish I'd known about them sooner.

Are you sure? Any instance of *_ON_STACK must only be used with
sync algorithms and most drivers under drivers/crypto declare
themselves as async.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10  5:25 ` Andy Lutomirski
  2016-12-10  5:32   ` Herbert Xu
@ 2016-12-10  5:37   ` Herbert Xu
  2016-12-10  6:30     ` [kernel-hardening] " Eric Biggers
  2016-12-10 14:45     ` Jason A. Donenfeld
  2016-12-10  5:55   ` Eric Biggers
  2 siblings, 2 replies; 19+ messages in thread
From: Herbert Xu @ 2016-12-10  5:37 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Eric Biggers, linux-crypto, linux-kernel, linux-mm,
	kernel-hardening, Andrew Lutomirski, Stephan Mueller

On Fri, Dec 09, 2016 at 09:25:38PM -0800, Andy Lutomirski wrote:
>
> Herbert, how hard would it be to teach the crypto code to use a more
> sensible data structure than scatterlist and to use coccinelle fix
> this stuff for real?

First of all we already have a sync non-SG hash interface, it's
called shash.

If we had enough sync-only users of skcipher then I'll consider
adding an interface for it.  However, at this point in time it
appears to more sense to convert such users over to the async
interface rather than the other way around.

As for AEAD we never had a sync interface to begin with and I
don't think I'm going to add one.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10  5:25 ` Andy Lutomirski
  2016-12-10  5:32   ` Herbert Xu
  2016-12-10  5:37   ` Herbert Xu
@ 2016-12-10  5:55   ` Eric Biggers
  2 siblings, 0 replies; 19+ messages in thread
From: Eric Biggers @ 2016-12-10  5:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-crypto, linux-kernel, linux-mm, kernel-hardening,
	Herbert Xu, Andrew Lutomirski, Stephan Mueller

On Fri, Dec 09, 2016 at 09:25:38PM -0800, Andy Lutomirski wrote:
> > The following crypto drivers initialize a scatterlist to point into an
> > ahash_request, which may have been allocated on the stack with
> > AHASH_REQUEST_ON_STACK():
> >
> >         drivers/crypto/bfin_crc.c:351
> >         drivers/crypto/qce/sha.c:299
> >         drivers/crypto/sahara.c:973,988
> >         drivers/crypto/talitos.c:1910
> 
> This are impossible or highly unlikely on x86.
> 
> >         drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
> >         drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
> 
> These
> 
> >         drivers/crypto/qce/sha.c:325
> 
> This is impossible on x86.
> 

Thanks for looking into these.  I didn't investigate who/what is likely to be
using each driver.

Of course I would not be surprised to see people want to start supporting
virtually mapped stacks on other architectures too.

> >
> > The "good" news with these bugs is that on x86_64 without CONFIG_DEBUG_SG=y or
> > CONFIG_DEBUG_VIRTUAL=y, you can still do virt_to_page() and then page_address()
> > on a vmalloc address and get back the same address, even though you aren't
> > *supposed* to be able to do this.  This will make things still work for most
> > people.  The bad news is that if you happen to have consumed just about 1 page
> > (or N pages) of your stack at the time you call the crypto API, your stack
> > buffer may actually span physically non-contiguous pages, so the crypto
> > algorithm will scribble over some unrelated page.
> 
> Are you sure?  If it round-trips to the same virtual address, it
> doesn't matter if the buffer is contiguous.

You may be right, I didn't test this.  The hash_walk and blkcipher_walk code do
go page by page, but I suppose on x86_64 it would just step from one bogus
"struct page" to the adjacent one and still map it to the original virtual
address.

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kernel-hardening] Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10  5:32   ` Herbert Xu
@ 2016-12-10  6:03     ` Eric Biggers
  2016-12-10  8:16       ` Herbert Xu
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Biggers @ 2016-12-10  6:03 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Andy Lutomirski, linux-crypto, linux-kernel, linux-mm,
	Andrew Lutomirski, Stephan Mueller

On Sat, Dec 10, 2016 at 01:32:08PM +0800, Herbert Xu wrote:
> On Fri, Dec 09, 2016 at 09:25:38PM -0800, Andy Lutomirski wrote:
> >
> > > The following crypto drivers initialize a scatterlist to point into an
> > > ablkcipher_request, which may have been allocated on the stack with
> > > SKCIPHER_REQUEST_ON_STACK():
> > >
> > >         drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
> > >         drivers/crypto/ccp/ccp-crypto-aes.c:94
> > 
> > These are real, and I wish I'd known about them sooner.
> 
> Are you sure? Any instance of *_ON_STACK must only be used with
> sync algorithms and most drivers under drivers/crypto declare
> themselves as async.
> 

Why exactly is that?  Obviously, it wouldn't work if you returned from the stack
frame before the request completed, but does anything stop someone from using an
*_ON_STACK() request and then waiting for the request to complete before
returning from the stack frame?

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kernel-hardening] Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10  5:37   ` Herbert Xu
@ 2016-12-10  6:30     ` Eric Biggers
  2016-12-10 14:45     ` Jason A. Donenfeld
  1 sibling, 0 replies; 19+ messages in thread
From: Eric Biggers @ 2016-12-10  6:30 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Andy Lutomirski, linux-crypto, linux-kernel, linux-mm,
	Andrew Lutomirski, Stephan Mueller

On Sat, Dec 10, 2016 at 01:37:12PM +0800, Herbert Xu wrote:
> On Fri, Dec 09, 2016 at 09:25:38PM -0800, Andy Lutomirski wrote:
> >
> > Herbert, how hard would it be to teach the crypto code to use a more
> > sensible data structure than scatterlist and to use coccinelle fix
> > this stuff for real?
> 
> First of all we already have a sync non-SG hash interface, it's
> called shash.
> 
> If we had enough sync-only users of skcipher then I'll consider
> adding an interface for it.  However, at this point in time it
> appears to more sense to convert such users over to the async
> interface rather than the other way around.
> 
> As for AEAD we never had a sync interface to begin with and I
> don't think I'm going to add one.
> 

Isn't the question of "should the API use physical or virtual addresses"
independent of the question of "should the API support asynchronous requests"?
You can already choose, via the flags and mask arguments when allocating a
crypto transform, whether you want it to be synchronous or asynchronous or
whether you don't care.  I don't see what that says about whether the API should
take in physical memory (e.g. scatterlists or struct pages) or virtual memory
(e.g. iov_iters or just regular pointers).

And while it's true that asynchronous algorithms are often provided by hardware
drivers that operate on physical memory, it's not always the case.  For example
some of the AES-NI algorithms are asynchronous only because they use the SSE
registers which can't always available to kernel code, so the request may need
to be processed by another thread.

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kernel-hardening] Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10  6:03     ` [kernel-hardening] " Eric Biggers
@ 2016-12-10  8:16       ` Herbert Xu
  2016-12-10  8:39         ` Eric Biggers
  0 siblings, 1 reply; 19+ messages in thread
From: Herbert Xu @ 2016-12-10  8:16 UTC (permalink / raw)
  To: Eric Biggers
  Cc: kernel-hardening, luto, linux-crypto, linux-kernel, linux-mm,
	luto, smueller

Why did you drop me from the CC list when you were replying to
my email?

Eric Biggers <ebiggers3@gmail.com> wrote:
> On Sat, Dec 10, 2016 at 01:32:08PM +0800, Herbert Xu wrote:
>
>> Are you sure? Any instance of *_ON_STACK must only be used with
>> sync algorithms and most drivers under drivers/crypto declare
>> themselves as async.
> 
> Why exactly is that?  Obviously, it wouldn't work if you returned from the stack
> frame before the request completed, but does anything stop someone from using an
> *_ON_STACK() request and then waiting for the request to complete before
> returning from the stack frame?

The *_ON_STACK variants (except SHASH of course) were simply hacks
to help legacy crypto API users to cope with the new async interface.
In general we should avoid using the sync interface when possible.

It's a bad idea for the obvious reason that most of our async
algorithms want to DMA and that doesn't work very well when you're
using memory from the stack.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10  8:16       ` Herbert Xu
@ 2016-12-10  8:39         ` Eric Biggers
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Biggers @ 2016-12-10  8:39 UTC (permalink / raw)
  To: Herbert Xu
  Cc: kernel-hardening, luto, linux-crypto, linux-kernel, linux-mm,
	luto, smueller

On Sat, Dec 10, 2016 at 04:16:43PM +0800, Herbert Xu wrote:
> Why did you drop me from the CC list when you were replying to
> my email?
> 

Sorry --- this thread is Cc'ed to the kernel-hardening mailing list (which was
somewhat recently revived), and I replied to the email that reached me from
there.  It looks like it currently behaves a little differently from the vger
mailing lists, in that it replaces "Reply-To" with the address of the mailing
list itself rather than the sender.  So that's how you got dropped.  It also
seems to add a prefix to the subject...

I
> >> Are you sure? Any instance of *_ON_STACK must only be used with
> >> sync algorithms and most drivers under drivers/crypto declare
> >> themselves as async.
> > 
> > Why exactly is that?  Obviously, it wouldn't work if you returned from the stack
> > frame before the request completed, but does anything stop someone from using an
> > *_ON_STACK() request and then waiting for the request to complete before
> > returning from the stack frame?
> 
> The *_ON_STACK variants (except SHASH of course) were simply hacks
> to help legacy crypto API users to cope with the new async interface.
> In general we should avoid using the sync interface when possible.
> 
> It's a bad idea for the obvious reason that most of our async
> algorithms want to DMA and that doesn't work very well when you're
> using memory from the stack.

Sure, I just feel that the idea of "is this algorithm asynchronous?" is being
conflated with the idea of "does this algorithm operate on physical memory?".
Also, if *_ON_STACK are really not allowed with asynchronous algorithms can
there at least be a comment or a WARN_ON() to express this?

Thanks,

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kernel-hardening] Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10  5:37   ` Herbert Xu
  2016-12-10  6:30     ` [kernel-hardening] " Eric Biggers
@ 2016-12-10 14:45     ` Jason A. Donenfeld
  2016-12-10 17:48       ` Andy Lutomirski
  1 sibling, 1 reply; 19+ messages in thread
From: Jason A. Donenfeld @ 2016-12-10 14:45 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Andy Lutomirski, Eric Biggers, linux-crypto, linux-kernel,
	linux-mm, Andrew Lutomirski, Stephan Mueller

Hi Herbert,

On Sat, Dec 10, 2016 at 6:37 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> As for AEAD we never had a sync interface to begin with and I
> don't think I'm going to add one.

That's too bad to hear. I hope you'll reconsider. Modern cryptographic
design is heading more and more in the direction of using AEADs for
interesting things, and having a sync interface would be a lot easier
for implementing these protocols. In the same way many protocols need
a hash of some data, now protocols often want some particular data
encrypted with an AEAD using a particular key and nonce and AD. One
protocol that comes to mind is Noise [1].

I know that in my own [currently external to the tree] kernel code, I
just forego the use of the crypto API all together, and one of the
primary reasons for that is lack of a sync interface for AEADs. When I
eventually send this upstream, presumably everyone will want me to use
the crypto API, and having a sync AEAD interface would be personally
helpful for that. I guess I could always write the sync interface
myself, but I imagine you'd prefer having the design control etc.

Jason


[1] http://noiseprotocol.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kernel-hardening] Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-10 14:45     ` Jason A. Donenfeld
@ 2016-12-10 17:48       ` Andy Lutomirski
  0 siblings, 0 replies; 19+ messages in thread
From: Andy Lutomirski @ 2016-12-10 17:48 UTC (permalink / raw)
  To: Jason A. Donenfeld, Al Viro
  Cc: kernel-hardening, Eric Biggers, linux-crypto, linux-kernel,
	linux-mm, Andrew Lutomirski, Stephan Mueller

cc: Viro because I'm talking about iov_iter.

On Sat, Dec 10, 2016 at 6:45 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Hi Herbert,
>
> On Sat, Dec 10, 2016 at 6:37 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>> As for AEAD we never had a sync interface to begin with and I
>> don't think I'm going to add one.
>
> That's too bad to hear. I hope you'll reconsider. Modern cryptographic
> design is heading more and more in the direction of using AEADs for
> interesting things, and having a sync interface would be a lot easier
> for implementing these protocols. In the same way many protocols need
> a hash of some data, now protocols often want some particular data
> encrypted with an AEAD using a particular key and nonce and AD. One
> protocol that comes to mind is Noise [1].
>

I think that sync vs async has gotten conflated with
vectored-vs-nonvectored and the results are unfortunate.

There are a lot of users in the tree that are trying to do crypto on
very small pieces of data and want to have that data consist of the
concatenation of two small buffers and/or want to use primitives that
don't have "sync" interfaces.  These users are stuck using async
interfaces even though using async implementations makes no sense for
them.

I'd love to see the API restructured a bit to decouple all of these
considerations.  One approach might be to teach iov_iter about
scatterlists.  Then, for each primitive, there could be two entry
points:

1. A simplified and lower-overhead entry.  You pass it an iov_iter
(and, depending on what the operation is, an output iov_iter), it does
the crypto synchronously, and returns.  Operating in-place might be
permitted for some primitives.

2. A full-featured async entry.  You pass it iov_iters and it requires
that the iov_iters be backed by scatterlists in order to operate
asynchronously.

I see no reason that the decisions to use virtual vs physical
addressing or to do vectored vs non-vectored IO should be tied up with
asynchronicity.

--Andy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-09 23:08 Remaining crypto API regressions with CONFIG_VMAP_STACK Eric Biggers
  2016-12-10  5:25 ` Andy Lutomirski
@ 2016-12-11 19:13 ` Andy Lutomirski
  2016-12-11 23:31   ` Eric Biggers
  2016-12-12 18:34 ` Andy Lutomirski
  2 siblings, 1 reply; 19+ messages in thread
From: Andy Lutomirski @ 2016-12-11 19:13 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, linux-kernel, linux-mm, kernel-hardening,
	Herbert Xu, Andrew Lutomirski, Stephan Mueller

On Fri, Dec 9, 2016 at 3:08 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> In the 4.9 kernel, virtually-mapped stacks will be supported and enabled by
> default on x86_64.  This has been exposing a number of problems in which
> on-stack buffers are being passed into the crypto API, which to support crypto
> accelerators operates on 'struct page' rather than on virtual memory.
>

>         fs/cifs/smbencrypt.c:96

This should use crypto_cipher_encrypt_one(), I think.

--Andy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-11 19:13 ` Andy Lutomirski
@ 2016-12-11 23:31   ` Eric Biggers
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Biggers @ 2016-12-11 23:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-crypto, linux-kernel, linux-mm, kernel-hardening,
	Herbert Xu, Andrew Lutomirski, Stephan Mueller

On Sun, Dec 11, 2016 at 11:13:55AM -0800, Andy Lutomirski wrote:
> On Fri, Dec 9, 2016 at 3:08 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> > In the 4.9 kernel, virtually-mapped stacks will be supported and enabled by
> > default on x86_64.  This has been exposing a number of problems in which
> > on-stack buffers are being passed into the crypto API, which to support crypto
> > accelerators operates on 'struct page' rather than on virtual memory.
> >
> 
> >         fs/cifs/smbencrypt.c:96
> 
> This should use crypto_cipher_encrypt_one(), I think.
> 
> --Andy

Yes, I believe that's correct.  It encrypts 8 bytes with ecb(des) which is
equivalent to simply encrypting one block with DES.  Maybe try the following
(untested):

static int
smbhash(unsigned char *out, const unsigned char *in, unsigned char *key)
{
	unsigned char key2[8];
	struct crypto_cipher *cipher;

	str_to_key(key, key2);

	cipher = crypto_alloc_cipher("des", 0, 0);
	if (IS_ERR(cipher)) {
		cifs_dbg(VFS, "could not allocate des cipher\n");
		return PTR_ERR(cipher);
	}

	crypto_cipher_setkey(cipher, key2, 8);

	crypto_cipher_encrypt_one(cipher, out, in);

	crypto_free_cipher(cipher);
	return 0;
}

- Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-09 23:08 Remaining crypto API regressions with CONFIG_VMAP_STACK Eric Biggers
  2016-12-10  5:25 ` Andy Lutomirski
  2016-12-11 19:13 ` Andy Lutomirski
@ 2016-12-12 18:34 ` Andy Lutomirski
  2016-12-12 18:45   ` Gary R Hook
  2016-12-13  3:39   ` Herbert Xu
  2 siblings, 2 replies; 19+ messages in thread
From: Andy Lutomirski @ 2016-12-12 18:34 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, linux-kernel, linux-mm, kernel-hardening,
	Herbert Xu, Andrew Lutomirski, Stephan Mueller

On Fri, Dec 9, 2016 at 3:08 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> In the 4.9 kernel, virtually-mapped stacks will be supported and enabled by
> default on x86_64.  This has been exposing a number of problems in which
> on-stack buffers are being passed into the crypto API, which to support crypto
> accelerators operates on 'struct page' rather than on virtual memory.

Here's my status.

>         drivers/crypto/bfin_crc.c:351
>         drivers/crypto/qce/sha.c:299
>         drivers/crypto/sahara.c:973,988
>         drivers/crypto/talitos.c:1910
>         drivers/crypto/qce/sha.c:325

I have a patch to make these depend on !VMAP_STACK.

>         drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
>         drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
>         drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
>         drivers/crypto/ccp/ccp-crypto-aes.c:94

According to Herbert, these are fine.  I'm personally less convinced
since I'm very confused as to what "async" means in the crypto code,
but I'm going to leave these alone.

>
> And these other places do crypto operations on buffers clearly on the stack:
>
>         drivers/usb/wusbcore/crypto.c:264
>         security/keys/encrypted-keys/encrypted.c:500

I have a patch.

>         drivers/net/wireless/intersil/orinoco/mic.c:72

I have a patch to convert this to, drumroll please:

    priv->tx_tfm_mic = crypto_alloc_shash("michael_mic", 0,
                          CRYPTO_ALG_ASYNC);

Herbert, I'm at a loss as what a "shash" that's "ASYNC" even means.

>         net/ceph/crypto.c:182

This:

size_t zero_padding = (0x10 - (src_len & 0x0f));

is an amazing line of code...

But this driver uses cbc and wants to do synchronous crypto, and I
don't think that the crypto API supports real synchronous crypto using
CBC, so I'm going to let someone else fix this.

>         net/rxrpc/rxkad.c:737,1000

Herbert, can you fix this?

>         fs/cifs/smbencrypt.c:96

I have a patch.


My pile is here:

https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=crypto

I'll send out the patches soon.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-12 18:34 ` Andy Lutomirski
@ 2016-12-12 18:45   ` Gary R Hook
  2016-12-13  3:39     ` Herbert Xu
  2016-12-13  3:39   ` Herbert Xu
  1 sibling, 1 reply; 19+ messages in thread
From: Gary R Hook @ 2016-12-12 18:45 UTC (permalink / raw)
  To: Andy Lutomirski, Eric Biggers
  Cc: linux-crypto, linux-kernel, linux-mm, kernel-hardening,
	Herbert Xu, Andrew Lutomirski, Stephan Mueller

On 12/12/2016 12:34 PM, Andy Lutomirski wrote:

<...snip...>

>
> I have a patch to make these depend on !VMAP_STACK.
>
>>         drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
>>         drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
>>         drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
>>         drivers/crypto/ccp/ccp-crypto-aes.c:94
>
> According to Herbert, these are fine.  I'm personally less convinced
> since I'm very confused as to what "async" means in the crypto code,
> but I'm going to leave these alone.

I went back through the code, and AFAICT every argument to sg_init_one() in
the above-cited files is a buffer that is part of the request context. Which
is allocated by the crypto framework, and therefore will never be on the 
stack.
Right?

I don't (as yet) see a need for any patch to these. Someone correct me 
if I'm
missing something.

<...snip...>

-- 
This is my day job. Follow me at:
IG/Twitter/Facebook: @grhookphoto
IG/Twitter/Facebook: @grhphotographer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-12 18:34 ` Andy Lutomirski
  2016-12-12 18:45   ` Gary R Hook
@ 2016-12-13  3:39   ` Herbert Xu
  2016-12-13 17:06     ` Andy Lutomirski
  1 sibling, 1 reply; 19+ messages in thread
From: Herbert Xu @ 2016-12-13  3:39 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Eric Biggers, linux-crypto, linux-kernel, linux-mm,
	kernel-hardening, Andrew Lutomirski, Stephan Mueller

On Mon, Dec 12, 2016 at 10:34:10AM -0800, Andy Lutomirski wrote:
>
> Here's my status.
> 
> >         drivers/crypto/bfin_crc.c:351
> >         drivers/crypto/qce/sha.c:299
> >         drivers/crypto/sahara.c:973,988
> >         drivers/crypto/talitos.c:1910
> >         drivers/crypto/qce/sha.c:325
> 
> I have a patch to make these depend on !VMAP_STACK.

Why? They're all marked as ASYNC AFAIK.

> I have a patch to convert this to, drumroll please:
> 
>     priv->tx_tfm_mic = crypto_alloc_shash("michael_mic", 0,
>                           CRYPTO_ALG_ASYNC);
> 
> Herbert, I'm at a loss as what a "shash" that's "ASYNC" even means.

Having 0 as type and CRYPTO_ALG_ASYNC as mask in general means
that we're requesting a sync algorithm (i.e., ASYNC bit off).

However, it is completely unnecessary for shash as they can never
be async.  So this could be changed to just ("michael_mic", 0, 0).

> >         net/ceph/crypto.c:182
> 
> This:
> 
> size_t zero_padding = (0x10 - (src_len & 0x0f));
> 
> is an amazing line of code...
> 
> But this driver uses cbc and wants to do synchronous crypto, and I
> don't think that the crypto API supports real synchronous crypto using
> CBC, so I'm going to let someone else fix this.

It does through skcipher if you allocate with (0, CRYPTO_ALG_ASYNC).

I'll try to fix this.

> >         net/rxrpc/rxkad.c:737,1000
> 
> Herbert, can you fix this?

Sure I'll take a look.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-12 18:45   ` Gary R Hook
@ 2016-12-13  3:39     ` Herbert Xu
  0 siblings, 0 replies; 19+ messages in thread
From: Herbert Xu @ 2016-12-13  3:39 UTC (permalink / raw)
  To: Gary R Hook
  Cc: Andy Lutomirski, Eric Biggers, linux-crypto, linux-kernel,
	linux-mm, kernel-hardening, Andrew Lutomirski, Stephan Mueller

On Mon, Dec 12, 2016 at 12:45:18PM -0600, Gary R Hook wrote:
> On 12/12/2016 12:34 PM, Andy Lutomirski wrote:
> 
> <...snip...>
> 
> >
> >I have a patch to make these depend on !VMAP_STACK.
> >
> >>        drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
> >>        drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
> >>        drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
> >>        drivers/crypto/ccp/ccp-crypto-aes.c:94
> >
> >According to Herbert, these are fine.  I'm personally less convinced
> >since I'm very confused as to what "async" means in the crypto code,
> >but I'm going to leave these alone.
> 
> I went back through the code, and AFAICT every argument to sg_init_one() in
> the above-cited files is a buffer that is part of the request context. Which
> is allocated by the crypto framework, and therefore will never be on the
> stack.
> Right?

Right.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-13  3:39   ` Herbert Xu
@ 2016-12-13 17:06     ` Andy Lutomirski
  2016-12-14  4:56       ` Herbert Xu
  0 siblings, 1 reply; 19+ messages in thread
From: Andy Lutomirski @ 2016-12-13 17:06 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Biggers, linux-crypto, linux-kernel, linux-mm,
	kernel-hardening, Andrew Lutomirski, Stephan Mueller

On Mon, Dec 12, 2016 at 7:39 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Mon, Dec 12, 2016 at 10:34:10AM -0800, Andy Lutomirski wrote:
>>
>> Here's my status.
>>
>> >         drivers/crypto/bfin_crc.c:351
>> >         drivers/crypto/qce/sha.c:299
>> >         drivers/crypto/sahara.c:973,988
>> >         drivers/crypto/talitos.c:1910
>> >         drivers/crypto/qce/sha.c:325
>>
>> I have a patch to make these depend on !VMAP_STACK.
>
> Why? They're all marked as ASYNC AFAIK.
>
>> I have a patch to convert this to, drumroll please:
>>
>>     priv->tx_tfm_mic = crypto_alloc_shash("michael_mic", 0,
>>                           CRYPTO_ALG_ASYNC);
>>
>> Herbert, I'm at a loss as what a "shash" that's "ASYNC" even means.
>
> Having 0 as type and CRYPTO_ALG_ASYNC as mask in general means
> that we're requesting a sync algorithm (i.e., ASYNC bit off).
>
> However, it is completely unnecessary for shash as they can never
> be async.  So this could be changed to just ("michael_mic", 0, 0).

I'm confused by a bunch of this.

1. Is it really the case that crypto_alloc_xyz(..., CRYPTO_ALG_ASYNC)
means to allocate a *synchronous* transform?  That's not what I
expected.

2. What guarantees that an async request is never allocated on the
stack?  If it's just convention, could an assertion be added
somewhere?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
  2016-12-13 17:06     ` Andy Lutomirski
@ 2016-12-14  4:56       ` Herbert Xu
  0 siblings, 0 replies; 19+ messages in thread
From: Herbert Xu @ 2016-12-14  4:56 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Eric Biggers, linux-crypto, linux-kernel, linux-mm,
	kernel-hardening, Andrew Lutomirski, Stephan Mueller

On Tue, Dec 13, 2016 at 09:06:31AM -0800, Andy Lutomirski wrote:
>
> > Having 0 as type and CRYPTO_ALG_ASYNC as mask in general means
> > that we're requesting a sync algorithm (i.e., ASYNC bit off).
> >
> > However, it is completely unnecessary for shash as they can never
> > be async.  So this could be changed to just ("michael_mic", 0, 0).
> 
> I'm confused by a bunch of this.
> 
> 1. Is it really the case that crypto_alloc_xyz(..., CRYPTO_ALG_ASYNC)
> means to allocate a *synchronous* transform?  That's not what I
> expected.

crypto_alloc_xyz(name, 0, CRYPTO_ALG_ASYNC) allocates a sync tfm
and crypto_alloc_xyz(name, CRYPTO_ALG_ASYNC, CRYPTO_ALG_ASYNC)
allocates an async tfm while crypto_alloc_xyz(name, 0, 0) does
not care whether the allocated tfm is sync or asnc.

> 2. What guarantees that an async request is never allocated on the
> stack?  If it's just convention, could an assertion be added
> somewhere?

Sure we can add an assertion.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-12-14  4:56 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-09 23:08 Remaining crypto API regressions with CONFIG_VMAP_STACK Eric Biggers
2016-12-10  5:25 ` Andy Lutomirski
2016-12-10  5:32   ` Herbert Xu
2016-12-10  6:03     ` [kernel-hardening] " Eric Biggers
2016-12-10  8:16       ` Herbert Xu
2016-12-10  8:39         ` Eric Biggers
2016-12-10  5:37   ` Herbert Xu
2016-12-10  6:30     ` [kernel-hardening] " Eric Biggers
2016-12-10 14:45     ` Jason A. Donenfeld
2016-12-10 17:48       ` Andy Lutomirski
2016-12-10  5:55   ` Eric Biggers
2016-12-11 19:13 ` Andy Lutomirski
2016-12-11 23:31   ` Eric Biggers
2016-12-12 18:34 ` Andy Lutomirski
2016-12-12 18:45   ` Gary R Hook
2016-12-13  3:39     ` Herbert Xu
2016-12-13  3:39   ` Herbert Xu
2016-12-13 17:06     ` Andy Lutomirski
2016-12-14  4:56       ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).