All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>,
	Mark Rutland <mark.rutland@arm.com>,
	Arnd Bergmann <arnd@kernel.org>, "Theodore Ts'o" <tytso@mit.edu>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	linux-toolchains@vger.kernel.org,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Will Deacon <will@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: Aarch64 EXT4FS inode checksum failures - seems to be weak memory ordering issues
Date: Fri, 8 Jan 2021 09:21:55 +0100	[thread overview]
Message-ID: <CAMj1kXH8nwyf=2hDBMrEi3uFzNS8Dvx-+AYy8DSWq3mO8UibAA@mail.gmail.com> (raw)
In-Reply-To: <X/eOC/tj3966kBRH@gmail.com>

On Thu, 7 Jan 2021 at 23:42, Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Thu, Jan 07, 2021 at 10:14:46PM +0000, Russell King - ARM Linux admin wrote:
> > On Thu, Jan 07, 2021 at 10:48:05PM +0100, Arnd Bergmann wrote:
> > > On Thu, Jan 7, 2021 at 5:27 PM Theodore Ts'o <tytso@mit.edu> wrote:
> > > >
> > > > On Thu, Jan 07, 2021 at 01:37:47PM +0000, Russell King - ARM Linux admin wrote:
> > > > > > The gcc bugzilla mentions backports into gcc-linaro, but I do not see
> > > > > > them in my git history.
> > > > >
> > > > > So, do we raise the minimum gcc version for the kernel as a whole to 5.1
> > > > > or just for aarch64?
> > > >
> > > > Russell, Arnd, thanks so much for tracking down the root cause of the
> > > > bug!
> > >
> > > There is one more thing that I wondered about when looking through
> > > the ext4 code: Should it just call the crc32c_le() function directly
> > > instead of going through the crypto layer? It seems that with Ard's
> > > rework from 2018, that can just call the underlying architecture specific
> > > implementation anyway.
> >
> > Yes, I've been wondering about that too. To me, it looks like the
> > ext4 code performs a layering violation by going "under the covers"
> > - there are accessor functions to set the CRC and retrieve it. ext4
> > instead just makes the assumption that the CRC value is stored after
> > struct shash_desc. Especially as the crypto/crc32c code references
> > the value using:
> >
> >       struct chksum_desc_ctx *ctx = shash_desc_ctx(desc);
> >
> > Not even crypto drivers are allowed to assume that desc+1 is where
> > the CRC is stored.
>
> It violates how the shash API is meant to be used in general, but there is a
> test that enforces that the shash_desc_ctx for crc32c must be just the single
> u32 crc value.  See alg_test_crc32c() in crypto/testmgr.c.  So it's apparently
> intended to work.
>
> >
> > However, struct shash_desc is already 128 bytes in size on aarch64,
>
> Ard Biesheuvel recently sent a patch to reduce the alignment of struct
> shash_desc to ARCH_SLAB_MINALIGN
> (https://lkml.kernel.org/linux-crypto/20210107124128.19791-1-ardb@kernel.org/),
> since apparently most of the bloat is from alignment for DMA, which isn't
> necessary.  I think that reduces the size by a lot on arm64.
>
> > and the proper way of doing it via SHASH_DESC_ON_STACK() is overkill,
> > being strangely 2 * sizeof(struct shash_desc) + 360 (which looks like
> > another bug to me!)
>
> Are you referring to the '2 * sizeof(struct shash_desc)' rather than just
> 'sizeof(struct shash_desc)'?  As mentioned in the comment above
> HASH_MAX_DESCSIZE, there can be a nested shash_desc due to HMAC.
> So I believe the value is correct.
>
> > So, I agree with you wrt crc32c_le(), especially as it would be more
> > efficient, and as the use of crc32c is already hard coded in the ext4
> > code - not only with crypto_alloc_shash("crc32c", 0, 0) but also with
> > the fixed-size structure in ext4_chksum().
> >
> > However, it's ultimately up to the ext4 maintainers to decide.
>
> As I mentioned in my other response, crc32c_le() isn't a proper library API
> (like some of the newer lib/crypto/ stuff) but rather just a wrapper for the
> shash API, and it doesn't handle modules being dynamically loaded/unloaded.
> So switching to it may cause a performance regression.
>
> What I'd recommend is making crc32c_le() able to call architecture-speccific
> implementations directly, similar to blake2s() and chacha20() in lib/crypto/.
> Then there would be no concern about when modules get loaded, etc...
>

I have looked into this in the past, both for crc32(c) and crc-t10dif,
both of which use a horrid method of wrapping a shash into a library
API. This was before we had static calls, though, and this work kind
of stalled on that. It should be straight-forward to redefine the
crc32c() library function as a static call, and have an optimized
module (or builtin) perform the [conditional] static call update at
module_init() time. The only missing piece is autoloading such
modules, which is tricky with softdeps if the dependency is from the
core kernel.

Currently, we have many users of crc32(c) in the kernel that go via
the shash (or synchronous ahash) layer to perform crc32c, all of which
would be better served by a library API, given that the hash type is a
compile time constant, and only synchronous calls are made.




drivers/infiniband/hw/i40iw/i40iw_utils.c: tfm =
crypto_alloc_shash("crc32c", 0, 0);
drivers/infiniband/sw/rxe/rxe_verbs.c: tfm = crypto_alloc_shash("crc32", 0, 0);
drivers/infiniband/sw/siw/siw_main.c: siw_crypto_shash =
crypto_alloc_shash("crc32c", 0, 0);
drivers/md/dm-crypt.c: tcw->crc32_tfm = crypto_alloc_shash("crc32", 0,
drivers/nvme/host/tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/nvme/target/tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/scsi/iscsi_tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/target/iscsi/iscsi_target_login.c: tfm =
crypto_alloc_ahash("crc32c", 0, CRYPTO_ALG_ASYNC);
fs/ext4/super.c: sbi->s_chksum_driver = crypto_alloc_shash("crc32c", 0, 0);
fs/f2fs/super.c: sbi->s_chksum_driver = crypto_alloc_shash("crc32", 0, 0);
fs/jbd2/journal.c: journal->j_chksum_driver =
crypto_alloc_shash("crc32c", 0, 0);
fs/jbd2/journal.c: journal->j_chksum_driver =
crypto_alloc_shash("crc32c", 0, 0);
lib/libcrc32c.c: tfm = crypto_alloc_shash("crc32c", 0, 0);

WARNING: multiple messages have this Message-ID (diff)
From: Ard Biesheuvel <ardb@kernel.org>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Arnd Bergmann <arnd@kernel.org>, Theodore Ts'o <tytso@mit.edu>,
	Russell King - ARM Linux admin <linux@armlinux.org.uk>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	linux-toolchains@vger.kernel.org,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Will Deacon <will@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: Aarch64 EXT4FS inode checksum failures - seems to be weak memory ordering issues
Date: Fri, 8 Jan 2021 09:21:55 +0100	[thread overview]
Message-ID: <CAMj1kXH8nwyf=2hDBMrEi3uFzNS8Dvx-+AYy8DSWq3mO8UibAA@mail.gmail.com> (raw)
In-Reply-To: <X/eOC/tj3966kBRH@gmail.com>

On Thu, 7 Jan 2021 at 23:42, Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Thu, Jan 07, 2021 at 10:14:46PM +0000, Russell King - ARM Linux admin wrote:
> > On Thu, Jan 07, 2021 at 10:48:05PM +0100, Arnd Bergmann wrote:
> > > On Thu, Jan 7, 2021 at 5:27 PM Theodore Ts'o <tytso@mit.edu> wrote:
> > > >
> > > > On Thu, Jan 07, 2021 at 01:37:47PM +0000, Russell King - ARM Linux admin wrote:
> > > > > > The gcc bugzilla mentions backports into gcc-linaro, but I do not see
> > > > > > them in my git history.
> > > > >
> > > > > So, do we raise the minimum gcc version for the kernel as a whole to 5.1
> > > > > or just for aarch64?
> > > >
> > > > Russell, Arnd, thanks so much for tracking down the root cause of the
> > > > bug!
> > >
> > > There is one more thing that I wondered about when looking through
> > > the ext4 code: Should it just call the crc32c_le() function directly
> > > instead of going through the crypto layer? It seems that with Ard's
> > > rework from 2018, that can just call the underlying architecture specific
> > > implementation anyway.
> >
> > Yes, I've been wondering about that too. To me, it looks like the
> > ext4 code performs a layering violation by going "under the covers"
> > - there are accessor functions to set the CRC and retrieve it. ext4
> > instead just makes the assumption that the CRC value is stored after
> > struct shash_desc. Especially as the crypto/crc32c code references
> > the value using:
> >
> >       struct chksum_desc_ctx *ctx = shash_desc_ctx(desc);
> >
> > Not even crypto drivers are allowed to assume that desc+1 is where
> > the CRC is stored.
>
> It violates how the shash API is meant to be used in general, but there is a
> test that enforces that the shash_desc_ctx for crc32c must be just the single
> u32 crc value.  See alg_test_crc32c() in crypto/testmgr.c.  So it's apparently
> intended to work.
>
> >
> > However, struct shash_desc is already 128 bytes in size on aarch64,
>
> Ard Biesheuvel recently sent a patch to reduce the alignment of struct
> shash_desc to ARCH_SLAB_MINALIGN
> (https://lkml.kernel.org/linux-crypto/20210107124128.19791-1-ardb@kernel.org/),
> since apparently most of the bloat is from alignment for DMA, which isn't
> necessary.  I think that reduces the size by a lot on arm64.
>
> > and the proper way of doing it via SHASH_DESC_ON_STACK() is overkill,
> > being strangely 2 * sizeof(struct shash_desc) + 360 (which looks like
> > another bug to me!)
>
> Are you referring to the '2 * sizeof(struct shash_desc)' rather than just
> 'sizeof(struct shash_desc)'?  As mentioned in the comment above
> HASH_MAX_DESCSIZE, there can be a nested shash_desc due to HMAC.
> So I believe the value is correct.
>
> > So, I agree with you wrt crc32c_le(), especially as it would be more
> > efficient, and as the use of crc32c is already hard coded in the ext4
> > code - not only with crypto_alloc_shash("crc32c", 0, 0) but also with
> > the fixed-size structure in ext4_chksum().
> >
> > However, it's ultimately up to the ext4 maintainers to decide.
>
> As I mentioned in my other response, crc32c_le() isn't a proper library API
> (like some of the newer lib/crypto/ stuff) but rather just a wrapper for the
> shash API, and it doesn't handle modules being dynamically loaded/unloaded.
> So switching to it may cause a performance regression.
>
> What I'd recommend is making crc32c_le() able to call architecture-speccific
> implementations directly, similar to blake2s() and chacha20() in lib/crypto/.
> Then there would be no concern about when modules get loaded, etc...
>

I have looked into this in the past, both for crc32(c) and crc-t10dif,
both of which use a horrid method of wrapping a shash into a library
API. This was before we had static calls, though, and this work kind
of stalled on that. It should be straight-forward to redefine the
crc32c() library function as a static call, and have an optimized
module (or builtin) perform the [conditional] static call update at
module_init() time. The only missing piece is autoloading such
modules, which is tricky with softdeps if the dependency is from the
core kernel.

Currently, we have many users of crc32(c) in the kernel that go via
the shash (or synchronous ahash) layer to perform crc32c, all of which
would be better served by a library API, given that the hash type is a
compile time constant, and only synchronous calls are made.




drivers/infiniband/hw/i40iw/i40iw_utils.c: tfm =
crypto_alloc_shash("crc32c", 0, 0);
drivers/infiniband/sw/rxe/rxe_verbs.c: tfm = crypto_alloc_shash("crc32", 0, 0);
drivers/infiniband/sw/siw/siw_main.c: siw_crypto_shash =
crypto_alloc_shash("crc32c", 0, 0);
drivers/md/dm-crypt.c: tcw->crc32_tfm = crypto_alloc_shash("crc32", 0,
drivers/nvme/host/tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/nvme/target/tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/scsi/iscsi_tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/target/iscsi/iscsi_target_login.c: tfm =
crypto_alloc_ahash("crc32c", 0, CRYPTO_ALG_ASYNC);
fs/ext4/super.c: sbi->s_chksum_driver = crypto_alloc_shash("crc32c", 0, 0);
fs/f2fs/super.c: sbi->s_chksum_driver = crypto_alloc_shash("crc32", 0, 0);
fs/jbd2/journal.c: journal->j_chksum_driver =
crypto_alloc_shash("crc32c", 0, 0);
fs/jbd2/journal.c: journal->j_chksum_driver =
crypto_alloc_shash("crc32c", 0, 0);
lib/libcrc32c.c: tfm = crypto_alloc_shash("crc32c", 0, 0);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-01-08  8:23 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-05 15:47 Aarch64 EXT4FS inode checksum failures - seems to be weak memory ordering issues Russell King - ARM Linux admin
2021-01-05 15:47 ` Russell King - ARM Linux admin
2021-01-05 18:27 ` Darrick J. Wong
2021-01-05 18:27   ` Darrick J. Wong
2021-01-05 19:50   ` Russell King - ARM Linux admin
2021-01-05 19:50     ` Russell King - ARM Linux admin
2021-01-06 11:53 ` Mark Rutland
2021-01-06 11:53   ` Mark Rutland
2021-01-06 12:13   ` Russell King - ARM Linux admin
2021-01-06 12:13     ` Russell King - ARM Linux admin
2021-01-06 13:52   ` Russell King - ARM Linux admin
2021-01-06 17:20     ` Will Deacon
2021-01-06 17:20       ` Will Deacon
2021-01-06 17:46       ` Russell King - ARM Linux admin
2021-01-06 17:46         ` Russell King - ARM Linux admin
2021-01-06 21:04       ` Arnd Bergmann
2021-01-06 21:04         ` Arnd Bergmann
2021-01-06 22:00         ` Arnd Bergmann
2021-01-06 22:00           ` Arnd Bergmann
2021-01-06 22:32       ` Russell King - ARM Linux admin
2021-01-06 22:32         ` Russell King - ARM Linux admin
2021-01-07 11:18         ` Russell King - ARM Linux admin
2021-01-07 11:18           ` Russell King - ARM Linux admin
2021-01-07 12:45           ` Russell King - ARM Linux admin
2021-01-07 12:45             ` Russell King - ARM Linux admin
2021-01-07 13:16             ` Arnd Bergmann
2021-01-07 13:16               ` Arnd Bergmann
2021-01-07 13:37               ` Russell King - ARM Linux admin
2021-01-07 13:37                 ` Russell King - ARM Linux admin
2021-01-07 16:27                 ` Theodore Ts'o
2021-01-07 16:27                   ` Theodore Ts'o
2021-01-07 17:00                   ` Florian Weimer
2021-01-07 17:00                     ` Florian Weimer
2021-01-07 21:48                   ` Arnd Bergmann
2021-01-07 21:48                     ` Arnd Bergmann
2021-01-07 22:14                     ` Russell King - ARM Linux admin
2021-01-07 22:14                       ` Russell King - ARM Linux admin
2021-01-07 22:41                       ` Eric Biggers
2021-01-07 22:41                         ` Eric Biggers
2021-01-08  8:21                         ` Ard Biesheuvel [this message]
2021-01-08  8:21                           ` Ard Biesheuvel
2021-01-07 22:27                     ` Eric Biggers
2021-01-07 22:27                       ` Eric Biggers
2021-01-07 23:53                       ` Darrick J. Wong
2021-01-07 23:53                         ` Darrick J. Wong
2021-01-08  8:05                         ` Arnd Bergmann
2021-01-08  8:05                           ` Arnd Bergmann
2021-01-08  9:13                   ` Peter Zijlstra
2021-01-08  9:13                     ` Peter Zijlstra
2021-01-08 10:31                   ` Pavel Machek
2021-01-08 10:31                     ` Pavel Machek
2021-01-07 21:20                 ` Arnd Bergmann
2021-01-07 21:20                   ` Arnd Bergmann
2021-01-08  9:21                   ` Peter Zijlstra
2021-01-08  9:21                     ` Peter Zijlstra
2021-01-08  9:26                     ` Will Deacon
2021-01-08  9:26                       ` Will Deacon
2021-01-08 20:02                       ` Linus Torvalds
2021-01-08 20:02                         ` Linus Torvalds
2021-01-08 20:22                         ` Arnd Bergmann
2021-01-08 20:22                           ` Arnd Bergmann
2021-01-08 21:20                           ` Nick Desaulniers
2021-01-08 21:20                             ` Nick Desaulniers
2021-01-08 20:29                         ` Russell King - ARM Linux admin
2021-01-08 20:29                           ` Russell King - ARM Linux admin
2021-01-12 13:20                         ` Lukas Wunner
2021-01-12 13:31                           ` Florian Weimer
2021-01-12 13:31                             ` Florian Weimer
2021-01-12 13:46                             ` David Laight
2021-01-12 13:46                               ` David Laight
2021-01-12 17:28                           ` Linus Torvalds
2021-01-12 17:28                             ` Linus Torvalds
2021-01-14 13:13                             ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMj1kXH8nwyf=2hDBMrEi3uFzNS8Dvx-+AYy8DSWq3mO8UibAA@mail.gmail.com' \
    --to=ardb@kernel.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=arnd@kernel.org \
    --cc=ebiggers@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=mark.rutland@arm.com \
    --cc=tytso@mit.edu \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.