All of lore.kernel.org
 help / color / mirror / Atom feed
* Kernel lock during cryptsetup luksClose
@ 2023-02-23 15:03 Frédéric Martinsons
  2023-02-27  9:52 ` Frédéric Martinsons
  0 siblings, 1 reply; 9+ messages in thread
From: Frédéric Martinsons @ 2023-02-23 15:03 UTC (permalink / raw)
  To: linux-rt-users

Subject: Kernel lock during cryptsetup luksClose

Hello,

I recently upgraded my system (Marvell Armada-3720) from linux 4.19.255-rt113 to
linux 4.19.271-rt120 and experienced a blocking issue with cryptsetup.

My sequence is roughly the following:
  - Boot an initramfs from PXE
  - Get back a ciphered image from a local HTTP server
  - Get back the ciphering key of this image (which is itself ciphered)
  - Decipher the key it via a TPM chip
  - Decipher the image via cryptsetup luksOpen
  - Partition the storage and install the image on it
  - Cleanup operation

The blocking I see is when I close the LUKS encrypted volume via
cryptsetup luksClose.
I manage to have some more traces of cryptsetup luksClose by adding
--debug to it.
Below is the correct sequence  in 4.19.255-rt113:

# cryptsetup 2.3.7 processing "cryptsetup --verbose --debug luksClose image"
# Running command close.
# Locking memory.
# Installing SIGINT/SIGTERM handler.
# Unblocking interruption on signal.
# Allocating crypt device context by device image.
# Initialising device-mapper backend library.
# dm version   [ opencount flush ]   [16384] (*1)
# dm versions   [ opencount flush ]   [16384] (*1)
# Detected dm-ioctl version 4.39.0.
# Detected dm-crypt version 1.18.1.
# Udev is not running. Not using udev synchronisation code.
# Device-mapper backend running with UDEV support disabled.
# dm status image  [ opencount noflush ]   [16384] (*1)
# Releasing device-mapper backend.
# Trying to open and read device /dev/loop0 with direct-io.
# Allocating context for crypt device /dev/loop0.
# Trying to open and read device /dev/loop0 with direct-io.
# Initialising device-mapper backend library.
# dm versions   [ opencount flush ]   [16384] (*1)
# dm table image  [ opencount flush securedata ]   [16384] (*1)
# Trying to open and read device /dev/loop0 with direct-io.
# dm versions   [ opencount flush ]   [16384] (*1)
# dm deps image  [ opencount flush ]   [16384] (*1)
# Crypto backend (OpenSSL 1.1.1s  1 Nov 2022) initialized in
cryptsetup library version 2.3.7.
# Detected kernel Linux 4.19.255-rt113-sigfox aarch64.
# PBKDF pbkdf2-sha256, time_ms 2000 (iterations 0).
# Reading LUKS header of size 1024 from device /dev/loop0
# Key length 32, device size 179761 sectors, header size 2050 sectors.
# Deactivating volume image.
# dm versions   [ opencount flush ]   [16384] (*1)
# dm status image  [ opencount noflush ]   [16384] (*1)
# dm versions   [ opencount flush ]   [16384] (*1)
# dm table image  [ opencount flush securedata ]   [16384] (*1)
# Trying to open and read device /dev/loop0 with direct-io.
# dm versions   [ opencount flush ]   [16384] (*1)
# dm remove image  [ opencount flush retryremove ]   [16384] (*1)
# image: Stacking NODE_DEL
# image: Processing NODE_DEL
# Removed /dev/mapper/image
# Releasing crypt device /dev/loop0 context.
# Releasing device-mapper backend.
# Closing read only fd for /dev/loop0.
# Unlocking memory.
Command successful.

And with 4.19.271-rt120:
# cryptsetup 2.3.7 processing "cryptsetup --verbose --debug luksClose image"
# Running command close.
# Locking memory.
# Installing SIGINT/SIGTERM handler.
# Unblocking interruption on signal.
# Allocating crypt device context by device image.
# Initialising device-mapper backend library.
# dm version   [ opencount flush ]   [16384] (*1)
# dm versions   [ opencount flush ]   [16384] (*1)
# Detected dm-ioctl version 4.39.0.
# Detected dm-crypt version 1.18.1.
# Udev is not running. Not using udev synchronisation code.
# Device-mapper backend running with UDEV support disabled.
# dm status image  [ opencount noflush ]   [16384] (*1)
# Releasing device-mapper backend.
# Trying to open and read device /dev/loop0 with direct-io.
# Allocating context for crypt device /dev/loop0.
# Trying to open and read device /dev/loop0 with direct-io.
# Initialising device-mapper backend library.
# dm versions   [ opencount flush ]   [16384] (*1)
# dm table image  [ opencount flush securedata ]   [16384] (*1)
# Trying to open and read device /dev/loop0 with direct-io.
# dm versions   [ opencount flush ]   [16384] (*1)
# dm deps image  [ opencount flush ]   [16384] (*1)
# Crypto backend (OpenSSL 1.1.1s  1 Nov 2022) initialized in
cryptsetup library version 2.3.7.
# Detected kernel Linux 4.19.271-rt120-sigfox aarch64.
# PBKDF pbkdf2-sha256, time_ms 2000 (iterations 0).
# Reading LUKS header of size 1024 from device /dev/loop0
# Key length 32, device size 179761 sectors, header size 2050 sectors.
# Deactivating volume image.
# dm versions   [ opencount flush ]   [16384] (*1)
# dm status image  [ opencount noflush ]   [16384] (*1)
# dm versions   [ opencount flush ]   [16384] (*1)
# dm table image  [ opencount flush securedata ]   [16384] (*1)
# Trying to open and read device /dev/loop0 with direct-io.
# dm versions   [ opencount flush ]   [16384] (*1)
# dm remove image  [ opencount flush retryremove ]   [16384] (*1)


After that line the system is completely blocked.

I perform some dichotomy and found that the problematic commit is:
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.19-rt&id=7fcfa8b616d360ab9ea1da94c2200f0708e0188b
introduced in 4.19.255-rt114. Revert this commit on 4.19.271-rt220 makes
the blocking disappear.

I put the kernel max verbosity (logleve=8 ignore_loglevel for command
line parameters)
but I didn't see any kernel traces during this sequence (I have a lot of log
before loading the initramfs but as soon as it is loaded, it seems the kernel
became quiet).
I'll post a follow up if I manage to have more traces.

Does anyone know what could happen ?

I have a reproducible setup and can test several things if someone
tells me what to look at.

Thanks in advance for any insights you could give.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel lock during cryptsetup luksClose
  2023-02-23 15:03 Kernel lock during cryptsetup luksClose Frédéric Martinsons
@ 2023-02-27  9:52 ` Frédéric Martinsons
  2023-02-27 13:07   ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Frédéric Martinsons @ 2023-02-27  9:52 UTC (permalink / raw)
  To: linux-rt-users

Hello, I manage to have more information about the issue by the use of
strace, the blocking call is an ioctl on /dev/mapper/control:

write(1, "# dm versions   [ opencount flus"..., 51# dm versions   [
opencount flush ]   [16384] (*1)) = 51
ioctl(3, DM_LIST_VERSIONS, {version=4.1.0, data_size=16384,
data_start=312, flags=DM_EXISTS_FLAG} => {version=4.39.0,
data_size=431, data_start=312, flags=DM_EXISTS_FLAG, ...}) = 0
write(1, "# dm remove luksDeviceTest  [ op"..., 75# dm remove
luksDeviceTest  [ opencount flush retryremove ]   [16384] (*1)) = 75
ioctl(3, DM_DEV_REMOVE, {version=4.0.0, data_size=16384,
name="luksDeviceTest", event_nr=4194304, flags=DM_EXISTS_FLAGS}

to be compare to the same call without
7fcfa8b616d360ab9ea1da94c2200f0708e0188b commit:

ioctl(3, DM_LIST_VERSIONS, {version=4.1.0, data_size=16384,
data_start=312, flags=DM_EXISTS_FLAG} => {version=4.39.0,
data_size=431, data_start=312, flags=DM_EXISTS_FLAG, ...}) = 0
write(1, "# dm remove luksDeviceTest  [ op"..., 75# dm remove
luksDeviceTest  [ opencount flush retryremove ]   [16384] (*1)) = 75
ioctl(3, DM_DEV_REMOVE, {version=4.0.0, data_size=16384,
name="luksDeviceTest", event_nr=4194304, flags=DM_EXISTS_FLAG} =>
{version=4.39.0, data_size=305, name="luksDeviceTest",
uuid="CRYPT-LUKS1-25edf7523ab4454590a7e2ac83050582-luksDeviceTest",
flags=DM_EXISTS_FLAG|DM_UEVENT_GENERATED_FLAG}) = 0
write(1, "# luksDeviceTest: Stacking NODE_"..., 36# luksDeviceTest:
Stacking NODE_DEL) = 36
write(1, "# luksDeviceTest: Processing NOD"..., 38# luksDeviceTest:
Processing NODE_DEL) = 38

The matching code line which block is
https://github.com/lvmteam/lvm2/blob/v2_03_06/libdm/ioctl/libdm-iface.c#L1890

I don't know how to go deeper to understand why preempt-rt is blocking
of this, any ideas ?
Another question is about the impact of reverting commit
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.19-rt&id=7fcfa8b616d360ab9ea1da94c2200f0708e0188b
, what are the risk of doing that ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel lock during cryptsetup luksClose
  2023-02-27  9:52 ` Frédéric Martinsons
@ 2023-02-27 13:07   ` Sebastian Andrzej Siewior
  2023-02-28 12:51     ` Frédéric Martinsons
  2023-02-28 23:00     ` Brennan Lamoreaux (VMware)
  0 siblings, 2 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-02-27 13:07 UTC (permalink / raw)
  To: Frédéric Martinsons; +Cc: linux-rt-users

On 2023-02-27 10:52:59 [+0100], Frédéric Martinsons wrote:
> I don't know how to go deeper to understand why preempt-rt is blocking
> of this, any ideas ?

The blocking / wake mechanism is different. There should be only one
waiter so the rcuwait_wait_*() mechanism has to work. The previously
used swait* may add multiple waiters to the queue but also wakes only
one waiter at a time.

> Another question is about the impact of reverting commit
> https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v4.19-rt&id=7fcfa8b616d360ab9ea1da94c2200f0708e0188b
> , what are the risk of doing that ?

That has been used before and the rcu-wait mechanism is what we use
upstream. That revert should case no harm.
I would need to reproduce this to figure out if this is a generic
problem or if the backport is bad.
Basically speaking (and you could add printks to figure that out)
manage_workers() should wake on task and that should be the one waiting
in put_unbound_pool(). But this is somehow not happening based on your
description.

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel lock during cryptsetup luksClose
  2023-02-27 13:07   ` Sebastian Andrzej Siewior
@ 2023-02-28 12:51     ` Frédéric Martinsons
  2023-02-28 23:00     ` Brennan Lamoreaux (VMware)
  1 sibling, 0 replies; 9+ messages in thread
From: Frédéric Martinsons @ 2023-02-28 12:51 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users

>
> That has been used before and the rcu-wait mechanism is what we use
> upstream. That revert should case no harm.
> I would need to reproduce this to figure out if this is a generic
> problem or if the backport is bad.
> Basically speaking (and you could add printks to figure that out)
> manage_workers() should wake on task and that should be the one waiting
> in put_unbound_pool(). But this is somehow not happening based on your
> description.
>
> Sebastian

Thank you very much for the answer. I added printk in put_unbound_pool
with the following patch:

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a9f3cc02bdc1..5aacb02b8ec5 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2025,7 +2025,9 @@ static bool manage_workers(struct worker *worker)

        pool->manager = NULL;
        pool->flags &= ~POOL_MANAGER_ACTIVE;
+       printk(KERN_INFO "manage_worker call rcuwait_wake_up , created
pool id %d, node id %d, refcnt %d", pool->id, pool->node,
pool->refcnt);
        rcuwait_wake_up(&manager_wait);
+       printk(KERN_INFO "manage_worker rcuwait_wake_up end, pool id
%d, node id %d, refcnt %d", pool->id, pool->node, pool->refcnt);
        return true;
 }

@@ -3371,20 +3373,21 @@ static void put_unbound_pool(struct worker_pool *pool)
 {
        DECLARE_COMPLETION_ONSTACK(detach_completion);
        struct worker *worker;
-
+       printk(KERN_INFO "put_unbound_pool starts pool id %d, node id
%d, refcnt %d", pool->id, pool->node, pool->refcnt);
        lockdep_assert_held(&wq_pool_mutex);

        if (--pool->refcnt)
                return;
-
+       printk(KERN_INFO "put_unbound_pool starts 2, pool id %d, node
id %d, refcnt %d", pool->id, pool->node, pool->refcnt);
        /* sanity checks */
        if (WARN_ON(!(pool->cpu < 0)) ||
            WARN_ON(!list_empty(&pool->worklist)))
                return;
-
+       printk(KERN_INFO "put_unbound_pool starts 3, pool id %d, node
id %d, refcnt %d", pool->id, pool->node, pool->refcnt);
        /* release id and unhash */
        if (pool->id >= 0)
                idr_remove(&worker_pool_idr, pool->id);
+       printk(KERN_INFO "put_unbound_pool starts 4, pool id %d, node
id %d, refcnt %d", pool->id, pool->node, pool->refcnt);
        hash_del(&pool->hash_node);

        /*
@@ -3394,6 +3397,7 @@ static void put_unbound_pool(struct worker_pool *pool)
         * Because of how wq_manager_inactive() works, we will hold the
         * spinlock after a successful wait.
         */
+       printk(KERN_INFO "put_unbound_pool lock pool id %d, node id
%d, refcnt %d", pool->id, pool->node, pool->refcnt);
        raw_spin_lock_irq(&pool->lock);
        rcuwait_wait_event(&manager_wait, wq_manager_inactive(pool),
                           TASK_UNINTERRUPTIBLE);
@@ -3403,6 +3407,7 @@ static void put_unbound_pool(struct worker_pool *pool)
                destroy_worker(worker);
        WARN_ON(pool->nr_workers || pool->nr_idle);
        raw_spin_unlock_irq(&pool->lock);
+       printk(KERN_INFO "put_unbound_pool unlock pool id %d, node id
%d, refcnt %d", pool->id, pool->node, pool->refcnt);

        mutex_lock(&wq_pool_attach_mutex);
        if (!list_empty(&pool->workers))


And I have the following associated traces before the lockup:

[   45.017508] manage_worker call rcuwait_wake_up , created pool id 5,
node id -1, refcnt 1
[   45.017514] manage_worker rcuwait_wake_up end, pool id 5, node id
-1, refcnt 1
[   45.076736] put_unbound_pool starts pool id 5, node id -1, refcnt 1
[   45.092553] put_unbound_pool starts 2, pool id 5, node id -1, refcnt 0
[   45.099022] put_unbound_pool starts 3, pool id 5, node id -1, refcnt 0
[   45.105762] put_unbound_pool starts 4, pool id 5, node id -1, refcnt 0


So it seems that it is the line hash_del(&pool->hash_node) that never returned.

Sorry but I have 0 experiences in kernel hacking but if you tell me
what info should be printed to help you
pinpoint the issue, I'll do it.

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Kernel lock during cryptsetup luksClose
  2023-02-27 13:07   ` Sebastian Andrzej Siewior
  2023-02-28 12:51     ` Frédéric Martinsons
@ 2023-02-28 23:00     ` Brennan Lamoreaux (VMware)
  2023-03-01  6:24       ` Frédéric Martinsons
  1 sibling, 1 reply; 9+ messages in thread
From: Brennan Lamoreaux (VMware) @ 2023-02-28 23:00 UTC (permalink / raw)
  To: bigeasy; +Cc: frederic.martinsons, linux-rt-users

I just posted a fix for a workqueue related deadlock in put_unbound_pool in the 4.19-rt kernel here:
https://lore.kernel.org/linux-rt-users/20230228224938.88035-1-brennanlamoreaux@gmail.com

Can you give this a shot and see if it fixes your problem?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel lock during cryptsetup luksClose
  2023-02-28 23:00     ` Brennan Lamoreaux (VMware)
@ 2023-03-01  6:24       ` Frédéric Martinsons
  2023-03-02  8:06         ` Frédéric Martinsons
  0 siblings, 1 reply; 9+ messages in thread
From: Frédéric Martinsons @ 2023-03-01  6:24 UTC (permalink / raw)
  To: Brennan Lamoreaux (VMware); +Cc: bigeasy, linux-rt-users

On Wed, 1 Mar 2023 at 00:00, Brennan Lamoreaux (VMware)
<brennanlamoreaux@gmail.com> wrote:
>
> I just posted a fix for a workqueue related deadlock in put_unbound_pool in the 4.19-rt kernel here:
> https://lore.kernel.org/linux-rt-users/20230228224938.88035-1-brennanlamoreaux@gmail.com
>
> Can you give this a shot and see if it fixes your problem?

Hey, thank you for this patch. I currently have no access to my test board.
But I'll happily test this patch tomorrow for sure.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel lock during cryptsetup luksClose
  2023-03-01  6:24       ` Frédéric Martinsons
@ 2023-03-02  8:06         ` Frédéric Martinsons
  2023-03-13  9:41           ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Frédéric Martinsons @ 2023-03-02  8:06 UTC (permalink / raw)
  To: Brennan Lamoreaux (VMware); +Cc: bigeasy, linux-rt-users

On Wed, 1 Mar 2023 at 07:24, Frédéric Martinsons
<frederic.martinsons@gmail.com> wrote:
>
> On Wed, 1 Mar 2023 at 00:00, Brennan Lamoreaux (VMware)
> <brennanlamoreaux@gmail.com> wrote:
> >
> > I just posted a fix for a workqueue related deadlock in put_unbound_pool in the 4.19-rt kernel here:
> > https://lore.kernel.org/linux-rt-users/20230228224938.88035-1-brennanlamoreaux@gmail.com
> >
> > Can you give this a shot and see if it fixes your problem?
>
> Hey, thank you for this patch. I currently have no access to my test board.
> But I'll happily test this patch tomorrow for sure.

The patch you provided works perfectly in my case, no more blocking.
Thanks.

For my curiosity, do you know why I fell in this case ? The bug
requires specific timing or it only
appears during usage of device mapper in kernel ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel lock during cryptsetup luksClose
  2023-03-02  8:06         ` Frédéric Martinsons
@ 2023-03-13  9:41           ` Sebastian Andrzej Siewior
  2023-03-13 18:34             ` Frédéric Martinsons
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-03-13  9:41 UTC (permalink / raw)
  To: Frédéric Martinsons; +Cc: Brennan Lamoreaux (VMware), linux-rt-users

On 2023-03-02 09:06:41 [+0100], Frédéric Martinsons wrote:
> 
> For my curiosity, do you know why I fell in this case ? The bug
> requires specific timing or it only
> appears during usage of device mapper in kernel ?

It always happens if a specific worker pool is removed and this happens
on luksClose. The problem was introduced by a backport, which touched
this corner.

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel lock during cryptsetup luksClose
  2023-03-13  9:41           ` Sebastian Andrzej Siewior
@ 2023-03-13 18:34             ` Frédéric Martinsons
  0 siblings, 0 replies; 9+ messages in thread
From: Frédéric Martinsons @ 2023-03-13 18:34 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Brennan Lamoreaux (VMware), linux-rt-users

Thank you very much for your help on this.

On Mon, 13 Mar 2023 at 10:41, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2023-03-02 09:06:41 [+0100], Frédéric Martinsons wrote:
> >
> > For my curiosity, do you know why I fell in this case ? The bug
> > requires specific timing or it only
> > appears during usage of device mapper in kernel ?
>
> It always happens if a specific worker pool is removed and this happens
> on luksClose. The problem was introduced by a backport, which touched
> this corner.
>
> Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-03-13 18:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-23 15:03 Kernel lock during cryptsetup luksClose Frédéric Martinsons
2023-02-27  9:52 ` Frédéric Martinsons
2023-02-27 13:07   ` Sebastian Andrzej Siewior
2023-02-28 12:51     ` Frédéric Martinsons
2023-02-28 23:00     ` Brennan Lamoreaux (VMware)
2023-03-01  6:24       ` Frédéric Martinsons
2023-03-02  8:06         ` Frédéric Martinsons
2023-03-13  9:41           ` Sebastian Andrzej Siewior
2023-03-13 18:34             ` Frédéric Martinsons

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.