linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rabin Vincent <rabin@rab.in>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Ben Collins <ben.c@servergy.com>,
	Mike Snitzer <snitzer@redhat.com>
Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	dm-devel@redhat.com, linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au
Subject: Re: [PATCH 3.14 73/92] dm crypt: fix deadlock when async crypto algorithm returns -EBUSY
Date: Mon, 4 May 2015 23:32:54 +0200	[thread overview]
Message-ID: <20150504213254.GA27445@debian> (raw)
In-Reply-To: <20150502190117.069180534@linuxfoundation.org>

On Sat, May 02, 2015 at 09:03:28PM +0200, Greg Kroah-Hartman wrote:
> 3.14-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Ben Collins <ben.c@servergy.com>
> 
> commit 0618764cb25f6fa9fb31152995de42a8a0496475 upstream.

The commit in question was recently merged to mainline and is now being
sent to various stable kernels.  But I think the patch is wrong and the
real problem lies in the Freescale CAAM driver which is described as
having being tested with.

 commit 0618764cb25f6fa9fb31152995de42a8a0496475
 Author: Ben Collins <ben.c@servergy.com>
 Date:   Fri Apr 3 16:09:46 2015 +0000
 
     dm crypt: fix deadlock when async crypto algorithm returns -EBUSY
     
     I suspect this doesn't show up for most anyone because software
     algorithms typically don't have a sense of being too busy.  However,
     when working with the Freescale CAAM driver it will return -EBUSY on
     occasion under heavy -- which resulted in dm-crypt deadlock.
     
     After checking the logic in some other drivers, the scheme for
     crypt_convert() and it's callback, kcryptd_async_done(), were not
     correctly laid out to properly handle -EBUSY or -EINPROGRESS.
     
     Fix this by using the completion for both -EBUSY and -EINPROGRESS.  Now
     crypt_convert()'s use of completion is comparable to
     af_alg_wait_for_completion().  Similarly, kcryptd_async_done() follows
     the pattern used in af_alg_complete().
     
     Before this fix dm-crypt would lockup within 1-2 minutes running with
     the CAAM driver.  Fix was regression tested against software algorithms
     on PPC32 and x86_64, and things seem perfectly happy there as well.
     
     Signed-off-by: Ben Collins <ben.c@servergy.com>
     Signed-off-by: Mike Snitzer <snitzer@redhat.com>
     Cc: stable@vger.kernel.org

dm-crypt uses CRYPTO_TFM_REQ_MAY_BACKLOG.  This means the the crypto
driver should internally backlog requests which arrive when the queue is
full and process them later.  Until the crypto hw's queue becomes full,
the driver returns -EINPROGRESS.  When the crypto hw's queue if full,
the driver returns -EBUSY, and if CRYPTO_TFM_REQ_MAY_BACKLOG is set, is
expected to backlog the request and process it when the hardware has
queue space.  At the point when the driver takes the request from the
backlog and starts processing it, it calls the completion function with
a status of -EINPROGRESS.  The completion function is called (for a
second time, in the case of backlogged requests) with a status/err of 0
when a request is done.

Crypto drivers for hardware without hardware queueing use the helpers,
crypto_init_queue(), crypto_enqueue_request(), crypto_dequeue_request()
and crypto_get_backlog() helpers to implement this behaviour correctly,
while others implement this behaviour without these helpers (ccp, for
example).

dm-crypt (before this patch) uses this API correctly.  It queues up as
many requests as the hw queues will allow (i.e. as long as it gets back
-EINPROGRESS from the request function).  Then, when it sees at least
one backlogged request (gets -EBUSY), it waits till that backlogged
request is handled (completion gets called with -EINPROGRESS), and then
continues.  The references to af_alg_wait_for_completion() and
af_alg_complete() in the commit message are irrelevant because those
functions only handle one request at a time, unlink dm-crypt.

The problem is that the Freescale CAAM driver, which this problem is
described as being seen on, fails to implement the backlogging behaviour
correctly.  In cam_jr_enqueue(), if the hardware queue is full, it
simply returns -EBUSY without backlogging the request.  What the
observed deadlock was is not described in the commit message but it is
obviously the wait_for_completion() in crypto_convert() where dm-crypto
would wait for the completion being called with -EINPROGRESS in the case
of backlogged requests.  This completion will never be completed due to
the bug in the CAAM driver.

What this patch does is that it makes dm-crypt wait for every request,
even when the driver/hardware queues are not full, which means that
dm-crypt will never see -EBUSY.  This means that this patch will cause a
performance regression on all crypto drivers which implement the API
correctly.

So, I think this patch should be reverted in mainline and the stable
kernels where it has been merged, and correct backlog handling should be
implemented in the CAAM driver instead.

  reply	other threads:[~2015-05-04 21:33 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-02 19:02 [PATCH 3.14 00/92] 3.14.41-stable review Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 01/92] ip_forward: Drop frames with attached skb->sk Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 02/92] tcp: fix possible deadlock in tcp_send_fin() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 03/92] tcp: avoid looping " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 04/92] net: do not deplete pfmemalloc reserve Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 05/92] net: fix crash in build_skb() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 06/92] Btrfs: fix log tree corruption when fs mounted with -o discard Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 07/92] btrfs: dont accept bare namespace as a valid xattr Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 08/92] Btrfs: fix inode eviction infinite loop after cloning into it Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 09/92] Btrfs: fix inode eviction infinite loop after extent_same ioctl Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 10/92] sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 11/92] usb: gadget: composite: enable BESL support Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 12/92] KVM: s390: Zero out current VMDB of STSI before including level3 data Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 13/92] s390/hibernate: fix save and restore of kernel text section Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 15/92] MIPS: Hibernate: flush TLB entries earlier Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 16/92] md/raid0: fix bug with chunksize not a power of 2 Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 17/92] cdc-wdm: fix endianness bug in debug statements Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 18/92] spi: spidev: fix possible arithmetic overflow for multi-transfer message Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 19/92] compal-laptop: Check return value of power_supply_register Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 20/92] ring-buffer: Replace this_cpu_*() with __this_cpu_*() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 21/92] power_supply: twl4030_madc: Check return value of power_supply_register Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 22/92] power_supply: lp8788-charger: Fix leaked power supply on probe fail Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 23/92] NFS: fix BUG() crash in notify_change() with patch to chown_common() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 24/92] ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 25/92] ARM: S3C64XX: Use fixed IRQ bases to avoid conflicts on Cragganmore Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 26/92] ARM: at91/dt: sama5d3 xplained: add phy address for macb1 Greg Kroah-Hartman
2015-05-04 18:09   ` Luis Henriques
2015-05-04 21:44     ` Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 27/92] ARM: dts: dove: Fix uart[23] reg property Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 28/92] usb: phy: Find the right match in devm_usb_phy_match Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 29/92] usb: define a generic USB_RESUME_TIMEOUT macro Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 30/92] usb: host: fusbh200: use new USB_RESUME_TIMEOUT Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 31/92] usb: host: uhci: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 32/92] usb: host: fotg210: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 33/92] usb: host: r8a66597: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 34/92] usb: host: isp116x: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 35/92] usb: host: xhci: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 36/92] usb: host: sl811: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 37/92] usb: dwc2: hcd: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 38/92] usb: core: hub: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 39/92] ALSA: emu10k1: dont deadlock in proc-functions Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 40/92] Input: elantech - fix absolute mode setting on some ASUS laptops Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 41/92] fs/binfmt_elf.c: fix bug in loading of PIE binaries Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 42/92] ptrace: fix race between ptrace_resume() and wait_task_stopped() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 43/92] rtlwifi: rtl8192cu: Add new USB ID Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 44/92] rtlwifi: rtl8192cu: Add new device ID Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 45/92] arm64: vdso: fix build error when switching from LE to BE Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 46/92] [SCSI] bfa: Replace large udelay() with mdelay() Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 47/92] drm/msm: use componentised device support Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 48/92] ext4: make fsync to sync parent dir in no-journal for real this time Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 49/92] powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 50/92] tools lib traceevent kbuffer: Remove extra update to data pointer in PADDING Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 51/92] tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 52/92] UBI: account for bitflips in both the VID header and data Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 53/92] UBI: fix out of bounds write Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 54/92] UBI: initialize LEB number variable Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 55/92] UBI: fix check for "too many bytes" Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 56/92] scsi: storvsc: Fix a bug in copy_from_bounce_buffer() Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 57/92] target: Fix COMPARE_AND_WRITE with SG_TO_MEM_NOALLOC handling Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 58/92] target/file: Fix BUG() when CONFIG_DEBUG_SG=y and DIF protection enabled Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 59/92] target/file: Fix SG table for prot_buf initialization Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 60/92] Bluetooth: ath3k: Add support Atheros AR5B195 combo Mini PCIe card Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 61/92] powerpc: Fix missing L2 cache size in /sys/devices/system/cpu Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 62/92] powerpc/cell: Fix cell iommu after it_page_shift changes Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 63/92] ASoC: davinci-evm: drop un-necessary remove function Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 64/92] ACPICA: Utilities: split IO address types from data type models Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 65/92] ACPI / scan: Annotate physical_node_lock in acpi_scan_is_offline() Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 66/92] xtensa: xtfpga: fix hardware lockup caused by LCD driver Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 67/92] xtensa: provide __NR_sync_file_range2 instead of __NR_sync_file_range Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 68/92] xtensa: ISS: fix locking in TAP network adapter Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 69/92] gpio: mvebu: Fix mask/unmask managment per irq chip type Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 70/92] gpio: clamp returned values to the boolean range Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 71/92] clk: tegra: Register the proper number of resets Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 72/92] clk: qcom: fix RCG M/N counter configuration Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 73/92] dm crypt: fix deadlock when async crypto algorithm returns -EBUSY Greg Kroah-Hartman
2015-05-04 21:32   ` Rabin Vincent [this message]
2015-05-04 21:40     ` Ben Collins
2015-05-05  1:53     ` Herbert Xu
2015-05-05  3:22     ` Mike Snitzer
2015-05-05  6:42       ` Milan Broz
2015-05-05 12:50         ` Mike Snitzer
2015-05-02 19:03 ` [PATCH 3.14 74/92] Drivers: hv: vmbus: Fix a bug in the error path in vmbus_open() Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 75/92] mvsas: fix panic on expander attached SATA devices Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 76/92] [media] stk1160: Make sure current buffer is released Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 77/92] IB/core: disallow registering 0-sized memory region Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 78/92] IB/core: dont disallow registering region starting at 0x0 Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 79/92] IB/mlx4: Fix WQE LSO segment calculation Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 80/92] i2c: core: Export bus recovery functions Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 81/92] drm/radeon: fix doublescan modes (v2) Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 82/92] drm/i915: cope with large i2c transfers Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 83/92] RCU pathwalk breakage when running into a symlink overmounting something Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 84/92] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 85/92] e1000: add dummy allocator to fix race condition between mtu change and netpoll Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 86/92] lib: memzero_explicit: use barrier instead of OPTIMIZER_HIDE_VAR Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 87/92] wl18xx: show rx_frames_per_rates as an array as it really is Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 88/92] crypto: omap-aes - Fix support for unequal lengths Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 89/92] C6x: time: Ensure consistency in __init Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 90/92] memstick: mspro_block: add missing curly braces Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 91/92] driver core: bus: Goto appropriate labels on failure in bus_add_device Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 92/92] fs: take i_mutex during prepare_binprm for set[ug]id executables Greg Kroah-Hartman
2015-05-03 19:54 ` [PATCH 3.14 00/92] 3.14.41-stable review Guenter Roeck
2015-05-04 21:42   ` Greg Kroah-Hartman
2015-05-05  4:45   ` Guenter Roeck
2015-05-05 22:12     ` Greg Kroah-Hartman
2015-05-04 16:10 ` Shuah Khan
2015-05-04 21:43   ` Greg Kroah-Hartman
2015-05-05 22:10 ` Greg Kroah-Hartman
2015-05-06  1:51   ` Guenter Roeck
2015-05-06 16:01   ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150504213254.GA27445@debian \
    --to=rabin@rab.in \
    --cc=ben.c@servergy.com \
    --cc=dm-devel@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).