linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Xianting Tian <xianting_tian@126.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	yubin@h3c.com, Linus Torvalds <torvalds@linux-foundation.org>,
	Sasha Levin <sashal@kernel.org>,
	linux-mm@kvack.org
Subject: [PATCH AUTOSEL 4.4 40/64] mm/filemap.c: clear page error before actual read
Date: Thu, 17 Sep 2020 22:16:19 -0400	[thread overview]
Message-ID: <20200918021643.2067895-40-sashal@kernel.org> (raw)
In-Reply-To: <20200918021643.2067895-1-sashal@kernel.org>

From: Xianting Tian <xianting_tian@126.com>

[ Upstream commit faffdfa04fa11ccf048cebdde73db41ede0679e0 ]

Mount failure issue happens under the scenario: Application forked dozens
of threads to mount the same number of cramfs images separately in docker,
but several mounts failed with high probability.  Mount failed due to the
checking result of the page(read from the superblock of loop dev) is not
uptodate after wait_on_page_locked(page) returned in function cramfs_read:

   wait_on_page_locked(page);
   if (!PageUptodate(page)) {
      ...
   }

The reason of the checking result of the page not uptodate: systemd-udevd
read the loopX dev before mount, because the status of loopX is Lo_unbound
at this time, so loop_make_request directly trigger the calling of io_end
handler end_buffer_async_read, which called SetPageError(page).  So It
caused the page can't be set to uptodate in function
end_buffer_async_read:

   if(page_uptodate && !PageError(page)) {
      SetPageUptodate(page);
   }

Then mount operation is performed, it used the same page which is just
accessed by systemd-udevd above, Because this page is not uptodate, it
will launch a actual read via submit_bh, then wait on this page by calling
wait_on_page_locked(page).  When the I/O of the page done, io_end handler
end_buffer_async_read is called, because no one cleared the page
error(during the whole read path of mount), which is caused by
systemd-udevd reading, so this page is still in "PageError" status, which
can't be set to uptodate in function end_buffer_async_read, then caused
mount failure.

But sometimes mount succeed even through systemd-udeved read loopX dev
just before, The reason is systemd-udevd launched other loopX read just
between step 3.1 and 3.2, the steps as below:

1, loopX dev default status is Lo_unbound;
2, systemd-udved read loopX dev (page is set to PageError);
3, mount operation
   1) set loopX status to Lo_bound;
   ==>systemd-udevd read loopX dev<==
   2) read loopX dev(page has no error)
   3) mount succeed

As the loopX dev status is set to Lo_bound after step 3.1, so the other
loopX dev read by systemd-udevd will go through the whole I/O stack, part
of the call trace as below:

   SYS_read
      vfs_read
          do_sync_read
              blkdev_aio_read
                 generic_file_aio_read
                     do_generic_file_read:
                        ClearPageError(page);
                        mapping->a_ops->readpage(filp, page);

here, mapping->a_ops->readpage() is blkdev_readpage.  In latest kernel,
some function name changed, the call trace as below:

   blkdev_read_iter
      generic_file_read_iter
         generic_file_buffered_read:
            /*
             * A previous I/O error may have been due to temporary
             * failures, eg. mutipath errors.
             * Pg_error will be set again if readpage fails.
             */
            ClearPageError(page);
            /* Start the actual read. The read will unlock the page*/
            error=mapping->a_ops->readpage(flip, page);

We can see ClearPageError(page) is called before the actual read,
then the read in step 3.2 succeed.

This patch is to add the calling of ClearPageError just before the actual
read of read path of cramfs mount.  Without the patch, the call trace as
below when performing cramfs mount:

   do_mount
      cramfs_read
         cramfs_blkdev_read
            read_cache_page
               do_read_cache_page:
                  filler(data, page);
                  or
                  mapping->a_ops->readpage(data, page);

With the patch, the call trace as below when performing mount:

   do_mount
      cramfs_read
         cramfs_blkdev_read
            read_cache_page:
               do_read_cache_page:
                  ClearPageError(page); <== new add
                  filler(data, page);
                  or
                  mapping->a_ops->readpage(data, page);

With the patch, mount operation trigger the calling of
ClearPageError(page) before the actual read, the page has no error if no
additional page error happen when I/O done.

Signed-off-by: Xianting Tian <xianting_tian@126.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: <yubin@h3c.com>
Link: http://lkml.kernel.org/r/1583318844-22971-1-git-send-email-xianting_tian@126.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/filemap.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index f217120973ebe..3d0a0e409cbf5 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2313,6 +2313,14 @@ filler:
 		unlock_page(page);
 		goto out;
 	}
+
+	/*
+	 * A previous I/O error may have been due to temporary
+	 * failures.
+	 * Clear page error before actual read, PG_error will be
+	 * set again if read page fails.
+	 */
+	ClearPageError(page);
 	goto filler;
 
 out:
-- 
2.25.1


  parent reply	other threads:[~2020-09-18  2:20 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-18  2:15 [PATCH AUTOSEL 4.4 01/64] scsi: aacraid: fix illegal IO beyond last LBA Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 02/64] m68k: q40: Fix info-leak in rtc_ioctl Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 03/64] gma/gma500: fix a memory disclosure bug due to uninitialized bytes Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 04/64] ASoC: kirkwood: fix IRQ error handling Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 05/64] ata: sata_mv, avoid trigerrable BUG_ON Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 06/64] PM / devfreq: tegra30: Fix integer overflow on CPU's freq max out Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 07/64] mtd: cfi_cmdset_0002: don't free cfi->cfiq in error path of cfi_amdstd_setup() Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 08/64] mfd: mfd-core: Protect against NULL call-back function pointer Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 09/64] tracing: Adding NULL checks for trace_array descriptor pointer Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 10/64] bcache: fix a lost wake-up problem caused by mca_cannibalize_lock Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 11/64] xfs: fix attr leaf header freemap.size underflow Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 12/64] kernel/sys.c: avoid copying possible padding bytes in copy_to_user Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 13/64] neigh_stat_seq_next() should increase position index Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 14/64] rt_cpu_seq_next " Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 15/64] seqlock: Require WRITE_ONCE surrounding raw_seqcount_barrier Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 16/64] ACPI: EC: Reference count query handlers under lock Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 17/64] tracing: Set kernel_stack's caller size properly Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 18/64] ext4: make dioread_nolock the default Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 19/64] ar5523: Add USB ID of SMCWUSBT-G2 wireless adapter Sasha Levin
2020-09-18  2:15 ` [PATCH AUTOSEL 4.4 20/64] Bluetooth: Fix refcount use-after-free issue Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 21/64] mm: pagewalk: fix termination condition in walk_pte_range() Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 22/64] Bluetooth: prefetch channel before killing sock Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 23/64] skbuff: fix a data race in skb_queue_len() Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 24/64] audit: CONFIG_CHANGE don't log internal bookkeeping as an event Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 25/64] selinux: sel_avc_get_stat_idx should increase position index Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 26/64] scsi: lpfc: Fix RQ buffer leakage when no IOCBs available Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 27/64] drm/omap: fix possible object reference leak Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 28/64] dmaengine: tegra-apb: Prevent race conditions on channel's freeing Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 29/64] media: go7007: Fix URB type for interrupt handling Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 30/64] Bluetooth: guard against controllers sending zero'd events Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 31/64] drm/amdgpu: increase atombios cmd timeout Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 32/64] Bluetooth: L2CAP: handle l2cap config request during open state Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 33/64] media: tda10071: fix unsigned sign extension overflow Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 34/64] tpm: ibmvtpm: Wait for buffer to be set before proceeding Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 35/64] tracing: Use address-of operator on section symbols Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 36/64] serial: 8250_omap: Fix sleeping function called from invalid context during probe Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 37/64] SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()' Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 38/64] ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 39/64] ALSA: usb-audio: Fix case when USB MIDI interface has more than one extra endpoint descriptor Sasha Levin
2020-09-18  2:16 ` Sasha Levin [this message]
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 41/64] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 42/64] serial: uartps: Wait for tx_empty in console setup Sasha Levin
2020-09-28 20:16   ` Naresh Kamboju
2020-09-28 22:00     ` Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 43/64] KVM: Remove CREATE_IRQCHIP/SET_PIT2 race Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 44/64] bdev: Reduce time holding bd_mutex in sync in blkdev_close() Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 45/64] drivers: char: tlclk.c: Avoid data race between init and interrupt handler Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 46/64] dt-bindings: sound: wm8994: Correct required supplies based on actual implementaion Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 47/64] atm: fix a memory leak of vcc->user_back Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 48/64] phy: samsung: s5pv210-usb2: Add delay after reset Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 49/64] Bluetooth: Handle Inquiry Cancel error after Inquiry Complete Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 50/64] USB: EHCI: ehci-mv: fix error handling in mv_ehci_probe() Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 51/64] tty: serial: samsung: Correct clock selection logic Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 52/64] ALSA: hda: Fix potential race in unsol event handler Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 53/64] fuse: don't check refcount after stealing page Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 54/64] USB: EHCI: ehci-mv: fix less than zero comparison of an unsigned int Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 55/64] e1000: Do not perform reset in reset_task if we are already down Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 56/64] printk: handle blank console arguments passed in Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 57/64] vfio/pci: fix memory leaks of eventfd ctx Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 58/64] perf kcore_copy: Fix module map when there are no modules loaded Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 59/64] mtd: rawnand: omap_elm: Fix runtime PM imbalance on error Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 60/64] ceph: fix potential race in ceph_check_caps Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 61/64] mtd: parser: cmdline: Support MTD names containing one or more colons Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 62/64] x86/speculation/mds: Mark mds_user_clear_cpu_buffers() __always_inline Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 63/64] vfio/pci: Clear error and request eventfd ctx after releasing Sasha Levin
2020-09-18  2:16 ` [PATCH AUTOSEL 4.4 64/64] vfio/pci: fix racy on error and request eventfd ctx Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200918021643.2067895-40-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    --cc=xianting_tian@126.com \
    --cc=yubin@h3c.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).