Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: CAI Qian <caiqian@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>, Nick Piggin <npiggin@gmail.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC][CFT] splice_read reworked
Date: Tue, 4 Oct 2016 08:12:29 +1100
Message-ID: <20161003211229.GN27872@dastard> (raw)
In-Reply-To: <59188294.61623.1475508050989.JavaMail.zimbra@redhat.com>

On Mon, Oct 03, 2016 at 11:20:50AM -0400, CAI Qian wrote:
> > container backed by overlayfs/xfs.
> There is another warning happened once so far. Not sure if related.
> 
> [  447.961826] ------------[ cut here ]------------
> [  447.967020] WARNING: CPU: 39 PID: 27352 at fs/xfs/xfs_file.c:626 xfs_file_dio_aio_write+0x3dc/0x4b0 [xfs]
> [  447.977736] Modules linked in: ieee802154_socket ieee802154 af_key vmw_vsock_vmci_transport vsock vmw_vmci bluetooth rfkill can pptp gre l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc nfnetlink scsi_transport_iscsi atm sctp veth ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc overlay intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support pcspkr i2c_i801 i2c_smbus ipmi_ssif mei_me sg mei shpchp lpc_ich wmi ipmi_si ipmi_msghandler acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crc32c_intel ttm ixgbe drm mdio ahci ptp libahci pps_core libata i2c_core dca fjes dm_mirror dm_region_hash dm_log dm_mod
> [  448.086775] CPU: 39 PID: 27352 Comm: trinity-c39 Not tainted 4.8.0-rc8-splice+ #1
> [  448.095126] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0044.R00.1501191641 01/19/2015
> [  448.106483]  0000000000000286 00000000389140f2 ffff880404833c48 ffffffff813d2eac
> [  448.114776]  0000000000000000 0000000000000000 ffff880404833c88 ffffffff8109cf11
> [  448.123067]  00000272389140f2 ffff880404833d80 ffff880404833dd8 ffff8803bfba88e8
> [  448.131356] Call Trace:
> [  448.134088]  [<ffffffff813d2eac>] dump_stack+0x85/0xc9
> [  448.139821]  [<ffffffff8109cf11>] __warn+0xd1/0xf0
> [  448.145167]  [<ffffffff8109d04d>] warn_slowpath_null+0x1d/0x20
> [  448.151705]  [<ffffffffa044165c>] xfs_file_dio_aio_write+0x3dc/0x4b0 [xfs]
> [  448.159394]  [<ffffffffa0441b10>] xfs_file_write_iter+0x90/0x130 [xfs]
> [  448.166679]  [<ffffffff81280eee>] do_iter_readv_writev+0xae/0x130
> [  448.173479]  [<ffffffff81281992>] do_readv_writev+0x1a2/0x230
> [  448.179906]  [<ffffffffa0441a80>] ? xfs_file_buffered_aio_write+0x350/0x350 [xfs]
> [  448.188256]  [<ffffffff8117729f>] ? __audit_syscall_entry+0xaf/0x100
> [  448.195347]  [<ffffffff810fce1d>] ? trace_hardirqs_on+0xd/0x10
> [  448.201855]  [<ffffffff8117729f>] ? __audit_syscall_entry+0xaf/0x100
> [  448.208944]  [<ffffffff81281c6c>] vfs_writev+0x3c/0x50
> [  448.214675]  [<ffffffff81281e22>] do_pwritev+0xa2/0xc0
> [  448.220407]  [<ffffffff81282f11>] SyS_pwritev+0x11/0x20
> [  448.226237]  [<ffffffff81003c9c>] do_syscall_64+0x6c/0x1e0
> [  448.232358]  [<ffffffff817d4a3f>] entry_SYSCALL64_slow_path+0x25/0x25
> [  448.239560] ---[ end trace 1c54e743f1fa4f5e ]---

This usually happens when an application mixes mmap access and
direct IO to the same file. The warning fires when the direct IO
cannot invalidate the cached range after writeback (e.g. writeback
raced with mmap app faulting and dirtying the page again), and hence
results in the page cache containing stale data.  This warning fires
when that happens, indicating to developers who get a bug report
about data corruption that it's the userspace application that is
the problem, not the filesystem. i.e the application is doing
something we explicitly document they should not do:

$ man 2 open
....
  O_DIRECT
....
       Applications should avoid mixing O_DIRECT and normal I/O to
       the same file, and especially to overlapping byte regions in
       the  same  file.   Even  when  the filesystem  correctly
       handles the coherency issues in this situation, overall I/O
       throughput is likely to be slower than using either mode
       alone.  Likewise, applications should avoid mixing mmap(2) of
       files with direct I/O to the same files.

Splice should not have this problem if the IO path locking is
correct, as both direct IO and splice IO use the same inode lock for
exclusion. i.e. splice write should not be running at the same time
as a direct IO read or write....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply index

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20160908235521.GL2356@ZenIV.linux.org.uk>
     [not found] ` <20160909015324.GD30056@dastard>
     [not found]   ` <CA+55aFzohsUXj_3BeFNr2t50Wm=G+7toRDEz=Tk7VJqP3n1hXQ@mail.gmail.com>
     [not found]     ` <CA+55aFxrqCng2Qxasc9pyMrKUGFjo==fEaFT1vkH9Lncte3RgQ@mail.gmail.com>
     [not found]       ` <20160909023452.GO2356@ZenIV.linux.org.uk>
     [not found]         ` <CA+55aFwHQMjO4-vtfB9-ytc=o+DRo-HXVGckvXLboUxgpwb7_g@mail.gmail.com>
     [not found]           ` <20160909221945.GQ2356@ZenIV.linux.org.uk>
     [not found]             ` <CA+55aFzTOOB6oEVaaGD0N7Uznk-W9+ULPwzsxS_L_oZqGVSeLA@mail.gmail.com>
     [not found]               ` <20160914031648.GB2356@ZenIV.linux.org.uk>
     [not found]                 ` <20160914133925.2fba4629@roar.ozlabs.ibm.com>
2016-09-18  5:33                   ` xfs_file_splice_read: possible circular locking dependency detected Al Viro
2016-09-19  3:08                     ` Nicholas Piggin
2016-09-19  6:11                       ` Al Viro
2016-09-19  7:26                         ` Nicholas Piggin
     [not found]                 ` <CA+55aFznQaOWoSMNphgGJJWZ=8-odrc0DAUMzfGPQe+_N4UgNA@mail.gmail.com>
     [not found]                   ` <20160914042559.GC2356@ZenIV.linux.org.uk>
     [not found]                     ` <20160917082007.GA6489@ZenIV.linux.org.uk>
     [not found]                       ` <20160917190023.GA8039@ZenIV.linux.org.uk>
2016-09-18 19:31                         ` skb_splice_bits() and large chunks in pipe (was " Al Viro
2016-09-18 20:12                           ` Linus Torvalds
2016-09-18 22:31                             ` Al Viro
2016-09-19  0:18                               ` Linus Torvalds
2016-09-23 19:00                         ` [RFC][CFT] splice_read reworked Al Viro
2016-09-23 19:01                           ` [PATCH 01/11] fix memory leaks in tracing_buffers_splice_read() Al Viro
2016-09-23 19:02                           ` [PATCH 02/11] splice_to_pipe(): don't open-code wakeup_pipe_readers() Al Viro
2016-09-23 19:02                           ` [PATCH 03/11] splice: switch get_iovec_page_array() to iov_iter Al Viro
2016-09-23 19:03                           ` [PATCH 04/11] splice: lift pipe_lock out of splice_to_pipe() Al Viro
2016-09-23 19:45                             ` Linus Torvalds
2016-09-23 20:10                               ` Al Viro
2016-09-23 20:36                                 ` Linus Torvalds
2016-09-24  3:59                                   ` Al Viro
2016-09-24 17:29                                     ` Al Viro
2016-09-27 15:38                                       ` Nicholas Piggin
2016-09-27 15:53                                       ` Chuck Lever
2016-09-24  3:59                                   ` [PATCH 04/12] " Al Viro
2016-09-26 13:35                                     ` Miklos Szeredi
2016-09-27  4:14                                       ` Al Viro
2016-12-17 19:54                                     ` Andreas Schwab
2016-12-18 19:28                                       ` Linus Torvalds
2016-12-18 19:57                                         ` Andreas Schwab
2016-12-18 20:12                                         ` Al Viro
2016-12-18 20:30                                           ` Al Viro
2016-12-18 22:10                                             ` Linus Torvalds
2016-12-18 22:18                                               ` Al Viro
2016-12-18 22:22                                                 ` Linus Torvalds
2016-12-18 22:49                                               ` Andreas Schwab
2016-12-21 18:56                                               ` Andreas Schwab
2016-12-21 19:12                                                 ` Linus Torvalds
2016-09-24  4:00                                   ` [PATCH 06/12] new helper: add_to_pipe() Al Viro
2016-09-26 13:49                                     ` Miklos Szeredi
2016-09-24  4:01                                   ` [PATCH 10/12] new iov_iter flavour: pipe-backed Al Viro
2016-09-29 20:53                                     ` Miklos Szeredi
2016-09-29 22:50                                       ` Al Viro
2016-09-30  7:30                                         ` Miklos Szeredi
2016-10-03  3:34                                           ` [RFC] O_DIRECT vs EFAULT (was Re: [PATCH 10/12] new iov_iter flavour: pipe-backed) Al Viro
2016-10-03 17:07                                             ` Linus Torvalds
2016-10-03 18:54                                               ` Al Viro
2016-09-24  4:01                                   ` [PATCH 11/12] switch generic_file_splice_read() to use of ->read_iter() Al Viro
2016-09-24  4:02                                   ` [PATCH 12/12] switch default_file_splice_read() to use of pipe-backed iov_iter Al Viro
2016-09-23 19:03                           ` [PATCH 05/11] skb_splice_bits(): get rid of callback Al Viro
2016-09-23 19:04                           ` [PATCH 06/11] new helper: add_to_pipe() Al Viro
2016-09-23 19:04                           ` [PATCH 07/11] fuse_dev_splice_read(): switch to add_to_pipe() Al Viro
2016-09-23 19:06                           ` [PATCH 08/11] cifs: don't use memcpy() to copy struct iov_iter Al Viro
2016-09-23 19:08                           ` [PATCH 09/11] fuse_ioctl_copy_user(): don't open-code copy_page_{to,from}_iter() Al Viro
2016-09-26  9:31                             ` Miklos Szeredi
2016-09-23 19:09                           ` [PATCH 10/11] new iov_iter flavour: pipe-backed Al Viro
2016-09-23 19:10                           ` [PATCH 11/11] switch generic_file_splice_read() to use of ->read_iter() Al Viro
2016-09-30 13:32                           ` [RFC][CFT] splice_read reworked CAI Qian
2016-09-30 17:42                             ` CAI Qian
2016-09-30 18:33                               ` CAI Qian
2016-10-03  1:37                                 ` Al Viro
2016-10-03 17:49                                   ` CAI Qian
2016-10-04 17:39                                     ` local DoS - systemd hang or timeout (WAS: Re: [RFC][CFT] splice_read reworked) CAI Qian
2016-10-04 21:42                                       ` tj
2016-10-05 14:09                                         ` CAI Qian
2016-10-05 15:30                                           ` tj
2016-10-05 15:54                                             ` CAI Qian
2016-10-05 18:57                                               ` CAI Qian
2016-10-05 20:05                                                 ` Al Viro
2016-10-06 12:20                                                   ` CAI Qian
2016-10-06 12:25                                                     ` CAI Qian
2016-10-06 16:11                                                       ` CAI Qian
2016-10-06 17:00                                                         ` Linus Torvalds
2016-10-06 18:12                                                           ` CAI Qian
2016-10-07  9:57                                                           ` Dave Chinner
2016-10-07 15:25                                                             ` Linus Torvalds
2016-10-07  7:08                                                       ` Jan Kara
2016-10-07 14:43                                                         ` CAI Qian
2016-10-07 15:27                                                           ` CAI Qian
2016-10-07 18:56                                                             ` CAI Qian
2016-10-09 21:54                                                               ` Dave Chinner
2016-10-10 14:10                                                                 ` CAI Qian
2016-10-10 20:14                                                                   ` CAI Qian
2016-10-10 21:57                                                                   ` Dave Chinner
2016-10-12 19:50                                                                     ` [bisected] " CAI Qian
2016-10-12 20:59                                                                       ` Dave Chinner
2016-10-13 16:25                                                                         ` CAI Qian
2016-10-13 20:49                                                                           ` Dave Chinner
2016-10-13 20:56                                                                             ` CAI Qian
2016-10-09 21:51                                                           ` Dave Chinner
2016-10-21 15:38                                                         ` [4.9-rc1+] overlayfs lockdep CAI Qian
2016-10-24 12:57                                                           ` Miklos Szeredi
2016-10-07  9:27                                                     ` local DoS - systemd hang or timeout (WAS: Re: [RFC][CFT] splice_read reworked) Dave Chinner
2016-10-03  1:42                               ` [RFC][CFT] splice_read reworked Al Viro
2016-10-03 14:06                                 ` CAI Qian
2016-10-03 15:20                                   ` CAI Qian
2016-10-03 21:12                                     ` Dave Chinner [this message]
2016-10-04 13:57                                       ` CAI Qian
2016-10-03 20:32                                   ` CAI Qian
2016-10-03 20:35                                     ` Al Viro
2016-10-04 13:29                                       ` CAI Qian
2016-10-04 14:28                                         ` Al Viro
2016-10-04 16:21                                           ` CAI Qian
2016-10-04 20:12                                             ` Al Viro
2016-10-05 14:30                                               ` CAI Qian
2016-10-05 16:07                                                 ` Al Viro
2016-09-19  0:22 skb_splice_bits() and large chunks in pipe (was Re: xfs_file_splice_read: possible circular locking dependency detected Al Viro
2016-09-20  9:51 ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161003211229.GN27872@dastard \
    --to=david@fromorbit.com \
    --cc=axboe@kernel.dk \
    --cc=caiqian@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=npiggin@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git