linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	Sergey Jerusalimov <wintchester@gmail.com>,
	Ilya Dryomov <idryomov@gmail.com>,
	Jeff Layton <jlayton@redhat.com>
Subject: Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations
Date: Tue, 28 Mar 2017 14:43:12 +0200	[thread overview]
Message-ID: <20170328124312.GE18241@dhcp22.suse.cz> (raw)
In-Reply-To: <20170328122601.905696872@linuxfoundation.org>

On Tue 28-03-17 14:30:45, Greg KH wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.

I haven't seen the original patch but the changelog makes me worried.
How exactly this is a problem? Where do we lockup? Does rbd/libceph take
any xfs locks?

> ------------------
> 
> From: Ilya Dryomov <idryomov@gmail.com>
> 
> commit 633ee407b9d15a75ac9740ba9d3338815e1fcb95 upstream.
> 
> sock_alloc_inode() allocates socket+inode and socket_wq with
> GFP_KERNEL, which is not allowed on the writeback path:
> 
>     Workqueue: ceph-msgr con_work [libceph]
>     ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
>     0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
>     ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
>     Call Trace:
>     [<ffffffff816dd629>] schedule+0x29/0x70
>     [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
>     [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
>     [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
>     [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
>     [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
>     [<ffffffff81086335>] flush_work+0x165/0x250
>     [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
>     [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
>     [<ffffffff816d6b42>] ? __slab_free+0xee/0x234
>     [<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
>     [<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
>     [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
>     [<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
>     [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
>     [<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
>     [<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
>     [<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
>     [<ffffffffa039ac07>] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs]
>     [<ffffffffa039bb13>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
>     [<ffffffffa03ab745>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
>     [<ffffffff811c0c18>] super_cache_scan+0x178/0x180
>     [<ffffffff8115912e>] shrink_slab_node+0x14e/0x340
>     [<ffffffff811afc3b>] ? mem_cgroup_iter+0x16b/0x450
>     [<ffffffff8115af70>] shrink_slab+0x100/0x140
>     [<ffffffff8115e425>] do_try_to_free_pages+0x335/0x490
>     [<ffffffff8115e7f9>] try_to_free_pages+0xb9/0x1f0
>     [<ffffffff816d56e4>] ? __alloc_pages_direct_compact+0x69/0x1be
>     [<ffffffff81150cba>] __alloc_pages_nodemask+0x69a/0xb40
>     [<ffffffff8119743e>] alloc_pages_current+0x9e/0x110
>     [<ffffffff811a0ac5>] new_slab+0x2c5/0x390
>     [<ffffffff816d71c4>] __slab_alloc+0x33b/0x459
>     [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
>     [<ffffffff8164bda1>] ? inet_sendmsg+0x71/0xc0
>     [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
>     [<ffffffff811a21f2>] kmem_cache_alloc+0x1a2/0x1b0
>     [<ffffffff815b906d>] sock_alloc_inode+0x2d/0xd0
>     [<ffffffff811d8566>] alloc_inode+0x26/0xa0
>     [<ffffffff811da04a>] new_inode_pseudo+0x1a/0x70
>     [<ffffffff815b933e>] sock_alloc+0x1e/0x80
>     [<ffffffff815ba855>] __sock_create+0x95/0x220
>     [<ffffffff815baa04>] sock_create_kern+0x24/0x30
>     [<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
>     [<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
>     [<ffffffff81084c19>] process_one_work+0x159/0x4f0
>     [<ffffffff8108561b>] worker_thread+0x11b/0x530
>     [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
>     [<ffffffff8108b6f9>] kthread+0xc9/0xe0
>     [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
>     [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
>     [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
> 
> Use memalloc_noio_{save,restore}() to temporarily force GFP_NOIO here.
> 
> Link: http://tracker.ceph.com/issues/19309
> Reported-by: Sergey Jerusalimov <wintchester@gmail.com>
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> Reviewed-by: Jeff Layton <jlayton@redhat.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>  net/ceph/messenger.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -7,6 +7,7 @@
>  #include <linux/kthread.h>
>  #include <linux/net.h>
>  #include <linux/nsproxy.h>
> +#include <linux/sched.h>
>  #include <linux/slab.h>
>  #include <linux/socket.h>
>  #include <linux/string.h>
> @@ -478,11 +479,16 @@ static int ceph_tcp_connect(struct ceph_
>  {
>  	struct sockaddr_storage *paddr = &con->peer_addr.in_addr;
>  	struct socket *sock;
> +	unsigned int noio_flag;
>  	int ret;
>  
>  	BUG_ON(con->sock);
> +
> +	/* sock_create_kern() allocates with GFP_KERNEL */
> +	noio_flag = memalloc_noio_save();
>  	ret = sock_create_kern(read_pnet(&con->msgr->net), paddr->ss_family,
>  			       SOCK_STREAM, IPPROTO_TCP, &sock);
> +	memalloc_noio_restore(noio_flag);
>  	if (ret)
>  		return ret;
>  	sock->sk->sk_allocation = GFP_NOFS;
> 

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2017-03-28 12:44 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-28 12:29 [PATCH 4.4 00/76] 4.4.58-stable review Greg Kroah-Hartman
2017-03-28 12:29 ` [PATCH 4.4 01/76] net/openvswitch: Set the ipv6 source tunnel key address attribute correctly Greg Kroah-Hartman
2017-03-28 12:29 ` [PATCH 4.4 02/76] net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 03/76] net: properly release sk_frag.page Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 04/76] amd-xgbe: Fix jumbo MTU processing on newer hardware Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 05/76] net: unix: properly re-increment inflight counter of GC discarded candidates Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 06/76] net/mlx5: Increase number of max QPs in default profile Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 07/76] net/mlx5e: Count LRO packets correctly Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 08/76] net: bcmgenet: remove bcmgenet_internal_phy_setup() Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 09/76] ipv4: provide stronger user input validation in nl_fib_input() Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 10/76] socket, bpf: fix sk_filter use after free in sk_clone_lock Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 11/76] tcp: initialize icsk_ack.lrcvtime at session start time Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 12/76] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 13/76] Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000 Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 14/76] Input: iforce - validate number of endpoints before using them Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 15/76] Input: ims-pcu " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 16/76] Input: hanwang " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 17/76] Input: yealink " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 18/76] Input: cm109 " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 19/76] Input: kbtab " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 20/76] Input: sur40 " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 21/76] ALSA: seq: Fix racy cell insertions during snd_seq_pool_done() Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 22/76] ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 23/76] ALSA: hda - Adding a group of pin definition to fix headset problem Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 24/76] USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 27/76] usb: gadget: f_uvc: Fix SuperSpeed companion descriptors wBytesPerInterval Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 28/76] usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 29/76] USB: uss720: fix NULL-deref at probe Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 30/76] USB: lvtest: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 31/76] USB: idmouse: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 32/76] USB: wusbcore: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 33/76] usb: musb: cppi41: dont check early-TX-interrupt for Isoch transfer Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 34/76] usb: hub: Fix crash after failure to read BOS descriptor Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 35/76] uwb: i1480-dfu: fix NULL-deref at probe Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 36/76] uwb: hwa-rc: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 37/76] mmc: ushc: " Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 38/76] iio: adc: ti_am335x_adc: fix fifo overrun recovery Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 39/76] iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3 Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 40/76] parport: fix attempt to write duplicate procfiles Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 41/76] ext4: mark inode dirty after converting inline directory Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 42/76] mmc: sdhci: Do not disable interrupts while waiting for clock Greg Kroah-Hartman
2017-04-04 16:50   ` Ben Hutchings
2017-04-06 12:12     ` Ludovic Desroches
2017-04-06 14:22       ` Ben Hutchings
2017-03-28 12:30 ` [PATCH 4.4 43/76] xen/acpi: upload PM state from init-domain to Xen Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 44/76] iommu/vt-d: Fix NULL pointer dereference in device_to_iommu Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 45/76] ARM: at91: pm: cpu_idle: switch DDR to power-down mode Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 46/76] ARM: dts: at91: sama5d2: add dma properties to UART nodes Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 47/76] cpufreq: Restore policy min/max limits on CPU online Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations Greg Kroah-Hartman
2017-03-28 12:43   ` Michal Hocko [this message]
2017-03-28 13:23     ` Ilya Dryomov
2017-03-28 13:30       ` Michal Hocko
2017-03-29  9:21         ` Ilya Dryomov
2017-03-29 10:41           ` Michal Hocko
2017-03-29 10:55             ` Michal Hocko
2017-03-29 11:10               ` Ilya Dryomov
2017-03-29 11:16                 ` Michal Hocko
2017-03-29 14:25                   ` Ilya Dryomov
2017-03-30  6:25                     ` Michal Hocko
2017-03-30 10:02                       ` Ilya Dryomov
2017-03-30 11:21                         ` Michal Hocko
2017-03-30 13:48                           ` Ilya Dryomov
2017-03-30 14:36                             ` Michal Hocko
2017-03-30 15:06                               ` Ilya Dryomov
2017-03-30 16:12                                 ` Michal Hocko
2017-03-30 17:19                                   ` Ilya Dryomov
2017-03-30 18:44                                     ` Michal Hocko
2017-03-30 13:53                       ` Ilya Dryomov
2017-03-30 13:59                         ` Michal Hocko
2017-03-29 11:05             ` Brian Foster
2017-03-29 11:14               ` Ilya Dryomov
2017-03-29 11:18                 ` Michal Hocko
2017-03-29 11:49                   ` Brian Foster
2017-03-29 14:30                     ` Ilya Dryomov
2017-03-28 12:30 ` [PATCH 4.4 49/76] raid10: increment write counter after bio is split Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 50/76] libceph: dont set weight to IN when OSD is destroyed Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 51/76] xfs: dont allow di_size with high bit set Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 52/76] xfs: fix up xfs_swap_extent_forks inline extent handling Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 53/76] nl80211: fix dumpit error path RTNL deadlocks Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 54/76] USB: usbtmc: add missing endpoint sanity check Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 55/76] xfs: clear _XBF_PAGES from buffers when readahead page Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 56/76] xen: do not re-use pirq number cached in pci device msi msg data Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 57/76] igb: Workaround for igb i210 firmware issue Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 58/76] igb: add i211 to i210 PHY workaround Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 59/76] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 60/76] PCI: Separate VF BAR updates from standard BAR updates Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 61/76] PCI: Remove pci_resource_bar() and pci_iov_resource_bar() Greg Kroah-Hartman
2017-03-28 12:30 ` [PATCH 4.4 62/76] PCI: Add comments about ROM BAR updating Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 63/76] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 64/76] PCI: Dont update VF BARs while VF memory space is enabled Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 65/76] PCI: Update BARs using property bits appropriate for type Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 66/76] PCI: Ignore BAR updates on virtual functions Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 67/76] PCI: Do any VF BAR updates before enabling the BARs Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 68/76] vfio/spapr: Postpone allocation of userspace version of TCE table Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 69/76] block: allow WRITE_SAME commands with the SG_IO ioctl Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 70/76] s390/zcrypt: Introduce CEX6 toleration Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 71/76] uvcvideo: uvc_scan_fallback() for webcams with broken chain Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 72/76] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520 Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 73/76] ACPI / blacklist: Make Dell Latitude 3350 ethernet work Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 74/76] serial: 8250_pci: Detach low-level driver during PCI error recovery Greg Kroah-Hartman
2017-04-04 20:26   ` Ben Hutchings
2017-03-28 12:31 ` [PATCH 4.4 75/76] fbcon: Fix vc attr at deinit Greg Kroah-Hartman
2017-03-28 12:31 ` [PATCH 4.4 76/76] crypto: algif_hash - avoid zero-sized array Greg Kroah-Hartman
2017-03-28 19:38 ` [PATCH 4.4 00/76] 4.4.58-stable review Shuah Khan
2017-03-29  2:58 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170328124312.GE18241@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wintchester@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).