From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B899C433DF for ; Tue, 18 Aug 2020 11:58:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 44DC6206B5 for ; Tue, 18 Aug 2020 11:58:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 44DC6206B5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=mentor.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CC10F6B0022; Tue, 18 Aug 2020 07:58:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C52D16B0024; Tue, 18 Aug 2020 07:58:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B13956B0026; Tue, 18 Aug 2020 07:58:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0052.hostedemail.com [216.40.44.52]) by kanga.kvack.org (Postfix) with ESMTP id 98C936B0022 for ; Tue, 18 Aug 2020 07:58:21 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5A6D02489 for ; Tue, 18 Aug 2020 11:58:21 +0000 (UTC) X-FDA: 77163541602.16.car81_1a1523d2701f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id 23B58100E690C for ; Tue, 18 Aug 2020 11:58:21 +0000 (UTC) X-HE-Tag: car81_1a1523d2701f X-Filterd-Recvd-Size: 6148 Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Tue, 18 Aug 2020 11:58:20 +0000 (UTC) IronPort-SDR: CYi9SrS/cnaVhkFNV9GstPyOyibzH1orezPvc1zMhy3RKJHyOytzYGrF9BackoPOpTNh60WGk6 g7J6a9AqYaKM6+DAUmq0lz2TPsaoUZMKgkPFK5yCxC6iGeKSXggmj9Jcs9Z96tCS3ZGY2z5zQI WqPJKdbN5CUAExeGPEmz60bT7zLKEsoQLNHBFoNcrOgWS1M86DHw3fMDW1gA7dG3qe/3zycfhx 1yK1dC2Yp9UeLnYEoqPLBaTZiQW8qkvSTYPAGBZNEhTesn+4n6JL4xHfZNwxlNfHEJanPW7Ymk e80= X-IronPort-AV: E=Sophos;i="5.76,327,1592899200"; d="scan'208";a="51997026" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 18 Aug 2020 03:58:18 -0800 IronPort-SDR: UacQbdCiZh6aLafhHVEmRcUXoNib6UbiNdqpwhet+hEKxZkvk+OYUUR+f9yf6A+NJRyEd8YZHA I4iIXGc6vZRUSEqQxRk3ein5Pt6wgk1kNTVMe+2Zj8nXV2Y/t5KNXwNAOSwQwRmfA5zl4DitRr euRNdMn1Iar92Sab2H0DoOIt3xyxkzC5Q5DX8pJiPj2N8kc0a+x4xoDNvFk42n6TMgKQAF0xxx LzGLV6f4KdWoTGPkXM3kl2/kW4ne5G/5LRLr3lfG7Io+G3AkfSRW8dhU3UPVR8g5yQMAitQsSp PO4= To: , , From: Jim Baxter Subject: PROBLEM: Long Workqueue delays V2 CC: "Resch Carsten (CM/ESO6)" , "Rosca, Eugeniu (ADITG/ESB)" Message-ID: <625615f2-3a6b-3136-35f9-2f2fb3c110cf@mentor.com> Date: Tue, 18 Aug 2020 12:58:13 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-06.mgc.mentorg.com (139.181.222.6) To SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) X-Rspamd-Queue-Id: 23B58100E690C X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: I am asking this question again to include the fs-devel list. We have issues with the workqueue of the kernel overloading the CPU 0 when we we disconnect a USB stick. This results in other items on the shared workqueue being delayed by around 6.5 seconds with a default kernel configuration and 2.3 seconds on a config tailored for our RCar embedded platform. We first noticed this issue on custom hardware and we have recreated it on an RCar Starter Kit using a test module [1] to replicate the behaviour, the test module outputs any delays of greater then 9ms. To run the test we have a 4GB random file on a USB stick and perform the following test. The stick is mounted as R/O and we are copying data from the stick: - Mount the stick. mount -o ro,remount /dev/sda1 - Load the Module: # taskset -c 0 modprobe latency-mon - Copy large amount of data from the stick: # dd if=/run/media/sda1/sample.txt of=/dev/zero [ 1437.517603] DELAY: 10 8388607+1 records in 8388607+1 records out - Disconnect the USB stick: [ 1551.796792] usb 2-1: USB disconnect, device number 2 [ 1558.625517] DELAY: 6782 The Delay output 6782 is in milliseconds. Using umount stops the issue occurring but is unfortunately not guaranteed in our particular system. >From my analysis the hub_event workqueue kworker/0:1+usb thread uses around 98% of the CPU. I have traced the workqueue:workqueue_queue_work function while unplugging the USB and there is no particular workqueue function being executed a lot more then the others for the kworker/0:1+usb thread. Using perf I identified the hub_events workqueue was spending a lot of time in invalidate_partition(), I have included a cut down the captured data from perf in [2] which shows the additional functions where the kworker spends most of its time. I am aware there will be delays on the shared workqueue, are the delays we are seeing considered normal? Is there any way to mitigate or identify where the delay is? I am unsure if this is a memory or filesystem subsystem issue. Thank you for you help. Thanks, Jim Baxter [1] Test Module: // SPDX-License-Identifier: GPL-2.0 /* * Simple WQ latency monitoring * * Copyright (C) 2020 Advanced Driver Information Technology. */ #include #include #include #define PERIOD_MS 100 static struct delayed_work wq; static u64 us_save; static void wq_cb(struct work_struct *work) { u64 us = ktime_to_us(ktime_get()); u64 us_diff = us - us_save; u64 us_print = 0; if (!us_save) goto skip_print; us_print = us_diff / 1000 - PERIOD_MS; if (us_print > 9) pr_crit("DELAY: %lld\n", us_print); skip_print: us_save = us; schedule_delayed_work(&wq, msecs_to_jiffies(PERIOD_MS)); } static int latency_mon_init(void) { us_save = 0; INIT_DELAYED_WORK(&wq, wq_cb); schedule_delayed_work(&wq, msecs_to_jiffies(PERIOD_MS)); return 0; } static void latency_mon_exit(void) { cancel_delayed_work_sync(&wq); pr_info("%s\n", __func__); } module_init(latency_mon_init); module_exit(latency_mon_exit); MODULE_AUTHOR("Eugeniu Rosca "); MODULE_LICENSE("GPL"); [2] perf trace: 95.22% 0.00% kworker/0:2-eve [kernel.kallsyms] | ---ret_from_fork kthread worker_thread | --95.15%--process_one_work | --94.99%--hub_event | --94.99%--usb_disconnect | --94.90%--invalidate_partition __invalidate_device | |--64.55%--invalidate_bdev | | | --64.13%--invalidate_mapping_pages | | | |--24.09%--invalidate_inode_page | | | | | --23.44%--remove_mapping | | | | | --23.20%--__remove_mapping | | | | | --21.90%--arch_local_irq_restore | | | |--22.44%--arch_local_irq_enable | --30.35%--shrink_dcache_sb | --30.17%--truncate_inode_pages_range