From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78379C433F5 for ; Tue, 24 May 2022 14:37:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238293AbiEXOhV (ORCPT ); Tue, 24 May 2022 10:37:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238291AbiEXOhQ (ORCPT ); Tue, 24 May 2022 10:37:16 -0400 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E49838CCFB; Tue, 24 May 2022 07:37:14 -0700 (PDT) Received: from mail-yw1-f182.google.com ([209.85.128.182]) by mrelayeu.kundenserver.de (mreue012 [213.165.67.97]) with ESMTPSA (Nemesis) id 1MCoYS-1o2H6G3lAG-008sj6; Tue, 24 May 2022 16:37:13 +0200 Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-2ff90e0937aso109245257b3.4; Tue, 24 May 2022 07:37:12 -0700 (PDT) X-Gm-Message-State: AOAM531NQ9nZ42DUFSZhpY01z8jVWqk321MePbhNymLIHPUo+L28NmLu XTr+QielpdSmSx3B/cWEdZod4rvWRtBIOvko/8s= X-Google-Smtp-Source: ABdhPJxGrm1E6BZQK5YLCc+LaETp8Lh3ZeDMhYEsJkfK/OrROFxhdLdYFixdudWyg6b1NU1IqUuE+/cFgJa+s4Ofbr8= X-Received: by 2002:a0d:cfc7:0:b0:300:26d2:30eb with SMTP id r190-20020a0dcfc7000000b0030026d230ebmr3783730ywd.320.1653403031598; Tue, 24 May 2022 07:37:11 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Arnd Bergmann Date: Tue, 24 May 2022 16:36:54 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: am335x: 5.18.x: system stalling To: Yegor Yefremov Cc: Arnd Bergmann , Tony Lindgren , Ard Biesheuvel , Linux-OMAP , linux-clk , Stephen Boyd , Linux ARM Content-Type: text/plain; charset="UTF-8" X-Provags-ID: V03:K1:Yy1Tkhcsf150+AgMTml809xsdYpvfJ3PwPAASTs681y+6fifNgV bf54Klpty9SLq0ZuAGb1eA1tqj0rWRG5dk5Qi7B4oEHgAV2fAb8fgpxeBxB3ifvvCrWet4p gDsRMH9K+N2Hu2nEPEwAUN9h32EraAR5U7Ag7C5jahn77/pjot006Z9kfz1GdMzbtgcNGPH zyiHW7HR/bASt3nhclZcw== X-UI-Out-Filterresults: notjunk:1;V03:K0:5BR7wGzgues=:lkOSIl9ZGNDwnFwY7Gb86v ITv6N1kwoxCq5t+LqkH+9K8cFwoIPGHwxOlH/FBQAp4QWU0vOoI0EgR7tcLpvihq/BwOWhqfj NWUeR6hks52fNMyVBYSRRjem1haC3oRvtQxKoFZzt8dCRkE3glcjcJ0voQb0C9cZP62Mf8wGq KRrNF4UhtQHfhPpP9FR2gy7sUozFUAgypJDyiD4zSw8+Z0KT6AUyEMZWDZKkgmYZcH0bDGijb aXePlGmafcRcEd0wSPJ2jxzXod0IormCA9c31gqhdc9urZUTS15GkrJ2brr3nA5l7cgf+KNed t7FXhDALTsY7UfPwUn+faf9fQPCc63w0pJmGgHQRt7Qc3SyjRqYN2saVlQ9XHrsNHLBeMWIs0 c1wg3JX7cpd9Vathzuzwj8Xf3IRcSEPIyGiPxxeKrqQRLEIgqX5olnyYYSqwQ4X9b7qfxuPma nMIirYpP5f1Q4p6aAyuwVqkM1nmP9/DohdNEwfvL24UQq1fyv1OqWGpqbVgtOwTvjjcigDMqR B1+dCDBOS0z3ycO6XE5UXcLPeWgEKV8xuLUDfNBAsFsptaolGX+vHDcXcvrmHwChGoEyJbfN6 n4f7VSh+Q0x6oBpIpGVvtb57VBhviHDWkzSc6YcI8A52oHQ4R7NEWGmCYQo8tcTH2IXgjj/cC v6olGJQ9W4hb3w37QsWjIXQQKcxrEoWOvtZSsGsmu6MImNOM9kz9Qgt5d4YSTXwO4MJtdRzuw NChbyAKHv3Db5b+D8scjOrjcZTatIS+xRMJJow== Precedence: bulk List-ID: X-Mailing-List: linux-clk@vger.kernel.org On Tue, May 24, 2022 at 3:38 PM Yegor Yefremov wrote: > On Sat, May 21, 2022 at 9:41 PM Arnd Bergmann wrote: > > On Thu, May 19, 2022 at 5:52 PM Yegor Yefremov wrote: > > > > Ok, so this is just a serial port based driver, which means the > > follow-up question > > is what you use for your uart. Is this one of the USB-serial ones or an on-chip > > uart? Which driver? > > This is the following chain: am335x -> musb-> ftdi_sio (FT-X flavor). > > I have also tried another system with two FT4232 chips (RS232 devices) > and performed transmission tests. This had no effect, the system > didn't stall. Ok, I see. I looked at ftdi_sio, and found a couple of slightly suspicious code paths in the FT-X specific bits, but after looking more closely I found nothing actually wrong with them. It might still be worth trying more combinations of those, e.g. if the FT-X uart fails without the CAN adapter, or whether it fails on the other machine. > > > > > CONFIG_DMA_API_DEBUG is still likely to pinpoint the bug, but I might also > > > > > just see it by looking at the right source file. > > > > > > > > I'll try to get more debug info with CONFIG_DMA_API_DEBUG. > > > > > > DMA_API_DEBUG showed nothing new. But disabling the CPUfreq driver > > > "solved" the problem. I have tried different governors and got these > > > two groups: > > > > > > ondemand, schedutil - cause the problem > > > conservative, powersave, performance and userspace - don't cause the problem > > > > > > So far, I have only seen the same debug output that I've initially > > > sent and in most cases, the system stalls without the output. > > > > Ok, so that sounds like it happens when you change the frequency. > > I assume this means you are using drivers/cpufreq/omap-cpufreq.c? > > Yes. > > > When using the usersapce governor, do you see problems when you > > manually change the frequency from sysfs? > > No, I can switch between 300MHz and 600MHz and perform CAN tests. > Everything goes well. One more idea: maybe this is a case where we actually run out of stack space? Without VMAP stacks, that may easily go unnoticed, but with VMAP stack it is supposed to produce an obvious error message with a backtrace. If we have a callchain that involves can_xmit -> tty -> tty_usb -> usb -> musb -> schedule -> cpufreq_update_util -> omap_cpufreq we might run out of the 8KB stack area. It's probably not this, but if you want to rule it out, try using #define THREAD_SIZE_ORDER 2 Arnd From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 56979C433F5 for ; Tue, 24 May 2022 14:43:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ywzv+mUNyQ8cWqIXJ7KcnbnhIfrf1QcBwf+HH9cbsbE=; b=FBBJPP7Pn5QctY gPw7BUEpdeTMjux29QC14EKB4yiI/NyBe7vPCNfK3ATjBXmu5Scf+L9vAiEAxfYkDcJ1lGUVGXLdq FLEQmxznUI7Xy98dFSN5X0DqS8fw1lbaI1dvyIqaDDAB4rpkAWEE61hcyFlC1meUHjzTYGmpCCMui tOGBi7odvax/XU2eqJNcD5hNOi+QDBRe5X7sC5mg0U90Aio9dLVqQFpciTltYJKbkjKprQwtYilPC hXghacNBZFOOh3GRcYiq7UWriJHHz3pcmqLJCMR3QpfnmcCQkrHJQtziGBIB+o44UHK7u0IHpUo3h qMNPBI0EbRP5RtARtCeA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1ntVjz-008I6P-3g; Tue, 24 May 2022 14:42:31 +0000 Received: from mout.kundenserver.de ([212.227.126.187]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1ntVjr-008I3h-AY for linux-arm-kernel@lists.infradead.org; Tue, 24 May 2022 14:42:25 +0000 Received: from mail-ua1-f42.google.com ([209.85.222.42]) by mrelayeu.kundenserver.de (mreue011 [213.165.67.97]) with ESMTPSA (Nemesis) id 1M4K2r-1ntmqu40sD-000Lsg for ; Tue, 24 May 2022 16:42:18 +0200 Received: by mail-ua1-f42.google.com with SMTP id j20so6341542uan.6 for ; Tue, 24 May 2022 07:42:17 -0700 (PDT) X-Gm-Message-State: AOAM5327zKnnCGlrpaG9Q32b6pWFoc1vADdNrtMxfVKDa5V4bNppXz+Y pmMpAqDdkese02/H6nLrQ/6MUHEYoCJbQsdt4C8= X-Google-Smtp-Source: ABdhPJxGrm1E6BZQK5YLCc+LaETp8Lh3ZeDMhYEsJkfK/OrROFxhdLdYFixdudWyg6b1NU1IqUuE+/cFgJa+s4Ofbr8= X-Received: by 2002:a0d:cfc7:0:b0:300:26d2:30eb with SMTP id r190-20020a0dcfc7000000b0030026d230ebmr3783730ywd.320.1653403031598; Tue, 24 May 2022 07:37:11 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Arnd Bergmann Date: Tue, 24 May 2022 16:36:54 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: am335x: 5.18.x: system stalling To: Yegor Yefremov Cc: Arnd Bergmann , Tony Lindgren , Ard Biesheuvel , Linux-OMAP , linux-clk , Stephen Boyd , Linux ARM X-Provags-ID: V03:K1:mmOo0JeFM440zuZIyITttKESiuhl05Gg5+fU2n3s4fRim7+vTSD JgcQPQInxGYm6zP6hOvqLusFZ/dWGag4mcKjPd738sxIrmAcHWiKzld2Eg5DUpwlW8P6wW7 6u+5v7V0qCoX2fF17c2bk+KVOzeZ6mWlGVmtitEf7GbMAkltHKL6r/BnreSIpnsQQlHrlFo UJJ0I3CrVM6QnUXibfMpg== X-UI-Out-Filterresults: notjunk:1;V03:K0:XL2tKtUunrg=:JJo8BrokTrGV6eDYrSdPWz emZAdVfHWR3IAvWR1MCN/+vy8qbXWmjSC0x7KVLeKQ/GtenEXlueob22Rhs7P0qj79RNnDNsv o7Mj16eVPLYuxqNadUKBRDYDeirTuoq47FGY7d5V1MI5MBVX6ABxa+dW+FBzQMWgzMB8ngD/i cvGVutsi9/8Ziqpzpz4yq9L1wFi0UlaEH9roXfNiao1aymvj9rwoVTjE6T2Y0NJtj9kQ79VEj yMNEryASbgLfqO3YmwsDYgeaPRYLBEIaQ/RzJDWeH6L8jlZlocVGuxs6HQ0ez4S3wxFjGxnTB d8eZEu2BcZD2KfLvKcup1OrkloFBTpL9tgO5ZWFGlAj9UeeqqPh0p3HMVBCTFFEc/p+73zyP4 VYo8VTCvCBxriTLpoiwIfOnchnsgZ7e3GxTdutpB9wBehCoJGeCIWnOO7fMvS7FyyTTRL1T/K /12/fA/gyJXLVLaxAGhDXesNw7tI9hzCwNmsKvGmL7qG9gD0zMOy0MQ/6jAKOSp1geqlq6ec1 wVijxjZSiekV/ukI7/2we4ah0VgXgWeF8GR28QNyKTab9ZnyJ3MmMey1nMN0vu1y6I8j/aSzT C2kGlYR0f63EBtJFvMexJ78KH9tHJ7jiDyk7Ho7s5kb+FVm0o9nwXr+tGrdGhV+lWXNsfXG/B xiT8RX+z8u+4HjhknOYqQ8U9vGBktyj8oX+g1QqNRVLJYuPGP+TilsVOTBg9gDJDiH+XO4pHL rm3GLZ7z2m9ZFF0XmJpjHagSU72TyucYQFlXsLe3S7RITPxh6ldv8sjfdPsKJX8WIfdRUrGuL z7P3kXNfgJ/ougiBzQt0jJe4mqe5CD41DFGKCDUENPcRKV4/z02qcUeaXEolBHElf1xvC+7Gy ufYY4fZeN5pD/yC6FvCQ== X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220524_074223_707334_3A2D38ED X-CRM114-Status: GOOD ( 32.34 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, May 24, 2022 at 3:38 PM Yegor Yefremov wrote: > On Sat, May 21, 2022 at 9:41 PM Arnd Bergmann wrote: > > On Thu, May 19, 2022 at 5:52 PM Yegor Yefremov wrote: > > > > Ok, so this is just a serial port based driver, which means the > > follow-up question > > is what you use for your uart. Is this one of the USB-serial ones or an on-chip > > uart? Which driver? > > This is the following chain: am335x -> musb-> ftdi_sio (FT-X flavor). > > I have also tried another system with two FT4232 chips (RS232 devices) > and performed transmission tests. This had no effect, the system > didn't stall. Ok, I see. I looked at ftdi_sio, and found a couple of slightly suspicious code paths in the FT-X specific bits, but after looking more closely I found nothing actually wrong with them. It might still be worth trying more combinations of those, e.g. if the FT-X uart fails without the CAN adapter, or whether it fails on the other machine. > > > > > CONFIG_DMA_API_DEBUG is still likely to pinpoint the bug, but I might also > > > > > just see it by looking at the right source file. > > > > > > > > I'll try to get more debug info with CONFIG_DMA_API_DEBUG. > > > > > > DMA_API_DEBUG showed nothing new. But disabling the CPUfreq driver > > > "solved" the problem. I have tried different governors and got these > > > two groups: > > > > > > ondemand, schedutil - cause the problem > > > conservative, powersave, performance and userspace - don't cause the problem > > > > > > So far, I have only seen the same debug output that I've initially > > > sent and in most cases, the system stalls without the output. > > > > Ok, so that sounds like it happens when you change the frequency. > > I assume this means you are using drivers/cpufreq/omap-cpufreq.c? > > Yes. > > > When using the usersapce governor, do you see problems when you > > manually change the frequency from sysfs? > > No, I can switch between 300MHz and 600MHz and perform CAN tests. > Everything goes well. One more idea: maybe this is a case where we actually run out of stack space? Without VMAP stacks, that may easily go unnoticed, but with VMAP stack it is supposed to produce an obvious error message with a backtrace. If we have a callchain that involves can_xmit -> tty -> tty_usb -> usb -> musb -> schedule -> cpufreq_update_util -> omap_cpufreq we might run out of the 8KB stack area. It's probably not this, but if you want to rule it out, try using #define THREAD_SIZE_ORDER 2 Arnd _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel