From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEFF6C433E0 for ; Mon, 10 Aug 2020 12:32:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 997C4206C3 for ; Mon, 10 Aug 2020 12:32:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=mg.codeaurora.org header.i=@mg.codeaurora.org header.b="h3TScBre" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726644AbgHJMc1 (ORCPT ); Mon, 10 Aug 2020 08:32:27 -0400 Received: from mail29.static.mailgun.info ([104.130.122.29]:13907 "EHLO mail29.static.mailgun.info" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726614AbgHJMc0 (ORCPT ); Mon, 10 Aug 2020 08:32:26 -0400 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1597062745; h=Content-Transfer-Encoding: Content-Type: In-Reply-To: MIME-Version: Date: Message-ID: From: References: Cc: To: Subject: Sender; bh=YATn2IQenXC1AZhlnO1R/hOYwlccU9AexyRTM+qvl5Y=; b=h3TScBrewZ0psJ2ZTdiLpFKY9iEOhXr4O+WGo4pKmNMxO53G425dlJmFtVjEJWmUf7wpcNzy UJgAIWJmo/2NzPuboubCAB0kkw/Y2WCId6WOJm7Ire+4j4j+7571rx8yVaaoTGLg8UEDe/B3 U4V6FPOx0LmdZF6U4P4TrU/7XPA= X-Mailgun-Sending-Ip: 104.130.122.29 X-Mailgun-Sid: WyIzZmY0MiIsICJsaW51eC1zZXJpYWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n09.prod.us-west-2.postgun.com with SMTP id 5f313e49d96d28d61e297def (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Mon, 10 Aug 2020 12:32:09 GMT Received: by smtp.codeaurora.org (Postfix, from userid 1001) id EEE58C4339C; Mon, 10 Aug 2020 12:32:08 +0000 (UTC) Received: from [192.168.1.100] (unknown [47.8.236.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: akashast) by smtp.codeaurora.org (Postfix) with ESMTPSA id 04822C433C6; Mon, 10 Aug 2020 12:32:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 04822C433C6 Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=none smtp.mailfrom=akashast@codeaurora.org Subject: Re: [PATCH] serial: qcom_geni_serial: Fix recent kdb hang To: Douglas Anderson , Greg Kroah-Hartman Cc: kgdb-bugreport@lists.sourceforge.net, Mukesh Savaliya , Andy Gross , Bjorn Andersson , Evan Green , Jiri Slaby , linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-serial@vger.kernel.org References: <20200806221904.1.I4455ff86f0ef5281c2a0cd0a4712db614548a5ca@changeid> From: Akash Asthana Message-ID: Date: Mon, 10 Aug 2020 18:01:54 +0530 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: <20200806221904.1.I4455ff86f0ef5281c2a0cd0a4712db614548a5ca@changeid> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-serial-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-serial@vger.kernel.org Hi Doug, On 8/7/2020 10:49 AM, Douglas Anderson wrote: > The commit e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work > even if UART isn't console") worked pretty well and I've been doing a > lot of debugging with it. However, recently I typed "dmesg" in kdb > and then held the space key down to scroll through the pagination. My > device hung. This was repeatable and I found that it was introduced > with the aforementioned commit. > > It turns out that there are some strange boundary cases in geni where > in some weird situations it will signal RX_LAST but then will put 0 in > RX_LAST_BYTE. This means that the entire last FIFO entry is valid. IMO that means we received a word in RX_FIFO and it is the last word hence RX_LAST bit is set. RX_LAST_BYTE is 0 means none of the bytes are valid in the last word. In such scenario we should just read RX_FIFO buffer (to empty it), discard the word and return NO_POLL_CHAR. Something like below. ---------------------------------------------------------------------------------------------------------------------------------------------------------                 else                         private_data->poll_cached_bytes_cnt = 4;                 private_data->poll_cached_bytes =                         readl(uport->membase + SE_GENI_RX_FIFOn);         } +        if (!private_data->poll_cached_bytes_cnt) +              return NO_POLL_CHAR;         private_data->poll_cached_bytes_cnt--;         ret = private_data->poll_cached_bytes & 0xff; ------------------------------------------------------------------------------------------------------------------------------------------------------------- Please let me know whether above code helps. I am not sure about what all scenario can leads to this behavior from hardware, I will try to get an answer from hardware team. Any error bit was set for SE_GENI_S_IRQ_STATUS & SE_GENI_M_IRQ_STATUS registers? I guess the hang was seen because *poll_cached_bytes_cnt* is unsigned int and it's value was 0, when it's decremented by 1 it's value become '4294967295' (very large) and dummy RX (0x00) would happen that many times before reading any actual RX transfers/bytes. Regards, Akash > This weird corner case is handled in qcom_geni_serial_handle_rx() > where you can see that we only honor RX_LAST_BYTE if RX_LAST is set > _and_ RX_LAST_BYTE is non-zero. If either of these is not true we use > BYTES_PER_FIFO_WORD (4) for the size of the last FIFO word. > > Let's fix kgdb. While at it, also use the proper #define for 4. > > Fixes: e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work even if UART isn't console") > Signed-off-by: Douglas Anderson > --- > > drivers/tty/serial/qcom_geni_serial.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c > index 07b7b6b05b8b..e27077656939 100644 > --- a/drivers/tty/serial/qcom_geni_serial.c > +++ b/drivers/tty/serial/qcom_geni_serial.c > @@ -361,11 +361,16 @@ static int qcom_geni_serial_get_char(struct uart_port *uport) > return NO_POLL_CHAR; > > if (word_cnt == 1 && (status & RX_LAST)) > + /* > + * NOTE: If RX_LAST_BYTE_VALID is 0 it needs to be > + * treated as if it was BYTES_PER_FIFO_WORD. > + */ > private_data->poll_cached_bytes_cnt = > (status & RX_LAST_BYTE_VALID_MSK) >> > RX_LAST_BYTE_VALID_SHFT; > - else > - private_data->poll_cached_bytes_cnt = 4; > + > + if (private_data->poll_cached_bytes_cnt == 0) > + private_data->poll_cached_bytes_cnt = BYTES_PER_FIFO_WORD; > > private_data->poll_cached_bytes = > readl(uport->membase + SE_GENI_RX_FIFOn); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na Linux Foundation Collaborative Project