From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 015CBC3F6B0 for ; Fri, 5 Aug 2022 08:57:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240316AbiHEI4g (ORCPT ); Fri, 5 Aug 2022 04:56:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234856AbiHEI41 (ORCPT ); Fri, 5 Aug 2022 04:56:27 -0400 Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ABA896AA3C; Fri, 5 Aug 2022 01:56:25 -0700 (PDT) Received: from [192.168.31.174] (unknown [95.31.173.239]) by mail.ispras.ru (Postfix) with ESMTPSA id BBDCD40737C5; Fri, 5 Aug 2022 08:56:19 +0000 (UTC) Message-ID: <18aa0617-0afe-2543-89cf-2f04c682ea88@ispras.ru> Date: Fri, 5 Aug 2022 11:56:18 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Subject: Re: [PATCH] can: j1939: fix memory leak of skbs Content-Language: en-US To: Oleksij Rempel Cc: Robin van der Gracht , Oleksij Rempel , kernel@pengutronix.de, Oliver Hartkopp , Marc Kleine-Budde , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , linux-can@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, ldv-project@linuxtesting.org, Alexey Khoroshilov References: <20220708175949.539064-1-pchelkin@ispras.ru> <20220729042244.GC30201@pengutronix.de> From: Fedor Pchelkin In-Reply-To: <20220729042244.GC30201@pengutronix.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Oleksij, On 29.07.2022 07:22, Oleksij Rempel wrote: > Initial issue can be reproduced by using real (slow) CAN with j1939cat[1] > tool. Both parts should be started to make sure the j1939_session_tx_dat() will > actually start using the queue. After pushing about 100K of data, application > will try to close the socket and exit. After socket is closed, all skb related > to this socket will be freed and j1939_session_tx_dat() will use freed skbs. Ok, the patch I suggested was a kind of a guess, now I understand that it breaks important logic. On 29.07.2022 07:22, Oleksij Rempel wrote: > This skb_get() is counter part of skb_unref() > j1939_session_skb_drop_old(). However, we have a case [1] where j1939_session_skb_queue() is called but the corresponding j1939_session_skb_drop_old() is not called and it causes a memory leak. I tried to investigate it a little bit: the thing is that j1939_session_skb_drop_old() can be called only when processing J1939_ETP_CMD_CTS. On the contrary, as I can see, j1939_session_skb_queue() can be called independently from J1939_ETP_CMD_CTS so the two functions obviously do not correspond to each other. In reproducer case there is a J1939_ETP_CMD_RTS processing, then we send some messages (where j1939_session_skb_queue() is called) and after that J1939_ETP_CMD_ABORT is processed and we lose those skbs. [1] https://forge.ispras.ru/issues/11743