From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CE2DC4727C for ; Tue, 29 Sep 2020 14:54:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ED25D20757 for ; Tue, 29 Sep 2020 14:54:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="eoKKMBN7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731511AbgI2OyJ (ORCPT ); Tue, 29 Sep 2020 10:54:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:33862 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728937AbgI2OyH (ORCPT ); Tue, 29 Sep 2020 10:54:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601391245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IgDvWNQP8iyuHEvr4G441L71BKSYFJPWhAP8LkzWJfg=; b=eoKKMBN7CEGS/O8wpnxX5YX5p7R/6oj0pmjByAZssa/tlgDzObwBIbhZXkZHaNf3yxFnjs /xxRgUM/x4iZ6RwvlYDvqSTGjNG6UgIhlOVv1JFsk77ppGGHvo3dx1IKn0uI5+hjhXkchS /V0QnFBKAlwY5qjuSlhyF3uziX8+a1k= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CB7CAAD0F; Tue, 29 Sep 2020 14:54:04 +0000 (UTC) Date: Tue, 29 Sep 2020 16:54:03 +0200 From: Michal Hocko To: Daniel Vetter Cc: "Paul E. McKenney" , Juri Lelli , Peter Zijlstra , Sebastian Andrzej Siewior , Lai Jiangshan , dri-devel , Ben Segall , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , linux-hexagon@vger.kernel.org, Will Deacon , Ingo Molnar , Anton Ivanov , linux-arch , Vincent Guittot , Herbert Xu , Brian Cain , Richard Weinberger , Russell King , Ard Biesheuvel , David Airlie , Ingo Molnar , Geert Uytterhoeven , Mel Gorman , intel-gfx , Matt Turner , Valentin Schneider , linux-xtensa@linux-xtensa.org, Shuah Khan , Jeff Dike , linux-um , Josh Triplett , Steven Rostedt , rcu@vger.kernel.org, linux-m68k , Ivan Kokshaysky , Rodrigo Vivi , Thomas Gleixner , Dietmar Eggemann , Linux ARM , Richard Henderson , Chris Zankel , Max Filippov , Daniel Bristot de Oliveira , LKML , alpha , Mathieu Desnoyers , Andrew Morton , Linus Torvalds Subject: Re: [patch 00/13] preempt: Make preempt count unconditional Message-ID: <20200929145403.GE2277@dhcp22.suse.cz> References: <87bli75t7v.fsf@nanos.tec.linutronix.de> <20200916152956.GV29330@paulmck-ThinkPad-P72> <20200916205840.GD29330@paulmck-ThinkPad-P72> <20200929081938.GC22035@dhcp22.suse.cz> <20200929090003.GG438822@phenom.ffwll.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200929090003.GG438822@phenom.ffwll.local> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 29-09-20 11:00:03, Daniel Vetter wrote: > On Tue, Sep 29, 2020 at 10:19:38AM +0200, Michal Hocko wrote: > > On Wed 16-09-20 23:43:02, Daniel Vetter wrote: > > > I can > > > then figure out whether it's better to risk not spotting issues with > > > call_rcu vs slapping a memalloc_noio_save/restore around all these > > > critical section which force-degrades any allocation to GFP_ATOMIC at > > > > did you mean memalloc_noreclaim_* here? > > Yeah I picked the wrong one of that family of functions. > > > > most, but has the risk that we run into code that assumes "GFP_KERNEL > > > never fails for small stuff" and has a decidedly less tested fallback > > > path than rcu code. > > > > Even if the above then please note that memalloc_noreclaim_* or > > PF_MEMALLOC should be used with an extreme care. Essentially only for > > internal memory reclaimers. It grants access to _all_ the available > > memory so any abuse can be detrimental to the overall system operation. > > Allocation failure in this mode means that we are out of memory and any > > code relying on such an allocation has to carefuly consider failure. > > This is not a random allocation mode. > > Agreed, that's why I don't like having these kind of automagic critical > sections. It's a bit a shotgun approach. Paul said that the code would > handle failures, but the problem is that it applies everywhere. Ohh, in the ideal world we wouldn't need anything like that. But then the reality fires: * PF_MEMALLOC (resp memalloc_noreclaim_* for that matter) is primarily used to make sure that allocations from inside the memory reclaim - yeah that happens - will not recurse. * PF_MEMALLOC_NO{FS,IO} (resp memalloc_no{fs,io}*) are used to mark no fs/io reclaim recursion critical sections because controling that for each allocation inside fs transaction (or other sensitive) or IO contexts turned out to be unmaintainable and people simply fallen into using NOFS/NOIO unconditionally which is causing reclaim imbalance problems. * PF_MEMALLOC_NOCMA (resp memalloc_nocma*) is used for long term pinning when CMA pages cannot be pinned because that would break the CMA guarantees. Communicating this to all potential allocations during pinning is simply unfeasible. So you are absolutely right that these critical sections with side effects on all allocations are far from ideal from the API point of view but they are mostly mirroring a demand for functionality which is _practically_ impossible to achieve with our current code base. Not that we couldn't get back to drawing board and come up with a saner thing and rework the world... -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 533FEC4727D for ; Tue, 29 Sep 2020 14:54:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9CB5C2075A for ; Tue, 29 Sep 2020 14:54:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="eoKKMBN7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9CB5C2075A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E3D0A6B0068; Tue, 29 Sep 2020 10:54:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DED536B0071; Tue, 29 Sep 2020 10:54:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C65DE6B0072; Tue, 29 Sep 2020 10:54:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0022.hostedemail.com [216.40.44.22]) by kanga.kvack.org (Postfix) with ESMTP id A67046B0068 for ; Tue, 29 Sep 2020 10:54:07 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5856D4DB3 for ; Tue, 29 Sep 2020 14:54:07 +0000 (UTC) X-FDA: 77316394134.09.power86_3a173852718b Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id 37888180AD801 for ; Tue, 29 Sep 2020 14:54:07 +0000 (UTC) X-HE-Tag: power86_3a173852718b X-Filterd-Recvd-Size: 6658 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Tue, 29 Sep 2020 14:54:06 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601391245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IgDvWNQP8iyuHEvr4G441L71BKSYFJPWhAP8LkzWJfg=; b=eoKKMBN7CEGS/O8wpnxX5YX5p7R/6oj0pmjByAZssa/tlgDzObwBIbhZXkZHaNf3yxFnjs /xxRgUM/x4iZ6RwvlYDvqSTGjNG6UgIhlOVv1JFsk77ppGGHvo3dx1IKn0uI5+hjhXkchS /V0QnFBKAlwY5qjuSlhyF3uziX8+a1k= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CB7CAAD0F; Tue, 29 Sep 2020 14:54:04 +0000 (UTC) Date: Tue, 29 Sep 2020 16:54:03 +0200 From: Michal Hocko To: Daniel Vetter Cc: "Paul E. McKenney" , Juri Lelli , Peter Zijlstra , Sebastian Andrzej Siewior , Lai Jiangshan , dri-devel , Ben Segall , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , linux-hexagon@vger.kernel.org, Will Deacon , Ingo Molnar , Anton Ivanov , linux-arch , Vincent Guittot , Herbert Xu , Brian Cain , Richard Weinberger , Russell King , Ard Biesheuvel , David Airlie , Ingo Molnar , Geert Uytterhoeven , Mel Gorman , intel-gfx , Matt Turner , Valentin Schneider , linux-xtensa@linux-xtensa.org, Shuah Khan , Jeff Dike , linux-um , Josh Triplett , Steven Rostedt , rcu@vger.kernel.org, linux-m68k , Ivan Kokshaysky , Rodrigo Vivi , Thomas Gleixner , Dietmar Eggemann , Linux ARM , Richard Henderson , Chris Zankel , Max Filippov , Daniel Bristot de Oliveira , LKML , alpha , Mathieu Desnoyers , Andrew Morton , Linus Torvalds Subject: Re: [patch 00/13] preempt: Make preempt count unconditional Message-ID: <20200929145403.GE2277@dhcp22.suse.cz> References: <87bli75t7v.fsf@nanos.tec.linutronix.de> <20200916152956.GV29330@paulmck-ThinkPad-P72> <20200916205840.GD29330@paulmck-ThinkPad-P72> <20200929081938.GC22035@dhcp22.suse.cz> <20200929090003.GG438822@phenom.ffwll.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200929090003.GG438822@phenom.ffwll.local> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 29-09-20 11:00:03, Daniel Vetter wrote: > On Tue, Sep 29, 2020 at 10:19:38AM +0200, Michal Hocko wrote: > > On Wed 16-09-20 23:43:02, Daniel Vetter wrote: > > > I can > > > then figure out whether it's better to risk not spotting issues with > > > call_rcu vs slapping a memalloc_noio_save/restore around all these > > > critical section which force-degrades any allocation to GFP_ATOMIC at > > > > did you mean memalloc_noreclaim_* here? > > Yeah I picked the wrong one of that family of functions. > > > > most, but has the risk that we run into code that assumes "GFP_KERNEL > > > never fails for small stuff" and has a decidedly less tested fallback > > > path than rcu code. > > > > Even if the above then please note that memalloc_noreclaim_* or > > PF_MEMALLOC should be used with an extreme care. Essentially only for > > internal memory reclaimers. It grants access to _all_ the available > > memory so any abuse can be detrimental to the overall system operation. > > Allocation failure in this mode means that we are out of memory and any > > code relying on such an allocation has to carefuly consider failure. > > This is not a random allocation mode. > > Agreed, that's why I don't like having these kind of automagic critical > sections. It's a bit a shotgun approach. Paul said that the code would > handle failures, but the problem is that it applies everywhere. Ohh, in the ideal world we wouldn't need anything like that. But then the reality fires: * PF_MEMALLOC (resp memalloc_noreclaim_* for that matter) is primarily used to make sure that allocations from inside the memory reclaim - yeah that happens - will not recurse. * PF_MEMALLOC_NO{FS,IO} (resp memalloc_no{fs,io}*) are used to mark no fs/io reclaim recursion critical sections because controling that for each allocation inside fs transaction (or other sensitive) or IO contexts turned out to be unmaintainable and people simply fallen into using NOFS/NOIO unconditionally which is causing reclaim imbalance problems. * PF_MEMALLOC_NOCMA (resp memalloc_nocma*) is used for long term pinning when CMA pages cannot be pinned because that would break the CMA guarantees. Communicating this to all potential allocations during pinning is simply unfeasible. So you are absolutely right that these critical sections with side effects on all allocations are far from ideal from the API point of view but they are mostly mirroring a demand for functionality which is _practically_ impossible to achieve with our current code base. Not that we couldn't get back to drawing board and come up with a saner thing and rework the world... -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70D18C4727F for ; Wed, 30 Sep 2020 07:50:26 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 038782071E for ; Wed, 30 Sep 2020 07:50:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=suse.com header.i=@suse.com header.b="eoKKMBN7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 038782071E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C43616E090; Wed, 30 Sep 2020 07:50:11 +0000 (UTC) Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 907EB89CF3; Tue, 29 Sep 2020 14:54:06 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601391245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IgDvWNQP8iyuHEvr4G441L71BKSYFJPWhAP8LkzWJfg=; b=eoKKMBN7CEGS/O8wpnxX5YX5p7R/6oj0pmjByAZssa/tlgDzObwBIbhZXkZHaNf3yxFnjs /xxRgUM/x4iZ6RwvlYDvqSTGjNG6UgIhlOVv1JFsk77ppGGHvo3dx1IKn0uI5+hjhXkchS /V0QnFBKAlwY5qjuSlhyF3uziX8+a1k= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CB7CAAD0F; Tue, 29 Sep 2020 14:54:04 +0000 (UTC) Date: Tue, 29 Sep 2020 16:54:03 +0200 From: Michal Hocko To: Daniel Vetter Subject: Re: [patch 00/13] preempt: Make preempt count unconditional Message-ID: <20200929145403.GE2277@dhcp22.suse.cz> References: <87bli75t7v.fsf@nanos.tec.linutronix.de> <20200916152956.GV29330@paulmck-ThinkPad-P72> <20200916205840.GD29330@paulmck-ThinkPad-P72> <20200929081938.GC22035@dhcp22.suse.cz> <20200929090003.GG438822@phenom.ffwll.local> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200929090003.GG438822@phenom.ffwll.local> User-Agent: Mutt/1.10.1 (2018-07-13) X-Mailman-Approved-At: Wed, 30 Sep 2020 07:50:09 +0000 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Juri Lelli , Peter Zijlstra , Sebastian Andrzej Siewior , Lai Jiangshan , dri-devel , Ben Segall , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , Will Deacon , Ard Biesheuvel , Anton Ivanov , linux-arch , Vincent Guittot , Herbert Xu , Brian Cain , linux-hexagon@vger.kernel.org, Russell King , Ingo Molnar , David Airlie , Ingo Molnar , Geert Uytterhoeven , Mel Gorman , Matt Turner , Valentin Schneider , linux-xtensa@linux-xtensa.org, Shuah Khan , "Paul E. McKenney" , Jeff Dike , intel-gfx , alpha , linux-um , Josh Triplett , Steven Rostedt , rcu@vger.kernel.org, linux-m68k , Ivan Kokshaysky , Rodrigo Vivi , Thomas Gleixner , Dietmar Eggemann , Linux ARM , Richard Henderson , Chris Zankel , Max Filippov , Linus Torvalds , LKML , Richard Weinberger , Mathieu Desnoyers , Andrew Morton , Daniel Bristot de Oliveira Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Tue 29-09-20 11:00:03, Daniel Vetter wrote: > On Tue, Sep 29, 2020 at 10:19:38AM +0200, Michal Hocko wrote: > > On Wed 16-09-20 23:43:02, Daniel Vetter wrote: > > > I can > > > then figure out whether it's better to risk not spotting issues with > > > call_rcu vs slapping a memalloc_noio_save/restore around all these > > > critical section which force-degrades any allocation to GFP_ATOMIC at > > > > did you mean memalloc_noreclaim_* here? > > Yeah I picked the wrong one of that family of functions. > > > > most, but has the risk that we run into code that assumes "GFP_KERNEL > > > never fails for small stuff" and has a decidedly less tested fallback > > > path than rcu code. > > > > Even if the above then please note that memalloc_noreclaim_* or > > PF_MEMALLOC should be used with an extreme care. Essentially only for > > internal memory reclaimers. It grants access to _all_ the available > > memory so any abuse can be detrimental to the overall system operation. > > Allocation failure in this mode means that we are out of memory and any > > code relying on such an allocation has to carefuly consider failure. > > This is not a random allocation mode. > > Agreed, that's why I don't like having these kind of automagic critical > sections. It's a bit a shotgun approach. Paul said that the code would > handle failures, but the problem is that it applies everywhere. Ohh, in the ideal world we wouldn't need anything like that. But then the reality fires: * PF_MEMALLOC (resp memalloc_noreclaim_* for that matter) is primarily used to make sure that allocations from inside the memory reclaim - yeah that happens - will not recurse. * PF_MEMALLOC_NO{FS,IO} (resp memalloc_no{fs,io}*) are used to mark no fs/io reclaim recursion critical sections because controling that for each allocation inside fs transaction (or other sensitive) or IO contexts turned out to be unmaintainable and people simply fallen into using NOFS/NOIO unconditionally which is causing reclaim imbalance problems. * PF_MEMALLOC_NOCMA (resp memalloc_nocma*) is used for long term pinning when CMA pages cannot be pinned because that would break the CMA guarantees. Communicating this to all potential allocations during pinning is simply unfeasible. So you are absolutely right that these critical sections with side effects on all allocations are far from ideal from the API point of view but they are mostly mirroring a demand for functionality which is _practically_ impossible to achieve with our current code base. Not that we couldn't get back to drawing board and come up with a saner thing and rework the world... -- Michal Hocko SUSE Labs _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5AB4C47425 for ; Tue, 29 Sep 2020 14:54:09 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 42F9420757 for ; Tue, 29 Sep 2020 14:54:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=suse.com header.i=@suse.com header.b="eoKKMBN7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 42F9420757 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5430189CF3; Tue, 29 Sep 2020 14:54:08 +0000 (UTC) Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id 907EB89CF3; Tue, 29 Sep 2020 14:54:06 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601391245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IgDvWNQP8iyuHEvr4G441L71BKSYFJPWhAP8LkzWJfg=; b=eoKKMBN7CEGS/O8wpnxX5YX5p7R/6oj0pmjByAZssa/tlgDzObwBIbhZXkZHaNf3yxFnjs /xxRgUM/x4iZ6RwvlYDvqSTGjNG6UgIhlOVv1JFsk77ppGGHvo3dx1IKn0uI5+hjhXkchS /V0QnFBKAlwY5qjuSlhyF3uziX8+a1k= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CB7CAAD0F; Tue, 29 Sep 2020 14:54:04 +0000 (UTC) Date: Tue, 29 Sep 2020 16:54:03 +0200 From: Michal Hocko To: Daniel Vetter Message-ID: <20200929145403.GE2277@dhcp22.suse.cz> References: <87bli75t7v.fsf@nanos.tec.linutronix.de> <20200916152956.GV29330@paulmck-ThinkPad-P72> <20200916205840.GD29330@paulmck-ThinkPad-P72> <20200929081938.GC22035@dhcp22.suse.cz> <20200929090003.GG438822@phenom.ffwll.local> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200929090003.GG438822@phenom.ffwll.local> User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [Intel-gfx] [patch 00/13] preempt: Make preempt count unconditional X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Juri Lelli , Peter Zijlstra , Sebastian Andrzej Siewior , Lai Jiangshan , dri-devel , Ben Segall , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , Will Deacon , Ard Biesheuvel , Anton Ivanov , linux-arch , Herbert Xu , Brian Cain , linux-hexagon@vger.kernel.org, Russell King , Ingo Molnar , David Airlie , Ingo Molnar , Geert Uytterhoeven , Mel Gorman , Matt Turner , Valentin Schneider , linux-xtensa@linux-xtensa.org, Shuah Khan , "Paul E. McKenney" , Jeff Dike , intel-gfx , alpha , linux-um , Josh Triplett , Steven Rostedt , rcu@vger.kernel.org, linux-m68k , Ivan Kokshaysky , Thomas Gleixner , Dietmar Eggemann , Linux ARM , Richard Henderson , Chris Zankel , Max Filippov , Linus Torvalds , LKML , Richard Weinberger , Mathieu Desnoyers , Andrew Morton , Daniel Bristot de Oliveira Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Tue 29-09-20 11:00:03, Daniel Vetter wrote: > On Tue, Sep 29, 2020 at 10:19:38AM +0200, Michal Hocko wrote: > > On Wed 16-09-20 23:43:02, Daniel Vetter wrote: > > > I can > > > then figure out whether it's better to risk not spotting issues with > > > call_rcu vs slapping a memalloc_noio_save/restore around all these > > > critical section which force-degrades any allocation to GFP_ATOMIC at > > > > did you mean memalloc_noreclaim_* here? > > Yeah I picked the wrong one of that family of functions. > > > > most, but has the risk that we run into code that assumes "GFP_KERNEL > > > never fails for small stuff" and has a decidedly less tested fallback > > > path than rcu code. > > > > Even if the above then please note that memalloc_noreclaim_* or > > PF_MEMALLOC should be used with an extreme care. Essentially only for > > internal memory reclaimers. It grants access to _all_ the available > > memory so any abuse can be detrimental to the overall system operation. > > Allocation failure in this mode means that we are out of memory and any > > code relying on such an allocation has to carefuly consider failure. > > This is not a random allocation mode. > > Agreed, that's why I don't like having these kind of automagic critical > sections. It's a bit a shotgun approach. Paul said that the code would > handle failures, but the problem is that it applies everywhere. Ohh, in the ideal world we wouldn't need anything like that. But then the reality fires: * PF_MEMALLOC (resp memalloc_noreclaim_* for that matter) is primarily used to make sure that allocations from inside the memory reclaim - yeah that happens - will not recurse. * PF_MEMALLOC_NO{FS,IO} (resp memalloc_no{fs,io}*) are used to mark no fs/io reclaim recursion critical sections because controling that for each allocation inside fs transaction (or other sensitive) or IO contexts turned out to be unmaintainable and people simply fallen into using NOFS/NOIO unconditionally which is causing reclaim imbalance problems. * PF_MEMALLOC_NOCMA (resp memalloc_nocma*) is used for long term pinning when CMA pages cannot be pinned because that would break the CMA guarantees. Communicating this to all potential allocations during pinning is simply unfeasible. So you are absolutely right that these critical sections with side effects on all allocations are far from ideal from the API point of view but they are mostly mirroring a demand for functionality which is _practically_ impossible to achieve with our current code base. Not that we couldn't get back to drawing board and come up with a saner thing and rework the world... -- Michal Hocko SUSE Labs _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Subject: Re: [patch 00/13] preempt: Make preempt count unconditional Date: Tue, 29 Sep 2020 16:54:03 +0200 Message-ID: <20200929145403.GE2277@dhcp22.suse.cz> References: <87bli75t7v.fsf@nanos.tec.linutronix.de> <20200916152956.GV29330@paulmck-ThinkPad-P72> <20200916205840.GD29330@paulmck-ThinkPad-P72> <20200929081938.GC22035@dhcp22.suse.cz> <20200929090003.GG438822@phenom.ffwll.local> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601391245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IgDvWNQP8iyuHEvr4G441L71BKSYFJPWhAP8LkzWJfg=; b=eoKKMBN7CEGS/O8wpnxX5YX5p7R/6oj0pmjByAZssa/tlgDzObwBIbhZXkZHaNf3yxFnjs /xxRgUM/x4iZ6RwvlYDvqSTGjNG6UgIhlOVv1JFsk77ppGGHvo3dx1IKn0uI5+hjhXkchS /V0QnFBKAlwY5qjuSlhyF3uziX8+a1k= Content-Disposition: inline In-Reply-To: <20200929090003.GG438822@phenom.ffwll.local> List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Daniel Vetter Cc: "Paul E. McKenney" , Juri Lelli , Peter Zijlstra , Sebastian Andrzej Siewior , Lai Jiangshan , dri-devel , Ben Segall , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , linux-hexagon@vger.kernel.org, Will Deacon , Ingo Molnar , Anton Ivanov , linux-arch , Vincent Guittot , Herbert Xu , Brian Cain , Richard Weinberger , Russell King On Tue 29-09-20 11:00:03, Daniel Vetter wrote: > On Tue, Sep 29, 2020 at 10:19:38AM +0200, Michal Hocko wrote: > > On Wed 16-09-20 23:43:02, Daniel Vetter wrote: > > > I can > > > then figure out whether it's better to risk not spotting issues with > > > call_rcu vs slapping a memalloc_noio_save/restore around all these > > > critical section which force-degrades any allocation to GFP_ATOMIC at > > > > did you mean memalloc_noreclaim_* here? > > Yeah I picked the wrong one of that family of functions. > > > > most, but has the risk that we run into code that assumes "GFP_KERNEL > > > never fails for small stuff" and has a decidedly less tested fallback > > > path than rcu code. > > > > Even if the above then please note that memalloc_noreclaim_* or > > PF_MEMALLOC should be used with an extreme care. Essentially only for > > internal memory reclaimers. It grants access to _all_ the available > > memory so any abuse can be detrimental to the overall system operation. > > Allocation failure in this mode means that we are out of memory and any > > code relying on such an allocation has to carefuly consider failure. > > This is not a random allocation mode. > > Agreed, that's why I don't like having these kind of automagic critical > sections. It's a bit a shotgun approach. Paul said that the code would > handle failures, but the problem is that it applies everywhere. Ohh, in the ideal world we wouldn't need anything like that. But then the reality fires: * PF_MEMALLOC (resp memalloc_noreclaim_* for that matter) is primarily used to make sure that allocations from inside the memory reclaim - yeah that happens - will not recurse. * PF_MEMALLOC_NO{FS,IO} (resp memalloc_no{fs,io}*) are used to mark no fs/io reclaim recursion critical sections because controling that for each allocation inside fs transaction (or other sensitive) or IO contexts turned out to be unmaintainable and people simply fallen into using NOFS/NOIO unconditionally which is causing reclaim imbalance problems. * PF_MEMALLOC_NOCMA (resp memalloc_nocma*) is used for long term pinning when CMA pages cannot be pinned because that would break the CMA guarantees. Communicating this to all potential allocations during pinning is simply unfeasible. So you are absolutely right that these critical sections with side effects on all allocations are far from ideal from the API point of view but they are mostly mirroring a demand for functionality which is _practically_ impossible to achieve with our current code base. Not that we couldn't get back to drawing board and come up with a saner thing and rework the world... -- Michal Hocko SUSE Labs