Re: [PATCH] delayacct: track delays from ksm cow

From: David Hildenbrand <david@redhat.com>
To: CGEL <cgel.zte@gmail.com>
Cc: bsingharora@gmail.com, akpm@linux-foundation.org,
	yang.yang29@zte.com.cn, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH] delayacct: track delays from ksm cow
Date: Fri, 18 Mar 2022 09:24:44 +0100	[thread overview]
Message-ID: <2bb1c357-5335-9d96-d862-bd51c1014193@redhat.com> (raw)
In-Reply-To: <6233e342.1c69fb81.692f.6286@mx.google.com>

On 18.03.22 02:41, CGEL wrote:
> On Thu, Mar 17, 2022 at 11:05:22AM +0100, David Hildenbrand wrote:
>> On 17.03.22 10:48, CGEL wrote:
>>> On Thu, Mar 17, 2022 at 09:17:13AM +0100, David Hildenbrand wrote:
>>>> On 17.03.22 03:03, CGEL wrote:
>>>>> On Wed, Mar 16, 2022 at 03:56:23PM +0100, David Hildenbrand wrote:
>>>>>> On 16.03.22 14:34, cgel.zte@gmail.com wrote:
>>>>>>> From: Yang Yang <yang.yang29@zte.com.cn>
>>>>>>>
>>>>>>> Delay accounting does not track the delay of ksm cow.  When tasks
>>>>>>> have many ksm pages, it may spend a amount of time waiting for ksm
>>>>>>> cow.
>>>>>>>
>>>>>>> To get the impact of tasks in ksm cow, measure the delay when ksm
>>>>>>> cow happens. This could help users to decide whether to user ksm
>>>>>>> or not.
>>>>>>>
>>>>>>> Also update tools/accounting/getdelays.c:
>>>>>>>
>>>>>>>     / # ./getdelays -dl -p 231
>>>>>>>     print delayacct stats ON
>>>>>>>     listen forever
>>>>>>>     PID     231
>>>>>>>
>>>>>>>     CPU             count     real total  virtual total    delay total  delay average
>>>>>>>                      6247     1859000000     2154070021     1674255063          0.268ms
>>>>>>>     IO              count    delay total  delay average
>>>>>>>                         0              0              0ms
>>>>>>>     SWAP            count    delay total  delay average
>>>>>>>                         0              0              0ms
>>>>>>>     RECLAIM         count    delay total  delay average
>>>>>>>                         0              0              0ms
>>>>>>>     THRASHING       count    delay total  delay average
>>>>>>>                         0              0              0ms
>>>>>>>     KSM             count    delay total  delay average
>>>>>>>                      3635      271567604              0ms
>>>>>>>
>>>>>>
>>>>>> TBH I'm not sure how particularly helpful this is and if we want this.
>>>>>>
>>>>> Thanks for replying.
>>>>>
>>>>> Users may use ksm by calling madvise(, , MADV_MERGEABLE) when they want
>>>>> save memory, it's a tradeoff by suffering delay on ksm cow. Users can
>>>>> get to know how much memory ksm saved by reading
>>>>> /sys/kernel/mm/ksm/pages_sharing, but they don't know what the costs of
>>>>> ksm cow delay, and this is important of some delay sensitive tasks. If
>>>>> users know both saved memory and ksm cow delay, they could better use
>>>>> madvise(, , MADV_MERGEABLE).
>>>>
>>>> But that happens after the effects, no?
>>>>
>>>> IOW a user already called madvise(, , MADV_MERGEABLE) and then gets the
>>>> results.
>>>>
>>> Image user are developing or porting their applications on experiment
>>> machine, they could takes those benchmark as feedback to adjust whether
>>> to use madvise(, , MADV_MERGEABLE) or it's range.
>>
>> And why can't they run it with and without and observe performance using
>> existing metrics (or even application-specific metrics?)?
>>
>>
> I think the reason why we need this patch, is just like why we need                                                                                                     
> swap,reclaim,thrashing getdelay information. When system is complex,
> it's hard to precise tell which kernel activity impact the observe
> performance or application-specific metrics, preempt? cgroup throttle?
> swap? reclaim? IO?
> 
> So if we could get the factor's precise impact data, when we are tunning
> the factor(for this patch it's ksm), it's more efficient.
> 

I'm not convinced that we want to make or write-fault handler more
complicated for such a corner case with an unclear, eventual use case.
IIRC, whenever using KSM you're already agreeing to eventually pay a
performance price, and the price heavily depends on other factors in the
system. Simply looking at the number of write-faults might already give
an indication what changed with KSM being enabled.

Having that said, I'd like to hear other opinions.

-- 
Thanks,

David / dhildenb