From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0DCAC43381 for ; Thu, 7 Mar 2019 15:57:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B955A20840 for ; Thu, 7 Mar 2019 15:57:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726341AbfCGP5B (ORCPT ); Thu, 7 Mar 2019 10:57:01 -0500 Received: from mga03.intel.com ([134.134.136.65]:36762 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726185AbfCGP5B (ORCPT ); Thu, 7 Mar 2019 10:57:01 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Mar 2019 07:57:00 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,451,1544515200"; d="scan'208";a="326324259" Received: from linux.intel.com ([10.54.29.200]) by fmsmga005.fm.intel.com with ESMTP; 07 Mar 2019 07:56:59 -0800 Received: from [10.251.89.100] (abudanko-mobl.ccr.corp.intel.com [10.251.89.100]) by linux.intel.com (Postfix) with ESMTP id 2BE1658073B; Thu, 7 Mar 2019 07:56:56 -0800 (PST) Subject: Re: [PATCH v5 07/10] perf record: implement -z,--compression_level=n option and compression From: Alexey Budankov To: Jiri Olsa Cc: Arnaldo Carvalho de Melo , Namhyung Kim , Alexander Shishkin , Peter Zijlstra , Ingo Molnar , Andi Kleen , linux-kernel References: <4d1b11a4-77ed-d9af-ed22-875fc17b6050@linux.intel.com> <87fa1906-2d6a-a00a-7ba5-b570d0cbf9cc@linux.intel.com> <20190305122534.GB16615@krava> <20190307121429.GB29474@krava> <002e7e10-b0ef-df2a-261c-88fd9c00364d@linux.intel.com> Organization: Intel Corp. Message-ID: <39de5b2e-8f38-8fd6-2ed6-2dbb34d00790@linux.intel.com> Date: Thu, 7 Mar 2019 18:56:55 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: <002e7e10-b0ef-df2a-261c-88fd9c00364d@linux.intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07.03.2019 18:26, Alexey Budankov wrote: > > On 07.03.2019 15:14, Jiri Olsa wrote: >> On Thu, Mar 07, 2019 at 11:39:46AM +0300, Alexey Budankov wrote: >>> >>> On 05.03.2019 15:25, Jiri Olsa wrote: >>>> On Fri, Mar 01, 2019 at 06:58:32PM +0300, Alexey Budankov wrote: >>>> >>>> SNIP >>>> >>>>> >>>>> /* >>>>> * Increment md->refcount to guard md->data[idx] buffer >>>>> @@ -350,7 +357,7 @@ int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx, >>>>> md->prev = head; >>>>> perf_mmap__consume(md); >>>>> >>>>> - rc = push(to, &md->aio.cblocks[idx], md->aio.data[idx], size0 + size, *off); >>>>> + rc = push(to, md->aio.data[idx], size0 + size, *off, &md->aio.cblocks[idx]); >>>>> if (!rc) { >>>>> *off += size0 + size; >>>>> } else { >>>>> @@ -556,13 +563,15 @@ int perf_mmap__read_init(struct perf_mmap *map) >>>>> } >>>>> >>>>> int perf_mmap__push(struct perf_mmap *md, void *to, >>>>> - int push(struct perf_mmap *map, void *to, void *buf, size_t size)) >>>>> + int push(struct perf_mmap *map, void *to, void *buf, size_t size), >>>>> + perf_mmap__compress_fn_t compress, void *comp_data) >>>>> { >>>>> u64 head = perf_mmap__read_head(md); >>>>> unsigned char *data = md->base + page_size; >>>>> unsigned long size; >>>>> void *buf; >>>>> int rc = 0; >>>>> + size_t mmap_len = perf_mmap__mmap_len(md); >>>>> >>>>> rc = perf_mmap__read_init(md); >>>>> if (rc < 0) >>>>> @@ -574,7 +583,10 @@ int perf_mmap__push(struct perf_mmap *md, void *to, >>>>> buf = &data[md->start & md->mask]; >>>>> size = md->mask + 1 - (md->start & md->mask); >>>>> md->start += size; >>>>> - >>>>> + if (compress) { >>>>> + size = compress(comp_data, md->data, mmap_len, buf, size); >>>>> + buf = md->data; >>>>> + } >>>>> if (push(md, to, buf, size) < 0) { >>>>> rc = -1; >>>>> goto out; >>>> >>>> when we discussed the compress callback should be another layer >>>> in perf_mmap__push I was thinking more of the layered/fifo design, >>>> like: >>>> >>>> normaly we call: >>>> >>>> perf_mmap__push(... push = record__pushfn ...) >>>> -> reads mmap data and calls push(data), which translates as: >>>> >>>> record__pushfn(data); >>>> - which stores the data >>>> >>>> >>>> for compressed it'd be: >>>> >>>> perf_mmap__push(... push = compressed_push ...) >>>> >>>> -> reads mmap data and calls push(data), which translates as: >>>> >>>> compressed_push(data) >>>> -> reads data, compresses them and calls, next push callback in line: >>>> >>>> record__pushfn(data) >>>> - which stores the data >>>> >>>> >>>> there'd need to be the logic for compressed_push to >>>> remember the 'next push' function >>> >>> That is suboptimal for AIO. Also compression is an independent operation that >>> could be applied on any of push stages you mean. >> >> not sure what you mean by suboptimal, but I think >> that it can still happen in subsequent push callback >> >>> >>>> >>>> but I think this was the original idea behind the >>>> perf_mmap__push -> it gets the data and pushes them for >>>> the next processing.. it should stay as simple as that >>> >>> Agree on keeping simplicity and, at the moment, there is no any push to the next >>> processing in the code so provided implementation fits as for serial as for AIO >>> at the same time sticking to simplicity as much as possibly. If you see something >>> that would fit better please speak up and share. >> >> I have to insist that perf_mmap__push stays untouched >> and we do other processing in the push callbacks > > What is about perf_mmap__aio_push()? > > Without compression it does > memcpy(), memcpy(), aio_push() > > With compression its does > memcpy_with_compression(), memcpy_with_compression(), aio_push() > > and deviation that increases amount of copy operations i.e. implementing three or more > is suboptimal in terms of runtime overhead and data loss decrease > > Compression for serial streaming can be implemented in push() callback. > AIO case would go with compression over a parameter in aio_push(). I meant compress_fn and comp_data parameters _for_ aio_push(). > So the both trace writing schemas could be optimally extended. > > ~Alexey > >> >> jirka >> >