From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Verma, Shally" Subject: Re: [PATCH 2/3] app/compress-perf: add performance measurement Date: Tue, 6 Nov 2018 15:39:58 +0000 Message-ID: References: <1538400427-20164-1-git-send-email-tomaszx.jozwiak@intel.com> <1538400427-20164-3-git-send-email-tomaszx.jozwiak@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable To: "Jozwiak, TomaszX" , "dev@dpdk.org" , "Trahe, Fiona" , "akhil.goyal@nxp.com" Return-path: Received: from NAM04-BN3-obe.outbound.protection.outlook.com (mail-eopbgr680084.outbound.protection.outlook.com [40.107.68.84]) by dpdk.org (Postfix) with ESMTP id C49B414EC for ; Tue, 6 Nov 2018 16:40:01 +0100 (CET) In-Reply-To: Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >-----Original Message----- >From: Jozwiak, TomaszX >Sent: 06 November 2018 14:36 >To: Verma, Shally ; dev@dpdk.org; Trahe, Fiona ; akhil.goyal@nxp.com >Subject: RE: [dpdk-dev] [PATCH 2/3] app/compress-perf: add performance mea= surement > >External Email > >> -----Original Message----- >> From: Verma, Shally [mailto:Shally.Verma@cavium.com] >> Sent: Tuesday, November 6, 2018 9:16 AM >> To: Jozwiak, TomaszX ; dev@dpdk.org; Trahe, >> Fiona ; akhil.goyal@nxp.com >> Subject: RE: [dpdk-dev] [PATCH 2/3] app/compress-perf: add performance >> measurement ... >> >> >> >+ >> >> >> >+ /* Window size */ >> >> >> >+ if (test_data->window_sz !=3D -1) { >> >> >> >+ if (param_range_check(test_data->window_sz, >> >> >> >+ &cap->window_size) >> >> >> What if cap->window_size is 0 i.e. implementation default? >> >> > >> >> >TJ: You probably mean cap->window_size.increment =3D 0 (because >> >> >cap->window_size is a structure). In that case we check if >> >> >test_data->window_sz >=3Dmin and test_data->window_sz <=3D max only, >> >> because increment =3D 0 means (base on compression API) we have only >> >> one value of windows_size (no range is supported). >> >> But PMD can set min and max too 0 for such case. >> > >> >TJ: I can't see any issue in that case too. Maybe I don't understand wh= at you >> mean but the logic is as follow: >> >1) if you pass '--window-sz ...' param. into command line your >> >intention is to force that value of window size during test. We check i= s this >> value is allow (by param_range_check() function). >> >2) if you plan to use default value - just don't pass '--window-sz' >> >param. in command line at all. In that case we get windows size from >> >window_size.max field, so if window_size.min=3D window_size.max=3D0 >> test_data->window_sz will be zero, as well. >> >If you mean that behavior is not good - I will be grateful for other >> suggestions. >> >> This is fine. but I am thinking of 3rd case here: >> c) user pass window sz but PMD window_sz.min =3D max =3D 0, then user >> requested windowsz is not applicable right?! > >In that case - true. There'll be fail : >"Compress device does not support this window size\n"); >So what is your proposal for that case? > We can set to window size to implementation default and add in diagnostic o= f used window sz for test run. No need to fail here I believe. Thanks Shally > > > > >> >> > >> >> >> >> > >> >> > >> >> > >> >> .... >> >> >> >> >> >+ >> >> >> >+ if (fread(data, data_to_read, 1, f) !=3D 1) { >> >> >> >+ RTE_LOG(ERR, USER1, "Input file could not= be read\n"); >> >> >> >+ goto err; >> >> >> >+ } >> >> >> >+ if (fseek(f, 0, SEEK_SET) !=3D 0) { >> >> >> >+ RTE_LOG(ERR, USER1, >> >> >> >+ "Size of input could not be calcu= lated\n"); >> >> >> >+ goto err; >> >> >> >+ } >> >> >> >+ remaining_data -=3D data_to_read; >> >> >> >+ data +=3D data_to_read; >> >> >> It looks like it will run 2nd time only if input file size < input >> >> >> data size in which case it will just keep filling input buffer >> >> >> with repeated >> >> data. >> >> >> Is that the intention here? >> >> > >> >> >TJ: Yes exactly. If test_data->input_data_sz is bigger than >> >> >actual_file_sz then we fill the buffer with repeated data from file >> >> >to fill >> >> whole buffer. >> >> I mentioned in one of the earlier reply, wont that then influence the >> >> compression behaviour and o/p? my suggestion was to work on actual >> >> user provided input to take perf to get actual perf for given content= . >> > >> >TJ: You right, but this solution is flexible. You can pass ' >> >--extended-input-sz" or not, so you can use original input data or exte= nd it >> if you want. >> Ok. but still not sure if it's really needed. Might be practically most = of the time >> it wont be exercised. No hard opinion on this though. >> >> Thanks >> Shally >> > >> >> >> >> > >> >> >> >> >> ... >> >> >> >> >> >+ if (data_addr =3D=3D NULL) { >> >> >> >+ RTE_LOG(ERR, USER1, "Could not >> >> >> >+ append data\n"); >> >> >> Since a new buffer per segment is allocated, so is it possible for >> >> >> append to fail? think, this check is redundant here. >> >> > >> >> >TJ: Yes, you're right, it should never fail. But I think it's good >> >> >coding practice >> >> to add the check just in case. >> >> > >> >> Unless it is called in data path which might cost perf a bit. >> > >> >TJ: prepare_bufs() is out of perf measurement, so shouldn't impact to >> >measurements. The performance measurement is inside >> >main_loop() only. >> > >> > >> >Br, Tomek >> > >> >> >> >> Thanks >> >> Shally >> >> >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ >> >> >> >+ rte_memcpy(data_addr, input_data_ptr, dat= a_sz); >> >> >> >+ input_data_ptr +=3D data_sz; >> >> >> >+ remaining_data -=3D data_sz; >> >> >> >+ >> >> >> >+ if (rte_pktmbuf_chain(test_data->decomp_b= ufs[i], >> >> >> >+ next_seg) < 0) { >> >> >> >+ RTE_LOG(ERR, USER1, "Could not ch= ain mbufs\n"); >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ segs_per_mbuf++; >> >> >> >+ } >> >> >> >+ >> >> >> >+ /* Allocate data in output mbuf */ >> >> >> >+ test_data->comp_bufs[i] =3D >> >> >> >+ rte_pktmbuf_alloc(test_data->comp_buf_poo= l); >> >> >> >+ if (test_data->comp_bufs[i] =3D=3D NULL) { >> >> >> >+ RTE_LOG(ERR, USER1, "Could not allocate m= buf\n"); >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ data_addr =3D (uint8_t *) rte_pktmbuf_append( >> >> >> >+ test_data->comp_bufs[i], >> >> >> >+ test_data->seg_sz); >> >> >> >+ if (data_addr =3D=3D NULL) { >> >> >> >+ RTE_LOG(ERR, USER1, "Could not append dat= a\n"); >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ >> >> >> >+ /* Chain mbufs if needed for output mbufs */ >> >> >> >+ for (j =3D 1; j < segs_per_mbuf; j++) { >> >> >> >+ struct rte_mbuf *next_seg =3D >> >> >> >+ >> >> >> >+ rte_pktmbuf_alloc(test_data->comp_buf_pool); >> >> >> >+ >> >> >> >+ if (next_seg =3D=3D NULL) { >> >> >> >+ RTE_LOG(ERR, USER1, >> >> >> >+ "Could not allocate mbuf\= n"); >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ >> >> >> >+ data_addr =3D (uint8_t *)rte_pktmbuf_appe= nd(next_seg, >> >> >> >+ test_data->seg_sz); >> >> >> >+ >> >> >> >+ if (data_addr =3D=3D NULL) { >> >> >> >+ RTE_LOG(ERR, USER1, "Could not ap= pend data\n"); >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ >> >> >> >+ if (rte_pktmbuf_chain(test_data->comp_buf= s[i], >> >> >> >+ next_seg) < 0) { >> >> >> >+ RTE_LOG(ERR, USER1, "Could not ch= ain mbufs\n"); >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ } >> >> >> >+ } >> >> >> >+ >> >> >> >+ return 0; >> >> >> >+} >> >> >> >+ >> >> >> >+static void >> >> >> >+free_bufs(struct comp_test_data *test_data) { >> >> >> >+ uint32_t i; >> >> >> >+ >> >> >> >+ for (i =3D 0; i < test_data->total_bufs; i++) { >> >> >> >+ rte_pktmbuf_free(test_data->comp_bufs[i]); >> >> >> >+ rte_pktmbuf_free(test_data->decomp_bufs[i]); >> >> >> >+ } >> >> >> >+ rte_free(test_data->comp_bufs); >> >> >> >+ rte_free(test_data->decomp_bufs); } >> >> >> >+ >> >> >> >+static int >> >> >> >+main_loop(struct comp_test_data *test_data, uint8_t level, >> >> >> >+ enum rte_comp_xform_type type, >> >> >> >+ uint8_t *output_data_ptr, >> >> >> >+ size_t *output_data_sz, >> >> >> >+ unsigned int benchmarking) { >> >> >> >+ uint8_t dev_id =3D test_data->cdev_id; >> >> >> >+ uint32_t i, iter, num_iter; >> >> >> >+ struct rte_comp_op **ops, **deq_ops; >> >> >> >+ void *priv_xform =3D NULL; >> >> >> >+ struct rte_comp_xform xform; >> >> >> >+ size_t output_size =3D 0; >> >> >> >+ struct rte_mbuf **input_bufs, **output_bufs; >> >> >> >+ int res =3D 0; >> >> >> >+ int allocated =3D 0; >> >> >> >+ >> >> >> >+ if (test_data =3D=3D NULL || !test_data->burst_sz) { >> >> >> >+ RTE_LOG(ERR, USER1, >> >> >> >+ "Unknow burst size\n"); >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ >> >> >> >+ ops =3D rte_zmalloc_socket(NULL, >> >> >> >+ 2 * test_data->total_bufs * sizeof(struct rte_com= p_op *), >> >> >> >+ 0, rte_socket_id()); >> >> >> >+ >> >> >> >+ if (ops =3D=3D NULL) { >> >> >> >+ RTE_LOG(ERR, USER1, >> >> >> >+ "Can't allocate memory for ops strucures\= n"); >> >> >> >+ return -1; >> >> >> >+ } >> >> >> >+ >> >> >> >+ deq_ops =3D &ops[test_data->total_bufs]; >> >> >> >+ >> >> >> >+ if (type =3D=3D RTE_COMP_COMPRESS) { >> >> >> >+ xform =3D (struct rte_comp_xform) { >> >> >> >+ .type =3D RTE_COMP_COMPRESS, >> >> >> >+ .compress =3D { >> >> >> >+ .algo =3D RTE_COMP_ALGO_DEFLATE, >> >> >> >+ .deflate.huffman =3D test_data->h= uffman_enc, >> >> >> >+ .level =3D level, >> >> >> >+ .window_size =3D test_data->windo= w_sz, >> >> >> >+ .chksum =3D RTE_COMP_CHECKSUM_NON= E, >> >> >> >+ .hash_algo =3D RTE_COMP_HASH_ALGO= _NONE >> >> >> >+ } >> >> >> >+ }; >> >> >> >+ input_bufs =3D test_data->decomp_bufs; >> >> >> >+ output_bufs =3D test_data->comp_bufs; >> >> >> >+ } else { >> >> >> >+ xform =3D (struct rte_comp_xform) { >> >> >> >+ .type =3D RTE_COMP_DECOMPRESS, >> >> >> >+ .decompress =3D { >> >> >> >+ .algo =3D RTE_COMP_ALGO_DEFLATE, >> >> >> >+ .chksum =3D RTE_COMP_CHECKSUM_NON= E, >> >> >> >+ .window_size =3D test_data->windo= w_sz, >> >> >> >+ .hash_algo =3D RTE_COMP_HASH_ALGO= _NONE >> >> >> >+ } >> >> >> >+ }; >> >> >> >+ input_bufs =3D test_data->comp_bufs; >> >> >> >+ output_bufs =3D test_data->decomp_bufs; >> >> >> >+ } >> >> >> >+ >> >> >> >+ /* Create private xform */ >> >> >> >+ if (rte_compressdev_private_xform_create(dev_id, &xform, >> >> >> >+ &priv_xform) < 0) { >> >> >> >+ RTE_LOG(ERR, USER1, "Private xform could not be >> created\n"); >> >> >> >+ res =3D -1; >> >> >> >+ goto end; >> >> >> >+ } >> >> >> >+ >> >> >> >+ uint64_t tsc_start, tsc_end, tsc_duration; >> >> >> >+ >> >> >> >+ tsc_start =3D tsc_end =3D tsc_duration =3D 0; >> >> >> >+ if (benchmarking) { >> >> >> >+ tsc_start =3D rte_rdtsc(); >> >> >> >+ num_iter =3D test_data->num_iter; >> >> >> >+ } else >> >> >> >+ num_iter =3D 1; >> >> >> Looks like in same code we're doing benchmarking and functional >> >> validation. >> >> >> It can be reorganised to keep validation test separately like done >> >> >> in crypto_perf. >> >> > >> >> >TJ: Ok, makes sense. However in the interests of getting this into >> >> >the >> >> >18.11 release I'd like to defer this refactoring and the remainder >> >> >of your >> >> comments below to the next release. >> >> > >> >> > >> >> >Next comments - WIP >> >> > >> >> > >> >> >Br, Tomek