From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.5 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id A9C0B1F859 for ; Tue, 30 Aug 2016 11:42:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758084AbcH3LmE (ORCPT ); Tue, 30 Aug 2016 07:42:04 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:35360 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756753AbcH3LmD (ORCPT ); Tue, 30 Aug 2016 07:42:03 -0400 Received: by mail-wm0-f68.google.com with SMTP id i5so2753161wmg.2 for ; Tue, 30 Aug 2016 04:42:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=wvViXJssEJvWsobfQa8s+wsyWPIF9RCnce4p0GasGtQ=; b=NFMKbDz+ac3AAXYjoH7pUoBc9thrJWMT3E1Ig4aZV6BqPAPnkSUyf73U1zXzhaRlzX QtCAYrhJCxWEVrf5LS5UnB4eMm8H+QHhpTYE6ptUI9incSD8vCsExTrbGcTTspK3XW6N 9C3UPZWGe1USC6n/cddce/Qp4G8Xvv+NkbDGwQxTQkL6Py9HtS2+HZ8x1Ebsx/cgMTXK HjqM6jiz3S23yJQA1Vyo1QceoYeNgZNABYqmBO6rD5/BJZ/twiqlz//oRT4QzQG8hyo6 32cLBNioLAxbRAcHPzZeTtX+qYNivSyomKY9BPL43uYDeFJzbgaS1vj8SdfLs4u4QnLR dRxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=wvViXJssEJvWsobfQa8s+wsyWPIF9RCnce4p0GasGtQ=; b=Jzn6G1r4IL42erwN2VdTZ7/u0Errvqgq7mcbwbpF4DffH+tBP4dJH5iTryiaxlL49b PxjaYUoHHItBSREIKYEN6A4zc3e29pZtIq7e7/3vHcQuhbPSMZm+Q9xHHuqy5L5Ne4Jd pDsvTLnfgRXL5c2vnV+FxCMd4pWSC7/0rSSw8JiXiLvm/rQkSAEZuxBkuQGaBCDl2wc5 6bMOQbNa3NUASxHPtcbZFX4kokokPEI/jetCb9R4kcn2e9bSjr9GhKCQYWaW1OMiyVAF tX66ml74l4x43+izv63q1Mo4Qj9lL2wc9/FC/n4QvY81QAtl3NsS42150jfPQE8Pg46S L8Rw== X-Gm-Message-State: AE9vXwPou2LvZbIZTS16V3bAIK922pqvy2eXGO6xyKnwB8AeJ4OFCmGZyqcgtw942IN2Cg== X-Received: by 10.194.75.198 with SMTP id e6mr3093599wjw.31.1472557321755; Tue, 30 Aug 2016 04:42:01 -0700 (PDT) Received: from slxbook4.ads.autodesk.com ([62.159.156.210]) by smtp.gmail.com with ESMTPSA id m81sm18312197wmf.1.2016.08.30.04.42.00 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 30 Aug 2016 04:42:01 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: [PATCH v6 10/13] convert: generate large test files only once From: Lars Schneider In-Reply-To: Date: Tue, 30 Aug 2016 13:41:59 +0200 Cc: Git Mailing List , Jeff King , sbeller@google.com, Johannes.Schindelin@gmx.de, jnareb@gmail.com, mlbright@gmail.com Content-Transfer-Encoding: quoted-printable Message-Id: References: <20160825110752.31581-1-larsxschneider@gmail.com> <20160825110752.31581-11-larsxschneider@gmail.com> To: Junio C Hamano X-Mailer: Apple Mail (2.3124) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org > On 29 Aug 2016, at 19:46, Junio C Hamano wrote: >=20 > larsxschneider@gmail.com writes: >=20 >> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh >> index 7b45136..34c8eb9 100755 >> --- a/t/t0021-conversion.sh >> +++ b/t/t0021-conversion.sh >> @@ -4,6 +4,15 @@ test_description=3D'blob conversion via = gitattributes' >>=20 >> . ./test-lib.sh >>=20 >> +if test_have_prereq EXPENSIVE >> +then >> + T0021_LARGE_FILE_SIZE=3D2048 >> + T0021_LARGISH_FILE_SIZE=3D100 >> +else >> + T0021_LARGE_FILE_SIZE=3D30 >> + T0021_LARGISH_FILE_SIZE=3D2 >> +fi >=20 > Minor: do we need T0021_ prefix? What are you trying to avoid > collisions with? Not necessary. I'll remove the prefix. >> + git checkout -- test test.t test.i && >> + >> + mkdir generated-test-data && >> + for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE) >> + do >> + RANDOM_STRING=3D"$(test-genrandom end $i | tr -dc = "A-Za-z0-9" )" >> + ROT_RANDOM_STRING=3D"$(echo $RANDOM_STRING | ./rot13.sh = )" >=20 > In earlier iteration of loop with lower $i, what guarantees that > some bytes survive "tr -dc"? Nothing really, good catch! The seed "end" produces as first character = always a=20 "S" which would survive "tr -dc". However, that is clunky. I will always = set "1" as first character in $RANDOM_STRING to mitigate the problem. >=20 >> + # Generate 1MB of empty data and 100 bytes of random = characters >=20 > 100 bytes? It seems to me that you are giving 1MB and then $i-byte > or less (which sometimes can be zero) of random string. Outdated comment. Will fix! >=20 >> + # printf "$(test-genrandom start $i)" >> + printf "%1048576d" 1 >>generated-test-data/large.file && >> + printf "$RANDOM_STRING" >>generated-test-data/large.file = && >> + printf "%1048576d" 1 = >>generated-test-data/large.file.rot13 && >> + printf "$ROT_RANDOM_STRING" = >>generated-test-data/large.file.rot13 && >> + >> + if test $i =3D $T0021_LARGISH_FILE_SIZE >> + then >> + cat generated-test-data/large.file = >generated-test-data/largish.file && >> + cat generated-test-data/large.file.rot13 = >generated-test-data/largish.file.rot13 >> + fi >> + done >=20 > This "now we are done with the loop, so copy them to the second > pair" needs to be in the loop? Shouldn't it come after 'done'? No, it does not need to be in the loop. I think I could do this after the loop instead: head -c $((1048576*$T0021_LARGISH_FILE_SIZE)) = generated-test-data/large.file >generated-test-data/largish.file > I do not quite get the point of this complexity. You are using > exactly the same seed "end" every time, so in the first round you > have 1M of SP, letter '1', letter 'S' (from the genrandom), then > in the second round you have 1M of SP, letter '1', letter 'S' and > letter 'p' (the last two from the genrandom), and go on. Is it > significant for the purpose of your test that the cruft inserted > between the repetition of 1M of SP gets longer by one byte but they > all share the same prefix (e.g. "1S", "1Sp", "1SpZ", "1SpZT", > ... are what you insert between a large run of spaces)? The pktline packets have a constant size. If the cruft between 1M of SP=20= has a constant size as well then the generated packets for the test data would repeat themselves. That's why I increased the length after every = 1M of SP. However, I realized that this test complexity is not necessary. I'll simplify it in the next round. Thanks, Lars=