From mboxrd@z Thu Jan 1 00:00:00 1970 From: Binoy Jayan Subject: Re: [RFC PATCH v4] IV Generation algorithms for dm-crypt Date: Mon, 20 Mar 2017 20:08:19 +0530 Message-ID: References: <1486463731-6224-1-git-send-email-binoy.jayan@linaro.org> <68f70534-a309-46ba-a84d-8acc1e6620e5@gmail.com> <2aef6e54-805f-e09b-ae66-c198f8c05335@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Milan Broz , Oded , Ofir , Herbert Xu , "David S. Miller" , linux-crypto@vger.kernel.org, Mark Brown , Arnd Bergmann , Linux kernel mailing list , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Shaohua Li , linux-raid@vger.kernel.org, Rajendra , Ondrej Mosnacek To: Gilad Ben-Yossef Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org On 6 March 2017 at 20:08, Gilad Ben-Yossef wrote: > > I gave it a spin on a x86_64 with 8 CPUs with AES-NI using cryptd and > on Arm using CryptoCell hardware accelerator. > > There was no difference in performance between 512 and 4096 bytes > cluster size on the x86_64 (800 MB loop file system) > > There was an improvement in latency of 3.2% between 512 and 4096 bytes > cluster size on the Arm. I expect the performance benefits for this > test for Binoy's patch to be the same. > > In both cases the very naive test was a simple dd with block size of > 4096 bytes or the raw block device. > > I do not know what effect having a bigger cluster size would have on > have on other more complex file system operations. > Is there any specific benchmark worth testing with? The multiple instances issue in /proc/crypto is fixed. It was because of the IV code itself modifying the algorithm name inadvertently in the global crypto algorithm lookup table when it was splitting up "plain(cbc(aes))" into "plain" and "cbc(aes)" so as to invoke the child algorithm. I ran a few tests with dd, bonnie and FIO under Qemu - x86 using the automated script [1] that I wrote to make the testing easy. The tests were done on software implementations of the algorithms as the real hardware was not available with me. According to the test, I found that the sequential reads and writes have a good improvement (5.7 %) in the data rate with the proposed solution while the random reads shows a very little improvement. When tested with FIO, the random writes also shows a small improvement (2.2%) but the random reads show a little deterioration in performance (4 %). When tested in arm hardware, only the sequential writes with bonnie shows improvement (5.6%). All other tests shows degraded performance in the absence of crypto hardware. [1] https://github.com/binoyjayan/utilities/blob/master/utils/dmtest Dependencies: dd [Full version], bonnie, fio Thanks, Binoy