On Thu, Apr 21, 2022 at 8:06 AM Borislav Petkov wrote: > > on AMD zen3 > > original: 20.11 Gb/s > rep_good: 34.662 Gb/s > erms: 36.378 Gb/s > fsrm: 36.398 Gb/s Looks good. Of course, the interesting cases are the "took a page fault in the middle" ones. A very simple basic test is something like the attached. It does no error checking or anything else, but doing a 'strace ./a.out' should give you something like ... openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f10ddfd0000 munmap(0x7f10ddfe0000, 65536) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 16 exit_group(16) = ? where that "read(..) = 16" is the important part. It correctly figured out that it can only do 16 bytes (ok, 17, but we've always allowed the user accessor functions to block). With erms/fsrm, presumably you get that optimal "read(..) = 17". I'm sure we have a test-case for this somewhere, but it was easier for me to write a few lines of (bad) code than try to find it. Linus