Hi, I ran it on my benchmark (https://github.com/foxhlchen/sysfs_benchmark). machine: aws c5 (Intel Xeon with 96 logical cores) kernel: v5.12 benchmark: create 96 threads and bind them to each core then run open+read+close on a sysfs file simultaneously for 1000 times. result: Without the patchset, an open+read+close operation takes 550-570 us, perf shows significant time(>40%) spending on mutex_lock. After applying it, it takes 410-440 us for that operation and perf shows only ~4% time on mutex_lock. It's weird, I don't see a huge performance boost compared to v2, even though there is no mutex problem from the perf report. I've put console outputs and perf reports on the attachment for your reference. thanks, fox