* [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-09 14:33 ` kernel test robot
0 siblings, 0 replies; 219+ messages in thread
From: kernel test robot @ 2016-08-09 14:33 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Dave Chinner, Bob Peterson, LKML, Linus Torvalds, lkp
[-- Attachment #1: Type: text/plain, Size: 8636 bytes --]
FYI, we noticed a -13.6% regression of aim7.jobs-per-min due to commit:
commit 68a9f5e7007c1afa2cf6830b690a90d0187c0684 ("xfs: implement iomap based buffered write path")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
in testcase: aim7
on test machine: 48 threads Ivytown Ivy Bridge-EP with 64G memory
with following parameters:
disk: 1BRD_48G
fs: xfs
test: disk_wrt
load: 3000
cpufreq_governor: performance
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74 ("xfs: reorder zeroing and flushing sequence in truncate")
68a9f5e700 ("xfs: implement iomap based buffered write path")
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69
---------------- --------------------------
%stddev %change %stddev
\ | \
486586 ± 0% -13.6% 420342 ± 0% aim7.jobs-per-min
37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time
37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time.max
6424 ± 1% +31.3% 8432 ± 1% aim7.time.involuntary_context_switches
151288 ± 0% +2.8% 155579 ± 0% aim7.time.minor_page_faults
376.31 ± 0% +28.5% 483.48 ± 0% aim7.time.system_time
429058 ± 0% -20.0% 343371 ± 0% aim7.time.voluntary_context_switches
16014 ± 0% +28.8% 20628 ± 1% meminfo.Active(file)
127154 ± 9% -14.4% 108893 ± 11% softirqs.SCHED
14084 ± 18% -33.1% 9421 ± 17% numa-numastat.node1.numa_foreign
15461 ± 17% -31.4% 10598 ± 13% numa-numastat.node1.numa_miss
24561 ± 0% -27.2% 17873 ± 1% vmstat.system.cs
47289 ± 0% +1.2% 47866 ± 0% vmstat.system.in
7868 ± 1% +27.3% 10013 ± 6% numa-meminfo.node0.Active(file)
8148 ± 1% +29.5% 10554 ± 7% numa-meminfo.node1.Active(file)
81041 ± 3% +30.0% 105374 ± 24% numa-meminfo.node1.Slab
1966 ± 1% +30.1% 2558 ± 4% numa-vmstat.node0.nr_active_file
4204 ± 3% +17.1% 4921 ± 8% numa-vmstat.node0.nr_alloc_batch
2037 ± 1% +26.6% 2579 ± 5% numa-vmstat.node1.nr_active_file
4003 ± 0% +28.1% 5129 ± 1% proc-vmstat.nr_active_file
979.25 ± 0% +63.7% 1602 ± 1% proc-vmstat.pgactivate
4699 ± 3% +162.6% 12340 ± 73% proc-vmstat.pgpgout
50.23 ± 19% -27.3% 36.50 ± 17% sched_debug.cpu.cpu_load[1].avg
466.50 ± 29% -51.8% 225.00 ± 73% sched_debug.cpu.cpu_load[1].max
77.78 ± 33% -50.6% 38.40 ± 57% sched_debug.cpu.cpu_load[1].stddev
300.50 ± 33% -52.9% 141.50 ± 48% sched_debug.cpu.cpu_load[2].max
1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.active_objs
1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.num_objs
431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.active_objs
431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.num_objs
24.26 ± 0% +8.7% 26.36 ± 0% turbostat.%Busy
686.75 ± 0% +9.1% 749.25 ± 0% turbostat.Avg_MHz
0.29 ± 1% -24.3% 0.22 ± 1% turbostat.CPU%c3
91.39 ± 2% +3.6% 94.71 ± 0% turbostat.CorWatt
121.88 ± 1% +2.8% 125.23 ± 0% turbostat.PkgWatt
53643508 ± 0% -19.6% 43119128 ± 2% cpuidle.C1-IVT.time
318952 ± 0% -25.7% 237018 ± 0% cpuidle.C1-IVT.usage
3471235 ± 2% -16.9% 2886121 ± 2% cpuidle.C1E-IVT.time
46642 ± 1% -22.4% 36214 ± 0% cpuidle.C1E-IVT.usage
12601665 ± 1% -21.8% 9854467 ± 1% cpuidle.C3-IVT.time
79872 ± 1% -19.6% 64244 ± 1% cpuidle.C3-IVT.usage
1.292e+09 ± 0% +13.7% 1.47e+09 ± 0% cpuidle.C6-IVT.time
5131 ±121% -100.0% 0.00 ± -1% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
5131 ±121% -100.0% 0.00 ± -1% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
9739 ± 99% -99.0% 95.50 ± 10% latency_stats.max.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
7739 ± 81% -72.1% 2162 ± 52% latency_stats.max.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
5131 ±121% -100.0% 0.00 ± -1% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
10459 ± 97% -97.5% 262.75 ± 5% latency_stats.sum.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
9097 ± 81% -72.5% 2505 ± 45% latency_stats.sum.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
2.59e+11 ± 6% +24.1% 3.213e+11 ± 4% perf-stat.branch-instructions
0.41 ± 2% -9.5% 0.38 ± 1% perf-stat.branch-miss-rate
1.072e+09 ± 4% +12.5% 1.206e+09 ± 3% perf-stat.branch-misses
972882 ± 0% -17.4% 803990 ± 0% perf-stat.context-switches
1.472e+12 ± 6% +22.4% 1.801e+12 ± 5% perf-stat.cpu-cycles
100350 ± 1% -5.1% 95219 ± 1% perf-stat.cpu-migrations
7.315e+08 ± 24% +60.4% 1.174e+09 ± 37% perf-stat.dTLB-load-misses
3.225e+11 ± 5% +36.4% 4.398e+11 ± 2% perf-stat.dTLB-loads
2.176e+11 ± 9% +44.6% 3.147e+11 ± 6% perf-stat.dTLB-stores
1.452e+12 ± 6% +29.5% 1.879e+12 ± 4% perf-stat.instructions
42168 ± 16% +27.5% 53751 ± 6% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.7% 1.04 ± 0% perf-stat.ipc
252401 ± 0% +6.6% 269148 ± 0% perf-stat.minor-faults
10.16 ± 3% +13.0% 11.48 ± 3% perf-stat.node-store-miss-rate
24842185 ± 2% +11.9% 27804764 ± 1% perf-stat.node-store-misses
252321 ± 0% +6.6% 268999 ± 0% perf-stat.page-faults
aim7.jobs-per-min
540000 ++-----------------------------------------------------------------+
520000 **.* *.**. .**.* |
| *.**.**.* ** *.**.**.**.**.* |
500000 ++ : |
480000 ++ *.**.**.**.**.**.**.**.**.*|
| |
460000 ++ |
440000 ++ |
420000 ++ O OO OO OO OO OO OO
|O O O OO O O O O O |
400000 O+ OO O OO O O O OO OO OO O O OO |
380000 ++ |
| |
360000 ++ O OO O |
340000 ++-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Thanks,
Xiaolong
[-- Attachment #2: config-4.7.0-rc1-00007-g68a9f5e --]
[-- Type: text/plain, Size: 151225 bytes --]
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.7.0-rc1 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y
#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y
#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ_FULL_ALL is not set
# CONFIG_NO_HZ_FULL_SYSIDLE is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
#
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_CONTEXT_TRACKING=y
# CONFIG_CONTEXT_TRACKING_FORCE is not set
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_RCU_NOCB_CPU=y
# CONFIG_RCU_NOCB_CPU_NONE is not set
# CONFIG_RCU_NOCB_CPU_ZERO is not set
CONFIG_RCU_NOCB_CPU_ALL=y
# CONFIG_RCU_EXPEDITE_BOOT is not set
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=19
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_NMI_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_NUMA_BALANCING=y
CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
CONFIG_CGROUPS=y
CONFIG_PAGE_COUNTER=y
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y
CONFIG_MEMCG_SWAP_ENABLED=y
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CGROUP_WRITEBACK=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
CONFIG_RT_GROUP_SCHED=y
# CONFIG_CGROUP_PIDS is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_DEVICE=y
# CONFIG_CGROUP_CPUACCT is not set
CONFIG_CGROUP_PERF=y
# CONFIG_CGROUP_DEBUG is not set
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
# CONFIG_EXPERT is not set
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_PRINTK=y
CONFIG_PRINTK_NMI=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
# CONFIG_BPF_SYSCALL is not set
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_ADVISE_SYSCALLS=y
# CONFIG_USERFAULTFD is not set
CONFIG_PCI_QUIRKS=y
CONFIG_MEMBARRIER=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y
#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
CONFIG_SLUB_CPU_PARTIAL=y
# CONFIG_SYSTEM_DATA_VERIFICATION is not set
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
CONFIG_KEXEC_CORE=y
CONFIG_OPROFILE=m
CONFIG_OPROFILE_EVENT_MULTIPLEX=y
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
# CONFIG_STATIC_KEYS_SELFTEST is not set
CONFIG_OPTPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_UPROBES=y
# CONFIG_HAVE_64BIT_ALIGNED_ACCESS is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_NMI=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_CLK=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_HAVE_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR is not set
CONFIG_CC_STACKPROTECTOR_NONE=y
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
# CONFIG_CC_STACKPROTECTOR_STRONG is not set
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=28
CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
CONFIG_HAVE_COPY_THREAD_TLS=y
CONFIG_HAVE_STACK_VALIDATION=y
# CONFIG_HAVE_ARCH_HASH is not set
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y
# CONFIG_CPU_NO_EFFICIENT_FFS is not set
#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
# CONFIG_MODULE_SIG is not set
# CONFIG_MODULE_COMPRESS is not set
# CONFIG_TRIM_UNUSED_KSYMS is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_BSG=y
CONFIG_BLK_DEV_BSGLIB=y
CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLK_DEV_THROTTLING=y
# CONFIG_BLK_CMDLINE_PARSER is not set
#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
# CONFIG_ATARI_PARTITION is not set
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
# CONFIG_LDM_PARTITION is not set
CONFIG_SGI_PARTITION=y
# CONFIG_ULTRIX_PARTITION is not set
CONFIG_SUN_PARTITION=y
CONFIG_KARMA_PARTITION=y
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
# CONFIG_CMDLINE_PARTITION is not set
CONFIG_BLOCK_COMPAT=y
#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_CFQ_GROUP_IOSCHED=y
CONFIG_DEFAULT_DEADLINE=y
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="deadline"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_PADATA=y
CONFIG_ASN1=y
CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
CONFIG_INLINE_READ_UNLOCK=y
CONFIG_INLINE_READ_UNLOCK_IRQ=y
CONFIG_INLINE_WRITE_UNLOCK=y
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_FREEZER=y
#
# Processor type and features
#
CONFIG_ZONE_DMA=y
CONFIG_SMP=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_FAST_FEATURE_TESTS=y
CONFIG_X86_X2APIC=y
CONFIG_X86_MPPARSE=y
# CONFIG_GOLDFISH is not set
CONFIG_X86_EXTENDED_PLATFORM=y
# CONFIG_X86_NUMACHIP is not set
# CONFIG_X86_VSMP is not set
CONFIG_X86_UV=y
# CONFIG_X86_GOLDFISH is not set
# CONFIG_X86_INTEL_MID is not set
CONFIG_X86_INTEL_LPSS=y
# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
CONFIG_IOSF_MBI=y
# CONFIG_IOSF_MBI_DEBUG is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
# CONFIG_SCHED_OMIT_FRAME_POINTER is not set
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
CONFIG_PARAVIRT_SPINLOCKS=y
# CONFIG_QUEUED_LOCK_STAT is not set
CONFIG_XEN=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PVHVM=y
CONFIG_XEN_512GB=y
CONFIG_XEN_SAVE_RESTORE=y
# CONFIG_XEN_DEBUG_FS is not set
# CONFIG_XEN_PVH is not set
CONFIG_KVM_GUEST=y
# CONFIG_KVM_DEBUG_FS is not set
CONFIG_PARAVIRT_TIME_ACCOUNTING=y
CONFIG_PARAVIRT_CLOCK=y
CONFIG_NO_BOOTMEM=y
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
# CONFIG_CALGARY_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
CONFIG_MAXSMP=y
CONFIG_NR_CPUS=8192
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_INTEL=y
CONFIG_X86_MCE_AMD=y
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_MCE_INJECT=m
CONFIG_X86_THERMAL_VECTOR=y
#
# Performance monitoring
#
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_PERF_EVENTS_INTEL_RAPL=y
CONFIG_PERF_EVENTS_INTEL_CSTATE=y
# CONFIG_PERF_EVENTS_AMD_POWER is not set
# CONFIG_VM86 is not set
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
CONFIG_I8K=m
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
CONFIG_MICROCODE_AMD=y
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_X86_DIRECT_GBPAGES=y
CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NODES_SPAN_OTHER_NODES=y
# CONFIG_NUMA_EMU is not set
CONFIG_NODES_SHIFT=10
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_MEMORY_PROBE=y
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_MOVABLE_NODE=y
CONFIG_HAVE_BOOTMEM_INFO_NODE=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG_SPARSE=y
# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_MEMORY_BALLOON=y
CONFIG_BALLOON_COMPACTION=y
CONFIG_COMPACTION=y
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_MMU_NOTIFIER=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
CONFIG_HWPOISON_INJECT=m
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_CLEANCACHE=y
CONFIG_FRONTSWAP=y
CONFIG_CMA=y
# CONFIG_CMA_DEBUG is not set
# CONFIG_CMA_DEBUGFS is not set
CONFIG_CMA_AREAS=7
CONFIG_ZSWAP=y
CONFIG_ZPOOL=y
CONFIG_ZBUD=y
# CONFIG_Z3FOLD is not set
CONFIG_ZSMALLOC=y
# CONFIG_PGTABLE_MAPPING is not set
# CONFIG_ZSMALLOC_STAT is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
CONFIG_ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT=y
# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_FRAME_VECTOR=y
CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
CONFIG_ARCH_HAS_PKEYS=y
CONFIG_X86_PMEM_LEGACY_DEVICE=y
CONFIG_X86_PMEM_LEGACY=y
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
# CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set
CONFIG_X86_RESERVE_LOW=64
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_ARCH_RANDOM=y
CONFIG_X86_SMAP=y
# CONFIG_X86_INTEL_MPX is not set
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
CONFIG_EFI=y
CONFIG_EFI_STUB=y
# CONFIG_EFI_MIXED is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
# CONFIG_KEXEC_FILE is not set
CONFIG_CRASH_DUMP=y
CONFIG_KEXEC_JUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
# CONFIG_RANDOMIZE_BASE is not set
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
# CONFIG_COMPAT_VDSO is not set
# CONFIG_LEGACY_VSYSCALL_NATIVE is not set
CONFIG_LEGACY_VSYSCALL_EMULATE=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
CONFIG_HAVE_LIVEPATCH=y
# CONFIG_LIVEPATCH is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y
#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
CONFIG_PM_DEBUG=y
CONFIG_PM_ADVANCED_DEBUG=y
CONFIG_PM_TEST_SUSPEND=y
CONFIG_PM_SLEEP_DEBUG=y
# CONFIG_DPM_WATCHDOG is not set
# CONFIG_PM_TRACE_RTC is not set
CONFIG_PM_CLK=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
# CONFIG_ACPI_DEBUGGER is not set
CONFIG_ACPI_SLEEP=y
# CONFIG_ACPI_PROCFS_POWER is not set
CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
CONFIG_ACPI_EC_DEBUGFS=m
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_IPMI=m
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_PROCESSOR_AGGREGATOR=m
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_NUMA=y
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_TABLE_UPGRADE=y
CONFIG_ACPI_DEBUG=y
CONFIG_ACPI_PCI_SLOT=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_HOTPLUG_MEMORY=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
CONFIG_ACPI_SBS=m
CONFIG_ACPI_HED=y
CONFIG_ACPI_CUSTOM_METHOD=m
CONFIG_ACPI_BGRT=y
# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
# CONFIG_ACPI_NFIT is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
CONFIG_ACPI_APEI=y
CONFIG_ACPI_APEI_GHES=y
CONFIG_ACPI_APEI_PCIEAER=y
CONFIG_ACPI_APEI_MEMORY_FAILURE=y
CONFIG_ACPI_APEI_EINJ=m
# CONFIG_ACPI_APEI_ERST_DEBUG is not set
# CONFIG_ACPI_EXTLOG is not set
# CONFIG_PMIC_OPREGION is not set
CONFIG_SFI=y
#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=m
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set
#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
CONFIG_X86_PCC_CPUFREQ=m
CONFIG_X86_ACPI_CPUFREQ=m
CONFIG_X86_ACPI_CPUFREQ_CPB=y
CONFIG_X86_POWERNOW_K8=m
CONFIG_X86_AMD_FREQ_SENSITIVITY=m
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_P4_CLOCKMOD=m
#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=m
#
# CPU Idle
#
CONFIG_CPU_IDLE=y
# CONFIG_CPU_IDLE_GOV_LADDER is not set
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
CONFIG_INTEL_IDLE=y
#
# Memory power savings
#
CONFIG_I7300_IDLE_IOAT_CHANNEL=y
CONFIG_I7300_IDLE=m
#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_XEN=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
CONFIG_PCIE_ECRC=y
CONFIG_PCIEAER_INJECT=m
CONFIG_PCIEASPM=y
# CONFIG_PCIEASPM_DEBUG is not set
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
CONFIG_PCIE_PME=y
# CONFIG_PCIE_DPC is not set
CONFIG_PCI_BUS_ADDR_T_64BIT=y
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_REALLOC_ENABLE_AUTO is not set
CONFIG_PCI_STUB=y
# CONFIG_XEN_PCIDEV_FRONTEND is not set
CONFIG_HT_IRQ=y
CONFIG_PCI_ATS=y
CONFIG_PCI_IOV=y
CONFIG_PCI_PRI=y
CONFIG_PCI_PASID=y
CONFIG_PCI_LABEL=y
# CONFIG_PCI_HYPERV is not set
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_ACPI_IBM=m
# CONFIG_HOTPLUG_PCI_CPCI is not set
CONFIG_HOTPLUG_PCI_SHPC=m
#
# PCI host controller drivers
#
# CONFIG_PCIE_DW_PLAT is not set
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
CONFIG_PCCARD=y
# CONFIG_PCMCIA is not set
CONFIG_CARDBUS=y
#
# PC-card bridges
#
CONFIG_YENTA=m
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
# CONFIG_RAPIDIO is not set
# CONFIG_X86_SYSFB is not set
#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_MISC=m
CONFIG_COREDUMP=y
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_AOUT is not set
# CONFIG_X86_X32 is not set
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_KEYS_COMPAT=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_PMC_ATOM=y
# CONFIG_VMD is not set
CONFIG_NET=y
CONFIG_COMPAT_NETLINK_MESSAGES=y
CONFIG_NET_INGRESS=y
CONFIG_NET_EGRESS=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=m
CONFIG_UNIX=y
CONFIG_UNIX_DIAG=m
CONFIG_XFRM=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
CONFIG_XFRM_STATISTICS=y
CONFIG_XFRM_IPCOMP=m
CONFIG_NET_KEY=m
CONFIG_NET_KEY_MIGRATE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_FIB_TRIE_STATS=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_ROUTE_CLASSID=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
# CONFIG_IP_PNP_BOOTP is not set
# CONFIG_IP_PNP_RARP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE_DEMUX=m
CONFIG_NET_IP_TUNNEL=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
CONFIG_NET_IPVTI=m
CONFIG_NET_UDP_TUNNEL=m
# CONFIG_NET_FOU is not set
# CONFIG_NET_FOU_IP_TUNNELS is not set
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_XFRM_MODE_TRANSPORT=m
CONFIG_INET_XFRM_MODE_TUNNEL=m
CONFIG_INET_XFRM_MODE_BEET=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
CONFIG_INET_UDP_DIAG=m
# CONFIG_INET_DIAG_DESTROY is not set
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=m
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=m
CONFIG_TCP_CONG_HTCP=m
CONFIG_TCP_CONG_HSTCP=m
CONFIG_TCP_CONG_HYBLA=m
CONFIG_TCP_CONG_VEGAS=m
CONFIG_TCP_CONG_SCALABLE=m
CONFIG_TCP_CONG_LP=m
CONFIG_TCP_CONG_VENO=m
CONFIG_TCP_CONG_YEAH=m
CONFIG_TCP_CONG_ILLINOIS=m
# CONFIG_TCP_CONG_DCTCP is not set
# CONFIG_TCP_CONG_CDG is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_IPV6_OPTIMISTIC_DAD=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_MIP6=m
# CONFIG_IPV6_ILA is not set
CONFIG_INET6_XFRM_TUNNEL=m
CONFIG_INET6_TUNNEL=m
CONFIG_INET6_XFRM_MODE_TRANSPORT=m
CONFIG_INET6_XFRM_MODE_TUNNEL=m
CONFIG_INET6_XFRM_MODE_BEET=m
CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION=m
# CONFIG_IPV6_VTI is not set
CONFIG_IPV6_SIT=m
CONFIG_IPV6_SIT_6RD=y
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=m
# CONFIG_IPV6_GRE is not set
CONFIG_IPV6_MULTIPLE_TABLES=y
# CONFIG_IPV6_SUBTREES is not set
CONFIG_IPV6_MROUTE=y
CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y
CONFIG_IPV6_PIMSM_V2=y
CONFIG_NETLABEL=y
CONFIG_NETWORK_SECMARK=y
CONFIG_NET_PTP_CLASSIFY=y
CONFIG_NETWORK_PHY_TIMESTAMPING=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=m
#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_ACCT=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_LOG_COMMON=m
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_ZONES=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_EVENTS=y
# CONFIG_NF_CONNTRACK_TIMEOUT is not set
CONFIG_NF_CONNTRACK_TIMESTAMP=y
CONFIG_NF_CONNTRACK_LABELS=y
CONFIG_NF_CT_PROTO_DCCP=m
CONFIG_NF_CT_PROTO_GRE=m
CONFIG_NF_CT_PROTO_SCTP=m
CONFIG_NF_CT_PROTO_UDPLITE=m
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_BROADCAST=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_SNMP=m
CONFIG_NF_CONNTRACK_PPTP=m
CONFIG_NF_CONNTRACK_SANE=m
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
# CONFIG_NETFILTER_NETLINK_GLUE_CT is not set
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_NF_NAT_PROTO_DCCP=m
CONFIG_NF_NAT_PROTO_UDPLITE=m
CONFIG_NF_NAT_PROTO_SCTP=m
CONFIG_NF_NAT_AMANDA=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_SIP=m
CONFIG_NF_NAT_TFTP=m
CONFIG_NF_NAT_REDIRECT=m
CONFIG_NETFILTER_SYNPROXY=m
CONFIG_NF_TABLES=m
# CONFIG_NF_TABLES_INET is not set
# CONFIG_NF_TABLES_NETDEV is not set
CONFIG_NFT_EXTHDR=m
CONFIG_NFT_META=m
CONFIG_NFT_CT=m
CONFIG_NFT_RBTREE=m
CONFIG_NFT_HASH=m
CONFIG_NFT_COUNTER=m
CONFIG_NFT_LOG=m
CONFIG_NFT_LIMIT=m
# CONFIG_NFT_MASQ is not set
# CONFIG_NFT_REDIR is not set
CONFIG_NFT_NAT=m
# CONFIG_NFT_QUEUE is not set
# CONFIG_NFT_REJECT is not set
CONFIG_NFT_COMPAT=m
CONFIG_NETFILTER_XTABLES=y
#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=m
CONFIG_NETFILTER_XT_CONNMARK=m
CONFIG_NETFILTER_XT_SET=m
#
# Xtables targets
#
CONFIG_NETFILTER_XT_TARGET_AUDIT=m
CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
CONFIG_NETFILTER_XT_TARGET_CT=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_HL=m
CONFIG_NETFILTER_XT_TARGET_HMARK=m
CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
CONFIG_NETFILTER_XT_TARGET_LED=m
CONFIG_NETFILTER_XT_TARGET_LOG=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_NAT=m
CONFIG_NETFILTER_XT_TARGET_NETMAP=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
CONFIG_NETFILTER_XT_TARGET_RATEEST=m
CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
CONFIG_NETFILTER_XT_TARGET_TEE=m
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
CONFIG_NETFILTER_XT_TARGET_TRACE=m
CONFIG_NETFILTER_XT_TARGET_SECMARK=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
CONFIG_NETFILTER_XT_MATCH_BPF=m
# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_CPU=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ECN=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_HL=m
# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
CONFIG_NETFILTER_XT_MATCH_IPVS=m
CONFIG_NETFILTER_XT_MATCH_L2TP=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_NFACCT=m
CONFIG_NETFILTER_XT_MATCH_OSF=m
CONFIG_NETFILTER_XT_MATCH_OWNER=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_RATEEST=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_RECENT=m
CONFIG_NETFILTER_XT_MATCH_SCTP=m
CONFIG_NETFILTER_XT_MATCH_SOCKET=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_TIME=m
CONFIG_NETFILTER_XT_MATCH_U32=m
CONFIG_IP_SET=m
CONFIG_IP_SET_MAX=256
CONFIG_IP_SET_BITMAP_IP=m
CONFIG_IP_SET_BITMAP_IPMAC=m
CONFIG_IP_SET_BITMAP_PORT=m
CONFIG_IP_SET_HASH_IP=m
# CONFIG_IP_SET_HASH_IPMARK is not set
CONFIG_IP_SET_HASH_IPPORT=m
CONFIG_IP_SET_HASH_IPPORTIP=m
CONFIG_IP_SET_HASH_IPPORTNET=m
# CONFIG_IP_SET_HASH_MAC is not set
# CONFIG_IP_SET_HASH_NETPORTNET is not set
CONFIG_IP_SET_HASH_NET=m
# CONFIG_IP_SET_HASH_NETNET is not set
CONFIG_IP_SET_HASH_NETPORT=m
CONFIG_IP_SET_HASH_NETIFACE=m
CONFIG_IP_SET_LIST_SET=m
CONFIG_IP_VS=m
CONFIG_IP_VS_IPV6=y
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12
#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_AH_ESP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y
CONFIG_IP_VS_PROTO_SCTP=y
#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
# CONFIG_IP_VS_FO is not set
# CONFIG_IP_VS_OVF is not set
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m
#
# IPVS SH scheduler
#
CONFIG_IP_VS_SH_TAB_BITS=8
#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
CONFIG_IP_VS_NFCT=y
CONFIG_IP_VS_PE_SIP=m
#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_CONNTRACK_IPV4=m
# CONFIG_NF_CONNTRACK_PROC_COMPAT is not set
CONFIG_NF_TABLES_IPV4=m
CONFIG_NFT_CHAIN_ROUTE_IPV4=m
# CONFIG_NFT_REJECT_IPV4 is not set
# CONFIG_NFT_DUP_IPV4 is not set
# CONFIG_NF_TABLES_ARP is not set
CONFIG_NF_DUP_IPV4=m
# CONFIG_NF_LOG_ARP is not set
CONFIG_NF_LOG_IPV4=m
CONFIG_NF_REJECT_IPV4=m
CONFIG_NF_NAT_IPV4=m
CONFIG_NFT_CHAIN_NAT_IPV4=m
CONFIG_NF_NAT_MASQUERADE_IPV4=m
CONFIG_NF_NAT_SNMP_BASIC=m
CONFIG_NF_NAT_PROTO_GRE=m
CONFIG_NF_NAT_PPTP=m
CONFIG_NF_NAT_H323=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_RPFILTER=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_SYNPROXY=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_SECURITY=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
#
# IPv6: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV6=m
CONFIG_NF_CONNTRACK_IPV6=m
CONFIG_NF_TABLES_IPV6=m
CONFIG_NFT_CHAIN_ROUTE_IPV6=m
# CONFIG_NFT_REJECT_IPV6 is not set
# CONFIG_NFT_DUP_IPV6 is not set
CONFIG_NF_DUP_IPV6=m
CONFIG_NF_REJECT_IPV6=m
CONFIG_NF_LOG_IPV6=m
CONFIG_NF_NAT_IPV6=m
CONFIG_NFT_CHAIN_NAT_IPV6=m
# CONFIG_NF_NAT_MASQUERADE_IPV6 is not set
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_AH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_MH=m
CONFIG_IP6_NF_MATCH_RPFILTER=m
CONFIG_IP6_NF_MATCH_RT=m
CONFIG_IP6_NF_TARGET_HL=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_TARGET_SYNPROXY=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_RAW=m
CONFIG_IP6_NF_SECURITY=m
# CONFIG_IP6_NF_NAT is not set
CONFIG_NF_TABLES_BRIDGE=m
# CONFIG_NFT_BRIDGE_META is not set
# CONFIG_NF_LOG_BRIDGE is not set
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_IP6=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_NFLOG=m
CONFIG_IP_DCCP=m
CONFIG_INET_DCCP_DIAG=m
#
# DCCP CCIDs Configuration
#
# CONFIG_IP_DCCP_CCID2_DEBUG is not set
CONFIG_IP_DCCP_CCID3=y
# CONFIG_IP_DCCP_CCID3_DEBUG is not set
CONFIG_IP_DCCP_TFRC_LIB=y
#
# DCCP Kernel Hacking
#
# CONFIG_IP_DCCP_DEBUG is not set
# CONFIG_NET_DCCPPROBE is not set
CONFIG_IP_SCTP=m
CONFIG_NET_SCTPPROBE=m
# CONFIG_SCTP_DBG_OBJCNT is not set
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set
CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1=y
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set
CONFIG_SCTP_COOKIE_HMAC_MD5=y
CONFIG_SCTP_COOKIE_HMAC_SHA1=y
CONFIG_INET_SCTP_DIAG=m
# CONFIG_RDS is not set
CONFIG_TIPC=m
CONFIG_TIPC_MEDIA_UDP=y
CONFIG_ATM=m
CONFIG_ATM_CLIP=m
# CONFIG_ATM_CLIP_NO_ICMP is not set
CONFIG_ATM_LANE=m
# CONFIG_ATM_MPOA is not set
CONFIG_ATM_BR2684=m
# CONFIG_ATM_BR2684_IPFILTER is not set
CONFIG_L2TP=m
CONFIG_L2TP_DEBUGFS=m
CONFIG_L2TP_V3=y
CONFIG_L2TP_IP=m
CONFIG_L2TP_ETH=m
CONFIG_STP=m
CONFIG_GARP=m
CONFIG_MRP=m
CONFIG_BRIDGE=m
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_BRIDGE_VLAN_FILTERING=y
CONFIG_HAVE_NET_DSA=y
CONFIG_VLAN_8021Q=m
CONFIG_VLAN_8021Q_GVRP=y
CONFIG_VLAN_8021Q_MVRP=y
# CONFIG_DECNET is not set
CONFIG_LLC=m
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_6LOWPAN is not set
CONFIG_IEEE802154=m
# CONFIG_IEEE802154_NL802154_EXPERIMENTAL is not set
CONFIG_IEEE802154_SOCKET=m
CONFIG_MAC802154=m
CONFIG_NET_SCHED=y
#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_HFSC=m
CONFIG_NET_SCH_ATM=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_MULTIQ=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFB=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_NETEM=m
CONFIG_NET_SCH_DRR=m
CONFIG_NET_SCH_MQPRIO=m
CONFIG_NET_SCH_CHOKE=m
CONFIG_NET_SCH_QFQ=m
CONFIG_NET_SCH_CODEL=m
CONFIG_NET_SCH_FQ_CODEL=m
# CONFIG_NET_SCH_FQ is not set
# CONFIG_NET_SCH_HHF is not set
# CONFIG_NET_SCH_PIE is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_SCH_PLUG=m
#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=m
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_CLS_U32_PERF=y
CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_FLOW=m
CONFIG_NET_CLS_CGROUP=y
# CONFIG_NET_CLS_BPF is not set
# CONFIG_NET_CLS_FLOWER is not set
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
CONFIG_NET_EMATCH_CMP=m
CONFIG_NET_EMATCH_NBYTE=m
CONFIG_NET_EMATCH_U32=m
CONFIG_NET_EMATCH_META=m
CONFIG_NET_EMATCH_TEXT=m
CONFIG_NET_EMATCH_IPSET=m
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_ACT_GACT=m
CONFIG_GACT_PROB=y
CONFIG_NET_ACT_MIRRED=m
CONFIG_NET_ACT_IPT=m
CONFIG_NET_ACT_NAT=m
CONFIG_NET_ACT_PEDIT=m
CONFIG_NET_ACT_SIMP=m
CONFIG_NET_ACT_SKBEDIT=m
CONFIG_NET_ACT_CSUM=m
# CONFIG_NET_ACT_VLAN is not set
# CONFIG_NET_ACT_BPF is not set
# CONFIG_NET_ACT_CONNMARK is not set
# CONFIG_NET_ACT_IFE is not set
CONFIG_NET_CLS_IND=y
CONFIG_NET_SCH_FIFO=y
CONFIG_DCB=y
CONFIG_DNS_RESOLVER=m
# CONFIG_BATMAN_ADV is not set
CONFIG_OPENVSWITCH=m
CONFIG_OPENVSWITCH_GRE=m
CONFIG_OPENVSWITCH_VXLAN=m
CONFIG_VSOCKETS=m
CONFIG_VMWARE_VMCI_VSOCKETS=m
CONFIG_NETLINK_DIAG=m
CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
# CONFIG_MPLS_ROUTING is not set
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
# CONFIG_NET_L3_MASTER_DEV is not set
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_XPS=y
CONFIG_SOCK_CGROUP_DATA=y
# CONFIG_CGROUP_NET_PRIO is not set
CONFIG_CGROUP_NET_CLASSID=y
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_BPF_JIT=y
CONFIG_NET_FLOW_LIMIT=y
#
# Network testing
#
CONFIG_NET_PKTGEN=m
# CONFIG_NET_TCPPROBE is not set
CONFIG_NET_DROP_MONITOR=y
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_AF_KCM is not set
CONFIG_FIB_RULES=y
CONFIG_WIRELESS=y
CONFIG_WIRELESS_EXT=y
CONFIG_WEXT_CORE=y
CONFIG_WEXT_PROC=y
CONFIG_WEXT_PRIV=y
CONFIG_CFG80211=m
# CONFIG_NL80211_TESTMODE is not set
# CONFIG_CFG80211_DEVELOPER_WARNINGS is not set
CONFIG_CFG80211_DEFAULT_PS=y
# CONFIG_CFG80211_DEBUGFS is not set
# CONFIG_CFG80211_INTERNAL_REGDB is not set
CONFIG_CFG80211_CRDA_SUPPORT=y
# CONFIG_CFG80211_WEXT is not set
CONFIG_LIB80211=m
# CONFIG_LIB80211_DEBUG is not set
CONFIG_MAC80211=m
CONFIG_MAC80211_HAS_RC=y
CONFIG_MAC80211_RC_MINSTREL=y
CONFIG_MAC80211_RC_MINSTREL_HT=y
# CONFIG_MAC80211_RC_MINSTREL_VHT is not set
CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y
CONFIG_MAC80211_RC_DEFAULT="minstrel_ht"
# CONFIG_MAC80211_MESH is not set
# CONFIG_MAC80211_LEDS is not set
# CONFIG_MAC80211_DEBUGFS is not set
# CONFIG_MAC80211_MESSAGE_TRACING is not set
# CONFIG_MAC80211_DEBUG_MENU is not set
CONFIG_MAC80211_STA_HASH_MAX_SIZE=0
# CONFIG_WIMAX is not set
CONFIG_RFKILL=m
CONFIG_RFKILL_LEDS=y
CONFIG_RFKILL_INPUT=y
# CONFIG_RFKILL_GPIO is not set
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
# CONFIG_NET_9P_DEBUG is not set
# CONFIG_CAIF is not set
# CONFIG_CEPH_LIB is not set
# CONFIG_NFC is not set
# CONFIG_LWTUNNEL is not set
CONFIG_DST_CACHE=y
# CONFIG_NET_DEVLINK is not set
CONFIG_MAY_USE_DEVLINK=y
CONFIG_HAVE_EBPF_JIT=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
CONFIG_EXTRA_FIRMWARE=""
CONFIG_FW_LOADER_USER_HELPER=y
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
CONFIG_SYS_HYPERVISOR=y
# CONFIG_GENERIC_CPU_DEVICES is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=y
CONFIG_REGMAP_SPI=y
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_FENCE_TRACE is not set
CONFIG_DMA_CMA=y
#
# Default contiguous memory area size:
#
CONFIG_CMA_SIZE_MBYTES=200
CONFIG_CMA_SIZE_SEL_MBYTES=y
# CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set
# CONFIG_CMA_SIZE_SEL_MIN is not set
# CONFIG_CMA_SIZE_SEL_MAX is not set
CONFIG_CMA_ALIGNMENT=8
#
# Bus devices
#
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
CONFIG_MTD=m
# CONFIG_MTD_TESTS is not set
# CONFIG_MTD_REDBOOT_PARTS is not set
# CONFIG_MTD_CMDLINE_PARTS is not set
# CONFIG_MTD_AR7_PARTS is not set
#
# User Modules And Translation Layers
#
CONFIG_MTD_BLKDEVS=m
CONFIG_MTD_BLOCK=m
# CONFIG_MTD_BLOCK_RO is not set
# CONFIG_FTL is not set
# CONFIG_NFTL is not set
# CONFIG_INFTL is not set
# CONFIG_RFD_FTL is not set
# CONFIG_SSFDC is not set
# CONFIG_SM_FTL is not set
# CONFIG_MTD_OOPS is not set
# CONFIG_MTD_SWAP is not set
# CONFIG_MTD_PARTITIONED_MASTER is not set
#
# RAM/ROM/Flash chip drivers
#
# CONFIG_MTD_CFI is not set
# CONFIG_MTD_JEDECPROBE is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
# CONFIG_MTD_CFI_I4 is not set
# CONFIG_MTD_CFI_I8 is not set
# CONFIG_MTD_RAM is not set
# CONFIG_MTD_ROM is not set
# CONFIG_MTD_ABSENT is not set
#
# Mapping drivers for chip access
#
# CONFIG_MTD_COMPLEX_MAPPINGS is not set
# CONFIG_MTD_INTEL_VR_NOR is not set
# CONFIG_MTD_PLATRAM is not set
#
# Self-contained MTD device drivers
#
# CONFIG_MTD_PMC551 is not set
# CONFIG_MTD_DATAFLASH is not set
# CONFIG_MTD_SST25L is not set
# CONFIG_MTD_SLRAM is not set
# CONFIG_MTD_PHRAM is not set
# CONFIG_MTD_MTDRAM is not set
# CONFIG_MTD_BLOCK2MTD is not set
#
# Disk-On-Chip Device Drivers
#
# CONFIG_MTD_DOCG3 is not set
# CONFIG_MTD_NAND is not set
# CONFIG_MTD_ONENAND is not set
#
# LPDDR & LPDDR2 PCM memory drivers
#
# CONFIG_MTD_LPDDR is not set
# CONFIG_MTD_SPI_NOR is not set
CONFIG_MTD_UBI=m
CONFIG_MTD_UBI_WL_THRESHOLD=4096
CONFIG_MTD_UBI_BEB_LIMIT=20
# CONFIG_MTD_UBI_FASTMAP is not set
# CONFIG_MTD_UBI_GLUEBI is not set
# CONFIG_MTD_UBI_BLOCK is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_SERIAL=m
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_GSC is not set
# CONFIG_PARPORT_AX88796 is not set
CONFIG_PARPORT_1284=y
CONFIG_PARPORT_NOT_PC=y
CONFIG_PNP=y
# CONFIG_PNP_DEBUG_MESSAGES is not set
#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_NULL_BLK=m
CONFIG_BLK_DEV_FD=m
# CONFIG_PARIDE is not set
CONFIG_BLK_DEV_PCIESSD_MTIP32XX=m
# CONFIG_ZRAM is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_LOOP_MIN_COUNT=0
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SKD is not set
CONFIG_BLK_DEV_OSD=m
CONFIG_BLK_DEV_SX8=m
CONFIG_BLK_DEV_RAM=m
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_XEN_BLKDEV_FRONTEND=m
# CONFIG_XEN_BLKDEV_BACKEND is not set
CONFIG_VIRTIO_BLK=y
# CONFIG_BLK_DEV_HD is not set
# CONFIG_BLK_DEV_RBD is not set
CONFIG_BLK_DEV_RSXX=m
CONFIG_NVME_CORE=m
CONFIG_BLK_DEV_NVME=m
# CONFIG_BLK_DEV_NVME_SCSI is not set
#
# Misc devices
#
CONFIG_SENSORS_LIS3LV02D=m
# CONFIG_AD525X_DPOT is not set
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
CONFIG_SGI_IOC4=m
CONFIG_TIFM_CORE=m
CONFIG_TIFM_7XX1=m
# CONFIG_ICS932S401 is not set
CONFIG_ENCLOSURE_SERVICES=m
CONFIG_SGI_XP=m
CONFIG_HP_ILO=m
CONFIG_SGI_GRU=m
# CONFIG_SGI_GRU_DEBUG is not set
CONFIG_APDS9802ALS=m
CONFIG_ISL29003=m
CONFIG_ISL29020=m
CONFIG_SENSORS_TSL2550=m
# CONFIG_SENSORS_BH1780 is not set
CONFIG_SENSORS_BH1770=m
CONFIG_SENSORS_APDS990X=m
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
# CONFIG_TI_DAC7512 is not set
CONFIG_VMWARE_BALLOON=m
# CONFIG_BMP085_I2C is not set
# CONFIG_BMP085_SPI is not set
# CONFIG_USB_SWITCH_FSA9480 is not set
# CONFIG_LATTICE_ECP3_CONFIG is not set
# CONFIG_SRAM is not set
# CONFIG_PANEL is not set
# CONFIG_C2PORT is not set
#
# EEPROM support
#
CONFIG_EEPROM_AT24=m
# CONFIG_EEPROM_AT25 is not set
CONFIG_EEPROM_LEGACY=m
CONFIG_EEPROM_MAX6875=m
CONFIG_EEPROM_93CX6=m
# CONFIG_EEPROM_93XX46 is not set
CONFIG_CB710_CORE=m
# CONFIG_CB710_DEBUG is not set
CONFIG_CB710_DEBUG_ASSUMPTIONS=y
#
# Texas Instruments shared transport line discipline
#
# CONFIG_TI_ST is not set
CONFIG_SENSORS_LIS3_I2C=m
#
# Altera FPGA firmware download module
#
CONFIG_ALTERA_STAPL=m
CONFIG_INTEL_MEI=y
CONFIG_INTEL_MEI_ME=y
# CONFIG_INTEL_MEI_TXE is not set
CONFIG_VMWARE_VMCI=m
#
# Intel MIC Bus Driver
#
# CONFIG_INTEL_MIC_BUS is not set
#
# SCIF Bus Driver
#
# CONFIG_SCIF_BUS is not set
#
# VOP Bus Driver
#
# CONFIG_VOP_BUS is not set
#
# Intel MIC Host Driver
#
#
# Intel MIC Card Driver
#
#
# SCIF Driver
#
#
# Intel MIC Coprocessor State Management (COSM) Drivers
#
#
# VOP Driver
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_CXL_BASE is not set
# CONFIG_CXL_KERNEL_API is not set
# CONFIG_CXL_EEH is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set
#
# SCSI device support
#
CONFIG_SCSI_MOD=y
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
# CONFIG_SCSI_MQ_DEFAULT is not set
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
CONFIG_CHR_DEV_ST=m
CONFIG_CHR_DEV_OSST=m
CONFIG_BLK_DEV_SR=m
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=m
CONFIG_CHR_DEV_SCH=m
CONFIG_SCSI_ENCLOSURE=m
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y
#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m
CONFIG_SCSI_SAS_LIBSAS=m
CONFIG_SCSI_SAS_ATA=y
CONFIG_SCSI_SAS_HOST_SMP=y
CONFIG_SCSI_SRP_ATTRS=m
CONFIG_SCSI_LOWLEVEL=y
CONFIG_ISCSI_TCP=m
CONFIG_ISCSI_BOOT_SYSFS=m
CONFIG_SCSI_CXGB3_ISCSI=m
CONFIG_SCSI_CXGB4_ISCSI=m
CONFIG_SCSI_BNX2_ISCSI=m
CONFIG_SCSI_BNX2X_FCOE=m
CONFIG_BE2ISCSI=m
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
CONFIG_SCSI_HPSA=m
CONFIG_SCSI_3W_9XXX=m
CONFIG_SCSI_3W_SAS=m
# CONFIG_SCSI_ACARD is not set
CONFIG_SCSI_AACRAID=m
# CONFIG_SCSI_AIC7XXX is not set
CONFIG_SCSI_AIC79XX=m
CONFIG_AIC79XX_CMDS_PER_DEVICE=4
CONFIG_AIC79XX_RESET_DELAY_MS=15000
# CONFIG_AIC79XX_DEBUG_ENABLE is not set
CONFIG_AIC79XX_DEBUG_MASK=0
# CONFIG_AIC79XX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_AIC94XX is not set
CONFIG_SCSI_MVSAS=m
# CONFIG_SCSI_MVSAS_DEBUG is not set
CONFIG_SCSI_MVSAS_TASKLET=y
CONFIG_SCSI_MVUMI=m
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
CONFIG_SCSI_ARCMSR=m
# CONFIG_SCSI_ESAS2R is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
CONFIG_MEGARAID_SAS=m
CONFIG_SCSI_MPT3SAS=m
CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
CONFIG_SCSI_UFSHCD=m
CONFIG_SCSI_UFSHCD_PCI=m
# CONFIG_SCSI_UFSHCD_PLATFORM is not set
CONFIG_SCSI_HPTIOP=m
# CONFIG_SCSI_BUSLOGIC is not set
CONFIG_VMWARE_PVSCSI=m
# CONFIG_XEN_SCSI_FRONTEND is not set
CONFIG_HYPERV_STORAGE=m
CONFIG_LIBFC=m
CONFIG_LIBFCOE=m
CONFIG_FCOE=m
CONFIG_FCOE_FNIC=m
# CONFIG_SCSI_SNIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
CONFIG_SCSI_ISCI=m
# CONFIG_SCSI_IPS is not set
CONFIG_SCSI_INITIO=m
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
CONFIG_SCSI_STEX=m
# CONFIG_SCSI_SYM53C8XX_2 is not set
CONFIG_SCSI_IPR=m
CONFIG_SCSI_IPR_TRACE=y
CONFIG_SCSI_IPR_DUMP=y
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA_FC=m
# CONFIG_TCM_QLA2XXX is not set
CONFIG_SCSI_QLA_ISCSI=m
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_WD719X is not set
CONFIG_SCSI_DEBUG=m
CONFIG_SCSI_PMCRAID=m
CONFIG_SCSI_PM8001=m
# CONFIG_SCSI_BFA_FC is not set
CONFIG_SCSI_VIRTIO=m
CONFIG_SCSI_CHELSIO_FCOE=m
CONFIG_SCSI_DH=y
CONFIG_SCSI_DH_RDAC=y
CONFIG_SCSI_DH_HP_SW=y
CONFIG_SCSI_DH_EMC=y
CONFIG_SCSI_DH_ALUA=y
CONFIG_SCSI_OSD_INITIATOR=m
CONFIG_SCSI_OSD_ULD=m
CONFIG_SCSI_OSD_DPRINT_SENSE=1
# CONFIG_SCSI_OSD_DEBUG is not set
CONFIG_ATA=m
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_ATA_ACPI=y
# CONFIG_SATA_ZPODD is not set
CONFIG_SATA_PMP=y
#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=m
CONFIG_SATA_AHCI_PLATFORM=m
# CONFIG_SATA_INIC162X is not set
CONFIG_SATA_ACARD_AHCI=m
CONFIG_SATA_SIL24=m
CONFIG_ATA_SFF=y
#
# SFF controllers with custom DMA interface
#
CONFIG_PDC_ADMA=m
CONFIG_SATA_QSTOR=m
CONFIG_SATA_SX4=m
CONFIG_ATA_BMDMA=y
#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=m
# CONFIG_SATA_DWC is not set
CONFIG_SATA_MV=m
CONFIG_SATA_NV=m
CONFIG_SATA_PROMISE=m
CONFIG_SATA_SIL=m
CONFIG_SATA_SIS=m
CONFIG_SATA_SVW=m
CONFIG_SATA_ULI=m
CONFIG_SATA_VIA=m
CONFIG_SATA_VITESSE=m
#
# PATA SFF controllers with BMDMA
#
CONFIG_PATA_ALI=m
CONFIG_PATA_AMD=m
CONFIG_PATA_ARTOP=m
CONFIG_PATA_ATIIXP=m
CONFIG_PATA_ATP867X=m
CONFIG_PATA_CMD64X=m
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
CONFIG_PATA_HPT366=m
CONFIG_PATA_HPT37X=m
CONFIG_PATA_HPT3X2N=m
CONFIG_PATA_HPT3X3=m
# CONFIG_PATA_HPT3X3_DMA is not set
CONFIG_PATA_IT8213=m
CONFIG_PATA_IT821X=m
CONFIG_PATA_JMICRON=m
CONFIG_PATA_MARVELL=m
CONFIG_PATA_NETCELL=m
CONFIG_PATA_NINJA32=m
# CONFIG_PATA_NS87415 is not set
CONFIG_PATA_OLDPIIX=m
# CONFIG_PATA_OPTIDMA is not set
CONFIG_PATA_PDC2027X=m
CONFIG_PATA_PDC_OLD=m
# CONFIG_PATA_RADISYS is not set
CONFIG_PATA_RDC=m
CONFIG_PATA_SCH=m
CONFIG_PATA_SERVERWORKS=m
CONFIG_PATA_SIL680=m
CONFIG_PATA_SIS=m
CONFIG_PATA_TOSHIBA=m
# CONFIG_PATA_TRIFLEX is not set
CONFIG_PATA_VIA=m
# CONFIG_PATA_WINBOND is not set
#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_RZ1000 is not set
#
# Generic fallback / legacy drivers
#
CONFIG_PATA_ACPI=m
CONFIG_ATA_GENERIC=m
# CONFIG_PATA_LEGACY is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_AUTODETECT=y
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
# CONFIG_MD_CLUSTER is not set
# CONFIG_BCACHE is not set
CONFIG_BLK_DEV_DM_BUILTIN=y
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_MQ_DEFAULT is not set
CONFIG_DM_DEBUG=y
CONFIG_DM_BUFIO=m
# CONFIG_DM_DEBUG_BLOCK_STACK_TRACING is not set
CONFIG_DM_BIO_PRISON=m
CONFIG_DM_PERSISTENT_DATA=m
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_THIN_PROVISIONING=m
CONFIG_DM_CACHE=m
CONFIG_DM_CACHE_SMQ=m
CONFIG_DM_CACHE_CLEANER=m
# CONFIG_DM_ERA is not set
CONFIG_DM_MIRROR=m
CONFIG_DM_LOG_USERSPACE=m
CONFIG_DM_RAID=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
CONFIG_DM_MULTIPATH_QL=m
CONFIG_DM_MULTIPATH_ST=m
CONFIG_DM_DELAY=m
CONFIG_DM_UEVENT=y
CONFIG_DM_FLAKEY=m
CONFIG_DM_VERITY=m
# CONFIG_DM_VERITY_FEC is not set
CONFIG_DM_SWITCH=m
# CONFIG_DM_LOG_WRITES is not set
CONFIG_TARGET_CORE=m
CONFIG_TCM_IBLOCK=m
CONFIG_TCM_FILEIO=m
CONFIG_TCM_PSCSI=m
# CONFIG_TCM_USER2 is not set
CONFIG_LOOPBACK_TARGET=m
CONFIG_TCM_FC=m
CONFIG_ISCSI_TARGET=m
# CONFIG_ISCSI_TARGET_CXGB4 is not set
# CONFIG_SBP_TARGET is not set
CONFIG_FUSION=y
CONFIG_FUSION_SPI=m
# CONFIG_FUSION_FC is not set
CONFIG_FUSION_SAS=m
CONFIG_FUSION_MAX_SGE=128
CONFIG_FUSION_CTL=m
CONFIG_FUSION_LOGGING=y
#
# IEEE 1394 (FireWire) support
#
CONFIG_FIREWIRE=m
CONFIG_FIREWIRE_OHCI=m
CONFIG_FIREWIRE_SBP2=m
CONFIG_FIREWIRE_NET=m
# CONFIG_FIREWIRE_NOSY is not set
CONFIG_MACINTOSH_DRIVERS=y
CONFIG_MAC_EMUMOUSEBTN=y
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
CONFIG_BONDING=m
CONFIG_DUMMY=m
# CONFIG_EQUALIZER is not set
CONFIG_NET_FC=y
CONFIG_IFB=m
CONFIG_NET_TEAM=m
CONFIG_NET_TEAM_MODE_BROADCAST=m
CONFIG_NET_TEAM_MODE_ROUNDROBIN=m
CONFIG_NET_TEAM_MODE_RANDOM=m
CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=m
CONFIG_NET_TEAM_MODE_LOADBALANCE=m
CONFIG_MACVLAN=m
CONFIG_MACVTAP=m
# CONFIG_IPVLAN is not set
CONFIG_VXLAN=m
# CONFIG_GENEVE is not set
# CONFIG_GTP is not set
# CONFIG_MACSEC is not set
CONFIG_NETCONSOLE=m
CONFIG_NETCONSOLE_DYNAMIC=y
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_TUN=m
# CONFIG_TUN_VNET_CROSS_LE is not set
CONFIG_VETH=m
CONFIG_VIRTIO_NET=y
CONFIG_NLMON=m
# CONFIG_ARCNET is not set
# CONFIG_ATM_DRIVERS is not set
#
# CAIF transport drivers
#
CONFIG_VHOST_NET=m
# CONFIG_VHOST_SCSI is not set
CONFIG_VHOST_RING=m
CONFIG_VHOST=m
# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
#
# Distributed Switch Architecture drivers
#
CONFIG_ETHERNET=y
CONFIG_MDIO=y
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_VENDOR_ADAPTEC is not set
CONFIG_NET_VENDOR_AGERE=y
# CONFIG_ET131X is not set
# CONFIG_NET_VENDOR_ALTEON is not set
# CONFIG_ALTERA_TSE is not set
# CONFIG_NET_VENDOR_AMD is not set
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ATHEROS=y
CONFIG_ATL2=m
CONFIG_ATL1=m
CONFIG_ATL1E=m
CONFIG_ATL1C=m
CONFIG_ALX=m
# CONFIG_NET_VENDOR_AURORA is not set
CONFIG_NET_CADENCE=y
# CONFIG_MACB is not set
CONFIG_NET_VENDOR_BROADCOM=y
CONFIG_B44=m
CONFIG_B44_PCI_AUTOSELECT=y
CONFIG_B44_PCICORE_AUTOSELECT=y
CONFIG_B44_PCI=y
# CONFIG_BCMGENET is not set
CONFIG_BNX2=m
CONFIG_CNIC=m
CONFIG_TIGON3=y
# CONFIG_BNX2X is not set
# CONFIG_BNXT is not set
CONFIG_NET_VENDOR_BROCADE=y
CONFIG_BNA=m
CONFIG_NET_VENDOR_CAVIUM=y
# CONFIG_THUNDER_NIC_PF is not set
# CONFIG_THUNDER_NIC_VF is not set
# CONFIG_THUNDER_NIC_BGX is not set
# CONFIG_LIQUIDIO is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
CONFIG_CHELSIO_T3=m
CONFIG_CHELSIO_T4=m
# CONFIG_CHELSIO_T4_DCB is not set
# CONFIG_CHELSIO_T4_UWIRE is not set
CONFIG_CHELSIO_T4VF=m
CONFIG_NET_VENDOR_CISCO=y
CONFIG_ENIC=m
# CONFIG_CX_ECAT is not set
CONFIG_DNET=m
CONFIG_NET_VENDOR_DEC=y
CONFIG_NET_TULIP=y
CONFIG_DE2104X=m
CONFIG_DE2104X_DSL=0
CONFIG_TULIP=y
# CONFIG_TULIP_MWI is not set
CONFIG_TULIP_MMIO=y
# CONFIG_TULIP_NAPI is not set
CONFIG_DE4X5=m
CONFIG_WINBOND_840=m
CONFIG_DM9102=m
CONFIG_ULI526X=m
CONFIG_PCMCIA_XIRCOM=m
# CONFIG_NET_VENDOR_DLINK is not set
CONFIG_NET_VENDOR_EMULEX=y
CONFIG_BE2NET=m
CONFIG_BE2NET_HWMON=y
CONFIG_BE2NET_VXLAN=y
CONFIG_NET_VENDOR_EZCHIP=y
# CONFIG_NET_VENDOR_EXAR is not set
# CONFIG_NET_VENDOR_HP is not set
CONFIG_NET_VENDOR_INTEL=y
# CONFIG_E100 is not set
CONFIG_E1000=y
CONFIG_E1000E=y
CONFIG_E1000E_HWTS=y
CONFIG_IGB=y
CONFIG_IGB_HWMON=y
CONFIG_IGBVF=m
CONFIG_IXGB=m
CONFIG_IXGBE=y
CONFIG_IXGBE_HWMON=y
CONFIG_IXGBE_DCB=y
CONFIG_IXGBEVF=m
CONFIG_I40E=m
# CONFIG_I40E_VXLAN is not set
# CONFIG_I40E_DCB is not set
# CONFIG_I40E_FCOE is not set
# CONFIG_I40EVF is not set
# CONFIG_FM10K is not set
# CONFIG_NET_VENDOR_I825XX is not set
CONFIG_JME=m
CONFIG_NET_VENDOR_MARVELL=y
CONFIG_MVMDIO=m
# CONFIG_MVNETA_BM is not set
CONFIG_SKGE=m
CONFIG_SKGE_DEBUG=y
CONFIG_SKGE_GENESIS=y
CONFIG_SKY2=m
CONFIG_SKY2_DEBUG=y
CONFIG_NET_VENDOR_MELLANOX=y
CONFIG_MLX4_EN=m
CONFIG_MLX4_EN_DCB=y
CONFIG_MLX4_EN_VXLAN=y
CONFIG_MLX4_CORE=m
CONFIG_MLX4_DEBUG=y
# CONFIG_MLX5_CORE is not set
# CONFIG_MLXSW_CORE is not set
# CONFIG_NET_VENDOR_MICREL is not set
CONFIG_NET_VENDOR_MICROCHIP=y
# CONFIG_ENC28J60 is not set
# CONFIG_ENCX24J600 is not set
CONFIG_NET_VENDOR_MYRI=y
CONFIG_MYRI10GE=m
# CONFIG_FEALNX is not set
# CONFIG_NET_VENDOR_NATSEMI is not set
CONFIG_NET_VENDOR_NETRONOME=y
# CONFIG_NFP_NETVF is not set
# CONFIG_NET_VENDOR_NVIDIA is not set
CONFIG_NET_VENDOR_OKI=y
CONFIG_ETHOC=m
CONFIG_NET_PACKET_ENGINE=y
# CONFIG_HAMACHI is not set
CONFIG_YELLOWFIN=m
CONFIG_NET_VENDOR_QLOGIC=y
CONFIG_QLA3XXX=m
CONFIG_QLCNIC=m
CONFIG_QLCNIC_SRIOV=y
CONFIG_QLCNIC_DCB=y
# CONFIG_QLCNIC_VXLAN is not set
CONFIG_QLCNIC_HWMON=y
CONFIG_QLGE=m
CONFIG_NETXEN_NIC=m
# CONFIG_QED is not set
CONFIG_NET_VENDOR_QUALCOMM=y
CONFIG_NET_VENDOR_REALTEK=y
# CONFIG_ATP is not set
CONFIG_8139CP=y
CONFIG_8139TOO=y
CONFIG_8139TOO_PIO=y
# CONFIG_8139TOO_TUNE_TWISTER is not set
CONFIG_8139TOO_8129=y
# CONFIG_8139_OLD_RX_RESET is not set
CONFIG_R8169=y
CONFIG_NET_VENDOR_RENESAS=y
# CONFIG_NET_VENDOR_RDC is not set
CONFIG_NET_VENDOR_ROCKER=y
CONFIG_NET_VENDOR_SAMSUNG=y
# CONFIG_SXGBE_ETH is not set
# CONFIG_NET_VENDOR_SEEQ is not set
# CONFIG_NET_VENDOR_SILAN is not set
# CONFIG_NET_VENDOR_SIS is not set
CONFIG_SFC=m
CONFIG_SFC_MTD=y
CONFIG_SFC_MCDI_MON=y
CONFIG_SFC_SRIOV=y
CONFIG_SFC_MCDI_LOGGING=y
CONFIG_NET_VENDOR_SMSC=y
CONFIG_EPIC100=m
# CONFIG_SMSC911X is not set
CONFIG_SMSC9420=m
# CONFIG_NET_VENDOR_STMICRO is not set
# CONFIG_NET_VENDOR_SUN is not set
CONFIG_NET_VENDOR_SYNOPSYS=y
# CONFIG_NET_VENDOR_TEHUTI is not set
# CONFIG_NET_VENDOR_TI is not set
# CONFIG_NET_VENDOR_VIA is not set
# CONFIG_NET_VENDOR_WIZNET is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_NET_SB1000 is not set
CONFIG_PHYLIB=y
#
# MII PHY device drivers
#
# CONFIG_AQUANTIA_PHY is not set
CONFIG_AT803X_PHY=m
CONFIG_AMD_PHY=m
CONFIG_MARVELL_PHY=m
CONFIG_DAVICOM_PHY=m
CONFIG_QSEMI_PHY=m
CONFIG_LXT_PHY=m
CONFIG_CICADA_PHY=m
CONFIG_VITESSE_PHY=m
# CONFIG_TERANETICS_PHY is not set
CONFIG_SMSC_PHY=m
CONFIG_BCM_NET_PHYLIB=m
CONFIG_BROADCOM_PHY=m
# CONFIG_BCM7XXX_PHY is not set
CONFIG_BCM87XX_PHY=m
CONFIG_ICPLUS_PHY=m
CONFIG_REALTEK_PHY=m
CONFIG_NATIONAL_PHY=m
CONFIG_STE10XP=m
CONFIG_LSI_ET1011C_PHY=m
CONFIG_MICREL_PHY=m
# CONFIG_DP83848_PHY is not set
# CONFIG_DP83867_PHY is not set
# CONFIG_MICROCHIP_PHY is not set
CONFIG_FIXED_PHY=y
CONFIG_MDIO_BITBANG=m
# CONFIG_MDIO_GPIO is not set
# CONFIG_MDIO_OCTEON is not set
# CONFIG_MDIO_THUNDER is not set
# CONFIG_MDIO_BCM_UNIMAC is not set
# CONFIG_MICREL_KS8995MA is not set
# CONFIG_PLIP is not set
CONFIG_PPP=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_FILTER=y
CONFIG_PPP_MPPE=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPPOATM=m
CONFIG_PPPOE=m
CONFIG_PPTP=m
CONFIG_PPPOL2TP=m
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_SLIP=m
CONFIG_SLHC=m
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLIP_SMART=y
# CONFIG_SLIP_MODE_SLIP6 is not set
CONFIG_USB_NET_DRIVERS=y
CONFIG_USB_CATC=y
CONFIG_USB_KAWETH=y
CONFIG_USB_PEGASUS=y
CONFIG_USB_RTL8150=y
CONFIG_USB_RTL8152=m
# CONFIG_USB_LAN78XX is not set
CONFIG_USB_USBNET=y
CONFIG_USB_NET_AX8817X=y
CONFIG_USB_NET_AX88179_178A=m
CONFIG_USB_NET_CDCETHER=y
CONFIG_USB_NET_CDC_EEM=y
CONFIG_USB_NET_CDC_NCM=m
# CONFIG_USB_NET_HUAWEI_CDC_NCM is not set
CONFIG_USB_NET_CDC_MBIM=m
CONFIG_USB_NET_DM9601=y
# CONFIG_USB_NET_SR9700 is not set
# CONFIG_USB_NET_SR9800 is not set
CONFIG_USB_NET_SMSC75XX=y
CONFIG_USB_NET_SMSC95XX=y
CONFIG_USB_NET_GL620A=y
CONFIG_USB_NET_NET1080=y
CONFIG_USB_NET_PLUSB=y
CONFIG_USB_NET_MCS7830=y
CONFIG_USB_NET_RNDIS_HOST=y
CONFIG_USB_NET_CDC_SUBSET_ENABLE=y
CONFIG_USB_NET_CDC_SUBSET=y
CONFIG_USB_ALI_M5632=y
CONFIG_USB_AN2720=y
CONFIG_USB_BELKIN=y
CONFIG_USB_ARMLINUX=y
CONFIG_USB_EPSON2888=y
CONFIG_USB_KC2190=y
CONFIG_USB_NET_ZAURUS=y
CONFIG_USB_NET_CX82310_ETH=m
CONFIG_USB_NET_KALMIA=m
CONFIG_USB_NET_QMI_WWAN=m
CONFIG_USB_HSO=m
CONFIG_USB_NET_INT51X1=y
CONFIG_USB_IPHETH=y
CONFIG_USB_SIERRA_NET=y
CONFIG_USB_VL600=m
# CONFIG_USB_NET_CH9200 is not set
CONFIG_WLAN=y
CONFIG_WLAN_VENDOR_ADMTEK=y
# CONFIG_ADM8211 is not set
CONFIG_WLAN_VENDOR_ATH=y
# CONFIG_ATH_DEBUG is not set
# CONFIG_ATH5K is not set
# CONFIG_ATH5K_PCI is not set
# CONFIG_ATH9K is not set
# CONFIG_ATH9K_HTC is not set
# CONFIG_CARL9170 is not set
# CONFIG_ATH6KL is not set
# CONFIG_AR5523 is not set
# CONFIG_WIL6210 is not set
# CONFIG_ATH10K is not set
# CONFIG_WCN36XX is not set
CONFIG_WLAN_VENDOR_ATMEL=y
# CONFIG_ATMEL is not set
# CONFIG_AT76C50X_USB is not set
CONFIG_WLAN_VENDOR_BROADCOM=y
# CONFIG_B43 is not set
# CONFIG_B43LEGACY is not set
# CONFIG_BRCMSMAC is not set
# CONFIG_BRCMFMAC is not set
CONFIG_WLAN_VENDOR_CISCO=y
# CONFIG_AIRO is not set
CONFIG_WLAN_VENDOR_INTEL=y
# CONFIG_IPW2100 is not set
# CONFIG_IPW2200 is not set
# CONFIG_IWL4965 is not set
# CONFIG_IWL3945 is not set
# CONFIG_IWLWIFI is not set
CONFIG_WLAN_VENDOR_INTERSIL=y
# CONFIG_HOSTAP is not set
# CONFIG_HERMES is not set
# CONFIG_P54_COMMON is not set
# CONFIG_PRISM54 is not set
CONFIG_WLAN_VENDOR_MARVELL=y
# CONFIG_LIBERTAS is not set
# CONFIG_LIBERTAS_THINFIRM is not set
# CONFIG_MWIFIEX is not set
# CONFIG_MWL8K is not set
CONFIG_WLAN_VENDOR_MEDIATEK=y
# CONFIG_MT7601U is not set
CONFIG_WLAN_VENDOR_RALINK=y
# CONFIG_RT2X00 is not set
CONFIG_WLAN_VENDOR_REALTEK=y
# CONFIG_RTL8180 is not set
# CONFIG_RTL8187 is not set
CONFIG_RTL_CARDS=m
# CONFIG_RTL8192CE is not set
# CONFIG_RTL8192SE is not set
# CONFIG_RTL8192DE is not set
# CONFIG_RTL8723AE is not set
# CONFIG_RTL8723BE is not set
# CONFIG_RTL8188EE is not set
# CONFIG_RTL8192EE is not set
# CONFIG_RTL8821AE is not set
# CONFIG_RTL8192CU is not set
# CONFIG_RTL8XXXU is not set
CONFIG_WLAN_VENDOR_RSI=y
# CONFIG_RSI_91X is not set
CONFIG_WLAN_VENDOR_ST=y
# CONFIG_CW1200 is not set
CONFIG_WLAN_VENDOR_TI=y
# CONFIG_WL1251 is not set
# CONFIG_WL12XX is not set
# CONFIG_WL18XX is not set
# CONFIG_WLCORE is not set
CONFIG_WLAN_VENDOR_ZYDAS=y
# CONFIG_USB_ZD1201 is not set
# CONFIG_ZD1211RW is not set
CONFIG_MAC80211_HWSIM=m
# CONFIG_USB_NET_RNDIS_WLAN is not set
#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
CONFIG_WAN=y
# CONFIG_LANMEDIA is not set
CONFIG_HDLC=m
CONFIG_HDLC_RAW=m
# CONFIG_HDLC_RAW_ETH is not set
CONFIG_HDLC_CISCO=m
CONFIG_HDLC_FR=m
CONFIG_HDLC_PPP=m
#
# X.25/LAPB support is disabled
#
# CONFIG_PCI200SYN is not set
# CONFIG_WANXL is not set
# CONFIG_PC300TOO is not set
# CONFIG_FARSYNC is not set
# CONFIG_DSCC4 is not set
CONFIG_DLCI=m
CONFIG_DLCI_MAX=8
# CONFIG_SBNI is not set
CONFIG_IEEE802154_DRIVERS=m
CONFIG_IEEE802154_FAKELB=m
# CONFIG_IEEE802154_AT86RF230 is not set
# CONFIG_IEEE802154_MRF24J40 is not set
# CONFIG_IEEE802154_CC2520 is not set
# CONFIG_IEEE802154_ATUSB is not set
# CONFIG_IEEE802154_ADF7242 is not set
CONFIG_XEN_NETDEV_FRONTEND=m
# CONFIG_XEN_NETDEV_BACKEND is not set
CONFIG_VMXNET3=m
# CONFIG_FUJITSU_ES is not set
CONFIG_HYPERV_NET=m
CONFIG_ISDN=y
CONFIG_ISDN_I4L=m
CONFIG_ISDN_PPP=y
CONFIG_ISDN_PPP_VJ=y
CONFIG_ISDN_MPP=y
CONFIG_IPPP_FILTER=y
# CONFIG_ISDN_PPP_BSDCOMP is not set
CONFIG_ISDN_AUDIO=y
CONFIG_ISDN_TTY_FAX=y
#
# ISDN feature submodules
#
CONFIG_ISDN_DIVERSION=m
#
# ISDN4Linux hardware drivers
#
#
# Passive cards
#
# CONFIG_ISDN_DRV_HISAX is not set
CONFIG_ISDN_CAPI=m
# CONFIG_CAPI_TRACE is not set
CONFIG_ISDN_CAPI_CAPI20=m
CONFIG_ISDN_CAPI_MIDDLEWARE=y
CONFIG_ISDN_CAPI_CAPIDRV=m
# CONFIG_ISDN_CAPI_CAPIDRV_VERBOSE is not set
#
# CAPI hardware drivers
#
CONFIG_CAPI_AVM=y
CONFIG_ISDN_DRV_AVMB1_B1PCI=m
CONFIG_ISDN_DRV_AVMB1_B1PCIV4=y
CONFIG_ISDN_DRV_AVMB1_T1PCI=m
CONFIG_ISDN_DRV_AVMB1_C4=m
# CONFIG_CAPI_EICON is not set
CONFIG_ISDN_DRV_GIGASET=m
CONFIG_GIGASET_CAPI=y
# CONFIG_GIGASET_I4L is not set
# CONFIG_GIGASET_DUMMYLL is not set
CONFIG_GIGASET_BASE=m
CONFIG_GIGASET_M105=m
CONFIG_GIGASET_M101=m
# CONFIG_GIGASET_DEBUG is not set
CONFIG_HYSDN=m
CONFIG_HYSDN_CAPI=y
CONFIG_MISDN=m
CONFIG_MISDN_DSP=m
CONFIG_MISDN_L1OIP=m
#
# mISDN hardware drivers
#
CONFIG_MISDN_HFCPCI=m
CONFIG_MISDN_HFCMULTI=m
CONFIG_MISDN_HFCUSB=m
CONFIG_MISDN_AVMFRITZ=m
CONFIG_MISDN_SPEEDFAX=m
CONFIG_MISDN_INFINEON=m
CONFIG_MISDN_W6692=m
CONFIG_MISDN_NETJET=m
CONFIG_MISDN_IPAC=m
CONFIG_MISDN_ISAR=m
CONFIG_ISDN_HDLC=m
# CONFIG_NVM is not set
#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_LEDS=y
CONFIG_INPUT_FF_MEMLESS=m
CONFIG_INPUT_POLLDEV=m
CONFIG_INPUT_SPARSEKMAP=m
# CONFIG_INPUT_MATRIXKMAP is not set
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
# CONFIG_KEYBOARD_ADP5589 is not set
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1070 is not set
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_GPIO is not set
# CONFIG_KEYBOARD_GPIO_POLLED is not set
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
# CONFIG_KEYBOARD_MATRIX is not set
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
# CONFIG_KEYBOARD_MAX7359 is not set
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_BYD=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_MOUSE_PS2_ELANTECH=y
CONFIG_MOUSE_PS2_SENTELIC=y
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_PS2_FOCALTECH=y
# CONFIG_MOUSE_PS2_VMMOUSE is not set
CONFIG_MOUSE_SERIAL=m
CONFIG_MOUSE_APPLETOUCH=m
CONFIG_MOUSE_BCM5974=m
CONFIG_MOUSE_CYAPA=m
# CONFIG_MOUSE_ELAN_I2C is not set
CONFIG_MOUSE_VSXXXAA=m
# CONFIG_MOUSE_GPIO is not set
CONFIG_MOUSE_SYNAPTICS_I2C=m
CONFIG_MOUSE_SYNAPTICS_USB=m
# CONFIG_INPUT_JOYSTICK is not set
CONFIG_INPUT_TABLET=y
CONFIG_TABLET_USB_ACECAD=m
CONFIG_TABLET_USB_AIPTEK=m
CONFIG_TABLET_USB_GTCO=m
# CONFIG_TABLET_USB_HANWANG is not set
CONFIG_TABLET_USB_KBTAB=m
# CONFIG_TABLET_SERIAL_WACOM4 is not set
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_PROPERTIES=y
# CONFIG_TOUCHSCREEN_ADS7846 is not set
# CONFIG_TOUCHSCREEN_AD7877 is not set
# CONFIG_TOUCHSCREEN_AD7879 is not set
# CONFIG_TOUCHSCREEN_ATMEL_MXT is not set
# CONFIG_TOUCHSCREEN_AUO_PIXCIR is not set
# CONFIG_TOUCHSCREEN_BU21013 is not set
# CONFIG_TOUCHSCREEN_CY8CTMG110 is not set
# CONFIG_TOUCHSCREEN_CYTTSP_CORE is not set
# CONFIG_TOUCHSCREEN_CYTTSP4_CORE is not set
# CONFIG_TOUCHSCREEN_DYNAPRO is not set
# CONFIG_TOUCHSCREEN_HAMPSHIRE is not set
# CONFIG_TOUCHSCREEN_EETI is not set
# CONFIG_TOUCHSCREEN_EGALAX_SERIAL is not set
# CONFIG_TOUCHSCREEN_FT6236 is not set
# CONFIG_TOUCHSCREEN_FUJITSU is not set
# CONFIG_TOUCHSCREEN_GOODIX is not set
# CONFIG_TOUCHSCREEN_ILI210X is not set
# CONFIG_TOUCHSCREEN_GUNZE is not set
# CONFIG_TOUCHSCREEN_ELAN is not set
# CONFIG_TOUCHSCREEN_ELO is not set
CONFIG_TOUCHSCREEN_WACOM_W8001=m
CONFIG_TOUCHSCREEN_WACOM_I2C=m
# CONFIG_TOUCHSCREEN_MAX11801 is not set
# CONFIG_TOUCHSCREEN_MCS5000 is not set
# CONFIG_TOUCHSCREEN_MMS114 is not set
# CONFIG_TOUCHSCREEN_MELFAS_MIP4 is not set
# CONFIG_TOUCHSCREEN_MTOUCH is not set
# CONFIG_TOUCHSCREEN_INEXIO is not set
# CONFIG_TOUCHSCREEN_MK712 is not set
# CONFIG_TOUCHSCREEN_PENMOUNT is not set
# CONFIG_TOUCHSCREEN_EDT_FT5X06 is not set
# CONFIG_TOUCHSCREEN_TOUCHRIGHT is not set
# CONFIG_TOUCHSCREEN_TOUCHWIN is not set
# CONFIG_TOUCHSCREEN_PIXCIR is not set
# CONFIG_TOUCHSCREEN_WDT87XX_I2C is not set
# CONFIG_TOUCHSCREEN_WM97XX is not set
# CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
# CONFIG_TOUCHSCREEN_TOUCHIT213 is not set
# CONFIG_TOUCHSCREEN_TSC_SERIO is not set
# CONFIG_TOUCHSCREEN_TSC2004 is not set
# CONFIG_TOUCHSCREEN_TSC2005 is not set
# CONFIG_TOUCHSCREEN_TSC2007 is not set
# CONFIG_TOUCHSCREEN_ST1232 is not set
# CONFIG_TOUCHSCREEN_SUR40 is not set
# CONFIG_TOUCHSCREEN_SX8654 is not set
# CONFIG_TOUCHSCREEN_TPS6507X is not set
# CONFIG_TOUCHSCREEN_ZFORCE is not set
# CONFIG_TOUCHSCREEN_ROHM_BU21023 is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_BMA150 is not set
# CONFIG_INPUT_E3X0_BUTTON is not set
CONFIG_INPUT_PCSPKR=m
# CONFIG_INPUT_MMA8450 is not set
# CONFIG_INPUT_MPU3050 is not set
CONFIG_INPUT_APANEL=m
# CONFIG_INPUT_GP2A is not set
# CONFIG_INPUT_GPIO_BEEPER is not set
# CONFIG_INPUT_GPIO_TILT_POLLED is not set
CONFIG_INPUT_ATLAS_BTNS=m
CONFIG_INPUT_ATI_REMOTE2=m
CONFIG_INPUT_KEYSPAN_REMOTE=m
# CONFIG_INPUT_KXTJ9 is not set
CONFIG_INPUT_POWERMATE=m
CONFIG_INPUT_YEALINK=m
CONFIG_INPUT_CM109=m
CONFIG_INPUT_UINPUT=m
# CONFIG_INPUT_PCF8574 is not set
# CONFIG_INPUT_PWM_BEEPER is not set
# CONFIG_INPUT_GPIO_ROTARY_ENCODER is not set
# CONFIG_INPUT_ADXL34X is not set
# CONFIG_INPUT_IMS_PCU is not set
# CONFIG_INPUT_CMA3000 is not set
CONFIG_INPUT_XEN_KBDDEV_FRONTEND=m
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
# CONFIG_INPUT_DRV260X_HAPTICS is not set
# CONFIG_INPUT_DRV2665_HAPTICS is not set
# CONFIG_INPUT_DRV2667_HAPTICS is not set
# CONFIG_RMI4_CORE is not set
#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
CONFIG_SERIO_ALTERA_PS2=m
# CONFIG_SERIO_PS2MULT is not set
CONFIG_SERIO_ARC_PS2=m
CONFIG_HYPERV_KEYBOARD=m
# CONFIG_USERIO is not set
# CONFIG_GAMEPORT is not set
#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
# CONFIG_LEGACY_PTYS is not set
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_ROCKETPORT is not set
CONFIG_CYCLADES=m
# CONFIG_CYZ_INTR is not set
CONFIG_MOXA_INTELLIO=m
CONFIG_MOXA_SMARTIO=m
CONFIG_SYNCLINK=m
CONFIG_SYNCLINKMP=m
CONFIG_SYNCLINK_GT=m
CONFIG_NOZOMI=m
# CONFIG_ISI is not set
CONFIG_N_HDLC=m
CONFIG_N_GSM=m
# CONFIG_TRACE_SINK is not set
CONFIG_DEVMEM=y
# CONFIG_DEVKMEM is not set
#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_DMA=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
CONFIG_SERIAL_8250_RSA=y
# CONFIG_SERIAL_8250_FSL is not set
CONFIG_SERIAL_8250_DW=y
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_MID=y
# CONFIG_SERIAL_8250_MOXA is not set
#
# Non-8250 serial port support
#
# CONFIG_SERIAL_MAX3100 is not set
# CONFIG_SERIAL_MAX310X is not set
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SERIAL_JSM=m
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_IFX6X60 is not set
CONFIG_SERIAL_ARC=m
CONFIG_SERIAL_ARC_NR_PORTS=1
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
CONFIG_PRINTER=m
# CONFIG_LP_CONSOLE is not set
CONFIG_PPDEV=m
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_XEN=y
CONFIG_HVC_XEN_FRONTEND=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
# CONFIG_IPMI_SI_PROBE_DEFAULTS is not set
# CONFIG_IPMI_SSIF is not set
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=m
CONFIG_HW_RANDOM_INTEL=m
CONFIG_HW_RANDOM_AMD=m
CONFIG_HW_RANDOM_VIA=m
CONFIG_HW_RANDOM_VIRTIO=y
CONFIG_HW_RANDOM_TPM=m
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
CONFIG_RAW_DRIVER=y
CONFIG_MAX_RAW_DEVS=8192
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
# CONFIG_HPET_MMAP_DEFAULT is not set
CONFIG_HANGCHECK_TIMER=m
CONFIG_UV_MMTIMER=m
CONFIG_TCG_TPM=y
CONFIG_TCG_TIS=y
# CONFIG_TCG_TIS_I2C_ATMEL is not set
# CONFIG_TCG_TIS_I2C_INFINEON is not set
# CONFIG_TCG_TIS_I2C_NUVOTON is not set
CONFIG_TCG_NSC=m
CONFIG_TCG_ATMEL=m
CONFIG_TCG_INFINEON=m
# CONFIG_TCG_XEN is not set
# CONFIG_TCG_CRB is not set
# CONFIG_TCG_TIS_ST33ZP24 is not set
CONFIG_TELCLOCK=m
CONFIG_DEVPORT=y
# CONFIG_XILLYBUS is not set
#
# I2C support
#
CONFIG_I2C=y
CONFIG_ACPI_I2C_OPREGION=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_COMPAT=y
CONFIG_I2C_CHARDEV=m
CONFIG_I2C_MUX=m
#
# Multiplexer I2C Chip support
#
# CONFIG_I2C_MUX_GPIO is not set
# CONFIG_I2C_MUX_PCA9541 is not set
# CONFIG_I2C_MUX_PCA954x is not set
# CONFIG_I2C_MUX_PINCTRL is not set
# CONFIG_I2C_MUX_REG is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_SMBUS=m
CONFIG_I2C_ALGOBIT=y
CONFIG_I2C_ALGOPCA=m
#
# I2C Hardware Bus support
#
#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD756_S4882=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=y
CONFIG_I2C_ISCH=m
CONFIG_I2C_ISMT=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_NFORCE2_S4985=m
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
CONFIG_I2C_SIS96X=m
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
#
# ACPI drivers
#
CONFIG_I2C_SCMI=m
#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_CBUS_GPIO is not set
CONFIG_I2C_DESIGNWARE_CORE=m
CONFIG_I2C_DESIGNWARE_PLATFORM=m
CONFIG_I2C_DESIGNWARE_PCI=m
# CONFIG_I2C_DESIGNWARE_BAYTRAIL is not set
# CONFIG_I2C_EMEV2 is not set
# CONFIG_I2C_GPIO is not set
# CONFIG_I2C_OCORES is not set
CONFIG_I2C_PCA_PLATFORM=m
# CONFIG_I2C_PXA_PCI is not set
CONFIG_I2C_SIMTEC=m
# CONFIG_I2C_XILINX is not set
#
# External I2C/SMBus adapter drivers
#
CONFIG_I2C_DIOLAN_U2C=m
CONFIG_I2C_PARPORT=m
CONFIG_I2C_PARPORT_LIGHT=m
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
CONFIG_I2C_TINY_USB=m
CONFIG_I2C_VIPERBOARD=m
#
# Other I2C/SMBus bus drivers
#
CONFIG_I2C_STUB=m
# CONFIG_I2C_SLAVE is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
CONFIG_SPI=y
# CONFIG_SPI_DEBUG is not set
CONFIG_SPI_MASTER=y
#
# SPI Master Controller Drivers
#
# CONFIG_SPI_ALTERA is not set
# CONFIG_SPI_AXI_SPI_ENGINE is not set
# CONFIG_SPI_BITBANG is not set
# CONFIG_SPI_BUTTERFLY is not set
# CONFIG_SPI_CADENCE is not set
CONFIG_SPI_DESIGNWARE=m
# CONFIG_SPI_DW_PCI is not set
# CONFIG_SPI_DW_MMIO is not set
# CONFIG_SPI_GPIO is not set
# CONFIG_SPI_LM70_LLP is not set
# CONFIG_SPI_OC_TINY is not set
CONFIG_SPI_PXA2XX=m
CONFIG_SPI_PXA2XX_PCI=m
# CONFIG_SPI_ROCKCHIP is not set
# CONFIG_SPI_SC18IS602 is not set
# CONFIG_SPI_XCOMM is not set
# CONFIG_SPI_XILINX is not set
# CONFIG_SPI_ZYNQMP_GQSPI is not set
#
# SPI Protocol Masters
#
# CONFIG_SPI_SPIDEV is not set
# CONFIG_SPI_LOOPBACK_TEST is not set
# CONFIG_SPI_TLE62X0 is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
#
# PPS support
#
CONFIG_PPS=y
# CONFIG_PPS_DEBUG is not set
#
# PPS clients support
#
# CONFIG_PPS_CLIENT_KTIMER is not set
CONFIG_PPS_CLIENT_LDISC=m
CONFIG_PPS_CLIENT_PARPORT=m
CONFIG_PPS_CLIENT_GPIO=m
#
# PPS generators support
#
#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK=y
CONFIG_DP83640_PHY=m
CONFIG_PINCTRL=y
#
# Pin controllers
#
CONFIG_PINMUX=y
CONFIG_PINCONF=y
CONFIG_GENERIC_PINCONF=y
# CONFIG_DEBUG_PINCTRL is not set
# CONFIG_PINCTRL_AMD is not set
CONFIG_PINCTRL_BAYTRAIL=y
# CONFIG_PINCTRL_CHERRYVIEW is not set
# CONFIG_PINCTRL_BROXTON is not set
# CONFIG_PINCTRL_SUNRISEPOINT is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
CONFIG_GPIOLIB=y
CONFIG_GPIO_DEVRES=y
CONFIG_GPIO_ACPI=y
CONFIG_GPIOLIB_IRQCHIP=y
# CONFIG_DEBUG_GPIO is not set
CONFIG_GPIO_SYSFS=y
#
# Memory mapped GPIO drivers
#
# CONFIG_GPIO_AMDPT is not set
# CONFIG_GPIO_DWAPB is not set
# CONFIG_GPIO_GENERIC_PLATFORM is not set
# CONFIG_GPIO_ICH is not set
CONFIG_GPIO_LYNXPOINT=y
# CONFIG_GPIO_VX855 is not set
# CONFIG_GPIO_ZX is not set
#
# Port-mapped I/O GPIO drivers
#
# CONFIG_GPIO_F7188X is not set
# CONFIG_GPIO_IT87 is not set
# CONFIG_GPIO_SCH is not set
# CONFIG_GPIO_SCH311X is not set
#
# I2C GPIO expanders
#
# CONFIG_GPIO_ADP5588 is not set
# CONFIG_GPIO_MAX7300 is not set
# CONFIG_GPIO_MAX732X is not set
# CONFIG_GPIO_PCA953X is not set
# CONFIG_GPIO_PCF857X is not set
# CONFIG_GPIO_SX150X is not set
# CONFIG_GPIO_TPIC2810 is not set
#
# MFD GPIO expanders
#
#
# PCI GPIO expanders
#
# CONFIG_GPIO_AMD8111 is not set
# CONFIG_GPIO_INTEL_MID is not set
# CONFIG_GPIO_ML_IOH is not set
# CONFIG_GPIO_RDC321X is not set
#
# SPI GPIO expanders
#
# CONFIG_GPIO_MAX7301 is not set
# CONFIG_GPIO_MC33880 is not set
# CONFIG_GPIO_PISOSR is not set
#
# SPI or I2C GPIO expanders
#
# CONFIG_GPIO_MCP23S08 is not set
#
# USB GPIO expanders
#
# CONFIG_GPIO_VIPERBOARD is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_DS2782 is not set
# CONFIG_BATTERY_SBS is not set
# CONFIG_BATTERY_BQ27XXX is not set
# CONFIG_BATTERY_MAX17040 is not set
# CONFIG_BATTERY_MAX17042 is not set
# CONFIG_CHARGER_ISP1704 is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_CHARGER_LP8727 is not set
# CONFIG_CHARGER_GPIO is not set
# CONFIG_CHARGER_BQ2415X is not set
# CONFIG_CHARGER_BQ24190 is not set
# CONFIG_CHARGER_BQ24257 is not set
# CONFIG_CHARGER_BQ24735 is not set
# CONFIG_CHARGER_BQ25890 is not set
CONFIG_CHARGER_SMB347=m
# CONFIG_BATTERY_GAUGE_LTC2941 is not set
# CONFIG_CHARGER_RT9455 is not set
CONFIG_POWER_RESET=y
# CONFIG_POWER_RESET_RESTART is not set
# CONFIG_POWER_AVS is not set
CONFIG_HWMON=y
CONFIG_HWMON_VID=m
# CONFIG_HWMON_DEBUG_CHIP is not set
#
# Native drivers
#
CONFIG_SENSORS_ABITUGURU=m
CONFIG_SENSORS_ABITUGURU3=m
# CONFIG_SENSORS_AD7314 is not set
CONFIG_SENSORS_AD7414=m
CONFIG_SENSORS_AD7418=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
CONFIG_SENSORS_ADM1029=m
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ADM9240=m
CONFIG_SENSORS_ADT7X10=m
# CONFIG_SENSORS_ADT7310 is not set
CONFIG_SENSORS_ADT7410=m
CONFIG_SENSORS_ADT7411=m
CONFIG_SENSORS_ADT7462=m
CONFIG_SENSORS_ADT7470=m
CONFIG_SENSORS_ADT7475=m
CONFIG_SENSORS_ASC7621=m
CONFIG_SENSORS_K8TEMP=m
CONFIG_SENSORS_K10TEMP=m
CONFIG_SENSORS_FAM15H_POWER=m
CONFIG_SENSORS_APPLESMC=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS620=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_DELL_SMM=m
CONFIG_SENSORS_I5K_AMB=m
CONFIG_SENSORS_F71805F=m
CONFIG_SENSORS_F71882FG=m
CONFIG_SENSORS_F75375S=m
CONFIG_SENSORS_FSCHMD=m
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_GL520SM=m
CONFIG_SENSORS_G760A=m
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_GPIO_FAN is not set
# CONFIG_SENSORS_HIH6130 is not set
CONFIG_SENSORS_IBMAEM=m
CONFIG_SENSORS_IBMPEX=m
# CONFIG_SENSORS_I5500 is not set
CONFIG_SENSORS_CORETEMP=m
CONFIG_SENSORS_IT87=m
# CONFIG_SENSORS_JC42 is not set
# CONFIG_SENSORS_POWR1220 is not set
CONFIG_SENSORS_LINEAGE=m
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC2990 is not set
CONFIG_SENSORS_LTC4151=m
CONFIG_SENSORS_LTC4215=m
# CONFIG_SENSORS_LTC4222 is not set
CONFIG_SENSORS_LTC4245=m
# CONFIG_SENSORS_LTC4260 is not set
CONFIG_SENSORS_LTC4261=m
# CONFIG_SENSORS_MAX1111 is not set
CONFIG_SENSORS_MAX16065=m
CONFIG_SENSORS_MAX1619=m
CONFIG_SENSORS_MAX1668=m
CONFIG_SENSORS_MAX197=m
# CONFIG_SENSORS_MAX31722 is not set
CONFIG_SENSORS_MAX6639=m
CONFIG_SENSORS_MAX6642=m
CONFIG_SENSORS_MAX6650=m
CONFIG_SENSORS_MAX6697=m
# CONFIG_SENSORS_MAX31790 is not set
CONFIG_SENSORS_MCP3021=m
# CONFIG_SENSORS_ADCXX is not set
CONFIG_SENSORS_LM63=m
# CONFIG_SENSORS_LM70 is not set
CONFIG_SENSORS_LM73=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_LM92=m
CONFIG_SENSORS_LM93=m
CONFIG_SENSORS_LM95234=m
CONFIG_SENSORS_LM95241=m
CONFIG_SENSORS_LM95245=m
CONFIG_SENSORS_PC87360=m
CONFIG_SENSORS_PC87427=m
CONFIG_SENSORS_NTC_THERMISTOR=m
# CONFIG_SENSORS_NCT6683 is not set
CONFIG_SENSORS_NCT6775=m
# CONFIG_SENSORS_NCT7802 is not set
# CONFIG_SENSORS_NCT7904 is not set
CONFIG_SENSORS_PCF8591=m
CONFIG_PMBUS=m
CONFIG_SENSORS_PMBUS=m
CONFIG_SENSORS_ADM1275=m
CONFIG_SENSORS_LM25066=m
CONFIG_SENSORS_LTC2978=m
# CONFIG_SENSORS_LTC3815 is not set
CONFIG_SENSORS_MAX16064=m
# CONFIG_SENSORS_MAX20751 is not set
CONFIG_SENSORS_MAX34440=m
CONFIG_SENSORS_MAX8688=m
# CONFIG_SENSORS_TPS40422 is not set
CONFIG_SENSORS_UCD9000=m
CONFIG_SENSORS_UCD9200=m
CONFIG_SENSORS_ZL6100=m
# CONFIG_SENSORS_SHT15 is not set
CONFIG_SENSORS_SHT21=m
# CONFIG_SENSORS_SHTC1 is not set
CONFIG_SENSORS_SIS5595=m
CONFIG_SENSORS_DME1737=m
CONFIG_SENSORS_EMC1403=m
# CONFIG_SENSORS_EMC2103 is not set
CONFIG_SENSORS_EMC6W201=m
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_SMSC47M192=m
CONFIG_SENSORS_SMSC47B397=m
CONFIG_SENSORS_SCH56XX_COMMON=m
CONFIG_SENSORS_SCH5627=m
CONFIG_SENSORS_SCH5636=m
# CONFIG_SENSORS_SMM665 is not set
# CONFIG_SENSORS_ADC128D818 is not set
CONFIG_SENSORS_ADS1015=m
CONFIG_SENSORS_ADS7828=m
# CONFIG_SENSORS_ADS7871 is not set
CONFIG_SENSORS_AMC6821=m
CONFIG_SENSORS_INA209=m
CONFIG_SENSORS_INA2XX=m
# CONFIG_SENSORS_TC74 is not set
CONFIG_SENSORS_THMC50=m
CONFIG_SENSORS_TMP102=m
# CONFIG_SENSORS_TMP103 is not set
CONFIG_SENSORS_TMP401=m
CONFIG_SENSORS_TMP421=m
CONFIG_SENSORS_VIA_CPUTEMP=m
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_VT1211=m
CONFIG_SENSORS_VT8231=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83791D=m
CONFIG_SENSORS_W83792D=m
CONFIG_SENSORS_W83793=m
CONFIG_SENSORS_W83795=m
# CONFIG_SENSORS_W83795_FANCTRL is not set
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83L786NG=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
#
# ACPI drivers
#
CONFIG_SENSORS_ACPI_POWER=m
CONFIG_SENSORS_ATK0110=m
CONFIG_THERMAL=y
CONFIG_THERMAL_HWMON=y
CONFIG_THERMAL_WRITABLE_TRIPS=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
# CONFIG_THERMAL_DEFAULT_GOV_POWER_ALLOCATOR is not set
CONFIG_THERMAL_GOV_FAIR_SHARE=y
CONFIG_THERMAL_GOV_STEP_WISE=y
CONFIG_THERMAL_GOV_BANG_BANG=y
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set
# CONFIG_THERMAL_EMULATION is not set
CONFIG_INTEL_POWERCLAMP=m
CONFIG_X86_PKG_TEMP_THERMAL=m
# CONFIG_INTEL_SOC_DTS_THERMAL is not set
#
# ACPI INT340X thermal drivers
#
# CONFIG_INT340X_THERMAL is not set
CONFIG_INTEL_PCH_THERMAL=m
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
# CONFIG_WATCHDOG_SYSFS is not set
#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_ZIIRAVE_WATCHDOG is not set
# CONFIG_CADENCE_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_MAX63XX_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
CONFIG_F71808E_WDT=m
CONFIG_SP5100_TCO=m
CONFIG_SBC_FITPC2_WATCHDOG=m
# CONFIG_EUROTECH_WDT is not set
CONFIG_IB700_WDT=m
CONFIG_IBMASR=m
# CONFIG_WAFER_WDT is not set
CONFIG_I6300ESB_WDT=y
CONFIG_IE6XX_WDT=m
CONFIG_ITCO_WDT=y
CONFIG_ITCO_VENDOR_SUPPORT=y
CONFIG_IT8712F_WDT=m
CONFIG_IT87_WDT=m
CONFIG_HP_WATCHDOG=m
CONFIG_HPWDT_NMI_DECODING=y
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
CONFIG_NV_TCO=m
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
CONFIG_SMSC_SCH311X_WDT=m
# CONFIG_SMSC37B787_WDT is not set
CONFIG_VIA_WDT=m
CONFIG_W83627HF_WDT=m
CONFIG_W83877F_WDT=m
CONFIG_W83977F_WDT=m
CONFIG_MACHZ_WDT=m
# CONFIG_SBC_EPX_C3_WATCHDOG is not set
# CONFIG_INTEL_MEI_WDT is not set
# CONFIG_NI903X_WDT is not set
# CONFIG_MEN_A21_WDT is not set
CONFIG_XEN_WDT=m
#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m
#
# USB-based Watchdog Cards
#
CONFIG_USBPCWATCHDOG=m
CONFIG_SSB_POSSIBLE=y
#
# Sonics Silicon Backplane
#
CONFIG_SSB=m
CONFIG_SSB_SPROM=y
CONFIG_SSB_PCIHOST_POSSIBLE=y
CONFIG_SSB_PCIHOST=y
# CONFIG_SSB_B43_PCI_BRIDGE is not set
CONFIG_SSB_SDIOHOST_POSSIBLE=y
CONFIG_SSB_SDIOHOST=y
# CONFIG_SSB_DEBUG is not set
CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
CONFIG_SSB_DRIVER_PCICORE=y
# CONFIG_SSB_DRIVER_GPIO is not set
CONFIG_BCMA_POSSIBLE=y
#
# Broadcom specific AMBA
#
CONFIG_BCMA=m
CONFIG_BCMA_HOST_PCI_POSSIBLE=y
CONFIG_BCMA_HOST_PCI=y
# CONFIG_BCMA_HOST_SOC is not set
CONFIG_BCMA_DRIVER_PCI=y
CONFIG_BCMA_DRIVER_GMAC_CMN=y
# CONFIG_BCMA_DRIVER_GPIO is not set
# CONFIG_BCMA_DEBUG is not set
#
# Multifunction device drivers
#
CONFIG_MFD_CORE=y
# CONFIG_MFD_AS3711 is not set
# CONFIG_PMIC_ADP5520 is not set
# CONFIG_MFD_AAT2870_CORE is not set
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_AXP20X_I2C is not set
# CONFIG_MFD_CROS_EC is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_DA9052_SPI is not set
# CONFIG_MFD_DA9052_I2C is not set
# CONFIG_MFD_DA9055 is not set
# CONFIG_MFD_DA9062 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_DLN2 is not set
# CONFIG_MFD_MC13XXX_SPI is not set
# CONFIG_MFD_MC13XXX_I2C is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_HTC_I2CPLD is not set
# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
CONFIG_LPC_ICH=y
CONFIG_LPC_SCH=m
# CONFIG_INTEL_SOC_PMIC is not set
# CONFIG_MFD_INTEL_LPSS_ACPI is not set
# CONFIG_MFD_INTEL_LPSS_PCI is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
# CONFIG_MFD_88PM805 is not set
# CONFIG_MFD_88PM860X is not set
# CONFIG_MFD_MAX14577 is not set
# CONFIG_MFD_MAX77693 is not set
# CONFIG_MFD_MAX77843 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MAX8925 is not set
# CONFIG_MFD_MAX8997 is not set
# CONFIG_MFD_MAX8998 is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_MENF21BMC is not set
# CONFIG_EZX_PCAP is not set
CONFIG_MFD_VIPERBOARD=m
# CONFIG_MFD_RETU is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_UCB1400_CORE is not set
# CONFIG_MFD_RDC321X is not set
CONFIG_MFD_RTSX_PCI=m
# CONFIG_MFD_RT5033 is not set
# CONFIG_MFD_RTSX_USB is not set
# CONFIG_MFD_RC5T583 is not set
# CONFIG_MFD_RN5T618 is not set
# CONFIG_MFD_SEC_CORE is not set
# CONFIG_MFD_SI476X_CORE is not set
CONFIG_MFD_SM501=m
# CONFIG_MFD_SM501_GPIO is not set
# CONFIG_MFD_SKY81452 is not set
# CONFIG_MFD_SMSC is not set
# CONFIG_ABX500_CORE is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_LP3943 is not set
# CONFIG_MFD_LP8788 is not set
# CONFIG_MFD_PALMAS is not set
# CONFIG_TPS6105X is not set
# CONFIG_TPS65010 is not set
# CONFIG_TPS6507X is not set
# CONFIG_MFD_TPS65086 is not set
# CONFIG_MFD_TPS65090 is not set
# CONFIG_MFD_TPS65217 is not set
# CONFIG_MFD_TPS65218 is not set
# CONFIG_MFD_TPS6586X is not set
# CONFIG_MFD_TPS65910 is not set
# CONFIG_MFD_TPS65912_I2C is not set
# CONFIG_MFD_TPS65912_SPI is not set
# CONFIG_MFD_TPS80031 is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_TWL6040_CORE is not set
# CONFIG_MFD_WL1273_CORE is not set
# CONFIG_MFD_LM3533 is not set
# CONFIG_MFD_TMIO is not set
CONFIG_MFD_VX855=m
# CONFIG_MFD_ARIZONA_I2C is not set
# CONFIG_MFD_ARIZONA_SPI is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM831X_SPI is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_WM8994 is not set
# CONFIG_REGULATOR is not set
CONFIG_MEDIA_SUPPORT=m
#
# Multimedia core support
#
CONFIG_MEDIA_CAMERA_SUPPORT=y
CONFIG_MEDIA_ANALOG_TV_SUPPORT=y
CONFIG_MEDIA_DIGITAL_TV_SUPPORT=y
CONFIG_MEDIA_RADIO_SUPPORT=y
# CONFIG_MEDIA_SDR_SUPPORT is not set
CONFIG_MEDIA_RC_SUPPORT=y
# CONFIG_MEDIA_CONTROLLER is not set
CONFIG_VIDEO_DEV=m
CONFIG_VIDEO_V4L2=m
# CONFIG_VIDEO_ADV_DEBUG is not set
# CONFIG_VIDEO_FIXED_MINOR_RANGES is not set
CONFIG_VIDEO_TUNER=m
CONFIG_VIDEOBUF_GEN=m
CONFIG_VIDEOBUF_DMA_SG=m
CONFIG_VIDEOBUF_VMALLOC=m
CONFIG_VIDEOBUF_DVB=m
CONFIG_VIDEOBUF2_CORE=m
CONFIG_VIDEOBUF2_MEMOPS=m
CONFIG_VIDEOBUF2_VMALLOC=m
CONFIG_VIDEOBUF2_DMA_SG=m
CONFIG_VIDEOBUF2_DVB=m
CONFIG_DVB_CORE=m
CONFIG_DVB_NET=y
CONFIG_TTPCI_EEPROM=m
CONFIG_DVB_MAX_ADAPTERS=8
CONFIG_DVB_DYNAMIC_MINORS=y
#
# Media drivers
#
CONFIG_RC_CORE=m
CONFIG_RC_MAP=m
CONFIG_RC_DECODERS=y
CONFIG_LIRC=m
CONFIG_IR_LIRC_CODEC=m
CONFIG_IR_NEC_DECODER=m
CONFIG_IR_RC5_DECODER=m
CONFIG_IR_RC6_DECODER=m
CONFIG_IR_JVC_DECODER=m
CONFIG_IR_SONY_DECODER=m
CONFIG_IR_SANYO_DECODER=m
CONFIG_IR_SHARP_DECODER=m
CONFIG_IR_MCE_KBD_DECODER=m
CONFIG_IR_XMP_DECODER=m
CONFIG_RC_DEVICES=y
CONFIG_RC_ATI_REMOTE=m
CONFIG_IR_ENE=m
# CONFIG_IR_HIX5HD2 is not set
CONFIG_IR_IMON=m
CONFIG_IR_MCEUSB=m
CONFIG_IR_ITE_CIR=m
CONFIG_IR_FINTEK=m
CONFIG_IR_NUVOTON=m
CONFIG_IR_REDRAT3=m
CONFIG_IR_STREAMZAP=m
CONFIG_IR_WINBOND_CIR=m
# CONFIG_IR_IGORPLUGUSB is not set
CONFIG_IR_IGUANA=m
CONFIG_IR_TTUSBIR=m
# CONFIG_RC_LOOPBACK is not set
CONFIG_IR_GPIO_CIR=m
CONFIG_MEDIA_USB_SUPPORT=y
#
# Webcam devices
#
CONFIG_USB_VIDEO_CLASS=m
CONFIG_USB_VIDEO_CLASS_INPUT_EVDEV=y
CONFIG_USB_GSPCA=m
CONFIG_USB_M5602=m
CONFIG_USB_STV06XX=m
CONFIG_USB_GL860=m
CONFIG_USB_GSPCA_BENQ=m
CONFIG_USB_GSPCA_CONEX=m
CONFIG_USB_GSPCA_CPIA1=m
# CONFIG_USB_GSPCA_DTCS033 is not set
CONFIG_USB_GSPCA_ETOMS=m
CONFIG_USB_GSPCA_FINEPIX=m
CONFIG_USB_GSPCA_JEILINJ=m
CONFIG_USB_GSPCA_JL2005BCD=m
# CONFIG_USB_GSPCA_KINECT is not set
CONFIG_USB_GSPCA_KONICA=m
CONFIG_USB_GSPCA_MARS=m
CONFIG_USB_GSPCA_MR97310A=m
CONFIG_USB_GSPCA_NW80X=m
CONFIG_USB_GSPCA_OV519=m
CONFIG_USB_GSPCA_OV534=m
CONFIG_USB_GSPCA_OV534_9=m
CONFIG_USB_GSPCA_PAC207=m
CONFIG_USB_GSPCA_PAC7302=m
CONFIG_USB_GSPCA_PAC7311=m
CONFIG_USB_GSPCA_SE401=m
CONFIG_USB_GSPCA_SN9C2028=m
CONFIG_USB_GSPCA_SN9C20X=m
CONFIG_USB_GSPCA_SONIXB=m
CONFIG_USB_GSPCA_SONIXJ=m
CONFIG_USB_GSPCA_SPCA500=m
CONFIG_USB_GSPCA_SPCA501=m
CONFIG_USB_GSPCA_SPCA505=m
CONFIG_USB_GSPCA_SPCA506=m
CONFIG_USB_GSPCA_SPCA508=m
CONFIG_USB_GSPCA_SPCA561=m
CONFIG_USB_GSPCA_SPCA1528=m
CONFIG_USB_GSPCA_SQ905=m
CONFIG_USB_GSPCA_SQ905C=m
CONFIG_USB_GSPCA_SQ930X=m
CONFIG_USB_GSPCA_STK014=m
# CONFIG_USB_GSPCA_STK1135 is not set
CONFIG_USB_GSPCA_STV0680=m
CONFIG_USB_GSPCA_SUNPLUS=m
CONFIG_USB_GSPCA_T613=m
CONFIG_USB_GSPCA_TOPRO=m
# CONFIG_USB_GSPCA_TOUPTEK is not set
CONFIG_USB_GSPCA_TV8532=m
CONFIG_USB_GSPCA_VC032X=m
CONFIG_USB_GSPCA_VICAM=m
CONFIG_USB_GSPCA_XIRLINK_CIT=m
CONFIG_USB_GSPCA_ZC3XX=m
CONFIG_USB_PWC=m
# CONFIG_USB_PWC_DEBUG is not set
CONFIG_USB_PWC_INPUT_EVDEV=y
# CONFIG_VIDEO_CPIA2 is not set
CONFIG_USB_ZR364XX=m
CONFIG_USB_STKWEBCAM=m
CONFIG_USB_S2255=m
# CONFIG_VIDEO_USBTV is not set
#
# Analog TV USB devices
#
CONFIG_VIDEO_PVRUSB2=m
CONFIG_VIDEO_PVRUSB2_SYSFS=y
CONFIG_VIDEO_PVRUSB2_DVB=y
# CONFIG_VIDEO_PVRUSB2_DEBUGIFC is not set
CONFIG_VIDEO_HDPVR=m
CONFIG_VIDEO_USBVISION=m
# CONFIG_VIDEO_STK1160_COMMON is not set
# CONFIG_VIDEO_GO7007 is not set
#
# Analog/digital TV USB devices
#
CONFIG_VIDEO_AU0828=m
CONFIG_VIDEO_AU0828_V4L2=y
# CONFIG_VIDEO_AU0828_RC is not set
CONFIG_VIDEO_CX231XX=m
CONFIG_VIDEO_CX231XX_RC=y
CONFIG_VIDEO_CX231XX_ALSA=m
CONFIG_VIDEO_CX231XX_DVB=m
CONFIG_VIDEO_TM6000=m
CONFIG_VIDEO_TM6000_ALSA=m
CONFIG_VIDEO_TM6000_DVB=m
#
# Digital TV USB devices
#
CONFIG_DVB_USB=m
# CONFIG_DVB_USB_DEBUG is not set
CONFIG_DVB_USB_A800=m
CONFIG_DVB_USB_DIBUSB_MB=m
# CONFIG_DVB_USB_DIBUSB_MB_FAULTY is not set
CONFIG_DVB_USB_DIBUSB_MC=m
CONFIG_DVB_USB_DIB0700=m
CONFIG_DVB_USB_UMT_010=m
CONFIG_DVB_USB_CXUSB=m
CONFIG_DVB_USB_M920X=m
CONFIG_DVB_USB_DIGITV=m
CONFIG_DVB_USB_VP7045=m
CONFIG_DVB_USB_VP702X=m
CONFIG_DVB_USB_GP8PSK=m
CONFIG_DVB_USB_NOVA_T_USB2=m
CONFIG_DVB_USB_TTUSB2=m
CONFIG_DVB_USB_DTT200U=m
CONFIG_DVB_USB_OPERA1=m
CONFIG_DVB_USB_AF9005=m
CONFIG_DVB_USB_AF9005_REMOTE=m
CONFIG_DVB_USB_PCTV452E=m
CONFIG_DVB_USB_DW2102=m
CONFIG_DVB_USB_CINERGY_T2=m
CONFIG_DVB_USB_DTV5100=m
CONFIG_DVB_USB_FRIIO=m
CONFIG_DVB_USB_AZ6027=m
CONFIG_DVB_USB_TECHNISAT_USB2=m
CONFIG_DVB_USB_V2=m
CONFIG_DVB_USB_AF9015=m
CONFIG_DVB_USB_AF9035=m
CONFIG_DVB_USB_ANYSEE=m
CONFIG_DVB_USB_AU6610=m
CONFIG_DVB_USB_AZ6007=m
CONFIG_DVB_USB_CE6230=m
CONFIG_DVB_USB_EC168=m
CONFIG_DVB_USB_GL861=m
CONFIG_DVB_USB_LME2510=m
CONFIG_DVB_USB_MXL111SF=m
CONFIG_DVB_USB_RTL28XXU=m
# CONFIG_DVB_USB_DVBSKY is not set
CONFIG_DVB_TTUSB_BUDGET=m
CONFIG_DVB_TTUSB_DEC=m
CONFIG_SMS_USB_DRV=m
CONFIG_DVB_B2C2_FLEXCOP_USB=m
# CONFIG_DVB_B2C2_FLEXCOP_USB_DEBUG is not set
# CONFIG_DVB_AS102 is not set
#
# Webcam, TV (analog/digital) USB devices
#
CONFIG_VIDEO_EM28XX=m
# CONFIG_VIDEO_EM28XX_V4L2 is not set
CONFIG_VIDEO_EM28XX_ALSA=m
CONFIG_VIDEO_EM28XX_DVB=m
CONFIG_VIDEO_EM28XX_RC=m
CONFIG_MEDIA_PCI_SUPPORT=y
#
# Media capture support
#
# CONFIG_VIDEO_MEYE is not set
# CONFIG_VIDEO_SOLO6X10 is not set
# CONFIG_VIDEO_TW68 is not set
# CONFIG_VIDEO_TW686X is not set
# CONFIG_VIDEO_ZORAN is not set
#
# Media capture/analog TV support
#
CONFIG_VIDEO_IVTV=m
# CONFIG_VIDEO_IVTV_ALSA is not set
CONFIG_VIDEO_FB_IVTV=m
# CONFIG_VIDEO_HEXIUM_GEMINI is not set
# CONFIG_VIDEO_HEXIUM_ORION is not set
# CONFIG_VIDEO_MXB is not set
# CONFIG_VIDEO_DT3155 is not set
#
# Media capture/analog/hybrid TV support
#
CONFIG_VIDEO_CX18=m
CONFIG_VIDEO_CX18_ALSA=m
CONFIG_VIDEO_CX23885=m
CONFIG_MEDIA_ALTERA_CI=m
# CONFIG_VIDEO_CX25821 is not set
CONFIG_VIDEO_CX88=m
CONFIG_VIDEO_CX88_ALSA=m
CONFIG_VIDEO_CX88_BLACKBIRD=m
CONFIG_VIDEO_CX88_DVB=m
CONFIG_VIDEO_CX88_ENABLE_VP3054=y
CONFIG_VIDEO_CX88_VP3054=m
CONFIG_VIDEO_CX88_MPEG=m
CONFIG_VIDEO_BT848=m
CONFIG_DVB_BT8XX=m
CONFIG_VIDEO_SAA7134=m
CONFIG_VIDEO_SAA7134_ALSA=m
CONFIG_VIDEO_SAA7134_RC=y
CONFIG_VIDEO_SAA7134_DVB=m
CONFIG_VIDEO_SAA7164=m
#
# Media digital TV PCI Adapters
#
CONFIG_DVB_AV7110_IR=y
CONFIG_DVB_AV7110=m
CONFIG_DVB_AV7110_OSD=y
CONFIG_DVB_BUDGET_CORE=m
CONFIG_DVB_BUDGET=m
CONFIG_DVB_BUDGET_CI=m
CONFIG_DVB_BUDGET_AV=m
CONFIG_DVB_BUDGET_PATCH=m
CONFIG_DVB_B2C2_FLEXCOP_PCI=m
# CONFIG_DVB_B2C2_FLEXCOP_PCI_DEBUG is not set
CONFIG_DVB_PLUTO2=m
CONFIG_DVB_DM1105=m
CONFIG_DVB_PT1=m
# CONFIG_DVB_PT3 is not set
CONFIG_MANTIS_CORE=m
CONFIG_DVB_MANTIS=m
CONFIG_DVB_HOPPER=m
CONFIG_DVB_NGENE=m
CONFIG_DVB_DDBRIDGE=m
# CONFIG_DVB_SMIPCIE is not set
# CONFIG_DVB_NETUP_UNIDVB is not set
# CONFIG_V4L_PLATFORM_DRIVERS is not set
# CONFIG_V4L_MEM2MEM_DRIVERS is not set
# CONFIG_V4L_TEST_DRIVERS is not set
# CONFIG_DVB_PLATFORM_DRIVERS is not set
#
# Supported MMC/SDIO adapters
#
CONFIG_SMS_SDIO_DRV=m
CONFIG_RADIO_ADAPTERS=y
CONFIG_RADIO_TEA575X=m
# CONFIG_RADIO_SI470X is not set
# CONFIG_RADIO_SI4713 is not set
# CONFIG_USB_MR800 is not set
# CONFIG_USB_DSBR is not set
# CONFIG_RADIO_MAXIRADIO is not set
# CONFIG_RADIO_SHARK is not set
# CONFIG_RADIO_SHARK2 is not set
# CONFIG_USB_KEENE is not set
# CONFIG_USB_RAREMONO is not set
# CONFIG_USB_MA901 is not set
# CONFIG_RADIO_TEA5764 is not set
# CONFIG_RADIO_SAA7706H is not set
# CONFIG_RADIO_TEF6862 is not set
# CONFIG_RADIO_WL1273 is not set
#
# Texas Instruments WL128x FM driver (ST based)
#
#
# Supported FireWire (IEEE 1394) Adapters
#
CONFIG_DVB_FIREDTV=m
CONFIG_DVB_FIREDTV_INPUT=y
CONFIG_MEDIA_COMMON_OPTIONS=y
#
# common driver options
#
CONFIG_VIDEO_CX2341X=m
CONFIG_VIDEO_TVEEPROM=m
CONFIG_CYPRESS_FIRMWARE=m
CONFIG_DVB_B2C2_FLEXCOP=m
CONFIG_VIDEO_SAA7146=m
CONFIG_VIDEO_SAA7146_VV=m
CONFIG_SMS_SIANO_MDTV=m
CONFIG_SMS_SIANO_RC=y
# CONFIG_SMS_SIANO_DEBUGFS is not set
#
# Media ancillary drivers (tuners, sensors, i2c, frontends)
#
CONFIG_MEDIA_SUBDRV_AUTOSELECT=y
CONFIG_MEDIA_ATTACH=y
CONFIG_VIDEO_IR_I2C=m
#
# Audio decoders, processors and mixers
#
CONFIG_VIDEO_TVAUDIO=m
CONFIG_VIDEO_TDA7432=m
CONFIG_VIDEO_MSP3400=m
CONFIG_VIDEO_CS3308=m
CONFIG_VIDEO_CS5345=m
CONFIG_VIDEO_CS53L32A=m
CONFIG_VIDEO_WM8775=m
CONFIG_VIDEO_WM8739=m
CONFIG_VIDEO_VP27SMPX=m
#
# RDS decoders
#
CONFIG_VIDEO_SAA6588=m
#
# Video decoders
#
CONFIG_VIDEO_SAA711X=m
#
# Video and audio decoders
#
CONFIG_VIDEO_SAA717X=m
CONFIG_VIDEO_CX25840=m
#
# Video encoders
#
CONFIG_VIDEO_SAA7127=m
#
# Camera sensor devices
#
#
# Flash devices
#
#
# Video improvement chips
#
CONFIG_VIDEO_UPD64031A=m
CONFIG_VIDEO_UPD64083=m
#
# Audio/Video compression chips
#
CONFIG_VIDEO_SAA6752HS=m
#
# Miscellaneous helper chips
#
CONFIG_VIDEO_M52790=m
#
# Sensors used on soc_camera driver
#
CONFIG_MEDIA_TUNER=m
CONFIG_MEDIA_TUNER_SIMPLE=m
CONFIG_MEDIA_TUNER_TDA8290=m
CONFIG_MEDIA_TUNER_TDA827X=m
CONFIG_MEDIA_TUNER_TDA18271=m
CONFIG_MEDIA_TUNER_TDA9887=m
CONFIG_MEDIA_TUNER_TEA5761=m
CONFIG_MEDIA_TUNER_TEA5767=m
CONFIG_MEDIA_TUNER_MT20XX=m
CONFIG_MEDIA_TUNER_MT2060=m
CONFIG_MEDIA_TUNER_MT2063=m
CONFIG_MEDIA_TUNER_MT2266=m
CONFIG_MEDIA_TUNER_MT2131=m
CONFIG_MEDIA_TUNER_QT1010=m
CONFIG_MEDIA_TUNER_XC2028=m
CONFIG_MEDIA_TUNER_XC5000=m
CONFIG_MEDIA_TUNER_XC4000=m
CONFIG_MEDIA_TUNER_MXL5005S=m
CONFIG_MEDIA_TUNER_MXL5007T=m
CONFIG_MEDIA_TUNER_MC44S803=m
CONFIG_MEDIA_TUNER_MAX2165=m
CONFIG_MEDIA_TUNER_TDA18218=m
CONFIG_MEDIA_TUNER_FC0011=m
CONFIG_MEDIA_TUNER_FC0012=m
CONFIG_MEDIA_TUNER_FC0013=m
CONFIG_MEDIA_TUNER_TDA18212=m
CONFIG_MEDIA_TUNER_E4000=m
CONFIG_MEDIA_TUNER_FC2580=m
CONFIG_MEDIA_TUNER_M88RS6000T=m
CONFIG_MEDIA_TUNER_TUA9001=m
CONFIG_MEDIA_TUNER_SI2157=m
CONFIG_MEDIA_TUNER_IT913X=m
CONFIG_MEDIA_TUNER_R820T=m
CONFIG_MEDIA_TUNER_QM1D1C0042=m
#
# Multistandard (satellite) frontends
#
CONFIG_DVB_STB0899=m
CONFIG_DVB_STB6100=m
CONFIG_DVB_STV090x=m
CONFIG_DVB_STV6110x=m
CONFIG_DVB_M88DS3103=m
#
# Multistandard (cable + terrestrial) frontends
#
CONFIG_DVB_DRXK=m
CONFIG_DVB_TDA18271C2DD=m
CONFIG_DVB_SI2165=m
#
# DVB-S (satellite) frontends
#
CONFIG_DVB_CX24110=m
CONFIG_DVB_CX24123=m
CONFIG_DVB_MT312=m
CONFIG_DVB_ZL10036=m
CONFIG_DVB_ZL10039=m
CONFIG_DVB_S5H1420=m
CONFIG_DVB_STV0288=m
CONFIG_DVB_STB6000=m
CONFIG_DVB_STV0299=m
CONFIG_DVB_STV6110=m
CONFIG_DVB_STV0900=m
CONFIG_DVB_TDA8083=m
CONFIG_DVB_TDA10086=m
CONFIG_DVB_TDA8261=m
CONFIG_DVB_VES1X93=m
CONFIG_DVB_TUNER_ITD1000=m
CONFIG_DVB_TUNER_CX24113=m
CONFIG_DVB_TDA826X=m
CONFIG_DVB_TUA6100=m
CONFIG_DVB_CX24116=m
CONFIG_DVB_CX24117=m
CONFIG_DVB_CX24120=m
CONFIG_DVB_SI21XX=m
CONFIG_DVB_TS2020=m
CONFIG_DVB_DS3000=m
CONFIG_DVB_MB86A16=m
CONFIG_DVB_TDA10071=m
#
# DVB-T (terrestrial) frontends
#
CONFIG_DVB_SP8870=m
CONFIG_DVB_SP887X=m
CONFIG_DVB_CX22700=m
CONFIG_DVB_CX22702=m
CONFIG_DVB_DRXD=m
CONFIG_DVB_L64781=m
CONFIG_DVB_TDA1004X=m
CONFIG_DVB_NXT6000=m
CONFIG_DVB_MT352=m
CONFIG_DVB_ZL10353=m
CONFIG_DVB_DIB3000MB=m
CONFIG_DVB_DIB3000MC=m
CONFIG_DVB_DIB7000M=m
CONFIG_DVB_DIB7000P=m
CONFIG_DVB_TDA10048=m
CONFIG_DVB_AF9013=m
CONFIG_DVB_EC100=m
CONFIG_DVB_STV0367=m
CONFIG_DVB_CXD2820R=m
CONFIG_DVB_RTL2830=m
CONFIG_DVB_RTL2832=m
CONFIG_DVB_SI2168=m
# CONFIG_DVB_AS102_FE is not set
#
# DVB-C (cable) frontends
#
CONFIG_DVB_VES1820=m
CONFIG_DVB_TDA10021=m
CONFIG_DVB_TDA10023=m
CONFIG_DVB_STV0297=m
#
# ATSC (North American/Korean Terrestrial/Cable DTV) frontends
#
CONFIG_DVB_NXT200X=m
CONFIG_DVB_OR51211=m
CONFIG_DVB_OR51132=m
CONFIG_DVB_BCM3510=m
CONFIG_DVB_LGDT330X=m
CONFIG_DVB_LGDT3305=m
CONFIG_DVB_LGDT3306A=m
CONFIG_DVB_LG2160=m
CONFIG_DVB_S5H1409=m
CONFIG_DVB_AU8522=m
CONFIG_DVB_AU8522_DTV=m
CONFIG_DVB_AU8522_V4L=m
CONFIG_DVB_S5H1411=m
#
# ISDB-T (terrestrial) frontends
#
CONFIG_DVB_S921=m
CONFIG_DVB_DIB8000=m
CONFIG_DVB_MB86A20S=m
#
# ISDB-S (satellite) & ISDB-T (terrestrial) frontends
#
CONFIG_DVB_TC90522=m
#
# Digital terrestrial only tuners/PLL
#
CONFIG_DVB_PLL=m
CONFIG_DVB_TUNER_DIB0070=m
CONFIG_DVB_TUNER_DIB0090=m
#
# SEC control devices for DVB-S
#
CONFIG_DVB_DRX39XYJ=m
CONFIG_DVB_LNBP21=m
CONFIG_DVB_LNBP22=m
CONFIG_DVB_ISL6405=m
CONFIG_DVB_ISL6421=m
CONFIG_DVB_ISL6423=m
CONFIG_DVB_A8293=m
CONFIG_DVB_LGS8GXX=m
CONFIG_DVB_ATBM8830=m
CONFIG_DVB_TDA665x=m
CONFIG_DVB_IX2505V=m
CONFIG_DVB_M88RS2000=m
CONFIG_DVB_AF9033=m
#
# Tools to develop new frontends
#
# CONFIG_DVB_DUMMY_FE is not set
#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
CONFIG_AGP_SIS=y
CONFIG_AGP_VIA=y
CONFIG_INTEL_GTT=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=64
CONFIG_VGA_SWITCHEROO=y
CONFIG_DRM=m
CONFIG_DRM_MIPI_DSI=y
# CONFIG_DRM_DP_AUX_CHARDEV is not set
CONFIG_DRM_KMS_HELPER=m
CONFIG_DRM_KMS_FB_HELPER=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_LOAD_EDID_FIRMWARE=y
CONFIG_DRM_TTM=m
#
# I2C encoder or helper chips
#
# CONFIG_DRM_I2C_ADV7511 is not set
CONFIG_DRM_I2C_CH7006=m
CONFIG_DRM_I2C_SIL164=m
CONFIG_DRM_I2C_NXP_TDA998X=m
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_R128 is not set
# CONFIG_DRM_RADEON is not set
# CONFIG_DRM_AMDGPU is not set
#
# ACP (Audio CoProcessor) Configuration
#
# CONFIG_DRM_NOUVEAU is not set
# CONFIG_DRM_I810 is not set
CONFIG_DRM_I915=m
# CONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT is not set
CONFIG_DRM_I915_USERPTR=y
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
# CONFIG_DRM_SAVAGE is not set
# CONFIG_DRM_VGEM is not set
CONFIG_DRM_VMWGFX=m
CONFIG_DRM_VMWGFX_FBCON=y
CONFIG_DRM_GMA500=m
CONFIG_DRM_GMA600=y
CONFIG_DRM_GMA3600=y
CONFIG_DRM_UDL=m
CONFIG_DRM_AST=m
CONFIG_DRM_MGAG200=m
CONFIG_DRM_CIRRUS_QEMU=m
CONFIG_DRM_QXL=m
# CONFIG_DRM_BOCHS is not set
# CONFIG_DRM_VIRTIO_GPU is not set
CONFIG_DRM_PANEL=y
#
# Display Panels
#
CONFIG_DRM_BRIDGE=y
#
# Display Interface Bridges
#
# CONFIG_DRM_ANALOGIX_ANX78XX is not set
#
# Frame buffer Devices
#
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
CONFIG_FB_CMDLINE=y
CONFIG_FB_NOTIFY=y
# CONFIG_FB_DDC is not set
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
CONFIG_FB_SYS_FILLRECT=m
CONFIG_FB_SYS_COPYAREA=m
CONFIG_FB_SYS_IMAGEBLIT=m
# CONFIG_FB_FOREIGN_ENDIAN is not set
CONFIG_FB_SYS_FOPS=m
CONFIG_FB_DEFERRED_IO=y
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
# CONFIG_FB_MODE_HELPERS is not set
CONFIG_FB_TILEBLITTING=y
#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SM501 is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_UDL is not set
# CONFIG_FB_IBM_GXT4500 is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_XEN_FBDEV_FRONTEND is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
# CONFIG_FB_AUO_K190X is not set
CONFIG_FB_HYPERV=m
# CONFIG_FB_SIMPLE is not set
# CONFIG_FB_SM712 is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=m
# CONFIG_LCD_L4F00242T03 is not set
# CONFIG_LCD_LMS283GF05 is not set
# CONFIG_LCD_LTV350QV is not set
# CONFIG_LCD_ILI922X is not set
# CONFIG_LCD_ILI9320 is not set
# CONFIG_LCD_TDO24M is not set
# CONFIG_LCD_VGG2432A4 is not set
CONFIG_LCD_PLATFORM=m
# CONFIG_LCD_S6E63M0 is not set
# CONFIG_LCD_LD9040 is not set
# CONFIG_LCD_AMS369FG06 is not set
# CONFIG_LCD_LMS501KF03 is not set
# CONFIG_LCD_HX8357 is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_GENERIC is not set
# CONFIG_BACKLIGHT_PWM is not set
CONFIG_BACKLIGHT_APPLE=m
# CONFIG_BACKLIGHT_PM8941_WLED is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# CONFIG_BACKLIGHT_ADP8860 is not set
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_LM3630A is not set
# CONFIG_BACKLIGHT_LM3639 is not set
# CONFIG_BACKLIGHT_LP855X is not set
# CONFIG_BACKLIGHT_GPIO is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_VGASTATE is not set
CONFIG_HDMI=y
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y
CONFIG_SOUND=m
CONFIG_SOUND_OSS_CORE=y
CONFIG_SOUND_OSS_CORE_PRECLAIM=y
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_JACK=y
CONFIG_SND_JACK_INPUT_DEV=y
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
# CONFIG_SND_MIXER_OSS is not set
# CONFIG_SND_PCM_OSS is not set
CONFIG_SND_PCM_TIMER=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_HRTIMER=m
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_MAX_CARDS=32
# CONFIG_SND_SUPPORT_OLD_API is not set
CONFIG_SND_PROC_FS=y
CONFIG_SND_VERBOSE_PROCFS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
CONFIG_SND_VMASTER=y
CONFIG_SND_DMA_SGBUF=y
CONFIG_SND_RAWMIDI_SEQ=m
CONFIG_SND_OPL3_LIB_SEQ=m
# CONFIG_SND_OPL4_LIB_SEQ is not set
# CONFIG_SND_SBAWE_SEQ is not set
CONFIG_SND_EMU10K1_SEQ=m
CONFIG_SND_MPU401_UART=m
CONFIG_SND_OPL3_LIB=m
CONFIG_SND_VX_LIB=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_DRIVERS=y
CONFIG_SND_PCSP=m
CONFIG_SND_DUMMY=m
CONFIG_SND_ALOOP=m
CONFIG_SND_VIRMIDI=m
CONFIG_SND_MTPAV=m
# CONFIG_SND_MTS64 is not set
# CONFIG_SND_SERIAL_U16550 is not set
CONFIG_SND_MPU401=m
# CONFIG_SND_PORTMAN2X4 is not set
CONFIG_SND_AC97_POWER_SAVE=y
CONFIG_SND_AC97_POWER_SAVE_DEFAULT=5
CONFIG_SND_PCI=y
CONFIG_SND_AD1889=m
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
CONFIG_SND_ALI5451=m
CONFIG_SND_ASIHPI=m
CONFIG_SND_ATIIXP=m
CONFIG_SND_ATIIXP_MODEM=m
CONFIG_SND_AU8810=m
CONFIG_SND_AU8820=m
CONFIG_SND_AU8830=m
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
CONFIG_SND_BT87X=m
# CONFIG_SND_BT87X_OVERCLOCK is not set
CONFIG_SND_CA0106=m
CONFIG_SND_CMIPCI=m
CONFIG_SND_OXYGEN_LIB=m
CONFIG_SND_OXYGEN=m
# CONFIG_SND_CS4281 is not set
CONFIG_SND_CS46XX=m
CONFIG_SND_CS46XX_NEW_DSP=y
CONFIG_SND_CTXFI=m
CONFIG_SND_DARLA20=m
CONFIG_SND_GINA20=m
CONFIG_SND_LAYLA20=m
CONFIG_SND_DARLA24=m
CONFIG_SND_GINA24=m
CONFIG_SND_LAYLA24=m
CONFIG_SND_MONA=m
CONFIG_SND_MIA=m
CONFIG_SND_ECHO3G=m
CONFIG_SND_INDIGO=m
CONFIG_SND_INDIGOIO=m
CONFIG_SND_INDIGODJ=m
CONFIG_SND_INDIGOIOX=m
CONFIG_SND_INDIGODJX=m
CONFIG_SND_EMU10K1=m
CONFIG_SND_EMU10K1X=m
CONFIG_SND_ENS1370=m
CONFIG_SND_ENS1371=m
# CONFIG_SND_ES1938 is not set
CONFIG_SND_ES1968=m
CONFIG_SND_ES1968_INPUT=y
CONFIG_SND_ES1968_RADIO=y
# CONFIG_SND_FM801 is not set
CONFIG_SND_HDSP=m
CONFIG_SND_HDSPM=m
CONFIG_SND_ICE1712=m
CONFIG_SND_ICE1724=m
CONFIG_SND_INTEL8X0=m
CONFIG_SND_INTEL8X0M=m
CONFIG_SND_KORG1212=m
CONFIG_SND_LOLA=m
CONFIG_SND_LX6464ES=m
CONFIG_SND_MAESTRO3=m
CONFIG_SND_MAESTRO3_INPUT=y
CONFIG_SND_MIXART=m
# CONFIG_SND_NM256 is not set
CONFIG_SND_PCXHR=m
# CONFIG_SND_RIPTIDE is not set
CONFIG_SND_RME32=m
CONFIG_SND_RME96=m
CONFIG_SND_RME9652=m
# CONFIG_SND_SONICVIBES is not set
CONFIG_SND_TRIDENT=m
CONFIG_SND_VIA82XX=m
CONFIG_SND_VIA82XX_MODEM=m
CONFIG_SND_VIRTUOSO=m
CONFIG_SND_VX222=m
# CONFIG_SND_YMFPCI is not set
#
# HD-Audio
#
CONFIG_SND_HDA=m
CONFIG_SND_HDA_INTEL=m
CONFIG_SND_HDA_HWDEP=y
# CONFIG_SND_HDA_RECONFIG is not set
CONFIG_SND_HDA_INPUT_BEEP=y
CONFIG_SND_HDA_INPUT_BEEP_MODE=0
# CONFIG_SND_HDA_PATCH_LOADER is not set
CONFIG_SND_HDA_CODEC_REALTEK=m
CONFIG_SND_HDA_CODEC_ANALOG=m
CONFIG_SND_HDA_CODEC_SIGMATEL=m
CONFIG_SND_HDA_CODEC_VIA=m
CONFIG_SND_HDA_CODEC_HDMI=m
CONFIG_SND_HDA_CODEC_CIRRUS=m
CONFIG_SND_HDA_CODEC_CONEXANT=m
CONFIG_SND_HDA_CODEC_CA0110=m
CONFIG_SND_HDA_CODEC_CA0132=m
CONFIG_SND_HDA_CODEC_CA0132_DSP=y
CONFIG_SND_HDA_CODEC_CMEDIA=m
CONFIG_SND_HDA_CODEC_SI3054=m
CONFIG_SND_HDA_GENERIC=m
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
CONFIG_SND_HDA_CORE=m
CONFIG_SND_HDA_DSP_LOADER=y
CONFIG_SND_HDA_I915=y
CONFIG_SND_HDA_PREALLOC_SIZE=512
CONFIG_SND_SPI=y
CONFIG_SND_USB=y
CONFIG_SND_USB_AUDIO=m
CONFIG_SND_USB_UA101=m
CONFIG_SND_USB_USX2Y=m
CONFIG_SND_USB_CAIAQ=m
CONFIG_SND_USB_CAIAQ_INPUT=y
CONFIG_SND_USB_US122L=m
CONFIG_SND_USB_6FIRE=m
# CONFIG_SND_USB_HIFACE is not set
# CONFIG_SND_BCD2000 is not set
# CONFIG_SND_USB_POD is not set
# CONFIG_SND_USB_PODHD is not set
# CONFIG_SND_USB_TONEPORT is not set
# CONFIG_SND_USB_VARIAX is not set
CONFIG_SND_FIREWIRE=y
CONFIG_SND_FIREWIRE_LIB=m
# CONFIG_SND_DICE is not set
# CONFIG_SND_OXFW is not set
CONFIG_SND_ISIGHT=m
# CONFIG_SND_FIREWORKS is not set
# CONFIG_SND_BEBOB is not set
# CONFIG_SND_FIREWIRE_DIGI00X is not set
# CONFIG_SND_FIREWIRE_TASCAM is not set
# CONFIG_SND_SOC is not set
# CONFIG_SOUND_PRIME is not set
CONFIG_AC97_BUS=m
#
# HID support
#
CONFIG_HID=y
CONFIG_HID_BATTERY_STRENGTH=y
CONFIG_HIDRAW=y
CONFIG_UHID=m
CONFIG_HID_GENERIC=y
#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
CONFIG_HID_ACRUX=m
# CONFIG_HID_ACRUX_FF is not set
CONFIG_HID_APPLE=y
CONFIG_HID_APPLEIR=m
# CONFIG_HID_ASUS is not set
CONFIG_HID_AUREAL=m
CONFIG_HID_BELKIN=y
# CONFIG_HID_BETOP_FF is not set
CONFIG_HID_CHERRY=y
CONFIG_HID_CHICONY=y
# CONFIG_HID_CORSAIR is not set
CONFIG_HID_PRODIKEYS=m
# CONFIG_HID_CMEDIA is not set
# CONFIG_HID_CP2112 is not set
CONFIG_HID_CYPRESS=y
CONFIG_HID_DRAGONRISE=m
# CONFIG_DRAGONRISE_FF is not set
# CONFIG_HID_EMS_FF is not set
CONFIG_HID_ELECOM=m
# CONFIG_HID_ELO is not set
CONFIG_HID_EZKEY=y
# CONFIG_HID_GEMBIRD is not set
# CONFIG_HID_GFRM is not set
CONFIG_HID_HOLTEK=m
# CONFIG_HOLTEK_FF is not set
# CONFIG_HID_GT683R is not set
CONFIG_HID_KEYTOUCH=m
CONFIG_HID_KYE=m
CONFIG_HID_UCLOGIC=m
CONFIG_HID_WALTOP=m
CONFIG_HID_GYRATION=m
CONFIG_HID_ICADE=m
CONFIG_HID_TWINHAN=m
CONFIG_HID_KENSINGTON=y
CONFIG_HID_LCPOWER=m
# CONFIG_HID_LENOVO is not set
CONFIG_HID_LOGITECH=y
CONFIG_HID_LOGITECH_DJ=m
CONFIG_HID_LOGITECH_HIDPP=m
# CONFIG_LOGITECH_FF is not set
# CONFIG_LOGIRUMBLEPAD2_FF is not set
# CONFIG_LOGIG940_FF is not set
# CONFIG_LOGIWHEELS_FF is not set
CONFIG_HID_MAGICMOUSE=y
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
CONFIG_HID_MULTITOUCH=m
CONFIG_HID_NTRIG=y
CONFIG_HID_ORTEK=m
CONFIG_HID_PANTHERLORD=m
# CONFIG_PANTHERLORD_FF is not set
# CONFIG_HID_PENMOUNT is not set
CONFIG_HID_PETALYNX=m
CONFIG_HID_PICOLCD=m
CONFIG_HID_PICOLCD_FB=y
CONFIG_HID_PICOLCD_BACKLIGHT=y
CONFIG_HID_PICOLCD_LCD=y
CONFIG_HID_PICOLCD_LEDS=y
CONFIG_HID_PICOLCD_CIR=y
CONFIG_HID_PLANTRONICS=y
CONFIG_HID_PRIMAX=m
CONFIG_HID_ROCCAT=m
CONFIG_HID_SAITEK=m
CONFIG_HID_SAMSUNG=m
CONFIG_HID_SONY=m
# CONFIG_SONY_FF is not set
CONFIG_HID_SPEEDLINK=m
CONFIG_HID_STEELSERIES=m
CONFIG_HID_SUNPLUS=m
# CONFIG_HID_RMI is not set
CONFIG_HID_GREENASIA=m
# CONFIG_GREENASIA_FF is not set
CONFIG_HID_HYPERV_MOUSE=m
CONFIG_HID_SMARTJOYPLUS=m
# CONFIG_SMARTJOYPLUS_FF is not set
CONFIG_HID_TIVO=m
CONFIG_HID_TOPSEED=m
CONFIG_HID_THINGM=m
CONFIG_HID_THRUSTMASTER=m
# CONFIG_THRUSTMASTER_FF is not set
CONFIG_HID_WACOM=m
CONFIG_HID_WIIMOTE=m
# CONFIG_HID_XINMO is not set
CONFIG_HID_ZEROPLUS=m
# CONFIG_ZEROPLUS_FF is not set
CONFIG_HID_ZYDACRON=m
# CONFIG_HID_SENSOR_HUB is not set
#
# USB HID support
#
CONFIG_USB_HID=y
CONFIG_HID_PID=y
CONFIG_USB_HIDDEV=y
#
# I2C HID support
#
CONFIG_I2C_HID=m
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_OTG is not set
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_ULPI_BUS is not set
CONFIG_USB_MON=y
CONFIG_USB_WUSB=m
CONFIG_USB_WUSB_CBAF=m
# CONFIG_USB_WUSB_CBAF_DEBUG is not set
#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_XHCI_PCI=y
CONFIG_USB_XHCI_PLATFORM=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
# CONFIG_USB_FOTG210_HCD is not set
# CONFIG_USB_MAX3421_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PCI=y
# CONFIG_USB_OHCI_HCD_PLATFORM is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_U132_HCD is not set
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_WHCI_HCD is not set
CONFIG_USB_HWA_HCD=m
# CONFIG_USB_HCD_BCMA is not set
# CONFIG_USB_HCD_SSB is not set
# CONFIG_USB_HCD_TEST_MODE is not set
#
# USB Device Class drivers
#
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m
CONFIG_USB_WDM=m
CONFIG_USB_TMC=m
#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#
#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_REALTEK=m
CONFIG_REALTEK_AUTOPM=y
CONFIG_USB_STORAGE_DATAFAB=m
CONFIG_USB_STORAGE_FREECOM=m
CONFIG_USB_STORAGE_ISD200=m
CONFIG_USB_STORAGE_USBAT=m
CONFIG_USB_STORAGE_SDDR09=m
CONFIG_USB_STORAGE_SDDR55=m
CONFIG_USB_STORAGE_JUMPSHOT=m
CONFIG_USB_STORAGE_ALAUDA=m
CONFIG_USB_STORAGE_ONETOUCH=m
CONFIG_USB_STORAGE_KARMA=m
CONFIG_USB_STORAGE_CYPRESS_ATACB=m
CONFIG_USB_STORAGE_ENE_UB6250=m
# CONFIG_USB_UAS is not set
#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_MUSB_HDRC is not set
CONFIG_USB_DWC3=y
# CONFIG_USB_DWC3_HOST is not set
CONFIG_USB_DWC3_GADGET=y
# CONFIG_USB_DWC3_DUAL_ROLE is not set
#
# Platform Glue Driver Support
#
CONFIG_USB_DWC3_PCI=y
# CONFIG_USB_DWC2 is not set
# CONFIG_USB_CHIPIDEA is not set
# CONFIG_USB_ISP1760 is not set
#
# USB port drivers
#
CONFIG_USB_USS720=m
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_CONSOLE=y
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_SIMPLE is not set
CONFIG_USB_SERIAL_AIRCABLE=m
CONFIG_USB_SERIAL_ARK3116=m
CONFIG_USB_SERIAL_BELKIN=m
CONFIG_USB_SERIAL_CH341=m
CONFIG_USB_SERIAL_WHITEHEAT=m
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
CONFIG_USB_SERIAL_CP210X=m
CONFIG_USB_SERIAL_CYPRESS_M8=m
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
# CONFIG_USB_SERIAL_F81232 is not set
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
CONFIG_USB_SERIAL_IUU=m
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
CONFIG_USB_SERIAL_KEYSPAN=m
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
# CONFIG_USB_SERIAL_METRO is not set
CONFIG_USB_SERIAL_MOS7720=m
CONFIG_USB_SERIAL_MOS7715_PARPORT=y
CONFIG_USB_SERIAL_MOS7840=m
# CONFIG_USB_SERIAL_MXUPORT is not set
CONFIG_USB_SERIAL_NAVMAN=m
CONFIG_USB_SERIAL_PL2303=m
CONFIG_USB_SERIAL_OTI6858=m
CONFIG_USB_SERIAL_QCAUX=m
CONFIG_USB_SERIAL_QUALCOMM=m
CONFIG_USB_SERIAL_SPCP8X5=m
CONFIG_USB_SERIAL_SAFE=m
CONFIG_USB_SERIAL_SAFE_PADDED=y
CONFIG_USB_SERIAL_SIERRAWIRELESS=m
CONFIG_USB_SERIAL_SYMBOL=m
# CONFIG_USB_SERIAL_TI is not set
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
CONFIG_USB_SERIAL_WWAN=m
CONFIG_USB_SERIAL_OPTION=m
CONFIG_USB_SERIAL_OMNINET=m
CONFIG_USB_SERIAL_OPTICON=m
CONFIG_USB_SERIAL_XSENS_MT=m
# CONFIG_USB_SERIAL_WISHBONE is not set
CONFIG_USB_SERIAL_SSU100=m
CONFIG_USB_SERIAL_QT2=m
CONFIG_USB_SERIAL_DEBUG=m
#
# USB Miscellaneous drivers
#
CONFIG_USB_EMI62=m
CONFIG_USB_EMI26=m
CONFIG_USB_ADUTUX=m
CONFIG_USB_SEVSEG=m
# CONFIG_USB_RIO500 is not set
CONFIG_USB_LEGOTOWER=m
CONFIG_USB_LCD=m
CONFIG_USB_LED=m
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
CONFIG_USB_IDMOUSE=m
CONFIG_USB_FTDI_ELAN=m
CONFIG_USB_APPLEDISPLAY=m
CONFIG_USB_SISUSBVGA=m
CONFIG_USB_SISUSBVGA_CON=y
CONFIG_USB_LD=m
# CONFIG_USB_TRANCEVIBRATOR is not set
CONFIG_USB_IOWARRIOR=m
# CONFIG_USB_TEST is not set
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
CONFIG_USB_ISIGHTFW=m
# CONFIG_USB_YUREX is not set
CONFIG_USB_EZUSB_FX2=m
CONFIG_USB_HSIC_USB3503=m
# CONFIG_USB_LINK_LAYER_TEST is not set
# CONFIG_USB_CHAOSKEY is not set
# CONFIG_UCSI is not set
CONFIG_USB_ATM=m
CONFIG_USB_SPEEDTOUCH=m
CONFIG_USB_CXACRU=m
CONFIG_USB_UEAGLEATM=m
CONFIG_USB_XUSBATM=m
#
# USB Physical Layer drivers
#
CONFIG_USB_PHY=y
CONFIG_NOP_USB_XCEIV=y
# CONFIG_USB_GPIO_VBUS is not set
# CONFIG_USB_ISP1301 is not set
CONFIG_USB_GADGET=y
# CONFIG_USB_GADGET_DEBUG is not set
# CONFIG_USB_GADGET_DEBUG_FILES is not set
# CONFIG_USB_GADGET_DEBUG_FS is not set
CONFIG_USB_GADGET_VBUS_DRAW=2
CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2
#
# USB Peripheral Controller
#
# CONFIG_USB_FOTG210_UDC is not set
# CONFIG_USB_GR_UDC is not set
# CONFIG_USB_R8A66597 is not set
# CONFIG_USB_PXA27X is not set
# CONFIG_USB_MV_UDC is not set
# CONFIG_USB_MV_U3D is not set
# CONFIG_USB_M66592 is not set
# CONFIG_USB_BDC_UDC is not set
# CONFIG_USB_AMD5536UDC is not set
# CONFIG_USB_NET2272 is not set
# CONFIG_USB_NET2280 is not set
# CONFIG_USB_GOKU is not set
# CONFIG_USB_EG20T is not set
# CONFIG_USB_DUMMY_HCD is not set
CONFIG_USB_LIBCOMPOSITE=m
CONFIG_USB_F_MASS_STORAGE=m
# CONFIG_USB_CONFIGFS is not set
# CONFIG_USB_ZERO is not set
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_ETH is not set
# CONFIG_USB_G_NCM is not set
# CONFIG_USB_GADGETFS is not set
# CONFIG_USB_FUNCTIONFS is not set
CONFIG_USB_MASS_STORAGE=m
# CONFIG_USB_GADGET_TARGET is not set
# CONFIG_USB_G_SERIAL is not set
# CONFIG_USB_MIDI_GADGET is not set
# CONFIG_USB_G_PRINTER is not set
# CONFIG_USB_CDC_COMPOSITE is not set
# CONFIG_USB_G_ACM_MS is not set
# CONFIG_USB_G_MULTI is not set
# CONFIG_USB_G_HID is not set
# CONFIG_USB_G_DBGP is not set
# CONFIG_USB_G_WEBCAM is not set
# CONFIG_USB_LED_TRIG is not set
CONFIG_UWB=m
CONFIG_UWB_HWA=m
CONFIG_UWB_WHCI=m
CONFIG_UWB_I1480U=m
CONFIG_MMC=m
# CONFIG_MMC_DEBUG is not set
#
# MMC/SD/SDIO Card Drivers
#
CONFIG_MMC_BLOCK=m
CONFIG_MMC_BLOCK_MINORS=8
CONFIG_MMC_BLOCK_BOUNCE=y
CONFIG_SDIO_UART=m
# CONFIG_MMC_TEST is not set
#
# MMC/SD/SDIO Host Controller Drivers
#
CONFIG_MMC_SDHCI=m
CONFIG_MMC_SDHCI_PCI=m
CONFIG_MMC_RICOH_MMC=y
CONFIG_MMC_SDHCI_ACPI=m
CONFIG_MMC_SDHCI_PLTFM=m
# CONFIG_MMC_WBSD is not set
CONFIG_MMC_TIFM_SD=m
# CONFIG_MMC_SPI is not set
CONFIG_MMC_CB710=m
CONFIG_MMC_VIA_SDMMC=m
CONFIG_MMC_VUB300=m
CONFIG_MMC_USHC=m
# CONFIG_MMC_USDHI6ROL0 is not set
CONFIG_MMC_REALTEK_PCI=m
# CONFIG_MMC_TOSHIBA_PCI is not set
# CONFIG_MMC_MTK is not set
CONFIG_MEMSTICK=m
# CONFIG_MEMSTICK_DEBUG is not set
#
# MemoryStick drivers
#
# CONFIG_MEMSTICK_UNSAFE_RESUME is not set
CONFIG_MSPRO_BLOCK=m
# CONFIG_MS_BLOCK is not set
#
# MemoryStick Host Controller Drivers
#
CONFIG_MEMSTICK_TIFM_MS=m
CONFIG_MEMSTICK_JMICRON_38X=m
CONFIG_MEMSTICK_R592=m
CONFIG_MEMSTICK_REALTEK_PCI=m
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
# CONFIG_LEDS_CLASS_FLASH is not set
#
# LED drivers
#
CONFIG_LEDS_LM3530=m
# CONFIG_LEDS_LM3642 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_GPIO is not set
CONFIG_LEDS_LP3944=m
CONFIG_LEDS_LP55XX_COMMON=m
CONFIG_LEDS_LP5521=m
CONFIG_LEDS_LP5523=m
CONFIG_LEDS_LP5562=m
# CONFIG_LEDS_LP8501 is not set
# CONFIG_LEDS_LP8860 is not set
CONFIG_LEDS_CLEVO_MAIL=m
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_DAC124S085 is not set
# CONFIG_LEDS_PWM is not set
# CONFIG_LEDS_BD2802 is not set
CONFIG_LEDS_INTEL_SS4200=m
# CONFIG_LEDS_LT3593 is not set
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_TLC591XX is not set
# CONFIG_LEDS_LM355x is not set
#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
CONFIG_LEDS_BLINKM=m
#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=m
CONFIG_LEDS_TRIGGER_ONESHOT=m
# CONFIG_LEDS_TRIGGER_MTD is not set
CONFIG_LEDS_TRIGGER_HEARTBEAT=m
CONFIG_LEDS_TRIGGER_BACKLIGHT=m
# CONFIG_LEDS_TRIGGER_CPU is not set
# CONFIG_LEDS_TRIGGER_GPIO is not set
CONFIG_LEDS_TRIGGER_DEFAULT_ON=m
#
# iptables trigger is under Netfilter config (LED target)
#
CONFIG_LEDS_TRIGGER_TRANSIENT=m
CONFIG_LEDS_TRIGGER_CAMERA=m
# CONFIG_LEDS_TRIGGER_PANIC is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=y
CONFIG_EDAC_LEGACY_SYSFS=y
# CONFIG_EDAC_DEBUG is not set
CONFIG_EDAC_DECODE_MCE=m
CONFIG_EDAC_MM_EDAC=m
CONFIG_EDAC_AMD64=m
# CONFIG_EDAC_AMD64_ERROR_INJECTION is not set
CONFIG_EDAC_E752X=m
CONFIG_EDAC_I82975X=m
CONFIG_EDAC_I3000=m
CONFIG_EDAC_I3200=m
# CONFIG_EDAC_IE31200 is not set
CONFIG_EDAC_X38=m
CONFIG_EDAC_I5400=m
CONFIG_EDAC_I7CORE=m
CONFIG_EDAC_I5000=m
CONFIG_EDAC_I5100=m
CONFIG_EDAC_I7300=m
CONFIG_EDAC_SBRIDGE=m
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_SYSTOHC is not set
# CONFIG_RTC_DEBUG is not set
#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set
#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_ABB5ZES3 is not set
# CONFIG_RTC_DRV_ABX80X is not set
CONFIG_RTC_DRV_DS1307=m
CONFIG_RTC_DRV_DS1307_HWMON=y
CONFIG_RTC_DRV_DS1374=m
# CONFIG_RTC_DRV_DS1374_WDT is not set
CONFIG_RTC_DRV_DS1672=m
CONFIG_RTC_DRV_MAX6900=m
CONFIG_RTC_DRV_RS5C372=m
CONFIG_RTC_DRV_ISL1208=m
CONFIG_RTC_DRV_ISL12022=m
# CONFIG_RTC_DRV_ISL12057 is not set
CONFIG_RTC_DRV_X1205=m
CONFIG_RTC_DRV_PCF8523=m
# CONFIG_RTC_DRV_PCF85063 is not set
CONFIG_RTC_DRV_PCF8563=m
CONFIG_RTC_DRV_PCF8583=m
CONFIG_RTC_DRV_M41T80=m
CONFIG_RTC_DRV_M41T80_WDT=y
CONFIG_RTC_DRV_BQ32K=m
# CONFIG_RTC_DRV_S35390A is not set
CONFIG_RTC_DRV_FM3130=m
# CONFIG_RTC_DRV_RX8010 is not set
CONFIG_RTC_DRV_RX8581=m
CONFIG_RTC_DRV_RX8025=m
CONFIG_RTC_DRV_EM3027=m
# CONFIG_RTC_DRV_RV8803 is not set
#
# SPI RTC drivers
#
# CONFIG_RTC_DRV_M41T93 is not set
# CONFIG_RTC_DRV_M41T94 is not set
# CONFIG_RTC_DRV_DS1302 is not set
# CONFIG_RTC_DRV_DS1305 is not set
# CONFIG_RTC_DRV_DS1343 is not set
# CONFIG_RTC_DRV_DS1347 is not set
# CONFIG_RTC_DRV_DS1390 is not set
# CONFIG_RTC_DRV_R9701 is not set
# CONFIG_RTC_DRV_RX4581 is not set
# CONFIG_RTC_DRV_RX6110 is not set
# CONFIG_RTC_DRV_RS5C348 is not set
# CONFIG_RTC_DRV_MAX6902 is not set
# CONFIG_RTC_DRV_PCF2123 is not set
# CONFIG_RTC_DRV_MCP795 is not set
CONFIG_RTC_I2C_AND_SPI=y
#
# SPI and I2C RTC drivers
#
CONFIG_RTC_DRV_DS3232=m
# CONFIG_RTC_DRV_PCF2127 is not set
CONFIG_RTC_DRV_RV3029C2=m
CONFIG_RTC_DRV_RV3029_HWMON=y
#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
CONFIG_RTC_DRV_DS1286=m
CONFIG_RTC_DRV_DS1511=m
CONFIG_RTC_DRV_DS1553=m
# CONFIG_RTC_DRV_DS1685_FAMILY is not set
CONFIG_RTC_DRV_DS1742=m
CONFIG_RTC_DRV_DS2404=m
CONFIG_RTC_DRV_STK17TA8=m
# CONFIG_RTC_DRV_M48T86 is not set
CONFIG_RTC_DRV_M48T35=m
CONFIG_RTC_DRV_M48T59=m
CONFIG_RTC_DRV_MSM6242=m
CONFIG_RTC_DRV_BQ4802=m
CONFIG_RTC_DRV_RP5C01=m
CONFIG_RTC_DRV_V3020=m
#
# on-CPU RTC drivers
#
#
# HID Sensor RTC drivers
#
# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set
CONFIG_DMADEVICES=y
# CONFIG_DMADEVICES_DEBUG is not set
#
# DMA Devices
#
CONFIG_DMA_ENGINE=y
CONFIG_DMA_VIRTUAL_CHANNELS=y
CONFIG_DMA_ACPI=y
# CONFIG_INTEL_IDMA64 is not set
# CONFIG_INTEL_IOATDMA is not set
# CONFIG_QCOM_HIDMA_MGMT is not set
# CONFIG_QCOM_HIDMA is not set
CONFIG_DW_DMAC_CORE=m
CONFIG_DW_DMAC=m
CONFIG_DW_DMAC_PCI=m
CONFIG_HSU_DMA=y
#
# DMA Clients
#
CONFIG_ASYNC_TX_DMA=y
CONFIG_DMATEST=m
#
# DMABUF options
#
# CONFIG_SYNC_FILE is not set
CONFIG_AUXDISPLAY=y
CONFIG_KS0108=m
CONFIG_KS0108_PORT=0x378
CONFIG_KS0108_DELAY=2
CONFIG_CFAG12864B=m
CONFIG_CFAG12864B_RATE=20
CONFIG_UIO=m
CONFIG_UIO_CIF=m
CONFIG_UIO_PDRV_GENIRQ=m
# CONFIG_UIO_DMEM_GENIRQ is not set
CONFIG_UIO_AEC=m
CONFIG_UIO_SERCOS3=m
CONFIG_UIO_PCI_GENERIC=m
# CONFIG_UIO_NETX is not set
# CONFIG_UIO_PRUSS is not set
# CONFIG_UIO_MF624 is not set
CONFIG_VFIO_IOMMU_TYPE1=m
CONFIG_VFIO_VIRQFD=m
CONFIG_VFIO=m
# CONFIG_VFIO_NOIOMMU is not set
CONFIG_VFIO_PCI=m
# CONFIG_VFIO_PCI_VGA is not set
CONFIG_VFIO_PCI_MMAP=y
CONFIG_VFIO_PCI_INTX=y
CONFIG_VFIO_PCI_IGD=y
CONFIG_IRQ_BYPASS_MANAGER=m
# CONFIG_VIRT_DRIVERS is not set
CONFIG_VIRTIO=y
#
# Virtio drivers
#
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_VIRTIO_BALLOON=y
# CONFIG_VIRTIO_INPUT is not set
# CONFIG_VIRTIO_MMIO is not set
#
# Microsoft Hyper-V guest support
#
CONFIG_HYPERV=m
CONFIG_HYPERV_UTILS=m
CONFIG_HYPERV_BALLOON=m
#
# Xen driver support
#
CONFIG_XEN_BALLOON=y
# CONFIG_XEN_SELFBALLOONING is not set
# CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not set
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=m
CONFIG_XEN_BACKEND=y
CONFIG_XENFS=m
CONFIG_XEN_COMPAT_XENFS=y
CONFIG_XEN_SYS_HYPERVISOR=y
CONFIG_XEN_XENBUS_FRONTEND=y
# CONFIG_XEN_GNTDEV is not set
# CONFIG_XEN_GRANT_DEV_ALLOC is not set
CONFIG_SWIOTLB_XEN=y
CONFIG_XEN_TMEM=m
CONFIG_XEN_PCIDEV_BACKEND=m
# CONFIG_XEN_SCSI_BACKEND is not set
CONFIG_XEN_PRIVCMD=m
CONFIG_XEN_ACPI_PROCESSOR=m
# CONFIG_XEN_MCE_LOG is not set
CONFIG_XEN_HAVE_PVMMU=y
CONFIG_XEN_EFI=y
CONFIG_XEN_AUTO_XLATE=y
CONFIG_XEN_ACPI=y
CONFIG_XEN_SYMS=y
CONFIG_XEN_HAVE_VPMU=y
CONFIG_STAGING=y
# CONFIG_SLICOSS is not set
# CONFIG_PRISM2_USB is not set
# CONFIG_COMEDI is not set
# CONFIG_RTL8192U is not set
CONFIG_RTLLIB=m
CONFIG_RTLLIB_CRYPTO_CCMP=m
CONFIG_RTLLIB_CRYPTO_TKIP=m
CONFIG_RTLLIB_CRYPTO_WEP=m
CONFIG_RTL8192E=m
CONFIG_R8712U=m
# CONFIG_R8188EU is not set
# CONFIG_R8723AU is not set
# CONFIG_RTS5208 is not set
# CONFIG_VT6655 is not set
# CONFIG_VT6656 is not set
# CONFIG_FB_SM750 is not set
# CONFIG_FB_XGI is not set
#
# Speakup console speech
#
# CONFIG_SPEAKUP is not set
# CONFIG_STAGING_MEDIA is not set
#
# Android
#
# CONFIG_LTE_GDM724X is not set
CONFIG_FIREWIRE_SERIAL=m
CONFIG_FWTTY_MAX_TOTAL_PORTS=64
CONFIG_FWTTY_MAX_CARD_PORTS=32
# CONFIG_LNET is not set
# CONFIG_DGNC is not set
# CONFIG_GS_FPGABOOT is not set
# CONFIG_CRYPTO_SKEIN is not set
# CONFIG_UNISYSSPAR is not set
# CONFIG_FB_TFT is not set
# CONFIG_WILC1000_SDIO is not set
# CONFIG_WILC1000_SPI is not set
# CONFIG_MOST is not set
#
# Old ISDN4Linux (deprecated)
#
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_ACER_WMI=m
CONFIG_ACERHDF=m
# CONFIG_ALIENWARE_WMI is not set
CONFIG_ASUS_LAPTOP=m
# CONFIG_DELL_SMBIOS is not set
CONFIG_DELL_WMI_AIO=m
# CONFIG_DELL_SMO8800 is not set
# CONFIG_DELL_RBTN is not set
CONFIG_FUJITSU_LAPTOP=m
# CONFIG_FUJITSU_LAPTOP_DEBUG is not set
CONFIG_FUJITSU_TABLET=m
CONFIG_AMILO_RFKILL=m
CONFIG_HP_ACCEL=m
# CONFIG_HP_WIRELESS is not set
CONFIG_HP_WMI=m
CONFIG_MSI_LAPTOP=m
CONFIG_PANASONIC_LAPTOP=m
CONFIG_COMPAL_LAPTOP=m
CONFIG_SONY_LAPTOP=m
CONFIG_SONYPI_COMPAT=y
CONFIG_IDEAPAD_LAPTOP=m
CONFIG_THINKPAD_ACPI=m
CONFIG_THINKPAD_ACPI_ALSA_SUPPORT=y
# CONFIG_THINKPAD_ACPI_DEBUGFACILITIES is not set
# CONFIG_THINKPAD_ACPI_DEBUG is not set
# CONFIG_THINKPAD_ACPI_UNSAFE_LEDS is not set
CONFIG_THINKPAD_ACPI_VIDEO=y
CONFIG_THINKPAD_ACPI_HOTKEY_POLL=y
CONFIG_SENSORS_HDAPS=m
# CONFIG_INTEL_MENLOW is not set
CONFIG_EEEPC_LAPTOP=m
CONFIG_ASUS_WMI=m
CONFIG_ASUS_NB_WMI=m
CONFIG_EEEPC_WMI=m
# CONFIG_ASUS_WIRELESS is not set
CONFIG_ACPI_WMI=m
CONFIG_MSI_WMI=m
CONFIG_TOPSTAR_LAPTOP=m
CONFIG_ACPI_TOSHIBA=m
CONFIG_TOSHIBA_BT_RFKILL=m
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_TOSHIBA_WMI is not set
CONFIG_ACPI_CMPC=m
# CONFIG_INTEL_HID_EVENT is not set
CONFIG_INTEL_IPS=m
# CONFIG_INTEL_PMC_CORE is not set
# CONFIG_IBM_RTL is not set
CONFIG_SAMSUNG_LAPTOP=m
CONFIG_MXM_WMI=m
CONFIG_INTEL_OAKTRAIL=m
CONFIG_SAMSUNG_Q10=m
CONFIG_APPLE_GMUX=m
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
CONFIG_PVPANIC=y
# CONFIG_INTEL_PMC_IPC is not set
# CONFIG_SURFACE_PRO3_BUTTON is not set
# CONFIG_INTEL_PUNIT_IPC is not set
# CONFIG_CHROME_PLATFORMS is not set
CONFIG_CLKDEV_LOOKUP=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y
#
# Common Clock Framework
#
# CONFIG_COMMON_CLK_SI5351 is not set
# CONFIG_COMMON_CLK_CDCE706 is not set
# CONFIG_COMMON_CLK_CS2000_CP is not set
# CONFIG_COMMON_CLK_NXP is not set
# CONFIG_COMMON_CLK_PWM is not set
# CONFIG_COMMON_CLK_PXA is not set
# CONFIG_COMMON_CLK_PIC32 is not set
# CONFIG_COMMON_CLK_OXNAS is not set
#
# Hardware Spinlock drivers
#
#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# CONFIG_ATMEL_PIT is not set
# CONFIG_SH_TIMER_CMT is not set
# CONFIG_SH_TIMER_MTU2 is not set
# CONFIG_SH_TIMER_TMU is not set
# CONFIG_EM_TIMER_STI is not set
# CONFIG_MAILBOX is not set
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y
#
# Generic IOMMU Pagetable Support
#
CONFIG_IOMMU_IOVA=y
CONFIG_AMD_IOMMU=y
CONFIG_AMD_IOMMU_V2=m
CONFIG_DMAR_TABLE=y
CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_SVM is not set
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
CONFIG_INTEL_IOMMU_FLOPPY_WA=y
CONFIG_IRQ_REMAP=y
#
# Remoteproc drivers
#
# CONFIG_STE_MODEM_RPROC is not set
#
# Rpmsg drivers
#
#
# SOC (System On Chip) specific Drivers
#
# CONFIG_SUNXI_SRAM is not set
# CONFIG_SOC_TI is not set
CONFIG_PM_DEVFREQ=y
#
# DEVFREQ Governors
#
CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=m
# CONFIG_DEVFREQ_GOV_PERFORMANCE is not set
# CONFIG_DEVFREQ_GOV_POWERSAVE is not set
# CONFIG_DEVFREQ_GOV_USERSPACE is not set
# CONFIG_DEVFREQ_GOV_PASSIVE is not set
#
# DEVFREQ Drivers
#
# CONFIG_PM_DEVFREQ_EVENT is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
CONFIG_NTB=m
# CONFIG_NTB_AMD is not set
# CONFIG_NTB_INTEL is not set
# CONFIG_NTB_PINGPONG is not set
# CONFIG_NTB_TOOL is not set
# CONFIG_NTB_PERF is not set
# CONFIG_NTB_TRANSPORT is not set
# CONFIG_VME_BUS is not set
CONFIG_PWM=y
CONFIG_PWM_SYSFS=y
# CONFIG_PWM_LPSS_PCI is not set
# CONFIG_PWM_LPSS_PLATFORM is not set
# CONFIG_PWM_PCA9685 is not set
CONFIG_ARM_GIC_MAX_NR=1
# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set
# CONFIG_FMC is not set
#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_BCM_KONA_USB2_PHY is not set
CONFIG_POWERCAP=y
CONFIG_INTEL_RAPL=y
# CONFIG_MCB is not set
#
# Performance monitor support
#
CONFIG_RAS=y
# CONFIG_MCE_AMD_INJ is not set
# CONFIG_THUNDERBOLT is not set
#
# Android
#
# CONFIG_ANDROID is not set
CONFIG_LIBNVDIMM=y
CONFIG_BLK_DEV_PMEM=m
CONFIG_ND_BLK=m
CONFIG_ND_CLAIM=y
CONFIG_ND_BTT=m
CONFIG_BTT=y
# CONFIG_DEV_DAX is not set
CONFIG_NVMEM=m
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
#
# FPGA Configuration Support
#
# CONFIG_FPGA is not set
#
# Firmware Drivers
#
CONFIG_EDD=m
# CONFIG_EDD_OFF is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DELL_RBU=m
CONFIG_DCDBAS=m
CONFIG_DMIID=y
CONFIG_DMI_SYSFS=y
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
CONFIG_ISCSI_IBFT_FIND=y
CONFIG_ISCSI_IBFT=m
# CONFIG_FW_CFG_SYSFS is not set
# CONFIG_GOOGLE_FIRMWARE is not set
#
# EFI (Extensible Firmware Interface) Support
#
CONFIG_EFI_VARS=y
CONFIG_EFI_ESRT=y
CONFIG_EFI_VARS_PSTORE=y
CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE=y
CONFIG_EFI_RUNTIME_MAP=y
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_RUNTIME_WRAPPERS=y
# CONFIG_EFI_BOOTLOADER_CONTROL is not set
# CONFIG_EFI_CAPSULE_LOADER is not set
CONFIG_UEFI_CPER=y
#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_FS_IOMAP=y
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT2=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_ENCRYPTION is not set
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=y
CONFIG_XFS_QUOTA=y
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
# CONFIG_XFS_WARN is not set
# CONFIG_XFS_DEBUG is not set
CONFIG_GFS2_FS=m
CONFIG_GFS2_FS_LOCKING_DLM=y
# CONFIG_OCFS2_FS is not set
CONFIG_BTRFS_FS=m
CONFIG_BTRFS_FS_POSIX_ACL=y
# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
# CONFIG_BTRFS_DEBUG is not set
# CONFIG_BTRFS_ASSERT is not set
# CONFIG_NILFS2_FS is not set
CONFIG_F2FS_FS=m
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
# CONFIG_F2FS_FS_SECURITY is not set
# CONFIG_F2FS_CHECK_FS is not set
# CONFIG_F2FS_FS_ENCRYPTION is not set
# CONFIG_F2FS_IO_TRACE is not set
# CONFIG_F2FS_FAULT_INJECTION is not set
# CONFIG_FS_DAX is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
CONFIG_FILE_LOCKING=y
CONFIG_MANDATORY_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_FANOTIFY=y
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_PRINT_QUOTA_WARNING=y
# CONFIG_QUOTA_DEBUG is not set
CONFIG_QUOTA_TREE=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_QUOTACTL_COMPAT=y
CONFIG_AUTOFS4_FS=y
CONFIG_FUSE_FS=m
CONFIG_CUSE=m
CONFIG_OVERLAY_FS=m
#
# Caches
#
CONFIG_FSCACHE=m
CONFIG_FSCACHE_STATS=y
# CONFIG_FSCACHE_HISTOGRAM is not set
# CONFIG_FSCACHE_DEBUG is not set
# CONFIG_FSCACHE_OBJECT_LIST is not set
CONFIG_CACHEFILES=m
# CONFIG_CACHEFILES_DEBUG is not set
# CONFIG_CACHEFILES_HISTOGRAM is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
# CONFIG_FAT_DEFAULT_UTF8 is not set
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_EFIVAR_FS=y
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ORANGEFS_FS is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_JFFS2_FS is not set
# CONFIG_UBIFS_FS is not set
# CONFIG_LOGFS is not set
CONFIG_CRAMFS=m
CONFIG_SQUASHFS=m
CONFIG_SQUASHFS_FILE_CACHE=y
# CONFIG_SQUASHFS_FILE_DIRECT is not set
CONFIG_SQUASHFS_DECOMP_SINGLE=y
# CONFIG_SQUASHFS_DECOMP_MULTI is not set
# CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU is not set
CONFIG_SQUASHFS_XATTR=y
CONFIG_SQUASHFS_ZLIB=y
# CONFIG_SQUASHFS_LZ4 is not set
CONFIG_SQUASHFS_LZO=y
CONFIG_SQUASHFS_XZ=y
# CONFIG_SQUASHFS_4K_DEVBLK_SIZE is not set
# CONFIG_SQUASHFS_EMBEDDED is not set
CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_PSTORE=y
# CONFIG_PSTORE_CONSOLE is not set
# CONFIG_PSTORE_PMSG is not set
# CONFIG_PSTORE_FTRACE is not set
CONFIG_PSTORE_RAM=m
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_EXOFS_FS is not set
CONFIG_ORE=m
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
# CONFIG_NFS_V2 is not set
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=m
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_OBJLAYOUT=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
CONFIG_NFS_V4_SECURITY_LABEL=y
CONFIG_ROOT_NFS=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFS_DEBUG=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
# CONFIG_NFSD_BLOCKLAYOUT is not set
# CONFIG_NFSD_SCSILAYOUT is not set
CONFIG_NFSD_V4_SECURITY_LABEL=y
# CONFIG_NFSD_FAULT_INJECTION is not set
CONFIG_GRACE_PERIOD=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=m
CONFIG_SUNRPC_BACKCHANNEL=y
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_SUNRPC_DEBUG=y
# CONFIG_CEPH_FS is not set
CONFIG_CIFS=m
CONFIG_CIFS_STATS=y
# CONFIG_CIFS_STATS2 is not set
CONFIG_CIFS_WEAK_PW_HASH=y
CONFIG_CIFS_UPCALL=y
CONFIG_CIFS_XATTR=y
CONFIG_CIFS_POSIX=y
CONFIG_CIFS_ACL=y
CONFIG_CIFS_DEBUG=y
# CONFIG_CIFS_DEBUG2 is not set
CONFIG_CIFS_DFS_UPCALL=y
CONFIG_CIFS_SMB2=y
# CONFIG_CIFS_SMB311 is not set
# CONFIG_CIFS_FSCACHE is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_9P_FS=y
CONFIG_9P_FS_POSIX_ACL=y
# CONFIG_9P_FS_SECURITY is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_MAC_ROMAN=m
CONFIG_NLS_MAC_CELTIC=m
CONFIG_NLS_MAC_CENTEURO=m
CONFIG_NLS_MAC_CROATIAN=m
CONFIG_NLS_MAC_CYRILLIC=m
CONFIG_NLS_MAC_GAELIC=m
CONFIG_NLS_MAC_GREEK=m
CONFIG_NLS_MAC_ICELAND=m
CONFIG_NLS_MAC_INUIT=m
CONFIG_NLS_MAC_ROMANIAN=m
CONFIG_NLS_MAC_TURKISH=m
CONFIG_NLS_UTF8=m
CONFIG_DLM=m
CONFIG_DLM_DEBUG=y
#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
CONFIG_DYNAMIC_DEBUG=y
#
# Compile-time checks and compiler options
#
# CONFIG_DEBUG_INFO is not set
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=2048
CONFIG_STRIP_ASM_SYMS=y
# CONFIG_READABLE_ASM is not set
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_PAGE_OWNER is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
# CONFIG_STACK_VALIDATION is not set
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_DEBUG_KERNEL=y
#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_PAGE_REF is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VIRTUAL is not set
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m
# CONFIG_DEBUG_PER_CPU_MAPS is not set
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_HAVE_ARCH_KASAN=y
# CONFIG_KASAN is not set
CONFIG_ARCH_HAS_KCOV=y
# CONFIG_KCOV is not set
CONFIG_DEBUG_SHIRQ=y
#
# Debug Lockups and Hangs
#
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
# CONFIG_DETECT_HUNG_TASK is not set
# CONFIG_WQ_WATCHDOG is not set
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHED_INFO=y
CONFIG_SCHEDSTATS=y
# CONFIG_SCHED_STACK_END_CHECK is not set
# CONFIG_DEBUG_TIMEKEEPING is not set
CONFIG_TIMER_STATS=y
#
# Lock Debugging (spinlocks, mutexes, etc...)
#
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
CONFIG_DEBUG_ATOMIC_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
CONFIG_LOCK_TORTURE_TEST=m
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
# CONFIG_DEBUG_PI_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set
#
# RCU Debugging
#
# CONFIG_PROVE_RCU is not set
CONFIG_SPARSE_RCU_POINTER=y
CONFIG_TORTURE_TEST=m
# CONFIG_RCU_PERF_TEST is not set
CONFIG_RCU_TORTURE_TEST=m
# CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT is not set
# CONFIG_RCU_TORTURE_TEST_SLOW_INIT is not set
# CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
CONFIG_NOTIFIER_ERROR_INJECTION=m
# CONFIG_CPU_NOTIFIER_ERROR_INJECT is not set
CONFIG_PM_NOTIFIER_ERROR_INJECT=m
# CONFIG_NETDEV_NOTIFIER_ERROR_INJECT is not set
# CONFIG_FAULT_INJECTION is not set
CONFIG_LATENCYTOP=y
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
# CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_RING_BUFFER_ALLOW_SWAP=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set
CONFIG_SCHED_TRACER=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
# CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
CONFIG_STACK_TRACER=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENT=y
CONFIG_UPROBE_EVENT=y
CONFIG_PROBE_EVENTS=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_MMIOTRACE is not set
# CONFIG_HIST_TRIGGERS is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
CONFIG_RING_BUFFER_BENCHMARK=m
# CONFIG_RING_BUFFER_STARTUP_TEST is not set
# CONFIG_TRACE_ENUM_MAP_FILE is not set
CONFIG_TRACING_EVENTS_GPIO=y
#
# Runtime Testing
#
CONFIG_LKDTM=m
# CONFIG_TEST_LIST_SORT is not set
# CONFIG_KPROBES_SANITY_TEST is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
CONFIG_RBTREE_TEST=m
CONFIG_INTERVAL_TREE_TEST=m
CONFIG_PERCPU_TEST=m
CONFIG_ATOMIC64_SELFTEST=y
CONFIG_ASYNC_RAID6_TEST=m
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_TEST_STRING_HELPERS is not set
CONFIG_TEST_KSTRTOX=m
# CONFIG_TEST_PRINTF is not set
# CONFIG_TEST_BITMAP is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_TEST_HASH is not set
CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
CONFIG_BUILD_DOCSRC=y
# CONFIG_DMA_API_DEBUG is not set
CONFIG_TEST_LKM=m
CONFIG_TEST_USER_COPY=m
CONFIG_TEST_BPF=m
CONFIG_TEST_FIRMWARE=m
CONFIG_TEST_UDELAY=m
# CONFIG_MEMTEST is not set
# CONFIG_TEST_STATIC_KEYS is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
# CONFIG_UBSAN is not set
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
CONFIG_STRICT_DEVMEM=y
# CONFIG_IO_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
# CONFIG_EARLY_PRINTK_EFI is not set
# CONFIG_X86_PTDUMP_CORE is not set
# CONFIG_X86_PTDUMP is not set
# CONFIG_EFI_PGT_DUMP is not set
CONFIG_DEBUG_RODATA_TEST=y
# CONFIG_DEBUG_WX is not set
CONFIG_DEBUG_SET_MODULE_RONX=y
CONFIG_DEBUG_NX_TEST=m
CONFIG_DOUBLEFAULT=y
# CONFIG_DEBUG_TLBFLUSH is not set
# CONFIG_IOMMU_DEBUG is not set
# CONFIG_IOMMU_STRESS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_X86_DECODER_SELFTEST=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
CONFIG_DEBUG_BOOT_PARAMS=y
# CONFIG_CPA_DEBUG is not set
CONFIG_OPTIMIZE_INLINING=y
# CONFIG_DEBUG_ENTRY is not set
# CONFIG_DEBUG_NMI_SELFTEST is not set
CONFIG_X86_DEBUG_FPU=y
# CONFIG_PUNIT_ATOM_DEBUG is not set
#
# Security options
#
CONFIG_KEYS=y
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_BIG_KEYS=y
CONFIG_TRUSTED_KEYS=y
CONFIG_ENCRYPTED_KEYS=y
# CONFIG_KEY_DH_OPERATIONS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
CONFIG_SECURITY=y
CONFIG_SECURITYFS=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_NETWORK_XFRM=y
# CONFIG_SECURITY_PATH is not set
CONFIG_INTEL_TXT=y
CONFIG_LSM_MMAP_MIN_ADDR=65535
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
# CONFIG_SECURITY_SMACK is not set
# CONFIG_SECURITY_TOMOYO is not set
# CONFIG_SECURITY_APPARMOR is not set
# CONFIG_SECURITY_LOADPIN is not set
# CONFIG_SECURITY_YAMA is not set
CONFIG_INTEGRITY=y
CONFIG_INTEGRITY_SIGNATURE=y
CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y
CONFIG_INTEGRITY_TRUSTED_KEYRING=y
CONFIG_INTEGRITY_AUDIT=y
CONFIG_IMA=y
CONFIG_IMA_MEASURE_PCR_IDX=10
CONFIG_IMA_LSM_RULES=y
# CONFIG_IMA_TEMPLATE is not set
CONFIG_IMA_NG_TEMPLATE=y
# CONFIG_IMA_SIG_TEMPLATE is not set
CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
CONFIG_IMA_DEFAULT_HASH_SHA1=y
# CONFIG_IMA_DEFAULT_HASH_SHA256 is not set
# CONFIG_IMA_DEFAULT_HASH_SHA512 is not set
# CONFIG_IMA_DEFAULT_HASH_WP512 is not set
CONFIG_IMA_DEFAULT_HASH="sha1"
# CONFIG_IMA_WRITE_POLICY is not set
# CONFIG_IMA_READ_POLICY is not set
CONFIG_IMA_APPRAISE=y
CONFIG_IMA_TRUSTED_KEYRING=y
# CONFIG_IMA_BLACKLIST_KEYRING is not set
# CONFIG_IMA_LOAD_X509 is not set
CONFIG_EVM=y
CONFIG_EVM_ATTR_FSUUID=y
# CONFIG_EVM_LOAD_X509 is not set
CONFIG_DEFAULT_SECURITY_SELINUX=y
# CONFIG_DEFAULT_SECURITY_DAC is not set
CONFIG_DEFAULT_SECURITY="selinux"
CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_ASYNC_PQ=m
CONFIG_ASYNC_RAID6_RECOV=m
CONFIG_CRYPTO=y
#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=y
CONFIG_CRYPTO_RSA=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_USER=m
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_GF128MUL=m
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
CONFIG_CRYPTO_PCRYPT=m
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CRYPTD=m
# CONFIG_CRYPTO_MCRYPTD is not set
CONFIG_CRYPTO_AUTHENC=m
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_ABLK_HELPER=m
CONFIG_CRYPTO_GLUE_HELPER_X86=m
#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=m
CONFIG_CRYPTO_GCM=m
# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=m
#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=y
CONFIG_CRYPTO_CTS=m
CONFIG_CRYPTO_ECB=y
CONFIG_CRYPTO_LRW=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_XTS=m
# CONFIG_CRYPTO_KEYWRAP is not set
#
# Hash modes
#
CONFIG_CRYPTO_CMAC=m
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=m
CONFIG_CRYPTO_VMAC=m
#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_CRC32C_INTEL=m
CONFIG_CRYPTO_CRC32=m
CONFIG_CRYPTO_CRC32_PCLMUL=m
CONFIG_CRYPTO_CRCT10DIF=y
CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
CONFIG_CRYPTO_GHASH=m
# CONFIG_CRYPTO_POLY1305 is not set
# CONFIG_CRYPTO_POLY1305_X86_64 is not set
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_RMD128=m
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_RMD256=m
CONFIG_CRYPTO_RMD320=m
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA1_SSSE3=m
CONFIG_CRYPTO_SHA256_SSSE3=m
CONFIG_CRYPTO_SHA512_SSSE3=m
# CONFIG_CRYPTO_SHA1_MB is not set
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL=m
#
# Ciphers
#
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_AES_X86_64=y
CONFIG_CRYPTO_AES_NI_INTEL=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_BLOWFISH_COMMON=m
CONFIG_CRYPTO_BLOWFISH_X86_64=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAMELLIA_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64=m
CONFIG_CRYPTO_CAST_COMMON=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST5_AVX_X86_64=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_CAST6_AVX_X86_64=m
CONFIG_CRYPTO_DES=m
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_SALSA20=m
CONFIG_CRYPTO_SALSA20_X86_64=m
# CONFIG_CRYPTO_CHACHA20 is not set
# CONFIG_CRYPTO_CHACHA20_X86_64 is not set
CONFIG_CRYPTO_SEED=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_SERPENT_SSE2_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX2_X86_64=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
CONFIG_CRYPTO_TWOFISH_X86_64=m
CONFIG_CRYPTO_TWOFISH_X86_64_3WAY=m
CONFIG_CRYPTO_TWOFISH_AVX_X86_64=m
#
# Compression
#
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_LZO=y
# CONFIG_CRYPTO_842 is not set
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set
#
# Random Number Generation
#
CONFIG_CRYPTO_ANSI_CPRNG=m
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
# CONFIG_CRYPTO_DRBG_HASH is not set
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
CONFIG_CRYPTO_USER_API=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CRYPTO_USER_API_SKCIPHER=y
# CONFIG_CRYPTO_USER_API_RNG is not set
# CONFIG_CRYPTO_USER_API_AEAD is not set
CONFIG_CRYPTO_HASH_INFO=y
CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_PADLOCK=m
CONFIG_CRYPTO_DEV_PADLOCK_AES=m
CONFIG_CRYPTO_DEV_PADLOCK_SHA=m
# CONFIG_CRYPTO_DEV_CCP is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
# CONFIG_CRYPTO_DEV_QAT_C62X is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
# CONFIG_PKCS7_MESSAGE_PARSER is not set
#
# Certificates for signature checking
#
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_SYSTEM_TRUSTED_KEYS=""
# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
# CONFIG_SECONDARY_TRUSTED_KEYRING is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_KVM_COMPAT=y
CONFIG_HAVE_KVM_IRQ_BYPASS=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_KVM_AMD=m
CONFIG_KVM_MMU_AUDIT=y
# CONFIG_KVM_DEVICE_ASSIGNMENT is not set
CONFIG_BINARY_PRINTF=y
#
# Library routines
#
CONFIG_RAID6_PQ=m
CONFIG_BITREVERSE=y
# CONFIG_HAVE_ARCH_BITREVERSE is not set
CONFIG_RATIONAL=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_IO=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_CRC_CCITT=m
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=y
CONFIG_CRC8=m
# CONFIG_AUDIT_ARCH_COMPAT_GENERIC is not set
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_IA64=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_REED_SOLOMON=m
CONFIG_REED_SOLOMON_ENC8=y
CONFIG_REED_SOLOMON_DEC8=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_INTERVAL_TREE=y
CONFIG_RADIX_TREE_MULTIORDER=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_CHECK_SIGNATURE=y
CONFIG_CPUMASK_OFFSTACK=y
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE=y
CONFIG_CLZ_TAB=y
CONFIG_CORDIC=m
# CONFIG_DDR is not set
CONFIG_IRQ_POLL=y
CONFIG_MPILIB=y
CONFIG_SIGNATURE=y
CONFIG_OID_REGISTRY=y
CONFIG_UCS2_STRING=y
CONFIG_FONT_SUPPORT=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
# CONFIG_SG_SPLIT is not set
CONFIG_SG_POOL=y
CONFIG_ARCH_HAS_SG_CHAIN=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_ARCH_HAS_MMIO_FLUSH=y
[-- Attachment #3: job.yaml --]
[-- Type: text/plain, Size: 3944 bytes --]
---
suite: aim7
testcase: aim7
category: benchmark
disk: 1BRD_48G
fs: xfs
aim7:
test: disk_wrt
load: 3000
job_origin: "/lkp/lkp/src/allot/cyclic:linux-devel:devel-hourly/ivb44/aim7-fs-1brd.yaml"
queue: bisect
testbox: ivb44
tbox_group: ivb44
rootfs: debian-x86_64-2015-02-07.cgz
job_file: "/lkp/scheduled/ivb44/aim7-1BRD_48G-xfs-disk_wrt-3000-performance-debian-x86_64-2015-02-07.cgz-68a9f5e7007c1afa2cf6830b690a90d0187c0684-20160808-100317-8vi4ke-0.yaml"
id: 72d1a5e8f77e4181b9db78845c9a2cc37165471d
model: Ivytown Ivy Bridge-EP
nr_cpu: 48
memory: 64G
nr_hdd_partitions: 3
hdd_partitions: "/dev/disk/by-id/ata-WDC_WD1003FBYZ-010FB0_WD-WCAW36*-part1"
swap_partitions: "/dev/disk/by-id/ata-WDC_WD1003FBYZ-010FB0_WD-WCAW36795753-part2"
rootfs_partition: "/dev/disk/by-id/ata-WDC_WD1003FBYZ-010FB0_WD-WCAW36795753-part3"
netconsole_port: 6644
kmsg:
iostat:
heartbeat:
vmstat:
numa-numastat:
numa-vmstat:
numa-meminfo:
proc-vmstat:
proc-stat:
meminfo:
slabinfo:
interrupts:
lock_stat:
latency_stats:
softirqs:
bdi_dev_mapping:
diskstats:
nfsstat:
cpuidle:
cpufreq-stats:
turbostat:
sched_debug:
perf-stat:
perf-profile:
cpufreq_governor: performance
commit: 68a9f5e7007c1afa2cf6830b690a90d0187c0684
need_kconfig:
- CONFIG_BLK_DEV_RAM
- CONFIG_BLK_DEV
- CONFIG_BLOCK
- CONFIG_XFS_FS
kconfig: x86_64-rhel
compiler: gcc-6
enqueue_time: 2016-08-08 19:50:12.582832286 +08:00
user: lkp
head_commit: 1f11daae97cb85d6472f4e21a39e8e95af20d74c
base_commit: 523d939ef98fd712632d93a5a2b588e477a7565e
branch: linux-devel/devel-hourly-2016080806
result_root: "/result/aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/0"
LKP_SERVER: inn
max_uptime: 785.16
initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
bootloader_append:
- root=/dev/ram0
- user=lkp
- job=/lkp/scheduled/ivb44/aim7-1BRD_48G-xfs-disk_wrt-3000-performance-debian-x86_64-2015-02-07.cgz-68a9f5e7007c1afa2cf6830b690a90d0187c0684-20160808-100317-8vi4ke-0.yaml
- ARCH=x86_64
- kconfig=x86_64-rhel
- branch=linux-devel/devel-hourly-2016080806
- commit=68a9f5e7007c1afa2cf6830b690a90d0187c0684
- BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/vmlinuz-4.7.0-rc1-00007-g68a9f5e
- max_uptime=785
- RESULT_ROOT=/result/aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/0
- LKP_SERVER=inn
- debug
- apic=debug
- sysrq_always_enabled
- rcupdate.rcu_cpu_stall_timeout=100
- panic=-1
- softlockup_panic=1
- nmi_watchdog=panic
- oops=panic
- load_ramdisk=2
- prompt_ramdisk=0
- systemd.log_level=err
- ignore_loglevel
- earlyprintk=ttyS0,115200
- console=ttyS0,115200
- console=tty0
- vga=normal
- rw
lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
modules_initrd: "/pkg/linux/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/modules.cgz"
bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/fs.cgz,/lkp/benchmarks/aim7-x86_64.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/lkp/benchmarks/perf-stat-x86_64.cgz,/lkp/benchmarks/perf-profile-x86_64.cgz"
linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/linux-headers.cgz"
site: inn
LKP_CGI_PORT: 80
LKP_CIFS_PORT: 139
oom-killer:
watchdog:
nfs-hang:
repeat_to: 2
bad_samples:
- 486.19
- 481.48
- 483.32
#! queue options
#! user overrides
#! schedule options
kernel: "/pkg/linux/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/vmlinuz-4.7.0-rc1-00007-g68a9f5e"
dequeue_time: 2016-08-08 20:30:06.296125189 +08:00
#! include/site/inn
#! runtime status
job_state: finished
loadavg: 1023.11 266.66 89.82 1/605 5699
start_time: '1470659457'
end_time: '1470659500'
version: "/lkp/lkp/.src-20160808-151458"
[-- Attachment #4: reproduce --]
[-- Type: text/plain, Size: 304 bytes --]
2016-08-08 20:30:55 dmsetup remove_all
2016-08-08 20:30:55 wipefs -a --force /dev/ram0
2016-08-08 20:30:55 mkfs -t xfs /dev/ram0
2016-08-08 20:30:55 mount -t xfs -o nobarrier,inode64 /dev/ram0 /fs/ram0
for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
do
echo performance > $file
done
^ permalink raw reply [flat|nested] 219+ messages in thread
* [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-09 14:33 ` kernel test robot
0 siblings, 0 replies; 219+ messages in thread
From: kernel test robot @ 2016-08-09 14:33 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 8906 bytes --]
FYI, we noticed a -13.6% regression of aim7.jobs-per-min due to commit:
commit 68a9f5e7007c1afa2cf6830b690a90d0187c0684 ("xfs: implement iomap based buffered write path")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
in testcase: aim7
on test machine: 48 threads Ivytown Ivy Bridge-EP with 64G memory
with following parameters:
disk: 1BRD_48G
fs: xfs
test: disk_wrt
load: 3000
cpufreq_governor: performance
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74 ("xfs: reorder zeroing and flushing sequence in truncate")
68a9f5e700 ("xfs: implement iomap based buffered write path")
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69
---------------- --------------------------
%stddev %change %stddev
\ | \
486586 ± 0% -13.6% 420342 ± 0% aim7.jobs-per-min
37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time
37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time.max
6424 ± 1% +31.3% 8432 ± 1% aim7.time.involuntary_context_switches
151288 ± 0% +2.8% 155579 ± 0% aim7.time.minor_page_faults
376.31 ± 0% +28.5% 483.48 ± 0% aim7.time.system_time
429058 ± 0% -20.0% 343371 ± 0% aim7.time.voluntary_context_switches
16014 ± 0% +28.8% 20628 ± 1% meminfo.Active(file)
127154 ± 9% -14.4% 108893 ± 11% softirqs.SCHED
14084 ± 18% -33.1% 9421 ± 17% numa-numastat.node1.numa_foreign
15461 ± 17% -31.4% 10598 ± 13% numa-numastat.node1.numa_miss
24561 ± 0% -27.2% 17873 ± 1% vmstat.system.cs
47289 ± 0% +1.2% 47866 ± 0% vmstat.system.in
7868 ± 1% +27.3% 10013 ± 6% numa-meminfo.node0.Active(file)
8148 ± 1% +29.5% 10554 ± 7% numa-meminfo.node1.Active(file)
81041 ± 3% +30.0% 105374 ± 24% numa-meminfo.node1.Slab
1966 ± 1% +30.1% 2558 ± 4% numa-vmstat.node0.nr_active_file
4204 ± 3% +17.1% 4921 ± 8% numa-vmstat.node0.nr_alloc_batch
2037 ± 1% +26.6% 2579 ± 5% numa-vmstat.node1.nr_active_file
4003 ± 0% +28.1% 5129 ± 1% proc-vmstat.nr_active_file
979.25 ± 0% +63.7% 1602 ± 1% proc-vmstat.pgactivate
4699 ± 3% +162.6% 12340 ± 73% proc-vmstat.pgpgout
50.23 ± 19% -27.3% 36.50 ± 17% sched_debug.cpu.cpu_load[1].avg
466.50 ± 29% -51.8% 225.00 ± 73% sched_debug.cpu.cpu_load[1].max
77.78 ± 33% -50.6% 38.40 ± 57% sched_debug.cpu.cpu_load[1].stddev
300.50 ± 33% -52.9% 141.50 ± 48% sched_debug.cpu.cpu_load[2].max
1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.active_objs
1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.num_objs
431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.active_objs
431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.num_objs
24.26 ± 0% +8.7% 26.36 ± 0% turbostat.%Busy
686.75 ± 0% +9.1% 749.25 ± 0% turbostat.Avg_MHz
0.29 ± 1% -24.3% 0.22 ± 1% turbostat.CPU%c3
91.39 ± 2% +3.6% 94.71 ± 0% turbostat.CorWatt
121.88 ± 1% +2.8% 125.23 ± 0% turbostat.PkgWatt
53643508 ± 0% -19.6% 43119128 ± 2% cpuidle.C1-IVT.time
318952 ± 0% -25.7% 237018 ± 0% cpuidle.C1-IVT.usage
3471235 ± 2% -16.9% 2886121 ± 2% cpuidle.C1E-IVT.time
46642 ± 1% -22.4% 36214 ± 0% cpuidle.C1E-IVT.usage
12601665 ± 1% -21.8% 9854467 ± 1% cpuidle.C3-IVT.time
79872 ± 1% -19.6% 64244 ± 1% cpuidle.C3-IVT.usage
1.292e+09 ± 0% +13.7% 1.47e+09 ± 0% cpuidle.C6-IVT.time
5131 ±121% -100.0% 0.00 ± -1% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
5131 ±121% -100.0% 0.00 ± -1% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
9739 ± 99% -99.0% 95.50 ± 10% latency_stats.max.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
7739 ± 81% -72.1% 2162 ± 52% latency_stats.max.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
5131 ±121% -100.0% 0.00 ± -1% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
10459 ± 97% -97.5% 262.75 ± 5% latency_stats.sum.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
9097 ± 81% -72.5% 2505 ± 45% latency_stats.sum.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
2.59e+11 ± 6% +24.1% 3.213e+11 ± 4% perf-stat.branch-instructions
0.41 ± 2% -9.5% 0.38 ± 1% perf-stat.branch-miss-rate
1.072e+09 ± 4% +12.5% 1.206e+09 ± 3% perf-stat.branch-misses
972882 ± 0% -17.4% 803990 ± 0% perf-stat.context-switches
1.472e+12 ± 6% +22.4% 1.801e+12 ± 5% perf-stat.cpu-cycles
100350 ± 1% -5.1% 95219 ± 1% perf-stat.cpu-migrations
7.315e+08 ± 24% +60.4% 1.174e+09 ± 37% perf-stat.dTLB-load-misses
3.225e+11 ± 5% +36.4% 4.398e+11 ± 2% perf-stat.dTLB-loads
2.176e+11 ± 9% +44.6% 3.147e+11 ± 6% perf-stat.dTLB-stores
1.452e+12 ± 6% +29.5% 1.879e+12 ± 4% perf-stat.instructions
42168 ± 16% +27.5% 53751 ± 6% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.7% 1.04 ± 0% perf-stat.ipc
252401 ± 0% +6.6% 269148 ± 0% perf-stat.minor-faults
10.16 ± 3% +13.0% 11.48 ± 3% perf-stat.node-store-miss-rate
24842185 ± 2% +11.9% 27804764 ± 1% perf-stat.node-store-misses
252321 ± 0% +6.6% 268999 ± 0% perf-stat.page-faults
aim7.jobs-per-min
540000 ++-----------------------------------------------------------------+
520000 **.* *.**. .**.* |
| *.**.**.* ** *.**.**.**.**.* |
500000 ++ : |
480000 ++ *.**.**.**.**.**.**.**.**.*|
| |
460000 ++ |
440000 ++ |
420000 ++ O OO OO OO OO OO OO
|O O O OO O O O O O |
400000 O+ OO O OO O O O OO OO OO O O OO |
380000 ++ |
| |
360000 ++ O OO O |
340000 ++-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Thanks,
Xiaolong
[-- Attachment #2: config-4.7.0-rc1-00007-g68a9f5e --]
[-- Type: text/plain, Size: 151225 bytes --]
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.7.0-rc1 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y
#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y
#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ_FULL_ALL is not set
# CONFIG_NO_HZ_FULL_SYSIDLE is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
#
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_CONTEXT_TRACKING=y
# CONFIG_CONTEXT_TRACKING_FORCE is not set
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_RCU_NOCB_CPU=y
# CONFIG_RCU_NOCB_CPU_NONE is not set
# CONFIG_RCU_NOCB_CPU_ZERO is not set
CONFIG_RCU_NOCB_CPU_ALL=y
# CONFIG_RCU_EXPEDITE_BOOT is not set
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=19
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_NMI_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_NUMA_BALANCING=y
CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
CONFIG_CGROUPS=y
CONFIG_PAGE_COUNTER=y
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y
CONFIG_MEMCG_SWAP_ENABLED=y
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CGROUP_WRITEBACK=y
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
CONFIG_RT_GROUP_SCHED=y
# CONFIG_CGROUP_PIDS is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_DEVICE=y
# CONFIG_CGROUP_CPUACCT is not set
CONFIG_CGROUP_PERF=y
# CONFIG_CGROUP_DEBUG is not set
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
# CONFIG_EXPERT is not set
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_PRINTK=y
CONFIG_PRINTK_NMI=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
# CONFIG_BPF_SYSCALL is not set
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_ADVISE_SYSCALLS=y
# CONFIG_USERFAULTFD is not set
CONFIG_PCI_QUIRKS=y
CONFIG_MEMBARRIER=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y
#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
CONFIG_SLUB_CPU_PARTIAL=y
# CONFIG_SYSTEM_DATA_VERIFICATION is not set
CONFIG_PROFILING=y
CONFIG_TRACEPOINTS=y
CONFIG_KEXEC_CORE=y
CONFIG_OPROFILE=m
CONFIG_OPROFILE_EVENT_MULTIPLEX=y
CONFIG_HAVE_OPROFILE=y
CONFIG_OPROFILE_NMI_TIMER=y
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
# CONFIG_STATIC_KEYS_SELFTEST is not set
CONFIG_OPTPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_UPROBES=y
# CONFIG_HAVE_64BIT_ALIGNED_ACCESS is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_KRETPROBES=y
CONFIG_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_NMI=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_CONTIGUOUS=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_CLK=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_HAVE_CMPXCHG_LOCAL=y
CONFIG_HAVE_CMPXCHG_DOUBLE=y
CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_HAVE_CC_STACKPROTECTOR=y
# CONFIG_CC_STACKPROTECTOR is not set
CONFIG_CC_STACKPROTECTOR_NONE=y
# CONFIG_CC_STACKPROTECTOR_REGULAR is not set
# CONFIG_CC_STACKPROTECTOR_STRONG is not set
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_HAVE_EXIT_THREAD=y
CONFIG_ARCH_MMAP_RND_BITS=28
CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
CONFIG_HAVE_COPY_THREAD_TLS=y
CONFIG_HAVE_STACK_VALIDATION=y
# CONFIG_HAVE_ARCH_HASH is not set
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_COMPAT_OLD_SIGACTION=y
# CONFIG_CPU_NO_EFFICIENT_FFS is not set
#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
# CONFIG_MODULE_SIG is not set
# CONFIG_MODULE_COMPRESS is not set
# CONFIG_TRIM_UNUSED_KSYMS is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_BSG=y
CONFIG_BLK_DEV_BSGLIB=y
CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLK_DEV_THROTTLING=y
# CONFIG_BLK_CMDLINE_PARSER is not set
#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
# CONFIG_ATARI_PARTITION is not set
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
# CONFIG_LDM_PARTITION is not set
CONFIG_SGI_PARTITION=y
# CONFIG_ULTRIX_PARTITION is not set
CONFIG_SUN_PARTITION=y
CONFIG_KARMA_PARTITION=y
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
# CONFIG_CMDLINE_PARTITION is not set
CONFIG_BLOCK_COMPAT=y
#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_CFQ_GROUP_IOSCHED=y
CONFIG_DEFAULT_DEADLINE=y
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="deadline"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_PADATA=y
CONFIG_ASN1=y
CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
CONFIG_INLINE_READ_UNLOCK=y
CONFIG_INLINE_READ_UNLOCK_IRQ=y
CONFIG_INLINE_WRITE_UNLOCK=y
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_FREEZER=y
#
# Processor type and features
#
CONFIG_ZONE_DMA=y
CONFIG_SMP=y
CONFIG_X86_FEATURE_NAMES=y
CONFIG_X86_FAST_FEATURE_TESTS=y
CONFIG_X86_X2APIC=y
CONFIG_X86_MPPARSE=y
# CONFIG_GOLDFISH is not set
CONFIG_X86_EXTENDED_PLATFORM=y
# CONFIG_X86_NUMACHIP is not set
# CONFIG_X86_VSMP is not set
CONFIG_X86_UV=y
# CONFIG_X86_GOLDFISH is not set
# CONFIG_X86_INTEL_MID is not set
CONFIG_X86_INTEL_LPSS=y
# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
CONFIG_IOSF_MBI=y
# CONFIG_IOSF_MBI_DEBUG is not set
CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
# CONFIG_SCHED_OMIT_FRAME_POINTER is not set
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
CONFIG_PARAVIRT_SPINLOCKS=y
# CONFIG_QUEUED_LOCK_STAT is not set
CONFIG_XEN=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PVHVM=y
CONFIG_XEN_512GB=y
CONFIG_XEN_SAVE_RESTORE=y
# CONFIG_XEN_DEBUG_FS is not set
# CONFIG_XEN_PVH is not set
CONFIG_KVM_GUEST=y
# CONFIG_KVM_DEBUG_FS is not set
CONFIG_PARAVIRT_TIME_ACCOUNTING=y
CONFIG_PARAVIRT_CLOCK=y
CONFIG_NO_BOOTMEM=y
# CONFIG_MK8 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_MATOM is not set
CONFIG_GENERIC_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
# CONFIG_CALGARY_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
CONFIG_MAXSMP=y
CONFIG_NR_CPUS=8192
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_INTEL=y
CONFIG_X86_MCE_AMD=y
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_MCE_INJECT=m
CONFIG_X86_THERMAL_VECTOR=y
#
# Performance monitoring
#
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_PERF_EVENTS_INTEL_RAPL=y
CONFIG_PERF_EVENTS_INTEL_CSTATE=y
# CONFIG_PERF_EVENTS_AMD_POWER is not set
# CONFIG_VM86 is not set
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
CONFIG_I8K=m
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
CONFIG_MICROCODE_AMD=y
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_X86_DIRECT_GBPAGES=y
CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NODES_SPAN_OTHER_NODES=y
# CONFIG_NUMA_EMU is not set
CONFIG_NODES_SHIFT=10
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_MEMORY_PROBE=y
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_MEMBLOCK=y
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
CONFIG_ARCH_DISCARD_MEMBLOCK=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_MOVABLE_NODE=y
CONFIG_HAVE_BOOTMEM_INFO_NODE=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG_SPARSE=y
# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_MEMORY_BALLOON=y
CONFIG_BALLOON_COMPACTION=y
CONFIG_COMPACTION=y
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_MMU_NOTIFIER=y
CONFIG_KSM=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
CONFIG_MEMORY_FAILURE=y
CONFIG_HWPOISON_INJECT=m
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_CLEANCACHE=y
CONFIG_FRONTSWAP=y
CONFIG_CMA=y
# CONFIG_CMA_DEBUG is not set
# CONFIG_CMA_DEBUGFS is not set
CONFIG_CMA_AREAS=7
CONFIG_ZSWAP=y
CONFIG_ZPOOL=y
CONFIG_ZBUD=y
# CONFIG_Z3FOLD is not set
CONFIG_ZSMALLOC=y
# CONFIG_PGTABLE_MAPPING is not set
# CONFIG_ZSMALLOC_STAT is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
CONFIG_ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT=y
# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_FRAME_VECTOR=y
CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
CONFIG_ARCH_HAS_PKEYS=y
CONFIG_X86_PMEM_LEGACY_DEVICE=y
CONFIG_X86_PMEM_LEGACY=y
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
# CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK is not set
CONFIG_X86_RESERVE_LOW=64
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_ARCH_RANDOM=y
CONFIG_X86_SMAP=y
# CONFIG_X86_INTEL_MPX is not set
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
CONFIG_EFI=y
CONFIG_EFI_STUB=y
# CONFIG_EFI_MIXED is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
# CONFIG_KEXEC_FILE is not set
CONFIG_CRASH_DUMP=y
CONFIG_KEXEC_JUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
# CONFIG_RANDOMIZE_BASE is not set
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
# CONFIG_COMPAT_VDSO is not set
# CONFIG_LEGACY_VSYSCALL_NATIVE is not set
CONFIG_LEGACY_VSYSCALL_EMULATE=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
CONFIG_HAVE_LIVEPATCH=y
# CONFIG_LIVEPATCH is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y
#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
# CONFIG_PM_AUTOSLEEP is not set
# CONFIG_PM_WAKELOCKS is not set
CONFIG_PM=y
CONFIG_PM_DEBUG=y
CONFIG_PM_ADVANCED_DEBUG=y
CONFIG_PM_TEST_SUSPEND=y
CONFIG_PM_SLEEP_DEBUG=y
# CONFIG_DPM_WATCHDOG is not set
# CONFIG_PM_TRACE_RTC is not set
CONFIG_PM_CLK=y
# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
# CONFIG_ACPI_DEBUGGER is not set
CONFIG_ACPI_SLEEP=y
# CONFIG_ACPI_PROCFS_POWER is not set
CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
CONFIG_ACPI_EC_DEBUGFS=m
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_CPU_FREQ_PSS=y
CONFIG_ACPI_PROCESSOR_IDLE=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_IPMI=m
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_PROCESSOR_AGGREGATOR=m
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_NUMA=y
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_TABLE_UPGRADE=y
CONFIG_ACPI_DEBUG=y
CONFIG_ACPI_PCI_SLOT=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_HOTPLUG_MEMORY=y
CONFIG_ACPI_HOTPLUG_IOAPIC=y
CONFIG_ACPI_SBS=m
CONFIG_ACPI_HED=y
CONFIG_ACPI_CUSTOM_METHOD=m
CONFIG_ACPI_BGRT=y
# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
# CONFIG_ACPI_NFIT is not set
CONFIG_HAVE_ACPI_APEI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
CONFIG_ACPI_APEI=y
CONFIG_ACPI_APEI_GHES=y
CONFIG_ACPI_APEI_PCIEAER=y
CONFIG_ACPI_APEI_MEMORY_FAILURE=y
CONFIG_ACPI_APEI_EINJ=m
# CONFIG_ACPI_APEI_ERST_DEBUG is not set
# CONFIG_ACPI_EXTLOG is not set
# CONFIG_PMIC_OPREGION is not set
CONFIG_SFI=y
#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=m
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set
#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
CONFIG_X86_PCC_CPUFREQ=m
CONFIG_X86_ACPI_CPUFREQ=m
CONFIG_X86_ACPI_CPUFREQ_CPB=y
CONFIG_X86_POWERNOW_K8=m
CONFIG_X86_AMD_FREQ_SENSITIVITY=m
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_P4_CLOCKMOD=m
#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=m
#
# CPU Idle
#
CONFIG_CPU_IDLE=y
# CONFIG_CPU_IDLE_GOV_LADDER is not set
CONFIG_CPU_IDLE_GOV_MENU=y
# CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set
CONFIG_INTEL_IDLE=y
#
# Memory power savings
#
CONFIG_I7300_IDLE_IOAT_CHANNEL=y
CONFIG_I7300_IDLE=m
#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_XEN=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
CONFIG_PCIE_ECRC=y
CONFIG_PCIEAER_INJECT=m
CONFIG_PCIEASPM=y
# CONFIG_PCIEASPM_DEBUG is not set
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
CONFIG_PCIE_PME=y
# CONFIG_PCIE_DPC is not set
CONFIG_PCI_BUS_ADDR_T_64BIT=y
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_REALLOC_ENABLE_AUTO is not set
CONFIG_PCI_STUB=y
# CONFIG_XEN_PCIDEV_FRONTEND is not set
CONFIG_HT_IRQ=y
CONFIG_PCI_ATS=y
CONFIG_PCI_IOV=y
CONFIG_PCI_PRI=y
CONFIG_PCI_PASID=y
CONFIG_PCI_LABEL=y
# CONFIG_PCI_HYPERV is not set
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_ACPI_IBM=m
# CONFIG_HOTPLUG_PCI_CPCI is not set
CONFIG_HOTPLUG_PCI_SHPC=m
#
# PCI host controller drivers
#
# CONFIG_PCIE_DW_PLAT is not set
CONFIG_ISA_DMA_API=y
CONFIG_AMD_NB=y
CONFIG_PCCARD=y
# CONFIG_PCMCIA is not set
CONFIG_CARDBUS=y
#
# PC-card bridges
#
CONFIG_YENTA=m
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
# CONFIG_RAPIDIO is not set
# CONFIG_X86_SYSFB is not set
#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_MISC=m
CONFIG_COREDUMP=y
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_AOUT is not set
# CONFIG_X86_X32 is not set
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_KEYS_COMPAT=y
CONFIG_X86_DEV_DMA_OPS=y
CONFIG_PMC_ATOM=y
# CONFIG_VMD is not set
CONFIG_NET=y
CONFIG_COMPAT_NETLINK_MESSAGES=y
CONFIG_NET_INGRESS=y
CONFIG_NET_EGRESS=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_DIAG=m
CONFIG_UNIX=y
CONFIG_UNIX_DIAG=m
CONFIG_XFRM=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_MIGRATE=y
CONFIG_XFRM_STATISTICS=y
CONFIG_XFRM_IPCOMP=m
CONFIG_NET_KEY=m
CONFIG_NET_KEY_MIGRATE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_FIB_TRIE_STATS=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_ROUTE_CLASSID=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
# CONFIG_IP_PNP_BOOTP is not set
# CONFIG_IP_PNP_RARP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE_DEMUX=m
CONFIG_NET_IP_TUNNEL=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
CONFIG_NET_IPVTI=m
CONFIG_NET_UDP_TUNNEL=m
# CONFIG_NET_FOU is not set
# CONFIG_NET_FOU_IP_TUNNELS is not set
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_XFRM_MODE_TRANSPORT=m
CONFIG_INET_XFRM_MODE_TUNNEL=m
CONFIG_INET_XFRM_MODE_BEET=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
CONFIG_INET_UDP_DIAG=m
# CONFIG_INET_DIAG_DESTROY is not set
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=m
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=m
CONFIG_TCP_CONG_HTCP=m
CONFIG_TCP_CONG_HSTCP=m
CONFIG_TCP_CONG_HYBLA=m
CONFIG_TCP_CONG_VEGAS=m
CONFIG_TCP_CONG_SCALABLE=m
CONFIG_TCP_CONG_LP=m
CONFIG_TCP_CONG_VENO=m
CONFIG_TCP_CONG_YEAH=m
CONFIG_TCP_CONG_ILLINOIS=m
# CONFIG_TCP_CONG_DCTCP is not set
# CONFIG_TCP_CONG_CDG is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_IPV6_OPTIMISTIC_DAD=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_MIP6=m
# CONFIG_IPV6_ILA is not set
CONFIG_INET6_XFRM_TUNNEL=m
CONFIG_INET6_TUNNEL=m
CONFIG_INET6_XFRM_MODE_TRANSPORT=m
CONFIG_INET6_XFRM_MODE_TUNNEL=m
CONFIG_INET6_XFRM_MODE_BEET=m
CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION=m
# CONFIG_IPV6_VTI is not set
CONFIG_IPV6_SIT=m
CONFIG_IPV6_SIT_6RD=y
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=m
# CONFIG_IPV6_GRE is not set
CONFIG_IPV6_MULTIPLE_TABLES=y
# CONFIG_IPV6_SUBTREES is not set
CONFIG_IPV6_MROUTE=y
CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y
CONFIG_IPV6_PIMSM_V2=y
CONFIG_NETLABEL=y
CONFIG_NETWORK_SECMARK=y
CONFIG_NET_PTP_CLASSIFY=y
CONFIG_NETWORK_PHY_TIMESTAMPING=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=m
#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_ACCT=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_LOG_COMMON=m
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_ZONES=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_EVENTS=y
# CONFIG_NF_CONNTRACK_TIMEOUT is not set
CONFIG_NF_CONNTRACK_TIMESTAMP=y
CONFIG_NF_CONNTRACK_LABELS=y
CONFIG_NF_CT_PROTO_DCCP=m
CONFIG_NF_CT_PROTO_GRE=m
CONFIG_NF_CT_PROTO_SCTP=m
CONFIG_NF_CT_PROTO_UDPLITE=m
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_BROADCAST=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_SNMP=m
CONFIG_NF_CONNTRACK_PPTP=m
CONFIG_NF_CONNTRACK_SANE=m
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
# CONFIG_NETFILTER_NETLINK_GLUE_CT is not set
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_NF_NAT_PROTO_DCCP=m
CONFIG_NF_NAT_PROTO_UDPLITE=m
CONFIG_NF_NAT_PROTO_SCTP=m
CONFIG_NF_NAT_AMANDA=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_SIP=m
CONFIG_NF_NAT_TFTP=m
CONFIG_NF_NAT_REDIRECT=m
CONFIG_NETFILTER_SYNPROXY=m
CONFIG_NF_TABLES=m
# CONFIG_NF_TABLES_INET is not set
# CONFIG_NF_TABLES_NETDEV is not set
CONFIG_NFT_EXTHDR=m
CONFIG_NFT_META=m
CONFIG_NFT_CT=m
CONFIG_NFT_RBTREE=m
CONFIG_NFT_HASH=m
CONFIG_NFT_COUNTER=m
CONFIG_NFT_LOG=m
CONFIG_NFT_LIMIT=m
# CONFIG_NFT_MASQ is not set
# CONFIG_NFT_REDIR is not set
CONFIG_NFT_NAT=m
# CONFIG_NFT_QUEUE is not set
# CONFIG_NFT_REJECT is not set
CONFIG_NFT_COMPAT=m
CONFIG_NETFILTER_XTABLES=y
#
# Xtables combined modules
#
CONFIG_NETFILTER_XT_MARK=m
CONFIG_NETFILTER_XT_CONNMARK=m
CONFIG_NETFILTER_XT_SET=m
#
# Xtables targets
#
CONFIG_NETFILTER_XT_TARGET_AUDIT=m
CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
CONFIG_NETFILTER_XT_TARGET_CT=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_HL=m
CONFIG_NETFILTER_XT_TARGET_HMARK=m
CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
CONFIG_NETFILTER_XT_TARGET_LED=m
CONFIG_NETFILTER_XT_TARGET_LOG=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_NAT=m
CONFIG_NETFILTER_XT_TARGET_NETMAP=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
CONFIG_NETFILTER_XT_TARGET_RATEEST=m
CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
CONFIG_NETFILTER_XT_TARGET_TEE=m
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
CONFIG_NETFILTER_XT_TARGET_TRACE=m
CONFIG_NETFILTER_XT_TARGET_SECMARK=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
#
# Xtables matches
#
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
CONFIG_NETFILTER_XT_MATCH_BPF=m
# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_CPU=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ECN=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_HL=m
# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
CONFIG_NETFILTER_XT_MATCH_IPVS=m
CONFIG_NETFILTER_XT_MATCH_L2TP=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_NFACCT=m
CONFIG_NETFILTER_XT_MATCH_OSF=m
CONFIG_NETFILTER_XT_MATCH_OWNER=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_RATEEST=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_RECENT=m
CONFIG_NETFILTER_XT_MATCH_SCTP=m
CONFIG_NETFILTER_XT_MATCH_SOCKET=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_TIME=m
CONFIG_NETFILTER_XT_MATCH_U32=m
CONFIG_IP_SET=m
CONFIG_IP_SET_MAX=256
CONFIG_IP_SET_BITMAP_IP=m
CONFIG_IP_SET_BITMAP_IPMAC=m
CONFIG_IP_SET_BITMAP_PORT=m
CONFIG_IP_SET_HASH_IP=m
# CONFIG_IP_SET_HASH_IPMARK is not set
CONFIG_IP_SET_HASH_IPPORT=m
CONFIG_IP_SET_HASH_IPPORTIP=m
CONFIG_IP_SET_HASH_IPPORTNET=m
# CONFIG_IP_SET_HASH_MAC is not set
# CONFIG_IP_SET_HASH_NETPORTNET is not set
CONFIG_IP_SET_HASH_NET=m
# CONFIG_IP_SET_HASH_NETNET is not set
CONFIG_IP_SET_HASH_NETPORT=m
CONFIG_IP_SET_HASH_NETIFACE=m
CONFIG_IP_SET_LIST_SET=m
CONFIG_IP_VS=m
CONFIG_IP_VS_IPV6=y
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12
#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_AH_ESP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y
CONFIG_IP_VS_PROTO_SCTP=y
#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
# CONFIG_IP_VS_FO is not set
# CONFIG_IP_VS_OVF is not set
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m
#
# IPVS SH scheduler
#
CONFIG_IP_VS_SH_TAB_BITS=8
#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
CONFIG_IP_VS_NFCT=y
CONFIG_IP_VS_PE_SIP=m
#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_CONNTRACK_IPV4=m
# CONFIG_NF_CONNTRACK_PROC_COMPAT is not set
CONFIG_NF_TABLES_IPV4=m
CONFIG_NFT_CHAIN_ROUTE_IPV4=m
# CONFIG_NFT_REJECT_IPV4 is not set
# CONFIG_NFT_DUP_IPV4 is not set
# CONFIG_NF_TABLES_ARP is not set
CONFIG_NF_DUP_IPV4=m
# CONFIG_NF_LOG_ARP is not set
CONFIG_NF_LOG_IPV4=m
CONFIG_NF_REJECT_IPV4=m
CONFIG_NF_NAT_IPV4=m
CONFIG_NFT_CHAIN_NAT_IPV4=m
CONFIG_NF_NAT_MASQUERADE_IPV4=m
CONFIG_NF_NAT_SNMP_BASIC=m
CONFIG_NF_NAT_PROTO_GRE=m
CONFIG_NF_NAT_PPTP=m
CONFIG_NF_NAT_H323=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_RPFILTER=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_SYNPROXY=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_SECURITY=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
#
# IPv6: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV6=m
CONFIG_NF_CONNTRACK_IPV6=m
CONFIG_NF_TABLES_IPV6=m
CONFIG_NFT_CHAIN_ROUTE_IPV6=m
# CONFIG_NFT_REJECT_IPV6 is not set
# CONFIG_NFT_DUP_IPV6 is not set
CONFIG_NF_DUP_IPV6=m
CONFIG_NF_REJECT_IPV6=m
CONFIG_NF_LOG_IPV6=m
CONFIG_NF_NAT_IPV6=m
CONFIG_NFT_CHAIN_NAT_IPV6=m
# CONFIG_NF_NAT_MASQUERADE_IPV6 is not set
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_AH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_MH=m
CONFIG_IP6_NF_MATCH_RPFILTER=m
CONFIG_IP6_NF_MATCH_RT=m
CONFIG_IP6_NF_TARGET_HL=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_TARGET_SYNPROXY=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_RAW=m
CONFIG_IP6_NF_SECURITY=m
# CONFIG_IP6_NF_NAT is not set
CONFIG_NF_TABLES_BRIDGE=m
# CONFIG_NFT_BRIDGE_META is not set
# CONFIG_NF_LOG_BRIDGE is not set
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_IP6=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_NFLOG=m
CONFIG_IP_DCCP=m
CONFIG_INET_DCCP_DIAG=m
#
# DCCP CCIDs Configuration
#
# CONFIG_IP_DCCP_CCID2_DEBUG is not set
CONFIG_IP_DCCP_CCID3=y
# CONFIG_IP_DCCP_CCID3_DEBUG is not set
CONFIG_IP_DCCP_TFRC_LIB=y
#
# DCCP Kernel Hacking
#
# CONFIG_IP_DCCP_DEBUG is not set
# CONFIG_NET_DCCPPROBE is not set
CONFIG_IP_SCTP=m
CONFIG_NET_SCTPPROBE=m
# CONFIG_SCTP_DBG_OBJCNT is not set
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set
CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1=y
# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set
CONFIG_SCTP_COOKIE_HMAC_MD5=y
CONFIG_SCTP_COOKIE_HMAC_SHA1=y
CONFIG_INET_SCTP_DIAG=m
# CONFIG_RDS is not set
CONFIG_TIPC=m
CONFIG_TIPC_MEDIA_UDP=y
CONFIG_ATM=m
CONFIG_ATM_CLIP=m
# CONFIG_ATM_CLIP_NO_ICMP is not set
CONFIG_ATM_LANE=m
# CONFIG_ATM_MPOA is not set
CONFIG_ATM_BR2684=m
# CONFIG_ATM_BR2684_IPFILTER is not set
CONFIG_L2TP=m
CONFIG_L2TP_DEBUGFS=m
CONFIG_L2TP_V3=y
CONFIG_L2TP_IP=m
CONFIG_L2TP_ETH=m
CONFIG_STP=m
CONFIG_GARP=m
CONFIG_MRP=m
CONFIG_BRIDGE=m
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_BRIDGE_VLAN_FILTERING=y
CONFIG_HAVE_NET_DSA=y
CONFIG_VLAN_8021Q=m
CONFIG_VLAN_8021Q_GVRP=y
CONFIG_VLAN_8021Q_MVRP=y
# CONFIG_DECNET is not set
CONFIG_LLC=m
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_PHONET is not set
# CONFIG_6LOWPAN is not set
CONFIG_IEEE802154=m
# CONFIG_IEEE802154_NL802154_EXPERIMENTAL is not set
CONFIG_IEEE802154_SOCKET=m
CONFIG_MAC802154=m
CONFIG_NET_SCHED=y
#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_HFSC=m
CONFIG_NET_SCH_ATM=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_MULTIQ=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFB=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_NETEM=m
CONFIG_NET_SCH_DRR=m
CONFIG_NET_SCH_MQPRIO=m
CONFIG_NET_SCH_CHOKE=m
CONFIG_NET_SCH_QFQ=m
CONFIG_NET_SCH_CODEL=m
CONFIG_NET_SCH_FQ_CODEL=m
# CONFIG_NET_SCH_FQ is not set
# CONFIG_NET_SCH_HHF is not set
# CONFIG_NET_SCH_PIE is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_SCH_PLUG=m
#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=m
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_CLS_U32_PERF=y
CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_CLS_FLOW=m
CONFIG_NET_CLS_CGROUP=y
# CONFIG_NET_CLS_BPF is not set
# CONFIG_NET_CLS_FLOWER is not set
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
CONFIG_NET_EMATCH_CMP=m
CONFIG_NET_EMATCH_NBYTE=m
CONFIG_NET_EMATCH_U32=m
CONFIG_NET_EMATCH_META=m
CONFIG_NET_EMATCH_TEXT=m
CONFIG_NET_EMATCH_IPSET=m
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_ACT_GACT=m
CONFIG_GACT_PROB=y
CONFIG_NET_ACT_MIRRED=m
CONFIG_NET_ACT_IPT=m
CONFIG_NET_ACT_NAT=m
CONFIG_NET_ACT_PEDIT=m
CONFIG_NET_ACT_SIMP=m
CONFIG_NET_ACT_SKBEDIT=m
CONFIG_NET_ACT_CSUM=m
# CONFIG_NET_ACT_VLAN is not set
# CONFIG_NET_ACT_BPF is not set
# CONFIG_NET_ACT_CONNMARK is not set
# CONFIG_NET_ACT_IFE is not set
CONFIG_NET_CLS_IND=y
CONFIG_NET_SCH_FIFO=y
CONFIG_DCB=y
CONFIG_DNS_RESOLVER=m
# CONFIG_BATMAN_ADV is not set
CONFIG_OPENVSWITCH=m
CONFIG_OPENVSWITCH_GRE=m
CONFIG_OPENVSWITCH_VXLAN=m
CONFIG_VSOCKETS=m
CONFIG_VMWARE_VMCI_VSOCKETS=m
CONFIG_NETLINK_DIAG=m
CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
# CONFIG_MPLS_ROUTING is not set
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
# CONFIG_NET_L3_MASTER_DEV is not set
CONFIG_RPS=y
CONFIG_RFS_ACCEL=y
CONFIG_XPS=y
CONFIG_SOCK_CGROUP_DATA=y
# CONFIG_CGROUP_NET_PRIO is not set
CONFIG_CGROUP_NET_CLASSID=y
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_BPF_JIT=y
CONFIG_NET_FLOW_LIMIT=y
#
# Network testing
#
CONFIG_NET_PKTGEN=m
# CONFIG_NET_TCPPROBE is not set
CONFIG_NET_DROP_MONITOR=y
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
# CONFIG_AF_KCM is not set
CONFIG_FIB_RULES=y
CONFIG_WIRELESS=y
CONFIG_WIRELESS_EXT=y
CONFIG_WEXT_CORE=y
CONFIG_WEXT_PROC=y
CONFIG_WEXT_PRIV=y
CONFIG_CFG80211=m
# CONFIG_NL80211_TESTMODE is not set
# CONFIG_CFG80211_DEVELOPER_WARNINGS is not set
CONFIG_CFG80211_DEFAULT_PS=y
# CONFIG_CFG80211_DEBUGFS is not set
# CONFIG_CFG80211_INTERNAL_REGDB is not set
CONFIG_CFG80211_CRDA_SUPPORT=y
# CONFIG_CFG80211_WEXT is not set
CONFIG_LIB80211=m
# CONFIG_LIB80211_DEBUG is not set
CONFIG_MAC80211=m
CONFIG_MAC80211_HAS_RC=y
CONFIG_MAC80211_RC_MINSTREL=y
CONFIG_MAC80211_RC_MINSTREL_HT=y
# CONFIG_MAC80211_RC_MINSTREL_VHT is not set
CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y
CONFIG_MAC80211_RC_DEFAULT="minstrel_ht"
# CONFIG_MAC80211_MESH is not set
# CONFIG_MAC80211_LEDS is not set
# CONFIG_MAC80211_DEBUGFS is not set
# CONFIG_MAC80211_MESSAGE_TRACING is not set
# CONFIG_MAC80211_DEBUG_MENU is not set
CONFIG_MAC80211_STA_HASH_MAX_SIZE=0
# CONFIG_WIMAX is not set
CONFIG_RFKILL=m
CONFIG_RFKILL_LEDS=y
CONFIG_RFKILL_INPUT=y
# CONFIG_RFKILL_GPIO is not set
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
# CONFIG_NET_9P_DEBUG is not set
# CONFIG_CAIF is not set
# CONFIG_CEPH_LIB is not set
# CONFIG_NFC is not set
# CONFIG_LWTUNNEL is not set
CONFIG_DST_CACHE=y
# CONFIG_NET_DEVLINK is not set
CONFIG_MAY_USE_DEVLINK=y
CONFIG_HAVE_EBPF_JIT=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
CONFIG_EXTRA_FIRMWARE=""
CONFIG_FW_LOADER_USER_HELPER=y
# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set
CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
CONFIG_SYS_HYPERVISOR=y
# CONFIG_GENERIC_CPU_DEVICES is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_REGMAP=y
CONFIG_REGMAP_I2C=y
CONFIG_REGMAP_SPI=y
CONFIG_DMA_SHARED_BUFFER=y
# CONFIG_FENCE_TRACE is not set
CONFIG_DMA_CMA=y
#
# Default contiguous memory area size:
#
CONFIG_CMA_SIZE_MBYTES=200
CONFIG_CMA_SIZE_SEL_MBYTES=y
# CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set
# CONFIG_CMA_SIZE_SEL_MIN is not set
# CONFIG_CMA_SIZE_SEL_MAX is not set
CONFIG_CMA_ALIGNMENT=8
#
# Bus devices
#
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
CONFIG_MTD=m
# CONFIG_MTD_TESTS is not set
# CONFIG_MTD_REDBOOT_PARTS is not set
# CONFIG_MTD_CMDLINE_PARTS is not set
# CONFIG_MTD_AR7_PARTS is not set
#
# User Modules And Translation Layers
#
CONFIG_MTD_BLKDEVS=m
CONFIG_MTD_BLOCK=m
# CONFIG_MTD_BLOCK_RO is not set
# CONFIG_FTL is not set
# CONFIG_NFTL is not set
# CONFIG_INFTL is not set
# CONFIG_RFD_FTL is not set
# CONFIG_SSFDC is not set
# CONFIG_SM_FTL is not set
# CONFIG_MTD_OOPS is not set
# CONFIG_MTD_SWAP is not set
# CONFIG_MTD_PARTITIONED_MASTER is not set
#
# RAM/ROM/Flash chip drivers
#
# CONFIG_MTD_CFI is not set
# CONFIG_MTD_JEDECPROBE is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
# CONFIG_MTD_CFI_I4 is not set
# CONFIG_MTD_CFI_I8 is not set
# CONFIG_MTD_RAM is not set
# CONFIG_MTD_ROM is not set
# CONFIG_MTD_ABSENT is not set
#
# Mapping drivers for chip access
#
# CONFIG_MTD_COMPLEX_MAPPINGS is not set
# CONFIG_MTD_INTEL_VR_NOR is not set
# CONFIG_MTD_PLATRAM is not set
#
# Self-contained MTD device drivers
#
# CONFIG_MTD_PMC551 is not set
# CONFIG_MTD_DATAFLASH is not set
# CONFIG_MTD_SST25L is not set
# CONFIG_MTD_SLRAM is not set
# CONFIG_MTD_PHRAM is not set
# CONFIG_MTD_MTDRAM is not set
# CONFIG_MTD_BLOCK2MTD is not set
#
# Disk-On-Chip Device Drivers
#
# CONFIG_MTD_DOCG3 is not set
# CONFIG_MTD_NAND is not set
# CONFIG_MTD_ONENAND is not set
#
# LPDDR & LPDDR2 PCM memory drivers
#
# CONFIG_MTD_LPDDR is not set
# CONFIG_MTD_SPI_NOR is not set
CONFIG_MTD_UBI=m
CONFIG_MTD_UBI_WL_THRESHOLD=4096
CONFIG_MTD_UBI_BEB_LIMIT=20
# CONFIG_MTD_UBI_FASTMAP is not set
# CONFIG_MTD_UBI_GLUEBI is not set
# CONFIG_MTD_UBI_BLOCK is not set
# CONFIG_OF is not set
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_SERIAL=m
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_GSC is not set
# CONFIG_PARPORT_AX88796 is not set
CONFIG_PARPORT_1284=y
CONFIG_PARPORT_NOT_PC=y
CONFIG_PNP=y
# CONFIG_PNP_DEBUG_MESSAGES is not set
#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_NULL_BLK=m
CONFIG_BLK_DEV_FD=m
# CONFIG_PARIDE is not set
CONFIG_BLK_DEV_PCIESSD_MTIP32XX=m
# CONFIG_ZRAM is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_LOOP_MIN_COUNT=0
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SKD is not set
CONFIG_BLK_DEV_OSD=m
CONFIG_BLK_DEV_SX8=m
CONFIG_BLK_DEV_RAM=m
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_XEN_BLKDEV_FRONTEND=m
# CONFIG_XEN_BLKDEV_BACKEND is not set
CONFIG_VIRTIO_BLK=y
# CONFIG_BLK_DEV_HD is not set
# CONFIG_BLK_DEV_RBD is not set
CONFIG_BLK_DEV_RSXX=m
CONFIG_NVME_CORE=m
CONFIG_BLK_DEV_NVME=m
# CONFIG_BLK_DEV_NVME_SCSI is not set
#
# Misc devices
#
CONFIG_SENSORS_LIS3LV02D=m
# CONFIG_AD525X_DPOT is not set
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
CONFIG_SGI_IOC4=m
CONFIG_TIFM_CORE=m
CONFIG_TIFM_7XX1=m
# CONFIG_ICS932S401 is not set
CONFIG_ENCLOSURE_SERVICES=m
CONFIG_SGI_XP=m
CONFIG_HP_ILO=m
CONFIG_SGI_GRU=m
# CONFIG_SGI_GRU_DEBUG is not set
CONFIG_APDS9802ALS=m
CONFIG_ISL29003=m
CONFIG_ISL29020=m
CONFIG_SENSORS_TSL2550=m
# CONFIG_SENSORS_BH1780 is not set
CONFIG_SENSORS_BH1770=m
CONFIG_SENSORS_APDS990X=m
# CONFIG_HMC6352 is not set
# CONFIG_DS1682 is not set
# CONFIG_TI_DAC7512 is not set
CONFIG_VMWARE_BALLOON=m
# CONFIG_BMP085_I2C is not set
# CONFIG_BMP085_SPI is not set
# CONFIG_USB_SWITCH_FSA9480 is not set
# CONFIG_LATTICE_ECP3_CONFIG is not set
# CONFIG_SRAM is not set
# CONFIG_PANEL is not set
# CONFIG_C2PORT is not set
#
# EEPROM support
#
CONFIG_EEPROM_AT24=m
# CONFIG_EEPROM_AT25 is not set
CONFIG_EEPROM_LEGACY=m
CONFIG_EEPROM_MAX6875=m
CONFIG_EEPROM_93CX6=m
# CONFIG_EEPROM_93XX46 is not set
CONFIG_CB710_CORE=m
# CONFIG_CB710_DEBUG is not set
CONFIG_CB710_DEBUG_ASSUMPTIONS=y
#
# Texas Instruments shared transport line discipline
#
# CONFIG_TI_ST is not set
CONFIG_SENSORS_LIS3_I2C=m
#
# Altera FPGA firmware download module
#
CONFIG_ALTERA_STAPL=m
CONFIG_INTEL_MEI=y
CONFIG_INTEL_MEI_ME=y
# CONFIG_INTEL_MEI_TXE is not set
CONFIG_VMWARE_VMCI=m
#
# Intel MIC Bus Driver
#
# CONFIG_INTEL_MIC_BUS is not set
#
# SCIF Bus Driver
#
# CONFIG_SCIF_BUS is not set
#
# VOP Bus Driver
#
# CONFIG_VOP_BUS is not set
#
# Intel MIC Host Driver
#
#
# Intel MIC Card Driver
#
#
# SCIF Driver
#
#
# Intel MIC Coprocessor State Management (COSM) Drivers
#
#
# VOP Driver
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_CXL_BASE is not set
# CONFIG_CXL_KERNEL_API is not set
# CONFIG_CXL_EEH is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set
#
# SCSI device support
#
CONFIG_SCSI_MOD=y
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
# CONFIG_SCSI_MQ_DEFAULT is not set
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
CONFIG_CHR_DEV_ST=m
CONFIG_CHR_DEV_OSST=m
CONFIG_BLK_DEV_SR=m
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=m
CONFIG_CHR_DEV_SCH=m
CONFIG_SCSI_ENCLOSURE=m
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y
#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m
CONFIG_SCSI_SAS_LIBSAS=m
CONFIG_SCSI_SAS_ATA=y
CONFIG_SCSI_SAS_HOST_SMP=y
CONFIG_SCSI_SRP_ATTRS=m
CONFIG_SCSI_LOWLEVEL=y
CONFIG_ISCSI_TCP=m
CONFIG_ISCSI_BOOT_SYSFS=m
CONFIG_SCSI_CXGB3_ISCSI=m
CONFIG_SCSI_CXGB4_ISCSI=m
CONFIG_SCSI_BNX2_ISCSI=m
CONFIG_SCSI_BNX2X_FCOE=m
CONFIG_BE2ISCSI=m
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
CONFIG_SCSI_HPSA=m
CONFIG_SCSI_3W_9XXX=m
CONFIG_SCSI_3W_SAS=m
# CONFIG_SCSI_ACARD is not set
CONFIG_SCSI_AACRAID=m
# CONFIG_SCSI_AIC7XXX is not set
CONFIG_SCSI_AIC79XX=m
CONFIG_AIC79XX_CMDS_PER_DEVICE=4
CONFIG_AIC79XX_RESET_DELAY_MS=15000
# CONFIG_AIC79XX_DEBUG_ENABLE is not set
CONFIG_AIC79XX_DEBUG_MASK=0
# CONFIG_AIC79XX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_AIC94XX is not set
CONFIG_SCSI_MVSAS=m
# CONFIG_SCSI_MVSAS_DEBUG is not set
CONFIG_SCSI_MVSAS_TASKLET=y
CONFIG_SCSI_MVUMI=m
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
CONFIG_SCSI_ARCMSR=m
# CONFIG_SCSI_ESAS2R is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
CONFIG_MEGARAID_SAS=m
CONFIG_SCSI_MPT3SAS=m
CONFIG_SCSI_MPT2SAS_MAX_SGE=128
CONFIG_SCSI_MPT3SAS_MAX_SGE=128
CONFIG_SCSI_MPT2SAS=m
CONFIG_SCSI_UFSHCD=m
CONFIG_SCSI_UFSHCD_PCI=m
# CONFIG_SCSI_UFSHCD_PLATFORM is not set
CONFIG_SCSI_HPTIOP=m
# CONFIG_SCSI_BUSLOGIC is not set
CONFIG_VMWARE_PVSCSI=m
# CONFIG_XEN_SCSI_FRONTEND is not set
CONFIG_HYPERV_STORAGE=m
CONFIG_LIBFC=m
CONFIG_LIBFCOE=m
CONFIG_FCOE=m
CONFIG_FCOE_FNIC=m
# CONFIG_SCSI_SNIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
CONFIG_SCSI_ISCI=m
# CONFIG_SCSI_IPS is not set
CONFIG_SCSI_INITIO=m
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
CONFIG_SCSI_STEX=m
# CONFIG_SCSI_SYM53C8XX_2 is not set
CONFIG_SCSI_IPR=m
CONFIG_SCSI_IPR_TRACE=y
CONFIG_SCSI_IPR_DUMP=y
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA_FC=m
# CONFIG_TCM_QLA2XXX is not set
CONFIG_SCSI_QLA_ISCSI=m
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_WD719X is not set
CONFIG_SCSI_DEBUG=m
CONFIG_SCSI_PMCRAID=m
CONFIG_SCSI_PM8001=m
# CONFIG_SCSI_BFA_FC is not set
CONFIG_SCSI_VIRTIO=m
CONFIG_SCSI_CHELSIO_FCOE=m
CONFIG_SCSI_DH=y
CONFIG_SCSI_DH_RDAC=y
CONFIG_SCSI_DH_HP_SW=y
CONFIG_SCSI_DH_EMC=y
CONFIG_SCSI_DH_ALUA=y
CONFIG_SCSI_OSD_INITIATOR=m
CONFIG_SCSI_OSD_ULD=m
CONFIG_SCSI_OSD_DPRINT_SENSE=1
# CONFIG_SCSI_OSD_DEBUG is not set
CONFIG_ATA=m
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_ATA_ACPI=y
# CONFIG_SATA_ZPODD is not set
CONFIG_SATA_PMP=y
#
# Controllers with non-SFF native interface
#
CONFIG_SATA_AHCI=m
CONFIG_SATA_AHCI_PLATFORM=m
# CONFIG_SATA_INIC162X is not set
CONFIG_SATA_ACARD_AHCI=m
CONFIG_SATA_SIL24=m
CONFIG_ATA_SFF=y
#
# SFF controllers with custom DMA interface
#
CONFIG_PDC_ADMA=m
CONFIG_SATA_QSTOR=m
CONFIG_SATA_SX4=m
CONFIG_ATA_BMDMA=y
#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=m
# CONFIG_SATA_DWC is not set
CONFIG_SATA_MV=m
CONFIG_SATA_NV=m
CONFIG_SATA_PROMISE=m
CONFIG_SATA_SIL=m
CONFIG_SATA_SIS=m
CONFIG_SATA_SVW=m
CONFIG_SATA_ULI=m
CONFIG_SATA_VIA=m
CONFIG_SATA_VITESSE=m
#
# PATA SFF controllers with BMDMA
#
CONFIG_PATA_ALI=m
CONFIG_PATA_AMD=m
CONFIG_PATA_ARTOP=m
CONFIG_PATA_ATIIXP=m
CONFIG_PATA_ATP867X=m
CONFIG_PATA_CMD64X=m
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
CONFIG_PATA_HPT366=m
CONFIG_PATA_HPT37X=m
CONFIG_PATA_HPT3X2N=m
CONFIG_PATA_HPT3X3=m
# CONFIG_PATA_HPT3X3_DMA is not set
CONFIG_PATA_IT8213=m
CONFIG_PATA_IT821X=m
CONFIG_PATA_JMICRON=m
CONFIG_PATA_MARVELL=m
CONFIG_PATA_NETCELL=m
CONFIG_PATA_NINJA32=m
# CONFIG_PATA_NS87415 is not set
CONFIG_PATA_OLDPIIX=m
# CONFIG_PATA_OPTIDMA is not set
CONFIG_PATA_PDC2027X=m
CONFIG_PATA_PDC_OLD=m
# CONFIG_PATA_RADISYS is not set
CONFIG_PATA_RDC=m
CONFIG_PATA_SCH=m
CONFIG_PATA_SERVERWORKS=m
CONFIG_PATA_SIL680=m
CONFIG_PATA_SIS=m
CONFIG_PATA_TOSHIBA=m
# CONFIG_PATA_TRIFLEX is not set
CONFIG_PATA_VIA=m
# CONFIG_PATA_WINBOND is not set
#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_RZ1000 is not set
#
# Generic fallback / legacy drivers
#
CONFIG_PATA_ACPI=m
CONFIG_ATA_GENERIC=m
# CONFIG_PATA_LEGACY is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_AUTODETECT=y
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
# CONFIG_MD_CLUSTER is not set
# CONFIG_BCACHE is not set
CONFIG_BLK_DEV_DM_BUILTIN=y
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_MQ_DEFAULT is not set
CONFIG_DM_DEBUG=y
CONFIG_DM_BUFIO=m
# CONFIG_DM_DEBUG_BLOCK_STACK_TRACING is not set
CONFIG_DM_BIO_PRISON=m
CONFIG_DM_PERSISTENT_DATA=m
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_THIN_PROVISIONING=m
CONFIG_DM_CACHE=m
CONFIG_DM_CACHE_SMQ=m
CONFIG_DM_CACHE_CLEANER=m
# CONFIG_DM_ERA is not set
CONFIG_DM_MIRROR=m
CONFIG_DM_LOG_USERSPACE=m
CONFIG_DM_RAID=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
CONFIG_DM_MULTIPATH_QL=m
CONFIG_DM_MULTIPATH_ST=m
CONFIG_DM_DELAY=m
CONFIG_DM_UEVENT=y
CONFIG_DM_FLAKEY=m
CONFIG_DM_VERITY=m
# CONFIG_DM_VERITY_FEC is not set
CONFIG_DM_SWITCH=m
# CONFIG_DM_LOG_WRITES is not set
CONFIG_TARGET_CORE=m
CONFIG_TCM_IBLOCK=m
CONFIG_TCM_FILEIO=m
CONFIG_TCM_PSCSI=m
# CONFIG_TCM_USER2 is not set
CONFIG_LOOPBACK_TARGET=m
CONFIG_TCM_FC=m
CONFIG_ISCSI_TARGET=m
# CONFIG_ISCSI_TARGET_CXGB4 is not set
# CONFIG_SBP_TARGET is not set
CONFIG_FUSION=y
CONFIG_FUSION_SPI=m
# CONFIG_FUSION_FC is not set
CONFIG_FUSION_SAS=m
CONFIG_FUSION_MAX_SGE=128
CONFIG_FUSION_CTL=m
CONFIG_FUSION_LOGGING=y
#
# IEEE 1394 (FireWire) support
#
CONFIG_FIREWIRE=m
CONFIG_FIREWIRE_OHCI=m
CONFIG_FIREWIRE_SBP2=m
CONFIG_FIREWIRE_NET=m
# CONFIG_FIREWIRE_NOSY is not set
CONFIG_MACINTOSH_DRIVERS=y
CONFIG_MAC_EMUMOUSEBTN=y
CONFIG_NETDEVICES=y
CONFIG_MII=y
CONFIG_NET_CORE=y
CONFIG_BONDING=m
CONFIG_DUMMY=m
# CONFIG_EQUALIZER is not set
CONFIG_NET_FC=y
CONFIG_IFB=m
CONFIG_NET_TEAM=m
CONFIG_NET_TEAM_MODE_BROADCAST=m
CONFIG_NET_TEAM_MODE_ROUNDROBIN=m
CONFIG_NET_TEAM_MODE_RANDOM=m
CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=m
CONFIG_NET_TEAM_MODE_LOADBALANCE=m
CONFIG_MACVLAN=m
CONFIG_MACVTAP=m
# CONFIG_IPVLAN is not set
CONFIG_VXLAN=m
# CONFIG_GENEVE is not set
# CONFIG_GTP is not set
# CONFIG_MACSEC is not set
CONFIG_NETCONSOLE=m
CONFIG_NETCONSOLE_DYNAMIC=y
CONFIG_NETPOLL=y
CONFIG_NET_POLL_CONTROLLER=y
CONFIG_TUN=m
# CONFIG_TUN_VNET_CROSS_LE is not set
CONFIG_VETH=m
CONFIG_VIRTIO_NET=y
CONFIG_NLMON=m
# CONFIG_ARCNET is not set
# CONFIG_ATM_DRIVERS is not set
#
# CAIF transport drivers
#
CONFIG_VHOST_NET=m
# CONFIG_VHOST_SCSI is not set
CONFIG_VHOST_RING=m
CONFIG_VHOST=m
# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
#
# Distributed Switch Architecture drivers
#
CONFIG_ETHERNET=y
CONFIG_MDIO=y
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_VENDOR_ADAPTEC is not set
CONFIG_NET_VENDOR_AGERE=y
# CONFIG_ET131X is not set
# CONFIG_NET_VENDOR_ALTEON is not set
# CONFIG_ALTERA_TSE is not set
# CONFIG_NET_VENDOR_AMD is not set
CONFIG_NET_VENDOR_ARC=y
CONFIG_NET_VENDOR_ATHEROS=y
CONFIG_ATL2=m
CONFIG_ATL1=m
CONFIG_ATL1E=m
CONFIG_ATL1C=m
CONFIG_ALX=m
# CONFIG_NET_VENDOR_AURORA is not set
CONFIG_NET_CADENCE=y
# CONFIG_MACB is not set
CONFIG_NET_VENDOR_BROADCOM=y
CONFIG_B44=m
CONFIG_B44_PCI_AUTOSELECT=y
CONFIG_B44_PCICORE_AUTOSELECT=y
CONFIG_B44_PCI=y
# CONFIG_BCMGENET is not set
CONFIG_BNX2=m
CONFIG_CNIC=m
CONFIG_TIGON3=y
# CONFIG_BNX2X is not set
# CONFIG_BNXT is not set
CONFIG_NET_VENDOR_BROCADE=y
CONFIG_BNA=m
CONFIG_NET_VENDOR_CAVIUM=y
# CONFIG_THUNDER_NIC_PF is not set
# CONFIG_THUNDER_NIC_VF is not set
# CONFIG_THUNDER_NIC_BGX is not set
# CONFIG_LIQUIDIO is not set
CONFIG_NET_VENDOR_CHELSIO=y
# CONFIG_CHELSIO_T1 is not set
CONFIG_CHELSIO_T3=m
CONFIG_CHELSIO_T4=m
# CONFIG_CHELSIO_T4_DCB is not set
# CONFIG_CHELSIO_T4_UWIRE is not set
CONFIG_CHELSIO_T4VF=m
CONFIG_NET_VENDOR_CISCO=y
CONFIG_ENIC=m
# CONFIG_CX_ECAT is not set
CONFIG_DNET=m
CONFIG_NET_VENDOR_DEC=y
CONFIG_NET_TULIP=y
CONFIG_DE2104X=m
CONFIG_DE2104X_DSL=0
CONFIG_TULIP=y
# CONFIG_TULIP_MWI is not set
CONFIG_TULIP_MMIO=y
# CONFIG_TULIP_NAPI is not set
CONFIG_DE4X5=m
CONFIG_WINBOND_840=m
CONFIG_DM9102=m
CONFIG_ULI526X=m
CONFIG_PCMCIA_XIRCOM=m
# CONFIG_NET_VENDOR_DLINK is not set
CONFIG_NET_VENDOR_EMULEX=y
CONFIG_BE2NET=m
CONFIG_BE2NET_HWMON=y
CONFIG_BE2NET_VXLAN=y
CONFIG_NET_VENDOR_EZCHIP=y
# CONFIG_NET_VENDOR_EXAR is not set
# CONFIG_NET_VENDOR_HP is not set
CONFIG_NET_VENDOR_INTEL=y
# CONFIG_E100 is not set
CONFIG_E1000=y
CONFIG_E1000E=y
CONFIG_E1000E_HWTS=y
CONFIG_IGB=y
CONFIG_IGB_HWMON=y
CONFIG_IGBVF=m
CONFIG_IXGB=m
CONFIG_IXGBE=y
CONFIG_IXGBE_HWMON=y
CONFIG_IXGBE_DCB=y
CONFIG_IXGBEVF=m
CONFIG_I40E=m
# CONFIG_I40E_VXLAN is not set
# CONFIG_I40E_DCB is not set
# CONFIG_I40E_FCOE is not set
# CONFIG_I40EVF is not set
# CONFIG_FM10K is not set
# CONFIG_NET_VENDOR_I825XX is not set
CONFIG_JME=m
CONFIG_NET_VENDOR_MARVELL=y
CONFIG_MVMDIO=m
# CONFIG_MVNETA_BM is not set
CONFIG_SKGE=m
CONFIG_SKGE_DEBUG=y
CONFIG_SKGE_GENESIS=y
CONFIG_SKY2=m
CONFIG_SKY2_DEBUG=y
CONFIG_NET_VENDOR_MELLANOX=y
CONFIG_MLX4_EN=m
CONFIG_MLX4_EN_DCB=y
CONFIG_MLX4_EN_VXLAN=y
CONFIG_MLX4_CORE=m
CONFIG_MLX4_DEBUG=y
# CONFIG_MLX5_CORE is not set
# CONFIG_MLXSW_CORE is not set
# CONFIG_NET_VENDOR_MICREL is not set
CONFIG_NET_VENDOR_MICROCHIP=y
# CONFIG_ENC28J60 is not set
# CONFIG_ENCX24J600 is not set
CONFIG_NET_VENDOR_MYRI=y
CONFIG_MYRI10GE=m
# CONFIG_FEALNX is not set
# CONFIG_NET_VENDOR_NATSEMI is not set
CONFIG_NET_VENDOR_NETRONOME=y
# CONFIG_NFP_NETVF is not set
# CONFIG_NET_VENDOR_NVIDIA is not set
CONFIG_NET_VENDOR_OKI=y
CONFIG_ETHOC=m
CONFIG_NET_PACKET_ENGINE=y
# CONFIG_HAMACHI is not set
CONFIG_YELLOWFIN=m
CONFIG_NET_VENDOR_QLOGIC=y
CONFIG_QLA3XXX=m
CONFIG_QLCNIC=m
CONFIG_QLCNIC_SRIOV=y
CONFIG_QLCNIC_DCB=y
# CONFIG_QLCNIC_VXLAN is not set
CONFIG_QLCNIC_HWMON=y
CONFIG_QLGE=m
CONFIG_NETXEN_NIC=m
# CONFIG_QED is not set
CONFIG_NET_VENDOR_QUALCOMM=y
CONFIG_NET_VENDOR_REALTEK=y
# CONFIG_ATP is not set
CONFIG_8139CP=y
CONFIG_8139TOO=y
CONFIG_8139TOO_PIO=y
# CONFIG_8139TOO_TUNE_TWISTER is not set
CONFIG_8139TOO_8129=y
# CONFIG_8139_OLD_RX_RESET is not set
CONFIG_R8169=y
CONFIG_NET_VENDOR_RENESAS=y
# CONFIG_NET_VENDOR_RDC is not set
CONFIG_NET_VENDOR_ROCKER=y
CONFIG_NET_VENDOR_SAMSUNG=y
# CONFIG_SXGBE_ETH is not set
# CONFIG_NET_VENDOR_SEEQ is not set
# CONFIG_NET_VENDOR_SILAN is not set
# CONFIG_NET_VENDOR_SIS is not set
CONFIG_SFC=m
CONFIG_SFC_MTD=y
CONFIG_SFC_MCDI_MON=y
CONFIG_SFC_SRIOV=y
CONFIG_SFC_MCDI_LOGGING=y
CONFIG_NET_VENDOR_SMSC=y
CONFIG_EPIC100=m
# CONFIG_SMSC911X is not set
CONFIG_SMSC9420=m
# CONFIG_NET_VENDOR_STMICRO is not set
# CONFIG_NET_VENDOR_SUN is not set
CONFIG_NET_VENDOR_SYNOPSYS=y
# CONFIG_NET_VENDOR_TEHUTI is not set
# CONFIG_NET_VENDOR_TI is not set
# CONFIG_NET_VENDOR_VIA is not set
# CONFIG_NET_VENDOR_WIZNET is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_NET_SB1000 is not set
CONFIG_PHYLIB=y
#
# MII PHY device drivers
#
# CONFIG_AQUANTIA_PHY is not set
CONFIG_AT803X_PHY=m
CONFIG_AMD_PHY=m
CONFIG_MARVELL_PHY=m
CONFIG_DAVICOM_PHY=m
CONFIG_QSEMI_PHY=m
CONFIG_LXT_PHY=m
CONFIG_CICADA_PHY=m
CONFIG_VITESSE_PHY=m
# CONFIG_TERANETICS_PHY is not set
CONFIG_SMSC_PHY=m
CONFIG_BCM_NET_PHYLIB=m
CONFIG_BROADCOM_PHY=m
# CONFIG_BCM7XXX_PHY is not set
CONFIG_BCM87XX_PHY=m
CONFIG_ICPLUS_PHY=m
CONFIG_REALTEK_PHY=m
CONFIG_NATIONAL_PHY=m
CONFIG_STE10XP=m
CONFIG_LSI_ET1011C_PHY=m
CONFIG_MICREL_PHY=m
# CONFIG_DP83848_PHY is not set
# CONFIG_DP83867_PHY is not set
# CONFIG_MICROCHIP_PHY is not set
CONFIG_FIXED_PHY=y
CONFIG_MDIO_BITBANG=m
# CONFIG_MDIO_GPIO is not set
# CONFIG_MDIO_OCTEON is not set
# CONFIG_MDIO_THUNDER is not set
# CONFIG_MDIO_BCM_UNIMAC is not set
# CONFIG_MICREL_KS8995MA is not set
# CONFIG_PLIP is not set
CONFIG_PPP=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_FILTER=y
CONFIG_PPP_MPPE=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPPOATM=m
CONFIG_PPPOE=m
CONFIG_PPTP=m
CONFIG_PPPOL2TP=m
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_SLIP=m
CONFIG_SLHC=m
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLIP_SMART=y
# CONFIG_SLIP_MODE_SLIP6 is not set
CONFIG_USB_NET_DRIVERS=y
CONFIG_USB_CATC=y
CONFIG_USB_KAWETH=y
CONFIG_USB_PEGASUS=y
CONFIG_USB_RTL8150=y
CONFIG_USB_RTL8152=m
# CONFIG_USB_LAN78XX is not set
CONFIG_USB_USBNET=y
CONFIG_USB_NET_AX8817X=y
CONFIG_USB_NET_AX88179_178A=m
CONFIG_USB_NET_CDCETHER=y
CONFIG_USB_NET_CDC_EEM=y
CONFIG_USB_NET_CDC_NCM=m
# CONFIG_USB_NET_HUAWEI_CDC_NCM is not set
CONFIG_USB_NET_CDC_MBIM=m
CONFIG_USB_NET_DM9601=y
# CONFIG_USB_NET_SR9700 is not set
# CONFIG_USB_NET_SR9800 is not set
CONFIG_USB_NET_SMSC75XX=y
CONFIG_USB_NET_SMSC95XX=y
CONFIG_USB_NET_GL620A=y
CONFIG_USB_NET_NET1080=y
CONFIG_USB_NET_PLUSB=y
CONFIG_USB_NET_MCS7830=y
CONFIG_USB_NET_RNDIS_HOST=y
CONFIG_USB_NET_CDC_SUBSET_ENABLE=y
CONFIG_USB_NET_CDC_SUBSET=y
CONFIG_USB_ALI_M5632=y
CONFIG_USB_AN2720=y
CONFIG_USB_BELKIN=y
CONFIG_USB_ARMLINUX=y
CONFIG_USB_EPSON2888=y
CONFIG_USB_KC2190=y
CONFIG_USB_NET_ZAURUS=y
CONFIG_USB_NET_CX82310_ETH=m
CONFIG_USB_NET_KALMIA=m
CONFIG_USB_NET_QMI_WWAN=m
CONFIG_USB_HSO=m
CONFIG_USB_NET_INT51X1=y
CONFIG_USB_IPHETH=y
CONFIG_USB_SIERRA_NET=y
CONFIG_USB_VL600=m
# CONFIG_USB_NET_CH9200 is not set
CONFIG_WLAN=y
CONFIG_WLAN_VENDOR_ADMTEK=y
# CONFIG_ADM8211 is not set
CONFIG_WLAN_VENDOR_ATH=y
# CONFIG_ATH_DEBUG is not set
# CONFIG_ATH5K is not set
# CONFIG_ATH5K_PCI is not set
# CONFIG_ATH9K is not set
# CONFIG_ATH9K_HTC is not set
# CONFIG_CARL9170 is not set
# CONFIG_ATH6KL is not set
# CONFIG_AR5523 is not set
# CONFIG_WIL6210 is not set
# CONFIG_ATH10K is not set
# CONFIG_WCN36XX is not set
CONFIG_WLAN_VENDOR_ATMEL=y
# CONFIG_ATMEL is not set
# CONFIG_AT76C50X_USB is not set
CONFIG_WLAN_VENDOR_BROADCOM=y
# CONFIG_B43 is not set
# CONFIG_B43LEGACY is not set
# CONFIG_BRCMSMAC is not set
# CONFIG_BRCMFMAC is not set
CONFIG_WLAN_VENDOR_CISCO=y
# CONFIG_AIRO is not set
CONFIG_WLAN_VENDOR_INTEL=y
# CONFIG_IPW2100 is not set
# CONFIG_IPW2200 is not set
# CONFIG_IWL4965 is not set
# CONFIG_IWL3945 is not set
# CONFIG_IWLWIFI is not set
CONFIG_WLAN_VENDOR_INTERSIL=y
# CONFIG_HOSTAP is not set
# CONFIG_HERMES is not set
# CONFIG_P54_COMMON is not set
# CONFIG_PRISM54 is not set
CONFIG_WLAN_VENDOR_MARVELL=y
# CONFIG_LIBERTAS is not set
# CONFIG_LIBERTAS_THINFIRM is not set
# CONFIG_MWIFIEX is not set
# CONFIG_MWL8K is not set
CONFIG_WLAN_VENDOR_MEDIATEK=y
# CONFIG_MT7601U is not set
CONFIG_WLAN_VENDOR_RALINK=y
# CONFIG_RT2X00 is not set
CONFIG_WLAN_VENDOR_REALTEK=y
# CONFIG_RTL8180 is not set
# CONFIG_RTL8187 is not set
CONFIG_RTL_CARDS=m
# CONFIG_RTL8192CE is not set
# CONFIG_RTL8192SE is not set
# CONFIG_RTL8192DE is not set
# CONFIG_RTL8723AE is not set
# CONFIG_RTL8723BE is not set
# CONFIG_RTL8188EE is not set
# CONFIG_RTL8192EE is not set
# CONFIG_RTL8821AE is not set
# CONFIG_RTL8192CU is not set
# CONFIG_RTL8XXXU is not set
CONFIG_WLAN_VENDOR_RSI=y
# CONFIG_RSI_91X is not set
CONFIG_WLAN_VENDOR_ST=y
# CONFIG_CW1200 is not set
CONFIG_WLAN_VENDOR_TI=y
# CONFIG_WL1251 is not set
# CONFIG_WL12XX is not set
# CONFIG_WL18XX is not set
# CONFIG_WLCORE is not set
CONFIG_WLAN_VENDOR_ZYDAS=y
# CONFIG_USB_ZD1201 is not set
# CONFIG_ZD1211RW is not set
CONFIG_MAC80211_HWSIM=m
# CONFIG_USB_NET_RNDIS_WLAN is not set
#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
CONFIG_WAN=y
# CONFIG_LANMEDIA is not set
CONFIG_HDLC=m
CONFIG_HDLC_RAW=m
# CONFIG_HDLC_RAW_ETH is not set
CONFIG_HDLC_CISCO=m
CONFIG_HDLC_FR=m
CONFIG_HDLC_PPP=m
#
# X.25/LAPB support is disabled
#
# CONFIG_PCI200SYN is not set
# CONFIG_WANXL is not set
# CONFIG_PC300TOO is not set
# CONFIG_FARSYNC is not set
# CONFIG_DSCC4 is not set
CONFIG_DLCI=m
CONFIG_DLCI_MAX=8
# CONFIG_SBNI is not set
CONFIG_IEEE802154_DRIVERS=m
CONFIG_IEEE802154_FAKELB=m
# CONFIG_IEEE802154_AT86RF230 is not set
# CONFIG_IEEE802154_MRF24J40 is not set
# CONFIG_IEEE802154_CC2520 is not set
# CONFIG_IEEE802154_ATUSB is not set
# CONFIG_IEEE802154_ADF7242 is not set
CONFIG_XEN_NETDEV_FRONTEND=m
# CONFIG_XEN_NETDEV_BACKEND is not set
CONFIG_VMXNET3=m
# CONFIG_FUJITSU_ES is not set
CONFIG_HYPERV_NET=m
CONFIG_ISDN=y
CONFIG_ISDN_I4L=m
CONFIG_ISDN_PPP=y
CONFIG_ISDN_PPP_VJ=y
CONFIG_ISDN_MPP=y
CONFIG_IPPP_FILTER=y
# CONFIG_ISDN_PPP_BSDCOMP is not set
CONFIG_ISDN_AUDIO=y
CONFIG_ISDN_TTY_FAX=y
#
# ISDN feature submodules
#
CONFIG_ISDN_DIVERSION=m
#
# ISDN4Linux hardware drivers
#
#
# Passive cards
#
# CONFIG_ISDN_DRV_HISAX is not set
CONFIG_ISDN_CAPI=m
# CONFIG_CAPI_TRACE is not set
CONFIG_ISDN_CAPI_CAPI20=m
CONFIG_ISDN_CAPI_MIDDLEWARE=y
CONFIG_ISDN_CAPI_CAPIDRV=m
# CONFIG_ISDN_CAPI_CAPIDRV_VERBOSE is not set
#
# CAPI hardware drivers
#
CONFIG_CAPI_AVM=y
CONFIG_ISDN_DRV_AVMB1_B1PCI=m
CONFIG_ISDN_DRV_AVMB1_B1PCIV4=y
CONFIG_ISDN_DRV_AVMB1_T1PCI=m
CONFIG_ISDN_DRV_AVMB1_C4=m
# CONFIG_CAPI_EICON is not set
CONFIG_ISDN_DRV_GIGASET=m
CONFIG_GIGASET_CAPI=y
# CONFIG_GIGASET_I4L is not set
# CONFIG_GIGASET_DUMMYLL is not set
CONFIG_GIGASET_BASE=m
CONFIG_GIGASET_M105=m
CONFIG_GIGASET_M101=m
# CONFIG_GIGASET_DEBUG is not set
CONFIG_HYSDN=m
CONFIG_HYSDN_CAPI=y
CONFIG_MISDN=m
CONFIG_MISDN_DSP=m
CONFIG_MISDN_L1OIP=m
#
# mISDN hardware drivers
#
CONFIG_MISDN_HFCPCI=m
CONFIG_MISDN_HFCMULTI=m
CONFIG_MISDN_HFCUSB=m
CONFIG_MISDN_AVMFRITZ=m
CONFIG_MISDN_SPEEDFAX=m
CONFIG_MISDN_INFINEON=m
CONFIG_MISDN_W6692=m
CONFIG_MISDN_NETJET=m
CONFIG_MISDN_IPAC=m
CONFIG_MISDN_ISAR=m
CONFIG_ISDN_HDLC=m
# CONFIG_NVM is not set
#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_LEDS=y
CONFIG_INPUT_FF_MEMLESS=m
CONFIG_INPUT_POLLDEV=m
CONFIG_INPUT_SPARSEKMAP=m
# CONFIG_INPUT_MATRIXKMAP is not set
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
# CONFIG_KEYBOARD_ADP5588 is not set
# CONFIG_KEYBOARD_ADP5589 is not set
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_QT1070 is not set
# CONFIG_KEYBOARD_QT2160 is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_GPIO is not set
# CONFIG_KEYBOARD_GPIO_POLLED is not set
# CONFIG_KEYBOARD_TCA6416 is not set
# CONFIG_KEYBOARD_TCA8418 is not set
# CONFIG_KEYBOARD_MATRIX is not set
# CONFIG_KEYBOARD_LM8323 is not set
# CONFIG_KEYBOARD_LM8333 is not set
# CONFIG_KEYBOARD_MAX7359 is not set
# CONFIG_KEYBOARD_MCS is not set
# CONFIG_KEYBOARD_MPR121 is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_SAMSUNG is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_BYD=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_CYPRESS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_MOUSE_PS2_ELANTECH=y
CONFIG_MOUSE_PS2_SENTELIC=y
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_PS2_FOCALTECH=y
# CONFIG_MOUSE_PS2_VMMOUSE is not set
CONFIG_MOUSE_SERIAL=m
CONFIG_MOUSE_APPLETOUCH=m
CONFIG_MOUSE_BCM5974=m
CONFIG_MOUSE_CYAPA=m
# CONFIG_MOUSE_ELAN_I2C is not set
CONFIG_MOUSE_VSXXXAA=m
# CONFIG_MOUSE_GPIO is not set
CONFIG_MOUSE_SYNAPTICS_I2C=m
CONFIG_MOUSE_SYNAPTICS_USB=m
# CONFIG_INPUT_JOYSTICK is not set
CONFIG_INPUT_TABLET=y
CONFIG_TABLET_USB_ACECAD=m
CONFIG_TABLET_USB_AIPTEK=m
CONFIG_TABLET_USB_GTCO=m
# CONFIG_TABLET_USB_HANWANG is not set
CONFIG_TABLET_USB_KBTAB=m
# CONFIG_TABLET_SERIAL_WACOM4 is not set
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_PROPERTIES=y
# CONFIG_TOUCHSCREEN_ADS7846 is not set
# CONFIG_TOUCHSCREEN_AD7877 is not set
# CONFIG_TOUCHSCREEN_AD7879 is not set
# CONFIG_TOUCHSCREEN_ATMEL_MXT is not set
# CONFIG_TOUCHSCREEN_AUO_PIXCIR is not set
# CONFIG_TOUCHSCREEN_BU21013 is not set
# CONFIG_TOUCHSCREEN_CY8CTMG110 is not set
# CONFIG_TOUCHSCREEN_CYTTSP_CORE is not set
# CONFIG_TOUCHSCREEN_CYTTSP4_CORE is not set
# CONFIG_TOUCHSCREEN_DYNAPRO is not set
# CONFIG_TOUCHSCREEN_HAMPSHIRE is not set
# CONFIG_TOUCHSCREEN_EETI is not set
# CONFIG_TOUCHSCREEN_EGALAX_SERIAL is not set
# CONFIG_TOUCHSCREEN_FT6236 is not set
# CONFIG_TOUCHSCREEN_FUJITSU is not set
# CONFIG_TOUCHSCREEN_GOODIX is not set
# CONFIG_TOUCHSCREEN_ILI210X is not set
# CONFIG_TOUCHSCREEN_GUNZE is not set
# CONFIG_TOUCHSCREEN_ELAN is not set
# CONFIG_TOUCHSCREEN_ELO is not set
CONFIG_TOUCHSCREEN_WACOM_W8001=m
CONFIG_TOUCHSCREEN_WACOM_I2C=m
# CONFIG_TOUCHSCREEN_MAX11801 is not set
# CONFIG_TOUCHSCREEN_MCS5000 is not set
# CONFIG_TOUCHSCREEN_MMS114 is not set
# CONFIG_TOUCHSCREEN_MELFAS_MIP4 is not set
# CONFIG_TOUCHSCREEN_MTOUCH is not set
# CONFIG_TOUCHSCREEN_INEXIO is not set
# CONFIG_TOUCHSCREEN_MK712 is not set
# CONFIG_TOUCHSCREEN_PENMOUNT is not set
# CONFIG_TOUCHSCREEN_EDT_FT5X06 is not set
# CONFIG_TOUCHSCREEN_TOUCHRIGHT is not set
# CONFIG_TOUCHSCREEN_TOUCHWIN is not set
# CONFIG_TOUCHSCREEN_PIXCIR is not set
# CONFIG_TOUCHSCREEN_WDT87XX_I2C is not set
# CONFIG_TOUCHSCREEN_WM97XX is not set
# CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
# CONFIG_TOUCHSCREEN_TOUCHIT213 is not set
# CONFIG_TOUCHSCREEN_TSC_SERIO is not set
# CONFIG_TOUCHSCREEN_TSC2004 is not set
# CONFIG_TOUCHSCREEN_TSC2005 is not set
# CONFIG_TOUCHSCREEN_TSC2007 is not set
# CONFIG_TOUCHSCREEN_ST1232 is not set
# CONFIG_TOUCHSCREEN_SUR40 is not set
# CONFIG_TOUCHSCREEN_SX8654 is not set
# CONFIG_TOUCHSCREEN_TPS6507X is not set
# CONFIG_TOUCHSCREEN_ZFORCE is not set
# CONFIG_TOUCHSCREEN_ROHM_BU21023 is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_AD714X is not set
# CONFIG_INPUT_BMA150 is not set
# CONFIG_INPUT_E3X0_BUTTON is not set
CONFIG_INPUT_PCSPKR=m
# CONFIG_INPUT_MMA8450 is not set
# CONFIG_INPUT_MPU3050 is not set
CONFIG_INPUT_APANEL=m
# CONFIG_INPUT_GP2A is not set
# CONFIG_INPUT_GPIO_BEEPER is not set
# CONFIG_INPUT_GPIO_TILT_POLLED is not set
CONFIG_INPUT_ATLAS_BTNS=m
CONFIG_INPUT_ATI_REMOTE2=m
CONFIG_INPUT_KEYSPAN_REMOTE=m
# CONFIG_INPUT_KXTJ9 is not set
CONFIG_INPUT_POWERMATE=m
CONFIG_INPUT_YEALINK=m
CONFIG_INPUT_CM109=m
CONFIG_INPUT_UINPUT=m
# CONFIG_INPUT_PCF8574 is not set
# CONFIG_INPUT_PWM_BEEPER is not set
# CONFIG_INPUT_GPIO_ROTARY_ENCODER is not set
# CONFIG_INPUT_ADXL34X is not set
# CONFIG_INPUT_IMS_PCU is not set
# CONFIG_INPUT_CMA3000 is not set
CONFIG_INPUT_XEN_KBDDEV_FRONTEND=m
# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
# CONFIG_INPUT_DRV260X_HAPTICS is not set
# CONFIG_INPUT_DRV2665_HAPTICS is not set
# CONFIG_INPUT_DRV2667_HAPTICS is not set
# CONFIG_RMI4_CORE is not set
#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
CONFIG_SERIO_ALTERA_PS2=m
# CONFIG_SERIO_PS2MULT is not set
CONFIG_SERIO_ARC_PS2=m
CONFIG_HYPERV_KEYBOARD=m
# CONFIG_USERIO is not set
# CONFIG_GAMEPORT is not set
#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
# CONFIG_LEGACY_PTYS is not set
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_ROCKETPORT is not set
CONFIG_CYCLADES=m
# CONFIG_CYZ_INTR is not set
CONFIG_MOXA_INTELLIO=m
CONFIG_MOXA_SMARTIO=m
CONFIG_SYNCLINK=m
CONFIG_SYNCLINKMP=m
CONFIG_SYNCLINK_GT=m
CONFIG_NOZOMI=m
# CONFIG_ISI is not set
CONFIG_N_HDLC=m
CONFIG_N_GSM=m
# CONFIG_TRACE_SINK is not set
CONFIG_DEVMEM=y
# CONFIG_DEVKMEM is not set
#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_DMA=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
CONFIG_SERIAL_8250_RSA=y
# CONFIG_SERIAL_8250_FSL is not set
CONFIG_SERIAL_8250_DW=y
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_MID=y
# CONFIG_SERIAL_8250_MOXA is not set
#
# Non-8250 serial port support
#
# CONFIG_SERIAL_MAX3100 is not set
# CONFIG_SERIAL_MAX310X is not set
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SERIAL_JSM=m
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_SC16IS7XX is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_IFX6X60 is not set
CONFIG_SERIAL_ARC=m
CONFIG_SERIAL_ARC_NR_PORTS=1
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
CONFIG_PRINTER=m
# CONFIG_LP_CONSOLE is not set
CONFIG_PPDEV=m
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_XEN=y
CONFIG_HVC_XEN_FRONTEND=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
# CONFIG_IPMI_SI_PROBE_DEFAULTS is not set
# CONFIG_IPMI_SSIF is not set
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=m
CONFIG_HW_RANDOM_INTEL=m
CONFIG_HW_RANDOM_AMD=m
CONFIG_HW_RANDOM_VIA=m
CONFIG_HW_RANDOM_VIRTIO=y
CONFIG_HW_RANDOM_TPM=m
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
CONFIG_RAW_DRIVER=y
CONFIG_MAX_RAW_DEVS=8192
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
# CONFIG_HPET_MMAP_DEFAULT is not set
CONFIG_HANGCHECK_TIMER=m
CONFIG_UV_MMTIMER=m
CONFIG_TCG_TPM=y
CONFIG_TCG_TIS=y
# CONFIG_TCG_TIS_I2C_ATMEL is not set
# CONFIG_TCG_TIS_I2C_INFINEON is not set
# CONFIG_TCG_TIS_I2C_NUVOTON is not set
CONFIG_TCG_NSC=m
CONFIG_TCG_ATMEL=m
CONFIG_TCG_INFINEON=m
# CONFIG_TCG_XEN is not set
# CONFIG_TCG_CRB is not set
# CONFIG_TCG_TIS_ST33ZP24 is not set
CONFIG_TELCLOCK=m
CONFIG_DEVPORT=y
# CONFIG_XILLYBUS is not set
#
# I2C support
#
CONFIG_I2C=y
CONFIG_ACPI_I2C_OPREGION=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_COMPAT=y
CONFIG_I2C_CHARDEV=m
CONFIG_I2C_MUX=m
#
# Multiplexer I2C Chip support
#
# CONFIG_I2C_MUX_GPIO is not set
# CONFIG_I2C_MUX_PCA9541 is not set
# CONFIG_I2C_MUX_PCA954x is not set
# CONFIG_I2C_MUX_PINCTRL is not set
# CONFIG_I2C_MUX_REG is not set
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_SMBUS=m
CONFIG_I2C_ALGOBIT=y
CONFIG_I2C_ALGOPCA=m
#
# I2C Hardware Bus support
#
#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD756_S4882=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=y
CONFIG_I2C_ISCH=m
CONFIG_I2C_ISMT=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_NFORCE2_S4985=m
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
CONFIG_I2C_SIS96X=m
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
#
# ACPI drivers
#
CONFIG_I2C_SCMI=m
#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
# CONFIG_I2C_CBUS_GPIO is not set
CONFIG_I2C_DESIGNWARE_CORE=m
CONFIG_I2C_DESIGNWARE_PLATFORM=m
CONFIG_I2C_DESIGNWARE_PCI=m
# CONFIG_I2C_DESIGNWARE_BAYTRAIL is not set
# CONFIG_I2C_EMEV2 is not set
# CONFIG_I2C_GPIO is not set
# CONFIG_I2C_OCORES is not set
CONFIG_I2C_PCA_PLATFORM=m
# CONFIG_I2C_PXA_PCI is not set
CONFIG_I2C_SIMTEC=m
# CONFIG_I2C_XILINX is not set
#
# External I2C/SMBus adapter drivers
#
CONFIG_I2C_DIOLAN_U2C=m
CONFIG_I2C_PARPORT=m
CONFIG_I2C_PARPORT_LIGHT=m
# CONFIG_I2C_ROBOTFUZZ_OSIF is not set
# CONFIG_I2C_TAOS_EVM is not set
CONFIG_I2C_TINY_USB=m
CONFIG_I2C_VIPERBOARD=m
#
# Other I2C/SMBus bus drivers
#
CONFIG_I2C_STUB=m
# CONFIG_I2C_SLAVE is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
CONFIG_SPI=y
# CONFIG_SPI_DEBUG is not set
CONFIG_SPI_MASTER=y
#
# SPI Master Controller Drivers
#
# CONFIG_SPI_ALTERA is not set
# CONFIG_SPI_AXI_SPI_ENGINE is not set
# CONFIG_SPI_BITBANG is not set
# CONFIG_SPI_BUTTERFLY is not set
# CONFIG_SPI_CADENCE is not set
CONFIG_SPI_DESIGNWARE=m
# CONFIG_SPI_DW_PCI is not set
# CONFIG_SPI_DW_MMIO is not set
# CONFIG_SPI_GPIO is not set
# CONFIG_SPI_LM70_LLP is not set
# CONFIG_SPI_OC_TINY is not set
CONFIG_SPI_PXA2XX=m
CONFIG_SPI_PXA2XX_PCI=m
# CONFIG_SPI_ROCKCHIP is not set
# CONFIG_SPI_SC18IS602 is not set
# CONFIG_SPI_XCOMM is not set
# CONFIG_SPI_XILINX is not set
# CONFIG_SPI_ZYNQMP_GQSPI is not set
#
# SPI Protocol Masters
#
# CONFIG_SPI_SPIDEV is not set
# CONFIG_SPI_LOOPBACK_TEST is not set
# CONFIG_SPI_TLE62X0 is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
#
# PPS support
#
CONFIG_PPS=y
# CONFIG_PPS_DEBUG is not set
#
# PPS clients support
#
# CONFIG_PPS_CLIENT_KTIMER is not set
CONFIG_PPS_CLIENT_LDISC=m
CONFIG_PPS_CLIENT_PARPORT=m
CONFIG_PPS_CLIENT_GPIO=m
#
# PPS generators support
#
#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK=y
CONFIG_DP83640_PHY=m
CONFIG_PINCTRL=y
#
# Pin controllers
#
CONFIG_PINMUX=y
CONFIG_PINCONF=y
CONFIG_GENERIC_PINCONF=y
# CONFIG_DEBUG_PINCTRL is not set
# CONFIG_PINCTRL_AMD is not set
CONFIG_PINCTRL_BAYTRAIL=y
# CONFIG_PINCTRL_CHERRYVIEW is not set
# CONFIG_PINCTRL_BROXTON is not set
# CONFIG_PINCTRL_SUNRISEPOINT is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
CONFIG_GPIOLIB=y
CONFIG_GPIO_DEVRES=y
CONFIG_GPIO_ACPI=y
CONFIG_GPIOLIB_IRQCHIP=y
# CONFIG_DEBUG_GPIO is not set
CONFIG_GPIO_SYSFS=y
#
# Memory mapped GPIO drivers
#
# CONFIG_GPIO_AMDPT is not set
# CONFIG_GPIO_DWAPB is not set
# CONFIG_GPIO_GENERIC_PLATFORM is not set
# CONFIG_GPIO_ICH is not set
CONFIG_GPIO_LYNXPOINT=y
# CONFIG_GPIO_VX855 is not set
# CONFIG_GPIO_ZX is not set
#
# Port-mapped I/O GPIO drivers
#
# CONFIG_GPIO_F7188X is not set
# CONFIG_GPIO_IT87 is not set
# CONFIG_GPIO_SCH is not set
# CONFIG_GPIO_SCH311X is not set
#
# I2C GPIO expanders
#
# CONFIG_GPIO_ADP5588 is not set
# CONFIG_GPIO_MAX7300 is not set
# CONFIG_GPIO_MAX732X is not set
# CONFIG_GPIO_PCA953X is not set
# CONFIG_GPIO_PCF857X is not set
# CONFIG_GPIO_SX150X is not set
# CONFIG_GPIO_TPIC2810 is not set
#
# MFD GPIO expanders
#
#
# PCI GPIO expanders
#
# CONFIG_GPIO_AMD8111 is not set
# CONFIG_GPIO_INTEL_MID is not set
# CONFIG_GPIO_ML_IOH is not set
# CONFIG_GPIO_RDC321X is not set
#
# SPI GPIO expanders
#
# CONFIG_GPIO_MAX7301 is not set
# CONFIG_GPIO_MC33880 is not set
# CONFIG_GPIO_PISOSR is not set
#
# SPI or I2C GPIO expanders
#
# CONFIG_GPIO_MCP23S08 is not set
#
# USB GPIO expanders
#
# CONFIG_GPIO_VIPERBOARD is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_TEST_POWER is not set
# CONFIG_BATTERY_DS2780 is not set
# CONFIG_BATTERY_DS2781 is not set
# CONFIG_BATTERY_DS2782 is not set
# CONFIG_BATTERY_SBS is not set
# CONFIG_BATTERY_BQ27XXX is not set
# CONFIG_BATTERY_MAX17040 is not set
# CONFIG_BATTERY_MAX17042 is not set
# CONFIG_CHARGER_ISP1704 is not set
# CONFIG_CHARGER_MAX8903 is not set
# CONFIG_CHARGER_LP8727 is not set
# CONFIG_CHARGER_GPIO is not set
# CONFIG_CHARGER_BQ2415X is not set
# CONFIG_CHARGER_BQ24190 is not set
# CONFIG_CHARGER_BQ24257 is not set
# CONFIG_CHARGER_BQ24735 is not set
# CONFIG_CHARGER_BQ25890 is not set
CONFIG_CHARGER_SMB347=m
# CONFIG_BATTERY_GAUGE_LTC2941 is not set
# CONFIG_CHARGER_RT9455 is not set
CONFIG_POWER_RESET=y
# CONFIG_POWER_RESET_RESTART is not set
# CONFIG_POWER_AVS is not set
CONFIG_HWMON=y
CONFIG_HWMON_VID=m
# CONFIG_HWMON_DEBUG_CHIP is not set
#
# Native drivers
#
CONFIG_SENSORS_ABITUGURU=m
CONFIG_SENSORS_ABITUGURU3=m
# CONFIG_SENSORS_AD7314 is not set
CONFIG_SENSORS_AD7414=m
CONFIG_SENSORS_AD7418=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
CONFIG_SENSORS_ADM1029=m
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ADM9240=m
CONFIG_SENSORS_ADT7X10=m
# CONFIG_SENSORS_ADT7310 is not set
CONFIG_SENSORS_ADT7410=m
CONFIG_SENSORS_ADT7411=m
CONFIG_SENSORS_ADT7462=m
CONFIG_SENSORS_ADT7470=m
CONFIG_SENSORS_ADT7475=m
CONFIG_SENSORS_ASC7621=m
CONFIG_SENSORS_K8TEMP=m
CONFIG_SENSORS_K10TEMP=m
CONFIG_SENSORS_FAM15H_POWER=m
CONFIG_SENSORS_APPLESMC=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS620=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_DELL_SMM=m
CONFIG_SENSORS_I5K_AMB=m
CONFIG_SENSORS_F71805F=m
CONFIG_SENSORS_F71882FG=m
CONFIG_SENSORS_F75375S=m
CONFIG_SENSORS_FSCHMD=m
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_GL520SM=m
CONFIG_SENSORS_G760A=m
# CONFIG_SENSORS_G762 is not set
# CONFIG_SENSORS_GPIO_FAN is not set
# CONFIG_SENSORS_HIH6130 is not set
CONFIG_SENSORS_IBMAEM=m
CONFIG_SENSORS_IBMPEX=m
# CONFIG_SENSORS_I5500 is not set
CONFIG_SENSORS_CORETEMP=m
CONFIG_SENSORS_IT87=m
# CONFIG_SENSORS_JC42 is not set
# CONFIG_SENSORS_POWR1220 is not set
CONFIG_SENSORS_LINEAGE=m
# CONFIG_SENSORS_LTC2945 is not set
# CONFIG_SENSORS_LTC2990 is not set
CONFIG_SENSORS_LTC4151=m
CONFIG_SENSORS_LTC4215=m
# CONFIG_SENSORS_LTC4222 is not set
CONFIG_SENSORS_LTC4245=m
# CONFIG_SENSORS_LTC4260 is not set
CONFIG_SENSORS_LTC4261=m
# CONFIG_SENSORS_MAX1111 is not set
CONFIG_SENSORS_MAX16065=m
CONFIG_SENSORS_MAX1619=m
CONFIG_SENSORS_MAX1668=m
CONFIG_SENSORS_MAX197=m
# CONFIG_SENSORS_MAX31722 is not set
CONFIG_SENSORS_MAX6639=m
CONFIG_SENSORS_MAX6642=m
CONFIG_SENSORS_MAX6650=m
CONFIG_SENSORS_MAX6697=m
# CONFIG_SENSORS_MAX31790 is not set
CONFIG_SENSORS_MCP3021=m
# CONFIG_SENSORS_ADCXX is not set
CONFIG_SENSORS_LM63=m
# CONFIG_SENSORS_LM70 is not set
CONFIG_SENSORS_LM73=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_LM92=m
CONFIG_SENSORS_LM93=m
CONFIG_SENSORS_LM95234=m
CONFIG_SENSORS_LM95241=m
CONFIG_SENSORS_LM95245=m
CONFIG_SENSORS_PC87360=m
CONFIG_SENSORS_PC87427=m
CONFIG_SENSORS_NTC_THERMISTOR=m
# CONFIG_SENSORS_NCT6683 is not set
CONFIG_SENSORS_NCT6775=m
# CONFIG_SENSORS_NCT7802 is not set
# CONFIG_SENSORS_NCT7904 is not set
CONFIG_SENSORS_PCF8591=m
CONFIG_PMBUS=m
CONFIG_SENSORS_PMBUS=m
CONFIG_SENSORS_ADM1275=m
CONFIG_SENSORS_LM25066=m
CONFIG_SENSORS_LTC2978=m
# CONFIG_SENSORS_LTC3815 is not set
CONFIG_SENSORS_MAX16064=m
# CONFIG_SENSORS_MAX20751 is not set
CONFIG_SENSORS_MAX34440=m
CONFIG_SENSORS_MAX8688=m
# CONFIG_SENSORS_TPS40422 is not set
CONFIG_SENSORS_UCD9000=m
CONFIG_SENSORS_UCD9200=m
CONFIG_SENSORS_ZL6100=m
# CONFIG_SENSORS_SHT15 is not set
CONFIG_SENSORS_SHT21=m
# CONFIG_SENSORS_SHTC1 is not set
CONFIG_SENSORS_SIS5595=m
CONFIG_SENSORS_DME1737=m
CONFIG_SENSORS_EMC1403=m
# CONFIG_SENSORS_EMC2103 is not set
CONFIG_SENSORS_EMC6W201=m
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_SMSC47M192=m
CONFIG_SENSORS_SMSC47B397=m
CONFIG_SENSORS_SCH56XX_COMMON=m
CONFIG_SENSORS_SCH5627=m
CONFIG_SENSORS_SCH5636=m
# CONFIG_SENSORS_SMM665 is not set
# CONFIG_SENSORS_ADC128D818 is not set
CONFIG_SENSORS_ADS1015=m
CONFIG_SENSORS_ADS7828=m
# CONFIG_SENSORS_ADS7871 is not set
CONFIG_SENSORS_AMC6821=m
CONFIG_SENSORS_INA209=m
CONFIG_SENSORS_INA2XX=m
# CONFIG_SENSORS_TC74 is not set
CONFIG_SENSORS_THMC50=m
CONFIG_SENSORS_TMP102=m
# CONFIG_SENSORS_TMP103 is not set
CONFIG_SENSORS_TMP401=m
CONFIG_SENSORS_TMP421=m
CONFIG_SENSORS_VIA_CPUTEMP=m
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_VT1211=m
CONFIG_SENSORS_VT8231=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83791D=m
CONFIG_SENSORS_W83792D=m
CONFIG_SENSORS_W83793=m
CONFIG_SENSORS_W83795=m
# CONFIG_SENSORS_W83795_FANCTRL is not set
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83L786NG=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
#
# ACPI drivers
#
CONFIG_SENSORS_ACPI_POWER=m
CONFIG_SENSORS_ATK0110=m
CONFIG_THERMAL=y
CONFIG_THERMAL_HWMON=y
CONFIG_THERMAL_WRITABLE_TRIPS=y
CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
# CONFIG_THERMAL_DEFAULT_GOV_POWER_ALLOCATOR is not set
CONFIG_THERMAL_GOV_FAIR_SHARE=y
CONFIG_THERMAL_GOV_STEP_WISE=y
CONFIG_THERMAL_GOV_BANG_BANG=y
CONFIG_THERMAL_GOV_USER_SPACE=y
# CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set
# CONFIG_THERMAL_EMULATION is not set
CONFIG_INTEL_POWERCLAMP=m
CONFIG_X86_PKG_TEMP_THERMAL=m
# CONFIG_INTEL_SOC_DTS_THERMAL is not set
#
# ACPI INT340X thermal drivers
#
# CONFIG_INT340X_THERMAL is not set
CONFIG_INTEL_PCH_THERMAL=m
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
# CONFIG_WATCHDOG_SYSFS is not set
#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_ZIIRAVE_WATCHDOG is not set
# CONFIG_CADENCE_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_MAX63XX_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
CONFIG_F71808E_WDT=m
CONFIG_SP5100_TCO=m
CONFIG_SBC_FITPC2_WATCHDOG=m
# CONFIG_EUROTECH_WDT is not set
CONFIG_IB700_WDT=m
CONFIG_IBMASR=m
# CONFIG_WAFER_WDT is not set
CONFIG_I6300ESB_WDT=y
CONFIG_IE6XX_WDT=m
CONFIG_ITCO_WDT=y
CONFIG_ITCO_VENDOR_SUPPORT=y
CONFIG_IT8712F_WDT=m
CONFIG_IT87_WDT=m
CONFIG_HP_WATCHDOG=m
CONFIG_HPWDT_NMI_DECODING=y
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
CONFIG_NV_TCO=m
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
CONFIG_SMSC_SCH311X_WDT=m
# CONFIG_SMSC37B787_WDT is not set
CONFIG_VIA_WDT=m
CONFIG_W83627HF_WDT=m
CONFIG_W83877F_WDT=m
CONFIG_W83977F_WDT=m
CONFIG_MACHZ_WDT=m
# CONFIG_SBC_EPX_C3_WATCHDOG is not set
# CONFIG_INTEL_MEI_WDT is not set
# CONFIG_NI903X_WDT is not set
# CONFIG_MEN_A21_WDT is not set
CONFIG_XEN_WDT=m
#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m
#
# USB-based Watchdog Cards
#
CONFIG_USBPCWATCHDOG=m
CONFIG_SSB_POSSIBLE=y
#
# Sonics Silicon Backplane
#
CONFIG_SSB=m
CONFIG_SSB_SPROM=y
CONFIG_SSB_PCIHOST_POSSIBLE=y
CONFIG_SSB_PCIHOST=y
# CONFIG_SSB_B43_PCI_BRIDGE is not set
CONFIG_SSB_SDIOHOST_POSSIBLE=y
CONFIG_SSB_SDIOHOST=y
# CONFIG_SSB_DEBUG is not set
CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
CONFIG_SSB_DRIVER_PCICORE=y
# CONFIG_SSB_DRIVER_GPIO is not set
CONFIG_BCMA_POSSIBLE=y
#
# Broadcom specific AMBA
#
CONFIG_BCMA=m
CONFIG_BCMA_HOST_PCI_POSSIBLE=y
CONFIG_BCMA_HOST_PCI=y
# CONFIG_BCMA_HOST_SOC is not set
CONFIG_BCMA_DRIVER_PCI=y
CONFIG_BCMA_DRIVER_GMAC_CMN=y
# CONFIG_BCMA_DRIVER_GPIO is not set
# CONFIG_BCMA_DEBUG is not set
#
# Multifunction device drivers
#
CONFIG_MFD_CORE=y
# CONFIG_MFD_AS3711 is not set
# CONFIG_PMIC_ADP5520 is not set
# CONFIG_MFD_AAT2870_CORE is not set
# CONFIG_MFD_BCM590XX is not set
# CONFIG_MFD_AXP20X_I2C is not set
# CONFIG_MFD_CROS_EC is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_DA9052_SPI is not set
# CONFIG_MFD_DA9052_I2C is not set
# CONFIG_MFD_DA9055 is not set
# CONFIG_MFD_DA9062 is not set
# CONFIG_MFD_DA9063 is not set
# CONFIG_MFD_DA9150 is not set
# CONFIG_MFD_DLN2 is not set
# CONFIG_MFD_MC13XXX_SPI is not set
# CONFIG_MFD_MC13XXX_I2C is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_HTC_I2CPLD is not set
# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
CONFIG_LPC_ICH=y
CONFIG_LPC_SCH=m
# CONFIG_INTEL_SOC_PMIC is not set
# CONFIG_MFD_INTEL_LPSS_ACPI is not set
# CONFIG_MFD_INTEL_LPSS_PCI is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_88PM800 is not set
# CONFIG_MFD_88PM805 is not set
# CONFIG_MFD_88PM860X is not set
# CONFIG_MFD_MAX14577 is not set
# CONFIG_MFD_MAX77693 is not set
# CONFIG_MFD_MAX77843 is not set
# CONFIG_MFD_MAX8907 is not set
# CONFIG_MFD_MAX8925 is not set
# CONFIG_MFD_MAX8997 is not set
# CONFIG_MFD_MAX8998 is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_MENF21BMC is not set
# CONFIG_EZX_PCAP is not set
CONFIG_MFD_VIPERBOARD=m
# CONFIG_MFD_RETU is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_UCB1400_CORE is not set
# CONFIG_MFD_RDC321X is not set
CONFIG_MFD_RTSX_PCI=m
# CONFIG_MFD_RT5033 is not set
# CONFIG_MFD_RTSX_USB is not set
# CONFIG_MFD_RC5T583 is not set
# CONFIG_MFD_RN5T618 is not set
# CONFIG_MFD_SEC_CORE is not set
# CONFIG_MFD_SI476X_CORE is not set
CONFIG_MFD_SM501=m
# CONFIG_MFD_SM501_GPIO is not set
# CONFIG_MFD_SKY81452 is not set
# CONFIG_MFD_SMSC is not set
# CONFIG_ABX500_CORE is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_LP3943 is not set
# CONFIG_MFD_LP8788 is not set
# CONFIG_MFD_PALMAS is not set
# CONFIG_TPS6105X is not set
# CONFIG_TPS65010 is not set
# CONFIG_TPS6507X is not set
# CONFIG_MFD_TPS65086 is not set
# CONFIG_MFD_TPS65090 is not set
# CONFIG_MFD_TPS65217 is not set
# CONFIG_MFD_TPS65218 is not set
# CONFIG_MFD_TPS6586X is not set
# CONFIG_MFD_TPS65910 is not set
# CONFIG_MFD_TPS65912_I2C is not set
# CONFIG_MFD_TPS65912_SPI is not set
# CONFIG_MFD_TPS80031 is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_TWL6040_CORE is not set
# CONFIG_MFD_WL1273_CORE is not set
# CONFIG_MFD_LM3533 is not set
# CONFIG_MFD_TMIO is not set
CONFIG_MFD_VX855=m
# CONFIG_MFD_ARIZONA_I2C is not set
# CONFIG_MFD_ARIZONA_SPI is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM831X_I2C is not set
# CONFIG_MFD_WM831X_SPI is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_WM8994 is not set
# CONFIG_REGULATOR is not set
CONFIG_MEDIA_SUPPORT=m
#
# Multimedia core support
#
CONFIG_MEDIA_CAMERA_SUPPORT=y
CONFIG_MEDIA_ANALOG_TV_SUPPORT=y
CONFIG_MEDIA_DIGITAL_TV_SUPPORT=y
CONFIG_MEDIA_RADIO_SUPPORT=y
# CONFIG_MEDIA_SDR_SUPPORT is not set
CONFIG_MEDIA_RC_SUPPORT=y
# CONFIG_MEDIA_CONTROLLER is not set
CONFIG_VIDEO_DEV=m
CONFIG_VIDEO_V4L2=m
# CONFIG_VIDEO_ADV_DEBUG is not set
# CONFIG_VIDEO_FIXED_MINOR_RANGES is not set
CONFIG_VIDEO_TUNER=m
CONFIG_VIDEOBUF_GEN=m
CONFIG_VIDEOBUF_DMA_SG=m
CONFIG_VIDEOBUF_VMALLOC=m
CONFIG_VIDEOBUF_DVB=m
CONFIG_VIDEOBUF2_CORE=m
CONFIG_VIDEOBUF2_MEMOPS=m
CONFIG_VIDEOBUF2_VMALLOC=m
CONFIG_VIDEOBUF2_DMA_SG=m
CONFIG_VIDEOBUF2_DVB=m
CONFIG_DVB_CORE=m
CONFIG_DVB_NET=y
CONFIG_TTPCI_EEPROM=m
CONFIG_DVB_MAX_ADAPTERS=8
CONFIG_DVB_DYNAMIC_MINORS=y
#
# Media drivers
#
CONFIG_RC_CORE=m
CONFIG_RC_MAP=m
CONFIG_RC_DECODERS=y
CONFIG_LIRC=m
CONFIG_IR_LIRC_CODEC=m
CONFIG_IR_NEC_DECODER=m
CONFIG_IR_RC5_DECODER=m
CONFIG_IR_RC6_DECODER=m
CONFIG_IR_JVC_DECODER=m
CONFIG_IR_SONY_DECODER=m
CONFIG_IR_SANYO_DECODER=m
CONFIG_IR_SHARP_DECODER=m
CONFIG_IR_MCE_KBD_DECODER=m
CONFIG_IR_XMP_DECODER=m
CONFIG_RC_DEVICES=y
CONFIG_RC_ATI_REMOTE=m
CONFIG_IR_ENE=m
# CONFIG_IR_HIX5HD2 is not set
CONFIG_IR_IMON=m
CONFIG_IR_MCEUSB=m
CONFIG_IR_ITE_CIR=m
CONFIG_IR_FINTEK=m
CONFIG_IR_NUVOTON=m
CONFIG_IR_REDRAT3=m
CONFIG_IR_STREAMZAP=m
CONFIG_IR_WINBOND_CIR=m
# CONFIG_IR_IGORPLUGUSB is not set
CONFIG_IR_IGUANA=m
CONFIG_IR_TTUSBIR=m
# CONFIG_RC_LOOPBACK is not set
CONFIG_IR_GPIO_CIR=m
CONFIG_MEDIA_USB_SUPPORT=y
#
# Webcam devices
#
CONFIG_USB_VIDEO_CLASS=m
CONFIG_USB_VIDEO_CLASS_INPUT_EVDEV=y
CONFIG_USB_GSPCA=m
CONFIG_USB_M5602=m
CONFIG_USB_STV06XX=m
CONFIG_USB_GL860=m
CONFIG_USB_GSPCA_BENQ=m
CONFIG_USB_GSPCA_CONEX=m
CONFIG_USB_GSPCA_CPIA1=m
# CONFIG_USB_GSPCA_DTCS033 is not set
CONFIG_USB_GSPCA_ETOMS=m
CONFIG_USB_GSPCA_FINEPIX=m
CONFIG_USB_GSPCA_JEILINJ=m
CONFIG_USB_GSPCA_JL2005BCD=m
# CONFIG_USB_GSPCA_KINECT is not set
CONFIG_USB_GSPCA_KONICA=m
CONFIG_USB_GSPCA_MARS=m
CONFIG_USB_GSPCA_MR97310A=m
CONFIG_USB_GSPCA_NW80X=m
CONFIG_USB_GSPCA_OV519=m
CONFIG_USB_GSPCA_OV534=m
CONFIG_USB_GSPCA_OV534_9=m
CONFIG_USB_GSPCA_PAC207=m
CONFIG_USB_GSPCA_PAC7302=m
CONFIG_USB_GSPCA_PAC7311=m
CONFIG_USB_GSPCA_SE401=m
CONFIG_USB_GSPCA_SN9C2028=m
CONFIG_USB_GSPCA_SN9C20X=m
CONFIG_USB_GSPCA_SONIXB=m
CONFIG_USB_GSPCA_SONIXJ=m
CONFIG_USB_GSPCA_SPCA500=m
CONFIG_USB_GSPCA_SPCA501=m
CONFIG_USB_GSPCA_SPCA505=m
CONFIG_USB_GSPCA_SPCA506=m
CONFIG_USB_GSPCA_SPCA508=m
CONFIG_USB_GSPCA_SPCA561=m
CONFIG_USB_GSPCA_SPCA1528=m
CONFIG_USB_GSPCA_SQ905=m
CONFIG_USB_GSPCA_SQ905C=m
CONFIG_USB_GSPCA_SQ930X=m
CONFIG_USB_GSPCA_STK014=m
# CONFIG_USB_GSPCA_STK1135 is not set
CONFIG_USB_GSPCA_STV0680=m
CONFIG_USB_GSPCA_SUNPLUS=m
CONFIG_USB_GSPCA_T613=m
CONFIG_USB_GSPCA_TOPRO=m
# CONFIG_USB_GSPCA_TOUPTEK is not set
CONFIG_USB_GSPCA_TV8532=m
CONFIG_USB_GSPCA_VC032X=m
CONFIG_USB_GSPCA_VICAM=m
CONFIG_USB_GSPCA_XIRLINK_CIT=m
CONFIG_USB_GSPCA_ZC3XX=m
CONFIG_USB_PWC=m
# CONFIG_USB_PWC_DEBUG is not set
CONFIG_USB_PWC_INPUT_EVDEV=y
# CONFIG_VIDEO_CPIA2 is not set
CONFIG_USB_ZR364XX=m
CONFIG_USB_STKWEBCAM=m
CONFIG_USB_S2255=m
# CONFIG_VIDEO_USBTV is not set
#
# Analog TV USB devices
#
CONFIG_VIDEO_PVRUSB2=m
CONFIG_VIDEO_PVRUSB2_SYSFS=y
CONFIG_VIDEO_PVRUSB2_DVB=y
# CONFIG_VIDEO_PVRUSB2_DEBUGIFC is not set
CONFIG_VIDEO_HDPVR=m
CONFIG_VIDEO_USBVISION=m
# CONFIG_VIDEO_STK1160_COMMON is not set
# CONFIG_VIDEO_GO7007 is not set
#
# Analog/digital TV USB devices
#
CONFIG_VIDEO_AU0828=m
CONFIG_VIDEO_AU0828_V4L2=y
# CONFIG_VIDEO_AU0828_RC is not set
CONFIG_VIDEO_CX231XX=m
CONFIG_VIDEO_CX231XX_RC=y
CONFIG_VIDEO_CX231XX_ALSA=m
CONFIG_VIDEO_CX231XX_DVB=m
CONFIG_VIDEO_TM6000=m
CONFIG_VIDEO_TM6000_ALSA=m
CONFIG_VIDEO_TM6000_DVB=m
#
# Digital TV USB devices
#
CONFIG_DVB_USB=m
# CONFIG_DVB_USB_DEBUG is not set
CONFIG_DVB_USB_A800=m
CONFIG_DVB_USB_DIBUSB_MB=m
# CONFIG_DVB_USB_DIBUSB_MB_FAULTY is not set
CONFIG_DVB_USB_DIBUSB_MC=m
CONFIG_DVB_USB_DIB0700=m
CONFIG_DVB_USB_UMT_010=m
CONFIG_DVB_USB_CXUSB=m
CONFIG_DVB_USB_M920X=m
CONFIG_DVB_USB_DIGITV=m
CONFIG_DVB_USB_VP7045=m
CONFIG_DVB_USB_VP702X=m
CONFIG_DVB_USB_GP8PSK=m
CONFIG_DVB_USB_NOVA_T_USB2=m
CONFIG_DVB_USB_TTUSB2=m
CONFIG_DVB_USB_DTT200U=m
CONFIG_DVB_USB_OPERA1=m
CONFIG_DVB_USB_AF9005=m
CONFIG_DVB_USB_AF9005_REMOTE=m
CONFIG_DVB_USB_PCTV452E=m
CONFIG_DVB_USB_DW2102=m
CONFIG_DVB_USB_CINERGY_T2=m
CONFIG_DVB_USB_DTV5100=m
CONFIG_DVB_USB_FRIIO=m
CONFIG_DVB_USB_AZ6027=m
CONFIG_DVB_USB_TECHNISAT_USB2=m
CONFIG_DVB_USB_V2=m
CONFIG_DVB_USB_AF9015=m
CONFIG_DVB_USB_AF9035=m
CONFIG_DVB_USB_ANYSEE=m
CONFIG_DVB_USB_AU6610=m
CONFIG_DVB_USB_AZ6007=m
CONFIG_DVB_USB_CE6230=m
CONFIG_DVB_USB_EC168=m
CONFIG_DVB_USB_GL861=m
CONFIG_DVB_USB_LME2510=m
CONFIG_DVB_USB_MXL111SF=m
CONFIG_DVB_USB_RTL28XXU=m
# CONFIG_DVB_USB_DVBSKY is not set
CONFIG_DVB_TTUSB_BUDGET=m
CONFIG_DVB_TTUSB_DEC=m
CONFIG_SMS_USB_DRV=m
CONFIG_DVB_B2C2_FLEXCOP_USB=m
# CONFIG_DVB_B2C2_FLEXCOP_USB_DEBUG is not set
# CONFIG_DVB_AS102 is not set
#
# Webcam, TV (analog/digital) USB devices
#
CONFIG_VIDEO_EM28XX=m
# CONFIG_VIDEO_EM28XX_V4L2 is not set
CONFIG_VIDEO_EM28XX_ALSA=m
CONFIG_VIDEO_EM28XX_DVB=m
CONFIG_VIDEO_EM28XX_RC=m
CONFIG_MEDIA_PCI_SUPPORT=y
#
# Media capture support
#
# CONFIG_VIDEO_MEYE is not set
# CONFIG_VIDEO_SOLO6X10 is not set
# CONFIG_VIDEO_TW68 is not set
# CONFIG_VIDEO_TW686X is not set
# CONFIG_VIDEO_ZORAN is not set
#
# Media capture/analog TV support
#
CONFIG_VIDEO_IVTV=m
# CONFIG_VIDEO_IVTV_ALSA is not set
CONFIG_VIDEO_FB_IVTV=m
# CONFIG_VIDEO_HEXIUM_GEMINI is not set
# CONFIG_VIDEO_HEXIUM_ORION is not set
# CONFIG_VIDEO_MXB is not set
# CONFIG_VIDEO_DT3155 is not set
#
# Media capture/analog/hybrid TV support
#
CONFIG_VIDEO_CX18=m
CONFIG_VIDEO_CX18_ALSA=m
CONFIG_VIDEO_CX23885=m
CONFIG_MEDIA_ALTERA_CI=m
# CONFIG_VIDEO_CX25821 is not set
CONFIG_VIDEO_CX88=m
CONFIG_VIDEO_CX88_ALSA=m
CONFIG_VIDEO_CX88_BLACKBIRD=m
CONFIG_VIDEO_CX88_DVB=m
CONFIG_VIDEO_CX88_ENABLE_VP3054=y
CONFIG_VIDEO_CX88_VP3054=m
CONFIG_VIDEO_CX88_MPEG=m
CONFIG_VIDEO_BT848=m
CONFIG_DVB_BT8XX=m
CONFIG_VIDEO_SAA7134=m
CONFIG_VIDEO_SAA7134_ALSA=m
CONFIG_VIDEO_SAA7134_RC=y
CONFIG_VIDEO_SAA7134_DVB=m
CONFIG_VIDEO_SAA7164=m
#
# Media digital TV PCI Adapters
#
CONFIG_DVB_AV7110_IR=y
CONFIG_DVB_AV7110=m
CONFIG_DVB_AV7110_OSD=y
CONFIG_DVB_BUDGET_CORE=m
CONFIG_DVB_BUDGET=m
CONFIG_DVB_BUDGET_CI=m
CONFIG_DVB_BUDGET_AV=m
CONFIG_DVB_BUDGET_PATCH=m
CONFIG_DVB_B2C2_FLEXCOP_PCI=m
# CONFIG_DVB_B2C2_FLEXCOP_PCI_DEBUG is not set
CONFIG_DVB_PLUTO2=m
CONFIG_DVB_DM1105=m
CONFIG_DVB_PT1=m
# CONFIG_DVB_PT3 is not set
CONFIG_MANTIS_CORE=m
CONFIG_DVB_MANTIS=m
CONFIG_DVB_HOPPER=m
CONFIG_DVB_NGENE=m
CONFIG_DVB_DDBRIDGE=m
# CONFIG_DVB_SMIPCIE is not set
# CONFIG_DVB_NETUP_UNIDVB is not set
# CONFIG_V4L_PLATFORM_DRIVERS is not set
# CONFIG_V4L_MEM2MEM_DRIVERS is not set
# CONFIG_V4L_TEST_DRIVERS is not set
# CONFIG_DVB_PLATFORM_DRIVERS is not set
#
# Supported MMC/SDIO adapters
#
CONFIG_SMS_SDIO_DRV=m
CONFIG_RADIO_ADAPTERS=y
CONFIG_RADIO_TEA575X=m
# CONFIG_RADIO_SI470X is not set
# CONFIG_RADIO_SI4713 is not set
# CONFIG_USB_MR800 is not set
# CONFIG_USB_DSBR is not set
# CONFIG_RADIO_MAXIRADIO is not set
# CONFIG_RADIO_SHARK is not set
# CONFIG_RADIO_SHARK2 is not set
# CONFIG_USB_KEENE is not set
# CONFIG_USB_RAREMONO is not set
# CONFIG_USB_MA901 is not set
# CONFIG_RADIO_TEA5764 is not set
# CONFIG_RADIO_SAA7706H is not set
# CONFIG_RADIO_TEF6862 is not set
# CONFIG_RADIO_WL1273 is not set
#
# Texas Instruments WL128x FM driver (ST based)
#
#
# Supported FireWire (IEEE 1394) Adapters
#
CONFIG_DVB_FIREDTV=m
CONFIG_DVB_FIREDTV_INPUT=y
CONFIG_MEDIA_COMMON_OPTIONS=y
#
# common driver options
#
CONFIG_VIDEO_CX2341X=m
CONFIG_VIDEO_TVEEPROM=m
CONFIG_CYPRESS_FIRMWARE=m
CONFIG_DVB_B2C2_FLEXCOP=m
CONFIG_VIDEO_SAA7146=m
CONFIG_VIDEO_SAA7146_VV=m
CONFIG_SMS_SIANO_MDTV=m
CONFIG_SMS_SIANO_RC=y
# CONFIG_SMS_SIANO_DEBUGFS is not set
#
# Media ancillary drivers (tuners, sensors, i2c, frontends)
#
CONFIG_MEDIA_SUBDRV_AUTOSELECT=y
CONFIG_MEDIA_ATTACH=y
CONFIG_VIDEO_IR_I2C=m
#
# Audio decoders, processors and mixers
#
CONFIG_VIDEO_TVAUDIO=m
CONFIG_VIDEO_TDA7432=m
CONFIG_VIDEO_MSP3400=m
CONFIG_VIDEO_CS3308=m
CONFIG_VIDEO_CS5345=m
CONFIG_VIDEO_CS53L32A=m
CONFIG_VIDEO_WM8775=m
CONFIG_VIDEO_WM8739=m
CONFIG_VIDEO_VP27SMPX=m
#
# RDS decoders
#
CONFIG_VIDEO_SAA6588=m
#
# Video decoders
#
CONFIG_VIDEO_SAA711X=m
#
# Video and audio decoders
#
CONFIG_VIDEO_SAA717X=m
CONFIG_VIDEO_CX25840=m
#
# Video encoders
#
CONFIG_VIDEO_SAA7127=m
#
# Camera sensor devices
#
#
# Flash devices
#
#
# Video improvement chips
#
CONFIG_VIDEO_UPD64031A=m
CONFIG_VIDEO_UPD64083=m
#
# Audio/Video compression chips
#
CONFIG_VIDEO_SAA6752HS=m
#
# Miscellaneous helper chips
#
CONFIG_VIDEO_M52790=m
#
# Sensors used on soc_camera driver
#
CONFIG_MEDIA_TUNER=m
CONFIG_MEDIA_TUNER_SIMPLE=m
CONFIG_MEDIA_TUNER_TDA8290=m
CONFIG_MEDIA_TUNER_TDA827X=m
CONFIG_MEDIA_TUNER_TDA18271=m
CONFIG_MEDIA_TUNER_TDA9887=m
CONFIG_MEDIA_TUNER_TEA5761=m
CONFIG_MEDIA_TUNER_TEA5767=m
CONFIG_MEDIA_TUNER_MT20XX=m
CONFIG_MEDIA_TUNER_MT2060=m
CONFIG_MEDIA_TUNER_MT2063=m
CONFIG_MEDIA_TUNER_MT2266=m
CONFIG_MEDIA_TUNER_MT2131=m
CONFIG_MEDIA_TUNER_QT1010=m
CONFIG_MEDIA_TUNER_XC2028=m
CONFIG_MEDIA_TUNER_XC5000=m
CONFIG_MEDIA_TUNER_XC4000=m
CONFIG_MEDIA_TUNER_MXL5005S=m
CONFIG_MEDIA_TUNER_MXL5007T=m
CONFIG_MEDIA_TUNER_MC44S803=m
CONFIG_MEDIA_TUNER_MAX2165=m
CONFIG_MEDIA_TUNER_TDA18218=m
CONFIG_MEDIA_TUNER_FC0011=m
CONFIG_MEDIA_TUNER_FC0012=m
CONFIG_MEDIA_TUNER_FC0013=m
CONFIG_MEDIA_TUNER_TDA18212=m
CONFIG_MEDIA_TUNER_E4000=m
CONFIG_MEDIA_TUNER_FC2580=m
CONFIG_MEDIA_TUNER_M88RS6000T=m
CONFIG_MEDIA_TUNER_TUA9001=m
CONFIG_MEDIA_TUNER_SI2157=m
CONFIG_MEDIA_TUNER_IT913X=m
CONFIG_MEDIA_TUNER_R820T=m
CONFIG_MEDIA_TUNER_QM1D1C0042=m
#
# Multistandard (satellite) frontends
#
CONFIG_DVB_STB0899=m
CONFIG_DVB_STB6100=m
CONFIG_DVB_STV090x=m
CONFIG_DVB_STV6110x=m
CONFIG_DVB_M88DS3103=m
#
# Multistandard (cable + terrestrial) frontends
#
CONFIG_DVB_DRXK=m
CONFIG_DVB_TDA18271C2DD=m
CONFIG_DVB_SI2165=m
#
# DVB-S (satellite) frontends
#
CONFIG_DVB_CX24110=m
CONFIG_DVB_CX24123=m
CONFIG_DVB_MT312=m
CONFIG_DVB_ZL10036=m
CONFIG_DVB_ZL10039=m
CONFIG_DVB_S5H1420=m
CONFIG_DVB_STV0288=m
CONFIG_DVB_STB6000=m
CONFIG_DVB_STV0299=m
CONFIG_DVB_STV6110=m
CONFIG_DVB_STV0900=m
CONFIG_DVB_TDA8083=m
CONFIG_DVB_TDA10086=m
CONFIG_DVB_TDA8261=m
CONFIG_DVB_VES1X93=m
CONFIG_DVB_TUNER_ITD1000=m
CONFIG_DVB_TUNER_CX24113=m
CONFIG_DVB_TDA826X=m
CONFIG_DVB_TUA6100=m
CONFIG_DVB_CX24116=m
CONFIG_DVB_CX24117=m
CONFIG_DVB_CX24120=m
CONFIG_DVB_SI21XX=m
CONFIG_DVB_TS2020=m
CONFIG_DVB_DS3000=m
CONFIG_DVB_MB86A16=m
CONFIG_DVB_TDA10071=m
#
# DVB-T (terrestrial) frontends
#
CONFIG_DVB_SP8870=m
CONFIG_DVB_SP887X=m
CONFIG_DVB_CX22700=m
CONFIG_DVB_CX22702=m
CONFIG_DVB_DRXD=m
CONFIG_DVB_L64781=m
CONFIG_DVB_TDA1004X=m
CONFIG_DVB_NXT6000=m
CONFIG_DVB_MT352=m
CONFIG_DVB_ZL10353=m
CONFIG_DVB_DIB3000MB=m
CONFIG_DVB_DIB3000MC=m
CONFIG_DVB_DIB7000M=m
CONFIG_DVB_DIB7000P=m
CONFIG_DVB_TDA10048=m
CONFIG_DVB_AF9013=m
CONFIG_DVB_EC100=m
CONFIG_DVB_STV0367=m
CONFIG_DVB_CXD2820R=m
CONFIG_DVB_RTL2830=m
CONFIG_DVB_RTL2832=m
CONFIG_DVB_SI2168=m
# CONFIG_DVB_AS102_FE is not set
#
# DVB-C (cable) frontends
#
CONFIG_DVB_VES1820=m
CONFIG_DVB_TDA10021=m
CONFIG_DVB_TDA10023=m
CONFIG_DVB_STV0297=m
#
# ATSC (North American/Korean Terrestrial/Cable DTV) frontends
#
CONFIG_DVB_NXT200X=m
CONFIG_DVB_OR51211=m
CONFIG_DVB_OR51132=m
CONFIG_DVB_BCM3510=m
CONFIG_DVB_LGDT330X=m
CONFIG_DVB_LGDT3305=m
CONFIG_DVB_LGDT3306A=m
CONFIG_DVB_LG2160=m
CONFIG_DVB_S5H1409=m
CONFIG_DVB_AU8522=m
CONFIG_DVB_AU8522_DTV=m
CONFIG_DVB_AU8522_V4L=m
CONFIG_DVB_S5H1411=m
#
# ISDB-T (terrestrial) frontends
#
CONFIG_DVB_S921=m
CONFIG_DVB_DIB8000=m
CONFIG_DVB_MB86A20S=m
#
# ISDB-S (satellite) & ISDB-T (terrestrial) frontends
#
CONFIG_DVB_TC90522=m
#
# Digital terrestrial only tuners/PLL
#
CONFIG_DVB_PLL=m
CONFIG_DVB_TUNER_DIB0070=m
CONFIG_DVB_TUNER_DIB0090=m
#
# SEC control devices for DVB-S
#
CONFIG_DVB_DRX39XYJ=m
CONFIG_DVB_LNBP21=m
CONFIG_DVB_LNBP22=m
CONFIG_DVB_ISL6405=m
CONFIG_DVB_ISL6421=m
CONFIG_DVB_ISL6423=m
CONFIG_DVB_A8293=m
CONFIG_DVB_LGS8GXX=m
CONFIG_DVB_ATBM8830=m
CONFIG_DVB_TDA665x=m
CONFIG_DVB_IX2505V=m
CONFIG_DVB_M88RS2000=m
CONFIG_DVB_AF9033=m
#
# Tools to develop new frontends
#
# CONFIG_DVB_DUMMY_FE is not set
#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
CONFIG_AGP_SIS=y
CONFIG_AGP_VIA=y
CONFIG_INTEL_GTT=y
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=64
CONFIG_VGA_SWITCHEROO=y
CONFIG_DRM=m
CONFIG_DRM_MIPI_DSI=y
# CONFIG_DRM_DP_AUX_CHARDEV is not set
CONFIG_DRM_KMS_HELPER=m
CONFIG_DRM_KMS_FB_HELPER=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_LOAD_EDID_FIRMWARE=y
CONFIG_DRM_TTM=m
#
# I2C encoder or helper chips
#
# CONFIG_DRM_I2C_ADV7511 is not set
CONFIG_DRM_I2C_CH7006=m
CONFIG_DRM_I2C_SIL164=m
CONFIG_DRM_I2C_NXP_TDA998X=m
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_R128 is not set
# CONFIG_DRM_RADEON is not set
# CONFIG_DRM_AMDGPU is not set
#
# ACP (Audio CoProcessor) Configuration
#
# CONFIG_DRM_NOUVEAU is not set
# CONFIG_DRM_I810 is not set
CONFIG_DRM_I915=m
# CONFIG_DRM_I915_PRELIMINARY_HW_SUPPORT is not set
CONFIG_DRM_I915_USERPTR=y
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
# CONFIG_DRM_SAVAGE is not set
# CONFIG_DRM_VGEM is not set
CONFIG_DRM_VMWGFX=m
CONFIG_DRM_VMWGFX_FBCON=y
CONFIG_DRM_GMA500=m
CONFIG_DRM_GMA600=y
CONFIG_DRM_GMA3600=y
CONFIG_DRM_UDL=m
CONFIG_DRM_AST=m
CONFIG_DRM_MGAG200=m
CONFIG_DRM_CIRRUS_QEMU=m
CONFIG_DRM_QXL=m
# CONFIG_DRM_BOCHS is not set
# CONFIG_DRM_VIRTIO_GPU is not set
CONFIG_DRM_PANEL=y
#
# Display Panels
#
CONFIG_DRM_BRIDGE=y
#
# Display Interface Bridges
#
# CONFIG_DRM_ANALOGIX_ANX78XX is not set
#
# Frame buffer Devices
#
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
CONFIG_FB_CMDLINE=y
CONFIG_FB_NOTIFY=y
# CONFIG_FB_DDC is not set
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
CONFIG_FB_SYS_FILLRECT=m
CONFIG_FB_SYS_COPYAREA=m
CONFIG_FB_SYS_IMAGEBLIT=m
# CONFIG_FB_FOREIGN_ENDIAN is not set
CONFIG_FB_SYS_FOPS=m
CONFIG_FB_DEFERRED_IO=y
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
# CONFIG_FB_MODE_HELPERS is not set
CONFIG_FB_TILEBLITTING=y
#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
CONFIG_FB_VESA=y
CONFIG_FB_EFI=y
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SM501 is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_UDL is not set
# CONFIG_FB_IBM_GXT4500 is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_XEN_FBDEV_FRONTEND is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
# CONFIG_FB_AUO_K190X is not set
CONFIG_FB_HYPERV=m
# CONFIG_FB_SIMPLE is not set
# CONFIG_FB_SM712 is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=m
# CONFIG_LCD_L4F00242T03 is not set
# CONFIG_LCD_LMS283GF05 is not set
# CONFIG_LCD_LTV350QV is not set
# CONFIG_LCD_ILI922X is not set
# CONFIG_LCD_ILI9320 is not set
# CONFIG_LCD_TDO24M is not set
# CONFIG_LCD_VGG2432A4 is not set
CONFIG_LCD_PLATFORM=m
# CONFIG_LCD_S6E63M0 is not set
# CONFIG_LCD_LD9040 is not set
# CONFIG_LCD_AMS369FG06 is not set
# CONFIG_LCD_LMS501KF03 is not set
# CONFIG_LCD_HX8357 is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_GENERIC is not set
# CONFIG_BACKLIGHT_PWM is not set
CONFIG_BACKLIGHT_APPLE=m
# CONFIG_BACKLIGHT_PM8941_WLED is not set
# CONFIG_BACKLIGHT_SAHARA is not set
# CONFIG_BACKLIGHT_ADP8860 is not set
# CONFIG_BACKLIGHT_ADP8870 is not set
# CONFIG_BACKLIGHT_LM3630A is not set
# CONFIG_BACKLIGHT_LM3639 is not set
# CONFIG_BACKLIGHT_LP855X is not set
# CONFIG_BACKLIGHT_GPIO is not set
# CONFIG_BACKLIGHT_LV5207LP is not set
# CONFIG_BACKLIGHT_BD6107 is not set
# CONFIG_VGASTATE is not set
CONFIG_HDMI=y
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y
CONFIG_SOUND=m
CONFIG_SOUND_OSS_CORE=y
CONFIG_SOUND_OSS_CORE_PRECLAIM=y
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_JACK=y
CONFIG_SND_JACK_INPUT_DEV=y
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
# CONFIG_SND_MIXER_OSS is not set
# CONFIG_SND_PCM_OSS is not set
CONFIG_SND_PCM_TIMER=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_HRTIMER=m
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_MAX_CARDS=32
# CONFIG_SND_SUPPORT_OLD_API is not set
CONFIG_SND_PROC_FS=y
CONFIG_SND_VERBOSE_PROCFS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
CONFIG_SND_VMASTER=y
CONFIG_SND_DMA_SGBUF=y
CONFIG_SND_RAWMIDI_SEQ=m
CONFIG_SND_OPL3_LIB_SEQ=m
# CONFIG_SND_OPL4_LIB_SEQ is not set
# CONFIG_SND_SBAWE_SEQ is not set
CONFIG_SND_EMU10K1_SEQ=m
CONFIG_SND_MPU401_UART=m
CONFIG_SND_OPL3_LIB=m
CONFIG_SND_VX_LIB=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_DRIVERS=y
CONFIG_SND_PCSP=m
CONFIG_SND_DUMMY=m
CONFIG_SND_ALOOP=m
CONFIG_SND_VIRMIDI=m
CONFIG_SND_MTPAV=m
# CONFIG_SND_MTS64 is not set
# CONFIG_SND_SERIAL_U16550 is not set
CONFIG_SND_MPU401=m
# CONFIG_SND_PORTMAN2X4 is not set
CONFIG_SND_AC97_POWER_SAVE=y
CONFIG_SND_AC97_POWER_SAVE_DEFAULT=5
CONFIG_SND_PCI=y
CONFIG_SND_AD1889=m
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
CONFIG_SND_ALI5451=m
CONFIG_SND_ASIHPI=m
CONFIG_SND_ATIIXP=m
CONFIG_SND_ATIIXP_MODEM=m
CONFIG_SND_AU8810=m
CONFIG_SND_AU8820=m
CONFIG_SND_AU8830=m
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
CONFIG_SND_BT87X=m
# CONFIG_SND_BT87X_OVERCLOCK is not set
CONFIG_SND_CA0106=m
CONFIG_SND_CMIPCI=m
CONFIG_SND_OXYGEN_LIB=m
CONFIG_SND_OXYGEN=m
# CONFIG_SND_CS4281 is not set
CONFIG_SND_CS46XX=m
CONFIG_SND_CS46XX_NEW_DSP=y
CONFIG_SND_CTXFI=m
CONFIG_SND_DARLA20=m
CONFIG_SND_GINA20=m
CONFIG_SND_LAYLA20=m
CONFIG_SND_DARLA24=m
CONFIG_SND_GINA24=m
CONFIG_SND_LAYLA24=m
CONFIG_SND_MONA=m
CONFIG_SND_MIA=m
CONFIG_SND_ECHO3G=m
CONFIG_SND_INDIGO=m
CONFIG_SND_INDIGOIO=m
CONFIG_SND_INDIGODJ=m
CONFIG_SND_INDIGOIOX=m
CONFIG_SND_INDIGODJX=m
CONFIG_SND_EMU10K1=m
CONFIG_SND_EMU10K1X=m
CONFIG_SND_ENS1370=m
CONFIG_SND_ENS1371=m
# CONFIG_SND_ES1938 is not set
CONFIG_SND_ES1968=m
CONFIG_SND_ES1968_INPUT=y
CONFIG_SND_ES1968_RADIO=y
# CONFIG_SND_FM801 is not set
CONFIG_SND_HDSP=m
CONFIG_SND_HDSPM=m
CONFIG_SND_ICE1712=m
CONFIG_SND_ICE1724=m
CONFIG_SND_INTEL8X0=m
CONFIG_SND_INTEL8X0M=m
CONFIG_SND_KORG1212=m
CONFIG_SND_LOLA=m
CONFIG_SND_LX6464ES=m
CONFIG_SND_MAESTRO3=m
CONFIG_SND_MAESTRO3_INPUT=y
CONFIG_SND_MIXART=m
# CONFIG_SND_NM256 is not set
CONFIG_SND_PCXHR=m
# CONFIG_SND_RIPTIDE is not set
CONFIG_SND_RME32=m
CONFIG_SND_RME96=m
CONFIG_SND_RME9652=m
# CONFIG_SND_SONICVIBES is not set
CONFIG_SND_TRIDENT=m
CONFIG_SND_VIA82XX=m
CONFIG_SND_VIA82XX_MODEM=m
CONFIG_SND_VIRTUOSO=m
CONFIG_SND_VX222=m
# CONFIG_SND_YMFPCI is not set
#
# HD-Audio
#
CONFIG_SND_HDA=m
CONFIG_SND_HDA_INTEL=m
CONFIG_SND_HDA_HWDEP=y
# CONFIG_SND_HDA_RECONFIG is not set
CONFIG_SND_HDA_INPUT_BEEP=y
CONFIG_SND_HDA_INPUT_BEEP_MODE=0
# CONFIG_SND_HDA_PATCH_LOADER is not set
CONFIG_SND_HDA_CODEC_REALTEK=m
CONFIG_SND_HDA_CODEC_ANALOG=m
CONFIG_SND_HDA_CODEC_SIGMATEL=m
CONFIG_SND_HDA_CODEC_VIA=m
CONFIG_SND_HDA_CODEC_HDMI=m
CONFIG_SND_HDA_CODEC_CIRRUS=m
CONFIG_SND_HDA_CODEC_CONEXANT=m
CONFIG_SND_HDA_CODEC_CA0110=m
CONFIG_SND_HDA_CODEC_CA0132=m
CONFIG_SND_HDA_CODEC_CA0132_DSP=y
CONFIG_SND_HDA_CODEC_CMEDIA=m
CONFIG_SND_HDA_CODEC_SI3054=m
CONFIG_SND_HDA_GENERIC=m
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
CONFIG_SND_HDA_CORE=m
CONFIG_SND_HDA_DSP_LOADER=y
CONFIG_SND_HDA_I915=y
CONFIG_SND_HDA_PREALLOC_SIZE=512
CONFIG_SND_SPI=y
CONFIG_SND_USB=y
CONFIG_SND_USB_AUDIO=m
CONFIG_SND_USB_UA101=m
CONFIG_SND_USB_USX2Y=m
CONFIG_SND_USB_CAIAQ=m
CONFIG_SND_USB_CAIAQ_INPUT=y
CONFIG_SND_USB_US122L=m
CONFIG_SND_USB_6FIRE=m
# CONFIG_SND_USB_HIFACE is not set
# CONFIG_SND_BCD2000 is not set
# CONFIG_SND_USB_POD is not set
# CONFIG_SND_USB_PODHD is not set
# CONFIG_SND_USB_TONEPORT is not set
# CONFIG_SND_USB_VARIAX is not set
CONFIG_SND_FIREWIRE=y
CONFIG_SND_FIREWIRE_LIB=m
# CONFIG_SND_DICE is not set
# CONFIG_SND_OXFW is not set
CONFIG_SND_ISIGHT=m
# CONFIG_SND_FIREWORKS is not set
# CONFIG_SND_BEBOB is not set
# CONFIG_SND_FIREWIRE_DIGI00X is not set
# CONFIG_SND_FIREWIRE_TASCAM is not set
# CONFIG_SND_SOC is not set
# CONFIG_SOUND_PRIME is not set
CONFIG_AC97_BUS=m
#
# HID support
#
CONFIG_HID=y
CONFIG_HID_BATTERY_STRENGTH=y
CONFIG_HIDRAW=y
CONFIG_UHID=m
CONFIG_HID_GENERIC=y
#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
CONFIG_HID_ACRUX=m
# CONFIG_HID_ACRUX_FF is not set
CONFIG_HID_APPLE=y
CONFIG_HID_APPLEIR=m
# CONFIG_HID_ASUS is not set
CONFIG_HID_AUREAL=m
CONFIG_HID_BELKIN=y
# CONFIG_HID_BETOP_FF is not set
CONFIG_HID_CHERRY=y
CONFIG_HID_CHICONY=y
# CONFIG_HID_CORSAIR is not set
CONFIG_HID_PRODIKEYS=m
# CONFIG_HID_CMEDIA is not set
# CONFIG_HID_CP2112 is not set
CONFIG_HID_CYPRESS=y
CONFIG_HID_DRAGONRISE=m
# CONFIG_DRAGONRISE_FF is not set
# CONFIG_HID_EMS_FF is not set
CONFIG_HID_ELECOM=m
# CONFIG_HID_ELO is not set
CONFIG_HID_EZKEY=y
# CONFIG_HID_GEMBIRD is not set
# CONFIG_HID_GFRM is not set
CONFIG_HID_HOLTEK=m
# CONFIG_HOLTEK_FF is not set
# CONFIG_HID_GT683R is not set
CONFIG_HID_KEYTOUCH=m
CONFIG_HID_KYE=m
CONFIG_HID_UCLOGIC=m
CONFIG_HID_WALTOP=m
CONFIG_HID_GYRATION=m
CONFIG_HID_ICADE=m
CONFIG_HID_TWINHAN=m
CONFIG_HID_KENSINGTON=y
CONFIG_HID_LCPOWER=m
# CONFIG_HID_LENOVO is not set
CONFIG_HID_LOGITECH=y
CONFIG_HID_LOGITECH_DJ=m
CONFIG_HID_LOGITECH_HIDPP=m
# CONFIG_LOGITECH_FF is not set
# CONFIG_LOGIRUMBLEPAD2_FF is not set
# CONFIG_LOGIG940_FF is not set
# CONFIG_LOGIWHEELS_FF is not set
CONFIG_HID_MAGICMOUSE=y
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
CONFIG_HID_MULTITOUCH=m
CONFIG_HID_NTRIG=y
CONFIG_HID_ORTEK=m
CONFIG_HID_PANTHERLORD=m
# CONFIG_PANTHERLORD_FF is not set
# CONFIG_HID_PENMOUNT is not set
CONFIG_HID_PETALYNX=m
CONFIG_HID_PICOLCD=m
CONFIG_HID_PICOLCD_FB=y
CONFIG_HID_PICOLCD_BACKLIGHT=y
CONFIG_HID_PICOLCD_LCD=y
CONFIG_HID_PICOLCD_LEDS=y
CONFIG_HID_PICOLCD_CIR=y
CONFIG_HID_PLANTRONICS=y
CONFIG_HID_PRIMAX=m
CONFIG_HID_ROCCAT=m
CONFIG_HID_SAITEK=m
CONFIG_HID_SAMSUNG=m
CONFIG_HID_SONY=m
# CONFIG_SONY_FF is not set
CONFIG_HID_SPEEDLINK=m
CONFIG_HID_STEELSERIES=m
CONFIG_HID_SUNPLUS=m
# CONFIG_HID_RMI is not set
CONFIG_HID_GREENASIA=m
# CONFIG_GREENASIA_FF is not set
CONFIG_HID_HYPERV_MOUSE=m
CONFIG_HID_SMARTJOYPLUS=m
# CONFIG_SMARTJOYPLUS_FF is not set
CONFIG_HID_TIVO=m
CONFIG_HID_TOPSEED=m
CONFIG_HID_THINGM=m
CONFIG_HID_THRUSTMASTER=m
# CONFIG_THRUSTMASTER_FF is not set
CONFIG_HID_WACOM=m
CONFIG_HID_WIIMOTE=m
# CONFIG_HID_XINMO is not set
CONFIG_HID_ZEROPLUS=m
# CONFIG_ZEROPLUS_FF is not set
CONFIG_HID_ZYDACRON=m
# CONFIG_HID_SENSOR_HUB is not set
#
# USB HID support
#
CONFIG_USB_HID=y
CONFIG_HID_PID=y
CONFIG_USB_HIDDEV=y
#
# I2C HID support
#
CONFIG_I2C_HID=m
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_COMMON=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB=y
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
#
# Miscellaneous USB options
#
CONFIG_USB_DEFAULT_PERSIST=y
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_OTG is not set
# CONFIG_USB_OTG_WHITELIST is not set
# CONFIG_USB_ULPI_BUS is not set
CONFIG_USB_MON=y
CONFIG_USB_WUSB=m
CONFIG_USB_WUSB_CBAF=m
# CONFIG_USB_WUSB_CBAF_DEBUG is not set
#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_XHCI_PCI=y
CONFIG_USB_XHCI_PLATFORM=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
CONFIG_USB_EHCI_PCI=y
# CONFIG_USB_EHCI_HCD_PLATFORM is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
# CONFIG_USB_FOTG210_HCD is not set
# CONFIG_USB_MAX3421_HCD is not set
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_OHCI_HCD_PCI=y
# CONFIG_USB_OHCI_HCD_PLATFORM is not set
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_U132_HCD is not set
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_WHCI_HCD is not set
CONFIG_USB_HWA_HCD=m
# CONFIG_USB_HCD_BCMA is not set
# CONFIG_USB_HCD_SSB is not set
# CONFIG_USB_HCD_TEST_MODE is not set
#
# USB Device Class drivers
#
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m
CONFIG_USB_WDM=m
CONFIG_USB_TMC=m
#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#
#
# also be needed; see USB_STORAGE Help for more info
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_REALTEK=m
CONFIG_REALTEK_AUTOPM=y
CONFIG_USB_STORAGE_DATAFAB=m
CONFIG_USB_STORAGE_FREECOM=m
CONFIG_USB_STORAGE_ISD200=m
CONFIG_USB_STORAGE_USBAT=m
CONFIG_USB_STORAGE_SDDR09=m
CONFIG_USB_STORAGE_SDDR55=m
CONFIG_USB_STORAGE_JUMPSHOT=m
CONFIG_USB_STORAGE_ALAUDA=m
CONFIG_USB_STORAGE_ONETOUCH=m
CONFIG_USB_STORAGE_KARMA=m
CONFIG_USB_STORAGE_CYPRESS_ATACB=m
CONFIG_USB_STORAGE_ENE_UB6250=m
# CONFIG_USB_UAS is not set
#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m
# CONFIG_USBIP_CORE is not set
# CONFIG_USB_MUSB_HDRC is not set
CONFIG_USB_DWC3=y
# CONFIG_USB_DWC3_HOST is not set
CONFIG_USB_DWC3_GADGET=y
# CONFIG_USB_DWC3_DUAL_ROLE is not set
#
# Platform Glue Driver Support
#
CONFIG_USB_DWC3_PCI=y
# CONFIG_USB_DWC2 is not set
# CONFIG_USB_CHIPIDEA is not set
# CONFIG_USB_ISP1760 is not set
#
# USB port drivers
#
CONFIG_USB_USS720=m
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_CONSOLE=y
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_SIMPLE is not set
CONFIG_USB_SERIAL_AIRCABLE=m
CONFIG_USB_SERIAL_ARK3116=m
CONFIG_USB_SERIAL_BELKIN=m
CONFIG_USB_SERIAL_CH341=m
CONFIG_USB_SERIAL_WHITEHEAT=m
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
CONFIG_USB_SERIAL_CP210X=m
CONFIG_USB_SERIAL_CYPRESS_M8=m
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
# CONFIG_USB_SERIAL_F81232 is not set
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
CONFIG_USB_SERIAL_IUU=m
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
CONFIG_USB_SERIAL_KEYSPAN=m
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
# CONFIG_USB_SERIAL_METRO is not set
CONFIG_USB_SERIAL_MOS7720=m
CONFIG_USB_SERIAL_MOS7715_PARPORT=y
CONFIG_USB_SERIAL_MOS7840=m
# CONFIG_USB_SERIAL_MXUPORT is not set
CONFIG_USB_SERIAL_NAVMAN=m
CONFIG_USB_SERIAL_PL2303=m
CONFIG_USB_SERIAL_OTI6858=m
CONFIG_USB_SERIAL_QCAUX=m
CONFIG_USB_SERIAL_QUALCOMM=m
CONFIG_USB_SERIAL_SPCP8X5=m
CONFIG_USB_SERIAL_SAFE=m
CONFIG_USB_SERIAL_SAFE_PADDED=y
CONFIG_USB_SERIAL_SIERRAWIRELESS=m
CONFIG_USB_SERIAL_SYMBOL=m
# CONFIG_USB_SERIAL_TI is not set
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
CONFIG_USB_SERIAL_WWAN=m
CONFIG_USB_SERIAL_OPTION=m
CONFIG_USB_SERIAL_OMNINET=m
CONFIG_USB_SERIAL_OPTICON=m
CONFIG_USB_SERIAL_XSENS_MT=m
# CONFIG_USB_SERIAL_WISHBONE is not set
CONFIG_USB_SERIAL_SSU100=m
CONFIG_USB_SERIAL_QT2=m
CONFIG_USB_SERIAL_DEBUG=m
#
# USB Miscellaneous drivers
#
CONFIG_USB_EMI62=m
CONFIG_USB_EMI26=m
CONFIG_USB_ADUTUX=m
CONFIG_USB_SEVSEG=m
# CONFIG_USB_RIO500 is not set
CONFIG_USB_LEGOTOWER=m
CONFIG_USB_LCD=m
CONFIG_USB_LED=m
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
CONFIG_USB_IDMOUSE=m
CONFIG_USB_FTDI_ELAN=m
CONFIG_USB_APPLEDISPLAY=m
CONFIG_USB_SISUSBVGA=m
CONFIG_USB_SISUSBVGA_CON=y
CONFIG_USB_LD=m
# CONFIG_USB_TRANCEVIBRATOR is not set
CONFIG_USB_IOWARRIOR=m
# CONFIG_USB_TEST is not set
# CONFIG_USB_EHSET_TEST_FIXTURE is not set
CONFIG_USB_ISIGHTFW=m
# CONFIG_USB_YUREX is not set
CONFIG_USB_EZUSB_FX2=m
CONFIG_USB_HSIC_USB3503=m
# CONFIG_USB_LINK_LAYER_TEST is not set
# CONFIG_USB_CHAOSKEY is not set
# CONFIG_UCSI is not set
CONFIG_USB_ATM=m
CONFIG_USB_SPEEDTOUCH=m
CONFIG_USB_CXACRU=m
CONFIG_USB_UEAGLEATM=m
CONFIG_USB_XUSBATM=m
#
# USB Physical Layer drivers
#
CONFIG_USB_PHY=y
CONFIG_NOP_USB_XCEIV=y
# CONFIG_USB_GPIO_VBUS is not set
# CONFIG_USB_ISP1301 is not set
CONFIG_USB_GADGET=y
# CONFIG_USB_GADGET_DEBUG is not set
# CONFIG_USB_GADGET_DEBUG_FILES is not set
# CONFIG_USB_GADGET_DEBUG_FS is not set
CONFIG_USB_GADGET_VBUS_DRAW=2
CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2
#
# USB Peripheral Controller
#
# CONFIG_USB_FOTG210_UDC is not set
# CONFIG_USB_GR_UDC is not set
# CONFIG_USB_R8A66597 is not set
# CONFIG_USB_PXA27X is not set
# CONFIG_USB_MV_UDC is not set
# CONFIG_USB_MV_U3D is not set
# CONFIG_USB_M66592 is not set
# CONFIG_USB_BDC_UDC is not set
# CONFIG_USB_AMD5536UDC is not set
# CONFIG_USB_NET2272 is not set
# CONFIG_USB_NET2280 is not set
# CONFIG_USB_GOKU is not set
# CONFIG_USB_EG20T is not set
# CONFIG_USB_DUMMY_HCD is not set
CONFIG_USB_LIBCOMPOSITE=m
CONFIG_USB_F_MASS_STORAGE=m
# CONFIG_USB_CONFIGFS is not set
# CONFIG_USB_ZERO is not set
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_ETH is not set
# CONFIG_USB_G_NCM is not set
# CONFIG_USB_GADGETFS is not set
# CONFIG_USB_FUNCTIONFS is not set
CONFIG_USB_MASS_STORAGE=m
# CONFIG_USB_GADGET_TARGET is not set
# CONFIG_USB_G_SERIAL is not set
# CONFIG_USB_MIDI_GADGET is not set
# CONFIG_USB_G_PRINTER is not set
# CONFIG_USB_CDC_COMPOSITE is not set
# CONFIG_USB_G_ACM_MS is not set
# CONFIG_USB_G_MULTI is not set
# CONFIG_USB_G_HID is not set
# CONFIG_USB_G_DBGP is not set
# CONFIG_USB_G_WEBCAM is not set
# CONFIG_USB_LED_TRIG is not set
CONFIG_UWB=m
CONFIG_UWB_HWA=m
CONFIG_UWB_WHCI=m
CONFIG_UWB_I1480U=m
CONFIG_MMC=m
# CONFIG_MMC_DEBUG is not set
#
# MMC/SD/SDIO Card Drivers
#
CONFIG_MMC_BLOCK=m
CONFIG_MMC_BLOCK_MINORS=8
CONFIG_MMC_BLOCK_BOUNCE=y
CONFIG_SDIO_UART=m
# CONFIG_MMC_TEST is not set
#
# MMC/SD/SDIO Host Controller Drivers
#
CONFIG_MMC_SDHCI=m
CONFIG_MMC_SDHCI_PCI=m
CONFIG_MMC_RICOH_MMC=y
CONFIG_MMC_SDHCI_ACPI=m
CONFIG_MMC_SDHCI_PLTFM=m
# CONFIG_MMC_WBSD is not set
CONFIG_MMC_TIFM_SD=m
# CONFIG_MMC_SPI is not set
CONFIG_MMC_CB710=m
CONFIG_MMC_VIA_SDMMC=m
CONFIG_MMC_VUB300=m
CONFIG_MMC_USHC=m
# CONFIG_MMC_USDHI6ROL0 is not set
CONFIG_MMC_REALTEK_PCI=m
# CONFIG_MMC_TOSHIBA_PCI is not set
# CONFIG_MMC_MTK is not set
CONFIG_MEMSTICK=m
# CONFIG_MEMSTICK_DEBUG is not set
#
# MemoryStick drivers
#
# CONFIG_MEMSTICK_UNSAFE_RESUME is not set
CONFIG_MSPRO_BLOCK=m
# CONFIG_MS_BLOCK is not set
#
# MemoryStick Host Controller Drivers
#
CONFIG_MEMSTICK_TIFM_MS=m
CONFIG_MEMSTICK_JMICRON_38X=m
CONFIG_MEMSTICK_R592=m
CONFIG_MEMSTICK_REALTEK_PCI=m
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
# CONFIG_LEDS_CLASS_FLASH is not set
#
# LED drivers
#
CONFIG_LEDS_LM3530=m
# CONFIG_LEDS_LM3642 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_GPIO is not set
CONFIG_LEDS_LP3944=m
CONFIG_LEDS_LP55XX_COMMON=m
CONFIG_LEDS_LP5521=m
CONFIG_LEDS_LP5523=m
CONFIG_LEDS_LP5562=m
# CONFIG_LEDS_LP8501 is not set
# CONFIG_LEDS_LP8860 is not set
CONFIG_LEDS_CLEVO_MAIL=m
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_PCA963X is not set
# CONFIG_LEDS_DAC124S085 is not set
# CONFIG_LEDS_PWM is not set
# CONFIG_LEDS_BD2802 is not set
CONFIG_LEDS_INTEL_SS4200=m
# CONFIG_LEDS_LT3593 is not set
# CONFIG_LEDS_TCA6507 is not set
# CONFIG_LEDS_TLC591XX is not set
# CONFIG_LEDS_LM355x is not set
#
# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM)
#
CONFIG_LEDS_BLINKM=m
#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=m
CONFIG_LEDS_TRIGGER_ONESHOT=m
# CONFIG_LEDS_TRIGGER_MTD is not set
CONFIG_LEDS_TRIGGER_HEARTBEAT=m
CONFIG_LEDS_TRIGGER_BACKLIGHT=m
# CONFIG_LEDS_TRIGGER_CPU is not set
# CONFIG_LEDS_TRIGGER_GPIO is not set
CONFIG_LEDS_TRIGGER_DEFAULT_ON=m
#
# iptables trigger is under Netfilter config (LED target)
#
CONFIG_LEDS_TRIGGER_TRANSIENT=m
CONFIG_LEDS_TRIGGER_CAMERA=m
# CONFIG_LEDS_TRIGGER_PANIC is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=y
CONFIG_EDAC_LEGACY_SYSFS=y
# CONFIG_EDAC_DEBUG is not set
CONFIG_EDAC_DECODE_MCE=m
CONFIG_EDAC_MM_EDAC=m
CONFIG_EDAC_AMD64=m
# CONFIG_EDAC_AMD64_ERROR_INJECTION is not set
CONFIG_EDAC_E752X=m
CONFIG_EDAC_I82975X=m
CONFIG_EDAC_I3000=m
CONFIG_EDAC_I3200=m
# CONFIG_EDAC_IE31200 is not set
CONFIG_EDAC_X38=m
CONFIG_EDAC_I5400=m
CONFIG_EDAC_I7CORE=m
CONFIG_EDAC_I5000=m
CONFIG_EDAC_I5100=m
CONFIG_EDAC_I7300=m
CONFIG_EDAC_SBRIDGE=m
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_SYSTOHC is not set
# CONFIG_RTC_DEBUG is not set
#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set
#
# I2C RTC drivers
#
# CONFIG_RTC_DRV_ABB5ZES3 is not set
# CONFIG_RTC_DRV_ABX80X is not set
CONFIG_RTC_DRV_DS1307=m
CONFIG_RTC_DRV_DS1307_HWMON=y
CONFIG_RTC_DRV_DS1374=m
# CONFIG_RTC_DRV_DS1374_WDT is not set
CONFIG_RTC_DRV_DS1672=m
CONFIG_RTC_DRV_MAX6900=m
CONFIG_RTC_DRV_RS5C372=m
CONFIG_RTC_DRV_ISL1208=m
CONFIG_RTC_DRV_ISL12022=m
# CONFIG_RTC_DRV_ISL12057 is not set
CONFIG_RTC_DRV_X1205=m
CONFIG_RTC_DRV_PCF8523=m
# CONFIG_RTC_DRV_PCF85063 is not set
CONFIG_RTC_DRV_PCF8563=m
CONFIG_RTC_DRV_PCF8583=m
CONFIG_RTC_DRV_M41T80=m
CONFIG_RTC_DRV_M41T80_WDT=y
CONFIG_RTC_DRV_BQ32K=m
# CONFIG_RTC_DRV_S35390A is not set
CONFIG_RTC_DRV_FM3130=m
# CONFIG_RTC_DRV_RX8010 is not set
CONFIG_RTC_DRV_RX8581=m
CONFIG_RTC_DRV_RX8025=m
CONFIG_RTC_DRV_EM3027=m
# CONFIG_RTC_DRV_RV8803 is not set
#
# SPI RTC drivers
#
# CONFIG_RTC_DRV_M41T93 is not set
# CONFIG_RTC_DRV_M41T94 is not set
# CONFIG_RTC_DRV_DS1302 is not set
# CONFIG_RTC_DRV_DS1305 is not set
# CONFIG_RTC_DRV_DS1343 is not set
# CONFIG_RTC_DRV_DS1347 is not set
# CONFIG_RTC_DRV_DS1390 is not set
# CONFIG_RTC_DRV_R9701 is not set
# CONFIG_RTC_DRV_RX4581 is not set
# CONFIG_RTC_DRV_RX6110 is not set
# CONFIG_RTC_DRV_RS5C348 is not set
# CONFIG_RTC_DRV_MAX6902 is not set
# CONFIG_RTC_DRV_PCF2123 is not set
# CONFIG_RTC_DRV_MCP795 is not set
CONFIG_RTC_I2C_AND_SPI=y
#
# SPI and I2C RTC drivers
#
CONFIG_RTC_DRV_DS3232=m
# CONFIG_RTC_DRV_PCF2127 is not set
CONFIG_RTC_DRV_RV3029C2=m
CONFIG_RTC_DRV_RV3029_HWMON=y
#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
CONFIG_RTC_DRV_DS1286=m
CONFIG_RTC_DRV_DS1511=m
CONFIG_RTC_DRV_DS1553=m
# CONFIG_RTC_DRV_DS1685_FAMILY is not set
CONFIG_RTC_DRV_DS1742=m
CONFIG_RTC_DRV_DS2404=m
CONFIG_RTC_DRV_STK17TA8=m
# CONFIG_RTC_DRV_M48T86 is not set
CONFIG_RTC_DRV_M48T35=m
CONFIG_RTC_DRV_M48T59=m
CONFIG_RTC_DRV_MSM6242=m
CONFIG_RTC_DRV_BQ4802=m
CONFIG_RTC_DRV_RP5C01=m
CONFIG_RTC_DRV_V3020=m
#
# on-CPU RTC drivers
#
#
# HID Sensor RTC drivers
#
# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set
CONFIG_DMADEVICES=y
# CONFIG_DMADEVICES_DEBUG is not set
#
# DMA Devices
#
CONFIG_DMA_ENGINE=y
CONFIG_DMA_VIRTUAL_CHANNELS=y
CONFIG_DMA_ACPI=y
# CONFIG_INTEL_IDMA64 is not set
# CONFIG_INTEL_IOATDMA is not set
# CONFIG_QCOM_HIDMA_MGMT is not set
# CONFIG_QCOM_HIDMA is not set
CONFIG_DW_DMAC_CORE=m
CONFIG_DW_DMAC=m
CONFIG_DW_DMAC_PCI=m
CONFIG_HSU_DMA=y
#
# DMA Clients
#
CONFIG_ASYNC_TX_DMA=y
CONFIG_DMATEST=m
#
# DMABUF options
#
# CONFIG_SYNC_FILE is not set
CONFIG_AUXDISPLAY=y
CONFIG_KS0108=m
CONFIG_KS0108_PORT=0x378
CONFIG_KS0108_DELAY=2
CONFIG_CFAG12864B=m
CONFIG_CFAG12864B_RATE=20
CONFIG_UIO=m
CONFIG_UIO_CIF=m
CONFIG_UIO_PDRV_GENIRQ=m
# CONFIG_UIO_DMEM_GENIRQ is not set
CONFIG_UIO_AEC=m
CONFIG_UIO_SERCOS3=m
CONFIG_UIO_PCI_GENERIC=m
# CONFIG_UIO_NETX is not set
# CONFIG_UIO_PRUSS is not set
# CONFIG_UIO_MF624 is not set
CONFIG_VFIO_IOMMU_TYPE1=m
CONFIG_VFIO_VIRQFD=m
CONFIG_VFIO=m
# CONFIG_VFIO_NOIOMMU is not set
CONFIG_VFIO_PCI=m
# CONFIG_VFIO_PCI_VGA is not set
CONFIG_VFIO_PCI_MMAP=y
CONFIG_VFIO_PCI_INTX=y
CONFIG_VFIO_PCI_IGD=y
CONFIG_IRQ_BYPASS_MANAGER=m
# CONFIG_VIRT_DRIVERS is not set
CONFIG_VIRTIO=y
#
# Virtio drivers
#
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_VIRTIO_BALLOON=y
# CONFIG_VIRTIO_INPUT is not set
# CONFIG_VIRTIO_MMIO is not set
#
# Microsoft Hyper-V guest support
#
CONFIG_HYPERV=m
CONFIG_HYPERV_UTILS=m
CONFIG_HYPERV_BALLOON=m
#
# Xen driver support
#
CONFIG_XEN_BALLOON=y
# CONFIG_XEN_SELFBALLOONING is not set
# CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not set
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=m
CONFIG_XEN_BACKEND=y
CONFIG_XENFS=m
CONFIG_XEN_COMPAT_XENFS=y
CONFIG_XEN_SYS_HYPERVISOR=y
CONFIG_XEN_XENBUS_FRONTEND=y
# CONFIG_XEN_GNTDEV is not set
# CONFIG_XEN_GRANT_DEV_ALLOC is not set
CONFIG_SWIOTLB_XEN=y
CONFIG_XEN_TMEM=m
CONFIG_XEN_PCIDEV_BACKEND=m
# CONFIG_XEN_SCSI_BACKEND is not set
CONFIG_XEN_PRIVCMD=m
CONFIG_XEN_ACPI_PROCESSOR=m
# CONFIG_XEN_MCE_LOG is not set
CONFIG_XEN_HAVE_PVMMU=y
CONFIG_XEN_EFI=y
CONFIG_XEN_AUTO_XLATE=y
CONFIG_XEN_ACPI=y
CONFIG_XEN_SYMS=y
CONFIG_XEN_HAVE_VPMU=y
CONFIG_STAGING=y
# CONFIG_SLICOSS is not set
# CONFIG_PRISM2_USB is not set
# CONFIG_COMEDI is not set
# CONFIG_RTL8192U is not set
CONFIG_RTLLIB=m
CONFIG_RTLLIB_CRYPTO_CCMP=m
CONFIG_RTLLIB_CRYPTO_TKIP=m
CONFIG_RTLLIB_CRYPTO_WEP=m
CONFIG_RTL8192E=m
CONFIG_R8712U=m
# CONFIG_R8188EU is not set
# CONFIG_R8723AU is not set
# CONFIG_RTS5208 is not set
# CONFIG_VT6655 is not set
# CONFIG_VT6656 is not set
# CONFIG_FB_SM750 is not set
# CONFIG_FB_XGI is not set
#
# Speakup console speech
#
# CONFIG_SPEAKUP is not set
# CONFIG_STAGING_MEDIA is not set
#
# Android
#
# CONFIG_LTE_GDM724X is not set
CONFIG_FIREWIRE_SERIAL=m
CONFIG_FWTTY_MAX_TOTAL_PORTS=64
CONFIG_FWTTY_MAX_CARD_PORTS=32
# CONFIG_LNET is not set
# CONFIG_DGNC is not set
# CONFIG_GS_FPGABOOT is not set
# CONFIG_CRYPTO_SKEIN is not set
# CONFIG_UNISYSSPAR is not set
# CONFIG_FB_TFT is not set
# CONFIG_WILC1000_SDIO is not set
# CONFIG_WILC1000_SPI is not set
# CONFIG_MOST is not set
#
# Old ISDN4Linux (deprecated)
#
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_ACER_WMI=m
CONFIG_ACERHDF=m
# CONFIG_ALIENWARE_WMI is not set
CONFIG_ASUS_LAPTOP=m
# CONFIG_DELL_SMBIOS is not set
CONFIG_DELL_WMI_AIO=m
# CONFIG_DELL_SMO8800 is not set
# CONFIG_DELL_RBTN is not set
CONFIG_FUJITSU_LAPTOP=m
# CONFIG_FUJITSU_LAPTOP_DEBUG is not set
CONFIG_FUJITSU_TABLET=m
CONFIG_AMILO_RFKILL=m
CONFIG_HP_ACCEL=m
# CONFIG_HP_WIRELESS is not set
CONFIG_HP_WMI=m
CONFIG_MSI_LAPTOP=m
CONFIG_PANASONIC_LAPTOP=m
CONFIG_COMPAL_LAPTOP=m
CONFIG_SONY_LAPTOP=m
CONFIG_SONYPI_COMPAT=y
CONFIG_IDEAPAD_LAPTOP=m
CONFIG_THINKPAD_ACPI=m
CONFIG_THINKPAD_ACPI_ALSA_SUPPORT=y
# CONFIG_THINKPAD_ACPI_DEBUGFACILITIES is not set
# CONFIG_THINKPAD_ACPI_DEBUG is not set
# CONFIG_THINKPAD_ACPI_UNSAFE_LEDS is not set
CONFIG_THINKPAD_ACPI_VIDEO=y
CONFIG_THINKPAD_ACPI_HOTKEY_POLL=y
CONFIG_SENSORS_HDAPS=m
# CONFIG_INTEL_MENLOW is not set
CONFIG_EEEPC_LAPTOP=m
CONFIG_ASUS_WMI=m
CONFIG_ASUS_NB_WMI=m
CONFIG_EEEPC_WMI=m
# CONFIG_ASUS_WIRELESS is not set
CONFIG_ACPI_WMI=m
CONFIG_MSI_WMI=m
CONFIG_TOPSTAR_LAPTOP=m
CONFIG_ACPI_TOSHIBA=m
CONFIG_TOSHIBA_BT_RFKILL=m
# CONFIG_TOSHIBA_HAPS is not set
# CONFIG_TOSHIBA_WMI is not set
CONFIG_ACPI_CMPC=m
# CONFIG_INTEL_HID_EVENT is not set
CONFIG_INTEL_IPS=m
# CONFIG_INTEL_PMC_CORE is not set
# CONFIG_IBM_RTL is not set
CONFIG_SAMSUNG_LAPTOP=m
CONFIG_MXM_WMI=m
CONFIG_INTEL_OAKTRAIL=m
CONFIG_SAMSUNG_Q10=m
CONFIG_APPLE_GMUX=m
# CONFIG_INTEL_RST is not set
# CONFIG_INTEL_SMARTCONNECT is not set
CONFIG_PVPANIC=y
# CONFIG_INTEL_PMC_IPC is not set
# CONFIG_SURFACE_PRO3_BUTTON is not set
# CONFIG_INTEL_PUNIT_IPC is not set
# CONFIG_CHROME_PLATFORMS is not set
CONFIG_CLKDEV_LOOKUP=y
CONFIG_HAVE_CLK_PREPARE=y
CONFIG_COMMON_CLK=y
#
# Common Clock Framework
#
# CONFIG_COMMON_CLK_SI5351 is not set
# CONFIG_COMMON_CLK_CDCE706 is not set
# CONFIG_COMMON_CLK_CS2000_CP is not set
# CONFIG_COMMON_CLK_NXP is not set
# CONFIG_COMMON_CLK_PWM is not set
# CONFIG_COMMON_CLK_PXA is not set
# CONFIG_COMMON_CLK_PIC32 is not set
# CONFIG_COMMON_CLK_OXNAS is not set
#
# Hardware Spinlock drivers
#
#
# Clock Source drivers
#
CONFIG_CLKEVT_I8253=y
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# CONFIG_ATMEL_PIT is not set
# CONFIG_SH_TIMER_CMT is not set
# CONFIG_SH_TIMER_MTU2 is not set
# CONFIG_SH_TIMER_TMU is not set
# CONFIG_EM_TIMER_STI is not set
# CONFIG_MAILBOX is not set
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y
#
# Generic IOMMU Pagetable Support
#
CONFIG_IOMMU_IOVA=y
CONFIG_AMD_IOMMU=y
CONFIG_AMD_IOMMU_V2=m
CONFIG_DMAR_TABLE=y
CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_SVM is not set
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
CONFIG_INTEL_IOMMU_FLOPPY_WA=y
CONFIG_IRQ_REMAP=y
#
# Remoteproc drivers
#
# CONFIG_STE_MODEM_RPROC is not set
#
# Rpmsg drivers
#
#
# SOC (System On Chip) specific Drivers
#
# CONFIG_SUNXI_SRAM is not set
# CONFIG_SOC_TI is not set
CONFIG_PM_DEVFREQ=y
#
# DEVFREQ Governors
#
CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=m
# CONFIG_DEVFREQ_GOV_PERFORMANCE is not set
# CONFIG_DEVFREQ_GOV_POWERSAVE is not set
# CONFIG_DEVFREQ_GOV_USERSPACE is not set
# CONFIG_DEVFREQ_GOV_PASSIVE is not set
#
# DEVFREQ Drivers
#
# CONFIG_PM_DEVFREQ_EVENT is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
CONFIG_NTB=m
# CONFIG_NTB_AMD is not set
# CONFIG_NTB_INTEL is not set
# CONFIG_NTB_PINGPONG is not set
# CONFIG_NTB_TOOL is not set
# CONFIG_NTB_PERF is not set
# CONFIG_NTB_TRANSPORT is not set
# CONFIG_VME_BUS is not set
CONFIG_PWM=y
CONFIG_PWM_SYSFS=y
# CONFIG_PWM_LPSS_PCI is not set
# CONFIG_PWM_LPSS_PLATFORM is not set
# CONFIG_PWM_PCA9685 is not set
CONFIG_ARM_GIC_MAX_NR=1
# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set
# CONFIG_FMC is not set
#
# PHY Subsystem
#
CONFIG_GENERIC_PHY=y
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# CONFIG_BCM_KONA_USB2_PHY is not set
CONFIG_POWERCAP=y
CONFIG_INTEL_RAPL=y
# CONFIG_MCB is not set
#
# Performance monitor support
#
CONFIG_RAS=y
# CONFIG_MCE_AMD_INJ is not set
# CONFIG_THUNDERBOLT is not set
#
# Android
#
# CONFIG_ANDROID is not set
CONFIG_LIBNVDIMM=y
CONFIG_BLK_DEV_PMEM=m
CONFIG_ND_BLK=m
CONFIG_ND_CLAIM=y
CONFIG_ND_BTT=m
CONFIG_BTT=y
# CONFIG_DEV_DAX is not set
CONFIG_NVMEM=m
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
#
# FPGA Configuration Support
#
# CONFIG_FPGA is not set
#
# Firmware Drivers
#
CONFIG_EDD=m
# CONFIG_EDD_OFF is not set
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DELL_RBU=m
CONFIG_DCDBAS=m
CONFIG_DMIID=y
CONFIG_DMI_SYSFS=y
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
CONFIG_ISCSI_IBFT_FIND=y
CONFIG_ISCSI_IBFT=m
# CONFIG_FW_CFG_SYSFS is not set
# CONFIG_GOOGLE_FIRMWARE is not set
#
# EFI (Extensible Firmware Interface) Support
#
CONFIG_EFI_VARS=y
CONFIG_EFI_ESRT=y
CONFIG_EFI_VARS_PSTORE=y
CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE=y
CONFIG_EFI_RUNTIME_MAP=y
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_RUNTIME_WRAPPERS=y
# CONFIG_EFI_BOOTLOADER_CONTROL is not set
# CONFIG_EFI_CAPSULE_LOADER is not set
CONFIG_UEFI_CPER=y
#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
CONFIG_FS_IOMAP=y
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT2=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_ENCRYPTION is not set
# CONFIG_EXT4_DEBUG is not set
CONFIG_JBD2=y
# CONFIG_JBD2_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=y
CONFIG_XFS_QUOTA=y
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
# CONFIG_XFS_WARN is not set
# CONFIG_XFS_DEBUG is not set
CONFIG_GFS2_FS=m
CONFIG_GFS2_FS_LOCKING_DLM=y
# CONFIG_OCFS2_FS is not set
CONFIG_BTRFS_FS=m
CONFIG_BTRFS_FS_POSIX_ACL=y
# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
# CONFIG_BTRFS_DEBUG is not set
# CONFIG_BTRFS_ASSERT is not set
# CONFIG_NILFS2_FS is not set
CONFIG_F2FS_FS=m
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
# CONFIG_F2FS_FS_SECURITY is not set
# CONFIG_F2FS_CHECK_FS is not set
# CONFIG_F2FS_FS_ENCRYPTION is not set
# CONFIG_F2FS_IO_TRACE is not set
# CONFIG_F2FS_FAULT_INJECTION is not set
# CONFIG_FS_DAX is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
CONFIG_FILE_LOCKING=y
CONFIG_MANDATORY_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_FANOTIFY=y
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_PRINT_QUOTA_WARNING=y
# CONFIG_QUOTA_DEBUG is not set
CONFIG_QUOTA_TREE=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_QUOTACTL_COMPAT=y
CONFIG_AUTOFS4_FS=y
CONFIG_FUSE_FS=m
CONFIG_CUSE=m
CONFIG_OVERLAY_FS=m
#
# Caches
#
CONFIG_FSCACHE=m
CONFIG_FSCACHE_STATS=y
# CONFIG_FSCACHE_HISTOGRAM is not set
# CONFIG_FSCACHE_DEBUG is not set
# CONFIG_FSCACHE_OBJECT_LIST is not set
CONFIG_CACHEFILES=m
# CONFIG_CACHEFILES_DEBUG is not set
# CONFIG_CACHEFILES_HISTOGRAM is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
# CONFIG_FAT_DEFAULT_UTF8 is not set
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_EFIVAR_FS=y
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ORANGEFS_FS is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_JFFS2_FS is not set
# CONFIG_UBIFS_FS is not set
# CONFIG_LOGFS is not set
CONFIG_CRAMFS=m
CONFIG_SQUASHFS=m
CONFIG_SQUASHFS_FILE_CACHE=y
# CONFIG_SQUASHFS_FILE_DIRECT is not set
CONFIG_SQUASHFS_DECOMP_SINGLE=y
# CONFIG_SQUASHFS_DECOMP_MULTI is not set
# CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU is not set
CONFIG_SQUASHFS_XATTR=y
CONFIG_SQUASHFS_ZLIB=y
# CONFIG_SQUASHFS_LZ4 is not set
CONFIG_SQUASHFS_LZO=y
CONFIG_SQUASHFS_XZ=y
# CONFIG_SQUASHFS_4K_DEVBLK_SIZE is not set
# CONFIG_SQUASHFS_EMBEDDED is not set
CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_QNX6FS_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_PSTORE=y
# CONFIG_PSTORE_CONSOLE is not set
# CONFIG_PSTORE_PMSG is not set
# CONFIG_PSTORE_FTRACE is not set
CONFIG_PSTORE_RAM=m
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_EXOFS_FS is not set
CONFIG_ORE=m
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
# CONFIG_NFS_V2 is not set
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=m
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_OBJLAYOUT=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
CONFIG_NFS_V4_SECURITY_LABEL=y
CONFIG_ROOT_NFS=y
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFS_DEBUG=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
# CONFIG_NFSD_BLOCKLAYOUT is not set
# CONFIG_NFSD_SCSILAYOUT is not set
CONFIG_NFSD_V4_SECURITY_LABEL=y
# CONFIG_NFSD_FAULT_INJECTION is not set
CONFIG_GRACE_PERIOD=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=m
CONFIG_SUNRPC_BACKCHANNEL=y
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_SUNRPC_DEBUG=y
# CONFIG_CEPH_FS is not set
CONFIG_CIFS=m
CONFIG_CIFS_STATS=y
# CONFIG_CIFS_STATS2 is not set
CONFIG_CIFS_WEAK_PW_HASH=y
CONFIG_CIFS_UPCALL=y
CONFIG_CIFS_XATTR=y
CONFIG_CIFS_POSIX=y
CONFIG_CIFS_ACL=y
CONFIG_CIFS_DEBUG=y
# CONFIG_CIFS_DEBUG2 is not set
CONFIG_CIFS_DFS_UPCALL=y
CONFIG_CIFS_SMB2=y
# CONFIG_CIFS_SMB311 is not set
# CONFIG_CIFS_FSCACHE is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
CONFIG_9P_FS=y
CONFIG_9P_FS_POSIX_ACL=y
# CONFIG_9P_FS_SECURITY is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_MAC_ROMAN=m
CONFIG_NLS_MAC_CELTIC=m
CONFIG_NLS_MAC_CENTEURO=m
CONFIG_NLS_MAC_CROATIAN=m
CONFIG_NLS_MAC_CYRILLIC=m
CONFIG_NLS_MAC_GAELIC=m
CONFIG_NLS_MAC_GREEK=m
CONFIG_NLS_MAC_ICELAND=m
CONFIG_NLS_MAC_INUIT=m
CONFIG_NLS_MAC_ROMANIAN=m
CONFIG_NLS_MAC_TURKISH=m
CONFIG_NLS_UTF8=m
CONFIG_DLM=m
CONFIG_DLM_DEBUG=y
#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
CONFIG_DYNAMIC_DEBUG=y
#
# Compile-time checks and compiler options
#
# CONFIG_DEBUG_INFO is not set
# CONFIG_ENABLE_WARN_DEPRECATED is not set
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=2048
CONFIG_STRIP_ASM_SYMS=y
# CONFIG_READABLE_ASM is not set
# CONFIG_UNUSED_SYMBOLS is not set
# CONFIG_PAGE_OWNER is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_SECTION_MISMATCH=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
# CONFIG_STACK_VALIDATION is not set
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_DEBUG_KERNEL=y
#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_PAGE_REF is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VIRTUAL is not set
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m
# CONFIG_DEBUG_PER_CPU_MAPS is not set
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_HAVE_ARCH_KASAN=y
# CONFIG_KASAN is not set
CONFIG_ARCH_HAS_KCOV=y
# CONFIG_KCOV is not set
CONFIG_DEBUG_SHIRQ=y
#
# Debug Lockups and Hangs
#
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
# CONFIG_DETECT_HUNG_TASK is not set
# CONFIG_WQ_WATCHDOG is not set
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHED_INFO=y
CONFIG_SCHEDSTATS=y
# CONFIG_SCHED_STACK_END_CHECK is not set
# CONFIG_DEBUG_TIMEKEEPING is not set
CONFIG_TIMER_STATS=y
#
# Lock Debugging (spinlocks, mutexes, etc...)
#
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
CONFIG_DEBUG_ATOMIC_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
CONFIG_LOCK_TORTURE_TEST=m
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
# CONFIG_DEBUG_PI_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set
#
# RCU Debugging
#
# CONFIG_PROVE_RCU is not set
CONFIG_SPARSE_RCU_POINTER=y
CONFIG_TORTURE_TEST=m
# CONFIG_RCU_PERF_TEST is not set
CONFIG_RCU_TORTURE_TEST=m
# CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT is not set
# CONFIG_RCU_TORTURE_TEST_SLOW_INIT is not set
# CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
CONFIG_NOTIFIER_ERROR_INJECTION=m
# CONFIG_CPU_NOTIFIER_ERROR_INJECT is not set
CONFIG_PM_NOTIFIER_ERROR_INJECT=m
# CONFIG_NETDEV_NOTIFIER_ERROR_INJECT is not set
# CONFIG_FAULT_INJECTION is not set
CONFIG_LATENCYTOP=y
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
# CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_FENTRY=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_TRACE_CLOCK=y
CONFIG_RING_BUFFER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_RING_BUFFER_ALLOW_SWAP=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set
CONFIG_SCHED_TRACER=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
# CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
CONFIG_STACK_TRACER=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENT=y
CONFIG_UPROBE_EVENT=y
CONFIG_PROBE_EVENTS=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_MMIOTRACE is not set
# CONFIG_HIST_TRIGGERS is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
CONFIG_RING_BUFFER_BENCHMARK=m
# CONFIG_RING_BUFFER_STARTUP_TEST is not set
# CONFIG_TRACE_ENUM_MAP_FILE is not set
CONFIG_TRACING_EVENTS_GPIO=y
#
# Runtime Testing
#
CONFIG_LKDTM=m
# CONFIG_TEST_LIST_SORT is not set
# CONFIG_KPROBES_SANITY_TEST is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
CONFIG_RBTREE_TEST=m
CONFIG_INTERVAL_TREE_TEST=m
CONFIG_PERCPU_TEST=m
CONFIG_ATOMIC64_SELFTEST=y
CONFIG_ASYNC_RAID6_TEST=m
# CONFIG_TEST_HEXDUMP is not set
# CONFIG_TEST_STRING_HELPERS is not set
CONFIG_TEST_KSTRTOX=m
# CONFIG_TEST_PRINTF is not set
# CONFIG_TEST_BITMAP is not set
# CONFIG_TEST_RHASHTABLE is not set
# CONFIG_TEST_HASH is not set
CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
CONFIG_BUILD_DOCSRC=y
# CONFIG_DMA_API_DEBUG is not set
CONFIG_TEST_LKM=m
CONFIG_TEST_USER_COPY=m
CONFIG_TEST_BPF=m
CONFIG_TEST_FIRMWARE=m
CONFIG_TEST_UDELAY=m
# CONFIG_MEMTEST is not set
# CONFIG_TEST_STATIC_KEYS is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
# CONFIG_UBSAN is not set
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
CONFIG_STRICT_DEVMEM=y
# CONFIG_IO_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
# CONFIG_EARLY_PRINTK_EFI is not set
# CONFIG_X86_PTDUMP_CORE is not set
# CONFIG_X86_PTDUMP is not set
# CONFIG_EFI_PGT_DUMP is not set
CONFIG_DEBUG_RODATA_TEST=y
# CONFIG_DEBUG_WX is not set
CONFIG_DEBUG_SET_MODULE_RONX=y
CONFIG_DEBUG_NX_TEST=m
CONFIG_DOUBLEFAULT=y
# CONFIG_DEBUG_TLBFLUSH is not set
# CONFIG_IOMMU_DEBUG is not set
# CONFIG_IOMMU_STRESS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_X86_DECODER_SELFTEST=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
CONFIG_DEBUG_BOOT_PARAMS=y
# CONFIG_CPA_DEBUG is not set
CONFIG_OPTIMIZE_INLINING=y
# CONFIG_DEBUG_ENTRY is not set
# CONFIG_DEBUG_NMI_SELFTEST is not set
CONFIG_X86_DEBUG_FPU=y
# CONFIG_PUNIT_ATOM_DEBUG is not set
#
# Security options
#
CONFIG_KEYS=y
CONFIG_PERSISTENT_KEYRINGS=y
CONFIG_BIG_KEYS=y
CONFIG_TRUSTED_KEYS=y
CONFIG_ENCRYPTED_KEYS=y
# CONFIG_KEY_DH_OPERATIONS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
CONFIG_SECURITY=y
CONFIG_SECURITYFS=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_NETWORK_XFRM=y
# CONFIG_SECURITY_PATH is not set
CONFIG_INTEL_TXT=y
CONFIG_LSM_MMAP_MIN_ADDR=65535
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
# CONFIG_SECURITY_SMACK is not set
# CONFIG_SECURITY_TOMOYO is not set
# CONFIG_SECURITY_APPARMOR is not set
# CONFIG_SECURITY_LOADPIN is not set
# CONFIG_SECURITY_YAMA is not set
CONFIG_INTEGRITY=y
CONFIG_INTEGRITY_SIGNATURE=y
CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y
CONFIG_INTEGRITY_TRUSTED_KEYRING=y
CONFIG_INTEGRITY_AUDIT=y
CONFIG_IMA=y
CONFIG_IMA_MEASURE_PCR_IDX=10
CONFIG_IMA_LSM_RULES=y
# CONFIG_IMA_TEMPLATE is not set
CONFIG_IMA_NG_TEMPLATE=y
# CONFIG_IMA_SIG_TEMPLATE is not set
CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
CONFIG_IMA_DEFAULT_HASH_SHA1=y
# CONFIG_IMA_DEFAULT_HASH_SHA256 is not set
# CONFIG_IMA_DEFAULT_HASH_SHA512 is not set
# CONFIG_IMA_DEFAULT_HASH_WP512 is not set
CONFIG_IMA_DEFAULT_HASH="sha1"
# CONFIG_IMA_WRITE_POLICY is not set
# CONFIG_IMA_READ_POLICY is not set
CONFIG_IMA_APPRAISE=y
CONFIG_IMA_TRUSTED_KEYRING=y
# CONFIG_IMA_BLACKLIST_KEYRING is not set
# CONFIG_IMA_LOAD_X509 is not set
CONFIG_EVM=y
CONFIG_EVM_ATTR_FSUUID=y
# CONFIG_EVM_LOAD_X509 is not set
CONFIG_DEFAULT_SECURITY_SELINUX=y
# CONFIG_DEFAULT_SECURITY_DAC is not set
CONFIG_DEFAULT_SECURITY="selinux"
CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_ASYNC_PQ=m
CONFIG_ASYNC_RAID6_RECOV=m
CONFIG_CRYPTO=y
#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_RNG_DEFAULT=y
CONFIG_CRYPTO_AKCIPHER2=y
CONFIG_CRYPTO_AKCIPHER=y
CONFIG_CRYPTO_RSA=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_USER=m
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_GF128MUL=m
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_NULL2=y
CONFIG_CRYPTO_PCRYPT=m
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CRYPTD=m
# CONFIG_CRYPTO_MCRYPTD is not set
CONFIG_CRYPTO_AUTHENC=m
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_ABLK_HELPER=m
CONFIG_CRYPTO_GLUE_HELPER_X86=m
#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=m
CONFIG_CRYPTO_GCM=m
# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=m
#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=y
CONFIG_CRYPTO_CTS=m
CONFIG_CRYPTO_ECB=y
CONFIG_CRYPTO_LRW=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_XTS=m
# CONFIG_CRYPTO_KEYWRAP is not set
#
# Hash modes
#
CONFIG_CRYPTO_CMAC=m
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=m
CONFIG_CRYPTO_VMAC=m
#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_CRC32C_INTEL=m
CONFIG_CRYPTO_CRC32=m
CONFIG_CRYPTO_CRC32_PCLMUL=m
CONFIG_CRYPTO_CRCT10DIF=y
CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
CONFIG_CRYPTO_GHASH=m
# CONFIG_CRYPTO_POLY1305 is not set
# CONFIG_CRYPTO_POLY1305_X86_64 is not set
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_RMD128=m
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_RMD256=m
CONFIG_CRYPTO_RMD320=m
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA1_SSSE3=m
CONFIG_CRYPTO_SHA256_SSSE3=m
CONFIG_CRYPTO_SHA512_SSSE3=m
# CONFIG_CRYPTO_SHA1_MB is not set
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL=m
#
# Ciphers
#
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_AES_X86_64=y
CONFIG_CRYPTO_AES_NI_INTEL=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_BLOWFISH_COMMON=m
CONFIG_CRYPTO_BLOWFISH_X86_64=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAMELLIA_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64=m
CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64=m
CONFIG_CRYPTO_CAST_COMMON=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST5_AVX_X86_64=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_CAST6_AVX_X86_64=m
CONFIG_CRYPTO_DES=m
# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_SALSA20=m
CONFIG_CRYPTO_SALSA20_X86_64=m
# CONFIG_CRYPTO_CHACHA20 is not set
# CONFIG_CRYPTO_CHACHA20_X86_64 is not set
CONFIG_CRYPTO_SEED=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_SERPENT_SSE2_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX_X86_64=m
CONFIG_CRYPTO_SERPENT_AVX2_X86_64=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
CONFIG_CRYPTO_TWOFISH_X86_64=m
CONFIG_CRYPTO_TWOFISH_X86_64_3WAY=m
CONFIG_CRYPTO_TWOFISH_AVX_X86_64=m
#
# Compression
#
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_LZO=y
# CONFIG_CRYPTO_842 is not set
# CONFIG_CRYPTO_LZ4 is not set
# CONFIG_CRYPTO_LZ4HC is not set
#
# Random Number Generation
#
CONFIG_CRYPTO_ANSI_CPRNG=m
CONFIG_CRYPTO_DRBG_MENU=y
CONFIG_CRYPTO_DRBG_HMAC=y
# CONFIG_CRYPTO_DRBG_HASH is not set
# CONFIG_CRYPTO_DRBG_CTR is not set
CONFIG_CRYPTO_DRBG=y
CONFIG_CRYPTO_JITTERENTROPY=y
CONFIG_CRYPTO_USER_API=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CRYPTO_USER_API_SKCIPHER=y
# CONFIG_CRYPTO_USER_API_RNG is not set
# CONFIG_CRYPTO_USER_API_AEAD is not set
CONFIG_CRYPTO_HASH_INFO=y
CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_PADLOCK=m
CONFIG_CRYPTO_DEV_PADLOCK_AES=m
CONFIG_CRYPTO_DEV_PADLOCK_SHA=m
# CONFIG_CRYPTO_DEV_CCP is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
# CONFIG_CRYPTO_DEV_QAT_C62X is not set
# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
# CONFIG_PKCS7_MESSAGE_PARSER is not set
#
# Certificates for signature checking
#
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_SYSTEM_TRUSTED_KEYS=""
# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
# CONFIG_SECONDARY_TRUSTED_KEYRING is not set
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_KVM_COMPAT=y
CONFIG_HAVE_KVM_IRQ_BYPASS=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_KVM_AMD=m
CONFIG_KVM_MMU_AUDIT=y
# CONFIG_KVM_DEVICE_ASSIGNMENT is not set
CONFIG_BINARY_PRINTF=y
#
# Library routines
#
CONFIG_RAID6_PQ=m
CONFIG_BITREVERSE=y
# CONFIG_HAVE_ARCH_BITREVERSE is not set
CONFIG_RATIONAL=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_IO=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
CONFIG_CRC_CCITT=m
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=y
CONFIG_CRC8=m
# CONFIG_AUDIT_ARCH_COMPAT_GENERIC is not set
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_XZ_DEC=y
CONFIG_XZ_DEC_X86=y
CONFIG_XZ_DEC_POWERPC=y
CONFIG_XZ_DEC_IA64=y
CONFIG_XZ_DEC_ARM=y
CONFIG_XZ_DEC_ARMTHUMB=y
CONFIG_XZ_DEC_SPARC=y
CONFIG_XZ_DEC_BCJ=y
# CONFIG_XZ_DEC_TEST is not set
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_REED_SOLOMON=m
CONFIG_REED_SOLOMON_ENC8=y
CONFIG_REED_SOLOMON_DEC8=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_INTERVAL_TREE=y
CONFIG_RADIX_TREE_MULTIORDER=y
CONFIG_ASSOCIATIVE_ARRAY=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_CHECK_SIGNATURE=y
CONFIG_CPUMASK_OFFSTACK=y
CONFIG_CPU_RMAP=y
CONFIG_DQL=y
CONFIG_GLOB=y
# CONFIG_GLOB_SELFTEST is not set
CONFIG_NLATTR=y
CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE=y
CONFIG_CLZ_TAB=y
CONFIG_CORDIC=m
# CONFIG_DDR is not set
CONFIG_IRQ_POLL=y
CONFIG_MPILIB=y
CONFIG_SIGNATURE=y
CONFIG_OID_REGISTRY=y
CONFIG_UCS2_STRING=y
CONFIG_FONT_SUPPORT=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
# CONFIG_SG_SPLIT is not set
CONFIG_SG_POOL=y
CONFIG_ARCH_HAS_SG_CHAIN=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_ARCH_HAS_MMIO_FLUSH=y
[-- Attachment #3: job.yaml --]
[-- Type: text/plain, Size: 3944 bytes --]
---
suite: aim7
testcase: aim7
category: benchmark
disk: 1BRD_48G
fs: xfs
aim7:
test: disk_wrt
load: 3000
job_origin: "/lkp/lkp/src/allot/cyclic:linux-devel:devel-hourly/ivb44/aim7-fs-1brd.yaml"
queue: bisect
testbox: ivb44
tbox_group: ivb44
rootfs: debian-x86_64-2015-02-07.cgz
job_file: "/lkp/scheduled/ivb44/aim7-1BRD_48G-xfs-disk_wrt-3000-performance-debian-x86_64-2015-02-07.cgz-68a9f5e7007c1afa2cf6830b690a90d0187c0684-20160808-100317-8vi4ke-0.yaml"
id: 72d1a5e8f77e4181b9db78845c9a2cc37165471d
model: Ivytown Ivy Bridge-EP
nr_cpu: 48
memory: 64G
nr_hdd_partitions: 3
hdd_partitions: "/dev/disk/by-id/ata-WDC_WD1003FBYZ-010FB0_WD-WCAW36*-part1"
swap_partitions: "/dev/disk/by-id/ata-WDC_WD1003FBYZ-010FB0_WD-WCAW36795753-part2"
rootfs_partition: "/dev/disk/by-id/ata-WDC_WD1003FBYZ-010FB0_WD-WCAW36795753-part3"
netconsole_port: 6644
kmsg:
iostat:
heartbeat:
vmstat:
numa-numastat:
numa-vmstat:
numa-meminfo:
proc-vmstat:
proc-stat:
meminfo:
slabinfo:
interrupts:
lock_stat:
latency_stats:
softirqs:
bdi_dev_mapping:
diskstats:
nfsstat:
cpuidle:
cpufreq-stats:
turbostat:
sched_debug:
perf-stat:
perf-profile:
cpufreq_governor: performance
commit: 68a9f5e7007c1afa2cf6830b690a90d0187c0684
need_kconfig:
- CONFIG_BLK_DEV_RAM
- CONFIG_BLK_DEV
- CONFIG_BLOCK
- CONFIG_XFS_FS
kconfig: x86_64-rhel
compiler: gcc-6
enqueue_time: 2016-08-08 19:50:12.582832286 +08:00
user: lkp
head_commit: 1f11daae97cb85d6472f4e21a39e8e95af20d74c
base_commit: 523d939ef98fd712632d93a5a2b588e477a7565e
branch: linux-devel/devel-hourly-2016080806
result_root: "/result/aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/0"
LKP_SERVER: inn
max_uptime: 785.16
initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
bootloader_append:
- root=/dev/ram0
- user=lkp
- job=/lkp/scheduled/ivb44/aim7-1BRD_48G-xfs-disk_wrt-3000-performance-debian-x86_64-2015-02-07.cgz-68a9f5e7007c1afa2cf6830b690a90d0187c0684-20160808-100317-8vi4ke-0.yaml
- ARCH=x86_64
- kconfig=x86_64-rhel
- branch=linux-devel/devel-hourly-2016080806
- commit=68a9f5e7007c1afa2cf6830b690a90d0187c0684
- BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/vmlinuz-4.7.0-rc1-00007-g68a9f5e
- max_uptime=785
- RESULT_ROOT=/result/aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/0
- LKP_SERVER=inn
- debug
- apic=debug
- sysrq_always_enabled
- rcupdate.rcu_cpu_stall_timeout=100
- panic=-1
- softlockup_panic=1
- nmi_watchdog=panic
- oops=panic
- load_ramdisk=2
- prompt_ramdisk=0
- systemd.log_level=err
- ignore_loglevel
- earlyprintk=ttyS0,115200
- console=ttyS0,115200
- console=tty0
- vga=normal
- rw
lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
modules_initrd: "/pkg/linux/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/modules.cgz"
bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/fs.cgz,/lkp/benchmarks/aim7-x86_64.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/lkp/benchmarks/perf-stat-x86_64.cgz,/lkp/benchmarks/perf-profile-x86_64.cgz"
linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/linux-headers.cgz"
site: inn
LKP_CGI_PORT: 80
LKP_CIFS_PORT: 139
oom-killer:
watchdog:
nfs-hang:
repeat_to: 2
bad_samples:
- 486.19
- 481.48
- 483.32
#! queue options
#! user overrides
#! schedule options
kernel: "/pkg/linux/x86_64-rhel/gcc-6/68a9f5e7007c1afa2cf6830b690a90d0187c0684/vmlinuz-4.7.0-rc1-00007-g68a9f5e"
dequeue_time: 2016-08-08 20:30:06.296125189 +08:00
#! include/site/inn
#! runtime status
job_state: finished
loadavg: 1023.11 266.66 89.82 1/605 5699
start_time: '1470659457'
end_time: '1470659500'
version: "/lkp/lkp/.src-20160808-151458"
[-- Attachment #4: reproduce.ksh --]
[-- Type: text/plain, Size: 304 bytes --]
2016-08-08 20:30:55 dmsetup remove_all
2016-08-08 20:30:55 wipefs -a --force /dev/ram0
2016-08-08 20:30:55 mkfs -t xfs /dev/ram0
2016-08-08 20:30:55 mount -t xfs -o nobarrier,inode64 /dev/ram0 /fs/ram0
for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
do
echo performance > $file
done
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-09 14:33 ` kernel test robot
@ 2016-08-10 18:24 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-10 18:24 UTC (permalink / raw)
To: kernel test robot
Cc: Christoph Hellwig, Dave Chinner, Bob Peterson, LKML, LKP
On Tue, Aug 9, 2016 at 7:33 AM, kernel test robot <xiaolong.ye@intel.com> wrote:
>
> FYI, we noticed a -13.6% regression of aim7.jobs-per-min due to commit:
> 68a9f5e7007c ("xfs: implement iomap based buffered write path")
>
> in testcase: aim7
> on test machine: 48 threads Ivytown Ivy Bridge-EP with 64G memory
> with following parameters:
>
> disk: 1BRD_48G
> fs: xfs
> test: disk_wrt
> load: 3000
> cpufreq_governor: performance
Christop, Dave, was this expected?
>From looking at the numbers, it looks like much more IO going on (and
this less CPU load)..
> 37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time
> 37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time.max
> 6424 ± 1% +31.3% 8432 ± 1% aim7.time.involuntary_context_switches> 4003 ± 0% +28.1% 5129 ± 1% proc-vmstat.nr_active_file
> 979.25 ± 0% +63.7% 1602 ± 1% proc-vmstat.pgactivate
> 4699 ± 3% +162.6% 12340 ± 73% proc-vmstat.pgpgout
> 50.23 ± 19% -27.3% 36.50 ± 17% sched_debug.cpu.cpu_load[1].avg
> 466.50 ± 29% -51.8% 225.00 ± 73% sched_debug.cpu.cpu_load[1].max
> 77.78 ± 33% -50.6% 38.40 ± 57% sched_debug.cpu.cpu_load[1].stddev
> 300.50 ± 33% -52.9% 141.50 ± 48% sched_debug.cpu.cpu_load[2].max
> 1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.active_objs
> 1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.num_objs
> 431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.active_objs
> 431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.num_objs
but what do I know. Those profiles from the robot are pretty hard to
make sense of.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-10 18:24 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-10 18:24 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1873 bytes --]
On Tue, Aug 9, 2016 at 7:33 AM, kernel test robot <xiaolong.ye@intel.com> wrote:
>
> FYI, we noticed a -13.6% regression of aim7.jobs-per-min due to commit:
> 68a9f5e7007c ("xfs: implement iomap based buffered write path")
>
> in testcase: aim7
> on test machine: 48 threads Ivytown Ivy Bridge-EP with 64G memory
> with following parameters:
>
> disk: 1BRD_48G
> fs: xfs
> test: disk_wrt
> load: 3000
> cpufreq_governor: performance
Christop, Dave, was this expected?
>From looking at the numbers, it looks like much more IO going on (and
this less CPU load)..
> 37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time
> 37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time.max
> 6424 ± 1% +31.3% 8432 ± 1% aim7.time.involuntary_context_switches> 4003 ± 0% +28.1% 5129 ± 1% proc-vmstat.nr_active_file
> 979.25 ± 0% +63.7% 1602 ± 1% proc-vmstat.pgactivate
> 4699 ± 3% +162.6% 12340 ± 73% proc-vmstat.pgpgout
> 50.23 ± 19% -27.3% 36.50 ± 17% sched_debug.cpu.cpu_load[1].avg
> 466.50 ± 29% -51.8% 225.00 ± 73% sched_debug.cpu.cpu_load[1].max
> 77.78 ± 33% -50.6% 38.40 ± 57% sched_debug.cpu.cpu_load[1].stddev
> 300.50 ± 33% -52.9% 141.50 ± 48% sched_debug.cpu.cpu_load[2].max
> 1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.active_objs
> 1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.num_objs
> 431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.active_objs
> 431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.num_objs
but what do I know. Those profiles from the robot are pretty hard to
make sense of.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-10 18:24 ` Linus Torvalds
@ 2016-08-10 23:08 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-10 23:08 UTC (permalink / raw)
To: Linus Torvalds
Cc: kernel test robot, Christoph Hellwig, Bob Peterson, LKML, LKP
On Wed, Aug 10, 2016 at 11:24:16AM -0700, Linus Torvalds wrote:
> On Tue, Aug 9, 2016 at 7:33 AM, kernel test robot <xiaolong.ye@intel.com> wrote:
> >
> > FYI, we noticed a -13.6% regression of aim7.jobs-per-min due to commit:
> > 68a9f5e7007c ("xfs: implement iomap based buffered write path")
> >
> > in testcase: aim7
> > on test machine: 48 threads Ivytown Ivy Bridge-EP with 64G memory
> > with following parameters:
> >
> > disk: 1BRD_48G
> > fs: xfs
> > test: disk_wrt
> > load: 3000
> > cpufreq_governor: performance
>
> Christop, Dave, was this expected?
No. I would have expected the performance to go the other way -
there is less overhead in the write() path now than there was
previously, and all my numbers go the other way (5-10%
improvements) in throughput.
> From looking at the numbers, it looks like much more IO going on (and
> this less CPU load)..
I read the numbers the other way, but to me the numbers do not
indicate anything about IO load.
> > 37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time
> > 37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time.max
> > 6424 ± 1% +31.3% 8432 ± 1% aim7.time.involuntary_context_switches
> > 4003 ± 0% +28.1% 5129 ± 1% proc-vmstat.nr_active_file
> > 979.25 ± 0% +63.7% 1602 ± 1% proc-vmstat.pgactivate
> > 4699 ± 3% +162.6% 12340 ± 73% proc-vmstat.pgpgout
> > 50.23 ± 19% -27.3% 36.50 ± 17% sched_debug.cpu.cpu_load[1].avg
> > 466.50 ± 29% -51.8% 225.00 ± 73% sched_debug.cpu.cpu_load[1].max
> > 77.78 ± 33% -50.6% 38.40 ± 57% sched_debug.cpu.cpu_load[1].stddev
> > 300.50 ± 33% -52.9% 141.50 ± 48% sched_debug.cpu.cpu_load[2].max
> > 1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.active_objs
> > 1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.num_objs
> > 431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.active_objs
> > 431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.num_objs
>
> but what do I know. Those profiles from the robot are pretty hard to
> make sense of.
Yup, I can't infer anything from most of the stats present. The only
thing that stood out is that there's clearly a significant reduction
in context switches:
429058 ± 0% -20.0% 343371 ± 0% aim7.time.voluntary_context_switches
....
972882 ± 0% -17.4% 803990 ± 0% perf-stat.context-switches
and a significant increase in system CPU time:
376.31 ± 0% +28.5% 483.48 ± 0% aim7.time.system_time
....
1.452e+12 ± 6% +29.5% 1.879e+12 ± 4% perf-stat.instructions
42168 ± 16% +27.5% 53751 ± 6% perf-stat.instructions-per-iTLB-miss
It looks to me like the extra system time is running more loops
in the same code footprint, not because we are executing a bigger
or different footprint of code.
That, to me, says there's a change in lock contention behaviour in
the workload (which we know aim7 is good at exposing). i.e. the
iomap change shifted contention from a sleeping lock to a spinning
lock, or maybe we now trigger optimistic spinning behaviour on a
lock we previously didn't spin on at all.
We really need instruction level perf profiles to understand
this - I don't have a machine with this many cpu cores available
locally, so I'm not sure I'm going to be able to make any progress
tracking it down in the short term. Maybe the lkp team has more
in-depth cpu usage profiles they can share?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-10 23:08 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-10 23:08 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3770 bytes --]
On Wed, Aug 10, 2016 at 11:24:16AM -0700, Linus Torvalds wrote:
> On Tue, Aug 9, 2016 at 7:33 AM, kernel test robot <xiaolong.ye@intel.com> wrote:
> >
> > FYI, we noticed a -13.6% regression of aim7.jobs-per-min due to commit:
> > 68a9f5e7007c ("xfs: implement iomap based buffered write path")
> >
> > in testcase: aim7
> > on test machine: 48 threads Ivytown Ivy Bridge-EP with 64G memory
> > with following parameters:
> >
> > disk: 1BRD_48G
> > fs: xfs
> > test: disk_wrt
> > load: 3000
> > cpufreq_governor: performance
>
> Christop, Dave, was this expected?
No. I would have expected the performance to go the other way -
there is less overhead in the write() path now than there was
previously, and all my numbers go the other way (5-10%
improvements) in throughput.
> From looking at the numbers, it looks like much more IO going on (and
> this less CPU load)..
I read the numbers the other way, but to me the numbers do not
indicate anything about IO load.
> > 37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time
> > 37.23 ± 0% +15.6% 43.04 ± 0% aim7.time.elapsed_time.max
> > 6424 ± 1% +31.3% 8432 ± 1% aim7.time.involuntary_context_switches
> > 4003 ± 0% +28.1% 5129 ± 1% proc-vmstat.nr_active_file
> > 979.25 ± 0% +63.7% 1602 ± 1% proc-vmstat.pgactivate
> > 4699 ± 3% +162.6% 12340 ± 73% proc-vmstat.pgpgout
> > 50.23 ± 19% -27.3% 36.50 ± 17% sched_debug.cpu.cpu_load[1].avg
> > 466.50 ± 29% -51.8% 225.00 ± 73% sched_debug.cpu.cpu_load[1].max
> > 77.78 ± 33% -50.6% 38.40 ± 57% sched_debug.cpu.cpu_load[1].stddev
> > 300.50 ± 33% -52.9% 141.50 ± 48% sched_debug.cpu.cpu_load[2].max
> > 1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.active_objs
> > 1836 ± 10% +65.5% 3039 ± 8% slabinfo.scsi_data_buffer.num_objs
> > 431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.active_objs
> > 431.75 ± 10% +65.6% 715.00 ± 8% slabinfo.xfs_efd_item.num_objs
>
> but what do I know. Those profiles from the robot are pretty hard to
> make sense of.
Yup, I can't infer anything from most of the stats present. The only
thing that stood out is that there's clearly a significant reduction
in context switches:
429058 ± 0% -20.0% 343371 ± 0% aim7.time.voluntary_context_switches
....
972882 ± 0% -17.4% 803990 ± 0% perf-stat.context-switches
and a significant increase in system CPU time:
376.31 ± 0% +28.5% 483.48 ± 0% aim7.time.system_time
....
1.452e+12 ± 6% +29.5% 1.879e+12 ± 4% perf-stat.instructions
42168 ± 16% +27.5% 53751 ± 6% perf-stat.instructions-per-iTLB-miss
It looks to me like the extra system time is running more loops
in the same code footprint, not because we are executing a bigger
or different footprint of code.
That, to me, says there's a change in lock contention behaviour in
the workload (which we know aim7 is good at exposing). i.e. the
iomap change shifted contention from a sleeping lock to a spinning
lock, or maybe we now trigger optimistic spinning behaviour on a
lock we previously didn't spin on at all.
We really need instruction level perf profiles to understand
this - I don't have a machine with this many cpu cores available
locally, so I'm not sure I'm going to be able to make any progress
tracking it down in the short term. Maybe the lkp team has more
in-depth cpu usage profiles they can share?
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-10 23:08 ` Dave Chinner
@ 2016-08-10 23:51 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-10 23:51 UTC (permalink / raw)
To: Dave Chinner, Wu Fengguang
Cc: kernel test robot, Christoph Hellwig, Bob Peterson, LKML, LKP
On Wed, Aug 10, 2016 at 4:08 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> That, to me, says there's a change in lock contention behaviour in
> the workload (which we know aim7 is good at exposing). i.e. the
> iomap change shifted contention from a sleeping lock to a spinning
> lock, or maybe we now trigger optimistic spinning behaviour on a
> lock we previously didn't spin on at all.
Hmm. Possibly. I reacted to the lower cpu load number, but yeah, I
could easily imagine some locking primitive difference too.
> We really need instruction level perf profiles to understand
> this - I don't have a machine with this many cpu cores available
> locally, so I'm not sure I'm going to be able to make any progress
> tracking it down in the short term. Maybe the lkp team has more
> in-depth cpu usage profiles they can share?
Yeah, I've occasionally wanted to see some kind of "top-25 kernel
functions in the profile" thing. That said, when the load isn't all
that familiar, the profiles usually are not all that easy to make
sense of either. But comparing the before and after state might give
us clues.
Fengguang?
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-10 23:51 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-10 23:51 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1168 bytes --]
On Wed, Aug 10, 2016 at 4:08 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> That, to me, says there's a change in lock contention behaviour in
> the workload (which we know aim7 is good at exposing). i.e. the
> iomap change shifted contention from a sleeping lock to a spinning
> lock, or maybe we now trigger optimistic spinning behaviour on a
> lock we previously didn't spin on at all.
Hmm. Possibly. I reacted to the lower cpu load number, but yeah, I
could easily imagine some locking primitive difference too.
> We really need instruction level perf profiles to understand
> this - I don't have a machine with this many cpu cores available
> locally, so I'm not sure I'm going to be able to make any progress
> tracking it down in the short term. Maybe the lkp team has more
> in-depth cpu usage profiles they can share?
Yeah, I've occasionally wanted to see some kind of "top-25 kernel
functions in the profile" thing. That said, when the load isn't all
that familiar, the profiles usually are not all that easy to make
sense of either. But comparing the before and after state might give
us clues.
Fengguang?
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-10 23:51 ` Linus Torvalds
@ 2016-08-10 23:58 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-10 23:58 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Wu Fengguang, Bob Peterson, LKML, LKP, Christoph Hellwig
Hi, Linus,
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Wed, Aug 10, 2016 at 4:08 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> That, to me, says there's a change in lock contention behaviour in
>> the workload (which we know aim7 is good at exposing). i.e. the
>> iomap change shifted contention from a sleeping lock to a spinning
>> lock, or maybe we now trigger optimistic spinning behaviour on a
>> lock we previously didn't spin on at all.
>
> Hmm. Possibly. I reacted to the lower cpu load number, but yeah, I
> could easily imagine some locking primitive difference too.
>
>> We really need instruction level perf profiles to understand
>> this - I don't have a machine with this many cpu cores available
>> locally, so I'm not sure I'm going to be able to make any progress
>> tracking it down in the short term. Maybe the lkp team has more
>> in-depth cpu usage profiles they can share?
>
> Yeah, I've occasionally wanted to see some kind of "top-25 kernel
> functions in the profile" thing. That said, when the load isn't all
> that familiar, the profiles usually are not all that easy to make
> sense of either. But comparing the before and after state might give
> us clues.
I have started perf-profile data collection, will send out the
comparison result soon.
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-10 23:58 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-10 23:58 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1357 bytes --]
Hi, Linus,
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Wed, Aug 10, 2016 at 4:08 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> That, to me, says there's a change in lock contention behaviour in
>> the workload (which we know aim7 is good at exposing). i.e. the
>> iomap change shifted contention from a sleeping lock to a spinning
>> lock, or maybe we now trigger optimistic spinning behaviour on a
>> lock we previously didn't spin on at all.
>
> Hmm. Possibly. I reacted to the lower cpu load number, but yeah, I
> could easily imagine some locking primitive difference too.
>
>> We really need instruction level perf profiles to understand
>> this - I don't have a machine with this many cpu cores available
>> locally, so I'm not sure I'm going to be able to make any progress
>> tracking it down in the short term. Maybe the lkp team has more
>> in-depth cpu usage profiles they can share?
>
> Yeah, I've occasionally wanted to see some kind of "top-25 kernel
> functions in the profile" thing. That said, when the load isn't all
> that familiar, the profiles usually are not all that easy to make
> sense of either. But comparing the before and after state might give
> us clues.
I have started perf-profile data collection, will send out the
comparison result soon.
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-10 23:58 ` Huang, Ying
@ 2016-08-11 0:11 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-11 0:11 UTC (permalink / raw)
To: Huang, Ying
Cc: Linus Torvalds, Dave Chinner, LKML, Bob Peterson, Wu Fengguang,
LKP, Christoph Hellwig
"Huang, Ying" <ying.huang@intel.com> writes:
> Hi, Linus,
>
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> On Wed, Aug 10, 2016 at 4:08 PM, Dave Chinner <david@fromorbit.com> wrote:
>>>
>>> That, to me, says there's a change in lock contention behaviour in
>>> the workload (which we know aim7 is good at exposing). i.e. the
>>> iomap change shifted contention from a sleeping lock to a spinning
>>> lock, or maybe we now trigger optimistic spinning behaviour on a
>>> lock we previously didn't spin on at all.
>>
>> Hmm. Possibly. I reacted to the lower cpu load number, but yeah, I
>> could easily imagine some locking primitive difference too.
>>
>>> We really need instruction level perf profiles to understand
>>> this - I don't have a machine with this many cpu cores available
>>> locally, so I'm not sure I'm going to be able to make any progress
>>> tracking it down in the short term. Maybe the lkp team has more
>>> in-depth cpu usage profiles they can share?
>>
>> Yeah, I've occasionally wanted to see some kind of "top-25 kernel
>> functions in the profile" thing. That said, when the load isn't all
>> that familiar, the profiles usually are not all that easy to make
>> sense of either. But comparing the before and after state might give
>> us clues.
>
> I have started perf-profile data collection, will send out the
> comparison result soon.
Here is the comparison result with perf-profile data.
=========================================================================================
compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506
68a9f5e7007c1afa2cf6830b690a90d0187c0684
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69
---------------- --------------------------
%stddev %change %stddev
\ | \
484435 ± 0% -13.3% 420004 ± 0% aim7.jobs-per-min
37.37 ± 0% +15.3% 43.09 ± 0% aim7.time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% aim7.time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% aim7.time.involuntary_context_switches
376.89 ± 0% +28.4% 484.11 ± 0% aim7.time.system_time
430512 ± 0% -20.1% 343838 ± 0% aim7.time.voluntary_context_switches
26816 ± 8% +10.2% 29542 ± 1% interrupts.CAL:Function_call_interrupts
125122 ± 10% -10.7% 111758 ± 12% softirqs.SCHED
24772 ± 0% -28.6% 17675 ± 0% vmstat.system.cs
53477 ± 2% +5.6% 56453 ± 0% vmstat.system.in
15627 ± 0% +27.7% 19956 ± 1% meminfo.Active(file)
16103 ± 3% +14.3% 18405 ± 8% meminfo.AnonHugePages
132898 ± 9% +15.4% 153380 ± 1% meminfo.DirectMap4k
13777 ± 5% +43.1% 19709 ± 0% meminfo.Shmem
3906 ± 0% +28.8% 5032 ± 2% proc-vmstat.nr_active_file
919.33 ± 5% +14.8% 1055 ± 8% proc-vmstat.nr_dirty
3444 ± 5% +41.8% 4884 ± 0% proc-vmstat.nr_shmem
4092 ± 14% +61.2% 6595 ± 1% proc-vmstat.pgactivate
1975 ± 15% +63.2% 3224 ± 17% slabinfo.scsi_data_buffer.active_objs
1975 ± 15% +63.2% 3224 ± 17% slabinfo.scsi_data_buffer.num_objs
464.33 ± 15% +63.3% 758.33 ± 17% slabinfo.xfs_efd_item.active_objs
464.33 ± 15% +63.3% 758.33 ± 17% slabinfo.xfs_efd_item.num_objs
1724300 ± 27% -40.5% 1025538 ± 1% sched_debug.cfs_rq:/.load.max
96.36 ± 3% +18.6% 114.32 ± 15% sched_debug.cfs_rq:/.util_avg.stddev
1724300 ± 27% -40.5% 1025538 ± 1% sched_debug.cpu.load.max
2887 ± 30% -28.2% 2073 ± 48% sched_debug.cpu.nr_load_updates.min
7.66 ± 20% -24.9% 5.75 ± 15% sched_debug.cpu.nr_uninterruptible.stddev
37.37 ± 0% +15.3% 43.09 ± 0% time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% time.involuntary_context_switches
1037 ± 0% +10.8% 1148 ± 0% time.percent_of_cpu_this_job_got
376.89 ± 0% +28.4% 484.11 ± 0% time.system_time
430512 ± 0% -20.1% 343838 ± 0% time.voluntary_context_switches
24.18 ± 0% +9.0% 26.35 ± 0% turbostat.%Busy
686.00 ± 0% +9.5% 751.00 ± 0% turbostat.Avg_MHz
0.28 ± 0% -25.0% 0.21 ± 0% turbostat.CPU%c3
93.33 ± 1% +3.0% 96.15 ± 0% turbostat.CorWatt
124.61 ± 0% +2.1% 127.17 ± 0% turbostat.PkgWatt
4.74 ± 0% -2.7% 4.61 ± 1% turbostat.RAMWatt
7723 ± 0% +32.6% 10238 ± 5% numa-meminfo.node0.Active(file)
1589 ± 17% +45.5% 2313 ± 24% numa-meminfo.node0.Dirty
56052 ± 3% +58.2% 88666 ± 17% numa-meminfo.node1.Active
48142 ± 4% +64.0% 78943 ± 19% numa-meminfo.node1.Active(anon)
7908 ± 1% +22.9% 9722 ± 3% numa-meminfo.node1.Active(file)
46721 ± 3% +55.9% 72837 ± 24% numa-meminfo.node1.AnonPages
4789 ± 69% +102.3% 9687 ± 9% numa-meminfo.node1.Shmem
52991525 ± 1% -19.4% 42687208 ± 0% cpuidle.C1-IVT.time
319584 ± 1% -26.5% 234868 ± 1% cpuidle.C1-IVT.usage
3468808 ± 2% -19.8% 2783341 ± 3% cpuidle.C1E-IVT.time
46760 ± 0% -22.4% 36298 ± 0% cpuidle.C1E-IVT.usage
12590471 ± 0% -22.3% 9788585 ± 1% cpuidle.C3-IVT.time
79965 ± 0% -19.0% 64749 ± 0% cpuidle.C3-IVT.usage
1.3e+09 ± 0% +13.3% 1.473e+09 ± 0% cpuidle.C6-IVT.time
352.33 ± 8% -24.7% 265.33 ± 1% cpuidle.POLL.usage
1930 ± 0% +33.9% 2585 ± 3% numa-vmstat.node0.nr_active_file
4468 ± 7% -8.5% 4089 ± 5% numa-vmstat.node0.nr_alloc_batch
466.67 ± 4% +29.3% 603.33 ± 14% numa-vmstat.node0.nr_dirty
12026 ± 4% +64.1% 19734 ± 20% numa-vmstat.node1.nr_active_anon
1977 ± 1% +23.6% 2444 ± 1% numa-vmstat.node1.nr_active_file
3809 ± 6% +16.1% 4422 ± 4% numa-vmstat.node1.nr_alloc_batch
11671 ± 3% +55.9% 18197 ± 24% numa-vmstat.node1.nr_anon_pages
1197 ± 69% +102.3% 2422 ± 9% numa-vmstat.node1.nr_shmem
456.33 ± 57% -75.6% 111.33 ± 86% numa-vmstat.node1.nr_written
2.658e+11 ± 4% +24.7% 3.316e+11 ± 2% perf-stat.branch-instructions
0.41 ± 1% -9.1% 0.37 ± 1% perf-stat.branch-miss-rate
1.09e+09 ± 3% +13.4% 1.237e+09 ± 1% perf-stat.branch-misses
981138 ± 0% -18.1% 803696 ± 0% perf-stat.context-switches
1.511e+12 ± 5% +23.4% 1.864e+12 ± 3% perf-stat.cpu-cycles
102600 ± 1% -7.3% 95075 ± 1% perf-stat.cpu-migrations
0.26 ± 12% -30.8% 0.18 ± 10% perf-stat.dTLB-load-miss-rate
3.164e+11 ± 1% +39.9% 4.426e+11 ± 4% perf-stat.dTLB-loads
0.03 ± 26% -41.3% 0.02 ± 13% perf-stat.dTLB-store-miss-rate
2.247e+11 ± 6% +26.4% 2.839e+11 ± 2% perf-stat.dTLB-stores
1.49e+12 ± 4% +30.1% 1.939e+12 ± 2% perf-stat.instructions
43348 ± 2% +34.2% 58161 ± 12% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.5% 1.04 ± 0% perf-stat.ipc
262799 ± 0% +4.4% 274251 ± 1% perf-stat.minor-faults
34.12 ± 1% +2.1% 34.83 ± 0% perf-stat.node-load-miss-rate
46476754 ± 2% +4.6% 48601269 ± 1% perf-stat.node-load-misses
9.96 ± 0% +13.4% 11.30 ± 0% perf-stat.node-store-miss-rate
24460859 ± 1% +14.4% 27971097 ± 1% perf-stat.node-store-misses
262780 ± 0% +4.4% 274227 ± 1% perf-stat.page-faults
11.31 ± 1% -18.1% 9.27 ± 0% perf-profile.cycles-pp.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.68 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.80 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
2.55 ± 3% -14.2% 2.19 ± 2% perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin
0.00 ± -1% +Inf% 4.45 ± 1% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor.iomap_apply
5.93 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write
13.71 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
10.36 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 3.64 ± 0% perf-profile.cycles-pp.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.04 ± 2% -18.9% 0.84 ± 1% perf-profile.cycles-pp.__delete_from_page_cache.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
11.24 ± 2% -18.1% 9.21 ± 0% perf-profile.cycles-pp.__dentry_kill.dput.__fput.____fput.task_work_run
11.31 ± 2% -18.1% 9.26 ± 0% perf-profile.cycles-pp.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath
0.00 ± -1% +Inf% 1.09 ± 2% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.32 ± 4% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 2.68 ± 2% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
3.04 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
1.00 ± 1% -18.0% 0.82 ± 1% perf-profile.cycles-pp.__radix_tree_lookup.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin
1.12 ± 2% -17.6% 0.92 ± 4% perf-profile.cycles-pp.__sb_start_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.38 ± 2% -13.3% 1.19 ± 1% perf-profile.cycles-pp.__set_page_dirty.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end
54.10 ± 1% +13.1% 61.20 ± 0% perf-profile.cycles-pp.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
6.34 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 3.69 ± 1% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
4.02 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
0.98 ± 5% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 2.56 ± 2% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
2.91 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
3.42 ± 0% -20.9% 2.71 ± 2% perf-profile.cycles-pp.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
0.00 ± -1% +Inf% 4.69 ± 0% perf-profile.cycles-pp.block_write_end.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
6.24 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
19.18 ± 5% -9.3% 17.40 ± 0% perf-profile.cycles-pp.call_cpuidle.cpu_startup_entry.start_secondary
0.94 ± 4% -19.8% 0.76 ± 0% perf-profile.cycles-pp.cancel_dirty_page.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage
3.95 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.copy_user_enhanced_fast_string.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± -1% +Inf% 3.22 ± 0% perf-profile.cycles-pp.copy_user_enhanced_fast_string.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
19.75 ± 5% -9.8% 17.81 ± 0% perf-profile.cycles-pp.cpu_startup_entry.start_secondary
19.18 ± 5% -9.3% 17.40 ± 0% perf-profile.cycles-pp.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
18.45 ± 5% -9.2% 16.75 ± 0% perf-profile.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
1.44 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 1.18 ± 1% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor
1.86 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
0.00 ± -1% +Inf% 1.53 ± 1% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply
1.74 ± 2% -19.9% 1.40 ± 3% perf-profile.cycles-pp.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
1.27 ± 0% -22.5% 0.99 ± 4% perf-profile.cycles-pp.destroy_inode.evict.iput.__dentry_kill.dput
2.61 ± 1% -24.3% 1.98 ± 1% perf-profile.cycles-pp.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
2.66 ± 1% -24.3% 2.01 ± 1% perf-profile.cycles-pp.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
1.79 ± 2% -28.2% 1.28 ± 3% perf-profile.cycles-pp.do_unlinkat.sys_unlink.entry_SYSCALL_64_fastpath
1.07 ± 3% -23.3% 0.82 ± 3% perf-profile.cycles-pp.down_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
1.01 ± 3% -17.9% 0.83 ± 2% perf-profile.cycles-pp.down_write.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.26 ± 2% -18.1% 9.23 ± 0% perf-profile.cycles-pp.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
11.21 ± 2% -18.1% 9.18 ± 0% perf-profile.cycles-pp.evict.iput.__dentry_kill.dput.__fput
11.34 ± 2% -18.1% 9.29 ± 0% perf-profile.cycles-pp.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.55 ± 3% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
1.83 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
43.95 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 7.91 ± 1% perf-profile.cycles-pp.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
10.68 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
1.91 ± 3% -16.4% 1.59 ± 1% perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page
0.00 ± -1% +Inf% 9.85 ± 0% perf-profile.cycles-pp.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
10.96 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± -1% +Inf% 52.29 ± 0% perf-profile.cycles-pp.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± -1% +Inf% 52.94 ± 0% perf-profile.cycles-pp.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 34.35 ± 0% perf-profile.cycles-pp.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± -1% +Inf% 16.48 ± 0% perf-profile.cycles-pp.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
11.22 ± 2% -18.1% 9.19 ± 0% perf-profile.cycles-pp.iput.__dentry_kill.dput.__fput.____fput
0.00 ± -1% +Inf% 1.55 ± 1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.72 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 2.78 ± 0% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor
3.39 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end
0.00 ± -1% +Inf% 3.44 ± 1% perf-profile.cycles-pp.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
3.03 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.memset_erms.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 2.43 ± 0% perf-profile.cycles-pp.memset_erms.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 9.25 ± 0% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply
10.37 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
2.58 ± 1% -24.1% 1.96 ± 0% perf-profile.cycles-pp.path_openat.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
1.17 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
2.06 ± 3% -22.5% 1.60 ± 2% perf-profile.cycles-pp.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.79 ± 3% -22.2% 1.39 ± 0% perf-profile.cycles-pp.security_file_permission.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.32 ± 4% -21.4% 1.04 ± 0% perf-profile.cycles-pp.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write.sys_write
19.79 ± 5% -9.9% 17.84 ± 0% perf-profile.cycles-pp.start_secondary
2.67 ± 1% -24.2% 2.02 ± 1% perf-profile.cycles-pp.sys_creat.entry_SYSCALL_64_fastpath
1.79 ± 3% -27.9% 1.29 ± 3% perf-profile.cycles-pp.sys_unlink.entry_SYSCALL_64_fastpath
60.98 ± 1% +9.5% 66.76 ± 0% perf-profile.cycles-pp.sys_write.entry_SYSCALL_64_fastpath
11.34 ± 1% -18.1% 9.29 ± 0% perf-profile.cycles-pp.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.32 ± 1% -18.0% 9.28 ± 0% perf-profile.cycles-pp.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
5.96 ± 1% -20.0% 4.77 ± 0% perf-profile.cycles-pp.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
9.89 ± 2% -17.4% 8.17 ± 0% perf-profile.cycles-pp.truncate_inode_pages_final.evict.iput.__dentry_kill.dput
9.87 ± 2% -17.5% 8.15 ± 0% perf-profile.cycles-pp.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.__dentry_kill
2.07 ± 1% -20.4% 1.65 ± 2% perf-profile.cycles-pp.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage
2.40 ± 1% -21.0% 1.89 ± 2% perf-profile.cycles-pp.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range
0.00 ± -1% +Inf% 1.36 ± 1% perf-profile.cycles-pp.unlock_page.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.72 ± 4% -100.0% 0.00 ± -1% perf-profile.cycles-pp.unlock_page.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
59.63 ± 1% +10.2% 65.72 ± 0% perf-profile.cycles-pp.vfs_write.sys_write.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.52 ± 2% perf-profile.cycles-pp.workingset_activation.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 1.73 ± 1% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
0.00 ± -1% +Inf% 1.97 ± 2% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 1.61 ± 2% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± -1% +Inf% 1.24 ± 2% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± -1% +Inf% 1.46 ± 1% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply
0.00 ± -1% +Inf% 1.21 ± 2% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay
1.25 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
0.00 ± -1% +Inf% 3.06 ± 1% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
1.04 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_bmapi_read.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 3.04 ± 1% perf-profile.cycles-pp.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 3.05 ± 1% perf-profile.cycles-pp.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
1.32 ± 2% -21.5% 1.04 ± 1% perf-profile.cycles-pp.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat
51.83 ± 1% +14.3% 59.25 ± 0% perf-profile.cycles-pp.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.sys_write
0.00 ± -1% +Inf% 16.05 ± 0% perf-profile.cycles-pp.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
53.16 ± 1% +13.6% 60.40 ± 0% perf-profile.cycles-pp.xfs_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.24 ± 1% -23.1% 0.95 ± 4% perf-profile.cycles-pp.xfs_fs_destroy_inode.destroy_inode.evict.iput.__dentry_kill
1.42 ± 2% -21.2% 1.12 ± 1% perf-profile.cycles-pp.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open
6.46 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
1.29 ± 3% -18.9% 1.04 ± 1% perf-profile.cycles-pp.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 1.14 ± 3% perf-profile.cycles-pp.xfs_ilock.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
1.21 ± 1% -23.4% 0.93 ± 4% perf-profile.cycles-pp.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
1.23 ± 4% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.6.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
0.00 ± -1% +Inf% 4.14 ± 0% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
3.28 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 9.08 ± 0% perf-profile.cycles-pp.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
3.54 ± 0% -20.8% 2.81 ± 1% perf-profile.cycles-pp.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
2.35 ± 1% -21.0% 1.86 ± 1% perf-profile.cycles-pp.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page
25.10 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.03 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
1.42 ± 2% -20.7% 1.13 ± 1% perf-profile.cycles-pp.xfs_vn_create.path_openat.do_filp_open.do_sys_open.sys_creat
1.42 ± 2% -20.5% 1.13 ± 1% perf-profile.cycles-pp.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open
2.27 ± 1% -10.6% 2.03 ± 0% perf-profile.func.cycles-pp.___might_sleep
2.49 ± 0% -34.5% 1.63 ± 1% perf-profile.func.cycles-pp.__block_commit_write.isra.24
1.51 ± 2% +15.4% 1.75 ± 1% perf-profile.func.cycles-pp.__block_write_begin_int
1.79 ± 4% -16.8% 1.49 ± 1% perf-profile.func.cycles-pp.__mark_inode_dirty
1.32 ± 0% -16.4% 1.10 ± 1% perf-profile.func.cycles-pp.__radix_tree_lookup
1.08 ± 2% -100.0% 0.00 ± -1% perf-profile.func.cycles-pp.__xfs_get_blocks
1.16 ± 0% -18.1% 0.95 ± 1% perf-profile.func.cycles-pp._raw_spin_lock
3.96 ± 2% -18.4% 3.23 ± 0% perf-profile.func.cycles-pp.copy_user_enhanced_fast_string
1.41 ± 3% -20.6% 1.12 ± 3% perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath
1.30 ± 2% -100.0% 0.00 ± -1% perf-profile.func.cycles-pp.generic_perform_write
1.31 ± 2% -46.7% 0.70 ± 0% perf-profile.func.cycles-pp.generic_write_end
18.43 ± 5% -9.1% 16.76 ± 0% perf-profile.func.cycles-pp.intel_idle
0.00 ± -1% +Inf% 1.12 ± 1% perf-profile.func.cycles-pp.iomap_write_actor
1.50 ± 1% -20.9% 1.19 ± 1% perf-profile.func.cycles-pp.mark_buffer_dirty
0.00 ± -1% +Inf% 1.91 ± 1% perf-profile.func.cycles-pp.mark_page_accessed
3.24 ± 0% -19.8% 2.60 ± 0% perf-profile.func.cycles-pp.memset_erms
1.75 ± 2% -18.9% 1.42 ± 1% perf-profile.func.cycles-pp.unlock_page
1.16 ± 1% -21.6% 0.91 ± 1% perf-profile.func.cycles-pp.vfs_write
0.37 ± 2% +243.6% 1.26 ± 2% perf-profile.func.cycles-pp.xfs_bmap_search_extents
0.41 ± 1% +198.4% 1.22 ± 2% perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents
0.70 ± 5% +219.5% 2.24 ± 0% perf-profile.func.cycles-pp.xfs_bmapi_read
1.05 ± 2% -15.6% 0.88 ± 3% perf-profile.func.cycles-pp.xfs_file_write_iter
0.64 ± 1% +182.8% 1.81 ± 4% perf-profile.func.cycles-pp.xfs_iext_bno_to_ext
0.00 ± -1% +Inf% 1.10 ± 3% perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8
0.46 ± 4% +161.6% 1.20 ± 1% perf-profile.func.cycles-pp.xfs_iomap_write_delay
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 0:11 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-11 0:11 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 30493 bytes --]
"Huang, Ying" <ying.huang@intel.com> writes:
> Hi, Linus,
>
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> On Wed, Aug 10, 2016 at 4:08 PM, Dave Chinner <david@fromorbit.com> wrote:
>>>
>>> That, to me, says there's a change in lock contention behaviour in
>>> the workload (which we know aim7 is good at exposing). i.e. the
>>> iomap change shifted contention from a sleeping lock to a spinning
>>> lock, or maybe we now trigger optimistic spinning behaviour on a
>>> lock we previously didn't spin on at all.
>>
>> Hmm. Possibly. I reacted to the lower cpu load number, but yeah, I
>> could easily imagine some locking primitive difference too.
>>
>>> We really need instruction level perf profiles to understand
>>> this - I don't have a machine with this many cpu cores available
>>> locally, so I'm not sure I'm going to be able to make any progress
>>> tracking it down in the short term. Maybe the lkp team has more
>>> in-depth cpu usage profiles they can share?
>>
>> Yeah, I've occasionally wanted to see some kind of "top-25 kernel
>> functions in the profile" thing. That said, when the load isn't all
>> that familiar, the profiles usually are not all that easy to make
>> sense of either. But comparing the before and after state might give
>> us clues.
>
> I have started perf-profile data collection, will send out the
> comparison result soon.
Here is the comparison result with perf-profile data.
=========================================================================================
compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506
68a9f5e7007c1afa2cf6830b690a90d0187c0684
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69
---------------- --------------------------
%stddev %change %stddev
\ | \
484435 ± 0% -13.3% 420004 ± 0% aim7.jobs-per-min
37.37 ± 0% +15.3% 43.09 ± 0% aim7.time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% aim7.time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% aim7.time.involuntary_context_switches
376.89 ± 0% +28.4% 484.11 ± 0% aim7.time.system_time
430512 ± 0% -20.1% 343838 ± 0% aim7.time.voluntary_context_switches
26816 ± 8% +10.2% 29542 ± 1% interrupts.CAL:Function_call_interrupts
125122 ± 10% -10.7% 111758 ± 12% softirqs.SCHED
24772 ± 0% -28.6% 17675 ± 0% vmstat.system.cs
53477 ± 2% +5.6% 56453 ± 0% vmstat.system.in
15627 ± 0% +27.7% 19956 ± 1% meminfo.Active(file)
16103 ± 3% +14.3% 18405 ± 8% meminfo.AnonHugePages
132898 ± 9% +15.4% 153380 ± 1% meminfo.DirectMap4k
13777 ± 5% +43.1% 19709 ± 0% meminfo.Shmem
3906 ± 0% +28.8% 5032 ± 2% proc-vmstat.nr_active_file
919.33 ± 5% +14.8% 1055 ± 8% proc-vmstat.nr_dirty
3444 ± 5% +41.8% 4884 ± 0% proc-vmstat.nr_shmem
4092 ± 14% +61.2% 6595 ± 1% proc-vmstat.pgactivate
1975 ± 15% +63.2% 3224 ± 17% slabinfo.scsi_data_buffer.active_objs
1975 ± 15% +63.2% 3224 ± 17% slabinfo.scsi_data_buffer.num_objs
464.33 ± 15% +63.3% 758.33 ± 17% slabinfo.xfs_efd_item.active_objs
464.33 ± 15% +63.3% 758.33 ± 17% slabinfo.xfs_efd_item.num_objs
1724300 ± 27% -40.5% 1025538 ± 1% sched_debug.cfs_rq:/.load.max
96.36 ± 3% +18.6% 114.32 ± 15% sched_debug.cfs_rq:/.util_avg.stddev
1724300 ± 27% -40.5% 1025538 ± 1% sched_debug.cpu.load.max
2887 ± 30% -28.2% 2073 ± 48% sched_debug.cpu.nr_load_updates.min
7.66 ± 20% -24.9% 5.75 ± 15% sched_debug.cpu.nr_uninterruptible.stddev
37.37 ± 0% +15.3% 43.09 ± 0% time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% time.involuntary_context_switches
1037 ± 0% +10.8% 1148 ± 0% time.percent_of_cpu_this_job_got
376.89 ± 0% +28.4% 484.11 ± 0% time.system_time
430512 ± 0% -20.1% 343838 ± 0% time.voluntary_context_switches
24.18 ± 0% +9.0% 26.35 ± 0% turbostat.%Busy
686.00 ± 0% +9.5% 751.00 ± 0% turbostat.Avg_MHz
0.28 ± 0% -25.0% 0.21 ± 0% turbostat.CPU%c3
93.33 ± 1% +3.0% 96.15 ± 0% turbostat.CorWatt
124.61 ± 0% +2.1% 127.17 ± 0% turbostat.PkgWatt
4.74 ± 0% -2.7% 4.61 ± 1% turbostat.RAMWatt
7723 ± 0% +32.6% 10238 ± 5% numa-meminfo.node0.Active(file)
1589 ± 17% +45.5% 2313 ± 24% numa-meminfo.node0.Dirty
56052 ± 3% +58.2% 88666 ± 17% numa-meminfo.node1.Active
48142 ± 4% +64.0% 78943 ± 19% numa-meminfo.node1.Active(anon)
7908 ± 1% +22.9% 9722 ± 3% numa-meminfo.node1.Active(file)
46721 ± 3% +55.9% 72837 ± 24% numa-meminfo.node1.AnonPages
4789 ± 69% +102.3% 9687 ± 9% numa-meminfo.node1.Shmem
52991525 ± 1% -19.4% 42687208 ± 0% cpuidle.C1-IVT.time
319584 ± 1% -26.5% 234868 ± 1% cpuidle.C1-IVT.usage
3468808 ± 2% -19.8% 2783341 ± 3% cpuidle.C1E-IVT.time
46760 ± 0% -22.4% 36298 ± 0% cpuidle.C1E-IVT.usage
12590471 ± 0% -22.3% 9788585 ± 1% cpuidle.C3-IVT.time
79965 ± 0% -19.0% 64749 ± 0% cpuidle.C3-IVT.usage
1.3e+09 ± 0% +13.3% 1.473e+09 ± 0% cpuidle.C6-IVT.time
352.33 ± 8% -24.7% 265.33 ± 1% cpuidle.POLL.usage
1930 ± 0% +33.9% 2585 ± 3% numa-vmstat.node0.nr_active_file
4468 ± 7% -8.5% 4089 ± 5% numa-vmstat.node0.nr_alloc_batch
466.67 ± 4% +29.3% 603.33 ± 14% numa-vmstat.node0.nr_dirty
12026 ± 4% +64.1% 19734 ± 20% numa-vmstat.node1.nr_active_anon
1977 ± 1% +23.6% 2444 ± 1% numa-vmstat.node1.nr_active_file
3809 ± 6% +16.1% 4422 ± 4% numa-vmstat.node1.nr_alloc_batch
11671 ± 3% +55.9% 18197 ± 24% numa-vmstat.node1.nr_anon_pages
1197 ± 69% +102.3% 2422 ± 9% numa-vmstat.node1.nr_shmem
456.33 ± 57% -75.6% 111.33 ± 86% numa-vmstat.node1.nr_written
2.658e+11 ± 4% +24.7% 3.316e+11 ± 2% perf-stat.branch-instructions
0.41 ± 1% -9.1% 0.37 ± 1% perf-stat.branch-miss-rate
1.09e+09 ± 3% +13.4% 1.237e+09 ± 1% perf-stat.branch-misses
981138 ± 0% -18.1% 803696 ± 0% perf-stat.context-switches
1.511e+12 ± 5% +23.4% 1.864e+12 ± 3% perf-stat.cpu-cycles
102600 ± 1% -7.3% 95075 ± 1% perf-stat.cpu-migrations
0.26 ± 12% -30.8% 0.18 ± 10% perf-stat.dTLB-load-miss-rate
3.164e+11 ± 1% +39.9% 4.426e+11 ± 4% perf-stat.dTLB-loads
0.03 ± 26% -41.3% 0.02 ± 13% perf-stat.dTLB-store-miss-rate
2.247e+11 ± 6% +26.4% 2.839e+11 ± 2% perf-stat.dTLB-stores
1.49e+12 ± 4% +30.1% 1.939e+12 ± 2% perf-stat.instructions
43348 ± 2% +34.2% 58161 ± 12% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.5% 1.04 ± 0% perf-stat.ipc
262799 ± 0% +4.4% 274251 ± 1% perf-stat.minor-faults
34.12 ± 1% +2.1% 34.83 ± 0% perf-stat.node-load-miss-rate
46476754 ± 2% +4.6% 48601269 ± 1% perf-stat.node-load-misses
9.96 ± 0% +13.4% 11.30 ± 0% perf-stat.node-store-miss-rate
24460859 ± 1% +14.4% 27971097 ± 1% perf-stat.node-store-misses
262780 ± 0% +4.4% 274227 ± 1% perf-stat.page-faults
11.31 ± 1% -18.1% 9.27 ± 0% perf-profile.cycles-pp.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.68 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.80 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
2.55 ± 3% -14.2% 2.19 ± 2% perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin
0.00 ± -1% +Inf% 4.45 ± 1% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor.iomap_apply
5.93 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write
13.71 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
10.36 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 3.64 ± 0% perf-profile.cycles-pp.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.04 ± 2% -18.9% 0.84 ± 1% perf-profile.cycles-pp.__delete_from_page_cache.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
11.24 ± 2% -18.1% 9.21 ± 0% perf-profile.cycles-pp.__dentry_kill.dput.__fput.____fput.task_work_run
11.31 ± 2% -18.1% 9.26 ± 0% perf-profile.cycles-pp.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath
0.00 ± -1% +Inf% 1.09 ± 2% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.32 ± 4% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 2.68 ± 2% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
3.04 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
1.00 ± 1% -18.0% 0.82 ± 1% perf-profile.cycles-pp.__radix_tree_lookup.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin
1.12 ± 2% -17.6% 0.92 ± 4% perf-profile.cycles-pp.__sb_start_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.38 ± 2% -13.3% 1.19 ± 1% perf-profile.cycles-pp.__set_page_dirty.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end
54.10 ± 1% +13.1% 61.20 ± 0% perf-profile.cycles-pp.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
6.34 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 3.69 ± 1% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
4.02 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
0.98 ± 5% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 2.56 ± 2% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
2.91 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
3.42 ± 0% -20.9% 2.71 ± 2% perf-profile.cycles-pp.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
0.00 ± -1% +Inf% 4.69 ± 0% perf-profile.cycles-pp.block_write_end.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
6.24 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
19.18 ± 5% -9.3% 17.40 ± 0% perf-profile.cycles-pp.call_cpuidle.cpu_startup_entry.start_secondary
0.94 ± 4% -19.8% 0.76 ± 0% perf-profile.cycles-pp.cancel_dirty_page.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage
3.95 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.copy_user_enhanced_fast_string.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± -1% +Inf% 3.22 ± 0% perf-profile.cycles-pp.copy_user_enhanced_fast_string.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
19.75 ± 5% -9.8% 17.81 ± 0% perf-profile.cycles-pp.cpu_startup_entry.start_secondary
19.18 ± 5% -9.3% 17.40 ± 0% perf-profile.cycles-pp.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
18.45 ± 5% -9.2% 16.75 ± 0% perf-profile.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
1.44 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 1.18 ± 1% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor
1.86 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
0.00 ± -1% +Inf% 1.53 ± 1% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply
1.74 ± 2% -19.9% 1.40 ± 3% perf-profile.cycles-pp.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
1.27 ± 0% -22.5% 0.99 ± 4% perf-profile.cycles-pp.destroy_inode.evict.iput.__dentry_kill.dput
2.61 ± 1% -24.3% 1.98 ± 1% perf-profile.cycles-pp.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
2.66 ± 1% -24.3% 2.01 ± 1% perf-profile.cycles-pp.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
1.79 ± 2% -28.2% 1.28 ± 3% perf-profile.cycles-pp.do_unlinkat.sys_unlink.entry_SYSCALL_64_fastpath
1.07 ± 3% -23.3% 0.82 ± 3% perf-profile.cycles-pp.down_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
1.01 ± 3% -17.9% 0.83 ± 2% perf-profile.cycles-pp.down_write.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.26 ± 2% -18.1% 9.23 ± 0% perf-profile.cycles-pp.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
11.21 ± 2% -18.1% 9.18 ± 0% perf-profile.cycles-pp.evict.iput.__dentry_kill.dput.__fput
11.34 ± 2% -18.1% 9.29 ± 0% perf-profile.cycles-pp.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.55 ± 3% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
1.83 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
43.95 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 7.91 ± 1% perf-profile.cycles-pp.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
10.68 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
1.91 ± 3% -16.4% 1.59 ± 1% perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page
0.00 ± -1% +Inf% 9.85 ± 0% perf-profile.cycles-pp.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
10.96 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± -1% +Inf% 52.29 ± 0% perf-profile.cycles-pp.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± -1% +Inf% 52.94 ± 0% perf-profile.cycles-pp.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 34.35 ± 0% perf-profile.cycles-pp.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± -1% +Inf% 16.48 ± 0% perf-profile.cycles-pp.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
11.22 ± 2% -18.1% 9.19 ± 0% perf-profile.cycles-pp.iput.__dentry_kill.dput.__fput.____fput
0.00 ± -1% +Inf% 1.55 ± 1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.72 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 2.78 ± 0% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor
3.39 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end
0.00 ± -1% +Inf% 3.44 ± 1% perf-profile.cycles-pp.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
3.03 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.memset_erms.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 2.43 ± 0% perf-profile.cycles-pp.memset_erms.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 9.25 ± 0% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply
10.37 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
2.58 ± 1% -24.1% 1.96 ± 0% perf-profile.cycles-pp.path_openat.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
1.17 ± 3% -100.0% 0.00 ± -1% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
2.06 ± 3% -22.5% 1.60 ± 2% perf-profile.cycles-pp.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.79 ± 3% -22.2% 1.39 ± 0% perf-profile.cycles-pp.security_file_permission.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.32 ± 4% -21.4% 1.04 ± 0% perf-profile.cycles-pp.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write.sys_write
19.79 ± 5% -9.9% 17.84 ± 0% perf-profile.cycles-pp.start_secondary
2.67 ± 1% -24.2% 2.02 ± 1% perf-profile.cycles-pp.sys_creat.entry_SYSCALL_64_fastpath
1.79 ± 3% -27.9% 1.29 ± 3% perf-profile.cycles-pp.sys_unlink.entry_SYSCALL_64_fastpath
60.98 ± 1% +9.5% 66.76 ± 0% perf-profile.cycles-pp.sys_write.entry_SYSCALL_64_fastpath
11.34 ± 1% -18.1% 9.29 ± 0% perf-profile.cycles-pp.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.32 ± 1% -18.0% 9.28 ± 0% perf-profile.cycles-pp.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
5.96 ± 1% -20.0% 4.77 ± 0% perf-profile.cycles-pp.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
9.89 ± 2% -17.4% 8.17 ± 0% perf-profile.cycles-pp.truncate_inode_pages_final.evict.iput.__dentry_kill.dput
9.87 ± 2% -17.5% 8.15 ± 0% perf-profile.cycles-pp.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.__dentry_kill
2.07 ± 1% -20.4% 1.65 ± 2% perf-profile.cycles-pp.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage
2.40 ± 1% -21.0% 1.89 ± 2% perf-profile.cycles-pp.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range
0.00 ± -1% +Inf% 1.36 ± 1% perf-profile.cycles-pp.unlock_page.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.72 ± 4% -100.0% 0.00 ± -1% perf-profile.cycles-pp.unlock_page.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
59.63 ± 1% +10.2% 65.72 ± 0% perf-profile.cycles-pp.vfs_write.sys_write.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.52 ± 2% perf-profile.cycles-pp.workingset_activation.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 1.73 ± 1% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
0.00 ± -1% +Inf% 1.97 ± 2% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 1.61 ± 2% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± -1% +Inf% 1.24 ± 2% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± -1% +Inf% 1.46 ± 1% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply
0.00 ± -1% +Inf% 1.21 ± 2% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay
1.25 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
0.00 ± -1% +Inf% 3.06 ± 1% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
1.04 ± 0% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_bmapi_read.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 3.04 ± 1% perf-profile.cycles-pp.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 3.05 ± 1% perf-profile.cycles-pp.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
1.32 ± 2% -21.5% 1.04 ± 1% perf-profile.cycles-pp.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat
51.83 ± 1% +14.3% 59.25 ± 0% perf-profile.cycles-pp.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.sys_write
0.00 ± -1% +Inf% 16.05 ± 0% perf-profile.cycles-pp.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
53.16 ± 1% +13.6% 60.40 ± 0% perf-profile.cycles-pp.xfs_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.24 ± 1% -23.1% 0.95 ± 4% perf-profile.cycles-pp.xfs_fs_destroy_inode.destroy_inode.evict.iput.__dentry_kill
1.42 ± 2% -21.2% 1.12 ± 1% perf-profile.cycles-pp.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open
6.46 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
1.29 ± 3% -18.9% 1.04 ± 1% perf-profile.cycles-pp.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 1.14 ± 3% perf-profile.cycles-pp.xfs_ilock.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
1.21 ± 1% -23.4% 0.93 ± 4% perf-profile.cycles-pp.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
1.23 ± 4% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.6.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
0.00 ± -1% +Inf% 4.14 ± 0% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
3.28 ± 2% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 9.08 ± 0% perf-profile.cycles-pp.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
3.54 ± 0% -20.8% 2.81 ± 1% perf-profile.cycles-pp.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
2.35 ± 1% -21.0% 1.86 ± 1% perf-profile.cycles-pp.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page
25.10 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.03 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
1.42 ± 2% -20.7% 1.13 ± 1% perf-profile.cycles-pp.xfs_vn_create.path_openat.do_filp_open.do_sys_open.sys_creat
1.42 ± 2% -20.5% 1.13 ± 1% perf-profile.cycles-pp.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open
2.27 ± 1% -10.6% 2.03 ± 0% perf-profile.func.cycles-pp.___might_sleep
2.49 ± 0% -34.5% 1.63 ± 1% perf-profile.func.cycles-pp.__block_commit_write.isra.24
1.51 ± 2% +15.4% 1.75 ± 1% perf-profile.func.cycles-pp.__block_write_begin_int
1.79 ± 4% -16.8% 1.49 ± 1% perf-profile.func.cycles-pp.__mark_inode_dirty
1.32 ± 0% -16.4% 1.10 ± 1% perf-profile.func.cycles-pp.__radix_tree_lookup
1.08 ± 2% -100.0% 0.00 ± -1% perf-profile.func.cycles-pp.__xfs_get_blocks
1.16 ± 0% -18.1% 0.95 ± 1% perf-profile.func.cycles-pp._raw_spin_lock
3.96 ± 2% -18.4% 3.23 ± 0% perf-profile.func.cycles-pp.copy_user_enhanced_fast_string
1.41 ± 3% -20.6% 1.12 ± 3% perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath
1.30 ± 2% -100.0% 0.00 ± -1% perf-profile.func.cycles-pp.generic_perform_write
1.31 ± 2% -46.7% 0.70 ± 0% perf-profile.func.cycles-pp.generic_write_end
18.43 ± 5% -9.1% 16.76 ± 0% perf-profile.func.cycles-pp.intel_idle
0.00 ± -1% +Inf% 1.12 ± 1% perf-profile.func.cycles-pp.iomap_write_actor
1.50 ± 1% -20.9% 1.19 ± 1% perf-profile.func.cycles-pp.mark_buffer_dirty
0.00 ± -1% +Inf% 1.91 ± 1% perf-profile.func.cycles-pp.mark_page_accessed
3.24 ± 0% -19.8% 2.60 ± 0% perf-profile.func.cycles-pp.memset_erms
1.75 ± 2% -18.9% 1.42 ± 1% perf-profile.func.cycles-pp.unlock_page
1.16 ± 1% -21.6% 0.91 ± 1% perf-profile.func.cycles-pp.vfs_write
0.37 ± 2% +243.6% 1.26 ± 2% perf-profile.func.cycles-pp.xfs_bmap_search_extents
0.41 ± 1% +198.4% 1.22 ± 2% perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents
0.70 ± 5% +219.5% 2.24 ± 0% perf-profile.func.cycles-pp.xfs_bmapi_read
1.05 ± 2% -15.6% 0.88 ± 3% perf-profile.func.cycles-pp.xfs_file_write_iter
0.64 ± 1% +182.8% 1.81 ± 4% perf-profile.func.cycles-pp.xfs_iext_bno_to_ext
0.00 ± -1% +Inf% 1.10 ± 3% perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8
0.46 ± 4% +161.6% 1.20 ± 1% perf-profile.func.cycles-pp.xfs_iomap_write_delay
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 0:11 ` Huang, Ying
@ 2016-08-11 0:23 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 0:23 UTC (permalink / raw)
To: Huang, Ying
Cc: Dave Chinner, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On Wed, Aug 10, 2016 at 5:11 PM, Huang, Ying <ying.huang@intel.com> wrote:
>
> Here is the comparison result with perf-profile data.
Heh. The diff is actually harder to read than just showing A/B
state.The fact that the call chain shows up as part of the symbol
makes it even more so.
For example:
> 0.00 ± -1% +Inf% 1.68 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
> 1.80 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
Ok, so it went from 1.8% to 1.68%, and isn't actually that big of a
change, but it shows up as a big change because the caller changed
from xfs_vm_write_begin to iomap_write_begin.
There's a few other cases of that too.
So I think it would actually be easier to just see "what 20 functions
were the hottest" (or maybe 50) before and after separately (just
sorted by cycles), without the diff part. Because the diff is really
hard to read.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 0:23 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 0:23 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1167 bytes --]
On Wed, Aug 10, 2016 at 5:11 PM, Huang, Ying <ying.huang@intel.com> wrote:
>
> Here is the comparison result with perf-profile data.
Heh. The diff is actually harder to read than just showing A/B
state.The fact that the call chain shows up as part of the symbol
makes it even more so.
For example:
> 0.00 ± -1% +Inf% 1.68 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
> 1.80 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
Ok, so it went from 1.8% to 1.68%, and isn't actually that big of a
change, but it shows up as a big change because the caller changed
from xfs_vm_write_begin to iomap_write_begin.
There's a few other cases of that too.
So I think it would actually be easier to just see "what 20 functions
were the hottest" (or maybe 50) before and after separately (just
sorted by cycles), without the diff part. Because the diff is really
hard to read.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 0:23 ` Linus Torvalds
@ 2016-08-11 0:33 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-11 0:33 UTC (permalink / raw)
To: Linus Torvalds
Cc: Huang, Ying, Dave Chinner, LKML, Bob Peterson, Wu Fengguang, LKP,
Christoph Hellwig
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Wed, Aug 10, 2016 at 5:11 PM, Huang, Ying <ying.huang@intel.com> wrote:
>>
>> Here is the comparison result with perf-profile data.
>
> Heh. The diff is actually harder to read than just showing A/B
> state.The fact that the call chain shows up as part of the symbol
> makes it even more so.
>
> For example:
>
>> 0.00 ± -1% +Inf% 1.68 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
>> 1.80 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
>
> Ok, so it went from 1.8% to 1.68%, and isn't actually that big of a
> change, but it shows up as a big change because the caller changed
> from xfs_vm_write_begin to iomap_write_begin.
>
> There's a few other cases of that too.
>
> So I think it would actually be easier to just see "what 20 functions
> were the hottest" (or maybe 50) before and after separately (just
> sorted by cycles), without the diff part. Because the diff is really
> hard to read.
Here it is,
Before:
"perf-profile.func.cycles-pp.intel_idle": 16.88,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.94,
"perf-profile.func.cycles-pp.memset_erms": 3.26,
"perf-profile.func.cycles-pp.__block_commit_write.isra.24": 2.47,
"perf-profile.func.cycles-pp.___might_sleep": 2.33,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.88,
"perf-profile.func.cycles-pp.unlock_page": 1.69,
"perf-profile.func.cycles-pp.up_write": 1.61,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.56,
"perf-profile.func.cycles-pp.down_write": 1.55,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 1.53,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.47,
"perf-profile.func.cycles-pp.generic_write_end": 1.36,
"perf-profile.func.cycles-pp.generic_perform_write": 1.33,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 1.32,
"perf-profile.func.cycles-pp.__might_sleep": 1.26,
"perf-profile.func.cycles-pp._raw_spin_lock": 1.17,
"perf-profile.func.cycles-pp.vfs_write": 1.14,
"perf-profile.func.cycles-pp.__xfs_get_blocks": 1.07,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 1.03,
"perf-profile.func.cycles-pp.pagecache_get_page": 1.03,
"perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 0.98,
"perf-profile.func.cycles-pp.get_page_from_freelist": 0.94,
"perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.94,
"perf-profile.func.cycles-pp.__vfs_write": 0.87,
"perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 0.87,
"perf-profile.func.cycles-pp.xfs_file_buffered_aio_write": 0.84,
"perf-profile.func.cycles-pp.find_get_entry": 0.79,
"perf-profile.func.cycles-pp._raw_spin_lock_irqsave": 0.78,
After:
"perf-profile.func.cycles-pp.intel_idle": 16.82,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.27,
"perf-profile.func.cycles-pp.memset_erms": 2.6,
"perf-profile.func.cycles-pp.xfs_bmapi_read": 2.24,
"perf-profile.func.cycles-pp.___might_sleep": 2.04,
"perf-profile.func.cycles-pp.mark_page_accessed": 1.93,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.78,
"perf-profile.func.cycles-pp.up_write": 1.72,
"perf-profile.func.cycles-pp.xfs_iext_bno_to_ext": 1.7,
"perf-profile.func.cycles-pp.__block_commit_write.isra.24": 1.65,
"perf-profile.func.cycles-pp.down_write": 1.51,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.51,
"perf-profile.func.cycles-pp.unlock_page": 1.43,
"perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents": 1.25,
"perf-profile.func.cycles-pp.xfs_bmap_search_extents": 1.23,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 1.21,
"perf-profile.func.cycles-pp.xfs_iomap_write_delay": 1.19,
"perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8": 1.15,
"perf-profile.func.cycles-pp.iomap_write_actor": 1.14,
"perf-profile.func.cycles-pp.__might_sleep": 1.12,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 1.08,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.07,
"perf-profile.func.cycles-pp.pagecache_get_page": 0.95,
"perf-profile.func.cycles-pp._raw_spin_lock": 0.95,
"perf-profile.func.cycles-pp.xfs_bmapi_delay": 0.93,
"perf-profile.func.cycles-pp.vfs_write": 0.92,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 0.86,
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 0:33 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-11 0:33 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4626 bytes --]
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Wed, Aug 10, 2016 at 5:11 PM, Huang, Ying <ying.huang@intel.com> wrote:
>>
>> Here is the comparison result with perf-profile data.
>
> Heh. The diff is actually harder to read than just showing A/B
> state.The fact that the call chain shows up as part of the symbol
> makes it even more so.
>
> For example:
>
>> 0.00 ± -1% +Inf% 1.68 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
>> 1.80 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
>
> Ok, so it went from 1.8% to 1.68%, and isn't actually that big of a
> change, but it shows up as a big change because the caller changed
> from xfs_vm_write_begin to iomap_write_begin.
>
> There's a few other cases of that too.
>
> So I think it would actually be easier to just see "what 20 functions
> were the hottest" (or maybe 50) before and after separately (just
> sorted by cycles), without the diff part. Because the diff is really
> hard to read.
Here it is,
Before:
"perf-profile.func.cycles-pp.intel_idle": 16.88,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.94,
"perf-profile.func.cycles-pp.memset_erms": 3.26,
"perf-profile.func.cycles-pp.__block_commit_write.isra.24": 2.47,
"perf-profile.func.cycles-pp.___might_sleep": 2.33,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.88,
"perf-profile.func.cycles-pp.unlock_page": 1.69,
"perf-profile.func.cycles-pp.up_write": 1.61,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.56,
"perf-profile.func.cycles-pp.down_write": 1.55,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 1.53,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.47,
"perf-profile.func.cycles-pp.generic_write_end": 1.36,
"perf-profile.func.cycles-pp.generic_perform_write": 1.33,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 1.32,
"perf-profile.func.cycles-pp.__might_sleep": 1.26,
"perf-profile.func.cycles-pp._raw_spin_lock": 1.17,
"perf-profile.func.cycles-pp.vfs_write": 1.14,
"perf-profile.func.cycles-pp.__xfs_get_blocks": 1.07,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 1.03,
"perf-profile.func.cycles-pp.pagecache_get_page": 1.03,
"perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 0.98,
"perf-profile.func.cycles-pp.get_page_from_freelist": 0.94,
"perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.94,
"perf-profile.func.cycles-pp.__vfs_write": 0.87,
"perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 0.87,
"perf-profile.func.cycles-pp.xfs_file_buffered_aio_write": 0.84,
"perf-profile.func.cycles-pp.find_get_entry": 0.79,
"perf-profile.func.cycles-pp._raw_spin_lock_irqsave": 0.78,
After:
"perf-profile.func.cycles-pp.intel_idle": 16.82,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.27,
"perf-profile.func.cycles-pp.memset_erms": 2.6,
"perf-profile.func.cycles-pp.xfs_bmapi_read": 2.24,
"perf-profile.func.cycles-pp.___might_sleep": 2.04,
"perf-profile.func.cycles-pp.mark_page_accessed": 1.93,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.78,
"perf-profile.func.cycles-pp.up_write": 1.72,
"perf-profile.func.cycles-pp.xfs_iext_bno_to_ext": 1.7,
"perf-profile.func.cycles-pp.__block_commit_write.isra.24": 1.65,
"perf-profile.func.cycles-pp.down_write": 1.51,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.51,
"perf-profile.func.cycles-pp.unlock_page": 1.43,
"perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents": 1.25,
"perf-profile.func.cycles-pp.xfs_bmap_search_extents": 1.23,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 1.21,
"perf-profile.func.cycles-pp.xfs_iomap_write_delay": 1.19,
"perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8": 1.15,
"perf-profile.func.cycles-pp.iomap_write_actor": 1.14,
"perf-profile.func.cycles-pp.__might_sleep": 1.12,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 1.08,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.07,
"perf-profile.func.cycles-pp.pagecache_get_page": 0.95,
"perf-profile.func.cycles-pp._raw_spin_lock": 0.95,
"perf-profile.func.cycles-pp.xfs_bmapi_delay": 0.93,
"perf-profile.func.cycles-pp.vfs_write": 0.92,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 0.86,
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 0:33 ` Huang, Ying
@ 2016-08-11 1:00 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 1:00 UTC (permalink / raw)
To: Huang, Ying
Cc: Dave Chinner, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
>
> Here it is,
Thanks.
Appended is a munged "after" list, with the "before" values in
parenthesis. It actually looks fairly similar.
The biggest difference is that we have "mark_page_accessed()" show up
after, and not before. There was also a lot of LRU noise in the
non-profile data. I wonder if that is the reason here: the old model
of using generic_perform_write/block_page_mkwrite didn't mark the
pages accessed, and now with iomap_file_buffered_write() they get
marked as active and that screws up the LRU list, and makes us not
flush out the dirty pages well (because they are seen as active and
not good for writeback), and then you get bad memory use.
I'm not seeing anything that looks like locking-related.
And I may well have screwed up that list munging. I should have
automated it more than I did.
Dave, Christoph?
Linus
---
intel_idle 16.82 (16.88)
copy_user_enhanced_fast_string 3.27 (3.94)
memset_erms 2.6 (3.26)
xfs_bmapi_read 2.24
___might_sleep 2.04 (2.33)
mark_page_accessed 1.93
__block_write_begin_int 1.78 (1.56)
up_write 1.72 (1.61)
xfs_iext_bno_to_ext 1.7
__block_commit_write.isra.24 1.65 (2.47)
down_write 1.51 (1.55)
__mark_inode_dirty 1.51 (1.88)
unlock_page 1.43 (1.69)
xfs_bmap_search_multi_extents 1.25
xfs_bmap_search_extents 1.23
mark_buffer_dirty 1.21 (1.53)
xfs_iomap_write_delay 1.19
xfs_iomap_eof_want_preallocate.constprop.8 1.15
iomap_write_actor 1.14
__might_sleep 1.12 (1.26)
__radix_tree_lookup 1.08 (1.32)
entry_SYSCALL_64_fastpath 1.07 (1.47)
pagecache_get_page 0.95 (1.03)
_raw_spin_lock 0.95 (1.17)
xfs_bmapi_delay 0.93
vfs_write 0.92 (1.14)
xfs_file_write_iter 0.86
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 1:00 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 1:00 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2623 bytes --]
On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
>
> Here it is,
Thanks.
Appended is a munged "after" list, with the "before" values in
parenthesis. It actually looks fairly similar.
The biggest difference is that we have "mark_page_accessed()" show up
after, and not before. There was also a lot of LRU noise in the
non-profile data. I wonder if that is the reason here: the old model
of using generic_perform_write/block_page_mkwrite didn't mark the
pages accessed, and now with iomap_file_buffered_write() they get
marked as active and that screws up the LRU list, and makes us not
flush out the dirty pages well (because they are seen as active and
not good for writeback), and then you get bad memory use.
I'm not seeing anything that looks like locking-related.
And I may well have screwed up that list munging. I should have
automated it more than I did.
Dave, Christoph?
Linus
---
intel_idle 16.82 (16.88)
copy_user_enhanced_fast_string 3.27 (3.94)
memset_erms 2.6 (3.26)
xfs_bmapi_read 2.24
___might_sleep 2.04 (2.33)
mark_page_accessed 1.93
__block_write_begin_int 1.78 (1.56)
up_write 1.72 (1.61)
xfs_iext_bno_to_ext 1.7
__block_commit_write.isra.24 1.65 (2.47)
down_write 1.51 (1.55)
__mark_inode_dirty 1.51 (1.88)
unlock_page 1.43 (1.69)
xfs_bmap_search_multi_extents 1.25
xfs_bmap_search_extents 1.23
mark_buffer_dirty 1.21 (1.53)
xfs_iomap_write_delay 1.19
xfs_iomap_eof_want_preallocate.constprop.8 1.15
iomap_write_actor 1.14
__might_sleep 1.12 (1.26)
__radix_tree_lookup 1.08 (1.32)
entry_SYSCALL_64_fastpath 1.07 (1.47)
pagecache_get_page 0.95 (1.03)
_raw_spin_lock 0.95 (1.17)
xfs_bmapi_delay 0.93
vfs_write 0.92 (1.14)
xfs_file_write_iter 0.86
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 0:33 ` Huang, Ying
@ 2016-08-11 1:16 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-11 1:16 UTC (permalink / raw)
To: Huang, Ying
Cc: Linus Torvalds, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On Wed, Aug 10, 2016 at 05:33:20PM -0700, Huang, Ying wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
> > On Wed, Aug 10, 2016 at 5:11 PM, Huang, Ying <ying.huang@intel.com> wrote:
> >>
> >> Here is the comparison result with perf-profile data.
> >
> > Heh. The diff is actually harder to read than just showing A/B
> > state.The fact that the call chain shows up as part of the symbol
> > makes it even more so.
> >
> > For example:
> >
> >> 0.00 ± -1% +Inf% 1.68 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
> >> 1.80 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
> >
> > Ok, so it went from 1.8% to 1.68%, and isn't actually that big of a
> > change, but it shows up as a big change because the caller changed
> > from xfs_vm_write_begin to iomap_write_begin.
> >
> > There's a few other cases of that too.
> >
> > So I think it would actually be easier to just see "what 20 functions
> > were the hottest" (or maybe 50) before and after separately (just
> > sorted by cycles), without the diff part. Because the diff is really
> > hard to read.
>
> Here it is,
>
> Before:
>
> "perf-profile.func.cycles-pp.intel_idle": 16.88,
> "perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.94,
> "perf-profile.func.cycles-pp.memset_erms": 3.26,
> "perf-profile.func.cycles-pp.__block_commit_write.isra.24": 2.47,
> "perf-profile.func.cycles-pp.___might_sleep": 2.33,
> "perf-profile.func.cycles-pp.__mark_inode_dirty": 1.88,
> "perf-profile.func.cycles-pp.unlock_page": 1.69,
> "perf-profile.func.cycles-pp.up_write": 1.61,
> "perf-profile.func.cycles-pp.__block_write_begin_int": 1.56,
> "perf-profile.func.cycles-pp.down_write": 1.55,
> "perf-profile.func.cycles-pp.mark_buffer_dirty": 1.53,
> "perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.47,
> "perf-profile.func.cycles-pp.generic_write_end": 1.36,
> "perf-profile.func.cycles-pp.generic_perform_write": 1.33,
> "perf-profile.func.cycles-pp.__radix_tree_lookup": 1.32,
> "perf-profile.func.cycles-pp.__might_sleep": 1.26,
> "perf-profile.func.cycles-pp._raw_spin_lock": 1.17,
> "perf-profile.func.cycles-pp.vfs_write": 1.14,
> "perf-profile.func.cycles-pp.__xfs_get_blocks": 1.07,
Ok, so that is the old block mapping call in the buffered IO path.
I don't see any of the functions it calls in the profile;
specifically xfs_bmapi_read(), and xfs_iomap_write_delay(), so it
appears the extent mapping and allocation overhead on the old code
totals somewhere under 2-3% of the entire CPU usage.
> "perf-profile.func.cycles-pp.xfs_file_write_iter": 1.03,
> "perf-profile.func.cycles-pp.pagecache_get_page": 1.03,
> "perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 0.98,
> "perf-profile.func.cycles-pp.get_page_from_freelist": 0.94,
> "perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.94,
> "perf-profile.func.cycles-pp.__vfs_write": 0.87,
> "perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 0.87,
> "perf-profile.func.cycles-pp.xfs_file_buffered_aio_write": 0.84,
> "perf-profile.func.cycles-pp.find_get_entry": 0.79,
> "perf-profile.func.cycles-pp._raw_spin_lock_irqsave": 0.78,
>
>
> After:
>
> "perf-profile.func.cycles-pp.intel_idle": 16.82,
> "perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.27,
> "perf-profile.func.cycles-pp.memset_erms": 2.6,
> "perf-profile.func.cycles-pp.xfs_bmapi_read": 2.24,
Straight away - thats' at least 3x more overhead block mapping lookups
with the iomap code.
> "perf-profile.func.cycles-pp.___might_sleep": 2.04,
> "perf-profile.func.cycles-pp.mark_page_accessed": 1.93,
> "perf-profile.func.cycles-pp.__block_write_begin_int": 1.78,
> "perf-profile.func.cycles-pp.up_write": 1.72,
> "perf-profile.func.cycles-pp.xfs_iext_bno_to_ext": 1.7,
Plus this child.
> "perf-profile.func.cycles-pp.__block_commit_write.isra.24": 1.65,
> "perf-profile.func.cycles-pp.down_write": 1.51,
> "perf-profile.func.cycles-pp.__mark_inode_dirty": 1.51,
> "perf-profile.func.cycles-pp.unlock_page": 1.43,
> "perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents": 1.25,
> "perf-profile.func.cycles-pp.xfs_bmap_search_extents": 1.23,
And these two.
> "perf-profile.func.cycles-pp.mark_buffer_dirty": 1.21,
> "perf-profile.func.cycles-pp.xfs_iomap_write_delay": 1.19,
> "perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8": 1.15,
And these two.
So, essentially the old code had maybe 2-3% cpu usage overhead in
the block mapping path on this workload, but the new code is, for
some reason, showing at least 8-9% CPU usage overhead. That, right
now, makes no sense at all to me as we should be doing - at worst -
exactly the same number of block mapping calls as the old code.
We need to know what is happening that is different - there's a good
chance the mapping trace events will tell us. Huang, can you get
a raw event trace from the test?
I need to see these events:
xfs_file*
xfs_iomap*
xfs_get_block*
For both kernels. An example trace from 4.8-rc1 running the command
`xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
overwrite and extend of the existing file ends up looking like:
$ sudo trace-cmd start -e xfs_iomap\* -e xfs_file\* -e xfs_get_blocks\*
$ sudo cat /sys/kernel/tracing/trace_pipe
<...>-2946 [001] .... 253971.750304: xfs_file_ioctl: dev 253:32 ino 0x84
xfs_io-2946 [001] .... 253971.750938: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 0x20000
xfs_io-2946 [001] .... 253971.750961: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
xfs_io-2946 [001] .... 253971.751114: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 0x20000
xfs_io-2946 [001] .... 253971.751128: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
xfs_io-2946 [001] .... 253971.751234: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 0x20000
xfs_io-2946 [001] .... 253971.751236: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
xfs_io-2946 [001] .... 253971.751381: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 0x20000
xfs_io-2946 [001] .... 253971.751415: xfs_iomap_prealloc_size: dev 253:32 ino 0x84 prealloc blocks 128 shift 0 m_writeio_blocks 16
xfs_io-2946 [001] .... 253971.751425: xfs_iomap_alloc: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 131072 type invalid startoff 0x60 startblock -1 blockcount 0x90
That's the output I need for the complete test - you'll need to use
a better recording mechanism that this (e.g. trace-cmd record,
trace-cmd report) because it will generate a lot of events. Compress
the two report files (they'll be large) and send them to me offlist.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 1:16 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-11 1:16 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 7529 bytes --]
On Wed, Aug 10, 2016 at 05:33:20PM -0700, Huang, Ying wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
> > On Wed, Aug 10, 2016 at 5:11 PM, Huang, Ying <ying.huang@intel.com> wrote:
> >>
> >> Here is the comparison result with perf-profile data.
> >
> > Heh. The diff is actually harder to read than just showing A/B
> > state.The fact that the call chain shows up as part of the symbol
> > makes it even more so.
> >
> > For example:
> >
> >> 0.00 ± -1% +Inf% 1.68 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
> >> 1.80 ± 1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
> >
> > Ok, so it went from 1.8% to 1.68%, and isn't actually that big of a
> > change, but it shows up as a big change because the caller changed
> > from xfs_vm_write_begin to iomap_write_begin.
> >
> > There's a few other cases of that too.
> >
> > So I think it would actually be easier to just see "what 20 functions
> > were the hottest" (or maybe 50) before and after separately (just
> > sorted by cycles), without the diff part. Because the diff is really
> > hard to read.
>
> Here it is,
>
> Before:
>
> "perf-profile.func.cycles-pp.intel_idle": 16.88,
> "perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.94,
> "perf-profile.func.cycles-pp.memset_erms": 3.26,
> "perf-profile.func.cycles-pp.__block_commit_write.isra.24": 2.47,
> "perf-profile.func.cycles-pp.___might_sleep": 2.33,
> "perf-profile.func.cycles-pp.__mark_inode_dirty": 1.88,
> "perf-profile.func.cycles-pp.unlock_page": 1.69,
> "perf-profile.func.cycles-pp.up_write": 1.61,
> "perf-profile.func.cycles-pp.__block_write_begin_int": 1.56,
> "perf-profile.func.cycles-pp.down_write": 1.55,
> "perf-profile.func.cycles-pp.mark_buffer_dirty": 1.53,
> "perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.47,
> "perf-profile.func.cycles-pp.generic_write_end": 1.36,
> "perf-profile.func.cycles-pp.generic_perform_write": 1.33,
> "perf-profile.func.cycles-pp.__radix_tree_lookup": 1.32,
> "perf-profile.func.cycles-pp.__might_sleep": 1.26,
> "perf-profile.func.cycles-pp._raw_spin_lock": 1.17,
> "perf-profile.func.cycles-pp.vfs_write": 1.14,
> "perf-profile.func.cycles-pp.__xfs_get_blocks": 1.07,
Ok, so that is the old block mapping call in the buffered IO path.
I don't see any of the functions it calls in the profile;
specifically xfs_bmapi_read(), and xfs_iomap_write_delay(), so it
appears the extent mapping and allocation overhead on the old code
totals somewhere under 2-3% of the entire CPU usage.
> "perf-profile.func.cycles-pp.xfs_file_write_iter": 1.03,
> "perf-profile.func.cycles-pp.pagecache_get_page": 1.03,
> "perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 0.98,
> "perf-profile.func.cycles-pp.get_page_from_freelist": 0.94,
> "perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.94,
> "perf-profile.func.cycles-pp.__vfs_write": 0.87,
> "perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 0.87,
> "perf-profile.func.cycles-pp.xfs_file_buffered_aio_write": 0.84,
> "perf-profile.func.cycles-pp.find_get_entry": 0.79,
> "perf-profile.func.cycles-pp._raw_spin_lock_irqsave": 0.78,
>
>
> After:
>
> "perf-profile.func.cycles-pp.intel_idle": 16.82,
> "perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.27,
> "perf-profile.func.cycles-pp.memset_erms": 2.6,
> "perf-profile.func.cycles-pp.xfs_bmapi_read": 2.24,
Straight away - thats' at least 3x more overhead block mapping lookups
with the iomap code.
> "perf-profile.func.cycles-pp.___might_sleep": 2.04,
> "perf-profile.func.cycles-pp.mark_page_accessed": 1.93,
> "perf-profile.func.cycles-pp.__block_write_begin_int": 1.78,
> "perf-profile.func.cycles-pp.up_write": 1.72,
> "perf-profile.func.cycles-pp.xfs_iext_bno_to_ext": 1.7,
Plus this child.
> "perf-profile.func.cycles-pp.__block_commit_write.isra.24": 1.65,
> "perf-profile.func.cycles-pp.down_write": 1.51,
> "perf-profile.func.cycles-pp.__mark_inode_dirty": 1.51,
> "perf-profile.func.cycles-pp.unlock_page": 1.43,
> "perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents": 1.25,
> "perf-profile.func.cycles-pp.xfs_bmap_search_extents": 1.23,
And these two.
> "perf-profile.func.cycles-pp.mark_buffer_dirty": 1.21,
> "perf-profile.func.cycles-pp.xfs_iomap_write_delay": 1.19,
> "perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8": 1.15,
And these two.
So, essentially the old code had maybe 2-3% cpu usage overhead in
the block mapping path on this workload, but the new code is, for
some reason, showing at least 8-9% CPU usage overhead. That, right
now, makes no sense at all to me as we should be doing - at worst -
exactly the same number of block mapping calls as the old code.
We need to know what is happening that is different - there's a good
chance the mapping trace events will tell us. Huang, can you get
a raw event trace from the test?
I need to see these events:
xfs_file*
xfs_iomap*
xfs_get_block*
For both kernels. An example trace from 4.8-rc1 running the command
`xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
overwrite and extend of the existing file ends up looking like:
$ sudo trace-cmd start -e xfs_iomap\* -e xfs_file\* -e xfs_get_blocks\*
$ sudo cat /sys/kernel/tracing/trace_pipe
<...>-2946 [001] .... 253971.750304: xfs_file_ioctl: dev 253:32 ino 0x84
xfs_io-2946 [001] .... 253971.750938: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 0x20000
xfs_io-2946 [001] .... 253971.750961: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
xfs_io-2946 [001] .... 253971.751114: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 0x20000
xfs_io-2946 [001] .... 253971.751128: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
xfs_io-2946 [001] .... 253971.751234: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 0x20000
xfs_io-2946 [001] .... 253971.751236: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
xfs_io-2946 [001] .... 253971.751381: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 0x20000
xfs_io-2946 [001] .... 253971.751415: xfs_iomap_prealloc_size: dev 253:32 ino 0x84 prealloc blocks 128 shift 0 m_writeio_blocks 16
xfs_io-2946 [001] .... 253971.751425: xfs_iomap_alloc: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 131072 type invalid startoff 0x60 startblock -1 blockcount 0x90
That's the output I need for the complete test - you'll need to use
a better recording mechanism that this (e.g. trace-cmd record,
trace-cmd report) because it will generate a lot of events. Compress
the two report files (they'll be large) and send them to me offlist.
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 1:16 ` Dave Chinner
@ 2016-08-11 1:32 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-11 1:32 UTC (permalink / raw)
To: Huang, Ying
Cc: Linus Torvalds, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> I need to see these events:
>
> xfs_file*
> xfs_iomap*
> xfs_get_block*
>
> For both kernels. An example trace from 4.8-rc1 running the command
> `xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
> overwrite and extend of the existing file ends up looking like:
>
> $ sudo trace-cmd start -e xfs_iomap\* -e xfs_file\* -e xfs_get_blocks\*
> $ sudo cat /sys/kernel/tracing/trace_pipe
> <...>-2946 [001] .... 253971.750304: xfs_file_ioctl: dev 253:32 ino 0x84
> xfs_io-2946 [001] .... 253971.750938: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 0x20000
> xfs_io-2946 [001] .... 253971.750961: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> xfs_io-2946 [001] .... 253971.751114: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 0x20000
> xfs_io-2946 [001] .... 253971.751128: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> xfs_io-2946 [001] .... 253971.751234: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 0x20000
> xfs_io-2946 [001] .... 253971.751236: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> xfs_io-2946 [001] .... 253971.751381: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 0x20000
> xfs_io-2946 [001] .... 253971.751415: xfs_iomap_prealloc_size: dev 253:32 ino 0x84 prealloc blocks 128 shift 0 m_writeio_blocks 16
> xfs_io-2946 [001] .... 253971.751425: xfs_iomap_alloc: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 131072 type invalid startoff 0x60 startblock -1 blockcount 0x90
>
> That's the output I need for the complete test - you'll need to use
> a better recording mechanism that this (e.g. trace-cmd record,
> trace-cmd report) because it will generate a lot of events. Compress
> the two report files (they'll be large) and send them to me offlist.
Can you also send me the output of xfs_info on the filesystem you
are testing?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 1:32 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-11 1:32 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2437 bytes --]
On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> I need to see these events:
>
> xfs_file*
> xfs_iomap*
> xfs_get_block*
>
> For both kernels. An example trace from 4.8-rc1 running the command
> `xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
> overwrite and extend of the existing file ends up looking like:
>
> $ sudo trace-cmd start -e xfs_iomap\* -e xfs_file\* -e xfs_get_blocks\*
> $ sudo cat /sys/kernel/tracing/trace_pipe
> <...>-2946 [001] .... 253971.750304: xfs_file_ioctl: dev 253:32 ino 0x84
> xfs_io-2946 [001] .... 253971.750938: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 0x20000
> xfs_io-2946 [001] .... 253971.750961: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> xfs_io-2946 [001] .... 253971.751114: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 0x20000
> xfs_io-2946 [001] .... 253971.751128: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> xfs_io-2946 [001] .... 253971.751234: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 0x20000
> xfs_io-2946 [001] .... 253971.751236: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> xfs_io-2946 [001] .... 253971.751381: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 0x20000
> xfs_io-2946 [001] .... 253971.751415: xfs_iomap_prealloc_size: dev 253:32 ino 0x84 prealloc blocks 128 shift 0 m_writeio_blocks 16
> xfs_io-2946 [001] .... 253971.751425: xfs_iomap_alloc: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 131072 type invalid startoff 0x60 startblock -1 blockcount 0x90
>
> That's the output I need for the complete test - you'll need to use
> a better recording mechanism that this (e.g. trace-cmd record,
> trace-cmd report) because it will generate a lot of events. Compress
> the two report files (they'll be large) and send them to me offlist.
Can you also send me the output of xfs_info on the filesystem you
are testing?
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 1:32 ` Dave Chinner
@ 2016-08-11 2:36 ` Ye Xiaolong
-1 siblings, 0 replies; 219+ messages in thread
From: Ye Xiaolong @ 2016-08-11 2:36 UTC (permalink / raw)
To: Dave Chinner
Cc: Huang, Ying, Linus Torvalds, LKML, Bob Peterson, Wu Fengguang,
LKP, Christoph Hellwig
On 08/11, Dave Chinner wrote:
>On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
>> I need to see these events:
>>
>> xfs_file*
>> xfs_iomap*
>> xfs_get_block*
>>
>> For both kernels. An example trace from 4.8-rc1 running the command
>> `xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
>> overwrite and extend of the existing file ends up looking like:
>>
>> $ sudo trace-cmd start -e xfs_iomap\* -e xfs_file\* -e xfs_get_blocks\*
>> $ sudo cat /sys/kernel/tracing/trace_pipe
>> <...>-2946 [001] .... 253971.750304: xfs_file_ioctl: dev 253:32 ino 0x84
>> xfs_io-2946 [001] .... 253971.750938: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 0x20000
>> xfs_io-2946 [001] .... 253971.750961: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
>> xfs_io-2946 [001] .... 253971.751114: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 0x20000
>> xfs_io-2946 [001] .... 253971.751128: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
>> xfs_io-2946 [001] .... 253971.751234: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 0x20000
>> xfs_io-2946 [001] .... 253971.751236: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
>> xfs_io-2946 [001] .... 253971.751381: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 0x20000
>> xfs_io-2946 [001] .... 253971.751415: xfs_iomap_prealloc_size: dev 253:32 ino 0x84 prealloc blocks 128 shift 0 m_writeio_blocks 16
>> xfs_io-2946 [001] .... 253971.751425: xfs_iomap_alloc: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 131072 type invalid startoff 0x60 startblock -1 blockcount 0x90
>>
>> That's the output I need for the complete test - you'll need to use
>> a better recording mechanism that this (e.g. trace-cmd record,
>> trace-cmd report) because it will generate a lot of events. Compress
>> the two report files (they'll be large) and send them to me offlist.
>
>Can you also send me the output of xfs_info on the filesystem you
>are testing?
Hi, Dave
Here is the xfs_info output:
# xfs_info /fs/ram0/
meta-data=/dev/ram0 isize=256 agcount=4, agsize=3145728 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=12582912, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=6144, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Thanks,
Xiaolong
>
>Cheers,
>
>Dave.
>--
>Dave Chinner
>david@fromorbit.com
>_______________________________________________
>LKP mailing list
>LKP@lists.01.org
>https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 2:36 ` Ye Xiaolong
0 siblings, 0 replies; 219+ messages in thread
From: Ye Xiaolong @ 2016-08-11 2:36 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3347 bytes --]
On 08/11, Dave Chinner wrote:
>On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
>> I need to see these events:
>>
>> xfs_file*
>> xfs_iomap*
>> xfs_get_block*
>>
>> For both kernels. An example trace from 4.8-rc1 running the command
>> `xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
>> overwrite and extend of the existing file ends up looking like:
>>
>> $ sudo trace-cmd start -e xfs_iomap\* -e xfs_file\* -e xfs_get_blocks\*
>> $ sudo cat /sys/kernel/tracing/trace_pipe
>> <...>-2946 [001] .... 253971.750304: xfs_file_ioctl: dev 253:32 ino 0x84
>> xfs_io-2946 [001] .... 253971.750938: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 0x20000
>> xfs_io-2946 [001] .... 253971.750961: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
>> xfs_io-2946 [001] .... 253971.751114: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 0x20000
>> xfs_io-2946 [001] .... 253971.751128: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
>> xfs_io-2946 [001] .... 253971.751234: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 0x20000
>> xfs_io-2946 [001] .... 253971.751236: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
>> xfs_io-2946 [001] .... 253971.751381: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 0x20000
>> xfs_io-2946 [001] .... 253971.751415: xfs_iomap_prealloc_size: dev 253:32 ino 0x84 prealloc blocks 128 shift 0 m_writeio_blocks 16
>> xfs_io-2946 [001] .... 253971.751425: xfs_iomap_alloc: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 131072 type invalid startoff 0x60 startblock -1 blockcount 0x90
>>
>> That's the output I need for the complete test - you'll need to use
>> a better recording mechanism that this (e.g. trace-cmd record,
>> trace-cmd report) because it will generate a lot of events. Compress
>> the two report files (they'll be large) and send them to me offlist.
>
>Can you also send me the output of xfs_info on the filesystem you
>are testing?
Hi, Dave
Here is the xfs_info output:
# xfs_info /fs/ram0/
meta-data=/dev/ram0 isize=256 agcount=4, agsize=3145728 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=12582912, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=6144, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Thanks,
Xiaolong
>
>Cheers,
>
>Dave.
>--
>Dave Chinner
>david(a)fromorbit.com
>_______________________________________________
>LKP mailing list
>LKP(a)lists.01.org
>https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 2:36 ` Ye Xiaolong
@ 2016-08-11 3:05 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-11 3:05 UTC (permalink / raw)
To: Ye Xiaolong
Cc: Huang, Ying, Linus Torvalds, LKML, Bob Peterson, Wu Fengguang,
LKP, Christoph Hellwig
On Thu, Aug 11, 2016 at 10:36:59AM +0800, Ye Xiaolong wrote:
> On 08/11, Dave Chinner wrote:
> >On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> >> I need to see these events:
> >>
> >> xfs_file*
> >> xfs_iomap*
> >> xfs_get_block*
> >>
> >> For both kernels. An example trace from 4.8-rc1 running the command
> >> `xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
> >> overwrite and extend of the existing file ends up looking like:
> >>
> >> $ sudo trace-cmd start -e xfs_iomap\* -e xfs_file\* -e xfs_get_blocks\*
> >> $ sudo cat /sys/kernel/tracing/trace_pipe
> >> <...>-2946 [001] .... 253971.750304: xfs_file_ioctl: dev 253:32 ino 0x84
> >> xfs_io-2946 [001] .... 253971.750938: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 0x20000
> >> xfs_io-2946 [001] .... 253971.750961: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> >> xfs_io-2946 [001] .... 253971.751114: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 0x20000
> >> xfs_io-2946 [001] .... 253971.751128: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> >> xfs_io-2946 [001] .... 253971.751234: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 0x20000
> >> xfs_io-2946 [001] .... 253971.751236: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> >> xfs_io-2946 [001] .... 253971.751381: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 0x20000
> >> xfs_io-2946 [001] .... 253971.751415: xfs_iomap_prealloc_size: dev 253:32 ino 0x84 prealloc blocks 128 shift 0 m_writeio_blocks 16
> >> xfs_io-2946 [001] .... 253971.751425: xfs_iomap_alloc: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 131072 type invalid startoff 0x60 startblock -1 blockcount 0x90
> >>
> >> That's the output I need for the complete test - you'll need to use
> >> a better recording mechanism that this (e.g. trace-cmd record,
> >> trace-cmd report) because it will generate a lot of events. Compress
> >> the two report files (they'll be large) and send them to me offlist.
> >
> >Can you also send me the output of xfs_info on the filesystem you
> >are testing?
>
> Hi, Dave
>
> Here is the xfs_info output:
>
> # xfs_info /fs/ram0/
> meta-data=/dev/ram0 isize=256 agcount=4, agsize=3145728 blks
> = sectsz=4096 attr=2, projid32bit=1
> = crc=0 finobt=0
> data = bsize=4096 blocks=12582912, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0 ftype=0
> log =internal bsize=4096 blocks=6144, version=2
> = sectsz=4096 sunit=1 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
OK, nothing unusual there. One thing that I did just think of - how
close to ENOSPC does this test get? i.e. are we hitting the "we're
almost out of free space" slow paths on this test?
Cheers,
dave.
>
> Thanks,
> Xiaolong
> >
> >Cheers,
> >
> >Dave.
> >--
> >Dave Chinner
> >david@fromorbit.com
> >_______________________________________________
> >LKP mailing list
> >LKP@lists.01.org
> >https://lists.01.org/mailman/listinfo/lkp
>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 3:05 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-11 3:05 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3790 bytes --]
On Thu, Aug 11, 2016 at 10:36:59AM +0800, Ye Xiaolong wrote:
> On 08/11, Dave Chinner wrote:
> >On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> >> I need to see these events:
> >>
> >> xfs_file*
> >> xfs_iomap*
> >> xfs_get_block*
> >>
> >> For both kernels. An example trace from 4.8-rc1 running the command
> >> `xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
> >> overwrite and extend of the existing file ends up looking like:
> >>
> >> $ sudo trace-cmd start -e xfs_iomap\* -e xfs_file\* -e xfs_get_blocks\*
> >> $ sudo cat /sys/kernel/tracing/trace_pipe
> >> <...>-2946 [001] .... 253971.750304: xfs_file_ioctl: dev 253:32 ino 0x84
> >> xfs_io-2946 [001] .... 253971.750938: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 0x20000
> >> xfs_io-2946 [001] .... 253971.750961: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x0 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> >> xfs_io-2946 [001] .... 253971.751114: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 0x20000
> >> xfs_io-2946 [001] .... 253971.751128: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x20000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> >> xfs_io-2946 [001] .... 253971.751234: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 0x20000
> >> xfs_io-2946 [001] .... 253971.751236: xfs_iomap_found: dev 253:32 ino 0x84 size 0x40000 offset 0x40000 count 131072 type invalid startoff 0x0 startblock 24 blockcount 0x60
> >> xfs_io-2946 [001] .... 253971.751381: xfs_file_buffered_write: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 0x20000
> >> xfs_io-2946 [001] .... 253971.751415: xfs_iomap_prealloc_size: dev 253:32 ino 0x84 prealloc blocks 128 shift 0 m_writeio_blocks 16
> >> xfs_io-2946 [001] .... 253971.751425: xfs_iomap_alloc: dev 253:32 ino 0x84 size 0x40000 offset 0x60000 count 131072 type invalid startoff 0x60 startblock -1 blockcount 0x90
> >>
> >> That's the output I need for the complete test - you'll need to use
> >> a better recording mechanism that this (e.g. trace-cmd record,
> >> trace-cmd report) because it will generate a lot of events. Compress
> >> the two report files (they'll be large) and send them to me offlist.
> >
> >Can you also send me the output of xfs_info on the filesystem you
> >are testing?
>
> Hi, Dave
>
> Here is the xfs_info output:
>
> # xfs_info /fs/ram0/
> meta-data=/dev/ram0 isize=256 agcount=4, agsize=3145728 blks
> = sectsz=4096 attr=2, projid32bit=1
> = crc=0 finobt=0
> data = bsize=4096 blocks=12582912, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0 ftype=0
> log =internal bsize=4096 blocks=6144, version=2
> = sectsz=4096 sunit=1 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
OK, nothing unusual there. One thing that I did just think of - how
close to ENOSPC does this test get? i.e. are we hitting the "we're
almost out of free space" slow paths on this test?
Cheers,
dave.
>
> Thanks,
> Xiaolong
> >
> >Cheers,
> >
> >Dave.
> >--
> >Dave Chinner
> >david(a)fromorbit.com
> >_______________________________________________
> >LKP mailing list
> >LKP(a)lists.01.org
> >https://lists.01.org/mailman/listinfo/lkp
>
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 1:00 ` Linus Torvalds
@ 2016-08-11 4:46 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-11 4:46 UTC (permalink / raw)
To: Linus Torvalds
Cc: Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
> >
> > Here it is,
>
> Thanks.
>
> Appended is a munged "after" list, with the "before" values in
> parenthesis. It actually looks fairly similar.
>
> The biggest difference is that we have "mark_page_accessed()" show up
> after, and not before. There was also a lot of LRU noise in the
> non-profile data. I wonder if that is the reason here: the old model
> of using generic_perform_write/block_page_mkwrite didn't mark the
> pages accessed, and now with iomap_file_buffered_write() they get
> marked as active and that screws up the LRU list, and makes us not
> flush out the dirty pages well (because they are seen as active and
> not good for writeback), and then you get bad memory use.
>
> I'm not seeing anything that looks like locking-related.
Not in that profile. I've been doing some local testing inside a
4-node fake-numa 16p/16GB RAM VM to see what I can find.
I'm yet to work out how I can trigger a profile like the one that
was reported (I really need to see the event traces), but in the
mean time I found this....
Doing a large sequential single threaded buffered write using a 4k
buffer (so single page per syscall to make the XFS IO path allocator
behave the same way as in 4.7), I'm seeing a CPU profile that
indicates we have a potential mapping->tree_lock issue:
# xfs_io -f -c "truncate 0" -c "pwrite 0 47g" /mnt/scratch/fooey
wrote 50465865728/50465865728 bytes at offset 0
47.000 GiB, 12320768 ops; 0:01:36.00 (499.418 MiB/sec and 127850.9132 ops/sec)
....
24.15% [kernel] [k] _raw_spin_unlock_irqrestore
9.67% [kernel] [k] copy_user_generic_string
5.64% [kernel] [k] _raw_spin_unlock_irq
3.34% [kernel] [k] get_page_from_freelist
2.57% [kernel] [k] mark_page_accessed
2.45% [kernel] [k] do_raw_spin_lock
1.83% [kernel] [k] shrink_page_list
1.70% [kernel] [k] free_hot_cold_page
1.26% [kernel] [k] xfs_do_writepage
1.21% [kernel] [k] __radix_tree_lookup
1.20% [kernel] [k] __wake_up_bit
0.99% [kernel] [k] __block_write_begin_int
0.95% [kernel] [k] find_get_pages_tag
0.92% [kernel] [k] cancel_dirty_page
0.89% [kernel] [k] unlock_page
0.87% [kernel] [k] clear_page_dirty_for_io
0.85% [kernel] [k] xfs_bmap_worst_indlen
0.84% [kernel] [k] xfs_file_buffered_aio_write
0.81% [kernel] [k] delay_tsc
0.78% [kernel] [k] node_dirty_ok
0.77% [kernel] [k] up_write
0.74% [kernel] [k] ___might_sleep
0.73% [kernel] [k] xfs_bmap_add_extent_hole_delay
0.72% [kernel] [k] __fget_light
0.67% [kernel] [k] add_to_page_cache_lru
0.67% [kernel] [k] __slab_free
0.63% [kernel] [k] drop_buffers
0.59% [kernel] [k] down_write
0.59% [kernel] [k] kmem_cache_alloc
0.58% [kernel] [k] iomap_write_actor
0.53% [kernel] [k] page_mapping
0.52% [kernel] [k] entry_SYSCALL_64_fastpath
0.52% [kernel] [k] __mark_inode_dirty
0.51% [kernel] [k] __block_commit_write.isra.30
0.51% [kernel] [k] xfs_file_write_iter
0.49% [kernel] [k] mark_buffer_async_write
0.47% [kernel] [k] balance_dirty_pages_ratelimited
0.47% [kernel] [k] xfs_count_page_state
0.47% [kernel] [k] page_evictable
0.46% [kernel] [k] xfs_vm_releasepage
0.46% [kernel] [k] xfs_iomap_write_delay
0.46% [kernel] [k] do_raw_spin_unlock
0.44% [kernel] [k] xfs_file_iomap_begin
0.44% [kernel] [k] xfs_map_at_offset
0.42% [kernel] [k] xfs_iext_bno_to_ext
There's very little XFS showing up near the top of the profile;
it's all page cache, writeback and spin lock traffic. This is a
dead give-away as to the lock being contended:
- 33.30% 0.01% [kernel] [k] kswapd
- 4.67% kswapd
- 4.69% shrink_node
- 4.77% shrink_node_memcg.isra.75
- 7.38% shrink_inactive_list
- 6.70% shrink_page_list
- 20.02% __remove_mapping
19.90% _raw_spin_unlock_irqrestore
I don't think that this is the same as what aim7 is triggering as
there's no XFS write() path allocation functions near the top of the
profile to speak of. Still, I don't recall seeing this before...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 4:46 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-11 4:46 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4449 bytes --]
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
> >
> > Here it is,
>
> Thanks.
>
> Appended is a munged "after" list, with the "before" values in
> parenthesis. It actually looks fairly similar.
>
> The biggest difference is that we have "mark_page_accessed()" show up
> after, and not before. There was also a lot of LRU noise in the
> non-profile data. I wonder if that is the reason here: the old model
> of using generic_perform_write/block_page_mkwrite didn't mark the
> pages accessed, and now with iomap_file_buffered_write() they get
> marked as active and that screws up the LRU list, and makes us not
> flush out the dirty pages well (because they are seen as active and
> not good for writeback), and then you get bad memory use.
>
> I'm not seeing anything that looks like locking-related.
Not in that profile. I've been doing some local testing inside a
4-node fake-numa 16p/16GB RAM VM to see what I can find.
I'm yet to work out how I can trigger a profile like the one that
was reported (I really need to see the event traces), but in the
mean time I found this....
Doing a large sequential single threaded buffered write using a 4k
buffer (so single page per syscall to make the XFS IO path allocator
behave the same way as in 4.7), I'm seeing a CPU profile that
indicates we have a potential mapping->tree_lock issue:
# xfs_io -f -c "truncate 0" -c "pwrite 0 47g" /mnt/scratch/fooey
wrote 50465865728/50465865728 bytes at offset 0
47.000 GiB, 12320768 ops; 0:01:36.00 (499.418 MiB/sec and 127850.9132 ops/sec)
....
24.15% [kernel] [k] _raw_spin_unlock_irqrestore
9.67% [kernel] [k] copy_user_generic_string
5.64% [kernel] [k] _raw_spin_unlock_irq
3.34% [kernel] [k] get_page_from_freelist
2.57% [kernel] [k] mark_page_accessed
2.45% [kernel] [k] do_raw_spin_lock
1.83% [kernel] [k] shrink_page_list
1.70% [kernel] [k] free_hot_cold_page
1.26% [kernel] [k] xfs_do_writepage
1.21% [kernel] [k] __radix_tree_lookup
1.20% [kernel] [k] __wake_up_bit
0.99% [kernel] [k] __block_write_begin_int
0.95% [kernel] [k] find_get_pages_tag
0.92% [kernel] [k] cancel_dirty_page
0.89% [kernel] [k] unlock_page
0.87% [kernel] [k] clear_page_dirty_for_io
0.85% [kernel] [k] xfs_bmap_worst_indlen
0.84% [kernel] [k] xfs_file_buffered_aio_write
0.81% [kernel] [k] delay_tsc
0.78% [kernel] [k] node_dirty_ok
0.77% [kernel] [k] up_write
0.74% [kernel] [k] ___might_sleep
0.73% [kernel] [k] xfs_bmap_add_extent_hole_delay
0.72% [kernel] [k] __fget_light
0.67% [kernel] [k] add_to_page_cache_lru
0.67% [kernel] [k] __slab_free
0.63% [kernel] [k] drop_buffers
0.59% [kernel] [k] down_write
0.59% [kernel] [k] kmem_cache_alloc
0.58% [kernel] [k] iomap_write_actor
0.53% [kernel] [k] page_mapping
0.52% [kernel] [k] entry_SYSCALL_64_fastpath
0.52% [kernel] [k] __mark_inode_dirty
0.51% [kernel] [k] __block_commit_write.isra.30
0.51% [kernel] [k] xfs_file_write_iter
0.49% [kernel] [k] mark_buffer_async_write
0.47% [kernel] [k] balance_dirty_pages_ratelimited
0.47% [kernel] [k] xfs_count_page_state
0.47% [kernel] [k] page_evictable
0.46% [kernel] [k] xfs_vm_releasepage
0.46% [kernel] [k] xfs_iomap_write_delay
0.46% [kernel] [k] do_raw_spin_unlock
0.44% [kernel] [k] xfs_file_iomap_begin
0.44% [kernel] [k] xfs_map_at_offset
0.42% [kernel] [k] xfs_iext_bno_to_ext
There's very little XFS showing up near the top of the profile;
it's all page cache, writeback and spin lock traffic. This is a
dead give-away as to the lock being contended:
- 33.30% 0.01% [kernel] [k] kswapd
- 4.67% kswapd
- 4.69% shrink_node
- 4.77% shrink_node_memcg.isra.75
- 7.38% shrink_inactive_list
- 6.70% shrink_page_list
- 20.02% __remove_mapping
19.90% _raw_spin_unlock_irqrestore
I don't think that this is the same as what aim7 is triggering as
there's no XFS write() path allocation functions near the top of the
profile to speak of. Still, I don't recall seeing this before...
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 1:00 ` Linus Torvalds
@ 2016-08-11 15:57 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-11 15:57 UTC (permalink / raw)
To: Linus Torvalds
Cc: Huang, Ying, Dave Chinner, LKML, Bob Peterson, Wu Fengguang, LKP,
Christoph Hellwig
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> The biggest difference is that we have "mark_page_accessed()" show up
> after, and not before. There was also a lot of LRU noise in the
> non-profile data. I wonder if that is the reason here: the old model
> of using generic_perform_write/block_page_mkwrite didn't mark the
> pages accessed, and now with iomap_file_buffered_write() they get
> marked as active and that screws up the LRU list, and makes us not
> flush out the dirty pages well (because they are seen as active and
> not good for writeback), and then you get bad memory use.
And that's actually a "bug" in the new code - mostly because I failed
to pick up changes to the core code happening after we 'forked' it,
in this case commit 2457ae ("mm: non-atomically mark page accessed during page
cache allocation where possible").
The one liner below (not tested yet) to simply remove it should fix that
up. I also noticed we have a spurious pagefault_disable/enable, I
need to dig into the history of that first, though.
diff --git a/fs/iomap.c b/fs/iomap.c
index 48141b8..f39c318 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -199,7 +199,6 @@ again:
pagefault_enable();
flush_dcache_page(page);
- mark_page_accessed(page);
status = iomap_write_end(inode, pos, bytes, copied, page);
if (unlikely(status < 0))
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 15:57 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-11 15:57 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1389 bytes --]
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> The biggest difference is that we have "mark_page_accessed()" show up
> after, and not before. There was also a lot of LRU noise in the
> non-profile data. I wonder if that is the reason here: the old model
> of using generic_perform_write/block_page_mkwrite didn't mark the
> pages accessed, and now with iomap_file_buffered_write() they get
> marked as active and that screws up the LRU list, and makes us not
> flush out the dirty pages well (because they are seen as active and
> not good for writeback), and then you get bad memory use.
And that's actually a "bug" in the new code - mostly because I failed
to pick up changes to the core code happening after we 'forked' it,
in this case commit 2457ae ("mm: non-atomically mark page accessed during page
cache allocation where possible").
The one liner below (not tested yet) to simply remove it should fix that
up. I also noticed we have a spurious pagefault_disable/enable, I
need to dig into the history of that first, though.
diff --git a/fs/iomap.c b/fs/iomap.c
index 48141b8..f39c318 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -199,7 +199,6 @@ again:
pagefault_enable();
flush_dcache_page(page);
- mark_page_accessed(page);
status = iomap_write_end(inode, pos, bytes, copied, page);
if (unlikely(status < 0))
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 15:57 ` Christoph Hellwig
@ 2016-08-11 16:55 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 16:55 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Huang, Ying, Dave Chinner, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig <hch@lst.de> wrote:
>
> The one liner below (not tested yet) to simply remove it should fix that
> up. I also noticed we have a spurious pagefault_disable/enable, I
> need to dig into the history of that first, though.
Hopefully the pagefault_disable/enable doesn't matter for this case.
Can we get this one-liner tested with the kernel robot for comparison?
I really think a messed-up LRU list could cause bad IO patterns, and
end up keeping dirty pages around that should be streaming out to disk
and re-used, so causing memory pressure etc for no good reason.
I think the mapping->tree_lock issue that Dave sees is interesting
too, but the kswapd activity (and the extra locking it causes) could
also be a symptom of the same thing - memory pressure due to just
putting pages in the active file that simply shouldn't be there.
So the trivial oneliner _might_ just explain things. It would be
really nice if the regression turns out to be due to something so
easily fixed.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 16:55 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 16:55 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]
On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig <hch@lst.de> wrote:
>
> The one liner below (not tested yet) to simply remove it should fix that
> up. I also noticed we have a spurious pagefault_disable/enable, I
> need to dig into the history of that first, though.
Hopefully the pagefault_disable/enable doesn't matter for this case.
Can we get this one-liner tested with the kernel robot for comparison?
I really think a messed-up LRU list could cause bad IO patterns, and
end up keeping dirty pages around that should be streaming out to disk
and re-used, so causing memory pressure etc for no good reason.
I think the mapping->tree_lock issue that Dave sees is interesting
too, but the kswapd activity (and the extra locking it causes) could
also be a symptom of the same thing - memory pressure due to just
putting pages in the active file that simply shouldn't be there.
So the trivial oneliner _might_ just explain things. It would be
really nice if the regression turns out to be due to something so
easily fixed.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 16:55 ` Linus Torvalds
@ 2016-08-11 17:51 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-11 17:51 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christoph Hellwig, Huang, Ying, Dave Chinner, LKML, Bob Peterson,
Wu Fengguang, LKP
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig <hch@lst.de> wrote:
>>
>> The one liner below (not tested yet) to simply remove it should fix that
>> up. I also noticed we have a spurious pagefault_disable/enable, I
>> need to dig into the history of that first, though.
>
> Hopefully the pagefault_disable/enable doesn't matter for this case.
>
> Can we get this one-liner tested with the kernel robot for comparison?
> I really think a messed-up LRU list could cause bad IO patterns, and
> end up keeping dirty pages around that should be streaming out to disk
> and re-used, so causing memory pressure etc for no good reason.
>
> I think the mapping->tree_lock issue that Dave sees is interesting
> too, but the kswapd activity (and the extra locking it causes) could
> also be a symptom of the same thing - memory pressure due to just
> putting pages in the active file that simply shouldn't be there.
>
> So the trivial oneliner _might_ just explain things. It would be
> really nice if the regression turns out to be due to something so
> easily fixed.
>
Here is the test result for the debug patch. It appears that the aim7
score is a little better, but the regression is not recovered.
commit 5c70fdfdf82723e47ac180e36a7638ca06ea19d8
Author: Christoph Hellwig <hch@lst.de>
Date: Thu Aug 11 17:57:21 2016 +0200
dbg fix 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> The biggest difference is that we have "mark_page_accessed()" show up
> after, and not before. There was also a lot of LRU noise in the
> non-profile data. I wonder if that is the reason here: the old model
> of using generic_perform_write/block_page_mkwrite didn't mark the
> pages accessed, and now with iomap_file_buffered_write() they get
> marked as active and that screws up the LRU list, and makes us not
> flush out the dirty pages well (because they are seen as active and
> not good for writeback), and then you get bad memory use.
And that's actually a "bug" in the new code - mostly because I failed
to pick up changes to the core code happening after we 'forked' it,
in this case commit 2457ae ("mm: non-atomically mark page accessed during page
cache allocation where possible").
The one liner below (not tested yet) to simply remove it should fix that
up. I also noticed we have a spurious pagefault_disable/enable, I
need to dig into the history of that first, though.
diff --git a/fs/iomap.c b/fs/iomap.c
index 48141b8..f39c318 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -199,7 +199,6 @@ again:
pagefault_enable();
flush_dcache_page(page);
- mark_page_accessed(page);
status = iomap_write_end(inode, pos, bytes, copied, page);
if (unlikely(status < 0))
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506
68a9f5e7007c1afa2cf6830b690a90d0187c0684
5c70fdfdf82723e47ac180e36a7638ca06ea19d8
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 5c70fdfdf82723e47ac180e36a
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
486586 ± 0% -13.6% 420523 ± 0% -11.2% 432165 ± 0% aim7.jobs-per-min
37.23 ± 0% +15.5% 43.02 ± 0% +12.4% 41.87 ± 0% aim7.time.elapsed_time
37.23 ± 0% +15.5% 43.02 ± 0% +12.4% 41.87 ± 0% aim7.time.elapsed_time.max
6424 ± 1% +31.3% 8437 ± 1% +24.4% 7992 ± 3% aim7.time.involuntary_context_switches
2489 ± 3% -2.5% 2425 ± 1% -1.0% 2465 ± 3% aim7.time.maximum_resident_set_size
151288 ± 0% +3.0% 155764 ± 0% +2.0% 154252 ± 1% aim7.time.minor_page_faults
376.31 ± 0% +28.4% 483.23 ± 0% +22.5% 460.96 ± 1% aim7.time.system_time
429058 ± 0% -20.1% 343013 ± 0% -16.7% 357231 ± 0% aim7.time.voluntary_context_switches
16014 ± 0% +28.3% 20545 ± 1% -4.0% 15369 ± 0% meminfo.Active(file)
127154 ± 9% -15.5% 107424 ± 10% -4.4% 121505 ± 11% softirqs.SCHED
24561 ± 0% -27.1% 17895 ± 1% -22.5% 19033 ± 1% vmstat.system.cs
47289 ± 0% +1.4% 47938 ± 0% +1.1% 47807 ± 0% vmstat.system.in
4003 ± 0% +27.8% 5117 ± 1% -4.0% 3842 ± 0% proc-vmstat.nr_active_file
979.25 ± 0% +59.1% 1558 ± 5% -11.8% 864.00 ± 1% proc-vmstat.pgactivate
4699 ± 3% +132.6% 10932 ± 78% +77.1% 8321 ± 36% proc-vmstat.pgpgout
7868 ± 1% +28.9% 10140 ± 6% -1.8% 7724 ± 1% numa-meminfo.node0.Active(file)
161402 ± 1% -21.0% 127504 ± 25% -1.7% 158708 ± 5% numa-meminfo.node0.Slab
8148 ± 1% +27.8% 10416 ± 7% -6.1% 7648 ± 1% numa-meminfo.node1.Active(file)
81041 ± 3% +43.8% 116540 ± 27% +5.7% 85631 ± 8% numa-meminfo.node1.Slab
13903 ± 18% +22.8% 17076 ± 9% -27.8% 10039 ± 12% numa-numastat.node0.numa_foreign
12525 ± 20% +27.1% 15922 ± 9% -26.8% 9170 ± 17% numa-numastat.node0.numa_miss
14084 ± 18% -35.4% 9102 ± 17% +16.5% 16401 ± 8% numa-numastat.node1.numa_foreign
15461 ± 17% -33.7% 10257 ± 14% +11.7% 17270 ± 9% numa-numastat.node1.numa_miss
1966 ± 1% +30.4% 2565 ± 4% -1.8% 1930 ± 1% numa-vmstat.node0.nr_active_file
4204 ± 3% +17.7% 4947 ± 7% +6.1% 4461 ± 6% numa-vmstat.node0.nr_alloc_batch
521.75 ± 4% +5.4% 549.80 ± 4% -12.0% 459.00 ± 12% numa-vmstat.node0.nr_dirty
2037 ± 1% +24.9% 2543 ± 5% -6.1% 1912 ± 1% numa-vmstat.node1.nr_active_file
24.26 ± 0% +8.7% 26.36 ± 0% +7.1% 25.99 ± 0% turbostat.%Busy
686.75 ± 0% +9.2% 749.80 ± 0% +6.5% 731.67 ± 0% turbostat.Avg_MHz
0.29 ± 1% -24.9% 0.22 ± 2% -18.8% 0.23 ± 2% turbostat.CPU%c3
91.39 ± 2% +3.5% 94.60 ± 0% +4.4% 95.45 ± 1% turbostat.CorWatt
121.88 ± 1% +2.6% 124.99 ± 0% +3.9% 126.57 ± 1% turbostat.PkgWatt
37.23 ± 0% +15.5% 43.02 ± 0% +12.4% 41.87 ± 0% time.elapsed_time
37.23 ± 0% +15.5% 43.02 ± 0% +12.4% 41.87 ± 0% time.elapsed_time.max
6424 ± 1% +31.3% 8437 ± 1% +24.4% 7992 ± 3% time.involuntary_context_switches
1038 ± 0% +10.5% 1148 ± 0% +8.4% 1126 ± 0% time.percent_of_cpu_this_job_got
376.31 ± 0% +28.4% 483.23 ± 0% +22.5% 460.96 ± 1% time.system_time
429058 ± 0% -20.1% 343013 ± 0% -16.7% 357231 ± 0% time.voluntary_context_switches
53643508 ± 0% -19.5% 43181295 ± 2% -16.4% 44822518 ± 0% cpuidle.C1-IVT.time
318952 ± 0% -25.7% 236993 ± 0% -21.3% 251084 ± 0% cpuidle.C1-IVT.usage
3471235 ± 2% -18.1% 2843694 ± 3% -12.9% 3022770 ± 1% cpuidle.C1E-IVT.time
46642 ± 1% -22.5% 36144 ± 0% -17.4% 38545 ± 0% cpuidle.C1E-IVT.usage
12601665 ± 1% -21.9% 9837996 ± 1% -18.9% 10218444 ± 0% cpuidle.C3-IVT.time
79872 ± 1% -19.7% 64163 ± 0% -16.8% 66477 ± 0% cpuidle.C3-IVT.usage
1.292e+09 ± 0% +13.7% 1.469e+09 ± 0% +10.9% 1.434e+09 ± 0% cpuidle.C6-IVT.time
5131 ±121% -98.5% 75.60 ±200% -93.3% 344.33 ± 98% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
2966 ± 80% -27.5% 2151 ± 16% +359.2% 13624 ± 77% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
5131 ±121% -98.5% 75.60 ±200% -93.3% 344.33 ± 98% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
9739 ± 99% -99.0% 95.40 ± 9% -99.1% 91.33 ± 5% latency_stats.max.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
7739 ± 81% -71.5% 2202 ± 46% -77.4% 1752 ± 23% latency_stats.max.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
5131 ±121% -98.5% 75.60 ±200% -93.3% 344.33 ± 98% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
10459 ± 97% -97.5% 263.80 ± 5% -97.6% 254.67 ± 1% latency_stats.sum.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
9097 ± 81% -70.2% 2708 ± 40% -74.3% 2335 ± 21% latency_stats.sum.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
229.50 ± 11% -6.7% 214.20 ± 46% -77.8% 51.00 ± 0% slabinfo.dio.active_objs
229.50 ± 11% -6.7% 214.20 ± 46% -77.8% 51.00 ± 0% slabinfo.dio.num_objs
369.75 ± 17% -6.2% 346.80 ± 11% -21.8% 289.00 ± 16% slabinfo.kmem_cache.active_objs
369.75 ± 17% -6.2% 346.80 ± 11% -21.8% 289.00 ± 16% slabinfo.kmem_cache.num_objs
1022 ± 5% -2.9% 992.80 ± 11% -11.9% 900.33 ± 3% slabinfo.nsproxy.active_objs
1022 ± 5% -2.9% 992.80 ± 11% -11.9% 900.33 ± 3% slabinfo.nsproxy.num_objs
1836 ± 10% +60.6% 2949 ± 9% +45.1% 2665 ± 6% slabinfo.scsi_data_buffer.active_objs
1836 ± 10% +60.6% 2949 ± 9% +45.1% 2665 ± 6% slabinfo.scsi_data_buffer.num_objs
431.75 ± 10% +60.7% 693.80 ± 9% +45.1% 626.67 ± 6% slabinfo.xfs_efd_item.active_objs
431.75 ± 10% +60.7% 693.80 ± 9% +45.1% 626.67 ± 6% slabinfo.xfs_efd_item.num_objs
2.59e+11 ± 6% +22.4% 3.169e+11 ± 5% +22.9% 3.182e+11 ± 7% perf-stat.branch-instructions
0.41 ± 2% -9.2% 0.38 ± 1% -25.5% 0.31 ± 3% perf-stat.branch-miss-rate
1.072e+09 ± 4% +11.3% 1.193e+09 ± 3% -8.5% 9.812e+08 ± 4% perf-stat.branch-misses
972882 ± 0% -17.4% 803235 ± 0% -14.3% 833753 ± 0% perf-stat.context-switches
1.472e+12 ± 6% +20.5% 1.774e+12 ± 5% +16.7% 1.717e+12 ± 7% perf-stat.cpu-cycles
100350 ± 1% -5.2% 95091 ± 1% -3.8% 96553 ± 0% perf-stat.cpu-migrations
7.315e+08 ± 24% +54.9% 1.133e+09 ± 35% +107.1% 1.515e+09 ± 64% perf-stat.dTLB-load-misses
3.225e+11 ± 5% +38.1% 4.454e+11 ± 3% +32.2% 4.263e+11 ± 6% perf-stat.dTLB-loads
2.176e+11 ± 9% +45.2% 3.16e+11 ± 5% +50.1% 3.267e+11 ± 7% perf-stat.dTLB-stores
1.452e+12 ± 6% +27.7% 1.853e+12 ± 5% +28.0% 1.858e+12 ± 7% perf-stat.instructions
42168 ± 16% +24.9% 52676 ± 7% +33.4% 56234 ± 6% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.9% 1.05 ± 0% +9.6% 1.08 ± 0% perf-stat.ipc
252401 ± 0% +6.7% 269219 ± 0% +5.2% 265524 ± 0% perf-stat.minor-faults
10.16 ± 3% +14.5% 11.63 ± 4% +12.3% 11.41 ± 3% perf-stat.node-store-miss-rate
24842185 ± 2% +11.0% 27573897 ± 2% +7.6% 26720271 ± 3% perf-stat.node-store-misses
2.198e+08 ± 2% -4.5% 2.1e+08 ± 5% -5.6% 2.076e+08 ± 2% perf-stat.node-stores
252321 ± 0% +6.6% 269092 ± 0% +5.2% 265415 ± 0% perf-stat.page-faults
4.08 ± 64% +63.1% 6.65 ± 77% +609.5% 28.93 ± 92% sched_debug.cfs_rq:/.MIN_vruntime.avg
157.17 ± 61% +30.0% 204.25 ± 47% +686.9% 1236 ±112% sched_debug.cfs_rq:/.MIN_vruntime.max
4.08 ± 64% +63.1% 6.65 ± 77% +609.5% 28.93 ± 92% sched_debug.cfs_rq:/.max_vruntime.avg
157.17 ± 61% +30.0% 204.25 ± 47% +686.9% 1236 ±112% sched_debug.cfs_rq:/.max_vruntime.max
191.00 ± 35% -23.4% 146.40 ± 15% -28.4% 136.67 ± 5% sched_debug.cfs_rq:/.runnable_load_avg.max
44.43 ± 16% -20.2% 35.47 ± 9% -25.2% 33.25 ± 1% sched_debug.cfs_rq:/.runnable_load_avg.stddev
50.23 ± 19% -27.1% 36.61 ± 15% -30.3% 35.03 ± 15% sched_debug.cpu.cpu_load[1].avg
466.50 ± 29% -55.6% 207.20 ± 73% -47.5% 245.00 ± 67% sched_debug.cpu.cpu_load[1].max
77.78 ± 33% -52.4% 37.05 ± 53% -48.8% 39.86 ± 51% sched_debug.cpu.cpu_load[1].stddev
38.82 ± 19% -24.2% 29.42 ± 15% -29.6% 27.31 ± 10% sched_debug.cpu.cpu_load[2].avg
300.50 ± 33% -55.1% 135.00 ± 46% -50.2% 149.67 ± 46% sched_debug.cpu.cpu_load[2].max
51.71 ± 41% -49.3% 26.20 ± 46% -50.9% 25.39 ± 33% sched_debug.cpu.cpu_load[2].stddev
27.08 ± 19% -22.4% 21.02 ± 15% -27.8% 19.56 ± 11% sched_debug.cpu.cpu_load[3].avg
185.00 ± 43% -49.6% 93.20 ± 44% -48.3% 95.67 ± 30% sched_debug.cpu.cpu_load[3].max
32.91 ± 50% -46.1% 17.75 ± 45% -49.8% 16.51 ± 25% sched_debug.cpu.cpu_load[3].stddev
17.78 ± 20% -20.9% 14.06 ± 13% -25.3% 13.28 ± 13% sched_debug.cpu.cpu_load[4].avg
20.29 ± 55% -44.5% 11.26 ± 39% -47.7% 10.61 ± 27% sched_debug.cpu.cpu_load[4].stddev
4929 ± 18% -24.8% 3704 ± 23% -4.5% 4708 ± 21% sched_debug.cpu.nr_load_updates.avg
276.50 ± 10% -4.4% 264.20 ± 7% -14.3% 237.00 ± 19% sched_debug.cpu.nr_switches.min
Best Regards,
Huang, Ying
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 17:51 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-11 17:51 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 15362 bytes --]
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig <hch@lst.de> wrote:
>>
>> The one liner below (not tested yet) to simply remove it should fix that
>> up. I also noticed we have a spurious pagefault_disable/enable, I
>> need to dig into the history of that first, though.
>
> Hopefully the pagefault_disable/enable doesn't matter for this case.
>
> Can we get this one-liner tested with the kernel robot for comparison?
> I really think a messed-up LRU list could cause bad IO patterns, and
> end up keeping dirty pages around that should be streaming out to disk
> and re-used, so causing memory pressure etc for no good reason.
>
> I think the mapping->tree_lock issue that Dave sees is interesting
> too, but the kswapd activity (and the extra locking it causes) could
> also be a symptom of the same thing - memory pressure due to just
> putting pages in the active file that simply shouldn't be there.
>
> So the trivial oneliner _might_ just explain things. It would be
> really nice if the regression turns out to be due to something so
> easily fixed.
>
Here is the test result for the debug patch. It appears that the aim7
score is a little better, but the regression is not recovered.
commit 5c70fdfdf82723e47ac180e36a7638ca06ea19d8
Author: Christoph Hellwig <hch@lst.de>
Date: Thu Aug 11 17:57:21 2016 +0200
dbg fix 68a9f5e700: aim7.jobs-per-min -13.6% regression
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> The biggest difference is that we have "mark_page_accessed()" show up
> after, and not before. There was also a lot of LRU noise in the
> non-profile data. I wonder if that is the reason here: the old model
> of using generic_perform_write/block_page_mkwrite didn't mark the
> pages accessed, and now with iomap_file_buffered_write() they get
> marked as active and that screws up the LRU list, and makes us not
> flush out the dirty pages well (because they are seen as active and
> not good for writeback), and then you get bad memory use.
And that's actually a "bug" in the new code - mostly because I failed
to pick up changes to the core code happening after we 'forked' it,
in this case commit 2457ae ("mm: non-atomically mark page accessed during page
cache allocation where possible").
The one liner below (not tested yet) to simply remove it should fix that
up. I also noticed we have a spurious pagefault_disable/enable, I
need to dig into the history of that first, though.
diff --git a/fs/iomap.c b/fs/iomap.c
index 48141b8..f39c318 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -199,7 +199,6 @@ again:
pagefault_enable();
flush_dcache_page(page);
- mark_page_accessed(page);
status = iomap_write_end(inode, pos, bytes, copied, page);
if (unlikely(status < 0))
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506
68a9f5e7007c1afa2cf6830b690a90d0187c0684
5c70fdfdf82723e47ac180e36a7638ca06ea19d8
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 5c70fdfdf82723e47ac180e36a
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
486586 ± 0% -13.6% 420523 ± 0% -11.2% 432165 ± 0% aim7.jobs-per-min
37.23 ± 0% +15.5% 43.02 ± 0% +12.4% 41.87 ± 0% aim7.time.elapsed_time
37.23 ± 0% +15.5% 43.02 ± 0% +12.4% 41.87 ± 0% aim7.time.elapsed_time.max
6424 ± 1% +31.3% 8437 ± 1% +24.4% 7992 ± 3% aim7.time.involuntary_context_switches
2489 ± 3% -2.5% 2425 ± 1% -1.0% 2465 ± 3% aim7.time.maximum_resident_set_size
151288 ± 0% +3.0% 155764 ± 0% +2.0% 154252 ± 1% aim7.time.minor_page_faults
376.31 ± 0% +28.4% 483.23 ± 0% +22.5% 460.96 ± 1% aim7.time.system_time
429058 ± 0% -20.1% 343013 ± 0% -16.7% 357231 ± 0% aim7.time.voluntary_context_switches
16014 ± 0% +28.3% 20545 ± 1% -4.0% 15369 ± 0% meminfo.Active(file)
127154 ± 9% -15.5% 107424 ± 10% -4.4% 121505 ± 11% softirqs.SCHED
24561 ± 0% -27.1% 17895 ± 1% -22.5% 19033 ± 1% vmstat.system.cs
47289 ± 0% +1.4% 47938 ± 0% +1.1% 47807 ± 0% vmstat.system.in
4003 ± 0% +27.8% 5117 ± 1% -4.0% 3842 ± 0% proc-vmstat.nr_active_file
979.25 ± 0% +59.1% 1558 ± 5% -11.8% 864.00 ± 1% proc-vmstat.pgactivate
4699 ± 3% +132.6% 10932 ± 78% +77.1% 8321 ± 36% proc-vmstat.pgpgout
7868 ± 1% +28.9% 10140 ± 6% -1.8% 7724 ± 1% numa-meminfo.node0.Active(file)
161402 ± 1% -21.0% 127504 ± 25% -1.7% 158708 ± 5% numa-meminfo.node0.Slab
8148 ± 1% +27.8% 10416 ± 7% -6.1% 7648 ± 1% numa-meminfo.node1.Active(file)
81041 ± 3% +43.8% 116540 ± 27% +5.7% 85631 ± 8% numa-meminfo.node1.Slab
13903 ± 18% +22.8% 17076 ± 9% -27.8% 10039 ± 12% numa-numastat.node0.numa_foreign
12525 ± 20% +27.1% 15922 ± 9% -26.8% 9170 ± 17% numa-numastat.node0.numa_miss
14084 ± 18% -35.4% 9102 ± 17% +16.5% 16401 ± 8% numa-numastat.node1.numa_foreign
15461 ± 17% -33.7% 10257 ± 14% +11.7% 17270 ± 9% numa-numastat.node1.numa_miss
1966 ± 1% +30.4% 2565 ± 4% -1.8% 1930 ± 1% numa-vmstat.node0.nr_active_file
4204 ± 3% +17.7% 4947 ± 7% +6.1% 4461 ± 6% numa-vmstat.node0.nr_alloc_batch
521.75 ± 4% +5.4% 549.80 ± 4% -12.0% 459.00 ± 12% numa-vmstat.node0.nr_dirty
2037 ± 1% +24.9% 2543 ± 5% -6.1% 1912 ± 1% numa-vmstat.node1.nr_active_file
24.26 ± 0% +8.7% 26.36 ± 0% +7.1% 25.99 ± 0% turbostat.%Busy
686.75 ± 0% +9.2% 749.80 ± 0% +6.5% 731.67 ± 0% turbostat.Avg_MHz
0.29 ± 1% -24.9% 0.22 ± 2% -18.8% 0.23 ± 2% turbostat.CPU%c3
91.39 ± 2% +3.5% 94.60 ± 0% +4.4% 95.45 ± 1% turbostat.CorWatt
121.88 ± 1% +2.6% 124.99 ± 0% +3.9% 126.57 ± 1% turbostat.PkgWatt
37.23 ± 0% +15.5% 43.02 ± 0% +12.4% 41.87 ± 0% time.elapsed_time
37.23 ± 0% +15.5% 43.02 ± 0% +12.4% 41.87 ± 0% time.elapsed_time.max
6424 ± 1% +31.3% 8437 ± 1% +24.4% 7992 ± 3% time.involuntary_context_switches
1038 ± 0% +10.5% 1148 ± 0% +8.4% 1126 ± 0% time.percent_of_cpu_this_job_got
376.31 ± 0% +28.4% 483.23 ± 0% +22.5% 460.96 ± 1% time.system_time
429058 ± 0% -20.1% 343013 ± 0% -16.7% 357231 ± 0% time.voluntary_context_switches
53643508 ± 0% -19.5% 43181295 ± 2% -16.4% 44822518 ± 0% cpuidle.C1-IVT.time
318952 ± 0% -25.7% 236993 ± 0% -21.3% 251084 ± 0% cpuidle.C1-IVT.usage
3471235 ± 2% -18.1% 2843694 ± 3% -12.9% 3022770 ± 1% cpuidle.C1E-IVT.time
46642 ± 1% -22.5% 36144 ± 0% -17.4% 38545 ± 0% cpuidle.C1E-IVT.usage
12601665 ± 1% -21.9% 9837996 ± 1% -18.9% 10218444 ± 0% cpuidle.C3-IVT.time
79872 ± 1% -19.7% 64163 ± 0% -16.8% 66477 ± 0% cpuidle.C3-IVT.usage
1.292e+09 ± 0% +13.7% 1.469e+09 ± 0% +10.9% 1.434e+09 ± 0% cpuidle.C6-IVT.time
5131 ±121% -98.5% 75.60 ±200% -93.3% 344.33 ± 98% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
2966 ± 80% -27.5% 2151 ± 16% +359.2% 13624 ± 77% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
5131 ±121% -98.5% 75.60 ±200% -93.3% 344.33 ± 98% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
9739 ± 99% -99.0% 95.40 ± 9% -99.1% 91.33 ± 5% latency_stats.max.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
7739 ± 81% -71.5% 2202 ± 46% -77.4% 1752 ± 23% latency_stats.max.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
5131 ±121% -98.5% 75.60 ±200% -93.3% 344.33 ± 98% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
10459 ± 97% -97.5% 263.80 ± 5% -97.6% 254.67 ± 1% latency_stats.sum.submit_bio_wait.blkdev_issue_flush.ext4_sync_fs.sync_fs_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
9097 ± 81% -70.2% 2708 ± 40% -74.3% 2335 ± 21% latency_stats.sum.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
229.50 ± 11% -6.7% 214.20 ± 46% -77.8% 51.00 ± 0% slabinfo.dio.active_objs
229.50 ± 11% -6.7% 214.20 ± 46% -77.8% 51.00 ± 0% slabinfo.dio.num_objs
369.75 ± 17% -6.2% 346.80 ± 11% -21.8% 289.00 ± 16% slabinfo.kmem_cache.active_objs
369.75 ± 17% -6.2% 346.80 ± 11% -21.8% 289.00 ± 16% slabinfo.kmem_cache.num_objs
1022 ± 5% -2.9% 992.80 ± 11% -11.9% 900.33 ± 3% slabinfo.nsproxy.active_objs
1022 ± 5% -2.9% 992.80 ± 11% -11.9% 900.33 ± 3% slabinfo.nsproxy.num_objs
1836 ± 10% +60.6% 2949 ± 9% +45.1% 2665 ± 6% slabinfo.scsi_data_buffer.active_objs
1836 ± 10% +60.6% 2949 ± 9% +45.1% 2665 ± 6% slabinfo.scsi_data_buffer.num_objs
431.75 ± 10% +60.7% 693.80 ± 9% +45.1% 626.67 ± 6% slabinfo.xfs_efd_item.active_objs
431.75 ± 10% +60.7% 693.80 ± 9% +45.1% 626.67 ± 6% slabinfo.xfs_efd_item.num_objs
2.59e+11 ± 6% +22.4% 3.169e+11 ± 5% +22.9% 3.182e+11 ± 7% perf-stat.branch-instructions
0.41 ± 2% -9.2% 0.38 ± 1% -25.5% 0.31 ± 3% perf-stat.branch-miss-rate
1.072e+09 ± 4% +11.3% 1.193e+09 ± 3% -8.5% 9.812e+08 ± 4% perf-stat.branch-misses
972882 ± 0% -17.4% 803235 ± 0% -14.3% 833753 ± 0% perf-stat.context-switches
1.472e+12 ± 6% +20.5% 1.774e+12 ± 5% +16.7% 1.717e+12 ± 7% perf-stat.cpu-cycles
100350 ± 1% -5.2% 95091 ± 1% -3.8% 96553 ± 0% perf-stat.cpu-migrations
7.315e+08 ± 24% +54.9% 1.133e+09 ± 35% +107.1% 1.515e+09 ± 64% perf-stat.dTLB-load-misses
3.225e+11 ± 5% +38.1% 4.454e+11 ± 3% +32.2% 4.263e+11 ± 6% perf-stat.dTLB-loads
2.176e+11 ± 9% +45.2% 3.16e+11 ± 5% +50.1% 3.267e+11 ± 7% perf-stat.dTLB-stores
1.452e+12 ± 6% +27.7% 1.853e+12 ± 5% +28.0% 1.858e+12 ± 7% perf-stat.instructions
42168 ± 16% +24.9% 52676 ± 7% +33.4% 56234 ± 6% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.9% 1.05 ± 0% +9.6% 1.08 ± 0% perf-stat.ipc
252401 ± 0% +6.7% 269219 ± 0% +5.2% 265524 ± 0% perf-stat.minor-faults
10.16 ± 3% +14.5% 11.63 ± 4% +12.3% 11.41 ± 3% perf-stat.node-store-miss-rate
24842185 ± 2% +11.0% 27573897 ± 2% +7.6% 26720271 ± 3% perf-stat.node-store-misses
2.198e+08 ± 2% -4.5% 2.1e+08 ± 5% -5.6% 2.076e+08 ± 2% perf-stat.node-stores
252321 ± 0% +6.6% 269092 ± 0% +5.2% 265415 ± 0% perf-stat.page-faults
4.08 ± 64% +63.1% 6.65 ± 77% +609.5% 28.93 ± 92% sched_debug.cfs_rq:/.MIN_vruntime.avg
157.17 ± 61% +30.0% 204.25 ± 47% +686.9% 1236 ±112% sched_debug.cfs_rq:/.MIN_vruntime.max
4.08 ± 64% +63.1% 6.65 ± 77% +609.5% 28.93 ± 92% sched_debug.cfs_rq:/.max_vruntime.avg
157.17 ± 61% +30.0% 204.25 ± 47% +686.9% 1236 ±112% sched_debug.cfs_rq:/.max_vruntime.max
191.00 ± 35% -23.4% 146.40 ± 15% -28.4% 136.67 ± 5% sched_debug.cfs_rq:/.runnable_load_avg.max
44.43 ± 16% -20.2% 35.47 ± 9% -25.2% 33.25 ± 1% sched_debug.cfs_rq:/.runnable_load_avg.stddev
50.23 ± 19% -27.1% 36.61 ± 15% -30.3% 35.03 ± 15% sched_debug.cpu.cpu_load[1].avg
466.50 ± 29% -55.6% 207.20 ± 73% -47.5% 245.00 ± 67% sched_debug.cpu.cpu_load[1].max
77.78 ± 33% -52.4% 37.05 ± 53% -48.8% 39.86 ± 51% sched_debug.cpu.cpu_load[1].stddev
38.82 ± 19% -24.2% 29.42 ± 15% -29.6% 27.31 ± 10% sched_debug.cpu.cpu_load[2].avg
300.50 ± 33% -55.1% 135.00 ± 46% -50.2% 149.67 ± 46% sched_debug.cpu.cpu_load[2].max
51.71 ± 41% -49.3% 26.20 ± 46% -50.9% 25.39 ± 33% sched_debug.cpu.cpu_load[2].stddev
27.08 ± 19% -22.4% 21.02 ± 15% -27.8% 19.56 ± 11% sched_debug.cpu.cpu_load[3].avg
185.00 ± 43% -49.6% 93.20 ± 44% -48.3% 95.67 ± 30% sched_debug.cpu.cpu_load[3].max
32.91 ± 50% -46.1% 17.75 ± 45% -49.8% 16.51 ± 25% sched_debug.cpu.cpu_load[3].stddev
17.78 ± 20% -20.9% 14.06 ± 13% -25.3% 13.28 ± 13% sched_debug.cpu.cpu_load[4].avg
20.29 ± 55% -44.5% 11.26 ± 39% -47.7% 10.61 ± 27% sched_debug.cpu.cpu_load[4].stddev
4929 ± 18% -24.8% 3704 ± 23% -4.5% 4708 ± 21% sched_debug.cpu.nr_load_updates.avg
276.50 ± 10% -4.4% 264.20 ± 7% -14.3% 237.00 ± 19% sched_debug.cpu.nr_switches.min
Best Regards,
Huang, Ying
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 17:51 ` Huang, Ying
@ 2016-08-11 19:51 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 19:51 UTC (permalink / raw)
To: Huang, Ying
Cc: Christoph Hellwig, Dave Chinner, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 10:51 AM, Huang, Ying <ying.huang@intel.com> wrote:
>>
>
> Here is the test result for the debug patch. It appears that the aim7
> score is a little better, but the regression is not recovered.
Ok. It does seem to also reset the active file page counts back, so
that part did seem to be related, but yeah, from a performance
standpoint that was clearly not a major issue.
Let's hope Dave can figure out something based on his numbers, because
I'm out of ideas. Or maybe it's the pagefault-atomic thing that
Christoph was looking at.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 19:51 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 19:51 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 598 bytes --]
On Thu, Aug 11, 2016 at 10:51 AM, Huang, Ying <ying.huang@intel.com> wrote:
>>
>
> Here is the test result for the debug patch. It appears that the aim7
> score is a little better, but the regression is not recovered.
Ok. It does seem to also reset the active file page counts back, so
that part did seem to be related, but yeah, from a performance
standpoint that was clearly not a major issue.
Let's hope Dave can figure out something based on his numbers, because
I'm out of ideas. Or maybe it's the pagefault-atomic thing that
Christoph was looking at.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 19:51 ` Linus Torvalds
@ 2016-08-11 20:00 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-11 20:00 UTC (permalink / raw)
To: Linus Torvalds
Cc: Huang, Ying, Christoph Hellwig, Dave Chinner, LKML, Bob Peterson,
Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 12:51:31PM -0700, Linus Torvalds wrote:
> Ok. It does seem to also reset the active file page counts back, so
> that part did seem to be related, but yeah, from a performance
> standpoint that was clearly not a major issue.
>
> Let's hope Dave can figure out something based on his numbers, because
> I'm out of ideas. Or maybe it's the pagefault-atomic thing that
> Christoph was looking at.
I can't really think of any reason why the pagefaul_disable() would
sіgnificantly change performance. Anyway, the patch for the is below
(on top of the previous mark_page_accessed() one), so feel free to
re-run the test with it. It would also be nice to see the profiles
with the two patches applied.
commit 43106eea246074acc4bb7d12fdb91f58002f52ed
Author: Christoph Hellwig <hch@lst.de>
Date: Thu Aug 11 10:41:40 2016 -0700
fs: remove superflous pagefault_disable from iomap_write_actor
No idea where this really came from, generic_perform_write only briefly
did a pagefaul_disable when trying a different prefault scheme.
Signed-off-by: Christoph Hellwig <hch@lst.de>
diff --git a/fs/iomap.c b/fs/iomap.c
index f39c318..74712e2 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -194,9 +194,7 @@ again:
if (mapping_writably_mapped(inode->i_mapping))
flush_dcache_page(page);
- pagefault_disable();
copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
- pagefault_enable();
flush_dcache_page(page);
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 20:00 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-11 20:00 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1519 bytes --]
On Thu, Aug 11, 2016 at 12:51:31PM -0700, Linus Torvalds wrote:
> Ok. It does seem to also reset the active file page counts back, so
> that part did seem to be related, but yeah, from a performance
> standpoint that was clearly not a major issue.
>
> Let's hope Dave can figure out something based on his numbers, because
> I'm out of ideas. Or maybe it's the pagefault-atomic thing that
> Christoph was looking at.
I can't really think of any reason why the pagefaul_disable() would
sіgnificantly change performance. Anyway, the patch for the is below
(on top of the previous mark_page_accessed() one), so feel free to
re-run the test with it. It would also be nice to see the profiles
with the two patches applied.
commit 43106eea246074acc4bb7d12fdb91f58002f52ed
Author: Christoph Hellwig <hch@lst.de>
Date: Thu Aug 11 10:41:40 2016 -0700
fs: remove superflous pagefault_disable from iomap_write_actor
No idea where this really came from, generic_perform_write only briefly
did a pagefaul_disable when trying a different prefault scheme.
Signed-off-by: Christoph Hellwig <hch@lst.de>
diff --git a/fs/iomap.c b/fs/iomap.c
index f39c318..74712e2 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -194,9 +194,7 @@ again:
if (mapping_writably_mapped(inode->i_mapping))
flush_dcache_page(page);
- pagefault_disable();
copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
- pagefault_enable();
flush_dcache_page(page);
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 20:00 ` Christoph Hellwig
@ 2016-08-11 20:35 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 20:35 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Huang, Ying, Dave Chinner, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 1:00 PM, Christoph Hellwig <hch@lst.de> wrote:
>
> I can't really think of any reason why the pagefaul_disable() would
> sіgnificantly change performance.
No, you're right, we prefault the page anyway.
And quite frankly, looking at it, I think the pagefault_disable/enable
is actually *correct*.
The thing is, iov_iter_copy_from_user_atomic() doesn't itself enforce
non-blocking user accesses, it depends on the caller blocking page
faults.
And the reason we want to block page faults is to make sure we don't
deadlock on the page we just locked for writing.
So looking at it, I think the pagefault_disable/enable is actually
needed, and it may in fact be a bug that mm/filemap.c doesn't do that.
However, see commit 00a3d660cbac ("Revert "fs: do not prefault
sys_write() user buffer pages"") about why mm/filemap.c doesn't do the
pagefault_disable().
But depending on the prefaulting actually guaranteeing that we don't
deadlock sounds like a nasty race in theory (ie somebody does mmap
tricks in another thread in between the pre-faulting and the final
copying).
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 20:35 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 20:35 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1153 bytes --]
On Thu, Aug 11, 2016 at 1:00 PM, Christoph Hellwig <hch@lst.de> wrote:
>
> I can't really think of any reason why the pagefaul_disable() would
> sіgnificantly change performance.
No, you're right, we prefault the page anyway.
And quite frankly, looking at it, I think the pagefault_disable/enable
is actually *correct*.
The thing is, iov_iter_copy_from_user_atomic() doesn't itself enforce
non-blocking user accesses, it depends on the caller blocking page
faults.
And the reason we want to block page faults is to make sure we don't
deadlock on the page we just locked for writing.
So looking at it, I think the pagefault_disable/enable is actually
needed, and it may in fact be a bug that mm/filemap.c doesn't do that.
However, see commit 00a3d660cbac ("Revert "fs: do not prefault
sys_write() user buffer pages"") about why mm/filemap.c doesn't do the
pagefault_disable().
But depending on the prefaulting actually guaranteeing that we don't
deadlock sounds like a nasty race in theory (ie somebody does mmap
tricks in another thread in between the pre-faulting and the final
copying).
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 20:00 ` Christoph Hellwig
@ 2016-08-11 21:16 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-11 21:16 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Linus Torvalds, Huang, Ying, Dave Chinner, LKML, Bob Peterson,
Wu Fengguang, LKP
Christoph Hellwig <hch@lst.de> writes:
> On Thu, Aug 11, 2016 at 12:51:31PM -0700, Linus Torvalds wrote:
>> Ok. It does seem to also reset the active file page counts back, so
>> that part did seem to be related, but yeah, from a performance
>> standpoint that was clearly not a major issue.
>>
>> Let's hope Dave can figure out something based on his numbers, because
>> I'm out of ideas. Or maybe it's the pagefault-atomic thing that
>> Christoph was looking at.
>
> I can't really think of any reason why the pagefaul_disable() would
> sіgnificantly change performance. Anyway, the patch for the is below
> (on top of the previous mark_page_accessed() one), so feel free to
> re-run the test with it. It would also be nice to see the profiles
> with the two patches applied.
>
> commit 43106eea246074acc4bb7d12fdb91f58002f52ed
> Author: Christoph Hellwig <hch@lst.de>
> Date: Thu Aug 11 10:41:40 2016 -0700
>
> fs: remove superflous pagefault_disable from iomap_write_actor
>
> No idea where this really came from, generic_perform_write only briefly
> did a pagefaul_disable when trying a different prefault scheme.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> diff --git a/fs/iomap.c b/fs/iomap.c
> index f39c318..74712e2 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -194,9 +194,7 @@ again:
> if (mapping_writably_mapped(inode->i_mapping))
> flush_dcache_page(page);
>
> - pagefault_disable();
> copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
> - pagefault_enable();
>
> flush_dcache_page(page);
>
Test result is as follow,
commit e129b86bfacc8bb517b843fca41d0d179de7a4ca
Author: Christoph Hellwig <hch@lst.de>
Date: Thu Aug 11 10:41:40 2016 -0700
fs: remove superflous pagefault_disable from iomap_write_actor
No idea where this really came from, generic_perform_write only briefly
did a pagefaul_disable when trying a different prefault scheme.
Signed-off-by: Christoph Hellwig <hch@lst.de>
diff --git a/fs/iomap.c b/fs/iomap.c
index f39c318..74712e2 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -194,9 +194,7 @@ again:
if (mapping_writably_mapped(inode->i_mapping))
flush_dcache_page(page);
- pagefault_disable();
copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
- pagefault_enable();
flush_dcache_page(page);
=========================================================================================
compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506
68a9f5e7007c1afa2cf6830b690a90d0187c0684
e129b86bfacc8bb517b843fca41d0d179de7a4ca
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 e129b86bfacc8bb517b843fca4
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
484435 ± 0% -13.3% 420004 ± 0% -11.6% 428323 ± 0% aim7.jobs-per-min
37.37 ± 0% +15.3% 43.09 ± 0% +13.0% 42.24 ± 0% aim7.time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% +13.0% 42.24 ± 0% aim7.time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% +19.9% 7781 ± 4% aim7.time.involuntary_context_switches
2378 ± 0% +1.1% 2404 ± 0% +9.2% 2598 ± 0% aim7.time.maximum_resident_set_size
376.89 ± 0% +28.4% 484.11 ± 0% +22.8% 462.92 ± 0% aim7.time.system_time
430512 ± 0% -20.1% 343838 ± 0% -17.3% 356032 ± 0% aim7.time.voluntary_context_switches
26816 ± 8% +10.2% 29542 ± 1% +13.5% 30428 ± 0% interrupts.CAL:Function_call_interrupts
1016 ± 8% +527.7% 6381 ± 59% +483.5% 5932 ± 82% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_trans_get_buf_map.xfs_da_get_buf.xfs_dir3_data_init.xfs_dir2_sf_to_block.xfs_dir2_sf_addname.xfs_dir_createname.xfs_create.xfs_generic_create
125122 ± 10% -10.7% 111758 ± 12% +8.2% 135440 ± 3% softirqs.SCHED
410707 ± 12% +5.2% 432155 ± 11% +24.9% 512838 ± 4% softirqs.TIMER
24772 ± 0% -28.6% 17675 ± 0% -24.1% 18813 ± 1% vmstat.system.cs
53477 ± 2% +5.6% 56453 ± 0% +6.1% 56716 ± 0% vmstat.system.in
43469 ± 0% +5.3% 45792 ± 1% +25.0% 54343 ± 16% proc-vmstat.nr_active_anon
3906 ± 0% +28.8% 5032 ± 2% -48.8% 2000 ± 96% proc-vmstat.nr_active_file
919.33 ± 5% +14.8% 1055 ± 8% +17.8% 1083 ± 10% proc-vmstat.nr_dirty
2270 ± 0% +9.6% 2488 ± 0% +2255.3% 53482 ± 95% proc-vmstat.nr_inactive_anon
3444 ± 5% +41.8% 4884 ± 0% +1809.2% 65752 ± 92% proc-vmstat.nr_shmem
4092 ± 14% +61.2% 6595 ± 1% +64.7% 6741 ± 9% proc-vmstat.pgactivate
1975 ± 15% +63.2% 3224 ± 17% +39.3% 2752 ± 15% slabinfo.scsi_data_buffer.active_objs
1975 ± 15% +63.2% 3224 ± 17% +39.3% 2752 ± 15% slabinfo.scsi_data_buffer.num_objs
464.33 ± 15% +63.3% 758.33 ± 17% +39.3% 647.00 ± 15% slabinfo.xfs_efd_item.active_objs
464.33 ± 15% +63.3% 758.33 ± 17% +39.3% 647.00 ± 15% slabinfo.xfs_efd_item.num_objs
1859 ± 0% +9.3% 2032 ± 6% +12.4% 2089 ± 7% slabinfo.xfs_ili.active_objs
1859 ± 0% +9.3% 2032 ± 6% +12.4% 2089 ± 7% slabinfo.xfs_ili.num_objs
37.37 ± 0% +15.3% 43.09 ± 0% +13.0% 42.24 ± 0% time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% +13.0% 42.24 ± 0% time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% +19.9% 7781 ± 4% time.involuntary_context_switches
1037 ± 0% +10.8% 1148 ± 0% +8.2% 1122 ± 0% time.percent_of_cpu_this_job_got
376.89 ± 0% +28.4% 484.11 ± 0% +22.8% 462.92 ± 0% time.system_time
430512 ± 0% -20.1% 343838 ± 0% -17.3% 356032 ± 0% time.voluntary_context_switches
52991525 ± 1% -19.4% 42687208 ± 0% -15.8% 44610884 ± 0% cpuidle.C1-IVT.time
319584 ± 1% -26.5% 234868 ± 1% -20.1% 255455 ± 1% cpuidle.C1-IVT.usage
3468808 ± 2% -19.8% 2783341 ± 3% -17.8% 2851560 ± 1% cpuidle.C1E-IVT.time
46760 ± 0% -22.4% 36298 ± 0% -19.3% 37738 ± 0% cpuidle.C1E-IVT.usage
12590471 ± 0% -22.3% 9788585 ± 1% -19.2% 10169486 ± 0% cpuidle.C3-IVT.time
79965 ± 0% -19.0% 64749 ± 0% -17.0% 66337 ± 0% cpuidle.C3-IVT.usage
1.3e+09 ± 0% +13.3% 1.473e+09 ± 0% +11.5% 1.449e+09 ± 0% cpuidle.C6-IVT.time
1645891 ± 1% +6.2% 1747525 ± 0% +10.3% 1814699 ± 1% cpuidle.C6-IVT.usage
352.33 ± 8% -24.7% 265.33 ± 1% -11.3% 312.50 ± 4% cpuidle.POLL.usage
189508 ± 0% +6.3% 201410 ± 0% +19.0% 225505 ± 12% meminfo.Active
173880 ± 0% +4.4% 181454 ± 1% +25.1% 217503 ± 16% meminfo.Active(anon)
15627 ± 0% +27.7% 19956 ± 1% -48.8% 8001 ± 96% meminfo.Active(file)
16103 ± 3% +14.3% 18405 ± 8% +15.3% 18575 ± 1% meminfo.AnonHugePages
2260771 ± 0% -0.7% 2244069 ± 1% +10.6% 2501050 ± 9% meminfo.Committed_AS
4330854 ± 11% -8.5% 3960847 ± 0% +16.2% 5034030 ± 0% meminfo.DirectMap2M
132898 ± 9% +15.4% 153380 ± 1% -3.1% 128773 ± 4% meminfo.DirectMap4k
9085 ± 0% +9.4% 9940 ± 0% +2254.7% 213930 ± 95% meminfo.Inactive(anon)
13777 ± 5% +43.1% 19709 ± 0% +1809.0% 263006 ± 92% meminfo.Shmem
24.18 ± 0% +9.0% 26.35 ± 0% +7.1% 25.91 ± 0% turbostat.%Busy
686.00 ± 0% +9.5% 751.00 ± 0% +7.5% 737.50 ± 0% turbostat.Avg_MHz
0.28 ± 0% -25.0% 0.21 ± 0% -17.9% 0.23 ± 0% turbostat.CPU%c3
93.33 ± 1% +3.0% 96.15 ± 0% +0.1% 93.44 ± 0% turbostat.CorWatt
79.00 ± 1% -0.4% 78.67 ± 3% -25.9% 58.50 ± 2% turbostat.CoreTmp
3.05 ± 25% +24.9% 3.81 ± 44% -54.3% 1.40 ± 3% turbostat.Pkg%pc6
78.67 ± 0% +0.4% 79.00 ± 3% -25.6% 58.50 ± 2% turbostat.PkgTmp
124.61 ± 0% +2.1% 127.17 ± 0% -1.0% 123.33 ± 0% turbostat.PkgWatt
4.74 ± 0% -2.7% 4.61 ± 1% -11.1% 4.21 ± 0% turbostat.RAMWatt
1724300 ± 27% -40.5% 1025538 ± 1% -39.2% 1048552 ± 0% sched_debug.cfs_rq:/.load.max
618.30 ± 4% +0.2% 619.34 ± 2% +12.0% 692.21 ± 3% sched_debug.cfs_rq:/.min_vruntime.avg
96.36 ± 3% +18.6% 114.32 ± 15% +19.1% 114.75 ± 17% sched_debug.cfs_rq:/.util_avg.stddev
15.54 ± 3% +1.4% 15.76 ± 22% -14.1% 13.35 ± 1% sched_debug.cpu.cpu_load[4].avg
1724300 ± 27% -40.5% 1025538 ± 1% -39.2% 1048552 ± 0% sched_debug.cpu.load.max
4751 ± 21% -14.6% 4056 ± 25% +25.1% 5944 ± 7% sched_debug.cpu.nr_load_updates.avg
7914 ± 1% -14.1% 6797 ± 15% +29.9% 10280 ± 18% sched_debug.cpu.nr_load_updates.max
2887 ± 30% -28.2% 2073 ± 48% +37.8% 3977 ± 9% sched_debug.cpu.nr_load_updates.min
1182 ± 2% +5.2% 1244 ± 2% +13.0% 1336 ± 11% sched_debug.cpu.nr_load_updates.stddev
1006 ± 3% +3.7% 1044 ± 3% +7.5% 1082 ± 5% sched_debug.cpu.nr_switches.avg
7.66 ± 20% -24.9% 5.75 ± 15% -20.7% 6.07 ± 4% sched_debug.cpu.nr_uninterruptible.stddev
7723 ± 0% +32.6% 10238 ± 5% -48.6% 3968 ± 92% numa-meminfo.node0.Active(file)
1589 ± 17% +45.5% 2313 ± 24% +17.4% 1866 ± 2% numa-meminfo.node0.Dirty
56052 ± 3% +58.2% 88666 ± 17% +99.1% 111572 ± 36% numa-meminfo.node1.Active
48142 ± 4% +64.0% 78943 ± 19% +123.4% 107536 ± 41% numa-meminfo.node1.Active(anon)
7908 ± 1% +22.9% 9722 ± 3% -49.0% 4035 ± 99% numa-meminfo.node1.Active(file)
46721 ± 3% +55.9% 72837 ± 24% +76.9% 82652 ± 34% numa-meminfo.node1.AnonPages
3283 ±122% +4.7% 3436 ± 99% +3034.3% 102920 ± 98% numa-meminfo.node1.Inactive(anon)
6005 ± 4% -0.5% 5974 ± 1% +340.4% 26444 ± 77% numa-meminfo.node1.KernelStack
545018 ± 2% +9.2% 594908 ± 4% +33.1% 725280 ± 19% numa-meminfo.node1.MemUsed
10518 ± 11% +72.6% 18157 ± 33% +330.3% 45256 ± 74% numa-meminfo.node1.PageTables
51114 ± 1% +2.8% 52548 ± 8% +78.0% 90996 ± 39% numa-meminfo.node1.SUnreclaim
4789 ± 69% +102.3% 9687 ± 9% +2571.5% 127936 ± 91% numa-meminfo.node1.Shmem
83631 ± 2% -1.7% 82192 ± 9% +47.0% 122949 ± 22% numa-meminfo.node1.Slab
1930 ± 0% +33.9% 2585 ± 3% -48.6% 992.00 ± 92% numa-vmstat.node0.nr_active_file
4468 ± 7% -8.5% 4089 ± 5% +9.7% 4902 ± 5% numa-vmstat.node0.nr_alloc_batch
466.67 ± 4% +29.3% 603.33 ± 14% +4.0% 485.50 ± 22% numa-vmstat.node0.nr_dirty
12026 ± 4% +64.1% 19734 ± 20% +123.3% 26852 ± 41% numa-vmstat.node1.nr_active_anon
1977 ± 1% +23.6% 2444 ± 1% -49.0% 1008 ± 99% numa-vmstat.node1.nr_active_file
3809 ± 6% +16.1% 4422 ± 4% +17.1% 4459 ± 17% numa-vmstat.node1.nr_alloc_batch
11671 ± 3% +55.9% 18197 ± 24% +76.8% 20633 ± 34% numa-vmstat.node1.nr_anon_pages
13239858 ± 0% +2.7% 13602721 ± 4% +9.4% 14489999 ± 2% numa-vmstat.node1.nr_dirtied
480.67 ± 4% -5.2% 455.67 ± 24% +7.8% 518.00 ± 6% numa-vmstat.node1.nr_dirty
820.33 ±122% +4.7% 858.67 ± 99% +3036.2% 25727 ± 98% numa-vmstat.node1.nr_inactive_anon
373.67 ± 4% -0.5% 371.67 ± 1% +340.5% 1646 ± 76% numa-vmstat.node1.nr_kernel_stack
2626 ± 11% +72.6% 4533 ± 33% +329.6% 11283 ± 74% numa-vmstat.node1.nr_page_table_pages
1197 ± 69% +102.3% 2422 ± 9% +2572.1% 31984 ± 91% numa-vmstat.node1.nr_shmem
12777 ± 1% +2.8% 13134 ± 8% +77.9% 22731 ± 39% numa-vmstat.node1.nr_slab_unreclaimable
456.33 ± 57% -75.6% 111.33 ± 86% -71.2% 131.50 ± 96% numa-vmstat.node1.nr_written
13421081 ± 0% +2.9% 13803847 ± 4% +9.8% 14735371 ± 2% numa-vmstat.node1.numa_hit
13421080 ± 0% +2.9% 13803845 ± 4% +9.8% 14735369 ± 2% numa-vmstat.node1.numa_local
2.658e+11 ± 4% +24.7% 3.316e+11 ± 2% +20.5% 3.204e+11 ± 0% perf-stat.branch-instructions
0.41 ± 1% -9.1% 0.37 ± 1% -22.9% 0.32 ± 0% perf-stat.branch-miss-rate
1.09e+09 ± 3% +13.4% 1.237e+09 ± 1% -7.0% 1.014e+09 ± 0% perf-stat.branch-misses
981138 ± 0% -18.1% 803696 ± 0% -14.7% 837107 ± 0% perf-stat.context-switches
1.511e+12 ± 5% +23.4% 1.864e+12 ± 3% +16.1% 1.754e+12 ± 0% perf-stat.cpu-cycles
102600 ± 1% -7.3% 95075 ± 1% -4.7% 97803 ± 0% perf-stat.cpu-migrations
0.26 ± 12% -30.8% 0.18 ± 10% +48.9% 0.39 ± 36% perf-stat.dTLB-load-miss-rate
8.332e+08 ± 13% -3.8% 8.015e+08 ± 6% +105.9% 1.716e+09 ± 34% perf-stat.dTLB-load-misses
3.164e+11 ± 1% +39.9% 4.426e+11 ± 4% +39.6% 4.417e+11 ± 2% perf-stat.dTLB-loads
0.03 ± 26% -41.3% 0.02 ± 13% +35.0% 0.04 ± 15% perf-stat.dTLB-store-miss-rate
60071678 ± 27% -25.6% 44690199 ± 15% +69.8% 1.02e+08 ± 17% perf-stat.dTLB-store-misses
2.247e+11 ± 6% +26.4% 2.839e+11 ± 2% +25.1% 2.812e+11 ± 2% perf-stat.dTLB-stores
1.49e+12 ± 4% +30.1% 1.939e+12 ± 2% +25.5% 1.87e+12 ± 0% perf-stat.instructions
43348 ± 2% +34.2% 58161 ± 12% +33.0% 57666 ± 8% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.5% 1.04 ± 0% +8.1% 1.07 ± 0% perf-stat.ipc
262799 ± 0% +4.4% 274251 ± 1% +4.6% 274993 ± 0% perf-stat.minor-faults
34.12 ± 1% +2.1% 34.83 ± 0% +5.5% 35.99 ± 0% perf-stat.node-load-miss-rate
46476754 ± 2% +4.6% 48601269 ± 1% +5.5% 49038648 ± 0% perf-stat.node-load-misses
89728871 ± 1% +1.3% 90913257 ± 1% -2.8% 87233401 ± 0% perf-stat.node-loads
9.96 ± 0% +13.4% 11.30 ± 0% +18.3% 11.79 ± 3% perf-stat.node-store-miss-rate
24460859 ± 1% +14.4% 27971097 ± 1% +5.3% 25747546 ± 3% perf-stat.node-store-misses
2.211e+08 ± 1% -0.6% 2.197e+08 ± 1% -12.6% 1.931e+08 ± 6% perf-stat.node-stores
262780 ± 0% +4.4% 274227 ± 1% +4.6% 274953 ± 0% perf-stat.page-faults
11.31 ± 1% -18.1% 9.27 ± 0% -17.3% 9.36 ± 0% perf-profile.cycles-pp.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.68 ± 1% +Inf% 1.74 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.80 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
2.55 ± 3% -14.2% 2.19 ± 2% -15.3% 2.16 ± 0% perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin
0.00 ± -1% +Inf% 4.45 ± 1% +Inf% 4.96 ± 1% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor.iomap_apply
5.93 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write
13.71 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
10.36 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 3.64 ± 0% +Inf% 3.79 ± 2% perf-profile.cycles-pp.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.04 ± 2% -18.9% 0.84 ± 1% -15.4% 0.88 ± 0% perf-profile.cycles-pp.__delete_from_page_cache.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
11.24 ± 2% -18.1% 9.21 ± 0% -17.3% 9.30 ± 0% perf-profile.cycles-pp.__dentry_kill.dput.__fput.____fput.task_work_run
11.31 ± 2% -18.1% 9.26 ± 0% -17.3% 9.36 ± 0% perf-profile.cycles-pp.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath
1.72 ± 2% -10.1% 1.54 ± 1% -17.6% 1.42 ± 0% perf-profile.cycles-pp.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin
0.00 ± -1% +Inf% 1.09 ± 2% +Inf% 1.12 ± 1% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.32 ± 4% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 2.68 ± 2% +Inf% 2.65 ± 0% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
3.04 ± 3% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
2.50 ± 3% -11.5% 2.21 ± 0% -18.8% 2.03 ± 0% perf-profile.cycles-pp.__pagevec_release.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
1.00 ± 1% -18.0% 0.82 ± 1% -10.0% 0.90 ± 0% perf-profile.cycles-pp.__radix_tree_lookup.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin
1.12 ± 2% -17.6% 0.92 ± 4% -13.8% 0.96 ± 0% perf-profile.cycles-pp.__sb_start_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.38 ± 2% -13.3% 1.19 ± 1% -12.5% 1.21 ± 0% perf-profile.cycles-pp.__set_page_dirty.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end
54.10 ± 1% +13.1% 61.20 ± 0% +10.6% 59.86 ± 0% perf-profile.cycles-pp.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
6.34 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 3.69 ± 1% +Inf% 3.62 ± 0% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
4.02 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
0.98 ± 5% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 2.56 ± 2% +Inf% 2.50 ± 0% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
2.91 ± 3% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
3.42 ± 0% -20.9% 2.71 ± 2% -15.7% 2.88 ± 0% perf-profile.cycles-pp.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
0.00 ± -1% +Inf% 4.69 ± 0% +Inf% 5.54 ± 1% perf-profile.cycles-pp.block_write_end.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
6.24 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
19.18 ± 5% -9.3% 17.40 ± 0% -5.8% 18.06 ± 1% perf-profile.cycles-pp.call_cpuidle.cpu_startup_entry.start_secondary
0.94 ± 4% -19.8% 0.76 ± 0% -15.2% 0.80 ± 1% perf-profile.cycles-pp.cancel_dirty_page.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage
3.95 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.copy_user_enhanced_fast_string.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± -1% +Inf% 3.22 ± 0% +Inf% 3.29 ± 1% perf-profile.cycles-pp.copy_user_enhanced_fast_string.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
19.75 ± 5% -9.8% 17.81 ± 0% -6.3% 18.50 ± 1% perf-profile.cycles-pp.cpu_startup_entry.start_secondary
19.18 ± 5% -9.3% 17.40 ± 0% -5.8% 18.05 ± 1% perf-profile.cycles-pp.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
18.45 ± 5% -9.2% 16.75 ± 0% -5.6% 17.42 ± 1% perf-profile.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
1.44 ± 3% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 1.18 ± 1% +Inf% 1.25 ± 2% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor
1.86 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
0.00 ± -1% +Inf% 1.53 ± 1% +Inf% 1.61 ± 3% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply
1.74 ± 2% -19.9% 1.40 ± 3% -16.8% 1.45 ± 0% perf-profile.cycles-pp.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
1.27 ± 0% -22.5% 0.99 ± 4% -22.3% 0.99 ± 0% perf-profile.cycles-pp.destroy_inode.evict.iput.__dentry_kill.dput
2.61 ± 1% -24.3% 1.98 ± 1% -20.7% 2.07 ± 0% perf-profile.cycles-pp.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
2.66 ± 1% -24.3% 2.01 ± 1% -20.5% 2.12 ± 0% perf-profile.cycles-pp.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
1.79 ± 2% -28.2% 1.28 ± 3% -23.3% 1.37 ± 2% perf-profile.cycles-pp.do_unlinkat.sys_unlink.entry_SYSCALL_64_fastpath
1.07 ± 3% -23.3% 0.82 ± 3% -19.4% 0.86 ± 0% perf-profile.cycles-pp.down_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
1.01 ± 3% -17.9% 0.83 ± 2% -13.6% 0.87 ± 1% perf-profile.cycles-pp.down_write.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.26 ± 2% -18.1% 9.23 ± 0% -17.2% 9.32 ± 0% perf-profile.cycles-pp.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
11.21 ± 2% -18.1% 9.18 ± 0% -17.4% 9.26 ± 0% perf-profile.cycles-pp.evict.iput.__dentry_kill.dput.__fput
11.34 ± 2% -18.1% 9.29 ± 0% -17.3% 9.38 ± 0% perf-profile.cycles-pp.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.55 ± 3% +Inf% 1.65 ± 0% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
1.83 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
43.95 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 7.91 ± 1% +Inf% 9.04 ± 0% perf-profile.cycles-pp.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
10.68 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
1.91 ± 3% -16.4% 1.59 ± 1% -17.7% 1.57 ± 0% perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page
0.00 ± -1% +Inf% 9.85 ± 0% +Inf% 9.91 ± 0% perf-profile.cycles-pp.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
10.96 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± -1% +Inf% 52.29 ± 0% +Inf% 50.82 ± 0% perf-profile.cycles-pp.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± -1% +Inf% 52.94 ± 0% +Inf% 51.44 ± 0% perf-profile.cycles-pp.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 34.35 ± 0% +Inf% 32.27 ± 0% perf-profile.cycles-pp.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± -1% +Inf% 16.48 ± 0% +Inf% 16.75 ± 1% perf-profile.cycles-pp.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
11.22 ± 2% -18.1% 9.19 ± 0% -17.3% 9.27 ± 0% perf-profile.cycles-pp.iput.__dentry_kill.dput.__fput.____fput
0.00 ± -1% +Inf% 1.55 ± 1% +Inf% 1.42 ± 0% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.72 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 2.78 ± 0% +Inf% 2.88 ± 1% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor
3.39 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end
0.00 ± -1% +Inf% 3.44 ± 1% +NaN% 0.00 ± -1% perf-profile.cycles-pp.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
3.03 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.memset_erms.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 2.43 ± 0% +Inf% 2.48 ± 3% perf-profile.cycles-pp.memset_erms.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 9.25 ± 0% +Inf% 9.25 ± 0% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply
10.37 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
1.52 ± 2% -9.2% 1.38 ± 1% -17.0% 1.27 ± 0% perf-profile.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.pagecache_get_page
2.58 ± 1% -24.1% 1.96 ± 0% -20.6% 2.05 ± 0% perf-profile.cycles-pp.path_openat.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 0.95 ± 0% +Inf% 1.04 ± 0% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.17 ± 3% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
2.39 ± 3% -11.2% 2.12 ± 0% -18.3% 1.95 ± 1% perf-profile.cycles-pp.release_pages.__pagevec_release.truncate_inode_pages_range.truncate_inode_pages_final.evict
2.06 ± 3% -22.5% 1.60 ± 2% -10.9% 1.83 ± 0% perf-profile.cycles-pp.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.79 ± 3% -22.2% 1.39 ± 0% -9.8% 1.62 ± 0% perf-profile.cycles-pp.security_file_permission.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.32 ± 4% -21.4% 1.04 ± 0% -7.4% 1.23 ± 1% perf-profile.cycles-pp.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write.sys_write
19.79 ± 5% -9.9% 17.84 ± 0% -6.4% 18.54 ± 1% perf-profile.cycles-pp.start_secondary
2.67 ± 1% -24.2% 2.02 ± 1% -20.4% 2.12 ± 1% perf-profile.cycles-pp.sys_creat.entry_SYSCALL_64_fastpath
1.79 ± 3% -27.9% 1.29 ± 3% -23.0% 1.38 ± 2% perf-profile.cycles-pp.sys_unlink.entry_SYSCALL_64_fastpath
60.98 ± 1% +9.5% 66.76 ± 0% +7.8% 65.74 ± 0% perf-profile.cycles-pp.sys_write.entry_SYSCALL_64_fastpath
11.34 ± 1% -18.1% 9.29 ± 0% -17.2% 9.39 ± 0% perf-profile.cycles-pp.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.32 ± 1% -18.0% 9.28 ± 0% -17.3% 9.37 ± 0% perf-profile.cycles-pp.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
5.96 ± 1% -20.0% 4.77 ± 0% -15.8% 5.02 ± 0% perf-profile.cycles-pp.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
9.89 ± 2% -17.4% 8.17 ± 0% -16.7% 8.25 ± 0% perf-profile.cycles-pp.truncate_inode_pages_final.evict.iput.__dentry_kill.dput
9.87 ± 2% -17.5% 8.15 ± 0% -16.8% 8.21 ± 0% perf-profile.cycles-pp.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.__dentry_kill
2.07 ± 1% -20.4% 1.65 ± 2% -14.9% 1.77 ± 1% perf-profile.cycles-pp.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage
2.40 ± 1% -21.0% 1.89 ± 2% -15.3% 2.03 ± 1% perf-profile.cycles-pp.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range
0.00 ± -1% +Inf% 1.36 ± 1% +Inf% 1.56 ± 3% perf-profile.cycles-pp.unlock_page.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.72 ± 4% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.unlock_page.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
59.63 ± 1% +10.2% 65.72 ± 0% +8.5% 64.68 ± 0% perf-profile.cycles-pp.vfs_write.sys_write.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.52 ± 2% +NaN% 0.00 ± -1% perf-profile.cycles-pp.workingset_activation.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 1.73 ± 1% +Inf% 1.75 ± 2% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
0.00 ± -1% +Inf% 1.97 ± 2% +Inf% 2.04 ± 0% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 1.61 ± 2% +Inf% 1.65 ± 1% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± -1% +Inf% 1.24 ± 2% +Inf% 1.21 ± 3% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± -1% +Inf% 1.46 ± 1% +Inf% 1.47 ± 1% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply
0.00 ± -1% +Inf% 1.21 ± 2% +Inf% 1.25 ± 0% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay
1.25 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
0.00 ± -1% +Inf% 3.06 ± 1% +Inf% 3.08 ± 1% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
1.04 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_bmapi_read.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 3.04 ± 1% +Inf% 3.16 ± 1% perf-profile.cycles-pp.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 3.05 ± 1% +Inf% 3.09 ± 1% perf-profile.cycles-pp.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
1.32 ± 2% -21.5% 1.04 ± 1% -19.7% 1.06 ± 0% perf-profile.cycles-pp.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat
51.83 ± 1% +14.3% 59.25 ± 0% +11.8% 57.95 ± 0% perf-profile.cycles-pp.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.sys_write
0.00 ± -1% +Inf% 16.05 ± 0% +Inf% 16.68 ± 0% perf-profile.cycles-pp.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
53.16 ± 1% +13.6% 60.40 ± 0% +11.1% 59.09 ± 0% perf-profile.cycles-pp.xfs_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.24 ± 1% -23.1% 0.95 ± 4% -21.8% 0.97 ± 0% perf-profile.cycles-pp.xfs_fs_destroy_inode.destroy_inode.evict.iput.__dentry_kill
1.42 ± 2% -21.2% 1.12 ± 1% -20.6% 1.12 ± 0% perf-profile.cycles-pp.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open
6.46 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
1.29 ± 3% -18.9% 1.04 ± 1% -14.1% 1.10 ± 0% perf-profile.cycles-pp.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 1.14 ± 3% +Inf% 1.17 ± 1% perf-profile.cycles-pp.xfs_ilock.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
1.21 ± 1% -23.4% 0.93 ± 4% -22.5% 0.94 ± 0% perf-profile.cycles-pp.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
1.23 ± 4% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.6.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
0.00 ± -1% +Inf% 4.14 ± 0% +Inf% 4.15 ± 1% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
3.28 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 9.08 ± 0% +Inf% 9.19 ± 1% perf-profile.cycles-pp.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
3.54 ± 0% -20.8% 2.81 ± 1% -15.6% 2.99 ± 0% perf-profile.cycles-pp.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
2.35 ± 1% -21.0% 1.86 ± 1% -15.1% 2.00 ± 1% perf-profile.cycles-pp.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page
25.10 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.03 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
1.42 ± 2% -20.7% 1.13 ± 1% -20.4% 1.13 ± 0% perf-profile.cycles-pp.xfs_vn_create.path_openat.do_filp_open.do_sys_open.sys_creat
1.42 ± 2% -20.5% 1.13 ± 1% -20.2% 1.13 ± 0% perf-profile.cycles-pp.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open
2.27 ± 1% -10.6% 2.03 ± 0% -6.7% 2.12 ± 1% perf-profile.func.cycles-pp.___might_sleep
2.49 ± 0% -34.5% 1.63 ± 1% -16.7% 2.08 ± 0% perf-profile.func.cycles-pp.__block_commit_write.isra.24
1.51 ± 2% +15.4% 1.75 ± 1% +18.0% 1.79 ± 2% perf-profile.func.cycles-pp.__block_write_begin_int
1.79 ± 4% -16.8% 1.49 ± 1% -14.5% 1.53 ± 0% perf-profile.func.cycles-pp.__mark_inode_dirty
1.32 ± 0% -16.4% 1.10 ± 1% -9.5% 1.19 ± 0% perf-profile.func.cycles-pp.__radix_tree_lookup
1.08 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.func.cycles-pp.__xfs_get_blocks
1.16 ± 0% -18.1% 0.95 ± 1% -15.8% 0.98 ± 1% perf-profile.func.cycles-pp._raw_spin_lock
3.96 ± 2% -18.4% 3.23 ± 0% -16.9% 3.29 ± 1% perf-profile.func.cycles-pp.copy_user_enhanced_fast_string
1.41 ± 3% -20.6% 1.12 ± 3% -21.1% 1.11 ± 3% perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath
1.30 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.func.cycles-pp.generic_perform_write
1.31 ± 2% -46.7% 0.70 ± 0% -43.8% 0.73 ± 0% perf-profile.func.cycles-pp.generic_write_end
18.43 ± 5% -9.1% 16.76 ± 0% -5.4% 17.44 ± 1% perf-profile.func.cycles-pp.intel_idle
0.00 ± -1% +Inf% 1.12 ± 1% +Inf% 0.90 ± 0% perf-profile.func.cycles-pp.iomap_write_actor
1.50 ± 1% -20.9% 1.19 ± 1% -17.0% 1.25 ± 2% perf-profile.func.cycles-pp.mark_buffer_dirty
0.00 ± -1% +Inf% 1.91 ± 1% +NaN% 0.00 ± -1% perf-profile.func.cycles-pp.mark_page_accessed
3.24 ± 0% -19.8% 2.60 ± 0% -18.1% 2.66 ± 3% perf-profile.func.cycles-pp.memset_erms
1.75 ± 2% -18.9% 1.42 ± 1% -7.3% 1.62 ± 4% perf-profile.func.cycles-pp.unlock_page
1.56 ± 2% +6.0% 1.65 ± 3% +11.8% 1.74 ± 1% perf-profile.func.cycles-pp.up_write
1.16 ± 1% -21.6% 0.91 ± 1% -17.7% 0.95 ± 1% perf-profile.func.cycles-pp.vfs_write
0.37 ± 2% +243.6% 1.26 ± 2% +272.3% 1.36 ± 2% perf-profile.func.cycles-pp.xfs_bmap_search_extents
0.41 ± 1% +198.4% 1.22 ± 2% +198.8% 1.23 ± 3% perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents
0.70 ± 5% +219.5% 2.24 ± 0% +227.9% 2.29 ± 0% perf-profile.func.cycles-pp.xfs_bmapi_read
1.05 ± 2% -15.6% 0.88 ± 3% -18.8% 0.85 ± 1% perf-profile.func.cycles-pp.xfs_file_write_iter
0.64 ± 1% +182.8% 1.81 ± 4% +182.0% 1.81 ± 0% perf-profile.func.cycles-pp.xfs_iext_bno_to_ext
0.00 ± -1% +Inf% 1.10 ± 3% +Inf% 1.21 ± 2% perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8
0.46 ± 4% +161.6% 1.20 ± 1% +171.7% 1.25 ± 1% perf-profile.func.cycles-pp.xfs_iomap_write_delay
raw perf data:
"perf-profile.func.cycles-pp.intel_idle": 17.66,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.25,
"perf-profile.func.cycles-pp.memset_erms": 2.56,
"perf-profile.func.cycles-pp.xfs_bmapi_read": 2.28,
"perf-profile.func.cycles-pp.___might_sleep": 2.09,
"perf-profile.func.cycles-pp.__block_commit_write.isra.24": 2.07,
"perf-profile.func.cycles-pp.xfs_iext_bno_to_ext": 1.79,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.74,
"perf-profile.func.cycles-pp.up_write": 1.72,
"perf-profile.func.cycles-pp.unlock_page": 1.69,
"perf-profile.func.cycles-pp.down_write": 1.59,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.54,
"perf-profile.func.cycles-pp.xfs_bmap_search_extents": 1.33,
"perf-profile.func.cycles-pp.xfs_iomap_write_delay": 1.23,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 1.21,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 1.2,
"perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents": 1.18,
"perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8": 1.17,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.15,
"perf-profile.func.cycles-pp.__might_sleep": 1.14,
"perf-profile.func.cycles-pp._raw_spin_lock": 0.97,
"perf-profile.func.cycles-pp.vfs_write": 0.94,
"perf-profile.func.cycles-pp.xfs_bmapi_delay": 0.93,
"perf-profile.func.cycles-pp.iomap_write_actor": 0.9,
"perf-profile.func.cycles-pp.pagecache_get_page": 0.89,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 0.86,
"perf-profile.func.cycles-pp.xfs_file_iomap_begin": 0.81,
"perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 0.78,
"perf-profile.func.cycles-pp.iomap_apply": 0.77,
"perf-profile.func.cycles-pp.generic_write_end": 0.74,
"perf-profile.func.cycles-pp.xfs_file_buffered_aio_write": 0.72,
"perf-profile.func.cycles-pp.find_get_entry": 0.69,
"perf-profile.func.cycles-pp.__vfs_write": 0.67,
Best Regards,
Huang, Ying
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 21:16 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-11 21:16 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 45778 bytes --]
Christoph Hellwig <hch@lst.de> writes:
> On Thu, Aug 11, 2016 at 12:51:31PM -0700, Linus Torvalds wrote:
>> Ok. It does seem to also reset the active file page counts back, so
>> that part did seem to be related, but yeah, from a performance
>> standpoint that was clearly not a major issue.
>>
>> Let's hope Dave can figure out something based on his numbers, because
>> I'm out of ideas. Or maybe it's the pagefault-atomic thing that
>> Christoph was looking at.
>
> I can't really think of any reason why the pagefaul_disable() would
> sіgnificantly change performance. Anyway, the patch for the is below
> (on top of the previous mark_page_accessed() one), so feel free to
> re-run the test with it. It would also be nice to see the profiles
> with the two patches applied.
>
> commit 43106eea246074acc4bb7d12fdb91f58002f52ed
> Author: Christoph Hellwig <hch@lst.de>
> Date: Thu Aug 11 10:41:40 2016 -0700
>
> fs: remove superflous pagefault_disable from iomap_write_actor
>
> No idea where this really came from, generic_perform_write only briefly
> did a pagefaul_disable when trying a different prefault scheme.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> diff --git a/fs/iomap.c b/fs/iomap.c
> index f39c318..74712e2 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -194,9 +194,7 @@ again:
> if (mapping_writably_mapped(inode->i_mapping))
> flush_dcache_page(page);
>
> - pagefault_disable();
> copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
> - pagefault_enable();
>
> flush_dcache_page(page);
>
Test result is as follow,
commit e129b86bfacc8bb517b843fca41d0d179de7a4ca
Author: Christoph Hellwig <hch@lst.de>
Date: Thu Aug 11 10:41:40 2016 -0700
fs: remove superflous pagefault_disable from iomap_write_actor
No idea where this really came from, generic_perform_write only briefly
did a pagefaul_disable when trying a different prefault scheme.
Signed-off-by: Christoph Hellwig <hch@lst.de>
diff --git a/fs/iomap.c b/fs/iomap.c
index f39c318..74712e2 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -194,9 +194,7 @@ again:
if (mapping_writably_mapped(inode->i_mapping))
flush_dcache_page(page);
- pagefault_disable();
copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
- pagefault_enable();
flush_dcache_page(page);
=========================================================================================
compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506
68a9f5e7007c1afa2cf6830b690a90d0187c0684
e129b86bfacc8bb517b843fca41d0d179de7a4ca
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 e129b86bfacc8bb517b843fca4
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
484435 ± 0% -13.3% 420004 ± 0% -11.6% 428323 ± 0% aim7.jobs-per-min
37.37 ± 0% +15.3% 43.09 ± 0% +13.0% 42.24 ± 0% aim7.time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% +13.0% 42.24 ± 0% aim7.time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% +19.9% 7781 ± 4% aim7.time.involuntary_context_switches
2378 ± 0% +1.1% 2404 ± 0% +9.2% 2598 ± 0% aim7.time.maximum_resident_set_size
376.89 ± 0% +28.4% 484.11 ± 0% +22.8% 462.92 ± 0% aim7.time.system_time
430512 ± 0% -20.1% 343838 ± 0% -17.3% 356032 ± 0% aim7.time.voluntary_context_switches
26816 ± 8% +10.2% 29542 ± 1% +13.5% 30428 ± 0% interrupts.CAL:Function_call_interrupts
1016 ± 8% +527.7% 6381 ± 59% +483.5% 5932 ± 82% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_trans_get_buf_map.xfs_da_get_buf.xfs_dir3_data_init.xfs_dir2_sf_to_block.xfs_dir2_sf_addname.xfs_dir_createname.xfs_create.xfs_generic_create
125122 ± 10% -10.7% 111758 ± 12% +8.2% 135440 ± 3% softirqs.SCHED
410707 ± 12% +5.2% 432155 ± 11% +24.9% 512838 ± 4% softirqs.TIMER
24772 ± 0% -28.6% 17675 ± 0% -24.1% 18813 ± 1% vmstat.system.cs
53477 ± 2% +5.6% 56453 ± 0% +6.1% 56716 ± 0% vmstat.system.in
43469 ± 0% +5.3% 45792 ± 1% +25.0% 54343 ± 16% proc-vmstat.nr_active_anon
3906 ± 0% +28.8% 5032 ± 2% -48.8% 2000 ± 96% proc-vmstat.nr_active_file
919.33 ± 5% +14.8% 1055 ± 8% +17.8% 1083 ± 10% proc-vmstat.nr_dirty
2270 ± 0% +9.6% 2488 ± 0% +2255.3% 53482 ± 95% proc-vmstat.nr_inactive_anon
3444 ± 5% +41.8% 4884 ± 0% +1809.2% 65752 ± 92% proc-vmstat.nr_shmem
4092 ± 14% +61.2% 6595 ± 1% +64.7% 6741 ± 9% proc-vmstat.pgactivate
1975 ± 15% +63.2% 3224 ± 17% +39.3% 2752 ± 15% slabinfo.scsi_data_buffer.active_objs
1975 ± 15% +63.2% 3224 ± 17% +39.3% 2752 ± 15% slabinfo.scsi_data_buffer.num_objs
464.33 ± 15% +63.3% 758.33 ± 17% +39.3% 647.00 ± 15% slabinfo.xfs_efd_item.active_objs
464.33 ± 15% +63.3% 758.33 ± 17% +39.3% 647.00 ± 15% slabinfo.xfs_efd_item.num_objs
1859 ± 0% +9.3% 2032 ± 6% +12.4% 2089 ± 7% slabinfo.xfs_ili.active_objs
1859 ± 0% +9.3% 2032 ± 6% +12.4% 2089 ± 7% slabinfo.xfs_ili.num_objs
37.37 ± 0% +15.3% 43.09 ± 0% +13.0% 42.24 ± 0% time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% +13.0% 42.24 ± 0% time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% +19.9% 7781 ± 4% time.involuntary_context_switches
1037 ± 0% +10.8% 1148 ± 0% +8.2% 1122 ± 0% time.percent_of_cpu_this_job_got
376.89 ± 0% +28.4% 484.11 ± 0% +22.8% 462.92 ± 0% time.system_time
430512 ± 0% -20.1% 343838 ± 0% -17.3% 356032 ± 0% time.voluntary_context_switches
52991525 ± 1% -19.4% 42687208 ± 0% -15.8% 44610884 ± 0% cpuidle.C1-IVT.time
319584 ± 1% -26.5% 234868 ± 1% -20.1% 255455 ± 1% cpuidle.C1-IVT.usage
3468808 ± 2% -19.8% 2783341 ± 3% -17.8% 2851560 ± 1% cpuidle.C1E-IVT.time
46760 ± 0% -22.4% 36298 ± 0% -19.3% 37738 ± 0% cpuidle.C1E-IVT.usage
12590471 ± 0% -22.3% 9788585 ± 1% -19.2% 10169486 ± 0% cpuidle.C3-IVT.time
79965 ± 0% -19.0% 64749 ± 0% -17.0% 66337 ± 0% cpuidle.C3-IVT.usage
1.3e+09 ± 0% +13.3% 1.473e+09 ± 0% +11.5% 1.449e+09 ± 0% cpuidle.C6-IVT.time
1645891 ± 1% +6.2% 1747525 ± 0% +10.3% 1814699 ± 1% cpuidle.C6-IVT.usage
352.33 ± 8% -24.7% 265.33 ± 1% -11.3% 312.50 ± 4% cpuidle.POLL.usage
189508 ± 0% +6.3% 201410 ± 0% +19.0% 225505 ± 12% meminfo.Active
173880 ± 0% +4.4% 181454 ± 1% +25.1% 217503 ± 16% meminfo.Active(anon)
15627 ± 0% +27.7% 19956 ± 1% -48.8% 8001 ± 96% meminfo.Active(file)
16103 ± 3% +14.3% 18405 ± 8% +15.3% 18575 ± 1% meminfo.AnonHugePages
2260771 ± 0% -0.7% 2244069 ± 1% +10.6% 2501050 ± 9% meminfo.Committed_AS
4330854 ± 11% -8.5% 3960847 ± 0% +16.2% 5034030 ± 0% meminfo.DirectMap2M
132898 ± 9% +15.4% 153380 ± 1% -3.1% 128773 ± 4% meminfo.DirectMap4k
9085 ± 0% +9.4% 9940 ± 0% +2254.7% 213930 ± 95% meminfo.Inactive(anon)
13777 ± 5% +43.1% 19709 ± 0% +1809.0% 263006 ± 92% meminfo.Shmem
24.18 ± 0% +9.0% 26.35 ± 0% +7.1% 25.91 ± 0% turbostat.%Busy
686.00 ± 0% +9.5% 751.00 ± 0% +7.5% 737.50 ± 0% turbostat.Avg_MHz
0.28 ± 0% -25.0% 0.21 ± 0% -17.9% 0.23 ± 0% turbostat.CPU%c3
93.33 ± 1% +3.0% 96.15 ± 0% +0.1% 93.44 ± 0% turbostat.CorWatt
79.00 ± 1% -0.4% 78.67 ± 3% -25.9% 58.50 ± 2% turbostat.CoreTmp
3.05 ± 25% +24.9% 3.81 ± 44% -54.3% 1.40 ± 3% turbostat.Pkg%pc6
78.67 ± 0% +0.4% 79.00 ± 3% -25.6% 58.50 ± 2% turbostat.PkgTmp
124.61 ± 0% +2.1% 127.17 ± 0% -1.0% 123.33 ± 0% turbostat.PkgWatt
4.74 ± 0% -2.7% 4.61 ± 1% -11.1% 4.21 ± 0% turbostat.RAMWatt
1724300 ± 27% -40.5% 1025538 ± 1% -39.2% 1048552 ± 0% sched_debug.cfs_rq:/.load.max
618.30 ± 4% +0.2% 619.34 ± 2% +12.0% 692.21 ± 3% sched_debug.cfs_rq:/.min_vruntime.avg
96.36 ± 3% +18.6% 114.32 ± 15% +19.1% 114.75 ± 17% sched_debug.cfs_rq:/.util_avg.stddev
15.54 ± 3% +1.4% 15.76 ± 22% -14.1% 13.35 ± 1% sched_debug.cpu.cpu_load[4].avg
1724300 ± 27% -40.5% 1025538 ± 1% -39.2% 1048552 ± 0% sched_debug.cpu.load.max
4751 ± 21% -14.6% 4056 ± 25% +25.1% 5944 ± 7% sched_debug.cpu.nr_load_updates.avg
7914 ± 1% -14.1% 6797 ± 15% +29.9% 10280 ± 18% sched_debug.cpu.nr_load_updates.max
2887 ± 30% -28.2% 2073 ± 48% +37.8% 3977 ± 9% sched_debug.cpu.nr_load_updates.min
1182 ± 2% +5.2% 1244 ± 2% +13.0% 1336 ± 11% sched_debug.cpu.nr_load_updates.stddev
1006 ± 3% +3.7% 1044 ± 3% +7.5% 1082 ± 5% sched_debug.cpu.nr_switches.avg
7.66 ± 20% -24.9% 5.75 ± 15% -20.7% 6.07 ± 4% sched_debug.cpu.nr_uninterruptible.stddev
7723 ± 0% +32.6% 10238 ± 5% -48.6% 3968 ± 92% numa-meminfo.node0.Active(file)
1589 ± 17% +45.5% 2313 ± 24% +17.4% 1866 ± 2% numa-meminfo.node0.Dirty
56052 ± 3% +58.2% 88666 ± 17% +99.1% 111572 ± 36% numa-meminfo.node1.Active
48142 ± 4% +64.0% 78943 ± 19% +123.4% 107536 ± 41% numa-meminfo.node1.Active(anon)
7908 ± 1% +22.9% 9722 ± 3% -49.0% 4035 ± 99% numa-meminfo.node1.Active(file)
46721 ± 3% +55.9% 72837 ± 24% +76.9% 82652 ± 34% numa-meminfo.node1.AnonPages
3283 ±122% +4.7% 3436 ± 99% +3034.3% 102920 ± 98% numa-meminfo.node1.Inactive(anon)
6005 ± 4% -0.5% 5974 ± 1% +340.4% 26444 ± 77% numa-meminfo.node1.KernelStack
545018 ± 2% +9.2% 594908 ± 4% +33.1% 725280 ± 19% numa-meminfo.node1.MemUsed
10518 ± 11% +72.6% 18157 ± 33% +330.3% 45256 ± 74% numa-meminfo.node1.PageTables
51114 ± 1% +2.8% 52548 ± 8% +78.0% 90996 ± 39% numa-meminfo.node1.SUnreclaim
4789 ± 69% +102.3% 9687 ± 9% +2571.5% 127936 ± 91% numa-meminfo.node1.Shmem
83631 ± 2% -1.7% 82192 ± 9% +47.0% 122949 ± 22% numa-meminfo.node1.Slab
1930 ± 0% +33.9% 2585 ± 3% -48.6% 992.00 ± 92% numa-vmstat.node0.nr_active_file
4468 ± 7% -8.5% 4089 ± 5% +9.7% 4902 ± 5% numa-vmstat.node0.nr_alloc_batch
466.67 ± 4% +29.3% 603.33 ± 14% +4.0% 485.50 ± 22% numa-vmstat.node0.nr_dirty
12026 ± 4% +64.1% 19734 ± 20% +123.3% 26852 ± 41% numa-vmstat.node1.nr_active_anon
1977 ± 1% +23.6% 2444 ± 1% -49.0% 1008 ± 99% numa-vmstat.node1.nr_active_file
3809 ± 6% +16.1% 4422 ± 4% +17.1% 4459 ± 17% numa-vmstat.node1.nr_alloc_batch
11671 ± 3% +55.9% 18197 ± 24% +76.8% 20633 ± 34% numa-vmstat.node1.nr_anon_pages
13239858 ± 0% +2.7% 13602721 ± 4% +9.4% 14489999 ± 2% numa-vmstat.node1.nr_dirtied
480.67 ± 4% -5.2% 455.67 ± 24% +7.8% 518.00 ± 6% numa-vmstat.node1.nr_dirty
820.33 ±122% +4.7% 858.67 ± 99% +3036.2% 25727 ± 98% numa-vmstat.node1.nr_inactive_anon
373.67 ± 4% -0.5% 371.67 ± 1% +340.5% 1646 ± 76% numa-vmstat.node1.nr_kernel_stack
2626 ± 11% +72.6% 4533 ± 33% +329.6% 11283 ± 74% numa-vmstat.node1.nr_page_table_pages
1197 ± 69% +102.3% 2422 ± 9% +2572.1% 31984 ± 91% numa-vmstat.node1.nr_shmem
12777 ± 1% +2.8% 13134 ± 8% +77.9% 22731 ± 39% numa-vmstat.node1.nr_slab_unreclaimable
456.33 ± 57% -75.6% 111.33 ± 86% -71.2% 131.50 ± 96% numa-vmstat.node1.nr_written
13421081 ± 0% +2.9% 13803847 ± 4% +9.8% 14735371 ± 2% numa-vmstat.node1.numa_hit
13421080 ± 0% +2.9% 13803845 ± 4% +9.8% 14735369 ± 2% numa-vmstat.node1.numa_local
2.658e+11 ± 4% +24.7% 3.316e+11 ± 2% +20.5% 3.204e+11 ± 0% perf-stat.branch-instructions
0.41 ± 1% -9.1% 0.37 ± 1% -22.9% 0.32 ± 0% perf-stat.branch-miss-rate
1.09e+09 ± 3% +13.4% 1.237e+09 ± 1% -7.0% 1.014e+09 ± 0% perf-stat.branch-misses
981138 ± 0% -18.1% 803696 ± 0% -14.7% 837107 ± 0% perf-stat.context-switches
1.511e+12 ± 5% +23.4% 1.864e+12 ± 3% +16.1% 1.754e+12 ± 0% perf-stat.cpu-cycles
102600 ± 1% -7.3% 95075 ± 1% -4.7% 97803 ± 0% perf-stat.cpu-migrations
0.26 ± 12% -30.8% 0.18 ± 10% +48.9% 0.39 ± 36% perf-stat.dTLB-load-miss-rate
8.332e+08 ± 13% -3.8% 8.015e+08 ± 6% +105.9% 1.716e+09 ± 34% perf-stat.dTLB-load-misses
3.164e+11 ± 1% +39.9% 4.426e+11 ± 4% +39.6% 4.417e+11 ± 2% perf-stat.dTLB-loads
0.03 ± 26% -41.3% 0.02 ± 13% +35.0% 0.04 ± 15% perf-stat.dTLB-store-miss-rate
60071678 ± 27% -25.6% 44690199 ± 15% +69.8% 1.02e+08 ± 17% perf-stat.dTLB-store-misses
2.247e+11 ± 6% +26.4% 2.839e+11 ± 2% +25.1% 2.812e+11 ± 2% perf-stat.dTLB-stores
1.49e+12 ± 4% +30.1% 1.939e+12 ± 2% +25.5% 1.87e+12 ± 0% perf-stat.instructions
43348 ± 2% +34.2% 58161 ± 12% +33.0% 57666 ± 8% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.5% 1.04 ± 0% +8.1% 1.07 ± 0% perf-stat.ipc
262799 ± 0% +4.4% 274251 ± 1% +4.6% 274993 ± 0% perf-stat.minor-faults
34.12 ± 1% +2.1% 34.83 ± 0% +5.5% 35.99 ± 0% perf-stat.node-load-miss-rate
46476754 ± 2% +4.6% 48601269 ± 1% +5.5% 49038648 ± 0% perf-stat.node-load-misses
89728871 ± 1% +1.3% 90913257 ± 1% -2.8% 87233401 ± 0% perf-stat.node-loads
9.96 ± 0% +13.4% 11.30 ± 0% +18.3% 11.79 ± 3% perf-stat.node-store-miss-rate
24460859 ± 1% +14.4% 27971097 ± 1% +5.3% 25747546 ± 3% perf-stat.node-store-misses
2.211e+08 ± 1% -0.6% 2.197e+08 ± 1% -12.6% 1.931e+08 ± 6% perf-stat.node-stores
262780 ± 0% +4.4% 274227 ± 1% +4.6% 274953 ± 0% perf-stat.page-faults
11.31 ± 1% -18.1% 9.27 ± 0% -17.3% 9.36 ± 0% perf-profile.cycles-pp.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.68 ± 1% +Inf% 1.74 ± 1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.80 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
2.55 ± 3% -14.2% 2.19 ± 2% -15.3% 2.16 ± 0% perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin
0.00 ± -1% +Inf% 4.45 ± 1% +Inf% 4.96 ± 1% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor.iomap_apply
5.93 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write
13.71 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
10.36 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 3.64 ± 0% +Inf% 3.79 ± 2% perf-profile.cycles-pp.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.04 ± 2% -18.9% 0.84 ± 1% -15.4% 0.88 ± 0% perf-profile.cycles-pp.__delete_from_page_cache.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
11.24 ± 2% -18.1% 9.21 ± 0% -17.3% 9.30 ± 0% perf-profile.cycles-pp.__dentry_kill.dput.__fput.____fput.task_work_run
11.31 ± 2% -18.1% 9.26 ± 0% -17.3% 9.36 ± 0% perf-profile.cycles-pp.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath
1.72 ± 2% -10.1% 1.54 ± 1% -17.6% 1.42 ± 0% perf-profile.cycles-pp.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin
0.00 ± -1% +Inf% 1.09 ± 2% +Inf% 1.12 ± 1% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.32 ± 4% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 2.68 ± 2% +Inf% 2.65 ± 0% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
3.04 ± 3% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
2.50 ± 3% -11.5% 2.21 ± 0% -18.8% 2.03 ± 0% perf-profile.cycles-pp.__pagevec_release.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
1.00 ± 1% -18.0% 0.82 ± 1% -10.0% 0.90 ± 0% perf-profile.cycles-pp.__radix_tree_lookup.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin
1.12 ± 2% -17.6% 0.92 ± 4% -13.8% 0.96 ± 0% perf-profile.cycles-pp.__sb_start_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.38 ± 2% -13.3% 1.19 ± 1% -12.5% 1.21 ± 0% perf-profile.cycles-pp.__set_page_dirty.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end
54.10 ± 1% +13.1% 61.20 ± 0% +10.6% 59.86 ± 0% perf-profile.cycles-pp.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
6.34 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 3.69 ± 1% +Inf% 3.62 ± 0% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
4.02 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
0.98 ± 5% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 2.56 ± 2% +Inf% 2.50 ± 0% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
2.91 ± 3% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
3.42 ± 0% -20.9% 2.71 ± 2% -15.7% 2.88 ± 0% perf-profile.cycles-pp.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
0.00 ± -1% +Inf% 4.69 ± 0% +Inf% 5.54 ± 1% perf-profile.cycles-pp.block_write_end.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
6.24 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
19.18 ± 5% -9.3% 17.40 ± 0% -5.8% 18.06 ± 1% perf-profile.cycles-pp.call_cpuidle.cpu_startup_entry.start_secondary
0.94 ± 4% -19.8% 0.76 ± 0% -15.2% 0.80 ± 1% perf-profile.cycles-pp.cancel_dirty_page.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage
3.95 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.copy_user_enhanced_fast_string.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± -1% +Inf% 3.22 ± 0% +Inf% 3.29 ± 1% perf-profile.cycles-pp.copy_user_enhanced_fast_string.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
19.75 ± 5% -9.8% 17.81 ± 0% -6.3% 18.50 ± 1% perf-profile.cycles-pp.cpu_startup_entry.start_secondary
19.18 ± 5% -9.3% 17.40 ± 0% -5.8% 18.05 ± 1% perf-profile.cycles-pp.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
18.45 ± 5% -9.2% 16.75 ± 0% -5.6% 17.42 ± 1% perf-profile.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
1.44 ± 3% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 1.18 ± 1% +Inf% 1.25 ± 2% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor
1.86 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
0.00 ± -1% +Inf% 1.53 ± 1% +Inf% 1.61 ± 3% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply
1.74 ± 2% -19.9% 1.40 ± 3% -16.8% 1.45 ± 0% perf-profile.cycles-pp.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
1.27 ± 0% -22.5% 0.99 ± 4% -22.3% 0.99 ± 0% perf-profile.cycles-pp.destroy_inode.evict.iput.__dentry_kill.dput
2.61 ± 1% -24.3% 1.98 ± 1% -20.7% 2.07 ± 0% perf-profile.cycles-pp.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
2.66 ± 1% -24.3% 2.01 ± 1% -20.5% 2.12 ± 0% perf-profile.cycles-pp.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
1.79 ± 2% -28.2% 1.28 ± 3% -23.3% 1.37 ± 2% perf-profile.cycles-pp.do_unlinkat.sys_unlink.entry_SYSCALL_64_fastpath
1.07 ± 3% -23.3% 0.82 ± 3% -19.4% 0.86 ± 0% perf-profile.cycles-pp.down_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
1.01 ± 3% -17.9% 0.83 ± 2% -13.6% 0.87 ± 1% perf-profile.cycles-pp.down_write.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.26 ± 2% -18.1% 9.23 ± 0% -17.2% 9.32 ± 0% perf-profile.cycles-pp.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
11.21 ± 2% -18.1% 9.18 ± 0% -17.4% 9.26 ± 0% perf-profile.cycles-pp.evict.iput.__dentry_kill.dput.__fput
11.34 ± 2% -18.1% 9.29 ± 0% -17.3% 9.38 ± 0% perf-profile.cycles-pp.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.55 ± 3% +Inf% 1.65 ± 0% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
1.83 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
43.95 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 7.91 ± 1% +Inf% 9.04 ± 0% perf-profile.cycles-pp.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
10.68 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
1.91 ± 3% -16.4% 1.59 ± 1% -17.7% 1.57 ± 0% perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page
0.00 ± -1% +Inf% 9.85 ± 0% +Inf% 9.91 ± 0% perf-profile.cycles-pp.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
10.96 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± -1% +Inf% 52.29 ± 0% +Inf% 50.82 ± 0% perf-profile.cycles-pp.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± -1% +Inf% 52.94 ± 0% +Inf% 51.44 ± 0% perf-profile.cycles-pp.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 34.35 ± 0% +Inf% 32.27 ± 0% perf-profile.cycles-pp.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± -1% +Inf% 16.48 ± 0% +Inf% 16.75 ± 1% perf-profile.cycles-pp.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
11.22 ± 2% -18.1% 9.19 ± 0% -17.3% 9.27 ± 0% perf-profile.cycles-pp.iput.__dentry_kill.dput.__fput.____fput
0.00 ± -1% +Inf% 1.55 ± 1% +Inf% 1.42 ± 0% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.72 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
0.00 ± -1% +Inf% 2.78 ± 0% +Inf% 2.88 ± 1% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor
3.39 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end
0.00 ± -1% +Inf% 3.44 ± 1% +NaN% 0.00 ± -1% perf-profile.cycles-pp.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
3.03 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.memset_erms.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 2.43 ± 0% +Inf% 2.48 ± 3% perf-profile.cycles-pp.memset_erms.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 9.25 ± 0% +Inf% 9.25 ± 0% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply
10.37 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
1.52 ± 2% -9.2% 1.38 ± 1% -17.0% 1.27 ± 0% perf-profile.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.lru_cache_add.add_to_page_cache_lru.pagecache_get_page
2.58 ± 1% -24.1% 1.96 ± 0% -20.6% 2.05 ± 0% perf-profile.cycles-pp.path_openat.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 0.95 ± 0% +Inf% 1.04 ± 0% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
1.17 ± 3% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
2.39 ± 3% -11.2% 2.12 ± 0% -18.3% 1.95 ± 1% perf-profile.cycles-pp.release_pages.__pagevec_release.truncate_inode_pages_range.truncate_inode_pages_final.evict
2.06 ± 3% -22.5% 1.60 ± 2% -10.9% 1.83 ± 0% perf-profile.cycles-pp.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.79 ± 3% -22.2% 1.39 ± 0% -9.8% 1.62 ± 0% perf-profile.cycles-pp.security_file_permission.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.32 ± 4% -21.4% 1.04 ± 0% -7.4% 1.23 ± 1% perf-profile.cycles-pp.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write.sys_write
19.79 ± 5% -9.9% 17.84 ± 0% -6.4% 18.54 ± 1% perf-profile.cycles-pp.start_secondary
2.67 ± 1% -24.2% 2.02 ± 1% -20.4% 2.12 ± 1% perf-profile.cycles-pp.sys_creat.entry_SYSCALL_64_fastpath
1.79 ± 3% -27.9% 1.29 ± 3% -23.0% 1.38 ± 2% perf-profile.cycles-pp.sys_unlink.entry_SYSCALL_64_fastpath
60.98 ± 1% +9.5% 66.76 ± 0% +7.8% 65.74 ± 0% perf-profile.cycles-pp.sys_write.entry_SYSCALL_64_fastpath
11.34 ± 1% -18.1% 9.29 ± 0% -17.2% 9.39 ± 0% perf-profile.cycles-pp.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.32 ± 1% -18.0% 9.28 ± 0% -17.3% 9.37 ± 0% perf-profile.cycles-pp.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
5.96 ± 1% -20.0% 4.77 ± 0% -15.8% 5.02 ± 0% perf-profile.cycles-pp.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
9.89 ± 2% -17.4% 8.17 ± 0% -16.7% 8.25 ± 0% perf-profile.cycles-pp.truncate_inode_pages_final.evict.iput.__dentry_kill.dput
9.87 ± 2% -17.5% 8.15 ± 0% -16.8% 8.21 ± 0% perf-profile.cycles-pp.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.__dentry_kill
2.07 ± 1% -20.4% 1.65 ± 2% -14.9% 1.77 ± 1% perf-profile.cycles-pp.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage
2.40 ± 1% -21.0% 1.89 ± 2% -15.3% 2.03 ± 1% perf-profile.cycles-pp.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range
0.00 ± -1% +Inf% 1.36 ± 1% +Inf% 1.56 ± 3% perf-profile.cycles-pp.unlock_page.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
1.72 ± 4% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.unlock_page.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
59.63 ± 1% +10.2% 65.72 ± 0% +8.5% 64.68 ± 0% perf-profile.cycles-pp.vfs_write.sys_write.entry_SYSCALL_64_fastpath
0.00 ± -1% +Inf% 1.52 ± 2% +NaN% 0.00 ± -1% perf-profile.cycles-pp.workingset_activation.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 1.73 ± 1% +Inf% 1.75 ± 2% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
0.00 ± -1% +Inf% 1.97 ± 2% +Inf% 2.04 ± 0% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± -1% +Inf% 1.61 ± 2% +Inf% 1.65 ± 1% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± -1% +Inf% 1.24 ± 2% +Inf% 1.21 ± 3% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± -1% +Inf% 1.46 ± 1% +Inf% 1.47 ± 1% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply
0.00 ± -1% +Inf% 1.21 ± 2% +Inf% 1.25 ± 0% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay
1.25 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
0.00 ± -1% +Inf% 3.06 ± 1% +Inf% 3.08 ± 1% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
1.04 ± 0% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_bmapi_read.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 3.04 ± 1% +Inf% 3.16 ± 1% perf-profile.cycles-pp.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± -1% +Inf% 3.05 ± 1% +Inf% 3.09 ± 1% perf-profile.cycles-pp.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
1.32 ± 2% -21.5% 1.04 ± 1% -19.7% 1.06 ± 0% perf-profile.cycles-pp.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat
51.83 ± 1% +14.3% 59.25 ± 0% +11.8% 57.95 ± 0% perf-profile.cycles-pp.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.sys_write
0.00 ± -1% +Inf% 16.05 ± 0% +Inf% 16.68 ± 0% perf-profile.cycles-pp.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
53.16 ± 1% +13.6% 60.40 ± 0% +11.1% 59.09 ± 0% perf-profile.cycles-pp.xfs_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.24 ± 1% -23.1% 0.95 ± 4% -21.8% 0.97 ± 0% perf-profile.cycles-pp.xfs_fs_destroy_inode.destroy_inode.evict.iput.__dentry_kill
1.42 ± 2% -21.2% 1.12 ± 1% -20.6% 1.12 ± 0% perf-profile.cycles-pp.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open
6.46 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
1.29 ± 3% -18.9% 1.04 ± 1% -14.1% 1.10 ± 0% perf-profile.cycles-pp.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± -1% +Inf% 1.14 ± 3% +Inf% 1.17 ± 1% perf-profile.cycles-pp.xfs_ilock.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
1.21 ± 1% -23.4% 0.93 ± 4% -22.5% 0.94 ± 0% perf-profile.cycles-pp.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
1.23 ± 4% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.6.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
0.00 ± -1% +Inf% 4.14 ± 0% +Inf% 4.15 ± 1% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
3.28 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.00 ± -1% +Inf% 9.08 ± 0% +Inf% 9.19 ± 1% perf-profile.cycles-pp.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
3.54 ± 0% -20.8% 2.81 ± 1% -15.6% 2.99 ± 0% perf-profile.cycles-pp.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
2.35 ± 1% -21.0% 1.86 ± 1% -15.1% 2.00 ± 1% perf-profile.cycles-pp.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page
25.10 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.03 ± 1% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.cycles-pp.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
1.42 ± 2% -20.7% 1.13 ± 1% -20.4% 1.13 ± 0% perf-profile.cycles-pp.xfs_vn_create.path_openat.do_filp_open.do_sys_open.sys_creat
1.42 ± 2% -20.5% 1.13 ± 1% -20.2% 1.13 ± 0% perf-profile.cycles-pp.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open
2.27 ± 1% -10.6% 2.03 ± 0% -6.7% 2.12 ± 1% perf-profile.func.cycles-pp.___might_sleep
2.49 ± 0% -34.5% 1.63 ± 1% -16.7% 2.08 ± 0% perf-profile.func.cycles-pp.__block_commit_write.isra.24
1.51 ± 2% +15.4% 1.75 ± 1% +18.0% 1.79 ± 2% perf-profile.func.cycles-pp.__block_write_begin_int
1.79 ± 4% -16.8% 1.49 ± 1% -14.5% 1.53 ± 0% perf-profile.func.cycles-pp.__mark_inode_dirty
1.32 ± 0% -16.4% 1.10 ± 1% -9.5% 1.19 ± 0% perf-profile.func.cycles-pp.__radix_tree_lookup
1.08 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.func.cycles-pp.__xfs_get_blocks
1.16 ± 0% -18.1% 0.95 ± 1% -15.8% 0.98 ± 1% perf-profile.func.cycles-pp._raw_spin_lock
3.96 ± 2% -18.4% 3.23 ± 0% -16.9% 3.29 ± 1% perf-profile.func.cycles-pp.copy_user_enhanced_fast_string
1.41 ± 3% -20.6% 1.12 ± 3% -21.1% 1.11 ± 3% perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath
1.30 ± 2% -100.0% 0.00 ± -1% -100.0% 0.00 ± -1% perf-profile.func.cycles-pp.generic_perform_write
1.31 ± 2% -46.7% 0.70 ± 0% -43.8% 0.73 ± 0% perf-profile.func.cycles-pp.generic_write_end
18.43 ± 5% -9.1% 16.76 ± 0% -5.4% 17.44 ± 1% perf-profile.func.cycles-pp.intel_idle
0.00 ± -1% +Inf% 1.12 ± 1% +Inf% 0.90 ± 0% perf-profile.func.cycles-pp.iomap_write_actor
1.50 ± 1% -20.9% 1.19 ± 1% -17.0% 1.25 ± 2% perf-profile.func.cycles-pp.mark_buffer_dirty
0.00 ± -1% +Inf% 1.91 ± 1% +NaN% 0.00 ± -1% perf-profile.func.cycles-pp.mark_page_accessed
3.24 ± 0% -19.8% 2.60 ± 0% -18.1% 2.66 ± 3% perf-profile.func.cycles-pp.memset_erms
1.75 ± 2% -18.9% 1.42 ± 1% -7.3% 1.62 ± 4% perf-profile.func.cycles-pp.unlock_page
1.56 ± 2% +6.0% 1.65 ± 3% +11.8% 1.74 ± 1% perf-profile.func.cycles-pp.up_write
1.16 ± 1% -21.6% 0.91 ± 1% -17.7% 0.95 ± 1% perf-profile.func.cycles-pp.vfs_write
0.37 ± 2% +243.6% 1.26 ± 2% +272.3% 1.36 ± 2% perf-profile.func.cycles-pp.xfs_bmap_search_extents
0.41 ± 1% +198.4% 1.22 ± 2% +198.8% 1.23 ± 3% perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents
0.70 ± 5% +219.5% 2.24 ± 0% +227.9% 2.29 ± 0% perf-profile.func.cycles-pp.xfs_bmapi_read
1.05 ± 2% -15.6% 0.88 ± 3% -18.8% 0.85 ± 1% perf-profile.func.cycles-pp.xfs_file_write_iter
0.64 ± 1% +182.8% 1.81 ± 4% +182.0% 1.81 ± 0% perf-profile.func.cycles-pp.xfs_iext_bno_to_ext
0.00 ± -1% +Inf% 1.10 ± 3% +Inf% 1.21 ± 2% perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8
0.46 ± 4% +161.6% 1.20 ± 1% +171.7% 1.25 ± 1% perf-profile.func.cycles-pp.xfs_iomap_write_delay
raw perf data:
"perf-profile.func.cycles-pp.intel_idle": 17.66,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.25,
"perf-profile.func.cycles-pp.memset_erms": 2.56,
"perf-profile.func.cycles-pp.xfs_bmapi_read": 2.28,
"perf-profile.func.cycles-pp.___might_sleep": 2.09,
"perf-profile.func.cycles-pp.__block_commit_write.isra.24": 2.07,
"perf-profile.func.cycles-pp.xfs_iext_bno_to_ext": 1.79,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.74,
"perf-profile.func.cycles-pp.up_write": 1.72,
"perf-profile.func.cycles-pp.unlock_page": 1.69,
"perf-profile.func.cycles-pp.down_write": 1.59,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.54,
"perf-profile.func.cycles-pp.xfs_bmap_search_extents": 1.33,
"perf-profile.func.cycles-pp.xfs_iomap_write_delay": 1.23,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 1.21,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 1.2,
"perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents": 1.18,
"perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8": 1.17,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.15,
"perf-profile.func.cycles-pp.__might_sleep": 1.14,
"perf-profile.func.cycles-pp._raw_spin_lock": 0.97,
"perf-profile.func.cycles-pp.vfs_write": 0.94,
"perf-profile.func.cycles-pp.xfs_bmapi_delay": 0.93,
"perf-profile.func.cycles-pp.iomap_write_actor": 0.9,
"perf-profile.func.cycles-pp.pagecache_get_page": 0.89,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 0.86,
"perf-profile.func.cycles-pp.xfs_file_iomap_begin": 0.81,
"perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 0.78,
"perf-profile.func.cycles-pp.iomap_apply": 0.77,
"perf-profile.func.cycles-pp.generic_write_end": 0.74,
"perf-profile.func.cycles-pp.xfs_file_buffered_aio_write": 0.72,
"perf-profile.func.cycles-pp.find_get_entry": 0.69,
"perf-profile.func.cycles-pp.__vfs_write": 0.67,
Best Regards,
Huang, Ying
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 21:16 ` Huang, Ying
@ 2016-08-11 21:40 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 21:40 UTC (permalink / raw)
To: Huang, Ying
Cc: Christoph Hellwig, Dave Chinner, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 2:16 PM, Huang, Ying <ying.huang@intel.com> wrote:
>
> Test result is as follow,
Thanks. No change.
> raw perf data:
I redid my munging, with the old (good) percentages in parenthesis:
intel_idle: 17.66 (16.88)
copy_user_enhanced_fast_string: 3.25 (3.94)
memset_erms: 2.56 (3.26)
xfs_bmapi_read: 2.28
___might_sleep: 2.09 (2.33)
__block_commit_write.isra.24: 2.07 (2.47)
xfs_iext_bno_to_ext: 1.79
__block_write_begin_int: 1.74 (1.56)
up_write: 1.72 (1.61)
unlock_page: 1.69 (1.69)
down_write: 1.59 (1.55)
__mark_inode_dirty: 1.54 (1.88)
xfs_bmap_search_extents: 1.33
xfs_iomap_write_delay: 1.23
mark_buffer_dirty: 1.21 (1.53)
__radix_tree_lookup: 1.2 (1.32)
xfs_bmap_search_multi_extents: 1.18
xfs_iomap_eof_want_preallocate.constprop.8: 1.17
entry_SYSCALL_64_fastpath: 1.15 (1.47)
__might_sleep: 1.14 (1.26)
_raw_spin_lock: 0.97 (1.17)
vfs_write: 0.94 (1.14)
xfs_bmapi_delay: 0.93
iomap_write_actor: 0.9
pagecache_get_page: 0.89 (1.03)
xfs_file_write_iter: 0.86 (1.03)
xfs_file_iomap_begin: 0.81
iov_iter_copy_from_user_atomic: 0.78 (0.87)
iomap_apply: 0.77
generic_write_end: 0.74 (1.36)
xfs_file_buffered_aio_write: 0.72 (0.84)
find_get_entry: 0.69 (0.79)
__vfs_write: 0.67 (0.87)
and it's worth noting a few things:
- most of the old percentages are bigger, but that's natural: the
load used to take longer, and the more efficient (old) case thus has
higher percent values. That doesn't mean it was slower, quite the
reverse.
- the main exception is intel_idle, so we do have more idle time.
But the *big* difference is all the functions that didn't use to show
up at all, and have no previous percent values:
xfs_bmapi_read: 2.28
xfs_iext_bno_to_ext: 1.79
xfs_bmap_search_extents: 1.33
xfs_iomap_write_delay: 1.23
xfs_bmap_search_multi_extents: 1.18
xfs_iomap_eof_want_preallocate.constprop.8: 1.17
xfs_bmapi_delay: 0.93
iomap_write_actor: 0.9
xfs_file_iomap_begin: 0.81
iomap_apply: 0.77
and I think this really can explain the regression. That all adds up
to 12% or so of "new overhead". Which is fairly close to the
regression.
(Ok, that is playing fast and loose with percentages, but I think it
migth be "close enough" in practice).
So for some reason the new code doesn't do a lot more per-page
operations (the unlock_page() etc costs are fairly similar), but it
has a *much* m ore expensive footprint in the xfs_bmap/iomap
functions.
The old code had almost no XFS footprint at all, and didn't need to
look up block mappings etc, and worked almost entirely with the vfs
caches (so used the block numbers in the buffers etc).
And I know that DaveC often complains about vfs overhead, but the fact
is, the VFS layer is optimized to hell and back and does really really
well. Having to call down to filesystem routines (for block mappings
etc) is when performance goes down. I think this is an example of
that.
And hey, maybe I'm just misreading things, or reading too much into
those profiles. But it does look like that commit
68a9f5e7007c1afa2cf6830b690a90d0187c0684 ends up causing more xfs bmap
activity.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 21:40 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 21:40 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4215 bytes --]
On Thu, Aug 11, 2016 at 2:16 PM, Huang, Ying <ying.huang@intel.com> wrote:
>
> Test result is as follow,
Thanks. No change.
> raw perf data:
I redid my munging, with the old (good) percentages in parenthesis:
intel_idle: 17.66 (16.88)
copy_user_enhanced_fast_string: 3.25 (3.94)
memset_erms: 2.56 (3.26)
xfs_bmapi_read: 2.28
___might_sleep: 2.09 (2.33)
__block_commit_write.isra.24: 2.07 (2.47)
xfs_iext_bno_to_ext: 1.79
__block_write_begin_int: 1.74 (1.56)
up_write: 1.72 (1.61)
unlock_page: 1.69 (1.69)
down_write: 1.59 (1.55)
__mark_inode_dirty: 1.54 (1.88)
xfs_bmap_search_extents: 1.33
xfs_iomap_write_delay: 1.23
mark_buffer_dirty: 1.21 (1.53)
__radix_tree_lookup: 1.2 (1.32)
xfs_bmap_search_multi_extents: 1.18
xfs_iomap_eof_want_preallocate.constprop.8: 1.17
entry_SYSCALL_64_fastpath: 1.15 (1.47)
__might_sleep: 1.14 (1.26)
_raw_spin_lock: 0.97 (1.17)
vfs_write: 0.94 (1.14)
xfs_bmapi_delay: 0.93
iomap_write_actor: 0.9
pagecache_get_page: 0.89 (1.03)
xfs_file_write_iter: 0.86 (1.03)
xfs_file_iomap_begin: 0.81
iov_iter_copy_from_user_atomic: 0.78 (0.87)
iomap_apply: 0.77
generic_write_end: 0.74 (1.36)
xfs_file_buffered_aio_write: 0.72 (0.84)
find_get_entry: 0.69 (0.79)
__vfs_write: 0.67 (0.87)
and it's worth noting a few things:
- most of the old percentages are bigger, but that's natural: the
load used to take longer, and the more efficient (old) case thus has
higher percent values. That doesn't mean it was slower, quite the
reverse.
- the main exception is intel_idle, so we do have more idle time.
But the *big* difference is all the functions that didn't use to show
up at all, and have no previous percent values:
xfs_bmapi_read: 2.28
xfs_iext_bno_to_ext: 1.79
xfs_bmap_search_extents: 1.33
xfs_iomap_write_delay: 1.23
xfs_bmap_search_multi_extents: 1.18
xfs_iomap_eof_want_preallocate.constprop.8: 1.17
xfs_bmapi_delay: 0.93
iomap_write_actor: 0.9
xfs_file_iomap_begin: 0.81
iomap_apply: 0.77
and I think this really can explain the regression. That all adds up
to 12% or so of "new overhead". Which is fairly close to the
regression.
(Ok, that is playing fast and loose with percentages, but I think it
migth be "close enough" in practice).
So for some reason the new code doesn't do a lot more per-page
operations (the unlock_page() etc costs are fairly similar), but it
has a *much* m ore expensive footprint in the xfs_bmap/iomap
functions.
The old code had almost no XFS footprint at all, and didn't need to
look up block mappings etc, and worked almost entirely with the vfs
caches (so used the block numbers in the buffers etc).
And I know that DaveC often complains about vfs overhead, but the fact
is, the VFS layer is optimized to hell and back and does really really
well. Having to call down to filesystem routines (for block mappings
etc) is when performance goes down. I think this is an example of
that.
And hey, maybe I'm just misreading things, or reading too much into
those profiles. But it does look like that commit
68a9f5e7007c1afa2cf6830b690a90d0187c0684 ends up causing more xfs bmap
activity.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 21:40 ` Linus Torvalds
@ 2016-08-11 22:08 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-11 22:08 UTC (permalink / raw)
To: Linus Torvalds
Cc: Huang, Ying, Christoph Hellwig, Dave Chinner, LKML, Bob Peterson,
Wu Fengguang, LKP
I'll need to dig into what AIM7 actually does in this benchmark, which
isn't too easy as I'm on a business trip currently, but from the list
below it looks like it keeps overwriting and overwriting a file that's
already been allocated. This is a pretty stupid workload, but fortunately
it should also be able to be optimized by skipping the actual block
lookup, which is what the old buffer.c code happens to do.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 22:08 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-11 22:08 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 420 bytes --]
I'll need to dig into what AIM7 actually does in this benchmark, which
isn't too easy as I'm on a business trip currently, but from the list
below it looks like it keeps overwriting and overwriting a file that's
already been allocated. This is a pretty stupid workload, but fortunately
it should also be able to be optimized by skipping the actual block
lookup, which is what the old buffer.c code happens to do.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 20:35 ` Linus Torvalds
@ 2016-08-11 22:16 ` Al Viro
-1 siblings, 0 replies; 219+ messages in thread
From: Al Viro @ 2016-08-11 22:16 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christoph Hellwig, Huang, Ying, Dave Chinner, LKML, Bob Peterson,
Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 01:35:00PM -0700, Linus Torvalds wrote:
> The thing is, iov_iter_copy_from_user_atomic() doesn't itself enforce
> non-blocking user accesses, it depends on the caller blocking page
> faults.
Huh? The very first thing it does is
char *kaddr = kmap_atomic(page), *p = kaddr + offset;
If _that_ does not disable pagefaults, we are very deep in shit. AFAICS,
all instances of that sucker do disable those, including the non-highmem
default (it's
static inline void *kmap_atomic(struct page *page)
{
preempt_disable();
pagefault_disable();
return page_address(page);
}
)
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 22:16 ` Al Viro
0 siblings, 0 replies; 219+ messages in thread
From: Al Viro @ 2016-08-11 22:16 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 646 bytes --]
On Thu, Aug 11, 2016 at 01:35:00PM -0700, Linus Torvalds wrote:
> The thing is, iov_iter_copy_from_user_atomic() doesn't itself enforce
> non-blocking user accesses, it depends on the caller blocking page
> faults.
Huh? The very first thing it does is
char *kaddr = kmap_atomic(page), *p = kaddr + offset;
If _that_ does not disable pagefaults, we are very deep in shit. AFAICS,
all instances of that sucker do disable those, including the non-highmem
default (it's
static inline void *kmap_atomic(struct page *page)
{
preempt_disable();
pagefault_disable();
return page_address(page);
}
)
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 22:16 ` Al Viro
@ 2016-08-11 22:30 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 22:30 UTC (permalink / raw)
To: Al Viro
Cc: Christoph Hellwig, Huang, Ying, Dave Chinner, LKML, Bob Peterson,
Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 3:16 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Huh? The very first thing it does is
> char *kaddr = kmap_atomic(page), *p = kaddr + offset;
>
> If _that_ does not disable pagefaults, we are very deep in shit.
Right you are - it does, even with highmem disabled. Never mind, those
pagefault_disabled/enabled() calls are clearly bogus.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-11 22:30 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-11 22:30 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 402 bytes --]
On Thu, Aug 11, 2016 at 3:16 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Huh? The very first thing it does is
> char *kaddr = kmap_atomic(page), *p = kaddr + offset;
>
> If _that_ does not disable pagefaults, we are very deep in shit.
Right you are - it does, even with highmem disabled. Never mind, those
pagefault_disabled/enabled() calls are clearly bogus.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 16:55 ` Linus Torvalds
@ 2016-08-12 0:54 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 0:54 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 09:55:33AM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig <hch@lst.de> wrote:
> >
> > The one liner below (not tested yet) to simply remove it should fix that
> > up. I also noticed we have a spurious pagefault_disable/enable, I
> > need to dig into the history of that first, though.
>
> Hopefully the pagefault_disable/enable doesn't matter for this case.
>
> Can we get this one-liner tested with the kernel robot for comparison?
> I really think a messed-up LRU list could cause bad IO patterns, and
> end up keeping dirty pages around that should be streaming out to disk
> and re-used, so causing memory pressure etc for no good reason.
>
> I think the mapping->tree_lock issue that Dave sees is interesting
> too, but the kswapd activity (and the extra locking it causes) could
> also be a symptom of the same thing - memory pressure due to just
> putting pages in the active file that simply shouldn't be there.
So, removing mark_page_accessed() made the spinlock contention
*worse*.
36.51% [kernel] [k] _raw_spin_unlock_irqrestore
6.27% [kernel] [k] copy_user_generic_string
3.73% [kernel] [k] _raw_spin_unlock_irq
3.55% [kernel] [k] get_page_from_freelist
1.97% [kernel] [k] do_raw_spin_lock
1.72% [kernel] [k] __block_commit_write.isra.30
1.44% [kernel] [k] __wake_up_bit
1.41% [kernel] [k] shrink_page_list
1.24% [kernel] [k] __radix_tree_lookup
1.03% [kernel] [k] xfs_log_commit_cil
0.99% [kernel] [k] free_hot_cold_page
0.96% [kernel] [k] end_buffer_async_write
0.95% [kernel] [k] delay_tsc
0.94% [kernel] [k] ___might_sleep
0.93% [kernel] [k] kmem_cache_alloc
0.90% [kernel] [k] unlock_page
0.82% [kernel] [k] kmem_cache_free
0.74% [kernel] [k] up_write
0.72% [kernel] [k] node_dirty_ok
0.66% [kernel] [k] clear_page_dirty_for_io
0.65% [kernel] [k] __mark_inode_dirty
0.64% [kernel] [k] __block_write_begin_int
0.58% [kernel] [k] xfs_inode_item_format
0.57% [kernel] [k] __memset
0.57% [kernel] [k] cancel_dirty_page
0.56% [kernel] [k] down_write
0.54% [kernel] [k] page_evictable
0.53% [kernel] [k] page_mapping
0.52% [kernel] [k] __slab_free
0.49% [kernel] [k] xfs_do_writepage
0.49% [kernel] [k] drop_buffers
- 41.82% 41.82% [kernel] [k] _raw_spin_unlock_irqrestore
- 35.93% ret_from_fork
- kthread
- 29.76% kswapd
shrink_node
shrink_node_memcg.isra.75
shrink_inactive_list
shrink_page_list
__remove_mapping
_raw_spin_unlock_irqrestore
- 7.13% worker_thread
- process_one_work
- 4.40% wb_workfn
wb_writeback
__writeback_inodes_wb
writeback_sb_inodes
__writeback_single_inode
do_writepages
xfs_vm_writepages
write_cache_pages
xfs_do_writepage
- 2.71% xfs_end_io
xfs_destroy_ioend
end_buffer_async_write
end_page_writeback
test_clear_page_writeback
_raw_spin_unlock_irqrestore
+ 4.88% __libc_pwrite
The kswapd contention has jumped from 20% to 30% of the CPU time
in the profiles. I can't see how changing what LRU the page is on
will improve the contention problem - at it's sources it's a N:1
problem where the writing process and N kswapd worker threads are
all trying to access the same lock concurrently....
This is not the AIM7 problem we are looking for - what this test
demonstrates is a fundamental page cache scalability issue at the
design level - the mapping->tree_lock is a global serialisation
point....
I'm now going to test Christoph's theory that this is an "overwrite
doing lots of block mapping" issue. More on that to follow.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 0:54 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 0:54 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4180 bytes --]
On Thu, Aug 11, 2016 at 09:55:33AM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig <hch@lst.de> wrote:
> >
> > The one liner below (not tested yet) to simply remove it should fix that
> > up. I also noticed we have a spurious pagefault_disable/enable, I
> > need to dig into the history of that first, though.
>
> Hopefully the pagefault_disable/enable doesn't matter for this case.
>
> Can we get this one-liner tested with the kernel robot for comparison?
> I really think a messed-up LRU list could cause bad IO patterns, and
> end up keeping dirty pages around that should be streaming out to disk
> and re-used, so causing memory pressure etc for no good reason.
>
> I think the mapping->tree_lock issue that Dave sees is interesting
> too, but the kswapd activity (and the extra locking it causes) could
> also be a symptom of the same thing - memory pressure due to just
> putting pages in the active file that simply shouldn't be there.
So, removing mark_page_accessed() made the spinlock contention
*worse*.
36.51% [kernel] [k] _raw_spin_unlock_irqrestore
6.27% [kernel] [k] copy_user_generic_string
3.73% [kernel] [k] _raw_spin_unlock_irq
3.55% [kernel] [k] get_page_from_freelist
1.97% [kernel] [k] do_raw_spin_lock
1.72% [kernel] [k] __block_commit_write.isra.30
1.44% [kernel] [k] __wake_up_bit
1.41% [kernel] [k] shrink_page_list
1.24% [kernel] [k] __radix_tree_lookup
1.03% [kernel] [k] xfs_log_commit_cil
0.99% [kernel] [k] free_hot_cold_page
0.96% [kernel] [k] end_buffer_async_write
0.95% [kernel] [k] delay_tsc
0.94% [kernel] [k] ___might_sleep
0.93% [kernel] [k] kmem_cache_alloc
0.90% [kernel] [k] unlock_page
0.82% [kernel] [k] kmem_cache_free
0.74% [kernel] [k] up_write
0.72% [kernel] [k] node_dirty_ok
0.66% [kernel] [k] clear_page_dirty_for_io
0.65% [kernel] [k] __mark_inode_dirty
0.64% [kernel] [k] __block_write_begin_int
0.58% [kernel] [k] xfs_inode_item_format
0.57% [kernel] [k] __memset
0.57% [kernel] [k] cancel_dirty_page
0.56% [kernel] [k] down_write
0.54% [kernel] [k] page_evictable
0.53% [kernel] [k] page_mapping
0.52% [kernel] [k] __slab_free
0.49% [kernel] [k] xfs_do_writepage
0.49% [kernel] [k] drop_buffers
- 41.82% 41.82% [kernel] [k] _raw_spin_unlock_irqrestore
- 35.93% ret_from_fork
- kthread
- 29.76% kswapd
shrink_node
shrink_node_memcg.isra.75
shrink_inactive_list
shrink_page_list
__remove_mapping
_raw_spin_unlock_irqrestore
- 7.13% worker_thread
- process_one_work
- 4.40% wb_workfn
wb_writeback
__writeback_inodes_wb
writeback_sb_inodes
__writeback_single_inode
do_writepages
xfs_vm_writepages
write_cache_pages
xfs_do_writepage
- 2.71% xfs_end_io
xfs_destroy_ioend
end_buffer_async_write
end_page_writeback
test_clear_page_writeback
_raw_spin_unlock_irqrestore
+ 4.88% __libc_pwrite
The kswapd contention has jumped from 20% to 30% of the CPU time
in the profiles. I can't see how changing what LRU the page is on
will improve the contention problem - at it's sources it's a N:1
problem where the writing process and N kswapd worker threads are
all trying to access the same lock concurrently....
This is not the AIM7 problem we are looking for - what this test
demonstrates is a fundamental page cache scalability issue at the
design level - the mapping->tree_lock is a global serialisation
point....
I'm now going to test Christoph's theory that this is an "overwrite
doing lots of block mapping" issue. More on that to follow.
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 1:16 ` Dave Chinner
@ 2016-08-12 1:26 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 1:26 UTC (permalink / raw)
To: Huang, Ying
Cc: Linus Torvalds, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> On Wed, Aug 10, 2016 at 05:33:20PM -0700, Huang, Ying wrote:
> We need to know what is happening that is different - there's a good
> chance the mapping trace events will tell us. Huang, can you get
> a raw event trace from the test?
>
> I need to see these events:
>
> xfs_file*
> xfs_iomap*
> xfs_get_block*
>
lkp-folks, can I please get these traces run and sent to me? I don't
have the time or patience to try to get aim7 running on my machines
- the build is full of hard-coded paths and libraries that aren't
provided by modern distros (e.g. it requires a static libaio.a!) and
it fails at the configure stage complaining that:
configure: error: C compiler cannot create executables
Which is a complete load of BS.
Hence I can't make progress until I have some way of understanding
what the IO pattern is that is generating the profile being
measured. So far I'm unable to do that with any of the tools I been
trying, hence I need the traces to work out what I'm missing...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 1:26 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 1:26 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1143 bytes --]
On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> On Wed, Aug 10, 2016 at 05:33:20PM -0700, Huang, Ying wrote:
> We need to know what is happening that is different - there's a good
> chance the mapping trace events will tell us. Huang, can you get
> a raw event trace from the test?
>
> I need to see these events:
>
> xfs_file*
> xfs_iomap*
> xfs_get_block*
>
lkp-folks, can I please get these traces run and sent to me? I don't
have the time or patience to try to get aim7 running on my machines
- the build is full of hard-coded paths and libraries that aren't
provided by modern distros (e.g. it requires a static libaio.a!) and
it fails at the configure stage complaining that:
configure: error: C compiler cannot create executables
Which is a complete load of BS.
Hence I can't make progress until I have some way of understanding
what the IO pattern is that is generating the profile being
measured. So far I'm unable to do that with any of the tools I been
trying, hence I need the traces to work out what I'm missing...
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 0:54 ` Dave Chinner
@ 2016-08-12 2:23 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 2:23 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Fri, Aug 12, 2016 at 10:54:42AM +1000, Dave Chinner wrote:
> I'm now going to test Christoph's theory that this is an "overwrite
> doing lots of block mapping" issue. More on that to follow.
Ok, so going back to the profiles, I can say it's not an overwrite
issue, because there is delayed allocation showing up in the
profile. Lots of it. Which lead me to think "maybe the benchmark is
just completely *dumb*".
And, as usual, that's the answer. Here's the reproducer:
# sudo mkfs.xfs -f -m crc=0 /dev/pmem1
# sudo mount -o noatime /dev/pmem1 /mnt/scratch
# sudo xfs_io -f -c "pwrite 0 512m -b 1" /mnt/scratch/fooey
And here's the profile:
4.50% [kernel] [k] xfs_bmapi_read
3.64% [kernel] [k] __block_commit_write.isra.30
3.55% [kernel] [k] __radix_tree_lookup
3.46% [kernel] [k] up_write
3.43% [kernel] [k] ___might_sleep
3.09% [kernel] [k] entry_SYSCALL_64_fastpath
3.01% [kernel] [k] xfs_iext_bno_to_ext
3.01% [kernel] [k] find_get_entry
2.98% [kernel] [k] down_write
2.71% [kernel] [k] mark_buffer_dirty
2.52% [kernel] [k] __mark_inode_dirty
2.38% [kernel] [k] unlock_page
2.14% [kernel] [k] xfs_break_layouts
2.07% [kernel] [k] xfs_bmapi_update_map
2.06% [kernel] [k] xfs_bmap_search_extents
2.04% [kernel] [k] xfs_iomap_write_delay
2.00% [kernel] [k] generic_write_checks
1.96% [kernel] [k] xfs_bmap_search_multi_extents
1.90% [kernel] [k] __xfs_bmbt_get_all
1.89% [kernel] [k] balance_dirty_pages_ratelimited
1.82% [kernel] [k] wait_for_stable_page
1.76% [kernel] [k] xfs_file_write_iter
1.68% [kernel] [k] xfs_iomap_eof_want_preallocate
1.68% [kernel] [k] xfs_bmapi_delay
1.67% [kernel] [k] iomap_write_actor
1.60% [kernel] [k] xfs_file_buffered_aio_write
1.56% [kernel] [k] __might_sleep
1.48% [kernel] [k] do_raw_spin_lock
1.44% [kernel] [k] generic_write_end
1.41% [kernel] [k] pagecache_get_page
1.38% [kernel] [k] xfs_bmapi_trim_map
1.21% [kernel] [k] __block_write_begin_int
1.17% [kernel] [k] vfs_write
1.17% [kernel] [k] xfs_file_iomap_begin
1.17% [kernel] [k] xfs_bmbt_get_startoff
1.14% [kernel] [k] iomap_apply
1.08% [kernel] [k] xfs_iunlock
1.08% [kernel] [k] iov_iter_copy_from_user_atomic
0.97% [kernel] [k] xfs_file_aio_write_checks
0.96% [kernel] [k] xfs_ilock
.....
Yeah, I'm doing a sequential write in *1 byte pwrite() calls*.
Ok, so the benchmark isn't /quite/ that abysmally stupid. It's
still, ah, extremely challenged:
if (NBUFSIZE != 1024) { /* enforce known block size */
fprintf(stderr, "NBUFSIZE changed to %d\n", NBUFSIZE);
exit(1);
}
i.e. it's hard coded to do all it's "disk" IO in 1k block sizes.
Every read, every write, every file copy, etc are all done with a
1024 byte buffer. There are lots of loops that look like:
while (--n) {
write(fd, nbuf, sizeof nbuf)
}
where n is the file size specified in the job file. Those loops are
what is generating the profile we see: repeated partial page writes
that extend the file.
IOWs, the benchmark is doing exactly what we document in the fstat()
man page *not to do* as it is will cause inefficient IO patterns:
The st_blksize field gives the "preferred" blocksize for
efficient filesystem I/O. (Writing to a file in smaller
chunks may cause an inefficient read-modify-rewrite.)
The smallest we ever set st_blksize to is PAGE_SIZE, so the
benchmark is running well known and documented (at least 10 years
ago) slow paths through the IO stack. I'm very tempted now simply
to say that the aim7 disk benchmark is showing it's age and as such
the results are not actually reflective of what typical applications
will see.
Christoph, maybe there's something we can do to only trigger
speculative prealloc growth checks if the new file size crosses the end of
the currently allocated block at the EOF. That would chop out a fair
chunk of the xfs_bmapi_read calls being done in this workload. I'm
not sure how much effort we should spend optimising this slow path,
though....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 2:23 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 2:23 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4312 bytes --]
On Fri, Aug 12, 2016 at 10:54:42AM +1000, Dave Chinner wrote:
> I'm now going to test Christoph's theory that this is an "overwrite
> doing lots of block mapping" issue. More on that to follow.
Ok, so going back to the profiles, I can say it's not an overwrite
issue, because there is delayed allocation showing up in the
profile. Lots of it. Which lead me to think "maybe the benchmark is
just completely *dumb*".
And, as usual, that's the answer. Here's the reproducer:
# sudo mkfs.xfs -f -m crc=0 /dev/pmem1
# sudo mount -o noatime /dev/pmem1 /mnt/scratch
# sudo xfs_io -f -c "pwrite 0 512m -b 1" /mnt/scratch/fooey
And here's the profile:
4.50% [kernel] [k] xfs_bmapi_read
3.64% [kernel] [k] __block_commit_write.isra.30
3.55% [kernel] [k] __radix_tree_lookup
3.46% [kernel] [k] up_write
3.43% [kernel] [k] ___might_sleep
3.09% [kernel] [k] entry_SYSCALL_64_fastpath
3.01% [kernel] [k] xfs_iext_bno_to_ext
3.01% [kernel] [k] find_get_entry
2.98% [kernel] [k] down_write
2.71% [kernel] [k] mark_buffer_dirty
2.52% [kernel] [k] __mark_inode_dirty
2.38% [kernel] [k] unlock_page
2.14% [kernel] [k] xfs_break_layouts
2.07% [kernel] [k] xfs_bmapi_update_map
2.06% [kernel] [k] xfs_bmap_search_extents
2.04% [kernel] [k] xfs_iomap_write_delay
2.00% [kernel] [k] generic_write_checks
1.96% [kernel] [k] xfs_bmap_search_multi_extents
1.90% [kernel] [k] __xfs_bmbt_get_all
1.89% [kernel] [k] balance_dirty_pages_ratelimited
1.82% [kernel] [k] wait_for_stable_page
1.76% [kernel] [k] xfs_file_write_iter
1.68% [kernel] [k] xfs_iomap_eof_want_preallocate
1.68% [kernel] [k] xfs_bmapi_delay
1.67% [kernel] [k] iomap_write_actor
1.60% [kernel] [k] xfs_file_buffered_aio_write
1.56% [kernel] [k] __might_sleep
1.48% [kernel] [k] do_raw_spin_lock
1.44% [kernel] [k] generic_write_end
1.41% [kernel] [k] pagecache_get_page
1.38% [kernel] [k] xfs_bmapi_trim_map
1.21% [kernel] [k] __block_write_begin_int
1.17% [kernel] [k] vfs_write
1.17% [kernel] [k] xfs_file_iomap_begin
1.17% [kernel] [k] xfs_bmbt_get_startoff
1.14% [kernel] [k] iomap_apply
1.08% [kernel] [k] xfs_iunlock
1.08% [kernel] [k] iov_iter_copy_from_user_atomic
0.97% [kernel] [k] xfs_file_aio_write_checks
0.96% [kernel] [k] xfs_ilock
.....
Yeah, I'm doing a sequential write in *1 byte pwrite() calls*.
Ok, so the benchmark isn't /quite/ that abysmally stupid. It's
still, ah, extremely challenged:
if (NBUFSIZE != 1024) { /* enforce known block size */
fprintf(stderr, "NBUFSIZE changed to %d\n", NBUFSIZE);
exit(1);
}
i.e. it's hard coded to do all it's "disk" IO in 1k block sizes.
Every read, every write, every file copy, etc are all done with a
1024 byte buffer. There are lots of loops that look like:
while (--n) {
write(fd, nbuf, sizeof nbuf)
}
where n is the file size specified in the job file. Those loops are
what is generating the profile we see: repeated partial page writes
that extend the file.
IOWs, the benchmark is doing exactly what we document in the fstat()
man page *not to do* as it is will cause inefficient IO patterns:
The st_blksize field gives the "preferred" blocksize for
efficient filesystem I/O. (Writing to a file in smaller
chunks may cause an inefficient read-modify-rewrite.)
The smallest we ever set st_blksize to is PAGE_SIZE, so the
benchmark is running well known and documented (at least 10 years
ago) slow paths through the IO stack. I'm very tempted now simply
to say that the aim7 disk benchmark is showing it's age and as such
the results are not actually reflective of what typical applications
will see.
Christoph, maybe there's something we can do to only trigger
speculative prealloc growth checks if the new file size crosses the end of
the currently allocated block at the EOF. That would chop out a fair
chunk of the xfs_bmapi_read calls being done in this workload. I'm
not sure how much effort we should spend optimising this slow path,
though....
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 0:54 ` Dave Chinner
@ 2016-08-12 2:27 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 2:27 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> So, removing mark_page_accessed() made the spinlock contention
> *worse*.
>
> 36.51% [kernel] [k] _raw_spin_unlock_irqrestore
> 6.27% [kernel] [k] copy_user_generic_string
> 3.73% [kernel] [k] _raw_spin_unlock_irq
> 3.55% [kernel] [k] get_page_from_freelist
> 1.97% [kernel] [k] do_raw_spin_lock
> 1.72% [kernel] [k] __block_commit_write.isra.30
I don't recall having ever seen the mapping tree_lock as a contention
point before, but it's not like I've tried that load either. So it
might be a regression (going back long, I suspect), or just an unusual
load that nobody has traditionally tested much.
Single-threaded big file write one page at a time, was it?
The mapping tree lock has been around forever (it used to be a rw-lock
long long ago), but I wonder if we might have moved more stuff into it
(memory accounting comes to mind) causing much worse contention or
something.
Hmm. Just for fun, I googled "tree_lock contention". It's shown up
before - back in 2006, and it was you hitting it back then too.
There was an even older one (related to AIM7, interesting) which was
what caused the tree_lock to become a rw-lock back in 2005 (but then
Nick Piggin made it a spinlock again in 2008).
So it's not unheard of, but it certainly hasn't been a big issue.
That's the only obvious ones I found (apart from some btrfs issues,
but btrfs has a completely different notion of tree locking, so those
are not about the same thing).
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 2:27 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 2:27 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1609 bytes --]
On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> So, removing mark_page_accessed() made the spinlock contention
> *worse*.
>
> 36.51% [kernel] [k] _raw_spin_unlock_irqrestore
> 6.27% [kernel] [k] copy_user_generic_string
> 3.73% [kernel] [k] _raw_spin_unlock_irq
> 3.55% [kernel] [k] get_page_from_freelist
> 1.97% [kernel] [k] do_raw_spin_lock
> 1.72% [kernel] [k] __block_commit_write.isra.30
I don't recall having ever seen the mapping tree_lock as a contention
point before, but it's not like I've tried that load either. So it
might be a regression (going back long, I suspect), or just an unusual
load that nobody has traditionally tested much.
Single-threaded big file write one page at a time, was it?
The mapping tree lock has been around forever (it used to be a rw-lock
long long ago), but I wonder if we might have moved more stuff into it
(memory accounting comes to mind) causing much worse contention or
something.
Hmm. Just for fun, I googled "tree_lock contention". It's shown up
before - back in 2006, and it was you hitting it back then too.
There was an even older one (related to AIM7, interesting) which was
what caused the tree_lock to become a rw-lock back in 2005 (but then
Nick Piggin made it a spinlock again in 2008).
So it's not unheard of, but it certainly hasn't been a big issue.
That's the only obvious ones I found (apart from some btrfs issues,
but btrfs has a completely different notion of tree locking, so those
are not about the same thing).
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 2:23 ` Dave Chinner
@ 2016-08-12 2:32 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 2:32 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 7:23 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> And, as usual, that's the answer. Here's the reproducer:
>
> # sudo mkfs.xfs -f -m crc=0 /dev/pmem1
> # sudo mount -o noatime /dev/pmem1 /mnt/scratch
> # sudo xfs_io -f -c "pwrite 0 512m -b 1" /mnt/scratch/fooey
Heh. Ok, so 1 byte or 1kB at a time is pretty much the same thing, yeah.
And I guess that also explains why the system call entry showed up so
high in the profiles.
I'l take another look at tree_lock tomorrow, but it sounds like this
particular AIM regression is now effectively a solved (or at least
known) issue. Thanks,
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 2:32 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 2:32 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 656 bytes --]
On Thu, Aug 11, 2016 at 7:23 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> And, as usual, that's the answer. Here's the reproducer:
>
> # sudo mkfs.xfs -f -m crc=0 /dev/pmem1
> # sudo mount -o noatime /dev/pmem1 /mnt/scratch
> # sudo xfs_io -f -c "pwrite 0 512m -b 1" /mnt/scratch/fooey
Heh. Ok, so 1 byte or 1kB at a time is pretty much the same thing, yeah.
And I guess that also explains why the system call entry showed up so
high in the profiles.
I'l take another look at tree_lock tomorrow, but it sounds like this
particular AIM regression is now effectively a solved (or at least
known) issue. Thanks,
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 2:23 ` Dave Chinner
@ 2016-08-12 2:52 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-12 2:52 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, Christoph Hellwig, Huang, Ying, LKML,
Bob Peterson, Wu Fengguang, LKP
On Fri, Aug 12, 2016 at 12:23:29PM +1000, Dave Chinner wrote:
> Christoph, maybe there's something we can do to only trigger
> speculative prealloc growth checks if the new file size crosses the end of
> the currently allocated block at the EOF. That would chop out a fair
> chunk of the xfs_bmapi_read calls being done in this workload. I'm
> not sure how much effort we should spend optimising this slow path,
> though....
I can look at that, but indeed optimizing this patch seems a bit
stupid. The other thing we could do is to optimize xfs_bmapi_read - even
if it shouldn't be called this often it seems like it should waste a whole
lot less CPU cycles.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 2:52 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-12 2:52 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 673 bytes --]
On Fri, Aug 12, 2016 at 12:23:29PM +1000, Dave Chinner wrote:
> Christoph, maybe there's something we can do to only trigger
> speculative prealloc growth checks if the new file size crosses the end of
> the currently allocated block at the EOF. That would chop out a fair
> chunk of the xfs_bmapi_read calls being done in this workload. I'm
> not sure how much effort we should spend optimising this slow path,
> though....
I can look at that, but indeed optimizing this patch seems a bit
stupid. The other thing we could do is to optimize xfs_bmapi_read - even
if it shouldn't be called this often it seems like it should waste a whole
lot less CPU cycles.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 2:52 ` Christoph Hellwig
@ 2016-08-12 3:20 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 3:20 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Dave Chinner, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig <hch@lst.de> wrote:
>
> I can look at that, but indeed optimizing this patch seems a bit
> stupid.
The "write less than a full block to the end of the file" is actually
a reasonably common case.
It may not make for a great filesystem benchmark, but it also isn't
actually insane. People who do logging in user space do this all the
time, for example. And it is *not* stupid in that context. Not at all.
It's never going to be the *main* thing you do (unless you're AIM),
but I do think it's worth fixing.
And AIM7 remains one of those odd benchmarks that people use. I'm not
quite sure why, but I really do think that the normal "append smaller
chunks to the end of the file" should absolutely not be dismissed as
stupid.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 3:20 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 3:20 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 821 bytes --]
On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig <hch@lst.de> wrote:
>
> I can look at that, but indeed optimizing this patch seems a bit
> stupid.
The "write less than a full block to the end of the file" is actually
a reasonably common case.
It may not make for a great filesystem benchmark, but it also isn't
actually insane. People who do logging in user space do this all the
time, for example. And it is *not* stupid in that context. Not at all.
It's never going to be the *main* thing you do (unless you're AIM),
but I do think it's worth fixing.
And AIM7 remains one of those odd benchmarks that people use. I'm not
quite sure why, but I really do think that the normal "append smaller
chunks to the end of the file" should absolutely not be dismissed as
stupid.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 2:27 ` Linus Torvalds
@ 2016-08-12 3:56 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 3:56 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > So, removing mark_page_accessed() made the spinlock contention
> > *worse*.
> >
> > 36.51% [kernel] [k] _raw_spin_unlock_irqrestore
> > 6.27% [kernel] [k] copy_user_generic_string
> > 3.73% [kernel] [k] _raw_spin_unlock_irq
> > 3.55% [kernel] [k] get_page_from_freelist
> > 1.97% [kernel] [k] do_raw_spin_lock
> > 1.72% [kernel] [k] __block_commit_write.isra.30
>
> I don't recall having ever seen the mapping tree_lock as a contention
> point before, but it's not like I've tried that load either. So it
> might be a regression (going back long, I suspect), or just an unusual
> load that nobody has traditionally tested much.
>
> Single-threaded big file write one page at a time, was it?
Yup. On a 4 node NUMA system.
So when memory reclaim kicks in, there's a write process, a
writeback kworker and 4 kswapd kthreads all banging on the
mapping->tree_lock. There's an awful lot of concurrency happening
behind the scenes of that single user process writing to a file...
> The mapping tree lock has been around forever (it used to be a rw-lock
> long long ago), but I wonder if we might have moved more stuff into it
> (memory accounting comes to mind) causing much worse contention or
> something.
Yeah, there is now a crapton of accounting updated in
account_page_dirtied under the tree lock - memcg, writeback, node,
zone, task, etc. And there's a *lot* of code that
__delete_from_page_cache() can execute under the tree lock.
> Hmm. Just for fun, I googled "tree_lock contention". It's shown up
> before - back in 2006, and it was you hitting it back then too.
Of course! That, however, would have been when I was playing with
real big SGI machines, not a tiddly little 16p VM.... :P
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 3:56 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 3:56 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1980 bytes --]
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > So, removing mark_page_accessed() made the spinlock contention
> > *worse*.
> >
> > 36.51% [kernel] [k] _raw_spin_unlock_irqrestore
> > 6.27% [kernel] [k] copy_user_generic_string
> > 3.73% [kernel] [k] _raw_spin_unlock_irq
> > 3.55% [kernel] [k] get_page_from_freelist
> > 1.97% [kernel] [k] do_raw_spin_lock
> > 1.72% [kernel] [k] __block_commit_write.isra.30
>
> I don't recall having ever seen the mapping tree_lock as a contention
> point before, but it's not like I've tried that load either. So it
> might be a regression (going back long, I suspect), or just an unusual
> load that nobody has traditionally tested much.
>
> Single-threaded big file write one page at a time, was it?
Yup. On a 4 node NUMA system.
So when memory reclaim kicks in, there's a write process, a
writeback kworker and 4 kswapd kthreads all banging on the
mapping->tree_lock. There's an awful lot of concurrency happening
behind the scenes of that single user process writing to a file...
> The mapping tree lock has been around forever (it used to be a rw-lock
> long long ago), but I wonder if we might have moved more stuff into it
> (memory accounting comes to mind) causing much worse contention or
> something.
Yeah, there is now a crapton of accounting updated in
account_page_dirtied under the tree lock - memcg, writeback, node,
zone, task, etc. And there's a *lot* of code that
__delete_from_page_cache() can execute under the tree lock.
> Hmm. Just for fun, I googled "tree_lock contention". It's shown up
> before - back in 2006, and it was you hitting it back then too.
Of course! That, however, would have been when I was playing with
real big SGI machines, not a tiddly little 16p VM.... :P
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 3:20 ` Linus Torvalds
@ 2016-08-12 4:16 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 4:16 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 08:20:53PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig <hch@lst.de> wrote:
> >
> > I can look at that, but indeed optimizing this patch seems a bit
> > stupid.
>
> The "write less than a full block to the end of the file" is actually
> a reasonably common case.
>
> It may not make for a great filesystem benchmark, but it also isn't
> actually insane. People who do logging in user space do this all the
> time, for example. And it is *not* stupid in that context. Not at all.
>
> It's never going to be the *main* thing you do (unless you're AIM),
> but I do think it's worth fixing.
>
> And AIM7 remains one of those odd benchmarks that people use. I'm not
> quite sure why, but I really do think that the normal "append smaller
> chunks to the end of the file" should absolutely not be dismissed as
> stupid.
Yes, I agree that there are reasons for making sub-block IO work
well (which is why I'm looking to try to fix it), but that does't
mean the benchmark is sane. aim7 is, technically, a "scalability
benchmark". As such, expecting tiny writes to scale to moving large
amounts of data is the "stupid" thing it does. If you scale up the
amount of data you need to move, tehn you ned to scale up the
efficiency of moving that data. Case in point - writing 1GB of data
in 1kb chunks to XFs on a local /dev/pmem1 runs at ~600MB/s, whilst
moving it it in 1MB chunks runs at 1.9GB/s. aim7 doesn't actually
stress the scalability of the hardware, because inefficiencies in
it's implementation prevent it from getting to those limits.
That's what aim7 misses - as speeds and capabilities go up, the way
code needs to be written to make efficient use of the hardware also
changes. e.g. High throughput logging solutions don't write every
incoming log event immediately - they aggregate them into larger
buffers and then write those, knowing that they can support much
higher logging rates by doing this....
That's why running aim7 as your "does the filesystem scale"
benchmark is somewhat irrelevant to scaling applications on high
performance systems these days - users with fast storage will be
expecting to see that 1.9GB/s throughput from their app, not
600MB/s....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 4:16 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 4:16 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2345 bytes --]
On Thu, Aug 11, 2016 at 08:20:53PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig <hch@lst.de> wrote:
> >
> > I can look at that, but indeed optimizing this patch seems a bit
> > stupid.
>
> The "write less than a full block to the end of the file" is actually
> a reasonably common case.
>
> It may not make for a great filesystem benchmark, but it also isn't
> actually insane. People who do logging in user space do this all the
> time, for example. And it is *not* stupid in that context. Not at all.
>
> It's never going to be the *main* thing you do (unless you're AIM),
> but I do think it's worth fixing.
>
> And AIM7 remains one of those odd benchmarks that people use. I'm not
> quite sure why, but I really do think that the normal "append smaller
> chunks to the end of the file" should absolutely not be dismissed as
> stupid.
Yes, I agree that there are reasons for making sub-block IO work
well (which is why I'm looking to try to fix it), but that does't
mean the benchmark is sane. aim7 is, technically, a "scalability
benchmark". As such, expecting tiny writes to scale to moving large
amounts of data is the "stupid" thing it does. If you scale up the
amount of data you need to move, tehn you ned to scale up the
efficiency of moving that data. Case in point - writing 1GB of data
in 1kb chunks to XFs on a local /dev/pmem1 runs at ~600MB/s, whilst
moving it it in 1MB chunks runs at 1.9GB/s. aim7 doesn't actually
stress the scalability of the hardware, because inefficiencies in
it's implementation prevent it from getting to those limits.
That's what aim7 misses - as speeds and capabilities go up, the way
code needs to be written to make efficient use of the hardware also
changes. e.g. High throughput logging solutions don't write every
incoming log event immediately - they aggregate them into larger
buffers and then write those, knowing that they can support much
higher logging rates by doing this....
That's why running aim7 as your "does the filesystem scale"
benchmark is somewhat irrelevant to scaling applications on high
performance systems these days - users with fast storage will be
expecting to see that 1.9GB/s throughput from their app, not
600MB/s....
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 4:16 ` Dave Chinner
@ 2016-08-12 5:02 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 5:02 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> That's why running aim7 as your "does the filesystem scale"
> benchmark is somewhat irrelevant to scaling applications on high
> performance systems these days
Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
good benchmark. It's just that I think what it happens to test is
still meaningful, even if it's not necessarily in any way some kind of
"high performance IO" thing.
There are probably lots of other more important loads, I just reacted
to Christoph seeming to argue that the AIM7 behavior was _so_ broken
that we shouldn't even care. It's not _that_ broken, it's just not
about high-performance IO streaming, it happens to test something else
entirely.
We've actually had AIM7 occasionally find other issues just because
some of the things it does is so odd.
Iirc it has a fork test that doesn't execve (very unusual - you'd
generally use threads if you care about performance), and that has
shown issues with our anon_vma scaling before anything else did.
I also seem to remember some odd pty open/close/ioctl subtest that
showed problems with some of the last remnants of the old BKL (the
test probably actually tested something else, but ended up choking on
the odd tty things).
So in general, I'm not a fan of AIM as a benchmark, but it actually
_has_ found lots of real issues because it tends to do things that
kernel developers think are insane.
And let's face it, user programs doing odd and not very efficient
things should be considered par for the course. We're never going to
get rid of insane user programs, so we might as well fix the
performance problems even when we say "that's just stupid".
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 5:02 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 5:02 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1783 bytes --]
On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> That's why running aim7 as your "does the filesystem scale"
> benchmark is somewhat irrelevant to scaling applications on high
> performance systems these days
Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
good benchmark. It's just that I think what it happens to test is
still meaningful, even if it's not necessarily in any way some kind of
"high performance IO" thing.
There are probably lots of other more important loads, I just reacted
to Christoph seeming to argue that the AIM7 behavior was _so_ broken
that we shouldn't even care. It's not _that_ broken, it's just not
about high-performance IO streaming, it happens to test something else
entirely.
We've actually had AIM7 occasionally find other issues just because
some of the things it does is so odd.
Iirc it has a fork test that doesn't execve (very unusual - you'd
generally use threads if you care about performance), and that has
shown issues with our anon_vma scaling before anything else did.
I also seem to remember some odd pty open/close/ioctl subtest that
showed problems with some of the last remnants of the old BKL (the
test probably actually tested something else, but ended up choking on
the odd tty things).
So in general, I'm not a fan of AIM as a benchmark, but it actually
_has_ found lots of real issues because it tends to do things that
kernel developers think are insane.
And let's face it, user programs doing odd and not very efficient
things should be considered par for the course. We're never going to
get rid of insane user programs, so we might as well fix the
performance problems even when we say "that's just stupid".
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 5:02 ` Linus Torvalds
@ 2016-08-12 6:04 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 6:04 UTC (permalink / raw)
To: Linus Torvalds
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, Wu Fengguang, LKP
On Thu, Aug 11, 2016 at 10:02:39PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > That's why running aim7 as your "does the filesystem scale"
> > benchmark is somewhat irrelevant to scaling applications on high
> > performance systems these days
>
> Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
> good benchmark. It's just that I think what it happens to test is
> still meaningful, even if it's not necessarily in any way some kind of
> "high performance IO" thing.
>
> There are probably lots of other more important loads, I just reacted
> to Christoph seeming to argue that the AIM7 behavior was _so_ broken
> that we shouldn't even care. It's not _that_ broken, it's just not
> about high-performance IO streaming, it happens to test something else
> entirely.
Right - I admit that my first reaction once I worked out what the
problem was is exactly what Christoph said. But after looking at it
further, regardless of how crappy the benchmark it, it is a
regression....
> We've actually had AIM7 occasionally find other issues just because
> some of the things it does is so odd.
*nod*
> And let's face it, user programs doing odd and not very efficient
> things should be considered par for the course. We're never going to
> get rid of insane user programs, so we might as well fix the
> performance problems even when we say "that's just stupid".
Yup, that's what I'm doing :/
It looks like the underlying cause is that the old block mapping
code only fed filesystem block size lengths into
xfs_iomap_write_delay(), whereas the iomap code is feeding the
(capped) write() length into it. Hence xfs_iomap_write_delay() is
not detecting the need for speculative preallocation correctly on
these sub-block writes. The profile looks better for the 1 byte
write - I've combined the old and new for comparison below:
4.22% __block_commit_write.isra.30
3.80% up_write
3.74% xfs_bmapi_read
3.65% ___might_sleep
3.55% down_write
3.20% entry_SYSCALL_64_fastpath
3.02% mark_buffer_dirty
2.78% __mark_inode_dirty
2.78% unlock_page
2.59% xfs_break_layouts
2.47% xfs_iext_bno_to_ext
2.38% __block_write_begin_int
2.22% find_get_entry
2.17% xfs_file_write_iter
2.16% __radix_tree_lookup
2.13% iomap_write_actor
2.04% xfs_bmap_search_extents
1.98% __might_sleep
1.84% xfs_file_buffered_aio_write
1.76% iomap_apply
1.71% generic_write_end
1.68% vfs_write
1.66% iov_iter_copy_from_user_atomic
1.56% xfs_bmap_search_multi_extents
1.55% __vfs_write
1.52% pagecache_get_page
1.46% xfs_bmapi_update_map
1.33% xfs_iunlock
1.32% xfs_iomap_write_delay
1.29% xfs_file_iomap_begin
1.29% do_raw_spin_lock
1.29% __xfs_bmbt_get_all
1.21% iov_iter_advance
1.20% xfs_file_aio_write_checks
1.14% xfs_ilock
1.11% balance_dirty_pages_ratelimited
1.10% xfs_bmapi_trim_map
1.06% xfs_iomap_eof_want_preallocate
1.00% xfs_bmapi_delay
Comparison of common functions:
Old New function
4.50% 3.74% xfs_bmapi_read
3.64% 4.22% __block_commit_write.isra.30
3.55% 2.16% __radix_tree_lookup
3.46% 3.80% up_write
3.43% 3.65% ___might_sleep
3.09% 3.20% entry_SYSCALL_64_fastpath
3.01% 2.47% xfs_iext_bno_to_ext
3.01% 2.22% find_get_entry
2.98% 3.55% down_write
2.71% 3.02% mark_buffer_dirty
2.52% 2.78% __mark_inode_dirty
2.38% 2.78% unlock_page
2.14% 2.59% xfs_break_layouts
2.07% 1.46% xfs_bmapi_update_map
2.06% 2.04% xfs_bmap_search_extents
2.04% 1.32% xfs_iomap_write_delay
2.00% 0.38% generic_write_checks
1.96% 1.56% xfs_bmap_search_multi_extents
1.90% 1.29% __xfs_bmbt_get_all
1.89% 1.11% balance_dirty_pages_ratelimited
1.82% 0.28% wait_for_stable_page
1.76% 2.17% xfs_file_write_iter
1.68% 1.06% xfs_iomap_eof_want_preallocate
1.68% 1.00% xfs_bmapi_delay
1.67% 2.13% iomap_write_actor
1.60% 1.84% xfs_file_buffered_aio_write
1.56% 1.98% __might_sleep
1.48% 1.29% do_raw_spin_lock
1.44% 1.71% generic_write_end
1.41% 1.52% pagecache_get_page
1.38% 1.10% xfs_bmapi_trim_map
1.21% 2.38% __block_write_begin_int
1.17% 1.68% vfs_write
1.17% 1.29% xfs_file_iomap_begin
This shows more time spent in functions above xfs_file_iomap_begin
(which does the block mapping and allocation) and less time spent
below it. i.e. the generic functions as showing higher CPU usage
and the xfs* functions are showing signficantly reduced CPU usage.
This implies that we're doing a lot less block mapping work....
lkp-folk: the patch I've just tested it attached below - can you
feed that through your test and see if it fixes the regression?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
xfs: correct speculative prealloc on extending subpage writes
From: Dave Chinner <dchinner@redhat.com>
When a write occurs that extends the file, we check to see if we
need to preallocate more delalloc space. When we do sub-page
writes, the new iomap write path passes a sub-block write length to
the block mapping code. xfs_iomap_write_delay does not expect to be
pased byte counts smaller than one filesystem block, so it ends up
checking the BMBT on for blocks beyond EOF on every write,
regardless of whether we need to or not. This causes a regression in
aim7 benchmarks as it is full of sub-page writes.
To fix this, clamp the minimum length of a mapping request coming
through xfs_file_iomap_begin() to one filesystem block. This ensures
we are passing the same length to xfs_iomap_write_delay() as we did
when calling through the get_blocks path. This substantially reduces
the amount of lookup load being placed on the BMBT during sub-block
write loads.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_iomap.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index cf697eb..486b75b 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1036,10 +1036,15 @@ xfs_file_iomap_begin(
* number pulled out of thin air as a best guess for initial
* testing.
*
+ * xfs_iomap_write_delay() only works if the length passed in is
+ * >= one filesystem block. Hence we need to clamp the minimum
+ * length we map, too.
+ *
* Note that the values needs to be less than 32-bits wide until
* the lower level functions are updated.
*/
length = min_t(loff_t, length, 1024 * PAGE_SIZE);
+ length = max_t(loff_t, length, (1 << inode->i_blkbits));
if (xfs_get_extsz_hint(ip)) {
/*
* xfs_iomap_write_direct() expects the shared lock. It
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 6:04 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 6:04 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 6695 bytes --]
On Thu, Aug 11, 2016 at 10:02:39PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > That's why running aim7 as your "does the filesystem scale"
> > benchmark is somewhat irrelevant to scaling applications on high
> > performance systems these days
>
> Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
> good benchmark. It's just that I think what it happens to test is
> still meaningful, even if it's not necessarily in any way some kind of
> "high performance IO" thing.
>
> There are probably lots of other more important loads, I just reacted
> to Christoph seeming to argue that the AIM7 behavior was _so_ broken
> that we shouldn't even care. It's not _that_ broken, it's just not
> about high-performance IO streaming, it happens to test something else
> entirely.
Right - I admit that my first reaction once I worked out what the
problem was is exactly what Christoph said. But after looking at it
further, regardless of how crappy the benchmark it, it is a
regression....
> We've actually had AIM7 occasionally find other issues just because
> some of the things it does is so odd.
*nod*
> And let's face it, user programs doing odd and not very efficient
> things should be considered par for the course. We're never going to
> get rid of insane user programs, so we might as well fix the
> performance problems even when we say "that's just stupid".
Yup, that's what I'm doing :/
It looks like the underlying cause is that the old block mapping
code only fed filesystem block size lengths into
xfs_iomap_write_delay(), whereas the iomap code is feeding the
(capped) write() length into it. Hence xfs_iomap_write_delay() is
not detecting the need for speculative preallocation correctly on
these sub-block writes. The profile looks better for the 1 byte
write - I've combined the old and new for comparison below:
4.22% __block_commit_write.isra.30
3.80% up_write
3.74% xfs_bmapi_read
3.65% ___might_sleep
3.55% down_write
3.20% entry_SYSCALL_64_fastpath
3.02% mark_buffer_dirty
2.78% __mark_inode_dirty
2.78% unlock_page
2.59% xfs_break_layouts
2.47% xfs_iext_bno_to_ext
2.38% __block_write_begin_int
2.22% find_get_entry
2.17% xfs_file_write_iter
2.16% __radix_tree_lookup
2.13% iomap_write_actor
2.04% xfs_bmap_search_extents
1.98% __might_sleep
1.84% xfs_file_buffered_aio_write
1.76% iomap_apply
1.71% generic_write_end
1.68% vfs_write
1.66% iov_iter_copy_from_user_atomic
1.56% xfs_bmap_search_multi_extents
1.55% __vfs_write
1.52% pagecache_get_page
1.46% xfs_bmapi_update_map
1.33% xfs_iunlock
1.32% xfs_iomap_write_delay
1.29% xfs_file_iomap_begin
1.29% do_raw_spin_lock
1.29% __xfs_bmbt_get_all
1.21% iov_iter_advance
1.20% xfs_file_aio_write_checks
1.14% xfs_ilock
1.11% balance_dirty_pages_ratelimited
1.10% xfs_bmapi_trim_map
1.06% xfs_iomap_eof_want_preallocate
1.00% xfs_bmapi_delay
Comparison of common functions:
Old New function
4.50% 3.74% xfs_bmapi_read
3.64% 4.22% __block_commit_write.isra.30
3.55% 2.16% __radix_tree_lookup
3.46% 3.80% up_write
3.43% 3.65% ___might_sleep
3.09% 3.20% entry_SYSCALL_64_fastpath
3.01% 2.47% xfs_iext_bno_to_ext
3.01% 2.22% find_get_entry
2.98% 3.55% down_write
2.71% 3.02% mark_buffer_dirty
2.52% 2.78% __mark_inode_dirty
2.38% 2.78% unlock_page
2.14% 2.59% xfs_break_layouts
2.07% 1.46% xfs_bmapi_update_map
2.06% 2.04% xfs_bmap_search_extents
2.04% 1.32% xfs_iomap_write_delay
2.00% 0.38% generic_write_checks
1.96% 1.56% xfs_bmap_search_multi_extents
1.90% 1.29% __xfs_bmbt_get_all
1.89% 1.11% balance_dirty_pages_ratelimited
1.82% 0.28% wait_for_stable_page
1.76% 2.17% xfs_file_write_iter
1.68% 1.06% xfs_iomap_eof_want_preallocate
1.68% 1.00% xfs_bmapi_delay
1.67% 2.13% iomap_write_actor
1.60% 1.84% xfs_file_buffered_aio_write
1.56% 1.98% __might_sleep
1.48% 1.29% do_raw_spin_lock
1.44% 1.71% generic_write_end
1.41% 1.52% pagecache_get_page
1.38% 1.10% xfs_bmapi_trim_map
1.21% 2.38% __block_write_begin_int
1.17% 1.68% vfs_write
1.17% 1.29% xfs_file_iomap_begin
This shows more time spent in functions above xfs_file_iomap_begin
(which does the block mapping and allocation) and less time spent
below it. i.e. the generic functions as showing higher CPU usage
and the xfs* functions are showing signficantly reduced CPU usage.
This implies that we're doing a lot less block mapping work....
lkp-folk: the patch I've just tested it attached below - can you
feed that through your test and see if it fixes the regression?
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
xfs: correct speculative prealloc on extending subpage writes
From: Dave Chinner <dchinner@redhat.com>
When a write occurs that extends the file, we check to see if we
need to preallocate more delalloc space. When we do sub-page
writes, the new iomap write path passes a sub-block write length to
the block mapping code. xfs_iomap_write_delay does not expect to be
pased byte counts smaller than one filesystem block, so it ends up
checking the BMBT on for blocks beyond EOF on every write,
regardless of whether we need to or not. This causes a regression in
aim7 benchmarks as it is full of sub-page writes.
To fix this, clamp the minimum length of a mapping request coming
through xfs_file_iomap_begin() to one filesystem block. This ensures
we are passing the same length to xfs_iomap_write_delay() as we did
when calling through the get_blocks path. This substantially reduces
the amount of lookup load being placed on the BMBT during sub-block
write loads.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_iomap.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index cf697eb..486b75b 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1036,10 +1036,15 @@ xfs_file_iomap_begin(
* number pulled out of thin air as a best guess for initial
* testing.
*
+ * xfs_iomap_write_delay() only works if the length passed in is
+ * >= one filesystem block. Hence we need to clamp the minimum
+ * length we map, too.
+ *
* Note that the values needs to be less than 32-bits wide until
* the lower level functions are updated.
*/
length = min_t(loff_t, length, 1024 * PAGE_SIZE);
+ length = max_t(loff_t, length, (1 << inode->i_blkbits));
if (xfs_get_extsz_hint(ip)) {
/*
* xfs_iomap_write_direct() expects the shared lock. It
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 6:04 ` Dave Chinner
@ 2016-08-12 6:29 ` Ye Xiaolong
-1 siblings, 0 replies; 219+ messages in thread
From: Ye Xiaolong @ 2016-08-12 6:29 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On 08/12, Dave Chinner wrote:
>On Thu, Aug 11, 2016 at 10:02:39PM -0700, Linus Torvalds wrote:
>> On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner <david@fromorbit.com> wrote:
>> >
>> > That's why running aim7 as your "does the filesystem scale"
>> > benchmark is somewhat irrelevant to scaling applications on high
>> > performance systems these days
>>
>> Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
>> good benchmark. It's just that I think what it happens to test is
>> still meaningful, even if it's not necessarily in any way some kind of
>> "high performance IO" thing.
>>
>> There are probably lots of other more important loads, I just reacted
>> to Christoph seeming to argue that the AIM7 behavior was _so_ broken
>> that we shouldn't even care. It's not _that_ broken, it's just not
>> about high-performance IO streaming, it happens to test something else
>> entirely.
>
>Right - I admit that my first reaction once I worked out what the
>problem was is exactly what Christoph said. But after looking at it
>further, regardless of how crappy the benchmark it, it is a
>regression....
>
>> We've actually had AIM7 occasionally find other issues just because
>> some of the things it does is so odd.
>
>*nod*
>
>> And let's face it, user programs doing odd and not very efficient
>> things should be considered par for the course. We're never going to
>> get rid of insane user programs, so we might as well fix the
>> performance problems even when we say "that's just stupid".
>
>Yup, that's what I'm doing :/
>
>It looks like the underlying cause is that the old block mapping
>code only fed filesystem block size lengths into
>xfs_iomap_write_delay(), whereas the iomap code is feeding the
>(capped) write() length into it. Hence xfs_iomap_write_delay() is
>not detecting the need for speculative preallocation correctly on
>these sub-block writes. The profile looks better for the 1 byte
>write - I've combined the old and new for comparison below:
>
> 4.22% __block_commit_write.isra.30
> 3.80% up_write
> 3.74% xfs_bmapi_read
> 3.65% ___might_sleep
> 3.55% down_write
> 3.20% entry_SYSCALL_64_fastpath
> 3.02% mark_buffer_dirty
> 2.78% __mark_inode_dirty
> 2.78% unlock_page
> 2.59% xfs_break_layouts
> 2.47% xfs_iext_bno_to_ext
> 2.38% __block_write_begin_int
> 2.22% find_get_entry
> 2.17% xfs_file_write_iter
> 2.16% __radix_tree_lookup
> 2.13% iomap_write_actor
> 2.04% xfs_bmap_search_extents
> 1.98% __might_sleep
> 1.84% xfs_file_buffered_aio_write
> 1.76% iomap_apply
> 1.71% generic_write_end
> 1.68% vfs_write
> 1.66% iov_iter_copy_from_user_atomic
> 1.56% xfs_bmap_search_multi_extents
> 1.55% __vfs_write
> 1.52% pagecache_get_page
> 1.46% xfs_bmapi_update_map
> 1.33% xfs_iunlock
> 1.32% xfs_iomap_write_delay
> 1.29% xfs_file_iomap_begin
> 1.29% do_raw_spin_lock
> 1.29% __xfs_bmbt_get_all
> 1.21% iov_iter_advance
> 1.20% xfs_file_aio_write_checks
> 1.14% xfs_ilock
> 1.11% balance_dirty_pages_ratelimited
> 1.10% xfs_bmapi_trim_map
> 1.06% xfs_iomap_eof_want_preallocate
> 1.00% xfs_bmapi_delay
>
>Comparison of common functions:
>
>Old New function
>4.50% 3.74% xfs_bmapi_read
>3.64% 4.22% __block_commit_write.isra.30
>3.55% 2.16% __radix_tree_lookup
>3.46% 3.80% up_write
>3.43% 3.65% ___might_sleep
>3.09% 3.20% entry_SYSCALL_64_fastpath
>3.01% 2.47% xfs_iext_bno_to_ext
>3.01% 2.22% find_get_entry
>2.98% 3.55% down_write
>2.71% 3.02% mark_buffer_dirty
>2.52% 2.78% __mark_inode_dirty
>2.38% 2.78% unlock_page
>2.14% 2.59% xfs_break_layouts
>2.07% 1.46% xfs_bmapi_update_map
>2.06% 2.04% xfs_bmap_search_extents
>2.04% 1.32% xfs_iomap_write_delay
>2.00% 0.38% generic_write_checks
>1.96% 1.56% xfs_bmap_search_multi_extents
>1.90% 1.29% __xfs_bmbt_get_all
>1.89% 1.11% balance_dirty_pages_ratelimited
>1.82% 0.28% wait_for_stable_page
>1.76% 2.17% xfs_file_write_iter
>1.68% 1.06% xfs_iomap_eof_want_preallocate
>1.68% 1.00% xfs_bmapi_delay
>1.67% 2.13% iomap_write_actor
>1.60% 1.84% xfs_file_buffered_aio_write
>1.56% 1.98% __might_sleep
>1.48% 1.29% do_raw_spin_lock
>1.44% 1.71% generic_write_end
>1.41% 1.52% pagecache_get_page
>1.38% 1.10% xfs_bmapi_trim_map
>1.21% 2.38% __block_write_begin_int
>1.17% 1.68% vfs_write
>1.17% 1.29% xfs_file_iomap_begin
>
>This shows more time spent in functions above xfs_file_iomap_begin
>(which does the block mapping and allocation) and less time spent
>below it. i.e. the generic functions as showing higher CPU usage
>and the xfs* functions are showing signficantly reduced CPU usage.
>This implies that we're doing a lot less block mapping work....
>
>lkp-folk: the patch I've just tested it attached below - can you
>feed that through your test and see if it fixes the regression?
>
Hi, Dave
I am verifying your fix patch in lkp environment now, will send the
result once I get it.
Thanks,
Xiaolong
>Cheers,
>
>Dave.
>--
>Dave Chinner
>david@fromorbit.com
>
>xfs: correct speculative prealloc on extending subpage writes
>
>From: Dave Chinner <dchinner@redhat.com>
>
>When a write occurs that extends the file, we check to see if we
>need to preallocate more delalloc space. When we do sub-page
>writes, the new iomap write path passes a sub-block write length to
>the block mapping code. xfs_iomap_write_delay does not expect to be
>pased byte counts smaller than one filesystem block, so it ends up
>checking the BMBT on for blocks beyond EOF on every write,
>regardless of whether we need to or not. This causes a regression in
>aim7 benchmarks as it is full of sub-page writes.
>
>To fix this, clamp the minimum length of a mapping request coming
>through xfs_file_iomap_begin() to one filesystem block. This ensures
>we are passing the same length to xfs_iomap_write_delay() as we did
>when calling through the get_blocks path. This substantially reduces
>the amount of lookup load being placed on the BMBT during sub-block
>write loads.
>
>Signed-off-by: Dave Chinner <dchinner@redhat.com>
>---
> fs/xfs/xfs_iomap.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
>index cf697eb..486b75b 100644
>--- a/fs/xfs/xfs_iomap.c
>+++ b/fs/xfs/xfs_iomap.c
>@@ -1036,10 +1036,15 @@ xfs_file_iomap_begin(
> * number pulled out of thin air as a best guess for initial
> * testing.
> *
>+ * xfs_iomap_write_delay() only works if the length passed in is
>+ * >= one filesystem block. Hence we need to clamp the minimum
>+ * length we map, too.
>+ *
> * Note that the values needs to be less than 32-bits wide until
> * the lower level functions are updated.
> */
> length = min_t(loff_t, length, 1024 * PAGE_SIZE);
>+ length = max_t(loff_t, length, (1 << inode->i_blkbits));
> if (xfs_get_extsz_hint(ip)) {
> /*
> * xfs_iomap_write_direct() expects the shared lock. It
>_______________________________________________
>LKP mailing list
>LKP@lists.01.org
>https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 6:29 ` Ye Xiaolong
0 siblings, 0 replies; 219+ messages in thread
From: Ye Xiaolong @ 2016-08-12 6:29 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 7169 bytes --]
On 08/12, Dave Chinner wrote:
>On Thu, Aug 11, 2016 at 10:02:39PM -0700, Linus Torvalds wrote:
>> On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner <david@fromorbit.com> wrote:
>> >
>> > That's why running aim7 as your "does the filesystem scale"
>> > benchmark is somewhat irrelevant to scaling applications on high
>> > performance systems these days
>>
>> Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
>> good benchmark. It's just that I think what it happens to test is
>> still meaningful, even if it's not necessarily in any way some kind of
>> "high performance IO" thing.
>>
>> There are probably lots of other more important loads, I just reacted
>> to Christoph seeming to argue that the AIM7 behavior was _so_ broken
>> that we shouldn't even care. It's not _that_ broken, it's just not
>> about high-performance IO streaming, it happens to test something else
>> entirely.
>
>Right - I admit that my first reaction once I worked out what the
>problem was is exactly what Christoph said. But after looking at it
>further, regardless of how crappy the benchmark it, it is a
>regression....
>
>> We've actually had AIM7 occasionally find other issues just because
>> some of the things it does is so odd.
>
>*nod*
>
>> And let's face it, user programs doing odd and not very efficient
>> things should be considered par for the course. We're never going to
>> get rid of insane user programs, so we might as well fix the
>> performance problems even when we say "that's just stupid".
>
>Yup, that's what I'm doing :/
>
>It looks like the underlying cause is that the old block mapping
>code only fed filesystem block size lengths into
>xfs_iomap_write_delay(), whereas the iomap code is feeding the
>(capped) write() length into it. Hence xfs_iomap_write_delay() is
>not detecting the need for speculative preallocation correctly on
>these sub-block writes. The profile looks better for the 1 byte
>write - I've combined the old and new for comparison below:
>
> 4.22% __block_commit_write.isra.30
> 3.80% up_write
> 3.74% xfs_bmapi_read
> 3.65% ___might_sleep
> 3.55% down_write
> 3.20% entry_SYSCALL_64_fastpath
> 3.02% mark_buffer_dirty
> 2.78% __mark_inode_dirty
> 2.78% unlock_page
> 2.59% xfs_break_layouts
> 2.47% xfs_iext_bno_to_ext
> 2.38% __block_write_begin_int
> 2.22% find_get_entry
> 2.17% xfs_file_write_iter
> 2.16% __radix_tree_lookup
> 2.13% iomap_write_actor
> 2.04% xfs_bmap_search_extents
> 1.98% __might_sleep
> 1.84% xfs_file_buffered_aio_write
> 1.76% iomap_apply
> 1.71% generic_write_end
> 1.68% vfs_write
> 1.66% iov_iter_copy_from_user_atomic
> 1.56% xfs_bmap_search_multi_extents
> 1.55% __vfs_write
> 1.52% pagecache_get_page
> 1.46% xfs_bmapi_update_map
> 1.33% xfs_iunlock
> 1.32% xfs_iomap_write_delay
> 1.29% xfs_file_iomap_begin
> 1.29% do_raw_spin_lock
> 1.29% __xfs_bmbt_get_all
> 1.21% iov_iter_advance
> 1.20% xfs_file_aio_write_checks
> 1.14% xfs_ilock
> 1.11% balance_dirty_pages_ratelimited
> 1.10% xfs_bmapi_trim_map
> 1.06% xfs_iomap_eof_want_preallocate
> 1.00% xfs_bmapi_delay
>
>Comparison of common functions:
>
>Old New function
>4.50% 3.74% xfs_bmapi_read
>3.64% 4.22% __block_commit_write.isra.30
>3.55% 2.16% __radix_tree_lookup
>3.46% 3.80% up_write
>3.43% 3.65% ___might_sleep
>3.09% 3.20% entry_SYSCALL_64_fastpath
>3.01% 2.47% xfs_iext_bno_to_ext
>3.01% 2.22% find_get_entry
>2.98% 3.55% down_write
>2.71% 3.02% mark_buffer_dirty
>2.52% 2.78% __mark_inode_dirty
>2.38% 2.78% unlock_page
>2.14% 2.59% xfs_break_layouts
>2.07% 1.46% xfs_bmapi_update_map
>2.06% 2.04% xfs_bmap_search_extents
>2.04% 1.32% xfs_iomap_write_delay
>2.00% 0.38% generic_write_checks
>1.96% 1.56% xfs_bmap_search_multi_extents
>1.90% 1.29% __xfs_bmbt_get_all
>1.89% 1.11% balance_dirty_pages_ratelimited
>1.82% 0.28% wait_for_stable_page
>1.76% 2.17% xfs_file_write_iter
>1.68% 1.06% xfs_iomap_eof_want_preallocate
>1.68% 1.00% xfs_bmapi_delay
>1.67% 2.13% iomap_write_actor
>1.60% 1.84% xfs_file_buffered_aio_write
>1.56% 1.98% __might_sleep
>1.48% 1.29% do_raw_spin_lock
>1.44% 1.71% generic_write_end
>1.41% 1.52% pagecache_get_page
>1.38% 1.10% xfs_bmapi_trim_map
>1.21% 2.38% __block_write_begin_int
>1.17% 1.68% vfs_write
>1.17% 1.29% xfs_file_iomap_begin
>
>This shows more time spent in functions above xfs_file_iomap_begin
>(which does the block mapping and allocation) and less time spent
>below it. i.e. the generic functions as showing higher CPU usage
>and the xfs* functions are showing signficantly reduced CPU usage.
>This implies that we're doing a lot less block mapping work....
>
>lkp-folk: the patch I've just tested it attached below - can you
>feed that through your test and see if it fixes the regression?
>
Hi, Dave
I am verifying your fix patch in lkp environment now, will send the
result once I get it.
Thanks,
Xiaolong
>Cheers,
>
>Dave.
>--
>Dave Chinner
>david(a)fromorbit.com
>
>xfs: correct speculative prealloc on extending subpage writes
>
>From: Dave Chinner <dchinner@redhat.com>
>
>When a write occurs that extends the file, we check to see if we
>need to preallocate more delalloc space. When we do sub-page
>writes, the new iomap write path passes a sub-block write length to
>the block mapping code. xfs_iomap_write_delay does not expect to be
>pased byte counts smaller than one filesystem block, so it ends up
>checking the BMBT on for blocks beyond EOF on every write,
>regardless of whether we need to or not. This causes a regression in
>aim7 benchmarks as it is full of sub-page writes.
>
>To fix this, clamp the minimum length of a mapping request coming
>through xfs_file_iomap_begin() to one filesystem block. This ensures
>we are passing the same length to xfs_iomap_write_delay() as we did
>when calling through the get_blocks path. This substantially reduces
>the amount of lookup load being placed on the BMBT during sub-block
>write loads.
>
>Signed-off-by: Dave Chinner <dchinner@redhat.com>
>---
> fs/xfs/xfs_iomap.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
>index cf697eb..486b75b 100644
>--- a/fs/xfs/xfs_iomap.c
>+++ b/fs/xfs/xfs_iomap.c
>@@ -1036,10 +1036,15 @@ xfs_file_iomap_begin(
> * number pulled out of thin air as a best guess for initial
> * testing.
> *
>+ * xfs_iomap_write_delay() only works if the length passed in is
>+ * >= one filesystem block. Hence we need to clamp the minimum
>+ * length we map, too.
>+ *
> * Note that the values needs to be less than 32-bits wide until
> * the lower level functions are updated.
> */
> length = min_t(loff_t, length, 1024 * PAGE_SIZE);
>+ length = max_t(loff_t, length, (1 << inode->i_blkbits));
> if (xfs_get_extsz_hint(ip)) {
> /*
> * xfs_iomap_write_direct() expects the shared lock. It
>_______________________________________________
>LKP mailing list
>LKP(a)lists.01.org
>https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 6:29 ` Ye Xiaolong
@ 2016-08-12 8:51 ` Ye Xiaolong
-1 siblings, 0 replies; 219+ messages in thread
From: Ye Xiaolong @ 2016-08-12 8:51 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On 08/12, Ye Xiaolong wrote:
>On 08/12, Dave Chinner wrote:
[snip]
>>lkp-folk: the patch I've just tested it attached below - can you
>>feed that through your test and see if it fixes the regression?
>>
>
>Hi, Dave
>
>I am verifying your fix patch in lkp environment now, will send the
>result once I get it.
>
Here is the test result.
commit 636b594f38278080db93f2d67d11d31700924f5d
Author: Dave Chinner <dchinner@redhat.com>
AuthorDate: Fri Aug 12 14:23:44 2016 +0800
Commit: Xiaolong Ye <xiaolong.ye@intel.com>
CommitDate: Fri Aug 12 14:23:44 2016 +0800
When a write occurs that extends the file, we check to see if we
need to preallocate more delalloc space. When we do sub-page
writes, the new iomap write path passes a sub-block write length to
the block mapping code. xfs_iomap_write_delay does not expect to be
pased byte counts smaller than one filesystem block, so it ends up
checking the BMBT on for blocks beyond EOF on every write,
regardless of whether we need to or not. This causes a regression in
aim7 benchmarks as it is full of sub-page writes.
To fix this, clamp the minimum length of a mapping request coming
through xfs_file_iomap_begin() to one filesystem block. This ensures
we are passing the same length to xfs_iomap_write_delay() as we did
when calling through the get_blocks path. This substantially reduces
the amount of lookup load being placed on the BMBT during sub-block
write loads.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_iomap.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 620fc91..5eaace0 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1015,10 +1015,15 @@ xfs_file_iomap_begin(
* number pulled out of thin air as a best guess for initial
* testing.
*
+ * xfs_iomap_write_delay() only works if the length passed in is
+ * >= one filesystem block. Hence we need to clamp the minimum
+ * length we map, too.
+ *
* Note that the values needs to be less than 32-bits wide until
* the lower level functions are updated.
*/
length = min_t(loff_t, length, 1024 * PAGE_SIZE);
+ length = max_t(loff_t, length, (1 << inode->i_blkbits));
if (xfs_get_extsz_hint(ip)) {
/*
* xfs_iomap_write_direct() expects the shared lock. It
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 636b594f38278080db93f2d67d
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
484435 ± 0% -13.3% 420004 ± 0% -14.0% 416777 ± 0% aim7.jobs-per-min
6491 ± 3% +30.8% 8491 ± 0% +35.7% 8806 ± 1% aim7.time.involuntary_context_switches
376 ± 0% +28.4% 484 ± 0% +29.6% 488 ± 0% aim7.time.system_time
430512 ± 0% -20.1% 343838 ± 0% -19.7% 345708 ± 0% aim7.time.voluntary_context_switches
37.37 ± 0% +15.3% 43.09 ± 0% +16.1% 43.41 ± 0% aim7.time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% +16.1% 43.41 ± 0% aim7.time.elapsed_time.max
155184 ± 1% -2.1% 151864 ± 1% -2.7% 150937 ± 1% aim7.time.minor_page_faults
0 ± 0% +Inf% 215412 ±141% +Inf% 334416 ± 75% latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
24772 ± 0% -28.6% 17675 ± 0% -26.7% 18149 ± 2% vmstat.system.cs
26816 ± 8% +10.2% 29542 ± 1% +13.3% 30370 ± 1% interrupts.CAL:Function_call_interrupts
125122 ± 10% -10.7% 111758 ± 12% -11.1% 111223 ± 11% softirqs.SCHED
3906 ± 0% +28.8% 5032 ± 2% +29.1% 5045 ± 1% proc-vmstat.nr_active_file
3444 ± 5% +41.8% 4884 ± 0% +25.0% 4304 ± 11% proc-vmstat.nr_shmem
4092 ± 14% +61.2% 6595 ± 1% +40.0% 5728 ± 15% proc-vmstat.pgactivate
15627 ± 0% +27.7% 19956 ± 1% +27.4% 19902 ± 0% meminfo.Active(file)
16103 ± 3% +14.3% 18405 ± 8% +11.2% 17900 ± 1% meminfo.AnonHugePages
13777 ± 5% +43.1% 19709 ± 0% +25.0% 17220 ± 11% meminfo.Shmem
1724300 ± 27% -40.5% 1025538 ± 1% -41.3% 1012868 ± 0% sched_debug.cfs_rq:/.load.max
1724300 ± 27% -40.5% 1025538 ± 1% -41.3% 1012868 ± 0% sched_debug.cpu.load.max
37.37 ± 0% +15.3% 43.09 ± 0% +16.1% 43.41 ± 0% time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% +16.1% 43.41 ± 0% time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% +35.7% 8806 ± 1% time.involuntary_context_switches
1037 ± 0% +10.8% 1148 ± 0% +10.9% 1149 ± 0% time.percent_of_cpu_this_job_got
376 ± 0% +28.4% 484 ± 0% +29.6% 488 ± 0% time.system_time
430512 ± 0% -20.1% 343838 ± 0% -19.7% 345708 ± 0% time.voluntary_context_switches
319584 ± 1% -26.5% 234868 ± 1% -23.9% 243331 ± 3% cpuidle.C1-IVT.usage
52991525 ± 1% -19.4% 42687208 ± 0% -20.0% 42368754 ± 0% cpuidle.C1-IVT.time
46760 ± 0% -22.4% 36298 ± 0% -21.6% 36681 ± 1% cpuidle.C1E-IVT.usage
3468808 ± 2% -19.8% 2783341 ± 3% -16.9% 2881608 ± 5% cpuidle.C1E-IVT.time
12590471 ± 0% -22.3% 9788585 ± 1% -21.6% 9866515 ± 1% cpuidle.C3-IVT.time
79965 ± 0% -19.0% 64749 ± 0% -19.1% 64654 ± 0% cpuidle.C3-IVT.usage
1.3e+09 ± 0% +13.3% 1.473e+09 ± 0% +13.9% 1.481e+09 ± 0% cpuidle.C6-IVT.time
24.18 ± 0% +9.0% 26.35 ± 0% +9.6% 26.49 ± 0% turbostat.%Busy
686 ± 0% +9.5% 751 ± 0% +9.2% 749 ± 1% turbostat.Avg_MHz
0.28 ± 0% -25.0% 0.21 ± 0% -23.8% 0.21 ± 4% turbostat.CPU%c3
79 ± 1% -0.4% 78 ± 3% -21.5% 62 ± 2% turbostat.CoreTmp
78 ± 0% +0.4% 79 ± 3% -21.2% 62 ± 1% turbostat.PkgTmp
4.74 ± 0% -2.7% 4.61 ± 1% -13.1% 4.12 ± 0% turbostat.RAMWatt
51 ± 0% +0.0% 51 ± 0% +333.3% 221 ± 10% slabinfo.dio.active_objs
51 ± 0% +0.0% 51 ± 0% +333.3% 221 ± 10% slabinfo.dio.num_objs
876 ± 6% +2.8% 900 ± 3% +16.7% 1022 ± 0% slabinfo.nsproxy.active_objs
876 ± 6% +2.8% 900 ± 3% +16.7% 1022 ± 0% slabinfo.nsproxy.num_objs
1975 ± 15% +63.2% 3224 ± 17% +45.5% 2874 ± 15% slabinfo.scsi_data_buffer.active_objs
1975 ± 15% +63.2% 3224 ± 17% +45.5% 2874 ± 15% slabinfo.scsi_data_buffer.num_objs
464 ± 15% +63.3% 758 ± 17% +46.6% 680 ± 15% slabinfo.xfs_efd_item.active_objs
464 ± 15% +63.3% 758 ± 17% +46.6% 680 ± 15% slabinfo.xfs_efd_item.num_objs
1930 ± 0% +33.9% 2585 ± 3% +24.7% 2407 ± 5% numa-vmstat.node0.nr_active_file
466 ± 4% +29.3% 603 ± 14% +28.9% 601 ± 18% numa-vmstat.node0.nr_dirty
1977 ± 1% +23.6% 2444 ± 1% +33.6% 2641 ± 7% numa-vmstat.node1.nr_active_file
11671 ± 3% +55.9% 18197 ± 24% +43.3% 16730 ± 25% numa-vmstat.node1.nr_anon_pages
3809 ± 6% +16.1% 4422 ± 4% +21.6% 4633 ± 4% numa-vmstat.node1.nr_alloc_batch
12026 ± 4% +64.1% 19734 ± 20% +43.7% 17276 ± 22% numa-vmstat.node1.nr_active_anon
7723 ± 0% +32.6% 10238 ± 5% +19.5% 9228 ± 4% numa-meminfo.node0.Active(file)
8774 ± 29% +5.3% 9238 ± 28% +22.5% 10749 ± 24% numa-meminfo.node1.Mapped
7908 ± 1% +22.9% 9722 ± 3% +35.8% 10736 ± 3% numa-meminfo.node1.Active(file)
46721 ± 3% +55.9% 72837 ± 24% +42.8% 66711 ± 26% numa-meminfo.node1.AnonPages
56052 ± 3% +58.2% 88666 ± 17% +42.2% 79696 ± 19% numa-meminfo.node1.Active
48142 ± 4% +64.0% 78943 ± 19% +43.2% 68960 ± 22% numa-meminfo.node1.Active(anon)
2.658e+11 ± 4% +24.7% 3.316e+11 ± 2% +25.9% 3.346e+11 ± 3% perf-stat.branch-instructions
0.41 ± 1% -9.1% 0.37 ± 1% -9.4% 0.37 ± 1% perf-stat.branch-miss-rate
1.09e+09 ± 3% +13.4% 1.237e+09 ± 1% +14.1% 1.244e+09 ± 2% perf-stat.branch-misses
981138 ± 0% -18.1% 803696 ± 0% -16.0% 823913 ± 1% perf-stat.context-switches
1.511e+12 ± 5% +23.4% 1.864e+12 ± 3% +24.4% 1.88e+12 ± 4% perf-stat.cpu-cycles
102600 ± 1% -7.3% 95075 ± 1% -5.2% 97261 ± 1% perf-stat.cpu-migrations
0.26 ± 12% -30.8% 0.18 ± 10% -28.1% 0.19 ± 27% perf-stat.dTLB-load-miss-rate
3.164e+11 ± 1% +39.9% 4.426e+11 ± 4% +40.0% 4.43e+11 ± 1% perf-stat.dTLB-loads
0.03 ± 26% -41.3% 0.02 ± 13% -41.8% 0.02 ± 5% perf-stat.dTLB-store-miss-rate
2.247e+11 ± 6% +26.4% 2.839e+11 ± 2% +29.2% 2.903e+11 ± 5% perf-stat.dTLB-stores
34415974 ± 6% -1.7% 33840719 ± 12% -6.7% 32119462 ± 2% perf-stat.iTLB-load-misses
17863352 ± 4% +2.1% 18245848 ± 2% -7.9% 16460161 ± 2% perf-stat.iTLB-loads
1.49e+12 ± 4% +30.1% 1.939e+12 ± 2% +31.5% 1.959e+12 ± 3% perf-stat.instructions
43348 ± 2% +34.2% 58161 ± 12% +40.9% 61065 ± 5% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.5% 1.04 ± 0% +5.7% 1.04 ± 0% perf-stat.ipc
262799 ± 0% +4.4% 274251 ± 1% +4.3% 274149 ± 0% perf-stat.minor-faults
34.12 ± 1% +2.1% 34.83 ± 0% +3.5% 35.30 ± 1% perf-stat.node-load-miss-rate
46476754 ± 2% +4.6% 48601269 ± 1% +6.6% 49534267 ± 0% perf-stat.node-load-misses
9.96 ± 0% +13.4% 11.30 ± 0% +13.6% 11.31 ± 2% perf-stat.node-store-miss-rate
24460859 ± 1% +14.4% 27971097 ± 1% +13.8% 27844903 ± 0% perf-stat.node-store-misses
262780 ± 0% +4.4% 274227 ± 1% +4.3% 274117 ± 0% perf-stat.page-faults
0.00 ± 0% +Inf% 52.94 ± 0% +Inf% 52.69 ± 0% perf-profile.cycles-pp.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± 0% +Inf% 52.29 ± 0% +Inf% 52.11 ± 0% perf-profile.cycles-pp.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± 0% +Inf% 34.35 ± 0% +Inf% 34.05 ± 0% perf-profile.cycles-pp.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± 0% +Inf% 16.48 ± 0% +Inf% 16.35 ± 1% perf-profile.cycles-pp.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 16.05 ± 0% +Inf% 16.21 ± 1% perf-profile.cycles-pp.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± 0% +Inf% 9.85 ± 0% +Inf% 9.75 ± 1% perf-profile.cycles-pp.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 9.25 ± 0% +Inf% 9.18 ± 1% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply
0.00 ± 0% +Inf% 9.08 ± 0% +Inf% 9.08 ± 1% perf-profile.cycles-pp.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 7.91 ± 1% +Inf% 7.90 ± 0% perf-profile.cycles-pp.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 4.69 ± 0% +Inf% 4.66 ± 0% perf-profile.cycles-pp.block_write_end.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 4.45 ± 1% +Inf% 4.45 ± 0% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor.iomap_apply
0.00 ± 0% +Inf% 4.14 ± 0% +Inf% 4.12 ± 1% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 3.69 ± 1% +Inf% 3.69 ± 2% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
0.00 ± 0% +Inf% 3.64 ± 0% +Inf% 3.62 ± 0% perf-profile.cycles-pp.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 3.44 ± 1% +Inf% 3.35 ± 2% perf-profile.cycles-pp.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 3.04 ± 1% +Inf% 3.00 ± 3% perf-profile.cycles-pp.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 3.22 ± 0% +Inf% 3.15 ± 1% perf-profile.cycles-pp.copy_user_enhanced_fast_string.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 3.06 ± 1% +Inf% 3.09 ± 0% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 3.05 ± 1% +Inf% 3.05 ± 2% perf-profile.cycles-pp.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
0.00 ± 0% +Inf% 2.78 ± 0% +Inf% 2.83 ± 1% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor
0.00 ± 0% +Inf% 2.68 ± 2% +Inf% 2.60 ± 1% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
0.00 ± 0% +Inf% 2.56 ± 2% +Inf% 2.46 ± 0% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
0.00 ± 0% +Inf% 2.43 ± 0% +Inf% 2.42 ± 0% perf-profile.cycles-pp.memset_erms.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 1.97 ± 2% +Inf% 1.90 ± 4% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 1.55 ± 3% +Inf% 1.62 ± 2% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
0.00 ± 0% +Inf% 1.68 ± 1% +Inf% 1.66 ± 2% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
0.00 ± 0% +Inf% 1.73 ± 1% +Inf% 1.71 ± 2% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
0.00 ± 0% +Inf% 1.61 ± 2% +Inf% 1.64 ± 3% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± 0% +Inf% 1.52 ± 2% +Inf% 1.51 ± 4% perf-profile.cycles-pp.workingset_activation.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 1.55 ± 1% +Inf% 1.55 ± 1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
0.00 ± 0% +Inf% 1.53 ± 1% +Inf% 1.52 ± 1% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply
0.00 ± 0% +Inf% 1.46 ± 1% +Inf% 1.45 ± 3% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply
0.00 ± 0% +Inf% 1.36 ± 1% +Inf% 1.39 ± 1% perf-profile.cycles-pp.unlock_page.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 1.18 ± 1% +Inf% 1.19 ± 1% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor
0.00 ± 0% +Inf% 1.21 ± 2% +Inf% 1.23 ± 2% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay
0.00 ± 0% +Inf% 1.24 ± 2% +Inf% 1.21 ± 2% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± 0% +Inf% 1.14 ± 3% +Inf% 1.16 ± 3% perf-profile.cycles-pp.xfs_ilock.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 1.09 ± 2% +Inf% 1.08 ± 1% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 0.95 ± 0% +Inf% 1.01 ± 3% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
43.95 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
25.10 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
13.71 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
11.03 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
10.68 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
10.96 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
10.36 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
10.37 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
6.46 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
6.34 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
6.24 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
5.93 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write
3.95 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.copy_user_enhanced_fast_string.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
4.02 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
3.39 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end
3.28 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
3.03 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.memset_erms.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
3.04 ± 3% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
2.91 ± 3% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
1.86 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
1.72 ± 4% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.unlock_page.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
1.80 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
1.83 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
1.72 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
1.44 ± 3% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
1.32 ± 4% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
1.25 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
1.23 ± 4% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.6.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
1.17 ± 3% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
1.04 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_bmapi_read.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.98 ± 5% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin
1.79 ± 2% -28.2% 1.28 ± 3% -27.8% 1.29 ± 4% perf-profile.cycles-pp.do_unlinkat.sys_unlink.entry_SYSCALL_64_fastpath
1.79 ± 3% -27.9% 1.29 ± 3% -27.7% 1.30 ± 4% perf-profile.cycles-pp.sys_unlink.entry_SYSCALL_64_fastpath
1.27 ± 0% -22.5% 0.99 ± 4% -24.6% 0.96 ± 4% perf-profile.cycles-pp.destroy_inode.evict.iput.__dentry_kill.dput
2.61 ± 1% -24.3% 1.98 ± 1% -24.1% 1.98 ± 1% perf-profile.cycles-pp.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
2.58 ± 1% -24.1% 1.96 ± 0% -24.1% 1.96 ± 1% perf-profile.cycles-pp.path_openat.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
1.07 ± 3% -23.3% 0.82 ± 3% -21.1% 0.85 ± 2% perf-profile.cycles-pp.down_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
2.66 ± 1% -24.3% 2.01 ± 1% -23.7% 2.03 ± 1% perf-profile.cycles-pp.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
2.67 ± 1% -24.2% 2.02 ± 1% -23.8% 2.03 ± 1% perf-profile.cycles-pp.sys_creat.entry_SYSCALL_64_fastpath
1.24 ± 1% -23.1% 0.95 ± 4% -24.2% 0.94 ± 4% perf-profile.cycles-pp.xfs_fs_destroy_inode.destroy_inode.evict.iput.__dentry_kill
1.21 ± 1% -23.4% 0.93 ± 4% -24.7% 0.91 ± 4% perf-profile.cycles-pp.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
0.94 ± 4% -19.8% 0.76 ± 0% -21.6% 0.74 ± 4% perf-profile.cycles-pp.cancel_dirty_page.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage
1.32 ± 2% -21.5% 1.04 ± 1% -22.2% 1.03 ± 3% perf-profile.cycles-pp.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat
1.42 ± 2% -20.7% 1.13 ± 1% -22.1% 1.11 ± 3% perf-profile.cycles-pp.xfs_vn_create.path_openat.do_filp_open.do_sys_open.sys_creat
2.35 ± 1% -21.0% 1.86 ± 1% -20.7% 1.86 ± 1% perf-profile.cycles-pp.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page
1.91 ± 3% -16.4% 1.59 ± 1% -19.9% 1.53 ± 1% perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page
2.07 ± 1% -20.4% 1.65 ± 2% -19.9% 1.66 ± 1% perf-profile.cycles-pp.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage
1.42 ± 2% -20.5% 1.13 ± 1% -22.1% 1.10 ± 2% perf-profile.cycles-pp.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open
1.42 ± 2% -21.2% 1.12 ± 1% -22.4% 1.10 ± 3% perf-profile.cycles-pp.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open
1.12 ± 2% -17.6% 0.92 ± 4% -22.3% 0.87 ± 4% perf-profile.cycles-pp.__sb_start_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
2.40 ± 1% -21.0% 1.89 ± 2% -20.6% 1.90 ± 1% perf-profile.cycles-pp.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range
1.29 ± 3% -18.9% 1.04 ± 1% -17.9% 1.06 ± 1% perf-profile.cycles-pp.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
3.42 ± 0% -20.9% 2.71 ± 2% -20.3% 2.73 ± 2% perf-profile.cycles-pp.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
5.96 ± 1% -20.0% 4.77 ± 0% -19.4% 4.81 ± 1% perf-profile.cycles-pp.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
3.54 ± 0% -20.8% 2.81 ± 1% -20.0% 2.83 ± 2% perf-profile.cycles-pp.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
2.55 ± 3% -14.2% 2.19 ± 2% -17.5% 2.10 ± 1% perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin
1.04 ± 2% -18.9% 0.84 ± 1% -19.6% 0.84 ± 0% perf-profile.cycles-pp.__delete_from_page_cache.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
1.74 ± 2% -19.9% 1.40 ± 3% -19.3% 1.41 ± 1% perf-profile.cycles-pp.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
1.01 ± 3% -17.9% 0.83 ± 2% -18.2% 0.82 ± 1% perf-profile.cycles-pp.down_write.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.21 ± 2% -18.1% 9.18 ± 0% -18.4% 9.14 ± 1% perf-profile.cycles-pp.evict.iput.__dentry_kill.dput.__fput
11.24 ± 2% -18.1% 9.21 ± 0% -18.4% 9.18 ± 1% perf-profile.cycles-pp.__dentry_kill.dput.__fput.____fput.task_work_run
11.22 ± 2% -18.1% 9.19 ± 0% -18.4% 9.16 ± 1% perf-profile.cycles-pp.iput.__dentry_kill.dput.__fput.____fput
1.79 ± 3% -22.2% 1.39 ± 0% -18.2% 1.46 ± 0% perf-profile.cycles-pp.security_file_permission.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
11.26 ± 2% -18.1% 9.23 ± 0% -18.3% 9.20 ± 1% perf-profile.cycles-pp.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
11.31 ± 1% -18.1% 9.27 ± 0% -18.2% 9.25 ± 1% perf-profile.cycles-pp.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.34 ± 2% -18.1% 9.29 ± 0% -18.3% 9.27 ± 1% perf-profile.cycles-pp.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.31 ± 2% -18.1% 9.26 ± 0% -18.3% 9.24 ± 1% perf-profile.cycles-pp.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath
11.32 ± 1% -18.0% 9.28 ± 0% -18.2% 9.26 ± 1% perf-profile.cycles-pp.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.34 ± 1% -18.1% 9.29 ± 0% -18.2% 9.27 ± 1% perf-profile.cycles-pp.syscall_return_slowpath.entry_SYSCALL_64_fastpath
2.06 ± 3% -22.5% 1.60 ± 2% -18.1% 1.69 ± 0% perf-profile.cycles-pp.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
9.87 ± 2% -17.5% 8.15 ± 0% -17.6% 8.14 ± 1% perf-profile.cycles-pp.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.__dentry_kill
9.89 ± 2% -17.4% 8.17 ± 0% -17.5% 8.16 ± 1% perf-profile.cycles-pp.truncate_inode_pages_final.evict.iput.__dentry_kill.dput
1.00 ± 1% -18.0% 0.82 ± 1% -14.3% 0.86 ± 3% perf-profile.cycles-pp.__radix_tree_lookup.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin
51.83 ± 1% +14.3% 59.25 ± 0% +13.8% 58.97 ± 0% perf-profile.cycles-pp.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.sys_write
1.38 ± 2% -13.3% 1.19 ± 1% -9.9% 1.24 ± 2% perf-profile.cycles-pp.__set_page_dirty.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end
53.16 ± 1% +13.6% 60.40 ± 0% +13.0% 60.10 ± 0% perf-profile.cycles-pp.xfs_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
54.10 ± 1% +13.1% 61.20 ± 0% +12.5% 60.86 ± 0% perf-profile.cycles-pp.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.32 ± 4% -21.4% 1.04 ± 0% -14.9% 1.13 ± 1% perf-profile.cycles-pp.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write.sys_write
19.79 ± 5% -9.9% 17.84 ± 0% -7.5% 18.31 ± 3% perf-profile.cycles-pp.start_secondary
19.75 ± 5% -9.8% 17.81 ± 0% -7.4% 18.28 ± 3% perf-profile.cycles-pp.cpu_startup_entry.start_secondary
2.50 ± 3% -11.5% 2.21 ± 0% -13.1% 2.17 ± 0% perf-profile.cycles-pp.__pagevec_release.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
2.39 ± 3% -11.2% 2.12 ± 0% -13.0% 2.08 ± 0% perf-profile.cycles-pp.release_pages.__pagevec_release.truncate_inode_pages_range.truncate_inode_pages_final.evict
59.63 ± 1% +10.2% 65.72 ± 0% +9.7% 65.43 ± 0% perf-profile.cycles-pp.vfs_write.sys_write.entry_SYSCALL_64_fastpath
0.00 ± 0% +Inf% 1.91 ± 1% +Inf% 1.83 ± 1% perf-profile.func.cycles-pp.mark_page_accessed
0.00 ± 0% +Inf% 1.12 ± 1% +Inf% 1.12 ± 0% perf-profile.func.cycles-pp.iomap_write_actor
0.00 ± 0% +Inf% 1.10 ± 3% +Inf% 1.10 ± 2% perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8
1.30 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.func.cycles-pp.generic_perform_write
1.08 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.func.cycles-pp.__xfs_get_blocks
0.37 ± 2% +243.6% 1.26 ± 2% +236.4% 1.23 ± 0% perf-profile.func.cycles-pp.xfs_bmap_search_extents
0.70 ± 5% +219.5% 2.24 ± 0% +213.8% 2.20 ± 2% perf-profile.func.cycles-pp.xfs_bmapi_read
0.41 ± 1% +198.4% 1.22 ± 2% +190.2% 1.19 ± 1% perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents
0.64 ± 1% +182.8% 1.81 ± 4% +181.3% 1.80 ± 0% perf-profile.func.cycles-pp.xfs_iext_bno_to_ext
0.46 ± 4% +161.6% 1.20 ± 1% +163.0% 1.21 ± 1% perf-profile.func.cycles-pp.xfs_iomap_write_delay
1.31 ± 2% -46.7% 0.70 ± 0% -46.9% 0.69 ± 2% perf-profile.func.cycles-pp.generic_write_end
2.49 ± 0% -34.5% 1.63 ± 1% -36.0% 1.59 ± 1% perf-profile.func.cycles-pp.__block_commit_write.isra.24
1.50 ± 1% -20.9% 1.19 ± 1% -21.3% 1.18 ± 1% perf-profile.func.cycles-pp.mark_buffer_dirty
3.24 ± 0% -19.8% 2.60 ± 0% -20.0% 2.59 ± 0% perf-profile.func.cycles-pp.memset_erms
3.96 ± 2% -18.4% 3.23 ± 0% -20.3% 3.16 ± 1% perf-profile.func.cycles-pp.copy_user_enhanced_fast_string
1.79 ± 4% -16.8% 1.49 ± 1% -19.6% 1.44 ± 0% perf-profile.func.cycles-pp.__mark_inode_dirty
1.41 ± 3% -20.6% 1.12 ± 3% -21.3% 1.11 ± 3% perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath
1.16 ± 0% -18.1% 0.95 ± 1% -18.1% 0.95 ± 3% perf-profile.func.cycles-pp._raw_spin_lock
1.16 ± 1% -21.6% 0.91 ± 1% -20.1% 0.93 ± 2% perf-profile.func.cycles-pp.vfs_write
1.75 ± 2% -18.9% 1.42 ± 1% -17.7% 1.44 ± 2% perf-profile.func.cycles-pp.unlock_page
1.32 ± 0% -16.4% 1.10 ± 1% -14.1% 1.13 ± 3% perf-profile.func.cycles-pp.__radix_tree_lookup
1.51 ± 2% +15.4% 1.75 ± 1% +15.9% 1.75 ± 0% perf-profile.func.cycles-pp.__block_write_begin_int
1.02 ± 4% -7.5% 0.94 ± 2% -12.4% 0.89 ± 2% perf-profile.func.cycles-pp.pagecache_get_page
1.05 ± 2% -15.6% 0.88 ± 3% -15.6% 0.88 ± 5% perf-profile.func.cycles-pp.xfs_file_write_iter
raw perf profile data:
"perf-profile.func.cycles-pp.intel_idle": 17.0,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.15,
"perf-profile.func.cycles-pp.memset_erms": 2.59,
"perf-profile.func.cycles-pp.xfs_bmapi_read": 2.25,
"perf-profile.func.cycles-pp.___might_sleep": 2.13,
"perf-profile.func.cycles-pp.mark_page_accessed": 1.8,
"perf-profile.func.cycles-pp.xfs_iext_bno_to_ext": 1.8,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.74,
"perf-profile.func.cycles-pp.__block_commit_write.isra.24": 1.62,
"perf-profile.func.cycles-pp.up_write": 1.62,
"perf-profile.func.cycles-pp.down_write": 1.48,
"perf-profile.func.cycles-pp.unlock_page": 1.48,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.43,
"perf-profile.func.cycles-pp.xfs_iomap_write_delay": 1.23,
"perf-profile.func.cycles-pp.xfs_bmap_search_extents": 1.23,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 1.19,
"perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents": 1.18,
"perf-profile.func.cycles-pp.__might_sleep": 1.17,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 1.16,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.12,
"perf-profile.func.cycles-pp.iomap_write_actor": 1.11,
"perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8": 1.07,
"perf-profile.func.cycles-pp._raw_spin_lock": 1.0,
"perf-profile.func.cycles-pp.vfs_write": 0.94,
"perf-profile.func.cycles-pp.pagecache_get_page": 0.92,
"perf-profile.func.cycles-pp.xfs_bmapi_delay": 0.92,
"perf-profile.func.cycles-pp.xfs_file_iomap_begin": 0.91,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 0.9,
"perf-profile.func.cycles-pp.workingset_activation": 0.82,
"perf-profile.func.cycles-pp.iomap_apply": 0.77,
"perf-profile.func.cycles-pp.xfs_bmapi_trim_map.isra.14": 0.75,
"perf-profile.func.cycles-pp.xfs_file_buffered_aio_write": 0.74,
"perf-profile.func.cycles-pp.mem_cgroup_zone_lruvec": 0.73,
"perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 0.73,
"perf-profile.func.cycles-pp.get_page_from_freelist": 0.72,
"perf-profile.func.cycles-pp.generic_write_end": 0.71,
"perf-profile.func.cycles-pp.__vfs_write": 0.66,
"perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.66,
"perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 0.66,
"perf-profile.func.cycles-pp.release_pages": 0.65,
"perf-profile.func.cycles-pp.find_get_entry": 0.65,
Thanks,
Xiaolong
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 8:51 ` Ye Xiaolong
0 siblings, 0 replies; 219+ messages in thread
From: Ye Xiaolong @ 2016-08-12 8:51 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 40182 bytes --]
On 08/12, Ye Xiaolong wrote:
>On 08/12, Dave Chinner wrote:
[snip]
>>lkp-folk: the patch I've just tested it attached below - can you
>>feed that through your test and see if it fixes the regression?
>>
>
>Hi, Dave
>
>I am verifying your fix patch in lkp environment now, will send the
>result once I get it.
>
Here is the test result.
commit 636b594f38278080db93f2d67d11d31700924f5d
Author: Dave Chinner <dchinner@redhat.com>
AuthorDate: Fri Aug 12 14:23:44 2016 +0800
Commit: Xiaolong Ye <xiaolong.ye@intel.com>
CommitDate: Fri Aug 12 14:23:44 2016 +0800
When a write occurs that extends the file, we check to see if we
need to preallocate more delalloc space. When we do sub-page
writes, the new iomap write path passes a sub-block write length to
the block mapping code. xfs_iomap_write_delay does not expect to be
pased byte counts smaller than one filesystem block, so it ends up
checking the BMBT on for blocks beyond EOF on every write,
regardless of whether we need to or not. This causes a regression in
aim7 benchmarks as it is full of sub-page writes.
To fix this, clamp the minimum length of a mapping request coming
through xfs_file_iomap_begin() to one filesystem block. This ensures
we are passing the same length to xfs_iomap_write_delay() as we did
when calling through the get_blocks path. This substantially reduces
the amount of lookup load being placed on the BMBT during sub-block
write loads.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_iomap.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 620fc91..5eaace0 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1015,10 +1015,15 @@ xfs_file_iomap_begin(
* number pulled out of thin air as a best guess for initial
* testing.
*
+ * xfs_iomap_write_delay() only works if the length passed in is
+ * >= one filesystem block. Hence we need to clamp the minimum
+ * length we map, too.
+ *
* Note that the values needs to be less than 32-bits wide until
* the lower level functions are updated.
*/
length = min_t(loff_t, length, 1024 * PAGE_SIZE);
+ length = max_t(loff_t, length, (1 << inode->i_blkbits));
if (xfs_get_extsz_hint(ip)) {
/*
* xfs_iomap_write_direct() expects the shared lock. It
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 636b594f38278080db93f2d67d
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
484435 ± 0% -13.3% 420004 ± 0% -14.0% 416777 ± 0% aim7.jobs-per-min
6491 ± 3% +30.8% 8491 ± 0% +35.7% 8806 ± 1% aim7.time.involuntary_context_switches
376 ± 0% +28.4% 484 ± 0% +29.6% 488 ± 0% aim7.time.system_time
430512 ± 0% -20.1% 343838 ± 0% -19.7% 345708 ± 0% aim7.time.voluntary_context_switches
37.37 ± 0% +15.3% 43.09 ± 0% +16.1% 43.41 ± 0% aim7.time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% +16.1% 43.41 ± 0% aim7.time.elapsed_time.max
155184 ± 1% -2.1% 151864 ± 1% -2.7% 150937 ± 1% aim7.time.minor_page_faults
0 ± 0% +Inf% 215412 ±141% +Inf% 334416 ± 75% latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
24772 ± 0% -28.6% 17675 ± 0% -26.7% 18149 ± 2% vmstat.system.cs
26816 ± 8% +10.2% 29542 ± 1% +13.3% 30370 ± 1% interrupts.CAL:Function_call_interrupts
125122 ± 10% -10.7% 111758 ± 12% -11.1% 111223 ± 11% softirqs.SCHED
3906 ± 0% +28.8% 5032 ± 2% +29.1% 5045 ± 1% proc-vmstat.nr_active_file
3444 ± 5% +41.8% 4884 ± 0% +25.0% 4304 ± 11% proc-vmstat.nr_shmem
4092 ± 14% +61.2% 6595 ± 1% +40.0% 5728 ± 15% proc-vmstat.pgactivate
15627 ± 0% +27.7% 19956 ± 1% +27.4% 19902 ± 0% meminfo.Active(file)
16103 ± 3% +14.3% 18405 ± 8% +11.2% 17900 ± 1% meminfo.AnonHugePages
13777 ± 5% +43.1% 19709 ± 0% +25.0% 17220 ± 11% meminfo.Shmem
1724300 ± 27% -40.5% 1025538 ± 1% -41.3% 1012868 ± 0% sched_debug.cfs_rq:/.load.max
1724300 ± 27% -40.5% 1025538 ± 1% -41.3% 1012868 ± 0% sched_debug.cpu.load.max
37.37 ± 0% +15.3% 43.09 ± 0% +16.1% 43.41 ± 0% time.elapsed_time
37.37 ± 0% +15.3% 43.09 ± 0% +16.1% 43.41 ± 0% time.elapsed_time.max
6491 ± 3% +30.8% 8491 ± 0% +35.7% 8806 ± 1% time.involuntary_context_switches
1037 ± 0% +10.8% 1148 ± 0% +10.9% 1149 ± 0% time.percent_of_cpu_this_job_got
376 ± 0% +28.4% 484 ± 0% +29.6% 488 ± 0% time.system_time
430512 ± 0% -20.1% 343838 ± 0% -19.7% 345708 ± 0% time.voluntary_context_switches
319584 ± 1% -26.5% 234868 ± 1% -23.9% 243331 ± 3% cpuidle.C1-IVT.usage
52991525 ± 1% -19.4% 42687208 ± 0% -20.0% 42368754 ± 0% cpuidle.C1-IVT.time
46760 ± 0% -22.4% 36298 ± 0% -21.6% 36681 ± 1% cpuidle.C1E-IVT.usage
3468808 ± 2% -19.8% 2783341 ± 3% -16.9% 2881608 ± 5% cpuidle.C1E-IVT.time
12590471 ± 0% -22.3% 9788585 ± 1% -21.6% 9866515 ± 1% cpuidle.C3-IVT.time
79965 ± 0% -19.0% 64749 ± 0% -19.1% 64654 ± 0% cpuidle.C3-IVT.usage
1.3e+09 ± 0% +13.3% 1.473e+09 ± 0% +13.9% 1.481e+09 ± 0% cpuidle.C6-IVT.time
24.18 ± 0% +9.0% 26.35 ± 0% +9.6% 26.49 ± 0% turbostat.%Busy
686 ± 0% +9.5% 751 ± 0% +9.2% 749 ± 1% turbostat.Avg_MHz
0.28 ± 0% -25.0% 0.21 ± 0% -23.8% 0.21 ± 4% turbostat.CPU%c3
79 ± 1% -0.4% 78 ± 3% -21.5% 62 ± 2% turbostat.CoreTmp
78 ± 0% +0.4% 79 ± 3% -21.2% 62 ± 1% turbostat.PkgTmp
4.74 ± 0% -2.7% 4.61 ± 1% -13.1% 4.12 ± 0% turbostat.RAMWatt
51 ± 0% +0.0% 51 ± 0% +333.3% 221 ± 10% slabinfo.dio.active_objs
51 ± 0% +0.0% 51 ± 0% +333.3% 221 ± 10% slabinfo.dio.num_objs
876 ± 6% +2.8% 900 ± 3% +16.7% 1022 ± 0% slabinfo.nsproxy.active_objs
876 ± 6% +2.8% 900 ± 3% +16.7% 1022 ± 0% slabinfo.nsproxy.num_objs
1975 ± 15% +63.2% 3224 ± 17% +45.5% 2874 ± 15% slabinfo.scsi_data_buffer.active_objs
1975 ± 15% +63.2% 3224 ± 17% +45.5% 2874 ± 15% slabinfo.scsi_data_buffer.num_objs
464 ± 15% +63.3% 758 ± 17% +46.6% 680 ± 15% slabinfo.xfs_efd_item.active_objs
464 ± 15% +63.3% 758 ± 17% +46.6% 680 ± 15% slabinfo.xfs_efd_item.num_objs
1930 ± 0% +33.9% 2585 ± 3% +24.7% 2407 ± 5% numa-vmstat.node0.nr_active_file
466 ± 4% +29.3% 603 ± 14% +28.9% 601 ± 18% numa-vmstat.node0.nr_dirty
1977 ± 1% +23.6% 2444 ± 1% +33.6% 2641 ± 7% numa-vmstat.node1.nr_active_file
11671 ± 3% +55.9% 18197 ± 24% +43.3% 16730 ± 25% numa-vmstat.node1.nr_anon_pages
3809 ± 6% +16.1% 4422 ± 4% +21.6% 4633 ± 4% numa-vmstat.node1.nr_alloc_batch
12026 ± 4% +64.1% 19734 ± 20% +43.7% 17276 ± 22% numa-vmstat.node1.nr_active_anon
7723 ± 0% +32.6% 10238 ± 5% +19.5% 9228 ± 4% numa-meminfo.node0.Active(file)
8774 ± 29% +5.3% 9238 ± 28% +22.5% 10749 ± 24% numa-meminfo.node1.Mapped
7908 ± 1% +22.9% 9722 ± 3% +35.8% 10736 ± 3% numa-meminfo.node1.Active(file)
46721 ± 3% +55.9% 72837 ± 24% +42.8% 66711 ± 26% numa-meminfo.node1.AnonPages
56052 ± 3% +58.2% 88666 ± 17% +42.2% 79696 ± 19% numa-meminfo.node1.Active
48142 ± 4% +64.0% 78943 ± 19% +43.2% 68960 ± 22% numa-meminfo.node1.Active(anon)
2.658e+11 ± 4% +24.7% 3.316e+11 ± 2% +25.9% 3.346e+11 ± 3% perf-stat.branch-instructions
0.41 ± 1% -9.1% 0.37 ± 1% -9.4% 0.37 ± 1% perf-stat.branch-miss-rate
1.09e+09 ± 3% +13.4% 1.237e+09 ± 1% +14.1% 1.244e+09 ± 2% perf-stat.branch-misses
981138 ± 0% -18.1% 803696 ± 0% -16.0% 823913 ± 1% perf-stat.context-switches
1.511e+12 ± 5% +23.4% 1.864e+12 ± 3% +24.4% 1.88e+12 ± 4% perf-stat.cpu-cycles
102600 ± 1% -7.3% 95075 ± 1% -5.2% 97261 ± 1% perf-stat.cpu-migrations
0.26 ± 12% -30.8% 0.18 ± 10% -28.1% 0.19 ± 27% perf-stat.dTLB-load-miss-rate
3.164e+11 ± 1% +39.9% 4.426e+11 ± 4% +40.0% 4.43e+11 ± 1% perf-stat.dTLB-loads
0.03 ± 26% -41.3% 0.02 ± 13% -41.8% 0.02 ± 5% perf-stat.dTLB-store-miss-rate
2.247e+11 ± 6% +26.4% 2.839e+11 ± 2% +29.2% 2.903e+11 ± 5% perf-stat.dTLB-stores
34415974 ± 6% -1.7% 33840719 ± 12% -6.7% 32119462 ± 2% perf-stat.iTLB-load-misses
17863352 ± 4% +2.1% 18245848 ± 2% -7.9% 16460161 ± 2% perf-stat.iTLB-loads
1.49e+12 ± 4% +30.1% 1.939e+12 ± 2% +31.5% 1.959e+12 ± 3% perf-stat.instructions
43348 ± 2% +34.2% 58161 ± 12% +40.9% 61065 ± 5% perf-stat.instructions-per-iTLB-miss
0.99 ± 0% +5.5% 1.04 ± 0% +5.7% 1.04 ± 0% perf-stat.ipc
262799 ± 0% +4.4% 274251 ± 1% +4.3% 274149 ± 0% perf-stat.minor-faults
34.12 ± 1% +2.1% 34.83 ± 0% +3.5% 35.30 ± 1% perf-stat.node-load-miss-rate
46476754 ± 2% +4.6% 48601269 ± 1% +6.6% 49534267 ± 0% perf-stat.node-load-misses
9.96 ± 0% +13.4% 11.30 ± 0% +13.6% 11.31 ± 2% perf-stat.node-store-miss-rate
24460859 ± 1% +14.4% 27971097 ± 1% +13.8% 27844903 ± 0% perf-stat.node-store-misses
262780 ± 0% +4.4% 274227 ± 1% +4.3% 274117 ± 0% perf-stat.page-faults
0.00 ± 0% +Inf% 52.94 ± 0% +Inf% 52.69 ± 0% perf-profile.cycles-pp.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
0.00 ± 0% +Inf% 52.29 ± 0% +Inf% 52.11 ± 0% perf-profile.cycles-pp.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
0.00 ± 0% +Inf% 34.35 ± 0% +Inf% 34.05 ± 0% perf-profile.cycles-pp.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± 0% +Inf% 16.48 ± 0% +Inf% 16.35 ± 1% perf-profile.cycles-pp.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 16.05 ± 0% +Inf% 16.21 ± 1% perf-profile.cycles-pp.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write.xfs_file_write_iter
0.00 ± 0% +Inf% 9.85 ± 0% +Inf% 9.75 ± 1% perf-profile.cycles-pp.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 9.25 ± 0% +Inf% 9.18 ± 1% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor.iomap_apply
0.00 ± 0% +Inf% 9.08 ± 0% +Inf% 9.08 ± 1% perf-profile.cycles-pp.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 7.91 ± 1% +Inf% 7.90 ± 0% perf-profile.cycles-pp.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 4.69 ± 0% +Inf% 4.66 ± 0% perf-profile.cycles-pp.block_write_end.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 4.45 ± 1% +Inf% 4.45 ± 0% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor.iomap_apply
0.00 ± 0% +Inf% 4.14 ± 0% +Inf% 4.12 ± 1% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 3.69 ± 1% +Inf% 3.69 ± 2% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
0.00 ± 0% +Inf% 3.64 ± 0% +Inf% 3.62 ± 0% perf-profile.cycles-pp.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 3.44 ± 1% +Inf% 3.35 ± 2% perf-profile.cycles-pp.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 3.04 ± 1% +Inf% 3.00 ± 3% perf-profile.cycles-pp.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 3.22 ± 0% +Inf% 3.15 ± 1% perf-profile.cycles-pp.copy_user_enhanced_fast_string.iomap_write_actor.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 3.06 ± 1% +Inf% 3.09 ± 0% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 3.05 ± 1% +Inf% 3.05 ± 2% perf-profile.cycles-pp.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
0.00 ± 0% +Inf% 2.78 ± 0% +Inf% 2.83 ± 1% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.iomap_write_actor
0.00 ± 0% +Inf% 2.68 ± 2% +Inf% 2.60 ± 1% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
0.00 ± 0% +Inf% 2.56 ± 2% +Inf% 2.46 ± 0% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
0.00 ± 0% +Inf% 2.43 ± 0% +Inf% 2.42 ± 0% perf-profile.cycles-pp.memset_erms.iomap_write_begin.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 1.97 ± 2% +Inf% 1.90 ± 4% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 1.55 ± 3% +Inf% 1.62 ± 2% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin.iomap_write_actor
0.00 ± 0% +Inf% 1.68 ± 1% +Inf% 1.66 ± 2% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
0.00 ± 0% +Inf% 1.73 ± 1% +Inf% 1.71 ± 2% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin.iomap_apply
0.00 ± 0% +Inf% 1.61 ± 2% +Inf% 1.64 ± 3% perf-profile.cycles-pp.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± 0% +Inf% 1.52 ± 2% +Inf% 1.51 ± 4% perf-profile.cycles-pp.workingset_activation.mark_page_accessed.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 1.55 ± 1% +Inf% 1.55 ± 1% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
0.00 ± 0% +Inf% 1.53 ± 1% +Inf% 1.52 ± 1% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor.iomap_apply
0.00 ± 0% +Inf% 1.46 ± 1% +Inf% 1.45 ± 3% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_file_iomap_begin.iomap_apply
0.00 ± 0% +Inf% 1.36 ± 1% +Inf% 1.39 ± 1% perf-profile.cycles-pp.unlock_page.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 1.18 ± 1% +Inf% 1.19 ± 1% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_write_begin.iomap_write_actor
0.00 ± 0% +Inf% 1.21 ± 2% +Inf% 1.23 ± 2% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_read.xfs_iomap_eof_want_preallocate.constprop.8.xfs_iomap_write_delay
0.00 ± 0% +Inf% 1.24 ± 2% +Inf% 1.21 ± 2% perf-profile.cycles-pp.xfs_bmap_search_multi_extents.xfs_bmap_search_extents.xfs_bmapi_delay.xfs_iomap_write_delay.xfs_file_iomap_begin
0.00 ± 0% +Inf% 1.14 ± 3% +Inf% 1.16 ± 3% perf-profile.cycles-pp.xfs_ilock.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 ± 0% +Inf% 1.09 ± 2% +Inf% 1.08 ± 1% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.iomap_write_actor.iomap_apply.iomap_file_buffered_write
0.00 ± 0% +Inf% 0.95 ± 0% +Inf% 1.01 ± 3% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.iomap_write_begin
43.95 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
25.10 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
13.71 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
11.03 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
10.68 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
10.96 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter
10.36 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
10.37 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
6.46 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
6.34 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
6.24 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
5.93 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end.generic_perform_write
3.95 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.copy_user_enhanced_fast_string.generic_perform_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
4.02 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
3.39 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end.xfs_vm_write_end
3.28 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
3.03 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.memset_erms.__block_write_begin.xfs_vm_write_begin.generic_perform_write.xfs_file_buffered_aio_write
3.04 ± 3% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
2.91 ± 3% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
1.86 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin.generic_perform_write
1.72 ± 4% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.unlock_page.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
1.80 ± 1% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
1.83 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin.generic_perform_write
1.72 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.lru_cache_add.add_to_page_cache_lru.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
1.44 ± 3% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin.xfs_vm_write_begin
1.32 ± 4% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.__mark_inode_dirty.generic_write_end.xfs_vm_write_end.generic_perform_write.xfs_file_buffered_aio_write
1.25 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_bmapi_delay.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
1.23 ± 4% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.6.xfs_iomap_write_delay.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int
1.17 ± 3% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin.xfs_vm_write_begin
1.04 ± 0% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.xfs_bmapi_read.__xfs_get_blocks.xfs_get_blocks.__block_write_begin_int.__block_write_begin
0.98 ± 5% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.cycles-pp.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int.__block_write_begin
1.79 ± 2% -28.2% 1.28 ± 3% -27.8% 1.29 ± 4% perf-profile.cycles-pp.do_unlinkat.sys_unlink.entry_SYSCALL_64_fastpath
1.79 ± 3% -27.9% 1.29 ± 3% -27.7% 1.30 ± 4% perf-profile.cycles-pp.sys_unlink.entry_SYSCALL_64_fastpath
1.27 ± 0% -22.5% 0.99 ± 4% -24.6% 0.96 ± 4% perf-profile.cycles-pp.destroy_inode.evict.iput.__dentry_kill.dput
2.61 ± 1% -24.3% 1.98 ± 1% -24.1% 1.98 ± 1% perf-profile.cycles-pp.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
2.58 ± 1% -24.1% 1.96 ± 0% -24.1% 1.96 ± 1% perf-profile.cycles-pp.path_openat.do_filp_open.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
1.07 ± 3% -23.3% 0.82 ± 3% -21.1% 0.85 ± 2% perf-profile.cycles-pp.down_write.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
2.66 ± 1% -24.3% 2.01 ± 1% -23.7% 2.03 ± 1% perf-profile.cycles-pp.do_sys_open.sys_creat.entry_SYSCALL_64_fastpath
2.67 ± 1% -24.2% 2.02 ± 1% -23.8% 2.03 ± 1% perf-profile.cycles-pp.sys_creat.entry_SYSCALL_64_fastpath
1.24 ± 1% -23.1% 0.95 ± 4% -24.2% 0.94 ± 4% perf-profile.cycles-pp.xfs_fs_destroy_inode.destroy_inode.evict.iput.__dentry_kill
1.21 ± 1% -23.4% 0.93 ± 4% -24.7% 0.91 ± 4% perf-profile.cycles-pp.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
0.94 ± 4% -19.8% 0.76 ± 0% -21.6% 0.74 ± 4% perf-profile.cycles-pp.cancel_dirty_page.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage
1.32 ± 2% -21.5% 1.04 ± 1% -22.2% 1.03 ± 3% perf-profile.cycles-pp.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat
1.42 ± 2% -20.7% 1.13 ± 1% -22.1% 1.11 ± 3% perf-profile.cycles-pp.xfs_vn_create.path_openat.do_filp_open.do_sys_open.sys_creat
2.35 ± 1% -21.0% 1.86 ± 1% -20.7% 1.86 ± 1% perf-profile.cycles-pp.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page
1.91 ± 3% -16.4% 1.59 ± 1% -19.9% 1.53 ± 1% perf-profile.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page
2.07 ± 1% -20.4% 1.65 ± 2% -19.9% 1.66 ± 1% perf-profile.cycles-pp.try_to_free_buffers.xfs_vm_releasepage.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage
1.42 ± 2% -20.5% 1.13 ± 1% -22.1% 1.10 ± 2% perf-profile.cycles-pp.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open
1.42 ± 2% -21.2% 1.12 ± 1% -22.4% 1.10 ± 3% perf-profile.cycles-pp.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open
1.12 ± 2% -17.6% 0.92 ± 4% -22.3% 0.87 ± 4% perf-profile.cycles-pp.__sb_start_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
2.40 ± 1% -21.0% 1.89 ± 2% -20.6% 1.90 ± 1% perf-profile.cycles-pp.try_to_release_page.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range
1.29 ± 3% -18.9% 1.04 ± 1% -17.9% 1.06 ± 1% perf-profile.cycles-pp.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
3.42 ± 0% -20.9% 2.71 ± 2% -20.3% 2.73 ± 2% perf-profile.cycles-pp.block_invalidatepage.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
5.96 ± 1% -20.0% 4.77 ± 0% -19.4% 4.81 ± 1% perf-profile.cycles-pp.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
3.54 ± 0% -20.8% 2.81 ± 1% -20.0% 2.83 ± 2% perf-profile.cycles-pp.xfs_vm_invalidatepage.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
2.55 ± 3% -14.2% 2.19 ± 2% -17.5% 2.10 ± 1% perf-profile.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.__page_cache_alloc.pagecache_get_page.grab_cache_page_write_begin
1.04 ± 2% -18.9% 0.84 ± 1% -19.6% 0.84 ± 0% perf-profile.cycles-pp.__delete_from_page_cache.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final
1.74 ± 2% -19.9% 1.40 ± 3% -19.3% 1.41 ± 1% perf-profile.cycles-pp.delete_from_page_cache.truncate_inode_page.truncate_inode_pages_range.truncate_inode_pages_final.evict
1.01 ± 3% -17.9% 0.83 ± 2% -18.2% 0.82 ± 1% perf-profile.cycles-pp.down_write.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write
11.21 ± 2% -18.1% 9.18 ± 0% -18.4% 9.14 ± 1% perf-profile.cycles-pp.evict.iput.__dentry_kill.dput.__fput
11.24 ± 2% -18.1% 9.21 ± 0% -18.4% 9.18 ± 1% perf-profile.cycles-pp.__dentry_kill.dput.__fput.____fput.task_work_run
11.22 ± 2% -18.1% 9.19 ± 0% -18.4% 9.16 ± 1% perf-profile.cycles-pp.iput.__dentry_kill.dput.__fput.____fput
1.79 ± 3% -22.2% 1.39 ± 0% -18.2% 1.46 ± 0% perf-profile.cycles-pp.security_file_permission.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
11.26 ± 2% -18.1% 9.23 ± 0% -18.3% 9.20 ± 1% perf-profile.cycles-pp.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
11.31 ± 1% -18.1% 9.27 ± 0% -18.2% 9.25 ± 1% perf-profile.cycles-pp.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.34 ± 2% -18.1% 9.29 ± 0% -18.3% 9.27 ± 1% perf-profile.cycles-pp.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.31 ± 2% -18.1% 9.26 ± 0% -18.3% 9.24 ± 1% perf-profile.cycles-pp.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath
11.32 ± 1% -18.0% 9.28 ± 0% -18.2% 9.26 ± 1% perf-profile.cycles-pp.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
11.34 ± 1% -18.1% 9.29 ± 0% -18.2% 9.27 ± 1% perf-profile.cycles-pp.syscall_return_slowpath.entry_SYSCALL_64_fastpath
2.06 ± 3% -22.5% 1.60 ± 2% -18.1% 1.69 ± 0% perf-profile.cycles-pp.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
9.87 ± 2% -17.5% 8.15 ± 0% -17.6% 8.14 ± 1% perf-profile.cycles-pp.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.__dentry_kill
9.89 ± 2% -17.4% 8.17 ± 0% -17.5% 8.16 ± 1% perf-profile.cycles-pp.truncate_inode_pages_final.evict.iput.__dentry_kill.dput
1.00 ± 1% -18.0% 0.82 ± 1% -14.3% 0.86 ± 3% perf-profile.cycles-pp.__radix_tree_lookup.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.grab_cache_page_write_begin
51.83 ± 1% +14.3% 59.25 ± 0% +13.8% 58.97 ± 0% perf-profile.cycles-pp.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.sys_write
1.38 ± 2% -13.3% 1.19 ± 1% -9.9% 1.24 ± 2% perf-profile.cycles-pp.__set_page_dirty.mark_buffer_dirty.__block_commit_write.isra.24.block_write_end.generic_write_end
53.16 ± 1% +13.6% 60.40 ± 0% +13.0% 60.10 ± 0% perf-profile.cycles-pp.xfs_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
54.10 ± 1% +13.1% 61.20 ± 0% +12.5% 60.86 ± 0% perf-profile.cycles-pp.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
1.32 ± 4% -21.4% 1.04 ± 0% -14.9% 1.13 ± 1% perf-profile.cycles-pp.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write.sys_write
19.79 ± 5% -9.9% 17.84 ± 0% -7.5% 18.31 ± 3% perf-profile.cycles-pp.start_secondary
19.75 ± 5% -9.8% 17.81 ± 0% -7.4% 18.28 ± 3% perf-profile.cycles-pp.cpu_startup_entry.start_secondary
2.50 ± 3% -11.5% 2.21 ± 0% -13.1% 2.17 ± 0% perf-profile.cycles-pp.__pagevec_release.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput
2.39 ± 3% -11.2% 2.12 ± 0% -13.0% 2.08 ± 0% perf-profile.cycles-pp.release_pages.__pagevec_release.truncate_inode_pages_range.truncate_inode_pages_final.evict
59.63 ± 1% +10.2% 65.72 ± 0% +9.7% 65.43 ± 0% perf-profile.cycles-pp.vfs_write.sys_write.entry_SYSCALL_64_fastpath
0.00 ± 0% +Inf% 1.91 ± 1% +Inf% 1.83 ± 1% perf-profile.func.cycles-pp.mark_page_accessed
0.00 ± 0% +Inf% 1.12 ± 1% +Inf% 1.12 ± 0% perf-profile.func.cycles-pp.iomap_write_actor
0.00 ± 0% +Inf% 1.10 ± 3% +Inf% 1.10 ± 2% perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8
1.30 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.func.cycles-pp.generic_perform_write
1.08 ± 2% -100.0% 0.00 ± 0% -100.0% 0.00 ± 0% perf-profile.func.cycles-pp.__xfs_get_blocks
0.37 ± 2% +243.6% 1.26 ± 2% +236.4% 1.23 ± 0% perf-profile.func.cycles-pp.xfs_bmap_search_extents
0.70 ± 5% +219.5% 2.24 ± 0% +213.8% 2.20 ± 2% perf-profile.func.cycles-pp.xfs_bmapi_read
0.41 ± 1% +198.4% 1.22 ± 2% +190.2% 1.19 ± 1% perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents
0.64 ± 1% +182.8% 1.81 ± 4% +181.3% 1.80 ± 0% perf-profile.func.cycles-pp.xfs_iext_bno_to_ext
0.46 ± 4% +161.6% 1.20 ± 1% +163.0% 1.21 ± 1% perf-profile.func.cycles-pp.xfs_iomap_write_delay
1.31 ± 2% -46.7% 0.70 ± 0% -46.9% 0.69 ± 2% perf-profile.func.cycles-pp.generic_write_end
2.49 ± 0% -34.5% 1.63 ± 1% -36.0% 1.59 ± 1% perf-profile.func.cycles-pp.__block_commit_write.isra.24
1.50 ± 1% -20.9% 1.19 ± 1% -21.3% 1.18 ± 1% perf-profile.func.cycles-pp.mark_buffer_dirty
3.24 ± 0% -19.8% 2.60 ± 0% -20.0% 2.59 ± 0% perf-profile.func.cycles-pp.memset_erms
3.96 ± 2% -18.4% 3.23 ± 0% -20.3% 3.16 ± 1% perf-profile.func.cycles-pp.copy_user_enhanced_fast_string
1.79 ± 4% -16.8% 1.49 ± 1% -19.6% 1.44 ± 0% perf-profile.func.cycles-pp.__mark_inode_dirty
1.41 ± 3% -20.6% 1.12 ± 3% -21.3% 1.11 ± 3% perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath
1.16 ± 0% -18.1% 0.95 ± 1% -18.1% 0.95 ± 3% perf-profile.func.cycles-pp._raw_spin_lock
1.16 ± 1% -21.6% 0.91 ± 1% -20.1% 0.93 ± 2% perf-profile.func.cycles-pp.vfs_write
1.75 ± 2% -18.9% 1.42 ± 1% -17.7% 1.44 ± 2% perf-profile.func.cycles-pp.unlock_page
1.32 ± 0% -16.4% 1.10 ± 1% -14.1% 1.13 ± 3% perf-profile.func.cycles-pp.__radix_tree_lookup
1.51 ± 2% +15.4% 1.75 ± 1% +15.9% 1.75 ± 0% perf-profile.func.cycles-pp.__block_write_begin_int
1.02 ± 4% -7.5% 0.94 ± 2% -12.4% 0.89 ± 2% perf-profile.func.cycles-pp.pagecache_get_page
1.05 ± 2% -15.6% 0.88 ± 3% -15.6% 0.88 ± 5% perf-profile.func.cycles-pp.xfs_file_write_iter
raw perf profile data:
"perf-profile.func.cycles-pp.intel_idle": 17.0,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 3.15,
"perf-profile.func.cycles-pp.memset_erms": 2.59,
"perf-profile.func.cycles-pp.xfs_bmapi_read": 2.25,
"perf-profile.func.cycles-pp.___might_sleep": 2.13,
"perf-profile.func.cycles-pp.mark_page_accessed": 1.8,
"perf-profile.func.cycles-pp.xfs_iext_bno_to_ext": 1.8,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.74,
"perf-profile.func.cycles-pp.__block_commit_write.isra.24": 1.62,
"perf-profile.func.cycles-pp.up_write": 1.62,
"perf-profile.func.cycles-pp.down_write": 1.48,
"perf-profile.func.cycles-pp.unlock_page": 1.48,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.43,
"perf-profile.func.cycles-pp.xfs_iomap_write_delay": 1.23,
"perf-profile.func.cycles-pp.xfs_bmap_search_extents": 1.23,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 1.19,
"perf-profile.func.cycles-pp.xfs_bmap_search_multi_extents": 1.18,
"perf-profile.func.cycles-pp.__might_sleep": 1.17,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 1.16,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 1.12,
"perf-profile.func.cycles-pp.iomap_write_actor": 1.11,
"perf-profile.func.cycles-pp.xfs_iomap_eof_want_preallocate.constprop.8": 1.07,
"perf-profile.func.cycles-pp._raw_spin_lock": 1.0,
"perf-profile.func.cycles-pp.vfs_write": 0.94,
"perf-profile.func.cycles-pp.pagecache_get_page": 0.92,
"perf-profile.func.cycles-pp.xfs_bmapi_delay": 0.92,
"perf-profile.func.cycles-pp.xfs_file_iomap_begin": 0.91,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 0.9,
"perf-profile.func.cycles-pp.workingset_activation": 0.82,
"perf-profile.func.cycles-pp.iomap_apply": 0.77,
"perf-profile.func.cycles-pp.xfs_bmapi_trim_map.isra.14": 0.75,
"perf-profile.func.cycles-pp.xfs_file_buffered_aio_write": 0.74,
"perf-profile.func.cycles-pp.mem_cgroup_zone_lruvec": 0.73,
"perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 0.73,
"perf-profile.func.cycles-pp.get_page_from_freelist": 0.72,
"perf-profile.func.cycles-pp.generic_write_end": 0.71,
"perf-profile.func.cycles-pp.__vfs_write": 0.66,
"perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.66,
"perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 0.66,
"perf-profile.func.cycles-pp.release_pages": 0.65,
"perf-profile.func.cycles-pp.find_get_entry": 0.65,
Thanks,
Xiaolong
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 8:51 ` Ye Xiaolong
@ 2016-08-12 10:02 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 10:02 UTC (permalink / raw)
To: Ye Xiaolong
Cc: Linus Torvalds, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On Fri, Aug 12, 2016 at 04:51:24PM +0800, Ye Xiaolong wrote:
> On 08/12, Ye Xiaolong wrote:
> >On 08/12, Dave Chinner wrote:
>
> [snip]
>
> >>lkp-folk: the patch I've just tested it attached below - can you
> >>feed that through your test and see if it fixes the regression?
> >>
> >
> >Hi, Dave
> >
> >I am verifying your fix patch in lkp environment now, will send the
> >result once I get it.
> >
>
> Here is the test result.
Which says "no change". Oh well, back to the drawing board...
Can you send me the aim7 config file and command line you are using
for the test?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 10:02 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-12 10:02 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 662 bytes --]
On Fri, Aug 12, 2016 at 04:51:24PM +0800, Ye Xiaolong wrote:
> On 08/12, Ye Xiaolong wrote:
> >On 08/12, Dave Chinner wrote:
>
> [snip]
>
> >>lkp-folk: the patch I've just tested it attached below - can you
> >>feed that through your test and see if it fixes the regression?
> >>
> >
> >Hi, Dave
> >
> >I am verifying your fix patch in lkp environment now, will send the
> >result once I get it.
> >
>
> Here is the test result.
Which says "no change". Oh well, back to the drawing board...
Can you send me the aim7 config file and command line you are using
for the test?
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 10:02 ` Dave Chinner
@ 2016-08-12 10:43 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-12 10:43 UTC (permalink / raw)
To: Dave Chinner
Cc: Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP, Christoph Hellwig
Hi Dave,
On Fri, Aug 12, 2016 at 08:02:08PM +1000, Dave Chinner wrote:
>On Fri, Aug 12, 2016 at 04:51:24PM +0800, Ye Xiaolong wrote:
>> On 08/12, Ye Xiaolong wrote:
>> >On 08/12, Dave Chinner wrote:
>>
>> [snip]
>>
>> >>lkp-folk: the patch I've just tested it attached below - can you
>> >>feed that through your test and see if it fixes the regression?
>> >>
>> >
>> >Hi, Dave
>> >
>> >I am verifying your fix patch in lkp environment now, will send the
>> >result once I get it.
>> >
>>
>> Here is the test result.
>
>Which says "no change". Oh well, back to the drawing board...
>
>Can you send me the aim7 config file and command line you are using
>for the test?
The test scripts can be found here:
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/setup/disk
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/setup/fs
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/tests/aim7
Here are the real commands they executed in this test:
modprobe -r brd
modprobe brd rd_nr=1 rd_size=50331648 part_show=1
dmsetup remove_all
wipefs -a --force /dev/ram0
mkfs -t xfs /dev/ram0
mkdir -p /fs/ram0
mount -t xfs -o nobarrier,inode64 /dev/ram0 /fs/ram0
for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
do
echo performance > $file
done
echo "500 32000 128 512" > /proc/sys/kernel/sem
cat > workfile <<EOF
FILESIZE: 1M
POOLSIZE: 10M
10 disk_wrt
EOF
echo "/fs/ram0" > config
(
echo ivb44
echo disk_wrt
echo 1
echo 3000
echo 2
echo 3000
echo 1
) | ./multitask -t
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 10:43 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-12 10:43 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1691 bytes --]
Hi Dave,
On Fri, Aug 12, 2016 at 08:02:08PM +1000, Dave Chinner wrote:
>On Fri, Aug 12, 2016 at 04:51:24PM +0800, Ye Xiaolong wrote:
>> On 08/12, Ye Xiaolong wrote:
>> >On 08/12, Dave Chinner wrote:
>>
>> [snip]
>>
>> >>lkp-folk: the patch I've just tested it attached below - can you
>> >>feed that through your test and see if it fixes the regression?
>> >>
>> >
>> >Hi, Dave
>> >
>> >I am verifying your fix patch in lkp environment now, will send the
>> >result once I get it.
>> >
>>
>> Here is the test result.
>
>Which says "no change". Oh well, back to the drawing board...
>
>Can you send me the aim7 config file and command line you are using
>for the test?
The test scripts can be found here:
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/setup/disk
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/setup/fs
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/tests/aim7
Here are the real commands they executed in this test:
modprobe -r brd
modprobe brd rd_nr=1 rd_size=50331648 part_show=1
dmsetup remove_all
wipefs -a --force /dev/ram0
mkfs -t xfs /dev/ram0
mkdir -p /fs/ram0
mount -t xfs -o nobarrier,inode64 /dev/ram0 /fs/ram0
for file in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
do
echo performance > $file
done
echo "500 32000 128 512" > /proc/sys/kernel/sem
cat > workfile <<EOF
FILESIZE: 1M
POOLSIZE: 10M
10 disk_wrt
EOF
echo "/fs/ram0" > config
(
echo ivb44
echo disk_wrt
echo 1
echo 3000
echo 2
echo 3000
echo 1
) | ./multitask -t
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 3:56 ` Dave Chinner
@ 2016-08-12 18:03 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 18:03 UTC (permalink / raw)
To: Dave Chinner, Tejun Heo, Wu Fengguang, Kirill A. Shutemov
Cc: Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, LKP
[-- Attachment #1: Type: text/plain, Size: 1739 bytes --]
On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
>>
>> I don't recall having ever seen the mapping tree_lock as a contention
>> point before, but it's not like I've tried that load either. So it
>> might be a regression (going back long, I suspect), or just an unusual
>> load that nobody has traditionally tested much.
>>
>> Single-threaded big file write one page at a time, was it?
>
> Yup. On a 4 node NUMA system.
Ok, I can't see any real contention on my single-node workstation
(running ext4 too, so there may be filesystem differences), but I
guess that shouldn't surprise me. The cacheline bouncing just isn't
expensive enough when it all stays on-die.
I can see the tree_lock in my profiles (just not very high), and at
least for ext4 the main caller ssems to be
__set_page_dirty_nobuffers().
And yes, looking at that, the biggest cost by _far_ inside the
spinlock seems to be the accounting.
Which doesn't even have to be inside the mapping lock, as far as I can
tell, and as far as comments go.
So a stupid patch to just move the dirty page accounting to outside
the spinlock might help a lot.
Does this attached patch help your contention numbers?
Adding a few people who get blamed for account_page_dirtied() and
inode_attach_wb() just to make sure that nobody expected the
mapping_lock spinlock to be held when calling account_page_dirtied().
I realize that this has nothing to do with the AIM7 regression (the
spinlock just isn't high enough in that profile), but your contention
numbers just aren't right, and updating accounting statistics inside a
critical spinlock when not needed is just wrong.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 2276 bytes --]
fs/buffer.c | 5 ++++-
fs/xfs/xfs_aops.c | 5 ++++-
mm/page-writeback.c | 2 +-
3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 9c8eb9b6db6a..f79a9d241589 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -628,15 +628,18 @@ static void __set_page_dirty(struct page *page, struct address_space *mapping,
int warn)
{
unsigned long flags;
+ bool account = false;
spin_lock_irqsave(&mapping->tree_lock, flags);
if (page->mapping) { /* Race with truncate? */
WARN_ON_ONCE(warn && !PageUptodate(page));
- account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
+ account = true;
}
spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ if (account)
+ account_page_dirtied(page, mapping);
}
/*
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 7575cfc3ad15..59169c36765e 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1490,15 +1490,18 @@ xfs_vm_set_page_dirty(
if (newly_dirty) {
/* sigh - __set_page_dirty() is static, so copy it here, too */
unsigned long flags;
+ bool account = false;
spin_lock_irqsave(&mapping->tree_lock, flags);
if (page->mapping) { /* Race with truncate? */
WARN_ON_ONCE(!PageUptodate(page));
- account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
+ account = true;
}
spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ if (account)
+ account_page_dirtied(page, mapping);
}
unlock_page_memcg(page);
if (newly_dirty)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f4cd7d8005c9..9a6a6b99acfe 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2517,10 +2517,10 @@ int __set_page_dirty_nobuffers(struct page *page)
spin_lock_irqsave(&mapping->tree_lock, flags);
BUG_ON(page_mapping(page) != mapping);
WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
- account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree, page_index(page),
PAGECACHE_TAG_DIRTY);
spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ account_page_dirtied(page, mapping);
unlock_page_memcg(page);
if (mapping->host) {
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-12 18:03 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-12 18:03 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1781 bytes --]
On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
>>
>> I don't recall having ever seen the mapping tree_lock as a contention
>> point before, but it's not like I've tried that load either. So it
>> might be a regression (going back long, I suspect), or just an unusual
>> load that nobody has traditionally tested much.
>>
>> Single-threaded big file write one page at a time, was it?
>
> Yup. On a 4 node NUMA system.
Ok, I can't see any real contention on my single-node workstation
(running ext4 too, so there may be filesystem differences), but I
guess that shouldn't surprise me. The cacheline bouncing just isn't
expensive enough when it all stays on-die.
I can see the tree_lock in my profiles (just not very high), and at
least for ext4 the main caller ssems to be
__set_page_dirty_nobuffers().
And yes, looking at that, the biggest cost by _far_ inside the
spinlock seems to be the accounting.
Which doesn't even have to be inside the mapping lock, as far as I can
tell, and as far as comments go.
So a stupid patch to just move the dirty page accounting to outside
the spinlock might help a lot.
Does this attached patch help your contention numbers?
Adding a few people who get blamed for account_page_dirtied() and
inode_attach_wb() just to make sure that nobody expected the
mapping_lock spinlock to be held when calling account_page_dirtied().
I realize that this has nothing to do with the AIM7 regression (the
spinlock just isn't high enough in that profile), but your contention
numbers just aren't right, and updating accounting statistics inside a
critical spinlock when not needed is just wrong.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 2276 bytes --]
fs/buffer.c | 5 ++++-
fs/xfs/xfs_aops.c | 5 ++++-
mm/page-writeback.c | 2 +-
3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 9c8eb9b6db6a..f79a9d241589 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -628,15 +628,18 @@ static void __set_page_dirty(struct page *page, struct address_space *mapping,
int warn)
{
unsigned long flags;
+ bool account = false;
spin_lock_irqsave(&mapping->tree_lock, flags);
if (page->mapping) { /* Race with truncate? */
WARN_ON_ONCE(warn && !PageUptodate(page));
- account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
+ account = true;
}
spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ if (account)
+ account_page_dirtied(page, mapping);
}
/*
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 7575cfc3ad15..59169c36765e 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1490,15 +1490,18 @@ xfs_vm_set_page_dirty(
if (newly_dirty) {
/* sigh - __set_page_dirty() is static, so copy it here, too */
unsigned long flags;
+ bool account = false;
spin_lock_irqsave(&mapping->tree_lock, flags);
if (page->mapping) { /* Race with truncate? */
WARN_ON_ONCE(!PageUptodate(page));
- account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
+ account = true;
}
spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ if (account)
+ account_page_dirtied(page, mapping);
}
unlock_page_memcg(page);
if (newly_dirty)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f4cd7d8005c9..9a6a6b99acfe 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2517,10 +2517,10 @@ int __set_page_dirty_nobuffers(struct page *page)
spin_lock_irqsave(&mapping->tree_lock, flags);
BUG_ON(page_mapping(page) != mapping);
WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
- account_page_dirtied(page, mapping);
radix_tree_tag_set(&mapping->page_tree, page_index(page),
PAGECACHE_TAG_DIRTY);
spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ account_page_dirtied(page, mapping);
unlock_page_memcg(page);
if (mapping->host) {
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 10:02 ` Dave Chinner
@ 2016-08-13 0:30 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-13 0:30 UTC (permalink / raw)
To: Dave Chinner
Cc: Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, Wu Fengguang,
LKP, Christoph Hellwig
On Fri, Aug 12, 2016 at 08:02:08PM +1000, Dave Chinner wrote:
> Which says "no change". Oh well, back to the drawing board...
I don't see how it would change thing much - for all relevant calculations
we convert to block units first anyway.
But the whole xfs_iomap_write_delay is a giant mess anyway. For a usual
call we do at least four lookups in the extent btree, which seems rather
costly. Especially given that the low-level xfs_bmap_search_extents
interface would give us all required information in one single call.
Below is a patch I hacked up this morning to do just that. It passes
xfstests, but I've not done any real benchmarking with it. If the
reduced lookup overhead in it doesn't help enough we'll need to some
sort of look aside cache for the information, but I hope that we
can avoid that. And yes, it's a rather large patch - but the old
path was so entangled that I couldn't come up with something lighter.
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index b060bca..614803b 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1388,7 +1388,7 @@ xfs_bmap_search_multi_extents(
* Else, *lastxp will be set to the index of the found
* entry; *gotp will contain the entry.
*/
-STATIC xfs_bmbt_rec_host_t * /* pointer to found extent entry */
+xfs_bmbt_rec_host_t * /* pointer to found extent entry */
xfs_bmap_search_extents(
xfs_inode_t *ip, /* incore inode pointer */
xfs_fileoff_t bno, /* block number searched for */
@@ -4074,7 +4074,7 @@ xfs_bmapi_read(
return 0;
}
-STATIC int
+int
xfs_bmapi_reserve_delalloc(
struct xfs_inode *ip,
xfs_fileoff_t aoff,
@@ -4170,91 +4170,6 @@ out_unreserve_quota:
return error;
}
-/*
- * Map file blocks to filesystem blocks, adding delayed allocations as needed.
- */
-int
-xfs_bmapi_delay(
- struct xfs_inode *ip, /* incore inode */
- xfs_fileoff_t bno, /* starting file offs. mapped */
- xfs_filblks_t len, /* length to map in file */
- struct xfs_bmbt_irec *mval, /* output: map values */
- int *nmap, /* i/o: mval size/count */
- int flags) /* XFS_BMAPI_... */
-{
- struct xfs_mount *mp = ip->i_mount;
- struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
- struct xfs_bmbt_irec got; /* current file extent record */
- struct xfs_bmbt_irec prev; /* previous file extent record */
- xfs_fileoff_t obno; /* old block number (offset) */
- xfs_fileoff_t end; /* end of mapped file region */
- xfs_extnum_t lastx; /* last useful extent number */
- int eof; /* we've hit the end of extents */
- int n = 0; /* current extent index */
- int error = 0;
-
- ASSERT(*nmap >= 1);
- ASSERT(*nmap <= XFS_BMAP_MAX_NMAP);
- ASSERT(!(flags & ~XFS_BMAPI_ENTIRE));
- ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
-
- if (unlikely(XFS_TEST_ERROR(
- (XFS_IFORK_FORMAT(ip, XFS_DATA_FORK) != XFS_DINODE_FMT_EXTENTS &&
- XFS_IFORK_FORMAT(ip, XFS_DATA_FORK) != XFS_DINODE_FMT_BTREE),
- mp, XFS_ERRTAG_BMAPIFORMAT, XFS_RANDOM_BMAPIFORMAT))) {
- XFS_ERROR_REPORT("xfs_bmapi_delay", XFS_ERRLEVEL_LOW, mp);
- return -EFSCORRUPTED;
- }
-
- if (XFS_FORCED_SHUTDOWN(mp))
- return -EIO;
-
- XFS_STATS_INC(mp, xs_blk_mapw);
-
- if (!(ifp->if_flags & XFS_IFEXTENTS)) {
- error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
- if (error)
- return error;
- }
-
- xfs_bmap_search_extents(ip, bno, XFS_DATA_FORK, &eof, &lastx, &got, &prev);
- end = bno + len;
- obno = bno;
-
- while (bno < end && n < *nmap) {
- if (eof || got.br_startoff > bno) {
- error = xfs_bmapi_reserve_delalloc(ip, bno, len, &got,
- &prev, &lastx, eof);
- if (error) {
- if (n == 0) {
- *nmap = 0;
- return error;
- }
- break;
- }
- }
-
- /* set up the extent map to return. */
- xfs_bmapi_trim_map(mval, &got, &bno, len, obno, end, n, flags);
- xfs_bmapi_update_map(&mval, &bno, &len, obno, end, &n, flags);
-
- /* If we're done, stop now. */
- if (bno >= end || n >= *nmap)
- break;
-
- /* Else go on to the next record. */
- prev = got;
- if (++lastx < ifp->if_bytes / sizeof(xfs_bmbt_rec_t))
- xfs_bmbt_get_all(xfs_iext_get_ext(ifp, lastx), &got);
- else
- eof = 1;
- }
-
- *nmap = n;
- return 0;
-}
-
-
static int
xfs_bmapi_allocate(
struct xfs_bmalloca *bma)
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index 254034f..d660069 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -181,9 +181,6 @@ int xfs_bmap_read_extents(struct xfs_trans *tp, struct xfs_inode *ip,
int xfs_bmapi_read(struct xfs_inode *ip, xfs_fileoff_t bno,
xfs_filblks_t len, struct xfs_bmbt_irec *mval,
int *nmap, int flags);
-int xfs_bmapi_delay(struct xfs_inode *ip, xfs_fileoff_t bno,
- xfs_filblks_t len, struct xfs_bmbt_irec *mval,
- int *nmap, int flags);
int xfs_bmapi_write(struct xfs_trans *tp, struct xfs_inode *ip,
xfs_fileoff_t bno, xfs_filblks_t len, int flags,
xfs_fsblock_t *firstblock, xfs_extlen_t total,
@@ -202,5 +199,12 @@ int xfs_bmap_shift_extents(struct xfs_trans *tp, struct xfs_inode *ip,
struct xfs_defer_ops *dfops, enum shift_direction direction,
int num_exts);
int xfs_bmap_split_extent(struct xfs_inode *ip, xfs_fileoff_t split_offset);
+struct xfs_bmbt_rec_host *
+ xfs_bmap_search_extents(struct xfs_inode *ip, xfs_fileoff_t bno,
+ int fork, int *eofp, xfs_extnum_t *lastxp,
+ struct xfs_bmbt_irec *gotp, struct xfs_bmbt_irec *prevp);
+int xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, xfs_fileoff_t aoff,
+ xfs_filblks_t len, struct xfs_bmbt_irec *got,
+ struct xfs_bmbt_irec *prev, xfs_extnum_t *lastx, int eof);
#endif /* __XFS_BMAP_H__ */
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 2114d53..b40e31b 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -42,11 +42,62 @@
#define XFS_WRITEIO_ALIGN(mp,off) (((off) >> mp->m_writeio_log) \
<< mp->m_writeio_log)
-#define XFS_WRITE_IMAPS XFS_BMAP_MAX_NMAP
+
+void
+xfs_bmbt_to_iomap(
+ struct xfs_inode *ip,
+ struct iomap *iomap,
+ struct xfs_bmbt_irec *imap)
+{
+ struct xfs_mount *mp = ip->i_mount;
+
+ if (imap->br_startblock == HOLESTARTBLOCK) {
+ iomap->blkno = IOMAP_NULL_BLOCK;
+ iomap->type = IOMAP_HOLE;
+ } else if (imap->br_startblock == DELAYSTARTBLOCK) {
+ iomap->blkno = IOMAP_NULL_BLOCK;
+ iomap->type = IOMAP_DELALLOC;
+ } else {
+ iomap->blkno = xfs_fsb_to_db(ip, imap->br_startblock);
+ if (imap->br_state == XFS_EXT_UNWRITTEN)
+ iomap->type = IOMAP_UNWRITTEN;
+ else
+ iomap->type = IOMAP_MAPPED;
+ }
+ iomap->offset = XFS_FSB_TO_B(mp, imap->br_startoff);
+ iomap->length = XFS_FSB_TO_B(mp, imap->br_blockcount);
+ iomap->bdev = xfs_find_bdev_for_inode(VFS_I(ip));
+}
+
+static xfs_extlen_t
+xfs_align_eof(
+ struct xfs_inode *ip)
+{
+ struct xfs_mount *mp = ip->i_mount;
+ xfs_extlen_t align = 0;
+
+ ASSERT(!XFS_IS_REALTIME_INODE(ip));
+
+ /*
+ * Round up the allocation request to a stripe unit (m_dalign)
+ * boundary if the file size is >= stripe unit size, and we are
+ * allocating past the allocation eof.
+ *
+ * If mounted with the "-o swalloc" option the alignment is
+ * increased from the strip unit size to the stripe width.
+ */
+ if (mp->m_swidth && (mp->m_flags & XFS_MOUNT_SWALLOC))
+ align = mp->m_swidth;
+ else if (mp->m_dalign)
+ align = mp->m_dalign;
+
+ if (align && XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, align))
+ align = 0;
+ return align;
+}
STATIC int
xfs_iomap_eof_align_last_fsb(
- xfs_mount_t *mp,
xfs_inode_t *ip,
xfs_extlen_t extsize,
xfs_fileoff_t *last_fsb)
@@ -54,23 +105,8 @@ xfs_iomap_eof_align_last_fsb(
xfs_extlen_t align = 0;
int eof, error;
- if (!XFS_IS_REALTIME_INODE(ip)) {
- /*
- * Round up the allocation request to a stripe unit
- * (m_dalign) boundary if the file size is >= stripe unit
- * size, and we are allocating past the allocation eof.
- *
- * If mounted with the "-o swalloc" option the alignment is
- * increased from the strip unit size to the stripe width.
- */
- if (mp->m_swidth && (mp->m_flags & XFS_MOUNT_SWALLOC))
- align = mp->m_swidth;
- else if (mp->m_dalign)
- align = mp->m_dalign;
-
- if (align && XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, align))
- align = 0;
- }
+ if (!XFS_IS_REALTIME_INODE(ip))
+ align = xfs_align_eof(ip);
/*
* Always round up the allocation request to an extent boundary
@@ -154,7 +190,7 @@ xfs_iomap_write_direct(
*/
ASSERT(XFS_IFORK_PTR(ip, XFS_DATA_FORK)->if_flags &
XFS_IFEXTENTS);
- error = xfs_iomap_eof_align_last_fsb(mp, ip, extsz, &last_fsb);
+ error = xfs_iomap_eof_align_last_fsb(ip, extsz, &last_fsb);
if (error)
goto out_unlock;
} else {
@@ -274,130 +310,6 @@ out_trans_cancel:
goto out_unlock;
}
-/*
- * If the caller is doing a write at the end of the file, then extend the
- * allocation out to the file system's write iosize. We clean up any extra
- * space left over when the file is closed in xfs_inactive().
- *
- * If we find we already have delalloc preallocation beyond EOF, don't do more
- * preallocation as it it not needed.
- */
-STATIC int
-xfs_iomap_eof_want_preallocate(
- xfs_mount_t *mp,
- xfs_inode_t *ip,
- xfs_off_t offset,
- size_t count,
- xfs_bmbt_irec_t *imap,
- int nimaps,
- int *prealloc)
-{
- xfs_fileoff_t start_fsb;
- xfs_filblks_t count_fsb;
- int n, error, imaps;
- int found_delalloc = 0;
-
- *prealloc = 0;
- if (offset + count <= XFS_ISIZE(ip))
- return 0;
-
- /*
- * If the file is smaller than the minimum prealloc and we are using
- * dynamic preallocation, don't do any preallocation at all as it is
- * likely this is the only write to the file that is going to be done.
- */
- if (!(mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) &&
- XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, mp->m_writeio_blocks))
- return 0;
-
- /*
- * If there are any real blocks past eof, then don't
- * do any speculative allocation.
- */
- start_fsb = XFS_B_TO_FSBT(mp, ((xfs_ufsize_t)(offset + count - 1)));
- count_fsb = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
- while (count_fsb > 0) {
- imaps = nimaps;
- error = xfs_bmapi_read(ip, start_fsb, count_fsb, imap, &imaps,
- 0);
- if (error)
- return error;
- for (n = 0; n < imaps; n++) {
- if ((imap[n].br_startblock != HOLESTARTBLOCK) &&
- (imap[n].br_startblock != DELAYSTARTBLOCK))
- return 0;
- start_fsb += imap[n].br_blockcount;
- count_fsb -= imap[n].br_blockcount;
-
- if (imap[n].br_startblock == DELAYSTARTBLOCK)
- found_delalloc = 1;
- }
- }
- if (!found_delalloc)
- *prealloc = 1;
- return 0;
-}
-
-/*
- * Determine the initial size of the preallocation. We are beyond the current
- * EOF here, but we need to take into account whether this is a sparse write or
- * an extending write when determining the preallocation size. Hence we need to
- * look up the extent that ends at the current write offset and use the result
- * to determine the preallocation size.
- *
- * If the extent is a hole, then preallocation is essentially disabled.
- * Otherwise we take the size of the preceeding data extent as the basis for the
- * preallocation size. If the size of the extent is greater than half the
- * maximum extent length, then use the current offset as the basis. This ensures
- * that for large files the preallocation size always extends to MAXEXTLEN
- * rather than falling short due to things like stripe unit/width alignment of
- * real extents.
- */
-STATIC xfs_fsblock_t
-xfs_iomap_eof_prealloc_initial_size(
- struct xfs_mount *mp,
- struct xfs_inode *ip,
- xfs_off_t offset,
- xfs_bmbt_irec_t *imap,
- int nimaps)
-{
- xfs_fileoff_t start_fsb;
- int imaps = 1;
- int error;
-
- ASSERT(nimaps >= imaps);
-
- /* if we are using a specific prealloc size, return now */
- if (mp->m_flags & XFS_MOUNT_DFLT_IOSIZE)
- return 0;
-
- /* If the file is small, then use the minimum prealloc */
- if (XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, mp->m_dalign))
- return 0;
-
- /*
- * As we write multiple pages, the offset will always align to the
- * start of a page and hence point to a hole at EOF. i.e. if the size is
- * 4096 bytes, we only have one block at FSB 0, but XFS_B_TO_FSB(4096)
- * will return FSB 1. Hence if there are blocks in the file, we want to
- * point to the block prior to the EOF block and not the hole that maps
- * directly at @offset.
- */
- start_fsb = XFS_B_TO_FSB(mp, offset);
- if (start_fsb)
- start_fsb--;
- error = xfs_bmapi_read(ip, start_fsb, 1, imap, &imaps, XFS_BMAPI_ENTIRE);
- if (error)
- return 0;
-
- ASSERT(imaps == 1);
- if (imap[0].br_startblock == HOLESTARTBLOCK)
- return 0;
- if (imap[0].br_blockcount <= (MAXEXTLEN >> 1))
- return imap[0].br_blockcount << 1;
- return XFS_B_TO_FSB(mp, offset);
-}
-
STATIC bool
xfs_quota_need_throttle(
struct xfs_inode *ip,
@@ -466,20 +378,37 @@ xfs_quota_calc_throttle(
*/
STATIC xfs_fsblock_t
xfs_iomap_prealloc_size(
- struct xfs_mount *mp,
struct xfs_inode *ip,
xfs_off_t offset,
- struct xfs_bmbt_irec *imap,
- int nimaps)
+ struct xfs_bmbt_irec *prev)
{
- xfs_fsblock_t alloc_blocks = 0;
+ struct xfs_mount *mp = ip->i_mount;
int shift = 0;
int64_t freesp;
xfs_fsblock_t qblocks;
int qshift = 0;
+ xfs_fsblock_t alloc_blocks = 0;
- alloc_blocks = xfs_iomap_eof_prealloc_initial_size(mp, ip, offset,
- imap, nimaps);
+ /*
+ * Determine the initial size of the preallocation. We are beyond the
+ * current EOF here, but we need to take into account whether this is
+ * a sparse write or an extending write when determining the
+ * preallocation size. Hence we need to look up the extent that ends
+ * at the current write offset and use the result to determine the
+ * preallocation size.
+ *
+ * If the extent is a hole, then preallocation is essentially disabled.
+ * Otherwise we take the size of the preceding data extent as the basis
+ * for the preallocation size. If the size of the extent is greater than
+ * half the maximum extent length, then use the current offset as the
+ * basis. This ensures that for large files the preallocation size
+ * always extends to MAXEXTLEN rather than falling short due to things
+ * like stripe unit/width alignment of real extents.
+ */
+ if (prev->br_blockcount <= (MAXEXTLEN >> 1))
+ alloc_blocks = prev->br_blockcount << 1;
+ else
+ alloc_blocks = XFS_B_TO_FSB(mp, offset);
if (!alloc_blocks)
goto check_writeio;
qblocks = alloc_blocks;
@@ -550,120 +479,166 @@ xfs_iomap_prealloc_size(
*/
while (alloc_blocks && alloc_blocks >= freesp)
alloc_blocks >>= 4;
-
check_writeio:
if (alloc_blocks < mp->m_writeio_blocks)
alloc_blocks = mp->m_writeio_blocks;
-
trace_xfs_iomap_prealloc_size(ip, alloc_blocks, shift,
mp->m_writeio_blocks);
-
return alloc_blocks;
}
-int
-xfs_iomap_write_delay(
- xfs_inode_t *ip,
- xfs_off_t offset,
- size_t count,
- xfs_bmbt_irec_t *ret_imap)
+static int
+xfs_file_iomap_begin_delay(
+ struct inode *inode,
+ loff_t offset,
+ loff_t count,
+ unsigned flags,
+ struct iomap *iomap)
{
- xfs_mount_t *mp = ip->i_mount;
- xfs_fileoff_t offset_fsb;
- xfs_fileoff_t last_fsb;
- xfs_off_t aligned_offset;
- xfs_fileoff_t ioalign;
- xfs_extlen_t extsz;
- int nimaps;
- xfs_bmbt_irec_t imap[XFS_WRITE_IMAPS];
- int prealloc;
- int error;
-
- ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
-
- /*
- * Make sure that the dquots are there. This doesn't hold
- * the ilock across a disk read.
- */
- error = xfs_qm_dqattach_locked(ip, 0);
- if (error)
- return error;
-
- extsz = xfs_get_extsz_hint(ip);
- offset_fsb = XFS_B_TO_FSBT(mp, offset);
+ struct xfs_inode *ip = XFS_I(inode);
+ struct xfs_mount *mp = ip->i_mount;
+ struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
+ xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
+ xfs_fileoff_t maxbytes_fsb =
+ XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
+ xfs_fileoff_t end_fsb, orig_end_fsb;
+ int error = 0, eof = 0;
+ struct xfs_bmbt_irec got;
+ struct xfs_bmbt_irec prev;
+ xfs_extnum_t idx;
+
+ ASSERT(!XFS_IS_REALTIME_INODE(ip));
+ ASSERT(!xfs_get_extsz_hint(ip));
- error = xfs_iomap_eof_want_preallocate(mp, ip, offset, count,
- imap, XFS_WRITE_IMAPS, &prealloc);
- if (error)
- return error;
+ xfs_ilock(ip, XFS_ILOCK_EXCL);
-retry:
- if (prealloc) {
- xfs_fsblock_t alloc_blocks;
+ if (unlikely(XFS_TEST_ERROR(
+ (XFS_IFORK_FORMAT(ip, XFS_DATA_FORK) != XFS_DINODE_FMT_EXTENTS &&
+ XFS_IFORK_FORMAT(ip, XFS_DATA_FORK) != XFS_DINODE_FMT_BTREE),
+ mp, XFS_ERRTAG_BMAPIFORMAT, XFS_RANDOM_BMAPIFORMAT))) {
+ XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, mp);
+ error = -EFSCORRUPTED;
+ goto out_unlock;
+ }
- alloc_blocks = xfs_iomap_prealloc_size(mp, ip, offset, imap,
- XFS_WRITE_IMAPS);
+ XFS_STATS_INC(mp, xs_blk_mapw);
- aligned_offset = XFS_WRITEIO_ALIGN(mp, (offset + count - 1));
- ioalign = XFS_B_TO_FSBT(mp, aligned_offset);
- last_fsb = ioalign + alloc_blocks;
- } else {
- last_fsb = XFS_B_TO_FSB(mp, ((xfs_ufsize_t)(offset + count)));
+ if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+ error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
+ if (error)
+ goto out_unlock;
}
- if (prealloc || extsz) {
- error = xfs_iomap_eof_align_last_fsb(mp, ip, extsz, &last_fsb);
- if (error)
- return error;
+ xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
+ &got, &prev);
+ if (!eof && got.br_startoff <= offset_fsb) {
+ trace_xfs_iomap_found(ip, offset, count, 0, &got);
+ goto done;
}
+ error = xfs_qm_dqattach_locked(ip, 0);
+ if (error)
+ goto out_unlock;
+
/*
- * Make sure preallocation does not create extents beyond the range we
- * actually support in this filesystem.
+ * We cap the maximum length we map here to MAX_WRITEBACK_PAGES pages
+ * to keep the chunks of work done where somewhat symmetric with the
+ * work writeback does. This is a completely arbitrary number pulled
+ * out of thin air as a best guess for initial testing.
+ *
+ * Note that the values needs to be less than 32-bits wide until
+ * the lower level functions are updated.
*/
- if (last_fsb > XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes))
- last_fsb = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
+ count = min_t(loff_t, count, 1024 * PAGE_SIZE);
+ end_fsb = orig_end_fsb =
+ min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
- ASSERT(last_fsb > offset_fsb);
+ /*
+ * If we are doing a write at the end of the file and there are no
+ * allocations past this one, then extend the allocation out to the
+ * file system's write iosize.
+ *
+ * As an exception we don't do any preallocation at all if the file
+ * is smaller than the minimum preallocation and we are using the
+ * default dynamic preallocation scheme, as it is likely this is the
+ * only write to the file that is going to be done.
+ *
+ * We clean up any extra space left over when the file is closed in
+ * xfs_inactive().
+ */
+ if (eof && offset + count > XFS_ISIZE(ip) &&
+ ((mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) ||
+ XFS_ISIZE(ip) >= XFS_FSB_TO_B(mp, mp->m_writeio_blocks))) {
+ xfs_fsblock_t alloc_blocks;
+ xfs_off_t aligned_offset;
+ xfs_extlen_t align;
- nimaps = XFS_WRITE_IMAPS;
- error = xfs_bmapi_delay(ip, offset_fsb, last_fsb - offset_fsb,
- imap, &nimaps, XFS_BMAPI_ENTIRE);
+ /*
+ * If an explicit allocsize is set, the file is small, or we
+ * are writing behind a hole, then use the minimum prealloc:
+ */
+ if ((mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) ||
+ XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, mp->m_dalign) ||
+ prev.br_startoff + prev.br_blockcount < offset_fsb)
+ alloc_blocks = mp->m_writeio_blocks;
+ else
+ alloc_blocks =
+ xfs_iomap_prealloc_size(ip, offset, &prev);
+
+ aligned_offset = XFS_WRITEIO_ALIGN(mp, offset + count - 1);
+ end_fsb = XFS_B_TO_FSBT(mp, aligned_offset) + alloc_blocks;
+
+ align = xfs_align_eof(ip);
+ if (align)
+ end_fsb = roundup_64(end_fsb, align);
+
+ end_fsb = min(end_fsb, maxbytes_fsb);
+ ASSERT(end_fsb > offset_fsb);
+ }
+
+retry:
+ error = xfs_bmapi_reserve_delalloc(ip, offset_fsb,
+ end_fsb - offset_fsb, &got,
+ &prev, &idx, eof);
switch (error) {
case 0:
+ break;
case -ENOSPC:
case -EDQUOT:
- break;
- default:
- return error;
- }
-
- /*
- * If bmapi returned us nothing, we got either ENOSPC or EDQUOT. Retry
- * without EOF preallocation.
- */
- if (nimaps == 0) {
+ /* retry without any preallocation */
trace_xfs_delalloc_enospc(ip, offset, count);
- if (prealloc) {
- prealloc = 0;
- error = 0;
+ if (end_fsb != orig_end_fsb) {
+ end_fsb = orig_end_fsb;
goto retry;
}
- return error ? error : -ENOSPC;
+ /*FALLTHRU*/
+ default:
+ goto out_unlock;
}
- if (!(imap[0].br_startblock || XFS_IS_REALTIME_INODE(ip)))
- return xfs_alert_fsblock_zero(ip, &imap[0]);
+ trace_xfs_iomap_alloc(ip, offset, count, 0, &got);
+done:
+ if (isnullstartblock(got.br_startblock))
+ got.br_startblock = DELAYSTARTBLOCK;
+
+ if (!got.br_startblock) {
+ error = xfs_alert_fsblock_zero(ip, &got);
+ if (error)
+ goto out_unlock;
+ }
+
+ xfs_bmbt_to_iomap(ip, iomap, &got);
/*
* Tag the inode as speculatively preallocated so we can reclaim this
* space on demand, if necessary.
*/
- if (prealloc)
+ if (end_fsb != orig_end_fsb)
xfs_inode_set_eofblocks_tag(ip);
- *ret_imap = imap[0];
- return 0;
+out_unlock:
+ xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ return error;
}
/*
@@ -943,32 +918,6 @@ error_on_bmapi_transaction:
return error;
}
-void
-xfs_bmbt_to_iomap(
- struct xfs_inode *ip,
- struct iomap *iomap,
- struct xfs_bmbt_irec *imap)
-{
- struct xfs_mount *mp = ip->i_mount;
-
- if (imap->br_startblock == HOLESTARTBLOCK) {
- iomap->blkno = IOMAP_NULL_BLOCK;
- iomap->type = IOMAP_HOLE;
- } else if (imap->br_startblock == DELAYSTARTBLOCK) {
- iomap->blkno = IOMAP_NULL_BLOCK;
- iomap->type = IOMAP_DELALLOC;
- } else {
- iomap->blkno = xfs_fsb_to_db(ip, imap->br_startblock);
- if (imap->br_state == XFS_EXT_UNWRITTEN)
- iomap->type = IOMAP_UNWRITTEN;
- else
- iomap->type = IOMAP_MAPPED;
- }
- iomap->offset = XFS_FSB_TO_B(mp, imap->br_startoff);
- iomap->length = XFS_FSB_TO_B(mp, imap->br_blockcount);
- iomap->bdev = xfs_find_bdev_for_inode(VFS_I(ip));
-}
-
static inline bool imap_needs_alloc(struct xfs_bmbt_irec *imap, int nimaps)
{
return !nimaps ||
@@ -993,6 +942,11 @@ xfs_file_iomap_begin(
if (XFS_FORCED_SHUTDOWN(mp))
return -EIO;
+ if ((flags & IOMAP_WRITE) && !xfs_get_extsz_hint(ip)) {
+ return xfs_file_iomap_begin_delay(inode, offset, length, flags,
+ iomap);
+ }
+
xfs_ilock(ip, XFS_ILOCK_EXCL);
ASSERT(offset <= mp->m_super->s_maxbytes);
@@ -1020,19 +974,13 @@ xfs_file_iomap_begin(
* the lower level functions are updated.
*/
length = min_t(loff_t, length, 1024 * PAGE_SIZE);
- if (xfs_get_extsz_hint(ip)) {
- /*
- * xfs_iomap_write_direct() expects the shared lock. It
- * is unlocked on return.
- */
- xfs_ilock_demote(ip, XFS_ILOCK_EXCL);
- error = xfs_iomap_write_direct(ip, offset, length, &imap,
- nimaps);
- } else {
- error = xfs_iomap_write_delay(ip, offset, length, &imap);
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
- }
-
+ /*
+ * xfs_iomap_write_direct() expects the shared lock. It
+ * is unlocked on return.
+ */
+ xfs_ilock_demote(ip, XFS_ILOCK_EXCL);
+ error = xfs_iomap_write_direct(ip, offset, length, &imap,
+ nimaps);
if (error)
return error;
diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h
index e066d04..1fdf68d 100644
--- a/fs/xfs/xfs_iomap.h
+++ b/fs/xfs/xfs_iomap.h
@@ -25,8 +25,6 @@ struct xfs_bmbt_irec;
int xfs_iomap_write_direct(struct xfs_inode *, xfs_off_t, size_t,
struct xfs_bmbt_irec *, int);
-int xfs_iomap_write_delay(struct xfs_inode *, xfs_off_t, size_t,
- struct xfs_bmbt_irec *);
int xfs_iomap_write_allocate(struct xfs_inode *, xfs_off_t,
struct xfs_bmbt_irec *);
int xfs_iomap_write_unwritten(struct xfs_inode *, xfs_off_t, xfs_off_t);
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-13 0:30 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-13 0:30 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 24672 bytes --]
On Fri, Aug 12, 2016 at 08:02:08PM +1000, Dave Chinner wrote:
> Which says "no change". Oh well, back to the drawing board...
I don't see how it would change thing much - for all relevant calculations
we convert to block units first anyway.
But the whole xfs_iomap_write_delay is a giant mess anyway. For a usual
call we do at least four lookups in the extent btree, which seems rather
costly. Especially given that the low-level xfs_bmap_search_extents
interface would give us all required information in one single call.
Below is a patch I hacked up this morning to do just that. It passes
xfstests, but I've not done any real benchmarking with it. If the
reduced lookup overhead in it doesn't help enough we'll need to some
sort of look aside cache for the information, but I hope that we
can avoid that. And yes, it's a rather large patch - but the old
path was so entangled that I couldn't come up with something lighter.
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index b060bca..614803b 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1388,7 +1388,7 @@ xfs_bmap_search_multi_extents(
* Else, *lastxp will be set to the index of the found
* entry; *gotp will contain the entry.
*/
-STATIC xfs_bmbt_rec_host_t * /* pointer to found extent entry */
+xfs_bmbt_rec_host_t * /* pointer to found extent entry */
xfs_bmap_search_extents(
xfs_inode_t *ip, /* incore inode pointer */
xfs_fileoff_t bno, /* block number searched for */
@@ -4074,7 +4074,7 @@ xfs_bmapi_read(
return 0;
}
-STATIC int
+int
xfs_bmapi_reserve_delalloc(
struct xfs_inode *ip,
xfs_fileoff_t aoff,
@@ -4170,91 +4170,6 @@ out_unreserve_quota:
return error;
}
-/*
- * Map file blocks to filesystem blocks, adding delayed allocations as needed.
- */
-int
-xfs_bmapi_delay(
- struct xfs_inode *ip, /* incore inode */
- xfs_fileoff_t bno, /* starting file offs. mapped */
- xfs_filblks_t len, /* length to map in file */
- struct xfs_bmbt_irec *mval, /* output: map values */
- int *nmap, /* i/o: mval size/count */
- int flags) /* XFS_BMAPI_... */
-{
- struct xfs_mount *mp = ip->i_mount;
- struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
- struct xfs_bmbt_irec got; /* current file extent record */
- struct xfs_bmbt_irec prev; /* previous file extent record */
- xfs_fileoff_t obno; /* old block number (offset) */
- xfs_fileoff_t end; /* end of mapped file region */
- xfs_extnum_t lastx; /* last useful extent number */
- int eof; /* we've hit the end of extents */
- int n = 0; /* current extent index */
- int error = 0;
-
- ASSERT(*nmap >= 1);
- ASSERT(*nmap <= XFS_BMAP_MAX_NMAP);
- ASSERT(!(flags & ~XFS_BMAPI_ENTIRE));
- ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
-
- if (unlikely(XFS_TEST_ERROR(
- (XFS_IFORK_FORMAT(ip, XFS_DATA_FORK) != XFS_DINODE_FMT_EXTENTS &&
- XFS_IFORK_FORMAT(ip, XFS_DATA_FORK) != XFS_DINODE_FMT_BTREE),
- mp, XFS_ERRTAG_BMAPIFORMAT, XFS_RANDOM_BMAPIFORMAT))) {
- XFS_ERROR_REPORT("xfs_bmapi_delay", XFS_ERRLEVEL_LOW, mp);
- return -EFSCORRUPTED;
- }
-
- if (XFS_FORCED_SHUTDOWN(mp))
- return -EIO;
-
- XFS_STATS_INC(mp, xs_blk_mapw);
-
- if (!(ifp->if_flags & XFS_IFEXTENTS)) {
- error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
- if (error)
- return error;
- }
-
- xfs_bmap_search_extents(ip, bno, XFS_DATA_FORK, &eof, &lastx, &got, &prev);
- end = bno + len;
- obno = bno;
-
- while (bno < end && n < *nmap) {
- if (eof || got.br_startoff > bno) {
- error = xfs_bmapi_reserve_delalloc(ip, bno, len, &got,
- &prev, &lastx, eof);
- if (error) {
- if (n == 0) {
- *nmap = 0;
- return error;
- }
- break;
- }
- }
-
- /* set up the extent map to return. */
- xfs_bmapi_trim_map(mval, &got, &bno, len, obno, end, n, flags);
- xfs_bmapi_update_map(&mval, &bno, &len, obno, end, &n, flags);
-
- /* If we're done, stop now. */
- if (bno >= end || n >= *nmap)
- break;
-
- /* Else go on to the next record. */
- prev = got;
- if (++lastx < ifp->if_bytes / sizeof(xfs_bmbt_rec_t))
- xfs_bmbt_get_all(xfs_iext_get_ext(ifp, lastx), &got);
- else
- eof = 1;
- }
-
- *nmap = n;
- return 0;
-}
-
-
static int
xfs_bmapi_allocate(
struct xfs_bmalloca *bma)
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index 254034f..d660069 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -181,9 +181,6 @@ int xfs_bmap_read_extents(struct xfs_trans *tp, struct xfs_inode *ip,
int xfs_bmapi_read(struct xfs_inode *ip, xfs_fileoff_t bno,
xfs_filblks_t len, struct xfs_bmbt_irec *mval,
int *nmap, int flags);
-int xfs_bmapi_delay(struct xfs_inode *ip, xfs_fileoff_t bno,
- xfs_filblks_t len, struct xfs_bmbt_irec *mval,
- int *nmap, int flags);
int xfs_bmapi_write(struct xfs_trans *tp, struct xfs_inode *ip,
xfs_fileoff_t bno, xfs_filblks_t len, int flags,
xfs_fsblock_t *firstblock, xfs_extlen_t total,
@@ -202,5 +199,12 @@ int xfs_bmap_shift_extents(struct xfs_trans *tp, struct xfs_inode *ip,
struct xfs_defer_ops *dfops, enum shift_direction direction,
int num_exts);
int xfs_bmap_split_extent(struct xfs_inode *ip, xfs_fileoff_t split_offset);
+struct xfs_bmbt_rec_host *
+ xfs_bmap_search_extents(struct xfs_inode *ip, xfs_fileoff_t bno,
+ int fork, int *eofp, xfs_extnum_t *lastxp,
+ struct xfs_bmbt_irec *gotp, struct xfs_bmbt_irec *prevp);
+int xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, xfs_fileoff_t aoff,
+ xfs_filblks_t len, struct xfs_bmbt_irec *got,
+ struct xfs_bmbt_irec *prev, xfs_extnum_t *lastx, int eof);
#endif /* __XFS_BMAP_H__ */
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 2114d53..b40e31b 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -42,11 +42,62 @@
#define XFS_WRITEIO_ALIGN(mp,off) (((off) >> mp->m_writeio_log) \
<< mp->m_writeio_log)
-#define XFS_WRITE_IMAPS XFS_BMAP_MAX_NMAP
+
+void
+xfs_bmbt_to_iomap(
+ struct xfs_inode *ip,
+ struct iomap *iomap,
+ struct xfs_bmbt_irec *imap)
+{
+ struct xfs_mount *mp = ip->i_mount;
+
+ if (imap->br_startblock == HOLESTARTBLOCK) {
+ iomap->blkno = IOMAP_NULL_BLOCK;
+ iomap->type = IOMAP_HOLE;
+ } else if (imap->br_startblock == DELAYSTARTBLOCK) {
+ iomap->blkno = IOMAP_NULL_BLOCK;
+ iomap->type = IOMAP_DELALLOC;
+ } else {
+ iomap->blkno = xfs_fsb_to_db(ip, imap->br_startblock);
+ if (imap->br_state == XFS_EXT_UNWRITTEN)
+ iomap->type = IOMAP_UNWRITTEN;
+ else
+ iomap->type = IOMAP_MAPPED;
+ }
+ iomap->offset = XFS_FSB_TO_B(mp, imap->br_startoff);
+ iomap->length = XFS_FSB_TO_B(mp, imap->br_blockcount);
+ iomap->bdev = xfs_find_bdev_for_inode(VFS_I(ip));
+}
+
+static xfs_extlen_t
+xfs_align_eof(
+ struct xfs_inode *ip)
+{
+ struct xfs_mount *mp = ip->i_mount;
+ xfs_extlen_t align = 0;
+
+ ASSERT(!XFS_IS_REALTIME_INODE(ip));
+
+ /*
+ * Round up the allocation request to a stripe unit (m_dalign)
+ * boundary if the file size is >= stripe unit size, and we are
+ * allocating past the allocation eof.
+ *
+ * If mounted with the "-o swalloc" option the alignment is
+ * increased from the strip unit size to the stripe width.
+ */
+ if (mp->m_swidth && (mp->m_flags & XFS_MOUNT_SWALLOC))
+ align = mp->m_swidth;
+ else if (mp->m_dalign)
+ align = mp->m_dalign;
+
+ if (align && XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, align))
+ align = 0;
+ return align;
+}
STATIC int
xfs_iomap_eof_align_last_fsb(
- xfs_mount_t *mp,
xfs_inode_t *ip,
xfs_extlen_t extsize,
xfs_fileoff_t *last_fsb)
@@ -54,23 +105,8 @@ xfs_iomap_eof_align_last_fsb(
xfs_extlen_t align = 0;
int eof, error;
- if (!XFS_IS_REALTIME_INODE(ip)) {
- /*
- * Round up the allocation request to a stripe unit
- * (m_dalign) boundary if the file size is >= stripe unit
- * size, and we are allocating past the allocation eof.
- *
- * If mounted with the "-o swalloc" option the alignment is
- * increased from the strip unit size to the stripe width.
- */
- if (mp->m_swidth && (mp->m_flags & XFS_MOUNT_SWALLOC))
- align = mp->m_swidth;
- else if (mp->m_dalign)
- align = mp->m_dalign;
-
- if (align && XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, align))
- align = 0;
- }
+ if (!XFS_IS_REALTIME_INODE(ip))
+ align = xfs_align_eof(ip);
/*
* Always round up the allocation request to an extent boundary
@@ -154,7 +190,7 @@ xfs_iomap_write_direct(
*/
ASSERT(XFS_IFORK_PTR(ip, XFS_DATA_FORK)->if_flags &
XFS_IFEXTENTS);
- error = xfs_iomap_eof_align_last_fsb(mp, ip, extsz, &last_fsb);
+ error = xfs_iomap_eof_align_last_fsb(ip, extsz, &last_fsb);
if (error)
goto out_unlock;
} else {
@@ -274,130 +310,6 @@ out_trans_cancel:
goto out_unlock;
}
-/*
- * If the caller is doing a write at the end of the file, then extend the
- * allocation out to the file system's write iosize. We clean up any extra
- * space left over when the file is closed in xfs_inactive().
- *
- * If we find we already have delalloc preallocation beyond EOF, don't do more
- * preallocation as it it not needed.
- */
-STATIC int
-xfs_iomap_eof_want_preallocate(
- xfs_mount_t *mp,
- xfs_inode_t *ip,
- xfs_off_t offset,
- size_t count,
- xfs_bmbt_irec_t *imap,
- int nimaps,
- int *prealloc)
-{
- xfs_fileoff_t start_fsb;
- xfs_filblks_t count_fsb;
- int n, error, imaps;
- int found_delalloc = 0;
-
- *prealloc = 0;
- if (offset + count <= XFS_ISIZE(ip))
- return 0;
-
- /*
- * If the file is smaller than the minimum prealloc and we are using
- * dynamic preallocation, don't do any preallocation@all as it is
- * likely this is the only write to the file that is going to be done.
- */
- if (!(mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) &&
- XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, mp->m_writeio_blocks))
- return 0;
-
- /*
- * If there are any real blocks past eof, then don't
- * do any speculative allocation.
- */
- start_fsb = XFS_B_TO_FSBT(mp, ((xfs_ufsize_t)(offset + count - 1)));
- count_fsb = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
- while (count_fsb > 0) {
- imaps = nimaps;
- error = xfs_bmapi_read(ip, start_fsb, count_fsb, imap, &imaps,
- 0);
- if (error)
- return error;
- for (n = 0; n < imaps; n++) {
- if ((imap[n].br_startblock != HOLESTARTBLOCK) &&
- (imap[n].br_startblock != DELAYSTARTBLOCK))
- return 0;
- start_fsb += imap[n].br_blockcount;
- count_fsb -= imap[n].br_blockcount;
-
- if (imap[n].br_startblock == DELAYSTARTBLOCK)
- found_delalloc = 1;
- }
- }
- if (!found_delalloc)
- *prealloc = 1;
- return 0;
-}
-
-/*
- * Determine the initial size of the preallocation. We are beyond the current
- * EOF here, but we need to take into account whether this is a sparse write or
- * an extending write when determining the preallocation size. Hence we need to
- * look up the extent that ends@the current write offset and use the result
- * to determine the preallocation size.
- *
- * If the extent is a hole, then preallocation is essentially disabled.
- * Otherwise we take the size of the preceeding data extent as the basis for the
- * preallocation size. If the size of the extent is greater than half the
- * maximum extent length, then use the current offset as the basis. This ensures
- * that for large files the preallocation size always extends to MAXEXTLEN
- * rather than falling short due to things like stripe unit/width alignment of
- * real extents.
- */
-STATIC xfs_fsblock_t
-xfs_iomap_eof_prealloc_initial_size(
- struct xfs_mount *mp,
- struct xfs_inode *ip,
- xfs_off_t offset,
- xfs_bmbt_irec_t *imap,
- int nimaps)
-{
- xfs_fileoff_t start_fsb;
- int imaps = 1;
- int error;
-
- ASSERT(nimaps >= imaps);
-
- /* if we are using a specific prealloc size, return now */
- if (mp->m_flags & XFS_MOUNT_DFLT_IOSIZE)
- return 0;
-
- /* If the file is small, then use the minimum prealloc */
- if (XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, mp->m_dalign))
- return 0;
-
- /*
- * As we write multiple pages, the offset will always align to the
- * start of a page and hence point to a hole at EOF. i.e. if the size is
- * 4096 bytes, we only have one block at FSB 0, but XFS_B_TO_FSB(4096)
- * will return FSB 1. Hence if there are blocks in the file, we want to
- * point to the block prior to the EOF block and not the hole that maps
- * directly at @offset.
- */
- start_fsb = XFS_B_TO_FSB(mp, offset);
- if (start_fsb)
- start_fsb--;
- error = xfs_bmapi_read(ip, start_fsb, 1, imap, &imaps, XFS_BMAPI_ENTIRE);
- if (error)
- return 0;
-
- ASSERT(imaps == 1);
- if (imap[0].br_startblock == HOLESTARTBLOCK)
- return 0;
- if (imap[0].br_blockcount <= (MAXEXTLEN >> 1))
- return imap[0].br_blockcount << 1;
- return XFS_B_TO_FSB(mp, offset);
-}
-
STATIC bool
xfs_quota_need_throttle(
struct xfs_inode *ip,
@@ -466,20 +378,37 @@ xfs_quota_calc_throttle(
*/
STATIC xfs_fsblock_t
xfs_iomap_prealloc_size(
- struct xfs_mount *mp,
struct xfs_inode *ip,
xfs_off_t offset,
- struct xfs_bmbt_irec *imap,
- int nimaps)
+ struct xfs_bmbt_irec *prev)
{
- xfs_fsblock_t alloc_blocks = 0;
+ struct xfs_mount *mp = ip->i_mount;
int shift = 0;
int64_t freesp;
xfs_fsblock_t qblocks;
int qshift = 0;
+ xfs_fsblock_t alloc_blocks = 0;
- alloc_blocks = xfs_iomap_eof_prealloc_initial_size(mp, ip, offset,
- imap, nimaps);
+ /*
+ * Determine the initial size of the preallocation. We are beyond the
+ * current EOF here, but we need to take into account whether this is
+ * a sparse write or an extending write when determining the
+ * preallocation size. Hence we need to look up the extent that ends
+ * at the current write offset and use the result to determine the
+ * preallocation size.
+ *
+ * If the extent is a hole, then preallocation is essentially disabled.
+ * Otherwise we take the size of the preceding data extent as the basis
+ * for the preallocation size. If the size of the extent is greater than
+ * half the maximum extent length, then use the current offset as the
+ * basis. This ensures that for large files the preallocation size
+ * always extends to MAXEXTLEN rather than falling short due to things
+ * like stripe unit/width alignment of real extents.
+ */
+ if (prev->br_blockcount <= (MAXEXTLEN >> 1))
+ alloc_blocks = prev->br_blockcount << 1;
+ else
+ alloc_blocks = XFS_B_TO_FSB(mp, offset);
if (!alloc_blocks)
goto check_writeio;
qblocks = alloc_blocks;
@@ -550,120 +479,166 @@ xfs_iomap_prealloc_size(
*/
while (alloc_blocks && alloc_blocks >= freesp)
alloc_blocks >>= 4;
-
check_writeio:
if (alloc_blocks < mp->m_writeio_blocks)
alloc_blocks = mp->m_writeio_blocks;
-
trace_xfs_iomap_prealloc_size(ip, alloc_blocks, shift,
mp->m_writeio_blocks);
-
return alloc_blocks;
}
-int
-xfs_iomap_write_delay(
- xfs_inode_t *ip,
- xfs_off_t offset,
- size_t count,
- xfs_bmbt_irec_t *ret_imap)
+static int
+xfs_file_iomap_begin_delay(
+ struct inode *inode,
+ loff_t offset,
+ loff_t count,
+ unsigned flags,
+ struct iomap *iomap)
{
- xfs_mount_t *mp = ip->i_mount;
- xfs_fileoff_t offset_fsb;
- xfs_fileoff_t last_fsb;
- xfs_off_t aligned_offset;
- xfs_fileoff_t ioalign;
- xfs_extlen_t extsz;
- int nimaps;
- xfs_bmbt_irec_t imap[XFS_WRITE_IMAPS];
- int prealloc;
- int error;
-
- ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
-
- /*
- * Make sure that the dquots are there. This doesn't hold
- * the ilock across a disk read.
- */
- error = xfs_qm_dqattach_locked(ip, 0);
- if (error)
- return error;
-
- extsz = xfs_get_extsz_hint(ip);
- offset_fsb = XFS_B_TO_FSBT(mp, offset);
+ struct xfs_inode *ip = XFS_I(inode);
+ struct xfs_mount *mp = ip->i_mount;
+ struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
+ xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
+ xfs_fileoff_t maxbytes_fsb =
+ XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
+ xfs_fileoff_t end_fsb, orig_end_fsb;
+ int error = 0, eof = 0;
+ struct xfs_bmbt_irec got;
+ struct xfs_bmbt_irec prev;
+ xfs_extnum_t idx;
+
+ ASSERT(!XFS_IS_REALTIME_INODE(ip));
+ ASSERT(!xfs_get_extsz_hint(ip));
- error = xfs_iomap_eof_want_preallocate(mp, ip, offset, count,
- imap, XFS_WRITE_IMAPS, &prealloc);
- if (error)
- return error;
+ xfs_ilock(ip, XFS_ILOCK_EXCL);
-retry:
- if (prealloc) {
- xfs_fsblock_t alloc_blocks;
+ if (unlikely(XFS_TEST_ERROR(
+ (XFS_IFORK_FORMAT(ip, XFS_DATA_FORK) != XFS_DINODE_FMT_EXTENTS &&
+ XFS_IFORK_FORMAT(ip, XFS_DATA_FORK) != XFS_DINODE_FMT_BTREE),
+ mp, XFS_ERRTAG_BMAPIFORMAT, XFS_RANDOM_BMAPIFORMAT))) {
+ XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, mp);
+ error = -EFSCORRUPTED;
+ goto out_unlock;
+ }
- alloc_blocks = xfs_iomap_prealloc_size(mp, ip, offset, imap,
- XFS_WRITE_IMAPS);
+ XFS_STATS_INC(mp, xs_blk_mapw);
- aligned_offset = XFS_WRITEIO_ALIGN(mp, (offset + count - 1));
- ioalign = XFS_B_TO_FSBT(mp, aligned_offset);
- last_fsb = ioalign + alloc_blocks;
- } else {
- last_fsb = XFS_B_TO_FSB(mp, ((xfs_ufsize_t)(offset + count)));
+ if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+ error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
+ if (error)
+ goto out_unlock;
}
- if (prealloc || extsz) {
- error = xfs_iomap_eof_align_last_fsb(mp, ip, extsz, &last_fsb);
- if (error)
- return error;
+ xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
+ &got, &prev);
+ if (!eof && got.br_startoff <= offset_fsb) {
+ trace_xfs_iomap_found(ip, offset, count, 0, &got);
+ goto done;
}
+ error = xfs_qm_dqattach_locked(ip, 0);
+ if (error)
+ goto out_unlock;
+
/*
- * Make sure preallocation does not create extents beyond the range we
- * actually support in this filesystem.
+ * We cap the maximum length we map here to MAX_WRITEBACK_PAGES pages
+ * to keep the chunks of work done where somewhat symmetric with the
+ * work writeback does. This is a completely arbitrary number pulled
+ * out of thin air as a best guess for initial testing.
+ *
+ * Note that the values needs to be less than 32-bits wide until
+ * the lower level functions are updated.
*/
- if (last_fsb > XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes))
- last_fsb = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
+ count = min_t(loff_t, count, 1024 * PAGE_SIZE);
+ end_fsb = orig_end_fsb =
+ min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
- ASSERT(last_fsb > offset_fsb);
+ /*
+ * If we are doing a write at the end of the file and there are no
+ * allocations past this one, then extend the allocation out to the
+ * file system's write iosize.
+ *
+ * As an exception we don't do any preallocation at all if the file
+ * is smaller than the minimum preallocation and we are using the
+ * default dynamic preallocation scheme, as it is likely this is the
+ * only write to the file that is going to be done.
+ *
+ * We clean up any extra space left over when the file is closed in
+ * xfs_inactive().
+ */
+ if (eof && offset + count > XFS_ISIZE(ip) &&
+ ((mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) ||
+ XFS_ISIZE(ip) >= XFS_FSB_TO_B(mp, mp->m_writeio_blocks))) {
+ xfs_fsblock_t alloc_blocks;
+ xfs_off_t aligned_offset;
+ xfs_extlen_t align;
- nimaps = XFS_WRITE_IMAPS;
- error = xfs_bmapi_delay(ip, offset_fsb, last_fsb - offset_fsb,
- imap, &nimaps, XFS_BMAPI_ENTIRE);
+ /*
+ * If an explicit allocsize is set, the file is small, or we
+ * are writing behind a hole, then use the minimum prealloc:
+ */
+ if ((mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) ||
+ XFS_ISIZE(ip) < XFS_FSB_TO_B(mp, mp->m_dalign) ||
+ prev.br_startoff + prev.br_blockcount < offset_fsb)
+ alloc_blocks = mp->m_writeio_blocks;
+ else
+ alloc_blocks =
+ xfs_iomap_prealloc_size(ip, offset, &prev);
+
+ aligned_offset = XFS_WRITEIO_ALIGN(mp, offset + count - 1);
+ end_fsb = XFS_B_TO_FSBT(mp, aligned_offset) + alloc_blocks;
+
+ align = xfs_align_eof(ip);
+ if (align)
+ end_fsb = roundup_64(end_fsb, align);
+
+ end_fsb = min(end_fsb, maxbytes_fsb);
+ ASSERT(end_fsb > offset_fsb);
+ }
+
+retry:
+ error = xfs_bmapi_reserve_delalloc(ip, offset_fsb,
+ end_fsb - offset_fsb, &got,
+ &prev, &idx, eof);
switch (error) {
case 0:
+ break;
case -ENOSPC:
case -EDQUOT:
- break;
- default:
- return error;
- }
-
- /*
- * If bmapi returned us nothing, we got either ENOSPC or EDQUOT. Retry
- * without EOF preallocation.
- */
- if (nimaps == 0) {
+ /* retry without any preallocation */
trace_xfs_delalloc_enospc(ip, offset, count);
- if (prealloc) {
- prealloc = 0;
- error = 0;
+ if (end_fsb != orig_end_fsb) {
+ end_fsb = orig_end_fsb;
goto retry;
}
- return error ? error : -ENOSPC;
+ /*FALLTHRU*/
+ default:
+ goto out_unlock;
}
- if (!(imap[0].br_startblock || XFS_IS_REALTIME_INODE(ip)))
- return xfs_alert_fsblock_zero(ip, &imap[0]);
+ trace_xfs_iomap_alloc(ip, offset, count, 0, &got);
+done:
+ if (isnullstartblock(got.br_startblock))
+ got.br_startblock = DELAYSTARTBLOCK;
+
+ if (!got.br_startblock) {
+ error = xfs_alert_fsblock_zero(ip, &got);
+ if (error)
+ goto out_unlock;
+ }
+
+ xfs_bmbt_to_iomap(ip, iomap, &got);
/*
* Tag the inode as speculatively preallocated so we can reclaim this
* space on demand, if necessary.
*/
- if (prealloc)
+ if (end_fsb != orig_end_fsb)
xfs_inode_set_eofblocks_tag(ip);
- *ret_imap = imap[0];
- return 0;
+out_unlock:
+ xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ return error;
}
/*
@@ -943,32 +918,6 @@ error_on_bmapi_transaction:
return error;
}
-void
-xfs_bmbt_to_iomap(
- struct xfs_inode *ip,
- struct iomap *iomap,
- struct xfs_bmbt_irec *imap)
-{
- struct xfs_mount *mp = ip->i_mount;
-
- if (imap->br_startblock == HOLESTARTBLOCK) {
- iomap->blkno = IOMAP_NULL_BLOCK;
- iomap->type = IOMAP_HOLE;
- } else if (imap->br_startblock == DELAYSTARTBLOCK) {
- iomap->blkno = IOMAP_NULL_BLOCK;
- iomap->type = IOMAP_DELALLOC;
- } else {
- iomap->blkno = xfs_fsb_to_db(ip, imap->br_startblock);
- if (imap->br_state == XFS_EXT_UNWRITTEN)
- iomap->type = IOMAP_UNWRITTEN;
- else
- iomap->type = IOMAP_MAPPED;
- }
- iomap->offset = XFS_FSB_TO_B(mp, imap->br_startoff);
- iomap->length = XFS_FSB_TO_B(mp, imap->br_blockcount);
- iomap->bdev = xfs_find_bdev_for_inode(VFS_I(ip));
-}
-
static inline bool imap_needs_alloc(struct xfs_bmbt_irec *imap, int nimaps)
{
return !nimaps ||
@@ -993,6 +942,11 @@ xfs_file_iomap_begin(
if (XFS_FORCED_SHUTDOWN(mp))
return -EIO;
+ if ((flags & IOMAP_WRITE) && !xfs_get_extsz_hint(ip)) {
+ return xfs_file_iomap_begin_delay(inode, offset, length, flags,
+ iomap);
+ }
+
xfs_ilock(ip, XFS_ILOCK_EXCL);
ASSERT(offset <= mp->m_super->s_maxbytes);
@@ -1020,19 +974,13 @@ xfs_file_iomap_begin(
* the lower level functions are updated.
*/
length = min_t(loff_t, length, 1024 * PAGE_SIZE);
- if (xfs_get_extsz_hint(ip)) {
- /*
- * xfs_iomap_write_direct() expects the shared lock. It
- * is unlocked on return.
- */
- xfs_ilock_demote(ip, XFS_ILOCK_EXCL);
- error = xfs_iomap_write_direct(ip, offset, length, &imap,
- nimaps);
- } else {
- error = xfs_iomap_write_delay(ip, offset, length, &imap);
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
- }
-
+ /*
+ * xfs_iomap_write_direct() expects the shared lock. It
+ * is unlocked on return.
+ */
+ xfs_ilock_demote(ip, XFS_ILOCK_EXCL);
+ error = xfs_iomap_write_direct(ip, offset, length, &imap,
+ nimaps);
if (error)
return error;
diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h
index e066d04..1fdf68d 100644
--- a/fs/xfs/xfs_iomap.h
+++ b/fs/xfs/xfs_iomap.h
@@ -25,8 +25,6 @@ struct xfs_bmbt_irec;
int xfs_iomap_write_direct(struct xfs_inode *, xfs_off_t, size_t,
struct xfs_bmbt_irec *, int);
-int xfs_iomap_write_delay(struct xfs_inode *, xfs_off_t, size_t,
- struct xfs_bmbt_irec *);
int xfs_iomap_write_allocate(struct xfs_inode *, xfs_off_t,
struct xfs_bmbt_irec *);
int xfs_iomap_write_unwritten(struct xfs_inode *, xfs_off_t, xfs_off_t);
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-13 0:30 ` Christoph Hellwig
@ 2016-08-13 21:48 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-13 21:48 UTC (permalink / raw)
To: Dave Chinner
Cc: Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, Wu Fengguang,
LKP, Christoph Hellwig
On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
> Below is a patch I hacked up this morning to do just that. It passes
> xfstests, but I've not done any real benchmarking with it. If the
> reduced lookup overhead in it doesn't help enough we'll need to some
> sort of look aside cache for the information, but I hope that we
> can avoid that. And yes, it's a rather large patch - but the old
> path was so entangled that I couldn't come up with something lighter.
Hi Fengguang or Xiaolong,
any chance to add this thread to a lkp run? I've played around with
Dave's simplied xfs_io run, and while the end result for 1k block
size looks pretty similar in terms of execution time and throughput
the profiles look much better. For 512 byte or 1 byte tests the
tests completes a lot faster too.
Here is the perf report output for a 1k block size run, the first
item directly related to the block mapping shows up is
xfs_file_iomap_begin_delay at .75%. Although I'm a bit worried
up up_/down_read showing up so much. While we take a ilock and
iolock a lot they should be mostly uncontended for such a single
threaded write, so the overhead seems a bit worrisome.
(FYI, the tree this was tested on also has the mark_page_accessed
and pagefault_disable fixes applied)
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 7K of event 'cpu-clock'
# Event count (approx.): 1909250000
#
# Overhead Command Shared Object Symbol
# ........ ............ ................. .....................................
#
37.71% swapper [kernel.kallsyms] [k] native_safe_halt
9.85% kworker/u8:5 [kernel.kallsyms] [k] __copy_user_nocache
2.83% xfs_io [kernel.kallsyms] [k] copy_user_generic_string
2.33% xfs_io [kernel.kallsyms] [k] __memset
2.23% xfs_io [kernel.kallsyms] [k] __block_commit_write.isra.34
1.73% xfs_io [kernel.kallsyms] [k] down_write
1.64% xfs_io [kernel.kallsyms] [k] up_write
1.39% xfs_io [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.23% xfs_io [kernel.kallsyms] [k] entry_SYSCALL_64_fastpath
1.18% xfs_io [kernel.kallsyms] [k] __mark_inode_dirty
1.18% xfs_io [kernel.kallsyms] [k] _raw_spin_lock
1.15% kworker/u8:5 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.13% xfs_io [kernel.kallsyms] [k] __block_write_begin_int
1.10% xfs_io [kernel.kallsyms] [k] mark_buffer_dirty
1.07% xfs_io [kernel.kallsyms] [k] __radix_tree_lookup
1.01% kworker/0:2 [kernel.kallsyms] [k] end_buffer_async_write
0.97% xfs_io [kernel.kallsyms] [k] unlock_page
0.92% kworker/0:2 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.89% xfs_io [kernel.kallsyms] [k] iov_iter_copy_from_user_atomic
0.84% xfs_io [kernel.kallsyms] [k] generic_write_end
0.80% xfs_io [kernel.kallsyms] [k] get_page_from_freelist
0.80% xfs_io [kernel.kallsyms] [k] xfs_perag_put
0.79% xfs_io [kernel.kallsyms] [k] __add_to_page_cache_locked
0.75% xfs_io libc-2.19.so [.] __libc_pwrite
0.72% xfs_io [kernel.kallsyms] [k] xfs_file_iomap_begin_delay.isra.5
0.71% xfs_io [kernel.kallsyms] [k] iomap_write_actor
0.67% xfs_io [kernel.kallsyms] [k] pagecache_get_page
0.64% xfs_io [kernel.kallsyms] [k] balance_dirty_pages_ratelimited
0.64% xfs_io [kernel.kallsyms] [k] vfs_write
0.63% kworker/u8:5 [kernel.kallsyms] [k] clear_page_dirty_for_io
0.62% xfs_io [kernel.kallsyms] [k] xfs_file_write_iter
0.60% xfs_io [kernel.kallsyms] [k] __vfs_write
0.55% xfs_io [kernel.kallsyms] [k] page_waitqueue
0.54% xfs_io [kernel.kallsyms] [k] xfs_perag_get
0.52% xfs_io [kernel.kallsyms] [k] __wake_up_bit
0.52% xfs_io [kernel.kallsyms] [k] radix_tree_tag_set
0.50% kworker/u8:5 [kernel.kallsyms] [k] xfs_do_writepage
0.50% xfs_io [kernel.kallsyms] [k] iov_iter_advance
0.47% xfs_io [kernel.kallsyms] [k] kmem_cache_alloc
0.46% xfs_io [kernel.kallsyms] [k] xfs_file_buffered_aio_write
0.46% xfs_io [kernel.kallsyms] [k] xfs_iunlock
0.45% kworker/u8:5 [kernel.kallsyms] [k] __wake_up_bit
0.45% xfs_io [kernel.kallsyms] [k] find_get_entry
0.45% xfs_io [kernel.kallsyms] [k] xfs_bmap_search_multi_extents
0.41% xfs_io [kernel.kallsyms] [k] xfs_iext_bno_to_ext
0.39% xfs_io [kernel.kallsyms] [k] iomap_apply
0.39% xfs_io [kernel.kallsyms] [k] xfs_file_aio_write_checks
0.38% xfs_io [kernel.kallsyms] [k] xfs_ilock
0.38% xfs_io [kernel.kallsyms] [k] xfs_inode_set_eofblocks_tag
0.37% xfs_io [kernel.kallsyms] [k] xfs_bmap_search_extents
0.30% xfs_io [kernel.kallsyms] [k] file_update_time
0.29% xfs_io [kernel.kallsyms] [k] __fget_light
0.27% xfs_io [kernel.kallsyms] [k] rw_verify_area
0.26% kworker/u8:5 [kernel.kallsyms] [k] unlock_page
0.26% xfs_io xfs_io [.] pwrite_f
0.25% xfs_io [kernel.kallsyms] [k] iomap_file_buffered_write
0.25% xfs_io [kernel.kallsyms] [k] node_dirty_ok
0.25% xfs_io [kernel.kallsyms] [k] xfs_bmbt_to_iomap
0.24% kworker/0:2 [kernel.kallsyms] [k] xfs_destroy_ioend
0.24% xfs_io [kernel.kallsyms] [k] __xfs_bmbt_get_all
0.24% xfs_io [kernel.kallsyms] [k] iov_iter_fault_in_readable
0.22% kworker/u8:5 [kernel.kallsyms] [k] xfs_start_buffer_writeback
0.22% xfs_io [kernel.kallsyms] [k] fsnotify
0.22% xfs_io [kernel.kallsyms] [k] sys_pwrite64
0.22% xfs_io [kernel.kallsyms] [k] xfs_file_iomap_begin
0.21% kworker/u8:5 [kernel.kallsyms] [k] pmem_do_bvec
0.21% kworker/u8:5 [kernel.kallsyms] [k] xfs_map_at_offset
0.18% kworker/u8:5 [kernel.kallsyms] [k] write_cache_pages
0.18% kworker/u8:5 [kernel.kallsyms] [k] xfs_map_buffer
0.17% kworker/u8:5 [kernel.kallsyms] [k] __test_set_page_writeback
0.17% xfs_io [kernel.kallsyms] [k] __alloc_pages_nodemask
0.17% xfs_io [kernel.kallsyms] [k] __fsnotify_parent
0.17% xfs_io [kernel.kallsyms] [k] block_write_end
0.17% xfs_io [kernel.kallsyms] [k] iomap_write_begin
0.17% xfs_io [kernel.kallsyms] [k] iov_iter_init
0.17% xfs_io [kernel.kallsyms] [k] percpu_up_read
0.17% xfs_io [kernel.kallsyms] [k] radix_tree_lookup_slot
0.16% xfs_io [kernel.kallsyms] [k] create_empty_buffers
0.16% xfs_io [kernel.kallsyms] [k] timespec_trunc
0.16% xfs_io [kernel.kallsyms] [k] wait_for_stable_page
0.16% xfs_io [kernel.kallsyms] [k] xfs_get_extsz_hint
0.14% kworker/0:2 [kernel.kallsyms] [k] test_clear_page_writeback
0.14% kworker/u8:5 [kernel.kallsyms] [k] release_pages
0.13% xfs_io [kernel.kallsyms] [k] iomap_write_end
0.13% xfs_io [kernel.kallsyms] [k] xfs_bmbt_get_startoff
0.12% kworker/u8:5 [kernel.kallsyms] [k] dec_zone_page_state
0.12% xfs_io [kernel.kallsyms] [k] alloc_page_buffers
0.12% xfs_io [kernel.kallsyms] [k] generic_write_checks
0.12% xfs_io [kernel.kallsyms] [k] percpu_down_read
0.12% xfs_io [kernel.kallsyms] [k] release_pages
0.12% xfs_io [kernel.kallsyms] [k] set_bh_page
0.12% xfs_io [kernel.kallsyms] [k] xfs_find_bdev_for_inode
0.12% xfs_io xfs_io [.] do_pwrite
0.10% kworker/u8:5 [kernel.kallsyms] [k] mark_buffer_async_write
0.10% kworker/u8:5 [kernel.kallsyms] [k] page_waitqueue
0.10% xfs_io [kernel.kallsyms] [k] PageHuge
0.10% xfs_io [kernel.kallsyms] [k] add_to_page_cache_lru
0.09% kworker/0:2 [kernel.kallsyms] [k] end_page_writeback
0.09% kworker/u8:5 [kernel.kallsyms] [k] find_get_pages_tag
0.09% kworker/u8:5 [kernel.kallsyms] [k] xfs_start_page_writeback
0.09% xfs_io [kernel.kallsyms] [k] create_page_buffers
0.09% xfs_io [kernel.kallsyms] [k] page_mapping
0.09% xfs_io [kernel.kallsyms] [k] xfs_bmbt_get_all
0.09% xfs_io [kernel.kallsyms] [k] xfs_file_iomap_end
0.08% kworker/u8:5 [kernel.kallsyms] [k] inc_node_page_state
0.08% kworker/u8:5 [kernel.kallsyms] [k] inc_zone_page_state
0.08% kworker/u8:5 [kernel.kallsyms] [k] page_mapping
0.08% kworker/u8:5 [kernel.kallsyms] [k] page_mkclean
0.08% xfs_io [kernel.kallsyms] [k] __sb_start_write
0.08% xfs_io [kernel.kallsyms] [k] current_kernel_time64
0.07% kworker/0:2 [kernel.kallsyms] [k] __wake_up_bit
0.07% kworker/u8:5 [kernel.kallsyms] [k] page_mapped
0.07% xfs_io [kernel.kallsyms] [k] __lru_cache_add
0.07% xfs_io [kernel.kallsyms] [k] current_fs_time
0.07% xfs_io [kernel.kallsyms] [k] grab_cache_page_write_begin
0.07% xfs_io [kernel.kallsyms] [k] xfs_iext_get_ext
0.05% kworker/u8:5 [kernel.kallsyms] [k] pmem_make_request
0.05% kworker/u8:5 [kernel.kallsyms] [k] xfs_add_to_ioend
0.05% xfs_io [kernel.kallsyms] [k] __fdget
0.05% xfs_io [kernel.kallsyms] [k] __find_get_block_slow
0.05% xfs_io [kernel.kallsyms] [k] __set_page_dirty
0.05% xfs_io [kernel.kallsyms] [k] alloc_buffer_head
0.05% xfs_io [kernel.kallsyms] [k] radix_tree_lookup
0.05% xfs_io [kernel.kallsyms] [k] radix_tree_tagged
0.05% xfs_io [kernel.kallsyms] [k] xfs_bmbt_get_blockcount
0.05% xfs_io [kernel.kallsyms] [k] xfs_fsb_to_db
0.04% kworker/0:2 [kernel.kallsyms] [k] dec_zone_page_state
0.04% kworker/u8:5 [kernel.kallsyms] [k] dec_node_page_state
0.04% xfs_io [kernel.kallsyms] [k] __radix_tree_preload
0.04% xfs_io [kernel.kallsyms] [k] __sb_end_write
0.03% kworker/0:2 [kernel.kallsyms] [k] dec_node_page_state
0.03% kworker/0:2 [kernel.kallsyms] [k] inc_node_page_state
0.03% kworker/0:2 [kernel.kallsyms] [k] page_mapping
0.03% kworker/u8:5 [kernel.kallsyms] [k] radix_tree_next_chunk
0.03% kworker/u8:5 [kernel.kallsyms] [k] xfs_fsb_to_db
0.03% xfs_io [kernel.kallsyms] [k] _raw_spin_lock_irqsave
0.03% xfs_io [kernel.kallsyms] [k] cache_alloc_refill
0.03% xfs_io [kernel.kallsyms] [k] lru_cache_add
0.01% kworker/0:2 [kernel.kallsyms] [k] cache_reap
0.01% kworker/0:2 [kernel.kallsyms] [k] mempool_free
0.01% kworker/0:2 [kernel.kallsyms] [k] page_waitqueue
0.01% kworker/u8:5 [kernel.kallsyms] [k] bio_add_page
0.01% kworker/u8:5 [kernel.kallsyms] [k] kmem_cache_alloc
0.01% kworker/u8:5 [kernel.kallsyms] [k] lru_add_drain_cpu
0.01% kworker/u8:5 [kernel.kallsyms] [k] mempool_alloc
0.01% kworker/u8:5 [kernel.kallsyms] [k] pagevec_lookup_tag
0.01% kworker/u8:5 [kernel.kallsyms] [k] queue_delayed_work_on
0.01% kworker/u8:5 [kernel.kallsyms] [k] queue_work_on
0.01% kworker/u8:5 [kernel.kallsyms] [k] run_timer_softirq
0.01% kworker/u8:5 [kernel.kallsyms] [k] update_group_capacity
0.01% kworker/u8:5 [kernel.kallsyms] [k] xfs_trans_reserve
0.01% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.01% xfs_io [kernel.kallsyms] [k] _cond_resched
0.01% xfs_io [kernel.kallsyms] [k] mod_zone_page_state
0.01% xfs_io [kernel.kallsyms] [k] pagevec_lru_move_fn
0.01% xfs_io [kernel.kallsyms] [k] radix_tree_maybe_preload
0.01% xfs_io [kernel.kallsyms] [k] unmap_underlying_metadata
0.01% xfs_io [kernel.kallsyms] [k] xfs_bmap_worst_indlen
0.01% xfs_io ld-2.19.so [.] 0x000000000000d866
0.01% xfs_io libc-2.19.so [.] 0x000000000008a8da
0.01% xfs_io xfs_io [.] pwrite64@plt
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-13 21:48 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-13 21:48 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 15181 bytes --]
On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
> Below is a patch I hacked up this morning to do just that. It passes
> xfstests, but I've not done any real benchmarking with it. If the
> reduced lookup overhead in it doesn't help enough we'll need to some
> sort of look aside cache for the information, but I hope that we
> can avoid that. And yes, it's a rather large patch - but the old
> path was so entangled that I couldn't come up with something lighter.
Hi Fengguang or Xiaolong,
any chance to add this thread to a lkp run? I've played around with
Dave's simplied xfs_io run, and while the end result for 1k block
size looks pretty similar in terms of execution time and throughput
the profiles look much better. For 512 byte or 1 byte tests the
tests completes a lot faster too.
Here is the perf report output for a 1k block size run, the first
item directly related to the block mapping shows up is
xfs_file_iomap_begin_delay at .75%. Although I'm a bit worried
up up_/down_read showing up so much. While we take a ilock and
iolock a lot they should be mostly uncontended for such a single
threaded write, so the overhead seems a bit worrisome.
(FYI, the tree this was tested on also has the mark_page_accessed
and pagefault_disable fixes applied)
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 7K of event 'cpu-clock'
# Event count (approx.): 1909250000
#
# Overhead Command Shared Object Symbol
# ........ ............ ................. .....................................
#
37.71% swapper [kernel.kallsyms] [k] native_safe_halt
9.85% kworker/u8:5 [kernel.kallsyms] [k] __copy_user_nocache
2.83% xfs_io [kernel.kallsyms] [k] copy_user_generic_string
2.33% xfs_io [kernel.kallsyms] [k] __memset
2.23% xfs_io [kernel.kallsyms] [k] __block_commit_write.isra.34
1.73% xfs_io [kernel.kallsyms] [k] down_write
1.64% xfs_io [kernel.kallsyms] [k] up_write
1.39% xfs_io [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.23% xfs_io [kernel.kallsyms] [k] entry_SYSCALL_64_fastpath
1.18% xfs_io [kernel.kallsyms] [k] __mark_inode_dirty
1.18% xfs_io [kernel.kallsyms] [k] _raw_spin_lock
1.15% kworker/u8:5 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.13% xfs_io [kernel.kallsyms] [k] __block_write_begin_int
1.10% xfs_io [kernel.kallsyms] [k] mark_buffer_dirty
1.07% xfs_io [kernel.kallsyms] [k] __radix_tree_lookup
1.01% kworker/0:2 [kernel.kallsyms] [k] end_buffer_async_write
0.97% xfs_io [kernel.kallsyms] [k] unlock_page
0.92% kworker/0:2 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.89% xfs_io [kernel.kallsyms] [k] iov_iter_copy_from_user_atomic
0.84% xfs_io [kernel.kallsyms] [k] generic_write_end
0.80% xfs_io [kernel.kallsyms] [k] get_page_from_freelist
0.80% xfs_io [kernel.kallsyms] [k] xfs_perag_put
0.79% xfs_io [kernel.kallsyms] [k] __add_to_page_cache_locked
0.75% xfs_io libc-2.19.so [.] __libc_pwrite
0.72% xfs_io [kernel.kallsyms] [k] xfs_file_iomap_begin_delay.isra.5
0.71% xfs_io [kernel.kallsyms] [k] iomap_write_actor
0.67% xfs_io [kernel.kallsyms] [k] pagecache_get_page
0.64% xfs_io [kernel.kallsyms] [k] balance_dirty_pages_ratelimited
0.64% xfs_io [kernel.kallsyms] [k] vfs_write
0.63% kworker/u8:5 [kernel.kallsyms] [k] clear_page_dirty_for_io
0.62% xfs_io [kernel.kallsyms] [k] xfs_file_write_iter
0.60% xfs_io [kernel.kallsyms] [k] __vfs_write
0.55% xfs_io [kernel.kallsyms] [k] page_waitqueue
0.54% xfs_io [kernel.kallsyms] [k] xfs_perag_get
0.52% xfs_io [kernel.kallsyms] [k] __wake_up_bit
0.52% xfs_io [kernel.kallsyms] [k] radix_tree_tag_set
0.50% kworker/u8:5 [kernel.kallsyms] [k] xfs_do_writepage
0.50% xfs_io [kernel.kallsyms] [k] iov_iter_advance
0.47% xfs_io [kernel.kallsyms] [k] kmem_cache_alloc
0.46% xfs_io [kernel.kallsyms] [k] xfs_file_buffered_aio_write
0.46% xfs_io [kernel.kallsyms] [k] xfs_iunlock
0.45% kworker/u8:5 [kernel.kallsyms] [k] __wake_up_bit
0.45% xfs_io [kernel.kallsyms] [k] find_get_entry
0.45% xfs_io [kernel.kallsyms] [k] xfs_bmap_search_multi_extents
0.41% xfs_io [kernel.kallsyms] [k] xfs_iext_bno_to_ext
0.39% xfs_io [kernel.kallsyms] [k] iomap_apply
0.39% xfs_io [kernel.kallsyms] [k] xfs_file_aio_write_checks
0.38% xfs_io [kernel.kallsyms] [k] xfs_ilock
0.38% xfs_io [kernel.kallsyms] [k] xfs_inode_set_eofblocks_tag
0.37% xfs_io [kernel.kallsyms] [k] xfs_bmap_search_extents
0.30% xfs_io [kernel.kallsyms] [k] file_update_time
0.29% xfs_io [kernel.kallsyms] [k] __fget_light
0.27% xfs_io [kernel.kallsyms] [k] rw_verify_area
0.26% kworker/u8:5 [kernel.kallsyms] [k] unlock_page
0.26% xfs_io xfs_io [.] pwrite_f
0.25% xfs_io [kernel.kallsyms] [k] iomap_file_buffered_write
0.25% xfs_io [kernel.kallsyms] [k] node_dirty_ok
0.25% xfs_io [kernel.kallsyms] [k] xfs_bmbt_to_iomap
0.24% kworker/0:2 [kernel.kallsyms] [k] xfs_destroy_ioend
0.24% xfs_io [kernel.kallsyms] [k] __xfs_bmbt_get_all
0.24% xfs_io [kernel.kallsyms] [k] iov_iter_fault_in_readable
0.22% kworker/u8:5 [kernel.kallsyms] [k] xfs_start_buffer_writeback
0.22% xfs_io [kernel.kallsyms] [k] fsnotify
0.22% xfs_io [kernel.kallsyms] [k] sys_pwrite64
0.22% xfs_io [kernel.kallsyms] [k] xfs_file_iomap_begin
0.21% kworker/u8:5 [kernel.kallsyms] [k] pmem_do_bvec
0.21% kworker/u8:5 [kernel.kallsyms] [k] xfs_map_at_offset
0.18% kworker/u8:5 [kernel.kallsyms] [k] write_cache_pages
0.18% kworker/u8:5 [kernel.kallsyms] [k] xfs_map_buffer
0.17% kworker/u8:5 [kernel.kallsyms] [k] __test_set_page_writeback
0.17% xfs_io [kernel.kallsyms] [k] __alloc_pages_nodemask
0.17% xfs_io [kernel.kallsyms] [k] __fsnotify_parent
0.17% xfs_io [kernel.kallsyms] [k] block_write_end
0.17% xfs_io [kernel.kallsyms] [k] iomap_write_begin
0.17% xfs_io [kernel.kallsyms] [k] iov_iter_init
0.17% xfs_io [kernel.kallsyms] [k] percpu_up_read
0.17% xfs_io [kernel.kallsyms] [k] radix_tree_lookup_slot
0.16% xfs_io [kernel.kallsyms] [k] create_empty_buffers
0.16% xfs_io [kernel.kallsyms] [k] timespec_trunc
0.16% xfs_io [kernel.kallsyms] [k] wait_for_stable_page
0.16% xfs_io [kernel.kallsyms] [k] xfs_get_extsz_hint
0.14% kworker/0:2 [kernel.kallsyms] [k] test_clear_page_writeback
0.14% kworker/u8:5 [kernel.kallsyms] [k] release_pages
0.13% xfs_io [kernel.kallsyms] [k] iomap_write_end
0.13% xfs_io [kernel.kallsyms] [k] xfs_bmbt_get_startoff
0.12% kworker/u8:5 [kernel.kallsyms] [k] dec_zone_page_state
0.12% xfs_io [kernel.kallsyms] [k] alloc_page_buffers
0.12% xfs_io [kernel.kallsyms] [k] generic_write_checks
0.12% xfs_io [kernel.kallsyms] [k] percpu_down_read
0.12% xfs_io [kernel.kallsyms] [k] release_pages
0.12% xfs_io [kernel.kallsyms] [k] set_bh_page
0.12% xfs_io [kernel.kallsyms] [k] xfs_find_bdev_for_inode
0.12% xfs_io xfs_io [.] do_pwrite
0.10% kworker/u8:5 [kernel.kallsyms] [k] mark_buffer_async_write
0.10% kworker/u8:5 [kernel.kallsyms] [k] page_waitqueue
0.10% xfs_io [kernel.kallsyms] [k] PageHuge
0.10% xfs_io [kernel.kallsyms] [k] add_to_page_cache_lru
0.09% kworker/0:2 [kernel.kallsyms] [k] end_page_writeback
0.09% kworker/u8:5 [kernel.kallsyms] [k] find_get_pages_tag
0.09% kworker/u8:5 [kernel.kallsyms] [k] xfs_start_page_writeback
0.09% xfs_io [kernel.kallsyms] [k] create_page_buffers
0.09% xfs_io [kernel.kallsyms] [k] page_mapping
0.09% xfs_io [kernel.kallsyms] [k] xfs_bmbt_get_all
0.09% xfs_io [kernel.kallsyms] [k] xfs_file_iomap_end
0.08% kworker/u8:5 [kernel.kallsyms] [k] inc_node_page_state
0.08% kworker/u8:5 [kernel.kallsyms] [k] inc_zone_page_state
0.08% kworker/u8:5 [kernel.kallsyms] [k] page_mapping
0.08% kworker/u8:5 [kernel.kallsyms] [k] page_mkclean
0.08% xfs_io [kernel.kallsyms] [k] __sb_start_write
0.08% xfs_io [kernel.kallsyms] [k] current_kernel_time64
0.07% kworker/0:2 [kernel.kallsyms] [k] __wake_up_bit
0.07% kworker/u8:5 [kernel.kallsyms] [k] page_mapped
0.07% xfs_io [kernel.kallsyms] [k] __lru_cache_add
0.07% xfs_io [kernel.kallsyms] [k] current_fs_time
0.07% xfs_io [kernel.kallsyms] [k] grab_cache_page_write_begin
0.07% xfs_io [kernel.kallsyms] [k] xfs_iext_get_ext
0.05% kworker/u8:5 [kernel.kallsyms] [k] pmem_make_request
0.05% kworker/u8:5 [kernel.kallsyms] [k] xfs_add_to_ioend
0.05% xfs_io [kernel.kallsyms] [k] __fdget
0.05% xfs_io [kernel.kallsyms] [k] __find_get_block_slow
0.05% xfs_io [kernel.kallsyms] [k] __set_page_dirty
0.05% xfs_io [kernel.kallsyms] [k] alloc_buffer_head
0.05% xfs_io [kernel.kallsyms] [k] radix_tree_lookup
0.05% xfs_io [kernel.kallsyms] [k] radix_tree_tagged
0.05% xfs_io [kernel.kallsyms] [k] xfs_bmbt_get_blockcount
0.05% xfs_io [kernel.kallsyms] [k] xfs_fsb_to_db
0.04% kworker/0:2 [kernel.kallsyms] [k] dec_zone_page_state
0.04% kworker/u8:5 [kernel.kallsyms] [k] dec_node_page_state
0.04% xfs_io [kernel.kallsyms] [k] __radix_tree_preload
0.04% xfs_io [kernel.kallsyms] [k] __sb_end_write
0.03% kworker/0:2 [kernel.kallsyms] [k] dec_node_page_state
0.03% kworker/0:2 [kernel.kallsyms] [k] inc_node_page_state
0.03% kworker/0:2 [kernel.kallsyms] [k] page_mapping
0.03% kworker/u8:5 [kernel.kallsyms] [k] radix_tree_next_chunk
0.03% kworker/u8:5 [kernel.kallsyms] [k] xfs_fsb_to_db
0.03% xfs_io [kernel.kallsyms] [k] _raw_spin_lock_irqsave
0.03% xfs_io [kernel.kallsyms] [k] cache_alloc_refill
0.03% xfs_io [kernel.kallsyms] [k] lru_cache_add
0.01% kworker/0:2 [kernel.kallsyms] [k] cache_reap
0.01% kworker/0:2 [kernel.kallsyms] [k] mempool_free
0.01% kworker/0:2 [kernel.kallsyms] [k] page_waitqueue
0.01% kworker/u8:5 [kernel.kallsyms] [k] bio_add_page
0.01% kworker/u8:5 [kernel.kallsyms] [k] kmem_cache_alloc
0.01% kworker/u8:5 [kernel.kallsyms] [k] lru_add_drain_cpu
0.01% kworker/u8:5 [kernel.kallsyms] [k] mempool_alloc
0.01% kworker/u8:5 [kernel.kallsyms] [k] pagevec_lookup_tag
0.01% kworker/u8:5 [kernel.kallsyms] [k] queue_delayed_work_on
0.01% kworker/u8:5 [kernel.kallsyms] [k] queue_work_on
0.01% kworker/u8:5 [kernel.kallsyms] [k] run_timer_softirq
0.01% kworker/u8:5 [kernel.kallsyms] [k] update_group_capacity
0.01% kworker/u8:5 [kernel.kallsyms] [k] xfs_trans_reserve
0.01% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.01% xfs_io [kernel.kallsyms] [k] _cond_resched
0.01% xfs_io [kernel.kallsyms] [k] mod_zone_page_state
0.01% xfs_io [kernel.kallsyms] [k] pagevec_lru_move_fn
0.01% xfs_io [kernel.kallsyms] [k] radix_tree_maybe_preload
0.01% xfs_io [kernel.kallsyms] [k] unmap_underlying_metadata
0.01% xfs_io [kernel.kallsyms] [k] xfs_bmap_worst_indlen
0.01% xfs_io ld-2.19.so [.] 0x000000000000d866
0.01% xfs_io libc-2.19.so [.] 0x000000000008a8da
0.01% xfs_io xfs_io [.] pwrite64(a)plt
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-13 21:48 ` Christoph Hellwig
@ 2016-08-13 22:07 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-13 22:07 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Dave Chinner, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP
Hi Christoph,
On Sat, Aug 13, 2016 at 11:48:25PM +0200, Christoph Hellwig wrote:
>On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
>> Below is a patch I hacked up this morning to do just that. It passes
>> xfstests, but I've not done any real benchmarking with it. If the
>> reduced lookup overhead in it doesn't help enough we'll need to some
>> sort of look aside cache for the information, but I hope that we
>> can avoid that. And yes, it's a rather large patch - but the old
>> path was so entangled that I couldn't come up with something lighter.
>
>Hi Fengguang or Xiaolong,
>
>any chance to add this thread to a lkp run?
Sure. To which base should I apply it? Or if you already pushed the
git tree, I'll test your commit directly.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-13 22:07 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-13 22:07 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 801 bytes --]
Hi Christoph,
On Sat, Aug 13, 2016 at 11:48:25PM +0200, Christoph Hellwig wrote:
>On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
>> Below is a patch I hacked up this morning to do just that. It passes
>> xfstests, but I've not done any real benchmarking with it. If the
>> reduced lookup overhead in it doesn't help enough we'll need to some
>> sort of look aside cache for the information, but I hope that we
>> can avoid that. And yes, it's a rather large patch - but the old
>> path was so entangled that I couldn't come up with something lighter.
>
>Hi Fengguang or Xiaolong,
>
>any chance to add this thread to a lkp run?
Sure. To which base should I apply it? Or if you already pushed the
git tree, I'll test your commit directly.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-13 22:07 ` Fengguang Wu
@ 2016-08-13 22:15 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-13 22:15 UTC (permalink / raw)
To: Fengguang Wu
Cc: Christoph Hellwig, Dave Chinner, Ye Xiaolong, Linus Torvalds,
LKML, Bob Peterson, LKP
Hi Fengguang,
feel free to try this git tree:
git://git.infradead.org/users/hch/vfs.git iomap-fixes
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-13 22:15 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-13 22:15 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 112 bytes --]
Hi Fengguang,
feel free to try this git tree:
git://git.infradead.org/users/hch/vfs.git iomap-fixes
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-13 22:15 ` Christoph Hellwig
@ 2016-08-13 22:51 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-13 22:51 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Dave Chinner, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP
Hi Christoph,
On Sun, Aug 14, 2016 at 12:15:08AM +0200, Christoph Hellwig wrote:
>Hi Fengguang,
>
>feel free to try this git tree:
>
> git://git.infradead.org/users/hch/vfs.git iomap-fixes
I just queued some test jobs for it.
% queue -q vip -t ivb44 -b hch-vfs/iomap-fixes aim7-fs-1brd.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
That job file can be found here:
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/jobs/aim7-fs-1brd.yaml
It specifies a matrix of the below atom tests:
wfg /c/lkp-tests% split-job jobs/aim7-fs-1brd.yaml -s 'fs: xfs'
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_src-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rr-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rw-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_cp-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_wrt-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-sync_disk_rw-600-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-creat-clo-1500-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rd-9000-performance.yaml
If you see other suitable tests for this patch, feel free to drop me a
hint. I'v queued these jobs to the other machines to make them run in
parallel.
% queue -q vip -t ivb43 -b hch-vfs/iomap-fixes fsmark-stress-journal-1hdd.yaml fsmark-stress-journal-1brd.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
% queue -q vip -t ivb44 -b hch-vfs/iomap-fixes fsmark-generic-1brd.yaml dd-write-1hdd.yaml fsmark-generic-1hdd.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
% queue -q vip -t lkp-hsx02 -b hch-vfs/iomap-fixes fsmark-generic-brd-raid.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985[0/1710]349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
% queue -q vip -t lkp-hsw-ep4 -b hch-vfs/iomap-fixes fsmark-1ssd-nvme-small.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-13 22:51 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-13 22:51 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2621 bytes --]
Hi Christoph,
On Sun, Aug 14, 2016 at 12:15:08AM +0200, Christoph Hellwig wrote:
>Hi Fengguang,
>
>feel free to try this git tree:
>
> git://git.infradead.org/users/hch/vfs.git iomap-fixes
I just queued some test jobs for it.
% queue -q vip -t ivb44 -b hch-vfs/iomap-fixes aim7-fs-1brd.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
That job file can be found here:
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/jobs/aim7-fs-1brd.yaml
It specifies a matrix of the below atom tests:
wfg /c/lkp-tests% split-job jobs/aim7-fs-1brd.yaml -s 'fs: xfs'
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_src-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rr-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rw-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_cp-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_wrt-3000-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-sync_disk_rw-600-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-creat-clo-1500-performance.yaml
jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rd-9000-performance.yaml
If you see other suitable tests for this patch, feel free to drop me a
hint. I'v queued these jobs to the other machines to make them run in
parallel.
% queue -q vip -t ivb43 -b hch-vfs/iomap-fixes fsmark-stress-journal-1hdd.yaml fsmark-stress-journal-1brd.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
% queue -q vip -t ivb44 -b hch-vfs/iomap-fixes fsmark-generic-1brd.yaml dd-write-1hdd.yaml fsmark-generic-1hdd.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
% queue -q vip -t lkp-hsx02 -b hch-vfs/iomap-fixes fsmark-generic-brd-raid.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985[0/1710]349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
% queue -q vip -t lkp-hsw-ep4 -b hch-vfs/iomap-fixes fsmark-1ssd-nvme-small.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-13 0:30 ` Christoph Hellwig
@ 2016-08-13 23:32 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-13 23:32 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, Wu Fengguang, LKP
On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
> On Fri, Aug 12, 2016 at 08:02:08PM +1000, Dave Chinner wrote:
> > Which says "no change". Oh well, back to the drawing board...
>
> I don't see how it would change thing much - for all relevant calculations
> we convert to block units first anyway.
THere was definitely an off-by-one in the code, which meant for
1-byte writes it never triggered speculative prealloc, so it was
doing the past-EOF real block check for every write. With it also
passing less than a block size, when the > XFS_ISIZE check passed
3 out of every 4 want_preallocate checks were landing on an already
allocated block, too, so it was doing 3x as many lookups as needed.
for 1k writes on a 4k block size filesystem. Amongst other things...
> But the whole xfs_iomap_write_delay is a giant mess anyway. For a usual
> call we do at least four lookups in the extent btree, which seems rather
> costly. Especially given that the low-level xfs_bmap_search_extents
> interface would give us all required information in one single call.
I noticed, though I was looking for a smaller, targetted fix rather
than rewriting the whole thing. Don't get me wrong, I think it needs
a rewrite to be efficient for the iomap infrastructure, just didn't
want to do that as a regression fix if a 1-liner might be
sufficient...
> Below is a patch I hacked up this morning to do just that. It passes
> xfstests, but I've not done any real benchmarking with it. If the
> reduced lookup overhead in it doesn't help enough we'll need to some
> sort of look aside cache for the information, but I hope that we
> can avoid that. And yes, it's a rather large patch - but the old
> path was so entangled that I couldn't come up with something lighter.
I'll run some tests on it. If it does so;ve the regression, I'm
going to hold it back until we get a decent amount of review and
test coverage on it, though...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-13 23:32 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-13 23:32 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2036 bytes --]
On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
> On Fri, Aug 12, 2016 at 08:02:08PM +1000, Dave Chinner wrote:
> > Which says "no change". Oh well, back to the drawing board...
>
> I don't see how it would change thing much - for all relevant calculations
> we convert to block units first anyway.
THere was definitely an off-by-one in the code, which meant for
1-byte writes it never triggered speculative prealloc, so it was
doing the past-EOF real block check for every write. With it also
passing less than a block size, when the > XFS_ISIZE check passed
3 out of every 4 want_preallocate checks were landing on an already
allocated block, too, so it was doing 3x as many lookups as needed.
for 1k writes on a 4k block size filesystem. Amongst other things...
> But the whole xfs_iomap_write_delay is a giant mess anyway. For a usual
> call we do at least four lookups in the extent btree, which seems rather
> costly. Especially given that the low-level xfs_bmap_search_extents
> interface would give us all required information in one single call.
I noticed, though I was looking for a smaller, targetted fix rather
than rewriting the whole thing. Don't get me wrong, I think it needs
a rewrite to be efficient for the iomap infrastructure, just didn't
want to do that as a regression fix if a 1-liner might be
sufficient...
> Below is a patch I hacked up this morning to do just that. It passes
> xfstests, but I've not done any real benchmarking with it. If the
> reduced lookup overhead in it doesn't help enough we'll need to some
> sort of look aside cache for the information, but I hope that we
> can avoid that. And yes, it's a rather large patch - but the old
> path was so entangled that I couldn't come up with something lighter.
I'll run some tests on it. If it does so;ve the regression, I'm
going to hold it back until we get a decent amount of review and
test coverage on it, though...
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 18:03 ` Linus Torvalds
@ 2016-08-13 23:58 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-13 23:58 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Tejun Heo, Kirill A. Shutemov, Christoph Hellwig,
Huang, Ying, LKML, Bob Peterson, LKP
Hi Linus,
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
>On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner <david@fromorbit.com> wrote:
>> On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
>>>
>>> I don't recall having ever seen the mapping tree_lock as a contention
>>> point before, but it's not like I've tried that load either. So it
>>> might be a regression (going back long, I suspect), or just an unusual
>>> load that nobody has traditionally tested much.
>>>
>>> Single-threaded big file write one page at a time, was it?
>>
>> Yup. On a 4 node NUMA system.
>
>Ok, I can't see any real contention on my single-node workstation
>(running ext4 too, so there may be filesystem differences), but I
>guess that shouldn't surprise me. The cacheline bouncing just isn't
>expensive enough when it all stays on-die.
>
>I can see the tree_lock in my profiles (just not very high), and at
>least for ext4 the main caller ssems to be
>__set_page_dirty_nobuffers().
>
>And yes, looking at that, the biggest cost by _far_ inside the
>spinlock seems to be the accounting.
>
>Which doesn't even have to be inside the mapping lock, as far as I can
>tell, and as far as comments go.
>
>So a stupid patch to just move the dirty page accounting to outside
>the spinlock might help a lot.
>
>Does this attached patch help your contention numbers?
>
>Adding a few people who get blamed for account_page_dirtied() and
>inode_attach_wb() just to make sure that nobody expected the
>mapping_lock spinlock to be held when calling account_page_dirtied().
>
>I realize that this has nothing to do with the AIM7 regression (the
>spinlock just isn't high enough in that profile), but your contention
>numbers just aren't right, and updating accounting statistics inside a
>critical spinlock when not needed is just wrong.
I'm testing this patch on top of 9909170065 ("Merge tag 'nfs-for-4.8-2'
of git://git.linux-nfs.org/projects/trondmy/linux-nfs").
The BRD (Ram backed block device, drivers/block/brd.c) tests enables
pretty fast IO. And the fsmark-generic-brd-raid.yaml on lkp-hsx02 will
simulate 8 RAID disks on a 4-node NUMA machine.
queue -q vip -t ivb44 -b wfg/account_page_dirtied-linus aim7-fs-1brd.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t ivb43 -b wfg/account_page_dirtied-linus fsmark-stress-journal-1hdd.yaml fsmark-stress-journal-1brd.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t ivb44 -b wfg/account_page_dirtied-linus fsmark-generic-1brd.yaml dd-write-1hdd.yaml fsmark-generic-1hdd.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t lkp-hsx02 -b wfg/account_page_dirtied-linus fsmark-generic-brd-raid.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t lkp-hsw-ep4 -b wfg/account_page_dirtied-linus fsmark-1ssd-nvme-small.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t lkp-hsw-ep4 -b wfg/account_page_dirtied-linus fsmark-1ssd-nvme-small.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
Thanks,
Fengguang
> fs/buffer.c | 5 ++++-
> fs/xfs/xfs_aops.c | 5 ++++-
> mm/page-writeback.c | 2 +-
> 3 files changed, 9 insertions(+), 3 deletions(-)
>
>diff --git a/fs/buffer.c b/fs/buffer.c
>index 9c8eb9b6db6a..f79a9d241589 100644
>--- a/fs/buffer.c
>+++ b/fs/buffer.c
>@@ -628,15 +628,18 @@ static void __set_page_dirty(struct page *page, struct address_space *mapping,
> int warn)
> {
> unsigned long flags;
>+ bool account = false;
>
> spin_lock_irqsave(&mapping->tree_lock, flags);
> if (page->mapping) { /* Race with truncate? */
> WARN_ON_ONCE(warn && !PageUptodate(page));
>- account_page_dirtied(page, mapping);
> radix_tree_tag_set(&mapping->page_tree,
> page_index(page), PAGECACHE_TAG_DIRTY);
>+ account = true;
> }
> spin_unlock_irqrestore(&mapping->tree_lock, flags);
>+ if (account)
>+ account_page_dirtied(page, mapping);
> }
>
> /*
>diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
>index 7575cfc3ad15..59169c36765e 100644
>--- a/fs/xfs/xfs_aops.c
>+++ b/fs/xfs/xfs_aops.c
>@@ -1490,15 +1490,18 @@ xfs_vm_set_page_dirty(
> if (newly_dirty) {
> /* sigh - __set_page_dirty() is static, so copy it here, too */
> unsigned long flags;
>+ bool account = false;
>
> spin_lock_irqsave(&mapping->tree_lock, flags);
> if (page->mapping) { /* Race with truncate? */
> WARN_ON_ONCE(!PageUptodate(page));
>- account_page_dirtied(page, mapping);
> radix_tree_tag_set(&mapping->page_tree,
> page_index(page), PAGECACHE_TAG_DIRTY);
>+ account = true;
> }
> spin_unlock_irqrestore(&mapping->tree_lock, flags);
>+ if (account)
>+ account_page_dirtied(page, mapping);
> }
> unlock_page_memcg(page);
> if (newly_dirty)
>diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>index f4cd7d8005c9..9a6a6b99acfe 100644
>--- a/mm/page-writeback.c
>+++ b/mm/page-writeback.c
>@@ -2517,10 +2517,10 @@ int __set_page_dirty_nobuffers(struct page *page)
> spin_lock_irqsave(&mapping->tree_lock, flags);
> BUG_ON(page_mapping(page) != mapping);
> WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
>- account_page_dirtied(page, mapping);
> radix_tree_tag_set(&mapping->page_tree, page_index(page),
> PAGECACHE_TAG_DIRTY);
> spin_unlock_irqrestore(&mapping->tree_lock, flags);
>+ account_page_dirtied(page, mapping);
> unlock_page_memcg(page);
>
> if (mapping->host) {
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-13 23:58 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-13 23:58 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 5802 bytes --]
Hi Linus,
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
>On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner <david@fromorbit.com> wrote:
>> On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
>>>
>>> I don't recall having ever seen the mapping tree_lock as a contention
>>> point before, but it's not like I've tried that load either. So it
>>> might be a regression (going back long, I suspect), or just an unusual
>>> load that nobody has traditionally tested much.
>>>
>>> Single-threaded big file write one page at a time, was it?
>>
>> Yup. On a 4 node NUMA system.
>
>Ok, I can't see any real contention on my single-node workstation
>(running ext4 too, so there may be filesystem differences), but I
>guess that shouldn't surprise me. The cacheline bouncing just isn't
>expensive enough when it all stays on-die.
>
>I can see the tree_lock in my profiles (just not very high), and at
>least for ext4 the main caller ssems to be
>__set_page_dirty_nobuffers().
>
>And yes, looking at that, the biggest cost by _far_ inside the
>spinlock seems to be the accounting.
>
>Which doesn't even have to be inside the mapping lock, as far as I can
>tell, and as far as comments go.
>
>So a stupid patch to just move the dirty page accounting to outside
>the spinlock might help a lot.
>
>Does this attached patch help your contention numbers?
>
>Adding a few people who get blamed for account_page_dirtied() and
>inode_attach_wb() just to make sure that nobody expected the
>mapping_lock spinlock to be held when calling account_page_dirtied().
>
>I realize that this has nothing to do with the AIM7 regression (the
>spinlock just isn't high enough in that profile), but your contention
>numbers just aren't right, and updating accounting statistics inside a
>critical spinlock when not needed is just wrong.
I'm testing this patch on top of 9909170065 ("Merge tag 'nfs-for-4.8-2'
of git://git.linux-nfs.org/projects/trondmy/linux-nfs").
The BRD (Ram backed block device, drivers/block/brd.c) tests enables
pretty fast IO. And the fsmark-generic-brd-raid.yaml on lkp-hsx02 will
simulate 8 RAID disks on a 4-node NUMA machine.
queue -q vip -t ivb44 -b wfg/account_page_dirtied-linus aim7-fs-1brd.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t ivb43 -b wfg/account_page_dirtied-linus fsmark-stress-journal-1hdd.yaml fsmark-stress-journal-1brd.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t ivb44 -b wfg/account_page_dirtied-linus fsmark-generic-1brd.yaml dd-write-1hdd.yaml fsmark-generic-1hdd.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t lkp-hsx02 -b wfg/account_page_dirtied-linus fsmark-generic-brd-raid.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t lkp-hsw-ep4 -b wfg/account_page_dirtied-linus fsmark-1ssd-nvme-small.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
queue -q vip -t lkp-hsw-ep4 -b wfg/account_page_dirtied-linus fsmark-1ssd-nvme-small.yaml -R3 -k 1b5f2eb4a752e1fa7102f37545f92e64fabd0cf8 -k 99091700659f4df965e138b38b4fa26a29b7eade
Thanks,
Fengguang
> fs/buffer.c | 5 ++++-
> fs/xfs/xfs_aops.c | 5 ++++-
> mm/page-writeback.c | 2 +-
> 3 files changed, 9 insertions(+), 3 deletions(-)
>
>diff --git a/fs/buffer.c b/fs/buffer.c
>index 9c8eb9b6db6a..f79a9d241589 100644
>--- a/fs/buffer.c
>+++ b/fs/buffer.c
>@@ -628,15 +628,18 @@ static void __set_page_dirty(struct page *page, struct address_space *mapping,
> int warn)
> {
> unsigned long flags;
>+ bool account = false;
>
> spin_lock_irqsave(&mapping->tree_lock, flags);
> if (page->mapping) { /* Race with truncate? */
> WARN_ON_ONCE(warn && !PageUptodate(page));
>- account_page_dirtied(page, mapping);
> radix_tree_tag_set(&mapping->page_tree,
> page_index(page), PAGECACHE_TAG_DIRTY);
>+ account = true;
> }
> spin_unlock_irqrestore(&mapping->tree_lock, flags);
>+ if (account)
>+ account_page_dirtied(page, mapping);
> }
>
> /*
>diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
>index 7575cfc3ad15..59169c36765e 100644
>--- a/fs/xfs/xfs_aops.c
>+++ b/fs/xfs/xfs_aops.c
>@@ -1490,15 +1490,18 @@ xfs_vm_set_page_dirty(
> if (newly_dirty) {
> /* sigh - __set_page_dirty() is static, so copy it here, too */
> unsigned long flags;
>+ bool account = false;
>
> spin_lock_irqsave(&mapping->tree_lock, flags);
> if (page->mapping) { /* Race with truncate? */
> WARN_ON_ONCE(!PageUptodate(page));
>- account_page_dirtied(page, mapping);
> radix_tree_tag_set(&mapping->page_tree,
> page_index(page), PAGECACHE_TAG_DIRTY);
>+ account = true;
> }
> spin_unlock_irqrestore(&mapping->tree_lock, flags);
>+ if (account)
>+ account_page_dirtied(page, mapping);
> }
> unlock_page_memcg(page);
> if (newly_dirty)
>diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>index f4cd7d8005c9..9a6a6b99acfe 100644
>--- a/mm/page-writeback.c
>+++ b/mm/page-writeback.c
>@@ -2517,10 +2517,10 @@ int __set_page_dirty_nobuffers(struct page *page)
> spin_lock_irqsave(&mapping->tree_lock, flags);
> BUG_ON(page_mapping(page) != mapping);
> WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
>- account_page_dirtied(page, mapping);
> radix_tree_tag_set(&mapping->page_tree, page_index(page),
> PAGECACHE_TAG_DIRTY);
> spin_unlock_irqrestore(&mapping->tree_lock, flags);
>+ account_page_dirtied(page, mapping);
> unlock_page_memcg(page);
>
> if (mapping->host) {
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-13 22:51 ` Fengguang Wu
@ 2016-08-14 14:50 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-14 14:50 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Dave Chinner, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP
Hi Christoph,
On Sun, Aug 14, 2016 at 06:51:28AM +0800, Fengguang Wu wrote:
>Hi Christoph,
>
>On Sun, Aug 14, 2016 at 12:15:08AM +0200, Christoph Hellwig wrote:
>>Hi Fengguang,
>>
>>feel free to try this git tree:
>>
>> git://git.infradead.org/users/hch/vfs.git iomap-fixes
>
>I just queued some test jobs for it.
>
>% queue -q vip -t ivb44 -b hch-vfs/iomap-fixes aim7-fs-1brd.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
>
>That job file can be found here:
>
> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/jobs/aim7-fs-1brd.yaml
>
>It specifies a matrix of the below atom tests:
>
> wfg /c/lkp-tests% split-job jobs/aim7-fs-1brd.yaml -s 'fs: xfs'
>
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_src-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rr-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rw-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_cp-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_wrt-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-sync_disk_rw-600-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-creat-clo-1500-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rd-9000-performance.yaml
I got some results now. The several finished aim7 tests have some
performance regressions for commit fe9c2c81 ("xfs: rewrite and
optimize the delalloc write path") comparing to its parent commit
ca2edab2e and their base mainline commit 990917006 ("Merge tag
'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs").
wfg@inn ~% compare -g aim7 -ai 99091700659f4df965e138b38b4fa26a29b7eade ca2edab2e1d8f30dda874b7f717c2d4664991e9b fe9c2c81ed073878768785a985295cbacc349e42
tests: 4
60 perf-index fe9c2c81ed073878768785a985295cbacc349e42
97 power-index fe9c2c81ed073878768785a985295cbacc349e42
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985 testcase/testparams/testbox
---------------- -------------------------- -------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
270459 272267 ± 3% -48% 139834 ± 3% aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
473257 468546 5% 497512 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
360578 -18% 296589 -60% 144974 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
358701 -6% 335712 -40% 216057 GEO-MEAN aim7.jobs-per-min
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
48.48 48.15 36% 65.85 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
89.50 89.76 88.75 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
35.78 23% 43.93 76% 63.09 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
53.75 7% 57.48 33% 71.71 GEO-MEAN turbostat.%Busy
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
1439 1431 36% 1964 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
2671 2674 2650 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
1057 23% 1303 78% 1883 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
1595 7% 1708 34% 2139 GEO-MEAN turbostat.Avg_MHz
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
167 167 6% 177 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
175 175 176 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
150 8% 162 19% 178 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
164 168 8% 177 GEO-MEAN turbostat.PkgWatt
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
10.27 10.43 -14% 8.79 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
6.85 6.66 6.88 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
9.96 14% 11.36 -7% 9.23 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
8.88 4% 9.24 -7% 8.23 GEO-MEAN turbostat.RAMWatt
Here are the detailed numbers for each test case. The perf-profile and
latency_stats numbers are sorted by absolute change in each sub-category
now. perf-profile numbers > 5 are all shown.
It may be more pleasant to view the long trace.call.funcs lines with
vim ":set nowrap" option.
aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
360578 -18% 294351 -60% 144974 aim7.jobs-per-min
12835 458% 71658 480% 74445 aim7.time.involuntary_context_switches
755 50% 1136 373% 3570 aim7.time.system_time
155970 152810 73% 269438 aim7.time.minor_page_faults
50.15 22% 61.39 148% 124.39 aim7.time.elapsed_time
50.15 22% 61.39 148% 124.39 aim7.time.elapsed_time.max
438660 428601 -7% 407807 aim7.time.voluntary_context_switches
2452 2480 5% 2584 aim7.time.maximum_resident_set_size
34293 ± 4% 70% 58129 ± 19% 213% 107483 interrupts.CAL:Function_call_interrupts
79.70 ± 6% 16% 92.63 ± 6% 89% 150.33 uptime.boot
2890 ± 8% 6% 3077 ± 8% 15% 3329 uptime.idle
150186 ± 9% 41% 212090 122% 333727 softirqs.RCU
161166 9% 176318 16% 186527 softirqs.SCHED
648051 33% 864346 222% 2089349 softirqs.TIMER
50.15 22% 61.39 148% 124.39 time.elapsed_time
50.15 22% 61.39 148% 124.39 time.elapsed_time.max
12835 458% 71658 480% 74445 time.involuntary_context_switches
155970 152810 73% 269438 time.minor_page_faults
1563 21% 1898 85% 2895 time.percent_of_cpu_this_job_got
755 50% 1136 373% 3570 time.system_time
4564660 ± 4% 68% 7651587 79% 8159302 numa-numastat.node0.numa_foreign
3929898 81% 7129718 46% 5733813 numa-numastat.node0.numa_miss
0 2 ± 20% 2 numa-numastat.node1.other_node
4569811 ± 4% 68% 7654689 79% 8163206 numa-numastat.node1.numa_miss
3935075 81% 7132850 46% 5737410 numa-numastat.node1.numa_foreign
34767917 4% 36214694 11% 38627727 numa-numastat.node1.numa_hit
34767917 4% 36214691 11% 38627725 numa-numastat.node1.local_node
12377 ± 18% 3615% 459790 2848% 364868 vmstat.io.bo
119 -8% 110 ± 4% -16% 101 vmstat.memory.buff
18826454 -16% 15748045 -37% 11882562 vmstat.memory.free
16 25% 20 106% 33 vmstat.procs.r
19407 469% 110509 520% 120350 vmstat.system.cs
48215 10% 52977 3% 49819 vmstat.system.in
142459 -11% 126667 -23% 109481 cpuidle.C1-IVT.usage
29494441 ± 3% -18% 24206809 -36% 18889149 cpuidle.C1-IVT.time
5736732 28% 7315830 525% 35868316 cpuidle.C1E-IVT.time
51148 9% 55743 98% 101021 cpuidle.C1E-IVT.usage
18347890 27% 23243942 21% 22154105 cpuidle.C3-IVT.time
96127 9% 104487 -29% 68552 cpuidle.C3-IVT.usage
1.525e+09 6% 1.617e+09 41% 2.147e+09 cpuidle.C6-IVT.time
1805218 11% 1998052 33% 2397285 cpuidle.C6-IVT.usage
286 ± 11% 14% 328 ± 7% 389% 1402 cpuidle.POLL.usage
1013526 ± 41% 98% 2003264 ± 20% 272% 3774675 cpuidle.POLL.time
35.78 24% 44.22 76% 63.09 turbostat.%Busy
1057 24% 1312 78% 1883 turbostat.Avg_MHz
34.80 -3% 33.63 -22% 27.18 turbostat.CPU%c1
0.34 -5% 0.33 -77% 0.08 turbostat.CPU%c3
29.07 -25% 21.82 -67% 9.65 turbostat.CPU%c6
118 11% 130 23% 145 turbostat.CorWatt
9.39 ± 13% -19% 7.61 ± 6% -61% 3.67 turbostat.Pkg%pc2
3.04 ± 33% -49% 1.55 ± 14% -76% 0.72 turbostat.Pkg%pc6
150 9% 164 19% 178 turbostat.PkgWatt
9.96 14% 11.34 -7% 9.23 turbostat.RAMWatt
18232 ± 8% -8% 16747 ± 10% 11% 20267 meminfo.AnonHugePages
80723 78330 -24% 61572 meminfo.CmaFree
4690642 ± 10% -15% 3981312 -15% 3983392 meminfo.DirectMap2M
1060897 -21% 834807 -22% 828755 meminfo.Dirty
2362330 26% 2983603 44% 3391287 meminfo.Inactive
2353250 26% 2974520 44% 3382139 meminfo.Inactive(file)
19388991 -18% 15966408 -38% 12038822 meminfo.MemFree
1186231 4% 1236627 13% 1341728 meminfo.SReclaimable
179570 3% 185696 14% 204382 meminfo.SUnreclaim
1365802 4% 1422323 13% 1546111 meminfo.Slab
318863 10% 352026 16% 368386 meminfo.Unevictable
0.00 0.00 9.15 perf-profile.cycles-pp.xfs_file_iomap_begin_delay.isra.9.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 0.00 8.90 perf-profile.cycles-pp.xfs_inode_set_eofblocks_tag.xfs_file_iomap_begin_delay.isra.9.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 0.00 8.61 perf-profile.cycles-pp._raw_spin_lock.xfs_inode_set_eofblocks_tag.xfs_file_iomap_begin_delay.isra.9.xfs_file_iomap_begin.iomap_apply
0.00 0.00 8.50 perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.xfs_inode_set_eofblocks_tag.xfs_file_iomap_begin_delay.isra.9.xfs_file_iomap_begin
6.05 -11% 5.42 ± 4% -15% 5.14 perf-profile.cycles-pp.hrtimer_interrupt.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter
6.54 -11% 5.80 ± 4% -16% 5.51 perf-profile.cycles-pp.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
16.78 -9% 15.34 ± 9% -11% 14.90 perf-profile.cycles-pp.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
16.51 ± 3% -9% 14.99 ± 9% -12% 14.49 perf-profile.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry
0.23 ± 23% 20% 0.28 ± 12% 3683% 8.70 perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath
4.369e+11 ± 4% 20% 5.239e+11 97% 8.601e+11 perf-stat.branch-instructions
0.38 5% 0.40 -27% 0.28 perf-stat.branch-miss-rate
1.678e+09 ± 3% 26% 2.117e+09 44% 2.413e+09 perf-stat.branch-misses
42.30 -7% 39.31 -5% 40.38 perf-stat.cache-miss-rate
6.874e+09 ± 4% 19% 8.21e+09 51% 1.041e+10 perf-stat.cache-misses
1.625e+10 ± 3% 29% 2.089e+10 59% 2.578e+10 perf-stat.cache-references
1017846 588% 7005227 1401% 15273586 perf-stat.context-switches
2.757e+12 ± 4% 48% 4.092e+12 318% 1.151e+13 perf-stat.cpu-cycles
177918 15% 204776 35% 241051 perf-stat.cpu-migrations
0.37 ± 14% 60% 0.60 ± 3% 45% 0.54 perf-stat.dTLB-load-miss-rate
2.413e+09 ± 14% 97% 4.757e+09 ± 4% 149% 6.001e+09 perf-stat.dTLB-load-misses
6.438e+11 23% 7.893e+11 71% 1.103e+12 perf-stat.dTLB-loads
0.06 ± 38% 100% 0.11 ± 6% 207% 0.17 perf-stat.dTLB-store-miss-rate
2.656e+08 ± 34% 123% 5.91e+08 ± 7% 203% 8.038e+08 perf-stat.dTLB-store-misses
45.99 ± 5% 8% 49.56 ± 11% 14% 52.61 perf-stat.iTLB-load-miss-rate
45151945 45832755 72% 77697494 perf-stat.iTLB-load-misses
53205262 ± 7% -10% 47792612 ± 21% 32% 69997751 perf-stat.iTLB-loads
2.457e+12 ± 4% 16% 2.851e+12 66% 4.084e+12 perf-stat.instructions
0.89 -22% 0.70 -60% 0.35 perf-stat.ipc
286640 8% 310690 99% 571225 perf-stat.minor-faults
29.16 7% 31.25 8% 31.42 perf-stat.node-load-miss-rate
4.86e+08 ± 3% 123% 1.084e+09 250% 1.7e+09 perf-stat.node-load-misses
1.18e+09 102% 2.385e+09 214% 3.711e+09 perf-stat.node-loads
21.51 30% 27.95 62% 34.86 perf-stat.node-store-miss-rate
1.262e+09 58% 1.989e+09 177% 3.499e+09 perf-stat.node-store-misses
4.606e+09 11% 5.126e+09 42% 6.539e+09 perf-stat.node-stores
286617 8% 310730 99% 571253 perf-stat.page-faults
1166432 23% 1429828 42% 1653754 numa-meminfo.node0.Inactive(file)
1175123 22% 1434274 41% 1662351 numa-meminfo.node0.Inactive
513534 -23% 394773 -24% 392567 numa-meminfo.node0.Dirty
9717968 -17% 8082393 -37% 6159862 numa-meminfo.node0.MemFree
159470 11% 176717 16% 184229 numa-meminfo.node0.Unevictable
23148226 7% 24783802 15% 26706333 numa-meminfo.node0.MemUsed
103531 ± 32% -10% 93669 ± 40% 40% 144469 numa-meminfo.node0.SUnreclaim
1187035 30% 1549075 46% 1727751 numa-meminfo.node1.Inactive
1186646 30% 1544438 46% 1727201 numa-meminfo.node1.Inactive(file)
21000905 3% 21647702 13% 23741428 numa-meminfo.node1.Active(file)
21083707 3% 21748741 13% 23822391 numa-meminfo.node1.Active
547021 -20% 438525 -21% 433706 numa-meminfo.node1.Dirty
9663240 -19% 7870896 -39% 5869977 numa-meminfo.node1.MemFree
561241 12% 625903 21% 679671 numa-meminfo.node1.SReclaimable
637259 ± 4% 13% 717863 ± 5% 16% 739482 numa-meminfo.node1.Slab
23329350 8% 25121687 16% 27122606 numa-meminfo.node1.MemUsed
159394 10% 175315 16% 184159 numa-meminfo.node1.Unevictable
521615 33% 695562 267% 1916159 latency_stats.avg.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
500644 33% 667614 261% 1805608 latency_stats.avg.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
8932 ± 46% -70% 2717 ± 4% -95% 464 latency_stats.avg.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
0 0 73327 latency_stats.hits.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
43 ± 37% 7923% 3503 ± 4% 31792% 13926 latency_stats.hits.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1422573 30% 1852368 ± 5% 228% 4672496 latency_stats.max.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
1423130 30% 1851873 ± 5% 228% 4661765 latency_stats.max.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
627 ± 66% 3788% 24404 ± 17% 6254% 39883 latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
3922 ± 18% 56% 6134 ± 29% 634% 28786 latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_ialloc_read_agi.xfs_dialloc.xfs_ialloc.xfs_dir_ialloc.xfs_create
0 0 16665 latency_stats.max.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
5.15e+10 25% 6.454e+10 220% 1.649e+11 latency_stats.sum.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
0 0 1.385e+08 latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
11666476 45% 16905624 755% 99756088 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink_remove.xfs_ifree.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode
2216 ± 69% 80030% 1775681 ± 4% 3e+06% 67521154 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1601815 28% 2053992 288% 6213577 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_ialloc_read_agi.xfs_dialloc.xfs_ialloc.xfs_dir_ialloc.xfs_create
1774397 20% 2120576 244% 6099374 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
628 ±141% 125% 1416 ± 5% 4e+05% 2677036 latency_stats.sum.xfs_iget.xfs_ialloc.xfs_dir_ialloc.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
6087 ± 92% 1277% 83839 ± 3% 11105% 682063 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
0 0 116108 latency_stats.sum.xlog_grant_head_wait.xlog_grant_head_check.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
1212 ± 59% 1842% 23546 ± 7% 4861% 60149 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
1624 ± 22% 1356% 23637 ± 3% 1596% 27545 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
2068 ± 27% 834% 19319 ± 23% 1125% 25334 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
0 0 22155 latency_stats.sum.xlog_grant_head_wait.xlog_grant_head_check.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput.dentry_unlink_inode
39 ± 71% 41280% 16414 ± 14% 51951% 20647 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
0 0 15600 latency_stats.sum.xlog_grant_head_wait.xlog_grant_head_check.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput.dentry_unlink_inode
10 ±141% 6795% 689 ± 70% 1e+05% 10637 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput.dentry_unlink_inode.__dentry_kill
99 ±112% 86% 185 ± 80% 9978% 10011 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_imap_to_bp.xfs_iunlink_remove.xfs_ifree.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode
18232 ±134% -16% 15260 ± 54% -40% 10975 latency_stats.sum.xfs_lock_two_inodes.xfs_remove.xfs_vn_unlink.vfs_unlink.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
647 ± 3% -97% 21 ± 19% 34% 868 proc-vmstat.kswapd_high_wmark_hit_quickly
1091 -97% 36 ± 9% 29% 1411 proc-vmstat.kswapd_low_wmark_hit_quickly
265066 -21% 208142 -22% 206344 proc-vmstat.nr_dirty
20118 19574 -23% 15432 proc-vmstat.nr_free_cma
4844108 -18% 3988031 -38% 3008251 proc-vmstat.nr_free_pages
588262 26% 743537 44% 845765 proc-vmstat.nr_inactive_file
50 ± 25% 192% 148 ± 15% 103% 103 proc-vmstat.nr_pages_scanned
296623 4% 309201 13% 335474 proc-vmstat.nr_slab_reclaimable
44880 3% 46405 14% 51078 proc-vmstat.nr_slab_unreclaimable
79716 10% 88008 16% 92097 proc-vmstat.nr_unevictable
167 ± 9% 9e+06% 14513434 2e+06% 3569348 proc-vmstat.nr_vmscan_immediate_reclaim
162380 ± 18% 4392% 7294622 7024% 11567602 proc-vmstat.nr_written
588257 26% 743537 44% 845784 proc-vmstat.nr_zone_inactive_file
79716 10% 88008 16% 92097 proc-vmstat.nr_zone_unevictable
265092 -21% 208154 -22% 206388 proc-vmstat.nr_zone_write_pending
8507451 ± 3% 74% 14784261 64% 13918067 proc-vmstat.numa_foreign
10 ± 4% 10 ± 4% 6e+05% 57855 proc-vmstat.numa_hint_faults
8507451 ± 3% 74% 14784187 64% 13918067 proc-vmstat.numa_miss
72 72 3e+05% 213175 proc-vmstat.numa_pte_updates
1740 -97% 59 ± 12% 33% 2306 proc-vmstat.pageoutrun
5322372 1068% 62167111 1024% 59824114 proc-vmstat.pgactivate
2816355 27% 3575784 14% 3203214 proc-vmstat.pgalloc_dma32
74392338 11% 82333943 14% 84954110 proc-vmstat.pgalloc_normal
60958397 -18% 49976330 -26% 45055885 proc-vmstat.pgdeactivate
302790 9% 329088 94% 586116 proc-vmstat.pgfault
61061205 14% 69758545 18% 72000453 proc-vmstat.pgfree
655652 ± 18% 4352% 29190304 6967% 46338056 proc-vmstat.pgpgout
60965725 -18% 49983704 -26% 45063375 proc-vmstat.pgrefill
2 ± 17% 4e+07% 985929 ± 8% 7e+07% 1952629 proc-vmstat.pgrotated
82046 ± 36% 50634% 41625211 5397% 4510385 proc-vmstat.pgscan_direct
60128369 -37% 38068394 10% 66306637 proc-vmstat.pgscan_kswapd
2030 ± 46% 1e+06% 27038054 ± 3% 78642% 1598733 proc-vmstat.pgsteal_direct
0 2414551 ± 3% 3694833 proc-vmstat.workingset_activate
0 2414551 ± 3% 3694833 proc-vmstat.workingset_refault
26 ± 39% 1e+07% 2657286 3e+06% 658792 numa-vmstat.node0.nr_vmscan_immediate_reclaim
40449 ± 22% 3135% 1308601 ± 4% 4723% 1950670 numa-vmstat.node0.nr_written
291648 22% 357059 42% 413612 numa-vmstat.node0.nr_zone_inactive_file
291655 22% 357053 42% 413596 numa-vmstat.node0.nr_inactive_file
1542314 ± 5% 77% 2731911 98% 3056411 numa-vmstat.node0.numa_foreign
1366073 ± 4% 103% 2766780 ± 3% 68% 2293117 numa-vmstat.node0.numa_miss
128634 -23% 99104 -24% 98062 numa-vmstat.node0.nr_dirty
128663 -23% 99130 -24% 98051 numa-vmstat.node0.nr_zone_write_pending
2424918 -16% 2033425 -37% 1537826 numa-vmstat.node0.nr_free_pages
14037168 10% 15473174 20% 16883787 numa-vmstat.node0.numa_local
14037172 10% 15473174 20% 16883790 numa-vmstat.node0.numa_hit
39867 10% 44022 16% 46058 numa-vmstat.node0.nr_zone_unevictable
39867 10% 44022 16% 46058 numa-vmstat.node0.nr_unevictable
25871 ± 32% -9% 23414 ± 40% 40% 36094 numa-vmstat.node0.nr_slab_unreclaimable
14851187 6% 15749527 11% 16497187 numa-vmstat.node0.nr_dirtied
0 1225299 ± 4% 2008478 numa-vmstat.node1.workingset_refault
0 1225299 ± 4% 2008478 numa-vmstat.node1.workingset_activate
23 ± 35% 1e+07% 2974198 ± 3% 3e+06% 683002 numa-vmstat.node1.nr_vmscan_immediate_reclaim
40769 ± 26% 3264% 1371611 ± 3% 5569% 2311374 numa-vmstat.node1.nr_written
25 ± 8% 216% 81 ± 3% 356% 117 numa-vmstat.node1.nr_pages_scanned
296681 30% 385708 45% 431591 numa-vmstat.node1.nr_zone_inactive_file
296681 30% 385709 45% 431591 numa-vmstat.node1.nr_inactive_file
5252547 5401234 13% 5936151 numa-vmstat.node1.nr_zone_active_file
5252547 5401238 13% 5936151 numa-vmstat.node1.nr_active_file
136060 -19% 110021 -21% 107114 numa-vmstat.node1.nr_zone_write_pending
136060 -19% 110019 -21% 107107 numa-vmstat.node1.nr_dirty
1520682 ± 3% 76% 2681012 98% 3008493 numa-vmstat.node1.numa_miss
2413468 -18% 1980184 -39% 1466738 numa-vmstat.node1.nr_free_pages
1344474 ± 3% 102% 2715690 ± 4% 67% 2245159 numa-vmstat.node1.numa_foreign
20160 19698 -22% 15673 numa-vmstat.node1.nr_free_cma
14350439 12% 16005551 27% 18257157 numa-vmstat.node1.numa_local
14350440 12% 16005552 27% 18257158 numa-vmstat.node1.numa_hit
15381788 9% 16829619 21% 18645441 numa-vmstat.node1.nr_dirtied
140354 11% 156202 21% 169950 numa-vmstat.node1.nr_slab_reclaimable
39848 10% 43676 16% 46041 numa-vmstat.node1.nr_zone_unevictable
39848 10% 43676 16% 46041 numa-vmstat.node1.nr_unevictable
377 ± 9% 370 ± 5% 24% 468 slabinfo.bdev_cache.active_objs
377 ± 9% 370 ± 5% 24% 468 slabinfo.bdev_cache.num_objs
389 ± 13% 604% 2737 ± 23% 3371% 13501 slabinfo.bio-1.active_objs
389 ± 13% 612% 2770 ± 24% 3441% 13774 slabinfo.bio-1.num_objs
7 ± 17% 1039% 83 ± 24% 3623% 273 slabinfo.bio-1.active_slabs
7 ± 17% 1039% 83 ± 24% 3623% 273 slabinfo.bio-1.num_slabs
978 ± 4% 10% 1075 17% 1144 slabinfo.blkdev_requests.active_objs
978 ± 4% 10% 1075 17% 1144 slabinfo.blkdev_requests.num_objs
10942119 3% 11286505 13% 12389701 slabinfo.buffer_head.num_objs
280566 3% 289397 13% 317684 slabinfo.buffer_head.active_slabs
280566 3% 289397 13% 317684 slabinfo.buffer_head.num_slabs
10941627 10693692 11% 12140372 slabinfo.buffer_head.active_objs
7436 ± 3% 7558 20% 8922 slabinfo.cred_jar.active_objs
7436 ± 3% 7558 20% 8922 slabinfo.cred_jar.num_objs
4734 85% 8767 ± 8% 60% 7554 slabinfo.kmalloc-128.num_objs
4734 78% 8418 ± 8% 45% 6848 slabinfo.kmalloc-128.active_objs
17074 -11% 15121 -10% 15379 slabinfo.kmalloc-256.num_objs
3105 4% 3216 14% 3527 slabinfo.kmalloc-4096.num_objs
3061 4% 3170 12% 3419 slabinfo.kmalloc-4096.active_objs
13131 ± 3% 17% 15379 12% 14714 slabinfo.kmalloc-512.num_objs
1623 ± 3% 1664 ± 3% 16% 1889 slabinfo.mnt_cache.active_objs
1623 ± 3% 1664 ± 3% 16% 1889 slabinfo.mnt_cache.num_objs
2670 6% 2821 19% 3178 slabinfo.nsproxy.active_objs
2670 6% 2821 19% 3178 slabinfo.nsproxy.num_objs
2532 5% 2656 17% 2959 slabinfo.posix_timers_cache.active_objs
2532 5% 2656 17% 2959 slabinfo.posix_timers_cache.num_objs
20689 87% 38595 ± 13% 47% 30452 slabinfo.radix_tree_node.active_objs
399 83% 730 ± 13% 47% 587 slabinfo.radix_tree_node.active_slabs
399 83% 730 ± 13% 47% 587 slabinfo.radix_tree_node.num_slabs
22379 83% 40931 ± 13% 47% 32872 slabinfo.radix_tree_node.num_objs
4688 4706 22% 5712 slabinfo.sigqueue.active_objs
4688 4706 22% 5712 slabinfo.sigqueue.num_objs
979 ± 4% 7% 1046 ± 3% -15% 833 slabinfo.task_group.active_objs
979 ± 4% 7% 1046 ± 3% -15% 833 slabinfo.task_group.num_objs
1344 5% 1410 17% 1570 slabinfo.xfs_btree_cur.active_objs
1344 5% 1410 17% 1570 slabinfo.xfs_btree_cur.num_objs
2500 5% 2632 18% 2946 slabinfo.xfs_da_state.active_objs
2500 5% 2632 18% 2946 slabinfo.xfs_da_state.num_objs
1299 279% 4917 ± 17% 134% 3035 slabinfo.xfs_efd_item.num_objs
1299 278% 4911 ± 17% 126% 2940 slabinfo.xfs_efd_item.active_objs
1904 ± 3% 4% 1982 42% 2703 slabinfo.xfs_inode.num_objs
1904 ± 3% 4% 1982 39% 2644 slabinfo.xfs_inode.active_objs
1659 113% 3538 ± 27% 1360% 24227 slabinfo.xfs_log_ticket.active_objs
1659 116% 3588 ± 27% 1369% 24383 slabinfo.xfs_log_ticket.num_objs
37 169% 99 ± 29% 1405% 557 slabinfo.xfs_log_ticket.active_slabs
37 169% 99 ± 29% 1405% 557 slabinfo.xfs_log_ticket.num_slabs
2615 84% 4821 ± 28% 1549% 43132 slabinfo.xfs_trans.active_objs
2615 86% 4860 ± 28% 1551% 43171 slabinfo.xfs_trans.num_objs
37 162% 97 ± 30% 1614% 634 slabinfo.xfs_trans.active_slabs
37 162% 97 ± 30% 1614% 634 slabinfo.xfs_trans.num_slabs
3255 ± 12% 9210% 303094 38966% 1271810 sched_debug.cfs_rq:/.min_vruntime.avg
8273 ± 10% 382% 39836 ± 17% 309% 33806 sched_debug.cfs_rq:/.load.avg
716 ± 34% 28783% 206899 1e+05% 1034000 sched_debug.cfs_rq:/.min_vruntime.min
1830 ± 5% 4365% 81731 10579% 195502 sched_debug.cfs_rq:/.min_vruntime.stddev
1845 ± 4% 4330% 81754 10503% 195683 sched_debug.cfs_rq:/.spread0.stddev
73578 ± 34% 1043% 841209 ± 34% 452% 405848 sched_debug.cfs_rq:/.load.max
12.67 ± 35% 3999% 519.25 1979% 263.33 sched_debug.cfs_rq:/.runnable_load_avg.max
2.34 ± 33% 4268% 102.01 1854% 45.63 sched_debug.cfs_rq:/.runnable_load_avg.stddev
10284 ± 12% 4107% 432665 ± 7% 15350% 1588973 sched_debug.cfs_rq:/.min_vruntime.max
1.05 ± 20% 2335% 25.54 1631% 18.15 sched_debug.cfs_rq:/.runnable_load_avg.avg
44.06 ± 28% 254% 155.90 ± 16% 310% 180.49 sched_debug.cfs_rq:/.util_avg.stddev
15448 ± 19% 831% 143829 ± 22% 422% 80585 sched_debug.cfs_rq:/.load.stddev
597 ± 13% -39% 367 ± 17% -49% 303 sched_debug.cfs_rq:/.util_avg.min
1464 ± 23% -55% 664 ± 30% -63% 546 sched_debug.cfs_rq:/.load_avg.min
1830 ± 3% -50% 911 ± 5% -65% 642 sched_debug.cfs_rq:/.load_avg.avg
0.30 ± 13% 22% 0.36 ± 11% 86% 0.56 sched_debug.cfs_rq:/.nr_running.avg
2302 ± 11% -31% 1589 -50% 1157 sched_debug.cfs_rq:/.load_avg.max
819 ± 3% 36% 1116 15% 940 sched_debug.cfs_rq:/.util_avg.max
728 -14% 630 -9% 664 sched_debug.cfs_rq:/.util_avg.avg
73578 ± 34% 1043% 841209 ± 34% 452% 405848 sched_debug.cpu.load.max
1.81 ± 11% 77% 3.22 395% 8.98 sched_debug.cpu.clock.stddev
1.81 ± 11% 77% 3.22 395% 8.98 sched_debug.cpu.clock_task.stddev
8278 ± 10% 379% 39671 ± 18% 305% 33517 sched_debug.cpu.load.avg
3600 385% 17452 1023% 40419 sched_debug.cpu.nr_load_updates.min
5446 305% 22069 754% 46492 sched_debug.cpu.nr_load_updates.avg
8627 ± 5% 217% 27314 517% 53222 sched_debug.cpu.nr_load_updates.max
6221 ± 3% 2137% 139191 3486% 223092 sched_debug.cpu.nr_switches.max
15.67 ± 40% 3187% 515.00 1579% 263.00 sched_debug.cpu.cpu_load[0].max
2.55 ± 33% 3886% 101.45 1697% 45.73 sched_debug.cpu.cpu_load[0].stddev
15452 ± 19% 831% 143937 ± 22% 421% 80431 sched_debug.cpu.load.stddev
1144 236% 3839 329% 4911 sched_debug.cpu.nr_load_updates.stddev
23.67 ± 41% 709% 191.50 ± 6% 637% 174.33 sched_debug.cpu.nr_uninterruptible.max
978 7241% 71831 ± 3% 13746% 135493 sched_debug.cpu.nr_switches.avg
0.96 ± 19% 2503% 24.95 1720% 17.44 sched_debug.cpu.cpu_load[0].avg
957 ± 4% 3406% 33568 3626% 35679 sched_debug.cpu.nr_switches.stddev
29644 ± 16% 107% 61350 ± 8% 190% 86111 sched_debug.cpu.clock.max
29644 ± 16% 107% 61350 ± 8% 190% 86111 sched_debug.cpu.clock_task.max
29640 ± 16% 107% 61344 ± 8% 190% 86096 sched_debug.cpu.clock.avg
29640 ± 16% 107% 61344 ± 8% 190% 86096 sched_debug.cpu.clock_task.avg
29635 ± 16% 107% 61338 ± 8% 190% 86079 sched_debug.cpu.clock.min
29635 ± 16% 107% 61338 ± 8% 190% 86079 sched_debug.cpu.clock_task.min
335 ± 4% 7948% 27014 22596% 76183 sched_debug.cpu.nr_switches.min
1.62 ± 32% 1784% 30.61 ± 3% 1100% 19.51 sched_debug.cpu.cpu_load[4].avg
5.46 ± 15% 2325% 132.40 1031% 61.73 sched_debug.cpu.nr_uninterruptible.stddev
424 ± 11% 106% 875 ± 13% 263% 1541 sched_debug.cpu.curr->pid.avg
1400 166% 3721 264% 5100 sched_debug.cpu.curr->pid.max
610 ± 3% 108% 1269 126% 1380 sched_debug.cpu.curr->pid.stddev
0.43 ± 15% 4% 0.45 ± 16% 48% 0.64 sched_debug.cpu.nr_running.avg
253789 ± 13% -5% 241499 ± 3% -22% 198383 sched_debug.cpu.avg_idle.stddev
29638 ± 16% 107% 61339 ± 8% 190% 86079 sched_debug.cpu_clk
28529 ± 17% 111% 60238 ± 8% 198% 84957 sched_debug.ktime
0.17 -74% 0.04 ± 8% -83% 0.03 sched_debug.rt_rq:/.rt_time.avg
0.85 ± 3% -74% 0.22 ± 8% -83% 0.14 sched_debug.rt_rq:/.rt_time.stddev
5.14 ± 10% -75% 1.28 ± 6% -83% 0.88 sched_debug.rt_rq:/.rt_time.max
29638 ± 16% 107% 61339 ± 8% 190% 86079 sched_debug.sched_clk
aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
473257 468546 5% 497512 aim7.jobs-per-min
613996 11% 681283 -7% 571701 aim7.time.involuntary_context_switches
4914 4977 -6% 4634 aim7.time.system_time
114.83 115.98 -5% 109.23 aim7.time.elapsed_time
114.83 115.98 -5% 109.23 aim7.time.elapsed_time.max
60711 ± 8% 20% 73007 -9% 55449 aim7.time.voluntary_context_switches
2509 -6% 2360 -4% 2416 aim7.time.maximum_resident_set_size
362268 19% 430263 -8% 332046 softirqs.RCU
352 ± 7% -32% 238 -35% 230 vmstat.procs.r
5 ± 16% 80% 9 -40% 3 vmstat.procs.b
9584 7% 10255 -10% 8585 vmstat.system.cs
20442 ± 5% 38% 28201 -40% 12270 cpuidle.C1-IVT.usage
3.95 -3% 3.81 9% 4.29 turbostat.CPU%c1
0.81 ± 14% 44% 1.17 28% 1.04 turbostat.Pkg%pc6
19711 ± 5% -7% 18413 -17% 16384 meminfo.AnonHugePages
3974485 3977216 27% 5046310 meminfo.DirectMap2M
139742 ± 4% 137012 -17% 116493 meminfo.DirectMap4k
244933 ± 4% -7% 228626 15% 280670 meminfo.PageTables
12.47 ± 39% 84% 22.89 64% 20.46 perf-profile.func.cycles-pp.poll_idle
57.44 ± 6% -10% 51.55 -13% 50.13 perf-profile.func.cycles-pp.intel_idle
0.20 3% 0.20 -5% 0.19 perf-stat.branch-miss-rate
5.356e+08 4% 5.552e+08 -6% 5.046e+08 perf-stat.branch-misses
1113549 7% 1187535 -15% 951607 perf-stat.context-switches
1.48e+13 1.491e+13 -6% 1.397e+13 perf-stat.cpu-cycles
101697 ± 3% 9% 111167 -3% 98319 perf-stat.cpu-migrations
0.69 ± 20% -17% 0.57 139% 1.65 perf-stat.dTLB-load-miss-rate
3.264e+09 ± 19% -17% 2.712e+09 148% 8.084e+09 perf-stat.dTLB-load-misses
4.695e+11 4.718e+11 4.818e+11 perf-stat.dTLB-loads
3.276e+11 ± 3% 3.303e+11 8% 3.528e+11 perf-stat.dTLB-stores
56.47 ± 19% 41% 79.48 -58% 23.96 perf-stat.iTLB-load-miss-rate
48864487 ± 4% 7% 52183944 -12% 43166037 perf-stat.iTLB-load-misses
40455495 ± 41% -67% 13468883 239% 1.37e+08 perf-stat.iTLB-loads
29278 ± 4% -6% 27480 12% 32844 perf-stat.instructions-per-iTLB-miss
0.10 0.10 5% 0.10 perf-stat.ipc
47.16 46.36 46.51 perf-stat.node-store-miss-rate
6568 ± 44% -59% 2721 -71% 1916 numa-meminfo.node0.Shmem
194395 7% 207086 15% 224164 numa-meminfo.node0.Active
10218 ± 24% -37% 6471 -36% 6494 numa-meminfo.node0.Mapped
7496 ± 34% -97% 204 37% 10278 numa-meminfo.node0.AnonHugePages
178888 6% 188799 16% 208213 numa-meminfo.node0.AnonPages
179468 6% 191062 17% 209704 numa-meminfo.node0.Active(anon)
256890 -15% 219489 -15% 219503 numa-meminfo.node1.Active
12213 ± 24% 49% 18208 -50% 6105 numa-meminfo.node1.AnonHugePages
45080 ± 23% -33% 30138 87% 84468 numa-meminfo.node1.PageTables
241623 -15% 204604 -16% 203913 numa-meminfo.node1.Active(anon)
240637 -15% 204491 -15% 203847 numa-meminfo.node1.AnonPages
23782392 ±139% 673% 1.838e+08 -100% 0 latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
61157 ± 4% -6% 57187 14% 69751 proc-vmstat.nr_page_table_pages
1641 ± 44% -59% 679 -71% 478 numa-vmstat.node0.nr_shmem
2655 ± 23% -35% 1715 -35% 1726 numa-vmstat.node0.nr_mapped
44867 5% 47231 16% 52261 numa-vmstat.node0.nr_anon_pages
45014 6% 47793 17% 52636 numa-vmstat.node0.nr_zone_active_anon
45014 6% 47793 17% 52636 numa-vmstat.node0.nr_active_anon
11300 ± 23% -33% 7542 88% 21209 numa-vmstat.node1.nr_page_table_pages
60581 -16% 51156 -15% 51193 numa-vmstat.node1.nr_zone_active_anon
60581 -16% 51156 -15% 51193 numa-vmstat.node1.nr_active_anon
60328 -15% 51127 -15% 51174 numa-vmstat.node1.nr_anon_pages
13671 13608 11% 15190 slabinfo.cred_jar.active_objs
13707 13608 11% 15231 slabinfo.cred_jar.num_objs
24109 24386 -11% 21574 slabinfo.kmalloc-16.active_objs
24109 24386 -11% 21574 slabinfo.kmalloc-16.num_objs
13709 ± 6% 13391 -15% 11600 slabinfo.kmalloc-512.active_objs
13808 ± 6% 13454 -16% 11657 slabinfo.kmalloc-512.num_objs
1456658 4% 1511260 15% 1675984 sched_debug.cfs_rq:/.min_vruntime.min
441613 ± 3% -28% 316751 -76% 105734 sched_debug.cfs_rq:/.min_vruntime.stddev
443999 ± 3% -28% 318033 -76% 106909 sched_debug.cfs_rq:/.spread0.stddev
2657974 2625551 -19% 2158111 sched_debug.cfs_rq:/.min_vruntime.max
0.22 ± 23% 96% 0.43 109% 0.46 sched_debug.cfs_rq:/.nr_spread_over.stddev
1.50 100% 3.00 133% 3.50 sched_debug.cfs_rq:/.nr_spread_over.max
111.95 ± 26% 15% 128.92 128% 254.81 sched_debug.cfs_rq:/.exec_clock.stddev
802 3% 829 -16% 671 sched_debug.cfs_rq:/.load_avg.min
874 879 -11% 780 sched_debug.cfs_rq:/.load_avg.avg
1256 ± 17% -20% 1011 -24% 957 sched_debug.cfs_rq:/.load_avg.max
1.33 ± 35% -100% 0.00 200% 4.00 sched_debug.cpu.cpu_load[4].min
4.56 ± 6% -11% 4.07 -27% 3.33 sched_debug.cpu.cpu_load[4].stddev
4.76 ± 3% -13% 4.14 -30% 3.35 sched_debug.cpu.cpu_load[3].stddev
25.17 ± 12% -26% 18.50 -21% 20.00 sched_debug.cpu.cpu_load[3].max
25.67 ± 9% -32% 17.50 -24% 19.50 sched_debug.cpu.cpu_load[0].max
4.67 ± 3% -17% 3.90 -22% 3.62 sched_debug.cpu.cpu_load[0].stddev
4.88 -15% 4.14 -31% 3.39 sched_debug.cpu.cpu_load[2].stddev
26.17 ± 10% -29% 18.50 -25% 19.50 sched_debug.cpu.cpu_load[2].max
7265 4% 7556 -12% 6419 sched_debug.cpu.nr_switches.avg
9.41 ± 10% 9.67 21% 11.38 sched_debug.cpu.cpu_load[1].avg
9.03 ± 12% 3% 9.32 23% 11.09 sched_debug.cpu.cpu_load[0].avg
4140 ± 4% -11% 3698 -11% 3703 sched_debug.cpu.nr_switches.stddev
9.41 ± 10% 3% 9.71 22% 11.49 sched_debug.cpu.cpu_load[3].avg
4690 4821 -9% 4257 sched_debug.cpu.nr_switches.min
9.39 ± 9% 3% 9.69 23% 11.52 sched_debug.cpu.cpu_load[4].avg
9.43 ± 10% 9.71 21% 11.44 sched_debug.cpu.cpu_load[2].avg
57.92 ± 18% -4% 55.55 -23% 44.50 sched_debug.cpu.nr_uninterruptible.stddev
3002 ± 3% 10% 3288 31% 3919 sched_debug.cpu.curr->pid.avg
6666 6652 -10% 6025 sched_debug.cpu.curr->pid.max
1379 1361 -19% 1118 sched_debug.cpu.ttwu_local.avg
1849 ± 3% -12% 1628 -18% 1517 sched_debug.cpu.ttwu_local.stddev
1679 ± 8% 4% 1738 -15% 1423 sched_debug.cpu.curr->pid.stddev
1.58 ± 33% -11% 1.41 65% 2.60 sched_debug.cpu.nr_running.avg
1767 6% 1880 -16% 1489 sched_debug.cpu.ttwu_count.avg
506 ± 6% -15% 430 -17% 419 sched_debug.cpu.ttwu_count.min
7139 8% 7745 -11% 6355 sched_debug.cpu.sched_count.avg
4355 6% 4625 -11% 3884 sched_debug.cpu.sched_count.min
4.91 ± 3% -16% 4.13 -28% 3.52 sched_debug.cpu.cpu_load[1].stddev
26.67 ± 9% -29% 19.00 -27% 19.50 sched_debug.cpu.cpu_load[1].max
209 ± 8% 19% 247 -15% 178 sched_debug.cpu.sched_goidle.avg
5.67 ± 27% -12% 5.00 50% 8.50 sched_debug.cpu.nr_running.max
36072 ± 7% 70% 61152 17% 42236 sched_debug.cpu.sched_count.max
2008 -8% 1847 -18% 1645 sched_debug.cpu.ttwu_count.stddev
0.07 ± 19% -20% 0.06 186% 0.21 sched_debug.rt_rq:/.rt_time.avg
0.36 ± 17% -23% 0.28 142% 0.88 sched_debug.rt_rq:/.rt_time.stddev
2.33 ± 15% -27% 1.70 87% 4.35 sched_debug.rt_rq:/.rt_time.max
aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
270459 272267 ± 3% -48% 139834 ± 3% aim7.jobs-per-min
21229 ± 5% 20896 ± 3% 449% 116516 ± 6% aim7.time.involuntary_context_switches
1461 ± 5% 1454 ± 5% 174% 3998 ± 3% aim7.time.system_time
155368 153149 149% 386164 aim7.time.minor_page_faults
66.84 66.41 ± 3% 93% 129.07 ± 3% aim7.time.elapsed_time
66.84 66.41 ± 3% 93% 129.07 ± 3% aim7.time.elapsed_time.max
328369 3% 339077 96% 644393 aim7.time.voluntary_context_switches
49489 ± 11% -8% 45459 39% 68941 ± 4% interrupts.CAL:Function_call_interrupts
96.62 ± 7% 97.09 61% 155.12 uptime.boot
186640 ± 10% 186707 127% 424522 ± 4% softirqs.RCU
146596 147043 37% 201373 softirqs.SCHED
1005660 ± 3% 991053 ± 4% 118% 2196513 softirqs.TIMER
66.84 66.41 ± 3% 93% 129.07 ± 3% time.elapsed_time
66.84 66.41 ± 3% 93% 129.07 ± 3% time.elapsed_time.max
21229 ± 5% 20896 ± 3% 449% 116516 ± 6% time.involuntary_context_switches
155368 153149 149% 386164 time.minor_page_faults
2212 2215 41% 3112 time.percent_of_cpu_this_job_got
1461 ± 5% 1454 ± 5% 174% 3998 ± 3% time.system_time
328369 3% 339077 96% 644393 time.voluntary_context_switches
1197810 ± 16% -67% 393936 ± 40% -56% 530668 ± 43% numa-numastat.node0.numa_miss
1196269 ± 16% -68% 387751 ± 40% -55% 533013 ± 42% numa-numastat.node1.numa_foreign
112 159% 292 ± 4% 146% 277 vmstat.memory.buff
16422228 16461619 -28% 11832310 vmstat.memory.free
22 -3% 22 87% 42 ± 3% vmstat.procs.r
48853 48768 50273 vmstat.system.in
125202 8% 135626 51% 189515 ± 4% cpuidle.C1-IVT.usage
28088338 ± 3% 11% 31082173 17% 32997314 ± 5% cpuidle.C1-IVT.time
3471814 27% 4422338 ± 15% 2877% 1.034e+08 ± 3% cpuidle.C1E-IVT.time
33353 8% 36128 703% 267725 cpuidle.C1E-IVT.usage
11371800 9% 12381174 244% 39113028 cpuidle.C3-IVT.time
64048 5% 67490 62% 103940 cpuidle.C3-IVT.usage
1.637e+09 1.631e+09 20% 1.959e+09 cpuidle.C6-IVT.time
1861259 4% 1931551 19% 2223599 cpuidle.C6-IVT.usage
230 ± 9% 42% 326 1631% 3986 cpuidle.POLL.usage
1724995 ± 41% 54% 2656939 ± 10% 112% 3662791 cpuidle.POLL.time
48.48 48.15 36% 65.85 turbostat.%Busy
1439 1431 36% 1964 turbostat.Avg_MHz
33.28 33.45 -25% 24.85 turbostat.CPU%c1
18.09 ± 3% 18.24 ± 4% -49% 9.16 turbostat.CPU%c6
134 133 8% 144 turbostat.CorWatt
5.39 ± 17% 4% 5.63 ± 8% -34% 3.54 turbostat.Pkg%pc2
2.97 ± 44% -17% 2.48 ± 32% -70% 0.91 ± 22% turbostat.Pkg%pc6
167 167 6% 177 turbostat.PkgWatt
10.27 10.43 -14% 8.79 turbostat.RAMWatt
44376005 -100% 205734 -100% 214640 meminfo.Active
44199835 -100% 30412 -100% 30241 meminfo.Active(file)
103029 ± 3% 27% 130507 ± 6% 29% 133114 ± 8% meminfo.CmaFree
124701 ± 4% 123685 ± 14% 16% 144180 ± 3% meminfo.DirectMap4k
7886 ± 4% 7993 ± 5% 144% 19231 ± 7% meminfo.Dirty
2472446 1791% 46747572 1976% 51320420 meminfo.Inactive
2463353 1797% 46738477 1983% 51311261 meminfo.Inactive(file)
16631615 16664565 -28% 11936074 meminfo.MemFree
4.125e+11 -5% 3.927e+11 103% 8.36e+11 perf-stat.branch-instructions
0.41 -20% 0.33 -43% 0.23 perf-stat.branch-miss-rate
1.671e+09 -23% 1.28e+09 16% 1.946e+09 perf-stat.branch-misses
7.138e+09 -3% 6.917e+09 23% 8.746e+09 perf-stat.cache-misses
2.036e+10 -4% 1.956e+10 22% 2.476e+10 perf-stat.cache-references
821470 4% 851532 88% 1548125 ± 3% perf-stat.context-switches
4.93e+12 ± 3% -4% 4.755e+12 ± 4% 154% 1.25e+13 perf-stat.cpu-cycles
125073 4% 129993 167% 333599 perf-stat.cpu-migrations
3.595e+09 ± 16% -19% 2.895e+09 ± 17% 39% 4.987e+09 ± 10% perf-stat.dTLB-load-misses
6.411e+11 6.339e+11 ± 3% 57% 1.004e+12 perf-stat.dTLB-loads
0.06 ± 3% -42% 0.04 87% 0.12 ± 3% perf-stat.dTLB-store-miss-rate
2.738e+08 -39% 1.675e+08 64% 4.502e+08 ± 5% perf-stat.dTLB-store-misses
4.321e+11 5% 4.552e+11 -12% 3.81e+11 ± 8% perf-stat.dTLB-stores
2.343e+12 -5% 2.229e+12 67% 3.918e+12 perf-stat.instructions
46162 ± 41% 46733 ± 3% 55% 71500 perf-stat.instructions-per-iTLB-miss
0.48 ± 4% 0.47 ± 5% -34% 0.31 perf-stat.ipc
325877 322934 115% 699924 perf-stat.minor-faults
42.88 3% 44.33 43.65 perf-stat.node-load-miss-rate
9.499e+08 9.578e+08 66% 1.581e+09 perf-stat.node-load-misses
1.266e+09 -5% 1.203e+09 61% 2.04e+09 perf-stat.node-loads
39.17 40.00 8% 42.12 perf-stat.node-store-miss-rate
3.198e+09 4% 3.318e+09 36% 4.344e+09 perf-stat.node-store-misses
4.966e+09 4.977e+09 20% 5.968e+09 perf-stat.node-stores
325852 322963 115% 699918 perf-stat.page-faults
21719324 -100% 15215 ± 3% -100% 14631 numa-meminfo.node0.Active(file)
1221037 1806% 23278263 1969% 25269114 numa-meminfo.node0.Inactive(file)
1223564 1803% 23286857 1965% 25269597 numa-meminfo.node0.Inactive
21811771 -100% 102448 -100% 104424 numa-meminfo.node0.Active
2971 ± 13% -8% 2734 ± 3% 157% 7626 ± 4% numa-meminfo.node0.Dirty
8476780 8356206 -27% 6162743 numa-meminfo.node0.MemFree
617361 611434 11% 687829 numa-meminfo.node0.SReclaimable
1249068 1779% 23471025 1985% 26046948 numa-meminfo.node1.Inactive
1242501 1789% 23470523 1996% 26038272 numa-meminfo.node1.Inactive(file)
22500867 -100% 15202 ± 4% -100% 15613 numa-meminfo.node1.Active(file)
22584509 -100% 103192 ± 6% -100% 109976 numa-meminfo.node1.Active
4814 ± 13% 4957 ± 5% 135% 11335 numa-meminfo.node1.Dirty
8132889 8297084 ± 3% -29% 5777419 ± 3% numa-meminfo.node1.MemFree
83641 ± 7% 5% 87990 ± 7% 13% 94363 numa-meminfo.node1.Active(anon)
82877 ± 7% 4% 86528 ± 6% 13% 93620 numa-meminfo.node1.AnonPages
0 0 842360 ±100% latency_stats.avg.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.do_syscall_64.return_from_SYSCALL_64
113 ±173% 232% 376 ±100% 2e+05% 203269 ± 4% latency_stats.hits.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
5245 ± 14% 5325 ± 3% 535% 33286 ± 23% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_ialloc_read_agi.xfs_dialloc.xfs_ialloc.xfs_dir_ialloc.xfs_create
1133 ±173% 113% 2416 ±100% 1351% 16434 latency_stats.max.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
0 0 842360 ±100% latency_stats.max.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.do_syscall_64.return_from_SYSCALL_64
7813 ± 13% -33% 5197 ± 9% 403% 39305 ± 18% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink_remove.xfs_ifree.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode
5271 ± 13% -3% 5091 ± 5% 288% 20467 latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
10369 ± 17% -41% 6086 ± 21% -96% 385 ±100% latency_stats.max.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
94417 ±173% 556% 619712 ±100% 3e+05% 3.061e+08 ± 5% latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
22126648 ± 4% 22776886 1311% 3.123e+08 ± 7% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink_remove.xfs_ifree.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode
2536 ±117% -98% 48 ± 43% 2059% 54765 ±100% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1702264 ± 3% 5% 1790192 509% 10359205 ± 6% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_ialloc_read_agi.xfs_dialloc.xfs_ialloc.xfs_dir_ialloc.xfs_create
1180839 ± 3% 5% 1238547 453% 6527115 ± 5% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
467 ±173% 680% 3644 ± 4% 7e+05% 3196407 ± 3% latency_stats.sum.xfs_iget.xfs_ialloc.xfs_dir_ialloc.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
0 0 842360 ±100% latency_stats.sum.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.do_syscall_64.return_from_SYSCALL_64
159018 ± 43% -49% 81514 ± 19% -99% 999 ±100% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
1084 ± 5% 21% 1313 ± 3% 86% 2018 proc-vmstat.kswapd_high_wmark_hit_quickly
1817 ± 3% 38% 2511 ± 3% 175% 4989 proc-vmstat.kswapd_low_wmark_hit_quickly
11055004 -100% 7603 -100% 7559 proc-vmstat.nr_active_file
1993 2013 ± 4% 128% 4553 ± 5% proc-vmstat.nr_dirty
25746 ± 3% 26% 32494 ± 6% 29% 33319 ± 8% proc-vmstat.nr_free_cma
4152484 4162399 -28% 2984494 proc-vmstat.nr_free_pages
615907 1798% 11688190 1983% 12827366 proc-vmstat.nr_inactive_file
11055042 -100% 7603 -100% 7559 proc-vmstat.nr_zone_active_file
615904 1798% 11688234 1983% 12827434 proc-vmstat.nr_zone_inactive_file
2016 ± 3% 2025 ± 4% 127% 4582 ± 4% proc-vmstat.nr_zone_write_pending
2912 ± 3% 32% 3834 ± 3% 141% 7009 proc-vmstat.pageoutrun
5380414 -100% 2502 ± 3% -100% 2602 ± 3% proc-vmstat.pgactivate
61925072 -100% 0 -100% 0 proc-vmstat.pgdeactivate
348105 343315 108% 723517 proc-vmstat.pgfault
61932469 -100% 0 -100% 0 proc-vmstat.pgrefill
5432311 -100% 3802 ± 3% -100% 3657 numa-vmstat.node0.nr_zone_active_file
5432276 -100% 3802 ± 3% -100% 3657 numa-vmstat.node0.nr_active_file
305236 1802% 5806215 1969% 6314975 numa-vmstat.node0.nr_zone_inactive_file
305239 1802% 5806170 1969% 6314910 numa-vmstat.node0.nr_inactive_file
748 ± 7% -20% 597 ± 10% 114% 1602 numa-vmstat.node0.nr_dirty
775 ± 7% -21% 610 ± 12% 112% 1642 numa-vmstat.node0.nr_zone_write_pending
2116796 2102494 ± 3% -27% 1543100 numa-vmstat.node0.nr_free_pages
154392 152538 11% 171898 numa-vmstat.node0.nr_slab_reclaimable
310642 1784% 5853811 1995% 6507801 numa-vmstat.node1.nr_zone_inactive_file
310642 1784% 5853814 1995% 6507801 numa-vmstat.node1.nr_inactive_file
5627293 -100% 3799 ± 4% -100% 3903 numa-vmstat.node1.nr_zone_active_file
5627293 -100% 3799 ± 4% -100% 3903 numa-vmstat.node1.nr_active_file
1206 ± 16% 14% 1373 129% 2758 ± 10% numa-vmstat.node1.nr_zone_write_pending
1205 ± 16% 14% 1373 129% 2757 ± 10% numa-vmstat.node1.nr_dirty
2031121 2088592 ± 3% -29% 1446172 ± 3% numa-vmstat.node1.nr_free_pages
25743 ± 3% 27% 32608 ± 7% 30% 33415 ± 8% numa-vmstat.node1.nr_free_cma
20877 ± 7% 6% 22077 ± 6% 13% 23620 numa-vmstat.node1.nr_zone_active_anon
20877 ± 7% 6% 22077 ± 6% 13% 23620 numa-vmstat.node1.nr_active_anon
20684 ± 7% 5% 21709 ± 6% 13% 23431 numa-vmstat.node1.nr_anon_pages
4687 4704 11% 5205 ± 3% slabinfo.kmalloc-128.num_objs
4687 4704 11% 5205 ± 3% slabinfo.kmalloc-128.active_objs
1401 -19% 1142 8% 1516 ± 6% slabinfo.xfs_efd_item.num_objs
1401 -19% 1142 8% 1516 ± 6% slabinfo.xfs_efd_item.active_objs
1725 ± 5% -8% 1589 -12% 1518 slabinfo.xfs_inode.num_objs
1725 ± 5% -8% 1589 -12% 1518 slabinfo.xfs_inode.active_objs
382810 ± 4% 383813 ± 3% 301% 1535378 sched_debug.cfs_rq:/.min_vruntime.avg
249011 ± 6% 245840 ± 3% 420% 1294704 sched_debug.cfs_rq:/.min_vruntime.min
105216 106278 79% 188096 sched_debug.cfs_rq:/.min_vruntime.stddev
105260 106358 79% 188314 sched_debug.cfs_rq:/.spread0.stddev
9414 ± 4% 9361 ± 4% 230% 31092 sched_debug.cfs_rq:/.exec_clock.min
541056 ± 9% 540188 ± 3% 236% 1820030 sched_debug.cfs_rq:/.min_vruntime.max
150.87 ± 11% -21% 119.80 ± 10% 34% 202.73 ± 7% sched_debug.cfs_rq:/.util_avg.stddev
13783 13656 170% 37192 sched_debug.cfs_rq:/.exec_clock.avg
17625 17508 141% 42564 sched_debug.cfs_rq:/.exec_clock.max
3410.74 ± 3% 3458.30 38% 4706.14 sched_debug.cfs_rq:/.exec_clock.stddev
732 ± 11% 11% 809 ± 3% -34% 480 ± 7% sched_debug.cfs_rq:/.load_avg.min
844 ± 8% 7% 901 -33% 569 ± 4% sched_debug.cfs_rq:/.load_avg.avg
0.41 ± 7% 11% 0.46 ± 11% 21% 0.50 ± 5% sched_debug.cfs_rq:/.nr_running.avg
1339 ± 5% 1338 -32% 909 sched_debug.cfs_rq:/.load_avg.max
0.53 ± 4% -4% 0.51 32% 0.70 sched_debug.cfs_rq:/.nr_spread_over.avg
0.50 0.50 33% 0.67 sched_debug.cfs_rq:/.nr_spread_over.min
355.00 ± 26% -67% 118.75 ± 4% -82% 64.83 ± 20% sched_debug.cpu.cpu_load[4].max
18042 17697 135% 42380 sched_debug.cpu.nr_load_updates.min
51.83 ± 22% -66% 17.44 -78% 11.18 ± 5% sched_debug.cpu.cpu_load[4].stddev
22708 22546 111% 47986 sched_debug.cpu.nr_load_updates.avg
29633 ± 7% -7% 27554 83% 54243 sched_debug.cpu.nr_load_updates.max
48.83 ± 29% -65% 16.91 ± 29% -73% 13.34 ± 13% sched_debug.cpu.cpu_load[3].stddev
329.25 ± 34% -65% 113.75 ± 30% -76% 79.67 ± 28% sched_debug.cpu.cpu_load[3].max
17106 14% 19541 ± 19% 34% 22978 ± 6% sched_debug.cpu.nr_switches.max
1168 ± 4% -3% 1131 ± 4% 144% 2846 ± 21% sched_debug.cpu.ttwu_local.max
3826 ± 3% 3766 17% 4487 sched_debug.cpu.nr_load_updates.stddev
19.73 ± 12% -4% 18.86 ± 14% 59% 31.42 ± 8% sched_debug.cpu.nr_uninterruptible.avg
149.75 ± 8% 150.00 ± 11% 42% 212.50 sched_debug.cpu.nr_uninterruptible.max
98147 ± 34% 97985 ± 42% 59% 156085 ± 8% sched_debug.cpu.avg_idle.min
8554 ± 3% 4% 8896 ± 5% 62% 13822 sched_debug.cpu.nr_switches.avg
2582 ± 3% 11% 2857 ± 11% 19% 3083 ± 3% sched_debug.cpu.nr_switches.stddev
60029 ± 9% 60817 ± 7% 44% 86205 sched_debug.cpu.clock.max
60029 ± 9% 60817 ± 7% 44% 86205 sched_debug.cpu.clock_task.max
60020 ± 9% 60807 ± 7% 44% 86188 sched_debug.cpu.clock.avg
60020 ± 9% 60807 ± 7% 44% 86188 sched_debug.cpu.clock_task.avg
60008 ± 9% 60793 ± 7% 44% 86169 sched_debug.cpu.clock.min
60008 ± 9% 60793 ± 7% 44% 86169 sched_debug.cpu.clock_task.min
18.36 ± 7% -37% 11.60 ± 5% -33% 12.21 sched_debug.cpu.cpu_load[3].avg
5577 ± 6% 3% 5772 ± 6% 81% 10121 sched_debug.cpu.nr_switches.min
19.14 ± 3% -36% 12.24 -36% 12.33 sched_debug.cpu.cpu_load[4].avg
17.21 ± 14% -31% 11.90 ± 18% -27% 12.56 ± 6% sched_debug.cpu.cpu_load[2].avg
83.49 ± 7% 5% 87.64 ± 3% 17% 97.56 ± 4% sched_debug.cpu.nr_uninterruptible.stddev
3729 3735 18% 4409 ± 13% sched_debug.cpu.curr->pid.max
374 ± 9% -4% 360 ± 9% 157% 962 sched_debug.cpu.ttwu_local.min
665 671 122% 1479 sched_debug.cpu.ttwu_local.avg
196 ± 7% 5% 207 ± 8% 88% 369 ± 14% sched_debug.cpu.ttwu_local.stddev
1196 ± 4% 5% 1261 ± 6% 11% 1333 ± 10% sched_debug.cpu.curr->pid.stddev
0.45 ± 7% 17% 0.53 ± 16% 29% 0.58 ± 16% sched_debug.cpu.nr_running.avg
6738 ± 16% 8% 7296 ± 20% 52% 10236 sched_debug.cpu.ttwu_count.max
3952 ± 4% 5% 4150 ± 5% 75% 6917 sched_debug.cpu.ttwu_count.avg
913 22% 1117 ± 18% 42% 1302 ± 3% sched_debug.cpu.sched_goidle.stddev
2546 ± 4% 4% 2653 ± 7% 89% 4816 sched_debug.cpu.ttwu_count.min
5301 ± 6% 36% 7190 ± 33% 61% 8513 ± 8% sched_debug.cpu.sched_goidle.max
4683 ± 16% 14% 5355 ± 25% 52% 7125 sched_debug.cpu.sched_count.stddev
8262 ± 3% 6% 8746 ± 7% 68% 13912 sched_debug.cpu.sched_count.avg
5139 ± 5% 4% 5362 ± 6% 90% 9773 sched_debug.cpu.sched_count.min
2088 ± 6% 7% 2229 ± 5% 55% 3242 sched_debug.cpu.sched_goidle.min
3258 ± 4% 6% 3445 ± 6% 44% 4706 sched_debug.cpu.sched_goidle.avg
37088 ± 17% 12% 41540 ± 23% 60% 59447 sched_debug.cpu.sched_count.max
1007 ± 7% 13% 1139 ± 14% 38% 1386 ± 3% sched_debug.cpu.ttwu_count.stddev
262591 ± 4% -3% 253748 ± 4% -11% 232974 sched_debug.cpu.avg_idle.stddev
60009 ± 9% 60795 ± 7% 44% 86169 sched_debug.cpu_clk
58763 ± 9% 59673 ± 7% 45% 85068 sched_debug.ktime
60009 ± 9% 60795 ± 7% 44% 86169 sched_debug.sched_clk
aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
99091700659f4df9 fe9c2c81ed073878768785a985
---------------- --------------------------
69789 5% 73162 aim7.jobs-per-min
81603 -7% 75897 ± 5% aim7.time.involuntary_context_switches
3825 -6% 3583 aim7.time.system_time
129.08 -5% 123.16 aim7.time.elapsed_time
129.08 -5% 123.16 aim7.time.elapsed_time.max
2536 -4% 2424 aim7.time.maximum_resident_set_size
3145 131% 7253 ± 20% numa-numastat.node1.numa_miss
3145 131% 7253 ± 20% numa-numastat.node1.numa_foreign
7059 4% 7362 vmstat.system.cs
7481848 40% 10487336 ± 8% cpuidle.C1-IVT.time
1491314 75% 2607219 ± 10% cpuidle.POLL.time
67 10% 73 ± 4% turbostat.CoreTmp
66 12% 73 ± 4% turbostat.PkgTmp
5025792 -21% 3973802 meminfo.DirectMap2M
49098 12% 54859 meminfo.PageTables
3.94 97% 7.76 ± 18% perf-profile.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
11.88 -24% 8.99 ± 14% perf-profile.cycles-pp.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
11.63 -25% 8.78 ± 13% perf-profile.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry
8.412e+11 -7% 7.83e+11 perf-stat.branch-instructions
0.30 0.29 perf-stat.branch-miss-rate
2.495e+09 -8% 2.292e+09 perf-stat.branch-misses
4.277e+09 -6% 4.003e+09 perf-stat.cache-misses
1.396e+10 -5% 1.327e+10 perf-stat.cache-references
1.224e+13 -8% 1.12e+13 perf-stat.cpu-cycles
0.58 -57% 0.25 ± 16% perf-stat.dTLB-load-miss-rate
5.407e+09 -60% 2.175e+09 ± 18% perf-stat.dTLB-load-misses
9.243e+11 -6% 8.708e+11 perf-stat.dTLB-loads
0.17 -58% 0.07 ± 4% perf-stat.dTLB-store-miss-rate
4.368e+08 -50% 2.177e+08 ± 3% perf-stat.dTLB-store-misses
2.549e+11 19% 3.041e+11 perf-stat.dTLB-stores
3.737e+12 -6% 3.498e+12 perf-stat.instructions
0.31 0.31 perf-stat.ipc
439716 426816 perf-stat.minor-faults
2.164e+09 -7% 2.012e+09 perf-stat.node-load-misses
2.417e+09 -7% 2.259e+09 perf-stat.node-loads
1.24e+09 -3% 1.198e+09 perf-stat.node-store-misses
1.556e+09 -4% 1.501e+09 perf-stat.node-stores
439435 426823 perf-stat.page-faults
51452 14% 58403 ± 8% numa-meminfo.node0.Active(anon)
10472 -36% 6692 ± 45% numa-meminfo.node1.Shmem
7665 74% 13316 numa-meminfo.node1.PageTables
6724 144% 16416 ± 43% latency_stats.avg.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
6724 144% 16416 ± 43% latency_stats.max.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
6724 144% 16416 ± 43% latency_stats.sum.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
12237 12% 13693 proc-vmstat.nr_page_table_pages
12824 14% 14578 ± 8% numa-vmstat.node0.nr_zone_active_anon
12824 14% 14578 ± 8% numa-vmstat.node0.nr_active_anon
2618 -36% 1672 ± 45% numa-vmstat.node1.nr_shmem
17453 24% 21726 ± 6% numa-vmstat.node1.numa_miss
1909 74% 3323 numa-vmstat.node1.nr_page_table_pages
17453 24% 21726 ± 6% numa-vmstat.node1.numa_foreign
922 24% 1143 ± 6% slabinfo.blkdev_requests.active_objs
922 24% 1143 ± 6% slabinfo.blkdev_requests.num_objs
569 21% 686 ± 11% slabinfo.file_lock_cache.active_objs
569 21% 686 ± 11% slabinfo.file_lock_cache.num_objs
9.07 16% 10.56 ± 9% sched_debug.cfs_rq:/.runnable_load_avg.avg
18406 -14% 15835 ± 10% sched_debug.cfs_rq:/.load.stddev
0.67 150% 1.67 ± 43% sched_debug.cfs_rq:/.nr_spread_over.max
581 -11% 517 ± 4% sched_debug.cfs_rq:/.load_avg.min
659 -10% 596 ± 4% sched_debug.cfs_rq:/.load_avg.avg
784 -12% 692 ± 4% sched_debug.cfs_rq:/.load_avg.max
18086 -12% 15845 ± 9% sched_debug.cpu.load.stddev
18.72 -17% 15.49 ± 8% sched_debug.cpu.nr_uninterruptible.avg
69.33 42% 98.67 ± 7% sched_debug.cpu.nr_uninterruptible.max
317829 -12% 280218 ± 4% sched_debug.cpu.avg_idle.min
9.80 18% 11.54 ± 10% sched_debug.cpu.cpu_load[1].avg
8.91 15% 10.28 ± 9% sched_debug.cpu.cpu_load[0].avg
9.53 22% 11.64 ± 10% sched_debug.cpu.cpu_load[3].avg
7083 11% 7853 sched_debug.cpu.nr_switches.min
9.73 22% 11.90 ± 7% sched_debug.cpu.cpu_load[4].avg
9.68 20% 11.59 ± 11% sched_debug.cpu.cpu_load[2].avg
24.59 49% 36.53 ± 17% sched_debug.cpu.nr_uninterruptible.stddev
1176 12% 1319 ± 4% sched_debug.cpu.curr->pid.avg
373 35% 502 ± 6% sched_debug.cpu.ttwu_local.min
3644 13% 4120 ± 3% sched_debug.cpu.ttwu_count.min
4855 13% 5463 ± 6% sched_debug.cpu.sched_goidle.max
7019 10% 7745 sched_debug.cpu.sched_count.min
2305 10% 2529 ± 3% sched_debug.cpu.sched_goidle.min
0.00 -19% 0.00 ± 7% sched_debug.cpu.next_balance.stddev
0.68 -17% 0.57 ± 11% sched_debug.cpu.nr_running.stddev
0.05 27% 0.06 ± 14% sched_debug.rt_rq:/.rt_nr_running.stddev
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-14 14:50 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-14 14:50 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 88051 bytes --]
Hi Christoph,
On Sun, Aug 14, 2016 at 06:51:28AM +0800, Fengguang Wu wrote:
>Hi Christoph,
>
>On Sun, Aug 14, 2016 at 12:15:08AM +0200, Christoph Hellwig wrote:
>>Hi Fengguang,
>>
>>feel free to try this git tree:
>>
>> git://git.infradead.org/users/hch/vfs.git iomap-fixes
>
>I just queued some test jobs for it.
>
>% queue -q vip -t ivb44 -b hch-vfs/iomap-fixes aim7-fs-1brd.yaml fs=xfs -r3 -k fe9c2c81ed073878768785a985295cbacc349e42 -k ca2edab2e1d8f30dda874b7f717c2d4664991e9b -k 99091700659f4df965e138b38b4fa26a29b7eade
>
>That job file can be found here:
>
> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/jobs/aim7-fs-1brd.yaml
>
>It specifies a matrix of the below atom tests:
>
> wfg /c/lkp-tests% split-job jobs/aim7-fs-1brd.yaml -s 'fs: xfs'
>
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_src-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rr-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rw-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_cp-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_wrt-3000-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-sync_disk_rw-600-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-creat-clo-1500-performance.yaml
> jobs/aim7-fs-1brd.yaml => ./aim7-fs-1brd-1BRD_48G-xfs-disk_rd-9000-performance.yaml
I got some results now. The several finished aim7 tests have some
performance regressions for commit fe9c2c81 ("xfs: rewrite and
optimize the delalloc write path") comparing to its parent commit
ca2edab2e and their base mainline commit 990917006 ("Merge tag
'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs").
wfg(a)inn ~% compare -g aim7 -ai 99091700659f4df965e138b38b4fa26a29b7eade ca2edab2e1d8f30dda874b7f717c2d4664991e9b fe9c2c81ed073878768785a985295cbacc349e42
tests: 4
60 perf-index fe9c2c81ed073878768785a985295cbacc349e42
97 power-index fe9c2c81ed073878768785a985295cbacc349e42
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985 testcase/testparams/testbox
---------------- -------------------------- -------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
270459 272267 ± 3% -48% 139834 ± 3% aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
473257 468546 5% 497512 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
360578 -18% 296589 -60% 144974 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
358701 -6% 335712 -40% 216057 GEO-MEAN aim7.jobs-per-min
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
48.48 48.15 36% 65.85 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
89.50 89.76 88.75 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
35.78 23% 43.93 76% 63.09 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
53.75 7% 57.48 33% 71.71 GEO-MEAN turbostat.%Busy
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
1439 1431 36% 1964 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
2671 2674 2650 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
1057 23% 1303 78% 1883 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
1595 7% 1708 34% 2139 GEO-MEAN turbostat.Avg_MHz
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
167 167 6% 177 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
175 175 176 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
150 8% 162 19% 178 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
164 168 8% 177 GEO-MEAN turbostat.PkgWatt
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
10.27 10.43 -14% 8.79 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
6.85 6.66 6.88 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
9.96 14% 11.36 -7% 9.23 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
8.88 4% 9.24 -7% 8.23 GEO-MEAN turbostat.RAMWatt
Here are the detailed numbers for each test case. The perf-profile and
latency_stats numbers are sorted by absolute change in each sub-category
now. perf-profile numbers > 5 are all shown.
It may be more pleasant to view the long trace.call.funcs lines with
vim ":set nowrap" option.
aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
360578 -18% 294351 -60% 144974 aim7.jobs-per-min
12835 458% 71658 480% 74445 aim7.time.involuntary_context_switches
755 50% 1136 373% 3570 aim7.time.system_time
155970 152810 73% 269438 aim7.time.minor_page_faults
50.15 22% 61.39 148% 124.39 aim7.time.elapsed_time
50.15 22% 61.39 148% 124.39 aim7.time.elapsed_time.max
438660 428601 -7% 407807 aim7.time.voluntary_context_switches
2452 2480 5% 2584 aim7.time.maximum_resident_set_size
34293 ± 4% 70% 58129 ± 19% 213% 107483 interrupts.CAL:Function_call_interrupts
79.70 ± 6% 16% 92.63 ± 6% 89% 150.33 uptime.boot
2890 ± 8% 6% 3077 ± 8% 15% 3329 uptime.idle
150186 ± 9% 41% 212090 122% 333727 softirqs.RCU
161166 9% 176318 16% 186527 softirqs.SCHED
648051 33% 864346 222% 2089349 softirqs.TIMER
50.15 22% 61.39 148% 124.39 time.elapsed_time
50.15 22% 61.39 148% 124.39 time.elapsed_time.max
12835 458% 71658 480% 74445 time.involuntary_context_switches
155970 152810 73% 269438 time.minor_page_faults
1563 21% 1898 85% 2895 time.percent_of_cpu_this_job_got
755 50% 1136 373% 3570 time.system_time
4564660 ± 4% 68% 7651587 79% 8159302 numa-numastat.node0.numa_foreign
3929898 81% 7129718 46% 5733813 numa-numastat.node0.numa_miss
0 2 ± 20% 2 numa-numastat.node1.other_node
4569811 ± 4% 68% 7654689 79% 8163206 numa-numastat.node1.numa_miss
3935075 81% 7132850 46% 5737410 numa-numastat.node1.numa_foreign
34767917 4% 36214694 11% 38627727 numa-numastat.node1.numa_hit
34767917 4% 36214691 11% 38627725 numa-numastat.node1.local_node
12377 ± 18% 3615% 459790 2848% 364868 vmstat.io.bo
119 -8% 110 ± 4% -16% 101 vmstat.memory.buff
18826454 -16% 15748045 -37% 11882562 vmstat.memory.free
16 25% 20 106% 33 vmstat.procs.r
19407 469% 110509 520% 120350 vmstat.system.cs
48215 10% 52977 3% 49819 vmstat.system.in
142459 -11% 126667 -23% 109481 cpuidle.C1-IVT.usage
29494441 ± 3% -18% 24206809 -36% 18889149 cpuidle.C1-IVT.time
5736732 28% 7315830 525% 35868316 cpuidle.C1E-IVT.time
51148 9% 55743 98% 101021 cpuidle.C1E-IVT.usage
18347890 27% 23243942 21% 22154105 cpuidle.C3-IVT.time
96127 9% 104487 -29% 68552 cpuidle.C3-IVT.usage
1.525e+09 6% 1.617e+09 41% 2.147e+09 cpuidle.C6-IVT.time
1805218 11% 1998052 33% 2397285 cpuidle.C6-IVT.usage
286 ± 11% 14% 328 ± 7% 389% 1402 cpuidle.POLL.usage
1013526 ± 41% 98% 2003264 ± 20% 272% 3774675 cpuidle.POLL.time
35.78 24% 44.22 76% 63.09 turbostat.%Busy
1057 24% 1312 78% 1883 turbostat.Avg_MHz
34.80 -3% 33.63 -22% 27.18 turbostat.CPU%c1
0.34 -5% 0.33 -77% 0.08 turbostat.CPU%c3
29.07 -25% 21.82 -67% 9.65 turbostat.CPU%c6
118 11% 130 23% 145 turbostat.CorWatt
9.39 ± 13% -19% 7.61 ± 6% -61% 3.67 turbostat.Pkg%pc2
3.04 ± 33% -49% 1.55 ± 14% -76% 0.72 turbostat.Pkg%pc6
150 9% 164 19% 178 turbostat.PkgWatt
9.96 14% 11.34 -7% 9.23 turbostat.RAMWatt
18232 ± 8% -8% 16747 ± 10% 11% 20267 meminfo.AnonHugePages
80723 78330 -24% 61572 meminfo.CmaFree
4690642 ± 10% -15% 3981312 -15% 3983392 meminfo.DirectMap2M
1060897 -21% 834807 -22% 828755 meminfo.Dirty
2362330 26% 2983603 44% 3391287 meminfo.Inactive
2353250 26% 2974520 44% 3382139 meminfo.Inactive(file)
19388991 -18% 15966408 -38% 12038822 meminfo.MemFree
1186231 4% 1236627 13% 1341728 meminfo.SReclaimable
179570 3% 185696 14% 204382 meminfo.SUnreclaim
1365802 4% 1422323 13% 1546111 meminfo.Slab
318863 10% 352026 16% 368386 meminfo.Unevictable
0.00 0.00 9.15 perf-profile.cycles-pp.xfs_file_iomap_begin_delay.isra.9.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write.xfs_file_buffered_aio_write
0.00 0.00 8.90 perf-profile.cycles-pp.xfs_inode_set_eofblocks_tag.xfs_file_iomap_begin_delay.isra.9.xfs_file_iomap_begin.iomap_apply.iomap_file_buffered_write
0.00 0.00 8.61 perf-profile.cycles-pp._raw_spin_lock.xfs_inode_set_eofblocks_tag.xfs_file_iomap_begin_delay.isra.9.xfs_file_iomap_begin.iomap_apply
0.00 0.00 8.50 perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.xfs_inode_set_eofblocks_tag.xfs_file_iomap_begin_delay.isra.9.xfs_file_iomap_begin
6.05 -11% 5.42 ± 4% -15% 5.14 perf-profile.cycles-pp.hrtimer_interrupt.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter
6.54 -11% 5.80 ± 4% -16% 5.51 perf-profile.cycles-pp.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
16.78 -9% 15.34 ± 9% -11% 14.90 perf-profile.cycles-pp.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
16.51 ± 3% -9% 14.99 ± 9% -12% 14.49 perf-profile.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry
0.23 ± 23% 20% 0.28 ± 12% 3683% 8.70 perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath
4.369e+11 ± 4% 20% 5.239e+11 97% 8.601e+11 perf-stat.branch-instructions
0.38 5% 0.40 -27% 0.28 perf-stat.branch-miss-rate
1.678e+09 ± 3% 26% 2.117e+09 44% 2.413e+09 perf-stat.branch-misses
42.30 -7% 39.31 -5% 40.38 perf-stat.cache-miss-rate
6.874e+09 ± 4% 19% 8.21e+09 51% 1.041e+10 perf-stat.cache-misses
1.625e+10 ± 3% 29% 2.089e+10 59% 2.578e+10 perf-stat.cache-references
1017846 588% 7005227 1401% 15273586 perf-stat.context-switches
2.757e+12 ± 4% 48% 4.092e+12 318% 1.151e+13 perf-stat.cpu-cycles
177918 15% 204776 35% 241051 perf-stat.cpu-migrations
0.37 ± 14% 60% 0.60 ± 3% 45% 0.54 perf-stat.dTLB-load-miss-rate
2.413e+09 ± 14% 97% 4.757e+09 ± 4% 149% 6.001e+09 perf-stat.dTLB-load-misses
6.438e+11 23% 7.893e+11 71% 1.103e+12 perf-stat.dTLB-loads
0.06 ± 38% 100% 0.11 ± 6% 207% 0.17 perf-stat.dTLB-store-miss-rate
2.656e+08 ± 34% 123% 5.91e+08 ± 7% 203% 8.038e+08 perf-stat.dTLB-store-misses
45.99 ± 5% 8% 49.56 ± 11% 14% 52.61 perf-stat.iTLB-load-miss-rate
45151945 45832755 72% 77697494 perf-stat.iTLB-load-misses
53205262 ± 7% -10% 47792612 ± 21% 32% 69997751 perf-stat.iTLB-loads
2.457e+12 ± 4% 16% 2.851e+12 66% 4.084e+12 perf-stat.instructions
0.89 -22% 0.70 -60% 0.35 perf-stat.ipc
286640 8% 310690 99% 571225 perf-stat.minor-faults
29.16 7% 31.25 8% 31.42 perf-stat.node-load-miss-rate
4.86e+08 ± 3% 123% 1.084e+09 250% 1.7e+09 perf-stat.node-load-misses
1.18e+09 102% 2.385e+09 214% 3.711e+09 perf-stat.node-loads
21.51 30% 27.95 62% 34.86 perf-stat.node-store-miss-rate
1.262e+09 58% 1.989e+09 177% 3.499e+09 perf-stat.node-store-misses
4.606e+09 11% 5.126e+09 42% 6.539e+09 perf-stat.node-stores
286617 8% 310730 99% 571253 perf-stat.page-faults
1166432 23% 1429828 42% 1653754 numa-meminfo.node0.Inactive(file)
1175123 22% 1434274 41% 1662351 numa-meminfo.node0.Inactive
513534 -23% 394773 -24% 392567 numa-meminfo.node0.Dirty
9717968 -17% 8082393 -37% 6159862 numa-meminfo.node0.MemFree
159470 11% 176717 16% 184229 numa-meminfo.node0.Unevictable
23148226 7% 24783802 15% 26706333 numa-meminfo.node0.MemUsed
103531 ± 32% -10% 93669 ± 40% 40% 144469 numa-meminfo.node0.SUnreclaim
1187035 30% 1549075 46% 1727751 numa-meminfo.node1.Inactive
1186646 30% 1544438 46% 1727201 numa-meminfo.node1.Inactive(file)
21000905 3% 21647702 13% 23741428 numa-meminfo.node1.Active(file)
21083707 3% 21748741 13% 23822391 numa-meminfo.node1.Active
547021 -20% 438525 -21% 433706 numa-meminfo.node1.Dirty
9663240 -19% 7870896 -39% 5869977 numa-meminfo.node1.MemFree
561241 12% 625903 21% 679671 numa-meminfo.node1.SReclaimable
637259 ± 4% 13% 717863 ± 5% 16% 739482 numa-meminfo.node1.Slab
23329350 8% 25121687 16% 27122606 numa-meminfo.node1.MemUsed
159394 10% 175315 16% 184159 numa-meminfo.node1.Unevictable
521615 33% 695562 267% 1916159 latency_stats.avg.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
500644 33% 667614 261% 1805608 latency_stats.avg.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
8932 ± 46% -70% 2717 ± 4% -95% 464 latency_stats.avg.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
0 0 73327 latency_stats.hits.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
43 ± 37% 7923% 3503 ± 4% 31792% 13926 latency_stats.hits.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1422573 30% 1852368 ± 5% 228% 4672496 latency_stats.max.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
1423130 30% 1851873 ± 5% 228% 4661765 latency_stats.max.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
627 ± 66% 3788% 24404 ± 17% 6254% 39883 latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
3922 ± 18% 56% 6134 ± 29% 634% 28786 latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_ialloc_read_agi.xfs_dialloc.xfs_ialloc.xfs_dir_ialloc.xfs_create
0 0 16665 latency_stats.max.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
5.15e+10 25% 6.454e+10 220% 1.649e+11 latency_stats.sum.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
0 0 1.385e+08 latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
11666476 45% 16905624 755% 99756088 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink_remove.xfs_ifree.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode
2216 ± 69% 80030% 1775681 ± 4% 3e+06% 67521154 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1601815 28% 2053992 288% 6213577 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_ialloc_read_agi.xfs_dialloc.xfs_ialloc.xfs_dir_ialloc.xfs_create
1774397 20% 2120576 244% 6099374 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
628 ±141% 125% 1416 ± 5% 4e+05% 2677036 latency_stats.sum.xfs_iget.xfs_ialloc.xfs_dir_ialloc.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
6087 ± 92% 1277% 83839 ± 3% 11105% 682063 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
0 0 116108 latency_stats.sum.xlog_grant_head_wait.xlog_grant_head_check.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write
1212 ± 59% 1842% 23546 ± 7% 4861% 60149 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
1624 ± 22% 1356% 23637 ± 3% 1596% 27545 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
2068 ± 27% 834% 19319 ± 23% 1125% 25334 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
0 0 22155 latency_stats.sum.xlog_grant_head_wait.xlog_grant_head_check.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput.dentry_unlink_inode
39 ± 71% 41280% 16414 ± 14% 51951% 20647 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
0 0 15600 latency_stats.sum.xlog_grant_head_wait.xlog_grant_head_check.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput.dentry_unlink_inode
10 ±141% 6795% 689 ± 70% 1e+05% 10637 latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput.dentry_unlink_inode.__dentry_kill
99 ±112% 86% 185 ± 80% 9978% 10011 latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_imap_to_bp.xfs_iunlink_remove.xfs_ifree.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode
18232 ±134% -16% 15260 ± 54% -40% 10975 latency_stats.sum.xfs_lock_two_inodes.xfs_remove.xfs_vn_unlink.vfs_unlink.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
647 ± 3% -97% 21 ± 19% 34% 868 proc-vmstat.kswapd_high_wmark_hit_quickly
1091 -97% 36 ± 9% 29% 1411 proc-vmstat.kswapd_low_wmark_hit_quickly
265066 -21% 208142 -22% 206344 proc-vmstat.nr_dirty
20118 19574 -23% 15432 proc-vmstat.nr_free_cma
4844108 -18% 3988031 -38% 3008251 proc-vmstat.nr_free_pages
588262 26% 743537 44% 845765 proc-vmstat.nr_inactive_file
50 ± 25% 192% 148 ± 15% 103% 103 proc-vmstat.nr_pages_scanned
296623 4% 309201 13% 335474 proc-vmstat.nr_slab_reclaimable
44880 3% 46405 14% 51078 proc-vmstat.nr_slab_unreclaimable
79716 10% 88008 16% 92097 proc-vmstat.nr_unevictable
167 ± 9% 9e+06% 14513434 2e+06% 3569348 proc-vmstat.nr_vmscan_immediate_reclaim
162380 ± 18% 4392% 7294622 7024% 11567602 proc-vmstat.nr_written
588257 26% 743537 44% 845784 proc-vmstat.nr_zone_inactive_file
79716 10% 88008 16% 92097 proc-vmstat.nr_zone_unevictable
265092 -21% 208154 -22% 206388 proc-vmstat.nr_zone_write_pending
8507451 ± 3% 74% 14784261 64% 13918067 proc-vmstat.numa_foreign
10 ± 4% 10 ± 4% 6e+05% 57855 proc-vmstat.numa_hint_faults
8507451 ± 3% 74% 14784187 64% 13918067 proc-vmstat.numa_miss
72 72 3e+05% 213175 proc-vmstat.numa_pte_updates
1740 -97% 59 ± 12% 33% 2306 proc-vmstat.pageoutrun
5322372 1068% 62167111 1024% 59824114 proc-vmstat.pgactivate
2816355 27% 3575784 14% 3203214 proc-vmstat.pgalloc_dma32
74392338 11% 82333943 14% 84954110 proc-vmstat.pgalloc_normal
60958397 -18% 49976330 -26% 45055885 proc-vmstat.pgdeactivate
302790 9% 329088 94% 586116 proc-vmstat.pgfault
61061205 14% 69758545 18% 72000453 proc-vmstat.pgfree
655652 ± 18% 4352% 29190304 6967% 46338056 proc-vmstat.pgpgout
60965725 -18% 49983704 -26% 45063375 proc-vmstat.pgrefill
2 ± 17% 4e+07% 985929 ± 8% 7e+07% 1952629 proc-vmstat.pgrotated
82046 ± 36% 50634% 41625211 5397% 4510385 proc-vmstat.pgscan_direct
60128369 -37% 38068394 10% 66306637 proc-vmstat.pgscan_kswapd
2030 ± 46% 1e+06% 27038054 ± 3% 78642% 1598733 proc-vmstat.pgsteal_direct
0 2414551 ± 3% 3694833 proc-vmstat.workingset_activate
0 2414551 ± 3% 3694833 proc-vmstat.workingset_refault
26 ± 39% 1e+07% 2657286 3e+06% 658792 numa-vmstat.node0.nr_vmscan_immediate_reclaim
40449 ± 22% 3135% 1308601 ± 4% 4723% 1950670 numa-vmstat.node0.nr_written
291648 22% 357059 42% 413612 numa-vmstat.node0.nr_zone_inactive_file
291655 22% 357053 42% 413596 numa-vmstat.node0.nr_inactive_file
1542314 ± 5% 77% 2731911 98% 3056411 numa-vmstat.node0.numa_foreign
1366073 ± 4% 103% 2766780 ± 3% 68% 2293117 numa-vmstat.node0.numa_miss
128634 -23% 99104 -24% 98062 numa-vmstat.node0.nr_dirty
128663 -23% 99130 -24% 98051 numa-vmstat.node0.nr_zone_write_pending
2424918 -16% 2033425 -37% 1537826 numa-vmstat.node0.nr_free_pages
14037168 10% 15473174 20% 16883787 numa-vmstat.node0.numa_local
14037172 10% 15473174 20% 16883790 numa-vmstat.node0.numa_hit
39867 10% 44022 16% 46058 numa-vmstat.node0.nr_zone_unevictable
39867 10% 44022 16% 46058 numa-vmstat.node0.nr_unevictable
25871 ± 32% -9% 23414 ± 40% 40% 36094 numa-vmstat.node0.nr_slab_unreclaimable
14851187 6% 15749527 11% 16497187 numa-vmstat.node0.nr_dirtied
0 1225299 ± 4% 2008478 numa-vmstat.node1.workingset_refault
0 1225299 ± 4% 2008478 numa-vmstat.node1.workingset_activate
23 ± 35% 1e+07% 2974198 ± 3% 3e+06% 683002 numa-vmstat.node1.nr_vmscan_immediate_reclaim
40769 ± 26% 3264% 1371611 ± 3% 5569% 2311374 numa-vmstat.node1.nr_written
25 ± 8% 216% 81 ± 3% 356% 117 numa-vmstat.node1.nr_pages_scanned
296681 30% 385708 45% 431591 numa-vmstat.node1.nr_zone_inactive_file
296681 30% 385709 45% 431591 numa-vmstat.node1.nr_inactive_file
5252547 5401234 13% 5936151 numa-vmstat.node1.nr_zone_active_file
5252547 5401238 13% 5936151 numa-vmstat.node1.nr_active_file
136060 -19% 110021 -21% 107114 numa-vmstat.node1.nr_zone_write_pending
136060 -19% 110019 -21% 107107 numa-vmstat.node1.nr_dirty
1520682 ± 3% 76% 2681012 98% 3008493 numa-vmstat.node1.numa_miss
2413468 -18% 1980184 -39% 1466738 numa-vmstat.node1.nr_free_pages
1344474 ± 3% 102% 2715690 ± 4% 67% 2245159 numa-vmstat.node1.numa_foreign
20160 19698 -22% 15673 numa-vmstat.node1.nr_free_cma
14350439 12% 16005551 27% 18257157 numa-vmstat.node1.numa_local
14350440 12% 16005552 27% 18257158 numa-vmstat.node1.numa_hit
15381788 9% 16829619 21% 18645441 numa-vmstat.node1.nr_dirtied
140354 11% 156202 21% 169950 numa-vmstat.node1.nr_slab_reclaimable
39848 10% 43676 16% 46041 numa-vmstat.node1.nr_zone_unevictable
39848 10% 43676 16% 46041 numa-vmstat.node1.nr_unevictable
377 ± 9% 370 ± 5% 24% 468 slabinfo.bdev_cache.active_objs
377 ± 9% 370 ± 5% 24% 468 slabinfo.bdev_cache.num_objs
389 ± 13% 604% 2737 ± 23% 3371% 13501 slabinfo.bio-1.active_objs
389 ± 13% 612% 2770 ± 24% 3441% 13774 slabinfo.bio-1.num_objs
7 ± 17% 1039% 83 ± 24% 3623% 273 slabinfo.bio-1.active_slabs
7 ± 17% 1039% 83 ± 24% 3623% 273 slabinfo.bio-1.num_slabs
978 ± 4% 10% 1075 17% 1144 slabinfo.blkdev_requests.active_objs
978 ± 4% 10% 1075 17% 1144 slabinfo.blkdev_requests.num_objs
10942119 3% 11286505 13% 12389701 slabinfo.buffer_head.num_objs
280566 3% 289397 13% 317684 slabinfo.buffer_head.active_slabs
280566 3% 289397 13% 317684 slabinfo.buffer_head.num_slabs
10941627 10693692 11% 12140372 slabinfo.buffer_head.active_objs
7436 ± 3% 7558 20% 8922 slabinfo.cred_jar.active_objs
7436 ± 3% 7558 20% 8922 slabinfo.cred_jar.num_objs
4734 85% 8767 ± 8% 60% 7554 slabinfo.kmalloc-128.num_objs
4734 78% 8418 ± 8% 45% 6848 slabinfo.kmalloc-128.active_objs
17074 -11% 15121 -10% 15379 slabinfo.kmalloc-256.num_objs
3105 4% 3216 14% 3527 slabinfo.kmalloc-4096.num_objs
3061 4% 3170 12% 3419 slabinfo.kmalloc-4096.active_objs
13131 ± 3% 17% 15379 12% 14714 slabinfo.kmalloc-512.num_objs
1623 ± 3% 1664 ± 3% 16% 1889 slabinfo.mnt_cache.active_objs
1623 ± 3% 1664 ± 3% 16% 1889 slabinfo.mnt_cache.num_objs
2670 6% 2821 19% 3178 slabinfo.nsproxy.active_objs
2670 6% 2821 19% 3178 slabinfo.nsproxy.num_objs
2532 5% 2656 17% 2959 slabinfo.posix_timers_cache.active_objs
2532 5% 2656 17% 2959 slabinfo.posix_timers_cache.num_objs
20689 87% 38595 ± 13% 47% 30452 slabinfo.radix_tree_node.active_objs
399 83% 730 ± 13% 47% 587 slabinfo.radix_tree_node.active_slabs
399 83% 730 ± 13% 47% 587 slabinfo.radix_tree_node.num_slabs
22379 83% 40931 ± 13% 47% 32872 slabinfo.radix_tree_node.num_objs
4688 4706 22% 5712 slabinfo.sigqueue.active_objs
4688 4706 22% 5712 slabinfo.sigqueue.num_objs
979 ± 4% 7% 1046 ± 3% -15% 833 slabinfo.task_group.active_objs
979 ± 4% 7% 1046 ± 3% -15% 833 slabinfo.task_group.num_objs
1344 5% 1410 17% 1570 slabinfo.xfs_btree_cur.active_objs
1344 5% 1410 17% 1570 slabinfo.xfs_btree_cur.num_objs
2500 5% 2632 18% 2946 slabinfo.xfs_da_state.active_objs
2500 5% 2632 18% 2946 slabinfo.xfs_da_state.num_objs
1299 279% 4917 ± 17% 134% 3035 slabinfo.xfs_efd_item.num_objs
1299 278% 4911 ± 17% 126% 2940 slabinfo.xfs_efd_item.active_objs
1904 ± 3% 4% 1982 42% 2703 slabinfo.xfs_inode.num_objs
1904 ± 3% 4% 1982 39% 2644 slabinfo.xfs_inode.active_objs
1659 113% 3538 ± 27% 1360% 24227 slabinfo.xfs_log_ticket.active_objs
1659 116% 3588 ± 27% 1369% 24383 slabinfo.xfs_log_ticket.num_objs
37 169% 99 ± 29% 1405% 557 slabinfo.xfs_log_ticket.active_slabs
37 169% 99 ± 29% 1405% 557 slabinfo.xfs_log_ticket.num_slabs
2615 84% 4821 ± 28% 1549% 43132 slabinfo.xfs_trans.active_objs
2615 86% 4860 ± 28% 1551% 43171 slabinfo.xfs_trans.num_objs
37 162% 97 ± 30% 1614% 634 slabinfo.xfs_trans.active_slabs
37 162% 97 ± 30% 1614% 634 slabinfo.xfs_trans.num_slabs
3255 ± 12% 9210% 303094 38966% 1271810 sched_debug.cfs_rq:/.min_vruntime.avg
8273 ± 10% 382% 39836 ± 17% 309% 33806 sched_debug.cfs_rq:/.load.avg
716 ± 34% 28783% 206899 1e+05% 1034000 sched_debug.cfs_rq:/.min_vruntime.min
1830 ± 5% 4365% 81731 10579% 195502 sched_debug.cfs_rq:/.min_vruntime.stddev
1845 ± 4% 4330% 81754 10503% 195683 sched_debug.cfs_rq:/.spread0.stddev
73578 ± 34% 1043% 841209 ± 34% 452% 405848 sched_debug.cfs_rq:/.load.max
12.67 ± 35% 3999% 519.25 1979% 263.33 sched_debug.cfs_rq:/.runnable_load_avg.max
2.34 ± 33% 4268% 102.01 1854% 45.63 sched_debug.cfs_rq:/.runnable_load_avg.stddev
10284 ± 12% 4107% 432665 ± 7% 15350% 1588973 sched_debug.cfs_rq:/.min_vruntime.max
1.05 ± 20% 2335% 25.54 1631% 18.15 sched_debug.cfs_rq:/.runnable_load_avg.avg
44.06 ± 28% 254% 155.90 ± 16% 310% 180.49 sched_debug.cfs_rq:/.util_avg.stddev
15448 ± 19% 831% 143829 ± 22% 422% 80585 sched_debug.cfs_rq:/.load.stddev
597 ± 13% -39% 367 ± 17% -49% 303 sched_debug.cfs_rq:/.util_avg.min
1464 ± 23% -55% 664 ± 30% -63% 546 sched_debug.cfs_rq:/.load_avg.min
1830 ± 3% -50% 911 ± 5% -65% 642 sched_debug.cfs_rq:/.load_avg.avg
0.30 ± 13% 22% 0.36 ± 11% 86% 0.56 sched_debug.cfs_rq:/.nr_running.avg
2302 ± 11% -31% 1589 -50% 1157 sched_debug.cfs_rq:/.load_avg.max
819 ± 3% 36% 1116 15% 940 sched_debug.cfs_rq:/.util_avg.max
728 -14% 630 -9% 664 sched_debug.cfs_rq:/.util_avg.avg
73578 ± 34% 1043% 841209 ± 34% 452% 405848 sched_debug.cpu.load.max
1.81 ± 11% 77% 3.22 395% 8.98 sched_debug.cpu.clock.stddev
1.81 ± 11% 77% 3.22 395% 8.98 sched_debug.cpu.clock_task.stddev
8278 ± 10% 379% 39671 ± 18% 305% 33517 sched_debug.cpu.load.avg
3600 385% 17452 1023% 40419 sched_debug.cpu.nr_load_updates.min
5446 305% 22069 754% 46492 sched_debug.cpu.nr_load_updates.avg
8627 ± 5% 217% 27314 517% 53222 sched_debug.cpu.nr_load_updates.max
6221 ± 3% 2137% 139191 3486% 223092 sched_debug.cpu.nr_switches.max
15.67 ± 40% 3187% 515.00 1579% 263.00 sched_debug.cpu.cpu_load[0].max
2.55 ± 33% 3886% 101.45 1697% 45.73 sched_debug.cpu.cpu_load[0].stddev
15452 ± 19% 831% 143937 ± 22% 421% 80431 sched_debug.cpu.load.stddev
1144 236% 3839 329% 4911 sched_debug.cpu.nr_load_updates.stddev
23.67 ± 41% 709% 191.50 ± 6% 637% 174.33 sched_debug.cpu.nr_uninterruptible.max
978 7241% 71831 ± 3% 13746% 135493 sched_debug.cpu.nr_switches.avg
0.96 ± 19% 2503% 24.95 1720% 17.44 sched_debug.cpu.cpu_load[0].avg
957 ± 4% 3406% 33568 3626% 35679 sched_debug.cpu.nr_switches.stddev
29644 ± 16% 107% 61350 ± 8% 190% 86111 sched_debug.cpu.clock.max
29644 ± 16% 107% 61350 ± 8% 190% 86111 sched_debug.cpu.clock_task.max
29640 ± 16% 107% 61344 ± 8% 190% 86096 sched_debug.cpu.clock.avg
29640 ± 16% 107% 61344 ± 8% 190% 86096 sched_debug.cpu.clock_task.avg
29635 ± 16% 107% 61338 ± 8% 190% 86079 sched_debug.cpu.clock.min
29635 ± 16% 107% 61338 ± 8% 190% 86079 sched_debug.cpu.clock_task.min
335 ± 4% 7948% 27014 22596% 76183 sched_debug.cpu.nr_switches.min
1.62 ± 32% 1784% 30.61 ± 3% 1100% 19.51 sched_debug.cpu.cpu_load[4].avg
5.46 ± 15% 2325% 132.40 1031% 61.73 sched_debug.cpu.nr_uninterruptible.stddev
424 ± 11% 106% 875 ± 13% 263% 1541 sched_debug.cpu.curr->pid.avg
1400 166% 3721 264% 5100 sched_debug.cpu.curr->pid.max
610 ± 3% 108% 1269 126% 1380 sched_debug.cpu.curr->pid.stddev
0.43 ± 15% 4% 0.45 ± 16% 48% 0.64 sched_debug.cpu.nr_running.avg
253789 ± 13% -5% 241499 ± 3% -22% 198383 sched_debug.cpu.avg_idle.stddev
29638 ± 16% 107% 61339 ± 8% 190% 86079 sched_debug.cpu_clk
28529 ± 17% 111% 60238 ± 8% 198% 84957 sched_debug.ktime
0.17 -74% 0.04 ± 8% -83% 0.03 sched_debug.rt_rq:/.rt_time.avg
0.85 ± 3% -74% 0.22 ± 8% -83% 0.14 sched_debug.rt_rq:/.rt_time.stddev
5.14 ± 10% -75% 1.28 ± 6% -83% 0.88 sched_debug.rt_rq:/.rt_time.max
29638 ± 16% 107% 61339 ± 8% 190% 86079 sched_debug.sched_clk
aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
473257 468546 5% 497512 aim7.jobs-per-min
613996 11% 681283 -7% 571701 aim7.time.involuntary_context_switches
4914 4977 -6% 4634 aim7.time.system_time
114.83 115.98 -5% 109.23 aim7.time.elapsed_time
114.83 115.98 -5% 109.23 aim7.time.elapsed_time.max
60711 ± 8% 20% 73007 -9% 55449 aim7.time.voluntary_context_switches
2509 -6% 2360 -4% 2416 aim7.time.maximum_resident_set_size
362268 19% 430263 -8% 332046 softirqs.RCU
352 ± 7% -32% 238 -35% 230 vmstat.procs.r
5 ± 16% 80% 9 -40% 3 vmstat.procs.b
9584 7% 10255 -10% 8585 vmstat.system.cs
20442 ± 5% 38% 28201 -40% 12270 cpuidle.C1-IVT.usage
3.95 -3% 3.81 9% 4.29 turbostat.CPU%c1
0.81 ± 14% 44% 1.17 28% 1.04 turbostat.Pkg%pc6
19711 ± 5% -7% 18413 -17% 16384 meminfo.AnonHugePages
3974485 3977216 27% 5046310 meminfo.DirectMap2M
139742 ± 4% 137012 -17% 116493 meminfo.DirectMap4k
244933 ± 4% -7% 228626 15% 280670 meminfo.PageTables
12.47 ± 39% 84% 22.89 64% 20.46 perf-profile.func.cycles-pp.poll_idle
57.44 ± 6% -10% 51.55 -13% 50.13 perf-profile.func.cycles-pp.intel_idle
0.20 3% 0.20 -5% 0.19 perf-stat.branch-miss-rate
5.356e+08 4% 5.552e+08 -6% 5.046e+08 perf-stat.branch-misses
1113549 7% 1187535 -15% 951607 perf-stat.context-switches
1.48e+13 1.491e+13 -6% 1.397e+13 perf-stat.cpu-cycles
101697 ± 3% 9% 111167 -3% 98319 perf-stat.cpu-migrations
0.69 ± 20% -17% 0.57 139% 1.65 perf-stat.dTLB-load-miss-rate
3.264e+09 ± 19% -17% 2.712e+09 148% 8.084e+09 perf-stat.dTLB-load-misses
4.695e+11 4.718e+11 4.818e+11 perf-stat.dTLB-loads
3.276e+11 ± 3% 3.303e+11 8% 3.528e+11 perf-stat.dTLB-stores
56.47 ± 19% 41% 79.48 -58% 23.96 perf-stat.iTLB-load-miss-rate
48864487 ± 4% 7% 52183944 -12% 43166037 perf-stat.iTLB-load-misses
40455495 ± 41% -67% 13468883 239% 1.37e+08 perf-stat.iTLB-loads
29278 ± 4% -6% 27480 12% 32844 perf-stat.instructions-per-iTLB-miss
0.10 0.10 5% 0.10 perf-stat.ipc
47.16 46.36 46.51 perf-stat.node-store-miss-rate
6568 ± 44% -59% 2721 -71% 1916 numa-meminfo.node0.Shmem
194395 7% 207086 15% 224164 numa-meminfo.node0.Active
10218 ± 24% -37% 6471 -36% 6494 numa-meminfo.node0.Mapped
7496 ± 34% -97% 204 37% 10278 numa-meminfo.node0.AnonHugePages
178888 6% 188799 16% 208213 numa-meminfo.node0.AnonPages
179468 6% 191062 17% 209704 numa-meminfo.node0.Active(anon)
256890 -15% 219489 -15% 219503 numa-meminfo.node1.Active
12213 ± 24% 49% 18208 -50% 6105 numa-meminfo.node1.AnonHugePages
45080 ± 23% -33% 30138 87% 84468 numa-meminfo.node1.PageTables
241623 -15% 204604 -16% 203913 numa-meminfo.node1.Active(anon)
240637 -15% 204491 -15% 203847 numa-meminfo.node1.AnonPages
23782392 ±139% 673% 1.838e+08 -100% 0 latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
61157 ± 4% -6% 57187 14% 69751 proc-vmstat.nr_page_table_pages
1641 ± 44% -59% 679 -71% 478 numa-vmstat.node0.nr_shmem
2655 ± 23% -35% 1715 -35% 1726 numa-vmstat.node0.nr_mapped
44867 5% 47231 16% 52261 numa-vmstat.node0.nr_anon_pages
45014 6% 47793 17% 52636 numa-vmstat.node0.nr_zone_active_anon
45014 6% 47793 17% 52636 numa-vmstat.node0.nr_active_anon
11300 ± 23% -33% 7542 88% 21209 numa-vmstat.node1.nr_page_table_pages
60581 -16% 51156 -15% 51193 numa-vmstat.node1.nr_zone_active_anon
60581 -16% 51156 -15% 51193 numa-vmstat.node1.nr_active_anon
60328 -15% 51127 -15% 51174 numa-vmstat.node1.nr_anon_pages
13671 13608 11% 15190 slabinfo.cred_jar.active_objs
13707 13608 11% 15231 slabinfo.cred_jar.num_objs
24109 24386 -11% 21574 slabinfo.kmalloc-16.active_objs
24109 24386 -11% 21574 slabinfo.kmalloc-16.num_objs
13709 ± 6% 13391 -15% 11600 slabinfo.kmalloc-512.active_objs
13808 ± 6% 13454 -16% 11657 slabinfo.kmalloc-512.num_objs
1456658 4% 1511260 15% 1675984 sched_debug.cfs_rq:/.min_vruntime.min
441613 ± 3% -28% 316751 -76% 105734 sched_debug.cfs_rq:/.min_vruntime.stddev
443999 ± 3% -28% 318033 -76% 106909 sched_debug.cfs_rq:/.spread0.stddev
2657974 2625551 -19% 2158111 sched_debug.cfs_rq:/.min_vruntime.max
0.22 ± 23% 96% 0.43 109% 0.46 sched_debug.cfs_rq:/.nr_spread_over.stddev
1.50 100% 3.00 133% 3.50 sched_debug.cfs_rq:/.nr_spread_over.max
111.95 ± 26% 15% 128.92 128% 254.81 sched_debug.cfs_rq:/.exec_clock.stddev
802 3% 829 -16% 671 sched_debug.cfs_rq:/.load_avg.min
874 879 -11% 780 sched_debug.cfs_rq:/.load_avg.avg
1256 ± 17% -20% 1011 -24% 957 sched_debug.cfs_rq:/.load_avg.max
1.33 ± 35% -100% 0.00 200% 4.00 sched_debug.cpu.cpu_load[4].min
4.56 ± 6% -11% 4.07 -27% 3.33 sched_debug.cpu.cpu_load[4].stddev
4.76 ± 3% -13% 4.14 -30% 3.35 sched_debug.cpu.cpu_load[3].stddev
25.17 ± 12% -26% 18.50 -21% 20.00 sched_debug.cpu.cpu_load[3].max
25.67 ± 9% -32% 17.50 -24% 19.50 sched_debug.cpu.cpu_load[0].max
4.67 ± 3% -17% 3.90 -22% 3.62 sched_debug.cpu.cpu_load[0].stddev
4.88 -15% 4.14 -31% 3.39 sched_debug.cpu.cpu_load[2].stddev
26.17 ± 10% -29% 18.50 -25% 19.50 sched_debug.cpu.cpu_load[2].max
7265 4% 7556 -12% 6419 sched_debug.cpu.nr_switches.avg
9.41 ± 10% 9.67 21% 11.38 sched_debug.cpu.cpu_load[1].avg
9.03 ± 12% 3% 9.32 23% 11.09 sched_debug.cpu.cpu_load[0].avg
4140 ± 4% -11% 3698 -11% 3703 sched_debug.cpu.nr_switches.stddev
9.41 ± 10% 3% 9.71 22% 11.49 sched_debug.cpu.cpu_load[3].avg
4690 4821 -9% 4257 sched_debug.cpu.nr_switches.min
9.39 ± 9% 3% 9.69 23% 11.52 sched_debug.cpu.cpu_load[4].avg
9.43 ± 10% 9.71 21% 11.44 sched_debug.cpu.cpu_load[2].avg
57.92 ± 18% -4% 55.55 -23% 44.50 sched_debug.cpu.nr_uninterruptible.stddev
3002 ± 3% 10% 3288 31% 3919 sched_debug.cpu.curr->pid.avg
6666 6652 -10% 6025 sched_debug.cpu.curr->pid.max
1379 1361 -19% 1118 sched_debug.cpu.ttwu_local.avg
1849 ± 3% -12% 1628 -18% 1517 sched_debug.cpu.ttwu_local.stddev
1679 ± 8% 4% 1738 -15% 1423 sched_debug.cpu.curr->pid.stddev
1.58 ± 33% -11% 1.41 65% 2.60 sched_debug.cpu.nr_running.avg
1767 6% 1880 -16% 1489 sched_debug.cpu.ttwu_count.avg
506 ± 6% -15% 430 -17% 419 sched_debug.cpu.ttwu_count.min
7139 8% 7745 -11% 6355 sched_debug.cpu.sched_count.avg
4355 6% 4625 -11% 3884 sched_debug.cpu.sched_count.min
4.91 ± 3% -16% 4.13 -28% 3.52 sched_debug.cpu.cpu_load[1].stddev
26.67 ± 9% -29% 19.00 -27% 19.50 sched_debug.cpu.cpu_load[1].max
209 ± 8% 19% 247 -15% 178 sched_debug.cpu.sched_goidle.avg
5.67 ± 27% -12% 5.00 50% 8.50 sched_debug.cpu.nr_running.max
36072 ± 7% 70% 61152 17% 42236 sched_debug.cpu.sched_count.max
2008 -8% 1847 -18% 1645 sched_debug.cpu.ttwu_count.stddev
0.07 ± 19% -20% 0.06 186% 0.21 sched_debug.rt_rq:/.rt_time.avg
0.36 ± 17% -23% 0.28 142% 0.88 sched_debug.rt_rq:/.rt_time.stddev
2.33 ± 15% -27% 1.70 87% 4.35 sched_debug.rt_rq:/.rt_time.max
aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
99091700659f4df9 ca2edab2e1d8f30dda874b7f71 fe9c2c81ed073878768785a985
---------------- -------------------------- --------------------------
270459 272267 ± 3% -48% 139834 ± 3% aim7.jobs-per-min
21229 ± 5% 20896 ± 3% 449% 116516 ± 6% aim7.time.involuntary_context_switches
1461 ± 5% 1454 ± 5% 174% 3998 ± 3% aim7.time.system_time
155368 153149 149% 386164 aim7.time.minor_page_faults
66.84 66.41 ± 3% 93% 129.07 ± 3% aim7.time.elapsed_time
66.84 66.41 ± 3% 93% 129.07 ± 3% aim7.time.elapsed_time.max
328369 3% 339077 96% 644393 aim7.time.voluntary_context_switches
49489 ± 11% -8% 45459 39% 68941 ± 4% interrupts.CAL:Function_call_interrupts
96.62 ± 7% 97.09 61% 155.12 uptime.boot
186640 ± 10% 186707 127% 424522 ± 4% softirqs.RCU
146596 147043 37% 201373 softirqs.SCHED
1005660 ± 3% 991053 ± 4% 118% 2196513 softirqs.TIMER
66.84 66.41 ± 3% 93% 129.07 ± 3% time.elapsed_time
66.84 66.41 ± 3% 93% 129.07 ± 3% time.elapsed_time.max
21229 ± 5% 20896 ± 3% 449% 116516 ± 6% time.involuntary_context_switches
155368 153149 149% 386164 time.minor_page_faults
2212 2215 41% 3112 time.percent_of_cpu_this_job_got
1461 ± 5% 1454 ± 5% 174% 3998 ± 3% time.system_time
328369 3% 339077 96% 644393 time.voluntary_context_switches
1197810 ± 16% -67% 393936 ± 40% -56% 530668 ± 43% numa-numastat.node0.numa_miss
1196269 ± 16% -68% 387751 ± 40% -55% 533013 ± 42% numa-numastat.node1.numa_foreign
112 159% 292 ± 4% 146% 277 vmstat.memory.buff
16422228 16461619 -28% 11832310 vmstat.memory.free
22 -3% 22 87% 42 ± 3% vmstat.procs.r
48853 48768 50273 vmstat.system.in
125202 8% 135626 51% 189515 ± 4% cpuidle.C1-IVT.usage
28088338 ± 3% 11% 31082173 17% 32997314 ± 5% cpuidle.C1-IVT.time
3471814 27% 4422338 ± 15% 2877% 1.034e+08 ± 3% cpuidle.C1E-IVT.time
33353 8% 36128 703% 267725 cpuidle.C1E-IVT.usage
11371800 9% 12381174 244% 39113028 cpuidle.C3-IVT.time
64048 5% 67490 62% 103940 cpuidle.C3-IVT.usage
1.637e+09 1.631e+09 20% 1.959e+09 cpuidle.C6-IVT.time
1861259 4% 1931551 19% 2223599 cpuidle.C6-IVT.usage
230 ± 9% 42% 326 1631% 3986 cpuidle.POLL.usage
1724995 ± 41% 54% 2656939 ± 10% 112% 3662791 cpuidle.POLL.time
48.48 48.15 36% 65.85 turbostat.%Busy
1439 1431 36% 1964 turbostat.Avg_MHz
33.28 33.45 -25% 24.85 turbostat.CPU%c1
18.09 ± 3% 18.24 ± 4% -49% 9.16 turbostat.CPU%c6
134 133 8% 144 turbostat.CorWatt
5.39 ± 17% 4% 5.63 ± 8% -34% 3.54 turbostat.Pkg%pc2
2.97 ± 44% -17% 2.48 ± 32% -70% 0.91 ± 22% turbostat.Pkg%pc6
167 167 6% 177 turbostat.PkgWatt
10.27 10.43 -14% 8.79 turbostat.RAMWatt
44376005 -100% 205734 -100% 214640 meminfo.Active
44199835 -100% 30412 -100% 30241 meminfo.Active(file)
103029 ± 3% 27% 130507 ± 6% 29% 133114 ± 8% meminfo.CmaFree
124701 ± 4% 123685 ± 14% 16% 144180 ± 3% meminfo.DirectMap4k
7886 ± 4% 7993 ± 5% 144% 19231 ± 7% meminfo.Dirty
2472446 1791% 46747572 1976% 51320420 meminfo.Inactive
2463353 1797% 46738477 1983% 51311261 meminfo.Inactive(file)
16631615 16664565 -28% 11936074 meminfo.MemFree
4.125e+11 -5% 3.927e+11 103% 8.36e+11 perf-stat.branch-instructions
0.41 -20% 0.33 -43% 0.23 perf-stat.branch-miss-rate
1.671e+09 -23% 1.28e+09 16% 1.946e+09 perf-stat.branch-misses
7.138e+09 -3% 6.917e+09 23% 8.746e+09 perf-stat.cache-misses
2.036e+10 -4% 1.956e+10 22% 2.476e+10 perf-stat.cache-references
821470 4% 851532 88% 1548125 ± 3% perf-stat.context-switches
4.93e+12 ± 3% -4% 4.755e+12 ± 4% 154% 1.25e+13 perf-stat.cpu-cycles
125073 4% 129993 167% 333599 perf-stat.cpu-migrations
3.595e+09 ± 16% -19% 2.895e+09 ± 17% 39% 4.987e+09 ± 10% perf-stat.dTLB-load-misses
6.411e+11 6.339e+11 ± 3% 57% 1.004e+12 perf-stat.dTLB-loads
0.06 ± 3% -42% 0.04 87% 0.12 ± 3% perf-stat.dTLB-store-miss-rate
2.738e+08 -39% 1.675e+08 64% 4.502e+08 ± 5% perf-stat.dTLB-store-misses
4.321e+11 5% 4.552e+11 -12% 3.81e+11 ± 8% perf-stat.dTLB-stores
2.343e+12 -5% 2.229e+12 67% 3.918e+12 perf-stat.instructions
46162 ± 41% 46733 ± 3% 55% 71500 perf-stat.instructions-per-iTLB-miss
0.48 ± 4% 0.47 ± 5% -34% 0.31 perf-stat.ipc
325877 322934 115% 699924 perf-stat.minor-faults
42.88 3% 44.33 43.65 perf-stat.node-load-miss-rate
9.499e+08 9.578e+08 66% 1.581e+09 perf-stat.node-load-misses
1.266e+09 -5% 1.203e+09 61% 2.04e+09 perf-stat.node-loads
39.17 40.00 8% 42.12 perf-stat.node-store-miss-rate
3.198e+09 4% 3.318e+09 36% 4.344e+09 perf-stat.node-store-misses
4.966e+09 4.977e+09 20% 5.968e+09 perf-stat.node-stores
325852 322963 115% 699918 perf-stat.page-faults
21719324 -100% 15215 ± 3% -100% 14631 numa-meminfo.node0.Active(file)
1221037 1806% 23278263 1969% 25269114 numa-meminfo.node0.Inactive(file)
1223564 1803% 23286857 1965% 25269597 numa-meminfo.node0.Inactive
21811771 -100% 102448 -100% 104424 numa-meminfo.node0.Active
2971 ± 13% -8% 2734 ± 3% 157% 7626 ± 4% numa-meminfo.node0.Dirty
8476780 8356206 -27% 6162743 numa-meminfo.node0.MemFree
617361 611434 11% 687829 numa-meminfo.node0.SReclaimable
1249068 1779% 23471025 1985% 26046948 numa-meminfo.node1.Inactive
1242501 1789% 23470523 1996% 26038272 numa-meminfo.node1.Inactive(file)
22500867 -100% 15202 ± 4% -100% 15613 numa-meminfo.node1.Active(file)
22584509 -100% 103192 ± 6% -100% 109976 numa-meminfo.node1.Active
4814 ± 13% 4957 ± 5% 135% 11335 numa-meminfo.node1.Dirty
8132889 8297084 ± 3% -29% 5777419 ± 3% numa-meminfo.node1.MemFree
83641 ± 7% 5% 87990 ± 7% 13% 94363 numa-meminfo.node1.Active(anon)
82877 ± 7% 4% 86528 ± 6% 13% 93620 numa-meminfo.node1.AnonPages
0 0 842360 ±100% latency_stats.avg.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.do_syscall_64.return_from_SYSCALL_64
113 ±173% 232% 376 ±100% 2e+05% 203269 ± 4% latency_stats.hits.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
5245 ± 14% 5325 ± 3% 535% 33286 ± 23% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_ialloc_read_agi.xfs_dialloc.xfs_ialloc.xfs_dir_ialloc.xfs_create
1133 ±173% 113% 2416 ±100% 1351% 16434 latency_stats.max.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
0 0 842360 ±100% latency_stats.max.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.do_syscall_64.return_from_SYSCALL_64
7813 ± 13% -33% 5197 ± 9% 403% 39305 ± 18% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink_remove.xfs_ifree.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode
5271 ± 13% -3% 5091 ± 5% 288% 20467 latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
10369 ± 17% -41% 6086 ± 21% -96% 385 ±100% latency_stats.max.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
94417 ±173% 556% 619712 ±100% 3e+05% 3.061e+08 ± 5% latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
22126648 ± 4% 22776886 1311% 3.123e+08 ± 7% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink_remove.xfs_ifree.xfs_inactive_ifree.xfs_inactive.xfs_fs_destroy_inode
2536 ±117% -98% 48 ± 43% 2059% 54765 ±100% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1702264 ± 3% 5% 1790192 509% 10359205 ± 6% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_ialloc_read_agi.xfs_dialloc.xfs_ialloc.xfs_dir_ialloc.xfs_create
1180839 ± 3% 5% 1238547 453% 6527115 ± 5% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agi.xfs_iunlink.xfs_droplink.xfs_remove.xfs_vn_unlink.vfs_unlink
467 ±173% 680% 3644 ± 4% 7e+05% 3196407 ± 3% latency_stats.sum.xfs_iget.xfs_ialloc.xfs_dir_ialloc.xfs_create.xfs_generic_create.xfs_vn_mknod.xfs_vn_create.path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
0 0 842360 ±100% latency_stats.sum.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.do_syscall_64.return_from_SYSCALL_64
159018 ± 43% -49% 81514 ± 19% -99% 999 ±100% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
1084 ± 5% 21% 1313 ± 3% 86% 2018 proc-vmstat.kswapd_high_wmark_hit_quickly
1817 ± 3% 38% 2511 ± 3% 175% 4989 proc-vmstat.kswapd_low_wmark_hit_quickly
11055004 -100% 7603 -100% 7559 proc-vmstat.nr_active_file
1993 2013 ± 4% 128% 4553 ± 5% proc-vmstat.nr_dirty
25746 ± 3% 26% 32494 ± 6% 29% 33319 ± 8% proc-vmstat.nr_free_cma
4152484 4162399 -28% 2984494 proc-vmstat.nr_free_pages
615907 1798% 11688190 1983% 12827366 proc-vmstat.nr_inactive_file
11055042 -100% 7603 -100% 7559 proc-vmstat.nr_zone_active_file
615904 1798% 11688234 1983% 12827434 proc-vmstat.nr_zone_inactive_file
2016 ± 3% 2025 ± 4% 127% 4582 ± 4% proc-vmstat.nr_zone_write_pending
2912 ± 3% 32% 3834 ± 3% 141% 7009 proc-vmstat.pageoutrun
5380414 -100% 2502 ± 3% -100% 2602 ± 3% proc-vmstat.pgactivate
61925072 -100% 0 -100% 0 proc-vmstat.pgdeactivate
348105 343315 108% 723517 proc-vmstat.pgfault
61932469 -100% 0 -100% 0 proc-vmstat.pgrefill
5432311 -100% 3802 ± 3% -100% 3657 numa-vmstat.node0.nr_zone_active_file
5432276 -100% 3802 ± 3% -100% 3657 numa-vmstat.node0.nr_active_file
305236 1802% 5806215 1969% 6314975 numa-vmstat.node0.nr_zone_inactive_file
305239 1802% 5806170 1969% 6314910 numa-vmstat.node0.nr_inactive_file
748 ± 7% -20% 597 ± 10% 114% 1602 numa-vmstat.node0.nr_dirty
775 ± 7% -21% 610 ± 12% 112% 1642 numa-vmstat.node0.nr_zone_write_pending
2116796 2102494 ± 3% -27% 1543100 numa-vmstat.node0.nr_free_pages
154392 152538 11% 171898 numa-vmstat.node0.nr_slab_reclaimable
310642 1784% 5853811 1995% 6507801 numa-vmstat.node1.nr_zone_inactive_file
310642 1784% 5853814 1995% 6507801 numa-vmstat.node1.nr_inactive_file
5627293 -100% 3799 ± 4% -100% 3903 numa-vmstat.node1.nr_zone_active_file
5627293 -100% 3799 ± 4% -100% 3903 numa-vmstat.node1.nr_active_file
1206 ± 16% 14% 1373 129% 2758 ± 10% numa-vmstat.node1.nr_zone_write_pending
1205 ± 16% 14% 1373 129% 2757 ± 10% numa-vmstat.node1.nr_dirty
2031121 2088592 ± 3% -29% 1446172 ± 3% numa-vmstat.node1.nr_free_pages
25743 ± 3% 27% 32608 ± 7% 30% 33415 ± 8% numa-vmstat.node1.nr_free_cma
20877 ± 7% 6% 22077 ± 6% 13% 23620 numa-vmstat.node1.nr_zone_active_anon
20877 ± 7% 6% 22077 ± 6% 13% 23620 numa-vmstat.node1.nr_active_anon
20684 ± 7% 5% 21709 ± 6% 13% 23431 numa-vmstat.node1.nr_anon_pages
4687 4704 11% 5205 ± 3% slabinfo.kmalloc-128.num_objs
4687 4704 11% 5205 ± 3% slabinfo.kmalloc-128.active_objs
1401 -19% 1142 8% 1516 ± 6% slabinfo.xfs_efd_item.num_objs
1401 -19% 1142 8% 1516 ± 6% slabinfo.xfs_efd_item.active_objs
1725 ± 5% -8% 1589 -12% 1518 slabinfo.xfs_inode.num_objs
1725 ± 5% -8% 1589 -12% 1518 slabinfo.xfs_inode.active_objs
382810 ± 4% 383813 ± 3% 301% 1535378 sched_debug.cfs_rq:/.min_vruntime.avg
249011 ± 6% 245840 ± 3% 420% 1294704 sched_debug.cfs_rq:/.min_vruntime.min
105216 106278 79% 188096 sched_debug.cfs_rq:/.min_vruntime.stddev
105260 106358 79% 188314 sched_debug.cfs_rq:/.spread0.stddev
9414 ± 4% 9361 ± 4% 230% 31092 sched_debug.cfs_rq:/.exec_clock.min
541056 ± 9% 540188 ± 3% 236% 1820030 sched_debug.cfs_rq:/.min_vruntime.max
150.87 ± 11% -21% 119.80 ± 10% 34% 202.73 ± 7% sched_debug.cfs_rq:/.util_avg.stddev
13783 13656 170% 37192 sched_debug.cfs_rq:/.exec_clock.avg
17625 17508 141% 42564 sched_debug.cfs_rq:/.exec_clock.max
3410.74 ± 3% 3458.30 38% 4706.14 sched_debug.cfs_rq:/.exec_clock.stddev
732 ± 11% 11% 809 ± 3% -34% 480 ± 7% sched_debug.cfs_rq:/.load_avg.min
844 ± 8% 7% 901 -33% 569 ± 4% sched_debug.cfs_rq:/.load_avg.avg
0.41 ± 7% 11% 0.46 ± 11% 21% 0.50 ± 5% sched_debug.cfs_rq:/.nr_running.avg
1339 ± 5% 1338 -32% 909 sched_debug.cfs_rq:/.load_avg.max
0.53 ± 4% -4% 0.51 32% 0.70 sched_debug.cfs_rq:/.nr_spread_over.avg
0.50 0.50 33% 0.67 sched_debug.cfs_rq:/.nr_spread_over.min
355.00 ± 26% -67% 118.75 ± 4% -82% 64.83 ± 20% sched_debug.cpu.cpu_load[4].max
18042 17697 135% 42380 sched_debug.cpu.nr_load_updates.min
51.83 ± 22% -66% 17.44 -78% 11.18 ± 5% sched_debug.cpu.cpu_load[4].stddev
22708 22546 111% 47986 sched_debug.cpu.nr_load_updates.avg
29633 ± 7% -7% 27554 83% 54243 sched_debug.cpu.nr_load_updates.max
48.83 ± 29% -65% 16.91 ± 29% -73% 13.34 ± 13% sched_debug.cpu.cpu_load[3].stddev
329.25 ± 34% -65% 113.75 ± 30% -76% 79.67 ± 28% sched_debug.cpu.cpu_load[3].max
17106 14% 19541 ± 19% 34% 22978 ± 6% sched_debug.cpu.nr_switches.max
1168 ± 4% -3% 1131 ± 4% 144% 2846 ± 21% sched_debug.cpu.ttwu_local.max
3826 ± 3% 3766 17% 4487 sched_debug.cpu.nr_load_updates.stddev
19.73 ± 12% -4% 18.86 ± 14% 59% 31.42 ± 8% sched_debug.cpu.nr_uninterruptible.avg
149.75 ± 8% 150.00 ± 11% 42% 212.50 sched_debug.cpu.nr_uninterruptible.max
98147 ± 34% 97985 ± 42% 59% 156085 ± 8% sched_debug.cpu.avg_idle.min
8554 ± 3% 4% 8896 ± 5% 62% 13822 sched_debug.cpu.nr_switches.avg
2582 ± 3% 11% 2857 ± 11% 19% 3083 ± 3% sched_debug.cpu.nr_switches.stddev
60029 ± 9% 60817 ± 7% 44% 86205 sched_debug.cpu.clock.max
60029 ± 9% 60817 ± 7% 44% 86205 sched_debug.cpu.clock_task.max
60020 ± 9% 60807 ± 7% 44% 86188 sched_debug.cpu.clock.avg
60020 ± 9% 60807 ± 7% 44% 86188 sched_debug.cpu.clock_task.avg
60008 ± 9% 60793 ± 7% 44% 86169 sched_debug.cpu.clock.min
60008 ± 9% 60793 ± 7% 44% 86169 sched_debug.cpu.clock_task.min
18.36 ± 7% -37% 11.60 ± 5% -33% 12.21 sched_debug.cpu.cpu_load[3].avg
5577 ± 6% 3% 5772 ± 6% 81% 10121 sched_debug.cpu.nr_switches.min
19.14 ± 3% -36% 12.24 -36% 12.33 sched_debug.cpu.cpu_load[4].avg
17.21 ± 14% -31% 11.90 ± 18% -27% 12.56 ± 6% sched_debug.cpu.cpu_load[2].avg
83.49 ± 7% 5% 87.64 ± 3% 17% 97.56 ± 4% sched_debug.cpu.nr_uninterruptible.stddev
3729 3735 18% 4409 ± 13% sched_debug.cpu.curr->pid.max
374 ± 9% -4% 360 ± 9% 157% 962 sched_debug.cpu.ttwu_local.min
665 671 122% 1479 sched_debug.cpu.ttwu_local.avg
196 ± 7% 5% 207 ± 8% 88% 369 ± 14% sched_debug.cpu.ttwu_local.stddev
1196 ± 4% 5% 1261 ± 6% 11% 1333 ± 10% sched_debug.cpu.curr->pid.stddev
0.45 ± 7% 17% 0.53 ± 16% 29% 0.58 ± 16% sched_debug.cpu.nr_running.avg
6738 ± 16% 8% 7296 ± 20% 52% 10236 sched_debug.cpu.ttwu_count.max
3952 ± 4% 5% 4150 ± 5% 75% 6917 sched_debug.cpu.ttwu_count.avg
913 22% 1117 ± 18% 42% 1302 ± 3% sched_debug.cpu.sched_goidle.stddev
2546 ± 4% 4% 2653 ± 7% 89% 4816 sched_debug.cpu.ttwu_count.min
5301 ± 6% 36% 7190 ± 33% 61% 8513 ± 8% sched_debug.cpu.sched_goidle.max
4683 ± 16% 14% 5355 ± 25% 52% 7125 sched_debug.cpu.sched_count.stddev
8262 ± 3% 6% 8746 ± 7% 68% 13912 sched_debug.cpu.sched_count.avg
5139 ± 5% 4% 5362 ± 6% 90% 9773 sched_debug.cpu.sched_count.min
2088 ± 6% 7% 2229 ± 5% 55% 3242 sched_debug.cpu.sched_goidle.min
3258 ± 4% 6% 3445 ± 6% 44% 4706 sched_debug.cpu.sched_goidle.avg
37088 ± 17% 12% 41540 ± 23% 60% 59447 sched_debug.cpu.sched_count.max
1007 ± 7% 13% 1139 ± 14% 38% 1386 ± 3% sched_debug.cpu.ttwu_count.stddev
262591 ± 4% -3% 253748 ± 4% -11% 232974 sched_debug.cpu.avg_idle.stddev
60009 ± 9% 60795 ± 7% 44% 86169 sched_debug.cpu_clk
58763 ± 9% 59673 ± 7% 45% 85068 sched_debug.ktime
60009 ± 9% 60795 ± 7% 44% 86169 sched_debug.sched_clk
aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
99091700659f4df9 fe9c2c81ed073878768785a985
---------------- --------------------------
69789 5% 73162 aim7.jobs-per-min
81603 -7% 75897 ± 5% aim7.time.involuntary_context_switches
3825 -6% 3583 aim7.time.system_time
129.08 -5% 123.16 aim7.time.elapsed_time
129.08 -5% 123.16 aim7.time.elapsed_time.max
2536 -4% 2424 aim7.time.maximum_resident_set_size
3145 131% 7253 ± 20% numa-numastat.node1.numa_miss
3145 131% 7253 ± 20% numa-numastat.node1.numa_foreign
7059 4% 7362 vmstat.system.cs
7481848 40% 10487336 ± 8% cpuidle.C1-IVT.time
1491314 75% 2607219 ± 10% cpuidle.POLL.time
67 10% 73 ± 4% turbostat.CoreTmp
66 12% 73 ± 4% turbostat.PkgTmp
5025792 -21% 3973802 meminfo.DirectMap2M
49098 12% 54859 meminfo.PageTables
3.94 97% 7.76 ± 18% perf-profile.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
11.88 -24% 8.99 ± 14% perf-profile.cycles-pp.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
11.63 -25% 8.78 ± 13% perf-profile.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry
8.412e+11 -7% 7.83e+11 perf-stat.branch-instructions
0.30 0.29 perf-stat.branch-miss-rate
2.495e+09 -8% 2.292e+09 perf-stat.branch-misses
4.277e+09 -6% 4.003e+09 perf-stat.cache-misses
1.396e+10 -5% 1.327e+10 perf-stat.cache-references
1.224e+13 -8% 1.12e+13 perf-stat.cpu-cycles
0.58 -57% 0.25 ± 16% perf-stat.dTLB-load-miss-rate
5.407e+09 -60% 2.175e+09 ± 18% perf-stat.dTLB-load-misses
9.243e+11 -6% 8.708e+11 perf-stat.dTLB-loads
0.17 -58% 0.07 ± 4% perf-stat.dTLB-store-miss-rate
4.368e+08 -50% 2.177e+08 ± 3% perf-stat.dTLB-store-misses
2.549e+11 19% 3.041e+11 perf-stat.dTLB-stores
3.737e+12 -6% 3.498e+12 perf-stat.instructions
0.31 0.31 perf-stat.ipc
439716 426816 perf-stat.minor-faults
2.164e+09 -7% 2.012e+09 perf-stat.node-load-misses
2.417e+09 -7% 2.259e+09 perf-stat.node-loads
1.24e+09 -3% 1.198e+09 perf-stat.node-store-misses
1.556e+09 -4% 1.501e+09 perf-stat.node-stores
439435 426823 perf-stat.page-faults
51452 14% 58403 ± 8% numa-meminfo.node0.Active(anon)
10472 -36% 6692 ± 45% numa-meminfo.node1.Shmem
7665 74% 13316 numa-meminfo.node1.PageTables
6724 144% 16416 ± 43% latency_stats.avg.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
6724 144% 16416 ± 43% latency_stats.max.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
6724 144% 16416 ± 43% latency_stats.sum.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
12237 12% 13693 proc-vmstat.nr_page_table_pages
12824 14% 14578 ± 8% numa-vmstat.node0.nr_zone_active_anon
12824 14% 14578 ± 8% numa-vmstat.node0.nr_active_anon
2618 -36% 1672 ± 45% numa-vmstat.node1.nr_shmem
17453 24% 21726 ± 6% numa-vmstat.node1.numa_miss
1909 74% 3323 numa-vmstat.node1.nr_page_table_pages
17453 24% 21726 ± 6% numa-vmstat.node1.numa_foreign
922 24% 1143 ± 6% slabinfo.blkdev_requests.active_objs
922 24% 1143 ± 6% slabinfo.blkdev_requests.num_objs
569 21% 686 ± 11% slabinfo.file_lock_cache.active_objs
569 21% 686 ± 11% slabinfo.file_lock_cache.num_objs
9.07 16% 10.56 ± 9% sched_debug.cfs_rq:/.runnable_load_avg.avg
18406 -14% 15835 ± 10% sched_debug.cfs_rq:/.load.stddev
0.67 150% 1.67 ± 43% sched_debug.cfs_rq:/.nr_spread_over.max
581 -11% 517 ± 4% sched_debug.cfs_rq:/.load_avg.min
659 -10% 596 ± 4% sched_debug.cfs_rq:/.load_avg.avg
784 -12% 692 ± 4% sched_debug.cfs_rq:/.load_avg.max
18086 -12% 15845 ± 9% sched_debug.cpu.load.stddev
18.72 -17% 15.49 ± 8% sched_debug.cpu.nr_uninterruptible.avg
69.33 42% 98.67 ± 7% sched_debug.cpu.nr_uninterruptible.max
317829 -12% 280218 ± 4% sched_debug.cpu.avg_idle.min
9.80 18% 11.54 ± 10% sched_debug.cpu.cpu_load[1].avg
8.91 15% 10.28 ± 9% sched_debug.cpu.cpu_load[0].avg
9.53 22% 11.64 ± 10% sched_debug.cpu.cpu_load[3].avg
7083 11% 7853 sched_debug.cpu.nr_switches.min
9.73 22% 11.90 ± 7% sched_debug.cpu.cpu_load[4].avg
9.68 20% 11.59 ± 11% sched_debug.cpu.cpu_load[2].avg
24.59 49% 36.53 ± 17% sched_debug.cpu.nr_uninterruptible.stddev
1176 12% 1319 ± 4% sched_debug.cpu.curr->pid.avg
373 35% 502 ± 6% sched_debug.cpu.ttwu_local.min
3644 13% 4120 ± 3% sched_debug.cpu.ttwu_count.min
4855 13% 5463 ± 6% sched_debug.cpu.sched_goidle.max
7019 10% 7745 sched_debug.cpu.sched_count.min
2305 10% 2529 ± 3% sched_debug.cpu.sched_goidle.min
0.00 -19% 0.00 ± 7% sched_debug.cpu.next_balance.stddev
0.68 -17% 0.57 ± 11% sched_debug.cpu.nr_running.stddev
0.05 27% 0.06 ± 14% sched_debug.rt_rq:/.rt_nr_running.stddev
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-14 14:50 ` Fengguang Wu
@ 2016-08-14 16:17 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-14 16:17 UTC (permalink / raw)
To: Fengguang Wu
Cc: Christoph Hellwig, Dave Chinner, Ye Xiaolong, Linus Torvalds,
LKML, Bob Peterson, LKP
Snipping the long contest:
I think there are three observations here:
(1) removing the mark_page_accessed (which is the only significant
change in the parent commit) hurts the
aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
I'd still rather stick to the filemap version and let the
VM people sort it out. How do the numbers for this test
look for XFS vs say ext4 and btrfs?
(2) lots of additional spinlock contention in the new case. A quick
check shows that I fat-fingered my rewrite so that we do
the xfs_inode_set_eofblocks_tag call now for the pure lookup
case, and pretty much all new cycles come from that.
(3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
we're already doing way to many even without my little bug above.
So I've force pushed a new version of the iomap-fixes branch with
(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
lot less expensive slotted in before that. Would be good to see
the numbers with that.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-14 16:17 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-08-14 16:17 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]
Snipping the long contest:
I think there are three observations here:
(1) removing the mark_page_accessed (which is the only significant
change in the parent commit) hurts the
aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
I'd still rather stick to the filemap version and let the
VM people sort it out. How do the numbers for this test
look for XFS vs say ext4 and btrfs?
(2) lots of additional spinlock contention in the new case. A quick
check shows that I fat-fingered my rewrite so that we do
the xfs_inode_set_eofblocks_tag call now for the pure lookup
case, and pretty much all new cycles come from that.
(3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
we're already doing way to many even without my little bug above.
So I've force pushed a new version of the iomap-fixes branch with
(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
lot less expensive slotted in before that. Would be good to see
the numbers with that.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-14 16:17 ` Christoph Hellwig
@ 2016-08-14 23:46 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-14 23:46 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Fengguang Wu, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
> Snipping the long contest:
>
> I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
> (2) lots of additional spinlock contention in the new case. A quick
> check shows that I fat-fingered my rewrite so that we do
> the xfs_inode_set_eofblocks_tag call now for the pure lookup
> case, and pretty much all new cycles come from that.
> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> we're already doing way to many even without my little bug above.
>
> So I've force pushed a new version of the iomap-fixes branch with
> (2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
> lot less expensive slotted in before that. Would be good to see
> the numbers with that.
With this new set of fixes, the 1byte write test runs ~30% faster on
my test machine (130k writes/s vs 100k writes/s), and the 1k write
on the pmem device runs about 10% faster (660MB/s vs 590MB/s).
dbench numbers on the pmem device also go through the roof (they
didn't show any regression to begin with) - 50% faster at 16 clients
on a 16AG filesystem (5700MB/s vs 3800MB/s).
The 10Mx4k file create fsmark workload I run (on the sparse 500TB
XFS filesystem backed by a pair of SSDs) is giving the highest
throughput *and* the lowest std dev I've ever recorded
(55014.8+/-1.3e+04 files/s) and that shows in the runtime which also
drops from 3m57s to 3m22s.
So regardless of what aim7 results we get from these changes, I'll
be merging them pending review and further testing...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-14 23:46 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-14 23:46 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2022 bytes --]
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
> Snipping the long contest:
>
> I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
> (2) lots of additional spinlock contention in the new case. A quick
> check shows that I fat-fingered my rewrite so that we do
> the xfs_inode_set_eofblocks_tag call now for the pure lookup
> case, and pretty much all new cycles come from that.
> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> we're already doing way to many even without my little bug above.
>
> So I've force pushed a new version of the iomap-fixes branch with
> (2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
> lot less expensive slotted in before that. Would be good to see
> the numbers with that.
With this new set of fixes, the 1byte write test runs ~30% faster on
my test machine (130k writes/s vs 100k writes/s), and the 1k write
on the pmem device runs about 10% faster (660MB/s vs 590MB/s).
dbench numbers on the pmem device also go through the roof (they
didn't show any regression to begin with) - 50% faster at 16 clients
on a 16AG filesystem (5700MB/s vs 3800MB/s).
The 10Mx4k file create fsmark workload I run (on the sparse 500TB
XFS filesystem backed by a pair of SSDs) is giving the highest
throughput *and* the lowest std dev I've ever recorded
(55014.8+/-1.3e+04 files/s) and that shows in the runtime which also
drops from 3m57s to 3m22s.
So regardless of what aim7 results we get from these changes, I'll
be merging them pending review and further testing...
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-14 16:17 ` Christoph Hellwig
@ 2016-08-14 23:57 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-14 23:57 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Dave Chinner, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP
Hi Christoph,
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>Snipping the long contest:
>
>I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
We'll be able to compare between filesystems when the tests for Linus'
patch finish.
> (2) lots of additional spinlock contention in the new case. A quick
> check shows that I fat-fingered my rewrite so that we do
> the xfs_inode_set_eofblocks_tag call now for the pure lookup
> case, and pretty much all new cycles come from that.
> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> we're already doing way to many even without my little bug above.
>
>So I've force pushed a new version of the iomap-fixes branch with
>(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>lot less expensive slotted in before that. Would be good to see
>the numbers with that.
I just queued these jobs. The comment-out ones will be submitted as
the 2nd stage when the 1st-round quick tests finish.
queue=(
queue
-q vip
--repeat-to 3
fs=xfs
perf-profile.delay=1
-b hch-vfs/iomap-fixes
-k bf4dc6e4ecc2a3d042029319bc8cd4204c185610
-k 74a242ad94d13436a1644c0b4586700e39871491
-k 99091700659f4df965e138b38b4fa26a29b7eade
)
"${queue[@]}" -t ivb44 aim7-fs-1brd.yaml
"${queue[@]}" -t ivb44 fsmark-generic-1brd.yaml
"${queue[@]}" -t ivb43 fsmark-stress-journal-1brd.yaml
"${queue[@]}" -t lkp-hsx02 fsmark-generic-brd-raid.yaml
"${queue[@]}" -t lkp-hsw-ep4 fsmark-1ssd-nvme-small.yaml
#"${queue[@]}" -t ivb43 fsmark-stress-journal-1hdd.yaml
#"${queue[@]}" -t ivb44 dd-write-1hdd.yaml fsmark-generic-1hdd.yaml
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-14 23:57 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-14 23:57 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2090 bytes --]
Hi Christoph,
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>Snipping the long contest:
>
>I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
We'll be able to compare between filesystems when the tests for Linus'
patch finish.
> (2) lots of additional spinlock contention in the new case. A quick
> check shows that I fat-fingered my rewrite so that we do
> the xfs_inode_set_eofblocks_tag call now for the pure lookup
> case, and pretty much all new cycles come from that.
> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> we're already doing way to many even without my little bug above.
>
>So I've force pushed a new version of the iomap-fixes branch with
>(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>lot less expensive slotted in before that. Would be good to see
>the numbers with that.
I just queued these jobs. The comment-out ones will be submitted as
the 2nd stage when the 1st-round quick tests finish.
queue=(
queue
-q vip
--repeat-to 3
fs=xfs
perf-profile.delay=1
-b hch-vfs/iomap-fixes
-k bf4dc6e4ecc2a3d042029319bc8cd4204c185610
-k 74a242ad94d13436a1644c0b4586700e39871491
-k 99091700659f4df965e138b38b4fa26a29b7eade
)
"${queue[@]}" -t ivb44 aim7-fs-1brd.yaml
"${queue[@]}" -t ivb44 fsmark-generic-1brd.yaml
"${queue[@]}" -t ivb43 fsmark-stress-journal-1brd.yaml
"${queue[@]}" -t lkp-hsx02 fsmark-generic-brd-raid.yaml
"${queue[@]}" -t lkp-hsw-ep4 fsmark-1ssd-nvme-small.yaml
#"${queue[@]}" -t ivb43 fsmark-stress-journal-1hdd.yaml
#"${queue[@]}" -t ivb44 dd-write-1hdd.yaml fsmark-generic-1hdd.yaml
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 18:03 ` Linus Torvalds
@ 2016-08-15 0:48 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 0:48 UTC (permalink / raw)
To: Linus Torvalds
Cc: Tejun Heo, Wu Fengguang, Kirill A. Shutemov, Christoph Hellwig,
Huang, Ying, LKML, Bob Peterson, LKP
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
> >>
> >> I don't recall having ever seen the mapping tree_lock as a contention
> >> point before, but it's not like I've tried that load either. So it
> >> might be a regression (going back long, I suspect), or just an unusual
> >> load that nobody has traditionally tested much.
> >>
> >> Single-threaded big file write one page at a time, was it?
> >
> > Yup. On a 4 node NUMA system.
>
> Ok, I can't see any real contention on my single-node workstation
> (running ext4 too, so there may be filesystem differences), but I
> guess that shouldn't surprise me. The cacheline bouncing just isn't
> expensive enough when it all stays on-die.
>
> I can see the tree_lock in my profiles (just not very high), and at
> least for ext4 the main caller ssems to be
> __set_page_dirty_nobuffers().
>
> And yes, looking at that, the biggest cost by _far_ inside the
> spinlock seems to be the accounting.
>
> Which doesn't even have to be inside the mapping lock, as far as I can
> tell, and as far as comments go.
>
> So a stupid patch to just move the dirty page accounting to outside
> the spinlock might help a lot.
>
> Does this attached patch help your contention numbers?
No. If anything, it makes it worse. Without the patch, I was
measuring 36-37% in _raw_spin_unlock_irqrestore. With the patch, it
is 42-43%. Write throughtput is the same at ~505MB/s.
There's a couple of interesting things showing up in the profile:
41.64% [kernel] [k] _raw_spin_unlock_irqrestore
7.92% [kernel] [k] copy_user_generic_string
5.87% [kernel] [k] _raw_spin_unlock_irq
3.18% [kernel] [k] do_raw_spin_lock
2.51% [kernel] [k] cancel_dirty_page <<<<<<<<<<<<<<<
2.35% [kernel] [k] get_page_from_freelist
2.22% [kernel] [k] shrink_page_list
2.04% [kernel] [k] __block_commit_write.isra.30
1.40% [kernel] [k] xfs_vm_releasepage
1.21% [kernel] [k] free_hot_cold_page
1.17% [kernel] [k] delay_tsc
1.12% [kernel] [k] __wake_up_bit
0.92% [kernel] [k] __slab_free
0.91% [kernel] [k] clear_page_dirty_for_io
0.82% [kernel] [k] __radix_tree_lookup
0.76% [kernel] [k] node_dirty_ok
0.68% [kernel] [k] page_evictable
0.63% [kernel] [k] do_raw_spin_unlock
0.62% [kernel] [k] up_write
.....
Why are we even calling into cancel_dirty_page() if the page isn't
dirty? xfs_vm_release_page() won't let dirty pages through to
try_to_free_buffers(), so all this is just pure overhead for XFS.
FWIW, this is not under the mapping->tree_lock, but the profile shows
that reclaiming bufferheads is roughly 20% of all the work kswapd is
doing. If we take away the mapping->tree_lock contention from the
usage totals, it's actually closer to 50% of the real work kswapd is
doing. The call graph profile is pretty grim:
- 41.89% 0.00% [kernel] [k] kswapd
- kswapd
- 41.84% shrink_node
- 41.61% shrink_node_memcg.isra.75
- 41.50% shrink_inactive_list
- 40.21% shrink_page_list
- 26.47% __remove_mapping
26.30% _raw_spin_unlock_irqrestore
+ 9.03% try_to_release_page
- 8.82% try_to_release_page
- 8.80% xfs_vm_releasepage
7.55% try_to_free_buffers
+ 1.56% free_hot_cold_page_list
0.73% page_evictable
0.88% _raw_spin_unlock_irq
I guess now that the iomap code is in and we no longer really depend
on bufferheads in the writeback path, it's time to take the next
step in removing bufferheads from XFS altogether....
> I realize that this has nothing to do with the AIM7 regression (the
> spinlock just isn't high enough in that profile), but your contention
> numbers just aren't right, and updating accounting statistics inside a
> critical spinlock when not needed is just wrong.
Yup, but the above profile shows that the lock contention is mainly
coming from concurrent access in memory reclaim so I don't think the
accounting has anything to do with it.
Perhaps some kind of per-mapping reclaim batching reduce the
__remove_mapping() locking overhead is in order here. Especially as
this problem will only get worse on larger NUMA machines....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 0:48 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 0:48 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4640 bytes --]
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
> >>
> >> I don't recall having ever seen the mapping tree_lock as a contention
> >> point before, but it's not like I've tried that load either. So it
> >> might be a regression (going back long, I suspect), or just an unusual
> >> load that nobody has traditionally tested much.
> >>
> >> Single-threaded big file write one page at a time, was it?
> >
> > Yup. On a 4 node NUMA system.
>
> Ok, I can't see any real contention on my single-node workstation
> (running ext4 too, so there may be filesystem differences), but I
> guess that shouldn't surprise me. The cacheline bouncing just isn't
> expensive enough when it all stays on-die.
>
> I can see the tree_lock in my profiles (just not very high), and at
> least for ext4 the main caller ssems to be
> __set_page_dirty_nobuffers().
>
> And yes, looking at that, the biggest cost by _far_ inside the
> spinlock seems to be the accounting.
>
> Which doesn't even have to be inside the mapping lock, as far as I can
> tell, and as far as comments go.
>
> So a stupid patch to just move the dirty page accounting to outside
> the spinlock might help a lot.
>
> Does this attached patch help your contention numbers?
No. If anything, it makes it worse. Without the patch, I was
measuring 36-37% in _raw_spin_unlock_irqrestore. With the patch, it
is 42-43%. Write throughtput is the same at ~505MB/s.
There's a couple of interesting things showing up in the profile:
41.64% [kernel] [k] _raw_spin_unlock_irqrestore
7.92% [kernel] [k] copy_user_generic_string
5.87% [kernel] [k] _raw_spin_unlock_irq
3.18% [kernel] [k] do_raw_spin_lock
2.51% [kernel] [k] cancel_dirty_page <<<<<<<<<<<<<<<
2.35% [kernel] [k] get_page_from_freelist
2.22% [kernel] [k] shrink_page_list
2.04% [kernel] [k] __block_commit_write.isra.30
1.40% [kernel] [k] xfs_vm_releasepage
1.21% [kernel] [k] free_hot_cold_page
1.17% [kernel] [k] delay_tsc
1.12% [kernel] [k] __wake_up_bit
0.92% [kernel] [k] __slab_free
0.91% [kernel] [k] clear_page_dirty_for_io
0.82% [kernel] [k] __radix_tree_lookup
0.76% [kernel] [k] node_dirty_ok
0.68% [kernel] [k] page_evictable
0.63% [kernel] [k] do_raw_spin_unlock
0.62% [kernel] [k] up_write
.....
Why are we even calling into cancel_dirty_page() if the page isn't
dirty? xfs_vm_release_page() won't let dirty pages through to
try_to_free_buffers(), so all this is just pure overhead for XFS.
FWIW, this is not under the mapping->tree_lock, but the profile shows
that reclaiming bufferheads is roughly 20% of all the work kswapd is
doing. If we take away the mapping->tree_lock contention from the
usage totals, it's actually closer to 50% of the real work kswapd is
doing. The call graph profile is pretty grim:
- 41.89% 0.00% [kernel] [k] kswapd
- kswapd
- 41.84% shrink_node
- 41.61% shrink_node_memcg.isra.75
- 41.50% shrink_inactive_list
- 40.21% shrink_page_list
- 26.47% __remove_mapping
26.30% _raw_spin_unlock_irqrestore
+ 9.03% try_to_release_page
- 8.82% try_to_release_page
- 8.80% xfs_vm_releasepage
7.55% try_to_free_buffers
+ 1.56% free_hot_cold_page_list
0.73% page_evictable
0.88% _raw_spin_unlock_irq
I guess now that the iomap code is in and we no longer really depend
on bufferheads in the writeback path, it's time to take the next
step in removing bufferheads from XFS altogether....
> I realize that this has nothing to do with the AIM7 regression (the
> spinlock just isn't high enough in that profile), but your contention
> numbers just aren't right, and updating accounting statistics inside a
> critical spinlock when not needed is just wrong.
Yup, but the above profile shows that the lock contention is mainly
coming from concurrent access in memory reclaim so I don't think the
accounting has anything to do with it.
Perhaps some kind of per-mapping reclaim batching reduce the
__remove_mapping() locking overhead is in order here. Especially as
this problem will only get worse on larger NUMA machines....
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 0:48 ` Dave Chinner
@ 2016-08-15 1:37 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 1:37 UTC (permalink / raw)
To: Dave Chinner
Cc: Tejun Heo, Wu Fengguang, Kirill A. Shutemov, Christoph Hellwig,
Huang, Ying, LKML, Bob Peterson, LKP
On Sun, Aug 14, 2016 at 5:48 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> Does this attached patch help your contention numbers?
>
> No. If anything, it makes it worse. Without the patch, I was
> measuring 36-37% in _raw_spin_unlock_irqrestore. With the patch, it
> is 42-43%. Write throughtput is the same at ~505MB/s.
Not helping any I can see, but I don't see how it could hurt...
Did you perhaps test it together with the other patches that improved
xfs performance? If other things improve, then I'd expect the
contention to get worse.
Not that it matters. Clearly that patch isn't even a stop-gap solution.
> There's a couple of interesting things showing up in the profile:
>
> 41.64% [kernel] [k] _raw_spin_unlock_irqrestore
Actually, you didn't point this one out, but *this* is the real kicker.
There's no way a *unlock* should show up that high. It's not spinning.
It's doing a single store and a pushq/popfq sequence.
Sure, it's going to take a cross-node cachemiss in the presence of
contention, but even then it should never be more expensive than the
locking side - which will *also* do the node changes.
So there's something really odd in your profile. I don't think that's valid.
Maybe your symbol table came from a old kernel, and functions moved
around enough that the profile attributions ended up bogus.
I suspect it's actually supposed to be _raw_spin_lock_irqrestore()
which is right next to that function. Although I'd actually expect
that if it's lock contention, you should see the contention mostly in
queued_spin_lock_slowpath().
Unless you have spinlock debugging turned on, in which case your
contention is all from *that*. That's possible, of course.
> 7.92% [kernel] [k] copy_user_generic_string
> 5.87% [kernel] [k] _raw_spin_unlock_irq
> 3.18% [kernel] [k] do_raw_spin_lock
> 2.51% [kernel] [k] cancel_dirty_page <<<<<<<<<<<<<<<
...
> Why are we even calling into cancel_dirty_page() if the page isn't
> dirty? xfs_vm_release_page() won't let dirty pages through to
> try_to_free_buffers(), so all this is just pure overhead for XFS.
See above: there's something screwy with your profile, you should
check that first. Maybe it's not actually cancel_dirty_page() but
something close-by.
(Although I don't see anything closeby normally, so even if the
spin_unlock_irq is bogus, I think *that* part may be incorrect.
Anyway, the reason you'd get cancel_dirty_page() is either due to
truncate, or due to try_to_free_buffers() having dropped the buffers
successfully because the filesystem had already written them out, but
the page is still marked dirty.
> FWIW, this is not under the mapping->tree_lock, but the profile shows
> that reclaiming bufferheads is roughly 20% of all the work kswapd is
> doing.
Well, that may not actually be wrong. That's the most expensive part
of reclaiming memory.
But please double-check your profile, because something is seriously
wrong in it.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 1:37 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 1:37 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3060 bytes --]
On Sun, Aug 14, 2016 at 5:48 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> Does this attached patch help your contention numbers?
>
> No. If anything, it makes it worse. Without the patch, I was
> measuring 36-37% in _raw_spin_unlock_irqrestore. With the patch, it
> is 42-43%. Write throughtput is the same at ~505MB/s.
Not helping any I can see, but I don't see how it could hurt...
Did you perhaps test it together with the other patches that improved
xfs performance? If other things improve, then I'd expect the
contention to get worse.
Not that it matters. Clearly that patch isn't even a stop-gap solution.
> There's a couple of interesting things showing up in the profile:
>
> 41.64% [kernel] [k] _raw_spin_unlock_irqrestore
Actually, you didn't point this one out, but *this* is the real kicker.
There's no way a *unlock* should show up that high. It's not spinning.
It's doing a single store and a pushq/popfq sequence.
Sure, it's going to take a cross-node cachemiss in the presence of
contention, but even then it should never be more expensive than the
locking side - which will *also* do the node changes.
So there's something really odd in your profile. I don't think that's valid.
Maybe your symbol table came from a old kernel, and functions moved
around enough that the profile attributions ended up bogus.
I suspect it's actually supposed to be _raw_spin_lock_irqrestore()
which is right next to that function. Although I'd actually expect
that if it's lock contention, you should see the contention mostly in
queued_spin_lock_slowpath().
Unless you have spinlock debugging turned on, in which case your
contention is all from *that*. That's possible, of course.
> 7.92% [kernel] [k] copy_user_generic_string
> 5.87% [kernel] [k] _raw_spin_unlock_irq
> 3.18% [kernel] [k] do_raw_spin_lock
> 2.51% [kernel] [k] cancel_dirty_page <<<<<<<<<<<<<<<
...
> Why are we even calling into cancel_dirty_page() if the page isn't
> dirty? xfs_vm_release_page() won't let dirty pages through to
> try_to_free_buffers(), so all this is just pure overhead for XFS.
See above: there's something screwy with your profile, you should
check that first. Maybe it's not actually cancel_dirty_page() but
something close-by.
(Although I don't see anything closeby normally, so even if the
spin_unlock_irq is bogus, I think *that* part may be incorrect.
Anyway, the reason you'd get cancel_dirty_page() is either due to
truncate, or due to try_to_free_buffers() having dropped the buffers
successfully because the filesystem had already written them out, but
the page is still marked dirty.
> FWIW, this is not under the mapping->tree_lock, but the profile shows
> that reclaiming bufferheads is roughly 20% of all the work kswapd is
> doing.
Well, that may not actually be wrong. That's the most expensive part
of reclaiming memory.
But please double-check your profile, because something is seriously
wrong in it.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 1:37 ` Linus Torvalds
@ 2016-08-15 2:28 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 2:28 UTC (permalink / raw)
To: Linus Torvalds
Cc: Tejun Heo, Wu Fengguang, Kirill A. Shutemov, Christoph Hellwig,
Huang, Ying, LKML, Bob Peterson, LKP
On Sun, Aug 14, 2016 at 06:37:33PM -0700, Linus Torvalds wrote:
> On Sun, Aug 14, 2016 at 5:48 PM, Dave Chinner <david@fromorbit.com> wrote:
> >>
> >> Does this attached patch help your contention numbers?
> >
> > No. If anything, it makes it worse. Without the patch, I was
> > measuring 36-37% in _raw_spin_unlock_irqrestore. With the patch, it
> > is 42-43%. Write throughtput is the same at ~505MB/s.
>
> Not helping any I can see, but I don't see how it could hurt...
>
> Did you perhaps test it together with the other patches that improved
> xfs performance? If other things improve, then I'd expect the
> contention to get worse.
>
> Not that it matters. Clearly that patch isn't even a stop-gap solution.
Tried it with and without. Same result.
> > There's a couple of interesting things showing up in the profile:
> >
> > 41.64% [kernel] [k] _raw_spin_unlock_irqrestore
>
> Actually, you didn't point this one out, but *this* is the real kicker.
>
> There's no way a *unlock* should show up that high. It's not spinning.
> It's doing a single store and a pushq/popfq sequence.
>
> Sure, it's going to take a cross-node cachemiss in the presence of
> contention, but even then it should never be more expensive than the
> locking side - which will *also* do the node changes.
>
> So there's something really odd in your profile. I don't think that's valid.
>
> Maybe your symbol table came from a old kernel, and functions moved
> around enough that the profile attributions ended up bogus.
No, I don't think so. I don't install symbol tables on my test VMs,
I let /proc/kallsyms do that work for me. From an strace of 'perf
top -U -g":
18916 open("vmlinux", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/boot/vmlinux", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/boot/vmlinux-4.8.0-rc1-dgc+", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/usr/lib/debug/boot/vmlinux-4.8.0-rc1-dgc+", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/lib/modules/4.8.0-rc1-dgc+/build/vmlinux", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/usr/lib/debug/lib/modules/4.8.0-rc1-dgc+/vmlinux", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/usr/lib/debug/boot/vmlinux-4.8.0-rc1-dgc+.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/root/.debug/.build-id/63/aab665ce90bd81763b90ff2cf103d8e8e823bc", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/sys/kernel/notes", O_RDONLY) = 56
18916 read(56, "\4\0\0\0\24\0\0\0\3\0\0\0", 12) = 12
18916 read(56, "GNU\0", 4) = 4
18916 read(56, "c\252\266e\316\220\275\201v;\220\377,\361\3\330\350\350#\274", 20) = 20
18916 close(56) = 0
18916 open("/root/.debug/[kernel.kcore]/63aab665ce90bd81763b90ff2cf103d8e8e823bc", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
18916 open("/proc/kallsyms", O_RDONLY) = 56
18916 fstat(56, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
18916 read(56, "0000000000000000 A irq_stack_uni"..., 1024) = 1024
18916 read(56, "a\n000000000000b8c0 A rsp_scratch"..., 1024) = 1024
18916 read(56, "0000000c6e0 A cmci_storm_state\n0"..., 1024) = 1024
18916 read(56, "000000ccd8 A sd_llc_id\n000000000"..., 1024) = 1024
You can see that perf is pulling the symbol table from the running
kernel, so I don't think there's a symbol mismatch here at all.
> I suspect it's actually supposed to be _raw_spin_lock_irqrestore()
> which is right next to that function. Although I'd actually expect
> that if it's lock contention, you should see the contention mostly in
> queued_spin_lock_slowpath().
>
> Unless you have spinlock debugging turned on, in which case your
> contention is all from *that*. That's possible, of course.
$ grep SPINLOCK .config
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_PARAVIRT_SPINLOCKS=y
CONFIG_DEBUG_SPINLOCK=y
$
So, turn off CONFIG_DEBUG_SPINLOCK, and:
41.06% [kernel] [k] _raw_spin_unlock_irqrestore
7.68% [kernel] [k] copy_user_generic_string
4.52% [kernel] [k] _raw_spin_unlock_irq
2.78% [kernel] [k] _raw_spin_lock
2.30% [kernel] [k] get_page_from_freelist
2.21% [kernel] [k] native_queued_spin_lock_slowpath
2.16% [kernel] [k] __slab_free
2.12% [kernel] [k] __block_commit_write.isra.29
1.55% [kernel] [k] __list_add
1.49% [kernel] [k] shrink_page_list
1.23% [kernel] [k] free_hot_cold_page
1.14% [kernel] [k] __wake_up_bit
1.01% [kernel] [k] try_to_release_page
1.00% [kernel] [k] page_evictable
0.90% [kernel] [k] cancel_dirty_page
0.80% [kernel] [k] unlock_page
0.80% [kernel] [k] up_write
0.73% [kernel] [k] ___might_sleep
0.68% [kernel] [k] clear_page_dirty_for_io
0.64% [kernel] [k] __radix_tree_lookup
0.61% [kernel] [k] __block_write_begin_int
0.60% [kernel] [k] xfs_do_writepage
0.59% [kernel] [k] node_dirty_ok
0.55% [kernel] [k] down_write
0.50% [kernel] [k] page_mapping
0.47% [kernel] [k] iomap_write_actor
- 38.29% 0.01% [kernel] [k] kswapd
- 38.28% kswapd
- 38.23% shrink_node
- 38.14% shrink_node_memcg.isra.75
- 38.09% shrink_inactive_list
- 36.90% shrink_page_list
- 24.41% __remove_mapping
24.16% _raw_spin_unlock_irqrestore
- 7.42% try_to_release_page
- 6.77% xfs_vm_releasepage
- 4.76% try_to_free_buffers
- 2.05% free_buffer_head
- 2.01% kmem_cache_free
1.94% __slab_free
- 1.24% _raw_spin_lock
native_queued_spin_lock_slowpath
0.89% cancel_dirty_page
1.61% _raw_spin_lock
+ 1.53% free_hot_cold_page_list
1.03% __list_add
0.74% page_evictable
0.86% _raw_spin_unlock_irq
No change in behaviour, and there's no obvious problems with the
call chain.
> > 7.92% [kernel] [k] copy_user_generic_string
> > 5.87% [kernel] [k] _raw_spin_unlock_irq
> > 3.18% [kernel] [k] do_raw_spin_lock
> > 2.51% [kernel] [k] cancel_dirty_page <<<<<<<<<<<<<<<
> ...
> > Why are we even calling into cancel_dirty_page() if the page isn't
> > dirty? xfs_vm_release_page() won't let dirty pages through to
> > try_to_free_buffers(), so all this is just pure overhead for XFS.
>
> See above: there's something screwy with your profile, you should
> check that first. Maybe it's not actually cancel_dirty_page() but
> something close-by.
No. try_to_free_buffers() calls drop_buffers(), which returns 1 when
the buffers are to be dropped. And when that happens, it *always*
calls cancel_dirty_page(), regardless of whether the page is
actually dirty or not.
fmeh. This was all screwed up by the memcg aware writeback. Starting
with commit 11f81be ("page_writeback: revive cancel_dirty_page() in a
restricted form") and then adding unconditional functionality that
can, in fact, *take the mapping->tree_lock* under the covers. i.e
unlocked_inode_to_wb_begin() hides that gem, which appears to be
neceessary for the accounting done when cleaning up a dirty page in
this location.
Still, why is it doing all this work on *clean pages*?
> > FWIW, this is not under the mapping->tree_lock, but the profile shows
> > that reclaiming bufferheads is roughly 20% of all the work kswapd is
> > doing.
>
> Well, that may not actually be wrong. That's the most expensive part
> of reclaiming memory.
All the more reason for not using them.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 2:28 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 2:28 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 8012 bytes --]
On Sun, Aug 14, 2016 at 06:37:33PM -0700, Linus Torvalds wrote:
> On Sun, Aug 14, 2016 at 5:48 PM, Dave Chinner <david@fromorbit.com> wrote:
> >>
> >> Does this attached patch help your contention numbers?
> >
> > No. If anything, it makes it worse. Without the patch, I was
> > measuring 36-37% in _raw_spin_unlock_irqrestore. With the patch, it
> > is 42-43%. Write throughtput is the same at ~505MB/s.
>
> Not helping any I can see, but I don't see how it could hurt...
>
> Did you perhaps test it together with the other patches that improved
> xfs performance? If other things improve, then I'd expect the
> contention to get worse.
>
> Not that it matters. Clearly that patch isn't even a stop-gap solution.
Tried it with and without. Same result.
> > There's a couple of interesting things showing up in the profile:
> >
> > 41.64% [kernel] [k] _raw_spin_unlock_irqrestore
>
> Actually, you didn't point this one out, but *this* is the real kicker.
>
> There's no way a *unlock* should show up that high. It's not spinning.
> It's doing a single store and a pushq/popfq sequence.
>
> Sure, it's going to take a cross-node cachemiss in the presence of
> contention, but even then it should never be more expensive than the
> locking side - which will *also* do the node changes.
>
> So there's something really odd in your profile. I don't think that's valid.
>
> Maybe your symbol table came from a old kernel, and functions moved
> around enough that the profile attributions ended up bogus.
No, I don't think so. I don't install symbol tables on my test VMs,
I let /proc/kallsyms do that work for me. From an strace of 'perf
top -U -g":
18916 open("vmlinux", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/boot/vmlinux", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/boot/vmlinux-4.8.0-rc1-dgc+", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/usr/lib/debug/boot/vmlinux-4.8.0-rc1-dgc+", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/lib/modules/4.8.0-rc1-dgc+/build/vmlinux", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/usr/lib/debug/lib/modules/4.8.0-rc1-dgc+/vmlinux", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/usr/lib/debug/boot/vmlinux-4.8.0-rc1-dgc+.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/root/.debug/.build-id/63/aab665ce90bd81763b90ff2cf103d8e8e823bc", O_RDONLY) = -1 ENOENT (No such file or directory)
18916 open("/sys/kernel/notes", O_RDONLY) = 56
18916 read(56, "\4\0\0\0\24\0\0\0\3\0\0\0", 12) = 12
18916 read(56, "GNU\0", 4) = 4
18916 read(56, "c\252\266e\316\220\275\201v;\220\377,\361\3\330\350\350#\274", 20) = 20
18916 close(56) = 0
18916 open("/root/.debug/[kernel.kcore]/63aab665ce90bd81763b90ff2cf103d8e8e823bc", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
18916 open("/proc/kallsyms", O_RDONLY) = 56
18916 fstat(56, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
18916 read(56, "0000000000000000 A irq_stack_uni"..., 1024) = 1024
18916 read(56, "a\n000000000000b8c0 A rsp_scratch"..., 1024) = 1024
18916 read(56, "0000000c6e0 A cmci_storm_state\n0"..., 1024) = 1024
18916 read(56, "000000ccd8 A sd_llc_id\n000000000"..., 1024) = 1024
You can see that perf is pulling the symbol table from the running
kernel, so I don't think there's a symbol mismatch here at all.
> I suspect it's actually supposed to be _raw_spin_lock_irqrestore()
> which is right next to that function. Although I'd actually expect
> that if it's lock contention, you should see the contention mostly in
> queued_spin_lock_slowpath().
>
> Unless you have spinlock debugging turned on, in which case your
> contention is all from *that*. That's possible, of course.
$ grep SPINLOCK .config
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_PARAVIRT_SPINLOCKS=y
CONFIG_DEBUG_SPINLOCK=y
$
So, turn off CONFIG_DEBUG_SPINLOCK, and:
41.06% [kernel] [k] _raw_spin_unlock_irqrestore
7.68% [kernel] [k] copy_user_generic_string
4.52% [kernel] [k] _raw_spin_unlock_irq
2.78% [kernel] [k] _raw_spin_lock
2.30% [kernel] [k] get_page_from_freelist
2.21% [kernel] [k] native_queued_spin_lock_slowpath
2.16% [kernel] [k] __slab_free
2.12% [kernel] [k] __block_commit_write.isra.29
1.55% [kernel] [k] __list_add
1.49% [kernel] [k] shrink_page_list
1.23% [kernel] [k] free_hot_cold_page
1.14% [kernel] [k] __wake_up_bit
1.01% [kernel] [k] try_to_release_page
1.00% [kernel] [k] page_evictable
0.90% [kernel] [k] cancel_dirty_page
0.80% [kernel] [k] unlock_page
0.80% [kernel] [k] up_write
0.73% [kernel] [k] ___might_sleep
0.68% [kernel] [k] clear_page_dirty_for_io
0.64% [kernel] [k] __radix_tree_lookup
0.61% [kernel] [k] __block_write_begin_int
0.60% [kernel] [k] xfs_do_writepage
0.59% [kernel] [k] node_dirty_ok
0.55% [kernel] [k] down_write
0.50% [kernel] [k] page_mapping
0.47% [kernel] [k] iomap_write_actor
- 38.29% 0.01% [kernel] [k] kswapd
- 38.28% kswapd
- 38.23% shrink_node
- 38.14% shrink_node_memcg.isra.75
- 38.09% shrink_inactive_list
- 36.90% shrink_page_list
- 24.41% __remove_mapping
24.16% _raw_spin_unlock_irqrestore
- 7.42% try_to_release_page
- 6.77% xfs_vm_releasepage
- 4.76% try_to_free_buffers
- 2.05% free_buffer_head
- 2.01% kmem_cache_free
1.94% __slab_free
- 1.24% _raw_spin_lock
native_queued_spin_lock_slowpath
0.89% cancel_dirty_page
1.61% _raw_spin_lock
+ 1.53% free_hot_cold_page_list
1.03% __list_add
0.74% page_evictable
0.86% _raw_spin_unlock_irq
No change in behaviour, and there's no obvious problems with the
call chain.
> > 7.92% [kernel] [k] copy_user_generic_string
> > 5.87% [kernel] [k] _raw_spin_unlock_irq
> > 3.18% [kernel] [k] do_raw_spin_lock
> > 2.51% [kernel] [k] cancel_dirty_page <<<<<<<<<<<<<<<
> ...
> > Why are we even calling into cancel_dirty_page() if the page isn't
> > dirty? xfs_vm_release_page() won't let dirty pages through to
> > try_to_free_buffers(), so all this is just pure overhead for XFS.
>
> See above: there's something screwy with your profile, you should
> check that first. Maybe it's not actually cancel_dirty_page() but
> something close-by.
No. try_to_free_buffers() calls drop_buffers(), which returns 1 when
the buffers are to be dropped. And when that happens, it *always*
calls cancel_dirty_page(), regardless of whether the page is
actually dirty or not.
fmeh. This was all screwed up by the memcg aware writeback. Starting
with commit 11f81be ("page_writeback: revive cancel_dirty_page() in a
restricted form") and then adding unconditional functionality that
can, in fact, *take the mapping->tree_lock* under the covers. i.e
unlocked_inode_to_wb_begin() hides that gem, which appears to be
neceessary for the accounting done when cleaning up a dirty page in
this location.
Still, why is it doing all this work on *clean pages*?
> > FWIW, this is not under the mapping->tree_lock, but the profile shows
> > that reclaiming bufferheads is roughly 20% of all the work kswapd is
> > doing.
>
> Well, that may not actually be wrong. That's the most expensive part
> of reclaiming memory.
All the more reason for not using them.
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 2:28 ` Dave Chinner
@ 2016-08-15 2:53 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 2:53 UTC (permalink / raw)
To: Dave Chinner
Cc: Tejun Heo, Wu Fengguang, Kirill A. Shutemov, Christoph Hellwig,
Huang, Ying, LKML, Bob Peterson, LKP
On Sun, Aug 14, 2016 at 7:28 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> Maybe your symbol table came from a old kernel, and functions moved
>> around enough that the profile attributions ended up bogus.
>
> No, I don't think so. I don't install symbol tables on my test VMs,
> I let /proc/kallsyms do that work for me. From an strace of 'perf
> top -U -g":
Ok. But something is definitely wrong with your profile.
What does it say if you annotate that _raw_spin_unlock_irqrestore() function?
I guarantee you that no, it's not spending 41% of time in
spin_unlock_irqrestore. That just isn't a valid profile. There's
something seriously wrong somewhere.
The fact that you then get the same profile when you turn _off_
CONFIG_DEBUG_SPINLOCK only proves there is something going on that is
pure garbage.
I suspect that what you did was to edit the .config file, remove
DEBUG_SPINLOCK, and then do "make oldconfig" again.
And it got turned on again, because you have one of the lock debugging
options on that force spinlock debuggin on again:
- DEBUG_WW_MUTEX_SLOWPATH
- DEBUG_LOCK_ALLOC
- PROVE_LOCKING
all of which would make any profiles entirely pointless.
[ Light goes on ]
Oh, no, I can see another possibility: you're not doing proper CPU
profiles, you're doing some timer-irq profile, and the reason you get
41% on spin_unlock_irq_restore() is that that is where the interrupts
are enabled again.
Timer-interrupt based profiles are not useful either.
Make sure you actually use "perf record -e cycles:pp" or something
that uses PEBS to get real profiles using CPU performance counters.
Because right now the profile data is worthless.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 2:53 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 2:53 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1730 bytes --]
On Sun, Aug 14, 2016 at 7:28 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> Maybe your symbol table came from a old kernel, and functions moved
>> around enough that the profile attributions ended up bogus.
>
> No, I don't think so. I don't install symbol tables on my test VMs,
> I let /proc/kallsyms do that work for me. From an strace of 'perf
> top -U -g":
Ok. But something is definitely wrong with your profile.
What does it say if you annotate that _raw_spin_unlock_irqrestore() function?
I guarantee you that no, it's not spending 41% of time in
spin_unlock_irqrestore. That just isn't a valid profile. There's
something seriously wrong somewhere.
The fact that you then get the same profile when you turn _off_
CONFIG_DEBUG_SPINLOCK only proves there is something going on that is
pure garbage.
I suspect that what you did was to edit the .config file, remove
DEBUG_SPINLOCK, and then do "make oldconfig" again.
And it got turned on again, because you have one of the lock debugging
options on that force spinlock debuggin on again:
- DEBUG_WW_MUTEX_SLOWPATH
- DEBUG_LOCK_ALLOC
- PROVE_LOCKING
all of which would make any profiles entirely pointless.
[ Light goes on ]
Oh, no, I can see another possibility: you're not doing proper CPU
profiles, you're doing some timer-irq profile, and the reason you get
41% on spin_unlock_irq_restore() is that that is where the interrupts
are enabled again.
Timer-interrupt based profiles are not useful either.
Make sure you actually use "perf record -e cycles:pp" or something
that uses PEBS to get real profiles using CPU performance counters.
Because right now the profile data is worthless.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 2:53 ` Linus Torvalds
@ 2016-08-15 5:00 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 5:00 UTC (permalink / raw)
To: Linus Torvalds
Cc: Tejun Heo, Wu Fengguang, Kirill A. Shutemov, Christoph Hellwig,
Huang, Ying, LKML, Bob Peterson, LKP
On Sun, Aug 14, 2016 at 07:53:40PM -0700, Linus Torvalds wrote:
> On Sun, Aug 14, 2016 at 7:28 PM, Dave Chinner <david@fromorbit.com> wrote:
> >>
> >> Maybe your symbol table came from a old kernel, and functions moved
> >> around enough that the profile attributions ended up bogus.
> >
> > No, I don't think so. I don't install symbol tables on my test VMs,
> > I let /proc/kallsyms do that work for me. From an strace of 'perf
> > top -U -g":
>
> Ok. But something is definitely wrong with your profile.
>
> What does it say if you annotate that _raw_spin_unlock_irqrestore() function?
....
raw_spin_unlock_irqrestore /proc/kcore
¿
¿
¿
¿ Disassembly of section load0:
¿
¿ ffffffff81e628b0 <load0>:
¿ nop
¿ push %rbp
¿ mov %rsp,%rbp
¿ movb $0x0,(%rdi)
¿ nop
¿ mov %rsi,%rdi
¿ push %rdi
¿ popfq
99.35 ¿ nop
¿ decl %gs:0x7e1a9bc7(%rip)
0.65 ¿ ¿ je 25
¿ pop %rbp
¿ ¿ retq
¿25: callq 0xffffffff81002000
¿ pop %rbp
¿ ¿ retq
> I guarantee you that no, it's not spending 41% of time in
> spin_unlock_irqrestore. That just isn't a valid profile. There's
> something seriously wrong somewhere.
>
> The fact that you then get the same profile when you turn _off_
> CONFIG_DEBUG_SPINLOCK only proves there is something going on that is
> pure garbage.
>
> I suspect that what you did was to edit the .config file, remove
> DEBUG_SPINLOCK, and then do "make oldconfig" again.
Yes.
> And it got turned on again,
No. I'm not that stupid - I checked:
$ grep SPINLOCK .config
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_PARAVIRT_SPINLOCKS=y
# CONFIG_DEBUG_SPINLOCK is not set
$
> because you have one of the lock debugging
> options on that force spinlock debuggin on again:
> - DEBUG_WW_MUTEX_SLOWPATH
> - DEBUG_LOCK_ALLOC
> - PROVE_LOCKING
None of which are set:
$ grep 'DEBUG\|PROVE' .config |grep -v '#'
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_RODATA=y
CONFIG_SLUB_DEBUG=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_PM_DEBUG=y
CONFIG_PM_SLEEP_DEBUG=y
CONFIG_DEBUG_DEVRES=y
CONFIG_PNP_DEBUG_MESSAGES=y
CONFIG_XFS_DEBUG=y
CONFIG_OCFS2_DEBUG_MASKLOG=y
CONFIG_CIFS_DEBUG=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
CONFIG_FAULT_INJECTION_DEBUG_FS=y
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
CONFIG_DEBUG_BOOT_PARAMS=y
$
> [ Light goes on ]
>
> Oh, no, I can see another possibility: you're not doing proper CPU
> profiles, you're doing some timer-irq profile, and the reason you get
> 41% on spin_unlock_irq_restore() is that that is where the interrupts
> are enabled again.
>
> Timer-interrupt based profiles are not useful either.
I've just been using whatever perf defaults to. Defaults are
supposed to be useful - if they aren't then perf needs to be fixed.
perf top reports this by default:
Samples: 118K of event 'cpu-clock', Event count (approx.): 793748915
Overhead Shared O Symbol ¿
34.48% [kernel] [k] _raw_spin_unlock_irqrestore ¿
7.89% [kernel] [k] copy_user_generic_string ¿
5.08% [kernel] [k] _raw_spin_unlock_irq
...
> Make sure you actually use "perf record -e cycles:pp" or something
> that uses PEBS to get real profiles using CPU performance counters.
WTF is PEBS? I'm not a CPU nerd, and I certainly don't expect to
have to learn all the intricacies of hardware performance counters
just to profile the kernel in a correct and sane manner. That's what
the *perf defaults* are supposed to do.
Anyway: `perf top -U -e cycles:pp`:
Samples: 301K of event 'cpu-clock:ppH', Event count (approx.): 69364814
Overhead Shared O Symbol ¿
30.89% [kernel] [k] _raw_spin_unlock_irqrestore ¿
7.04% [kernel] [k] _raw_spin_unlock_irq ¿
4.08% [kernel] [k] copy_user_generic_string ¿
2.44% [kernel] [k] get_page_from_freelist ¿
1.81% [kernel] [k] _raw_spin_lock
No change.
$ sudo perf record -e cycles:pp -a --all-kernel -- xfs_io -f -c "pwrite 0 47g" /mnt/scratch/fooey
# Samples: 2M of event 'cpu-clock:khppH'
# Event count (approx.): 588517250000
#
# Overhead Command Shared Object Symbol
# ........ ............... ................. ..........................................
#
83.09% swapper [kernel.kallsyms] [k] native_safe_halt
1.42% xfs_io [kernel.kallsyms] [k] copy_user_generic_string
1.26% kswapd3 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.24% kswapd1 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.09% kswapd2 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.98% kswapd0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.80% xfs_io [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.77% kworker/u34:2 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.73% xfs_io [kernel.kallsyms] [k] _raw_spin_unlock_irq
0.51% xfs_io [kernel.kallsyms] [k] get_page_from_freelist
0.39% xfs_io [kernel.kallsyms] [k] __block_commit_write.isra.29
0.16% xfs_io [kernel.kallsyms] [k] _raw_spin_lock
0.14% xfs_io [kernel.kallsyms] [k] up_write
0.14% kworker/u34:2 [kernel.kallsyms] [k] clear_page_dirty_for_io
0.14% kworker/u34:2 [kernel.kallsyms] [k] xfs_do_writepage
....
It's exactly the same profile, just reported as a percentage of 16
CPUs rather than normalised to a single CPU. From my ignorant
viewpoing, I'd say that's expected because perf is still using
"cpu-clock" event configuration.
The hardware event counters are undocumented in the perf man pages,
perf-list doesn't output a single "cpu" or "cycles" event counter,
or even what hardware event counters are available. Hence I've got
no idea if it's broken, why "cycles" (or "cpu-cycles") doesn't
apparently record "cycle" triggered events, or even what perf is
supposed to tell me is it's recording cycle triggered events.
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 5:00 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 5:00 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 7635 bytes --]
On Sun, Aug 14, 2016 at 07:53:40PM -0700, Linus Torvalds wrote:
> On Sun, Aug 14, 2016 at 7:28 PM, Dave Chinner <david@fromorbit.com> wrote:
> >>
> >> Maybe your symbol table came from a old kernel, and functions moved
> >> around enough that the profile attributions ended up bogus.
> >
> > No, I don't think so. I don't install symbol tables on my test VMs,
> > I let /proc/kallsyms do that work for me. From an strace of 'perf
> > top -U -g":
>
> Ok. But something is definitely wrong with your profile.
>
> What does it say if you annotate that _raw_spin_unlock_irqrestore() function?
....
raw_spin_unlock_irqrestore /proc/kcore
¿
¿
¿
¿ Disassembly of section load0:
¿
¿ ffffffff81e628b0 <load0>:
¿ nop
¿ push %rbp
¿ mov %rsp,%rbp
¿ movb $0x0,(%rdi)
¿ nop
¿ mov %rsi,%rdi
¿ push %rdi
¿ popfq
99.35 ¿ nop
¿ decl %gs:0x7e1a9bc7(%rip)
0.65 ¿ ¿ je 25
¿ pop %rbp
¿ ¿ retq
¿25: callq 0xffffffff81002000
¿ pop %rbp
¿ ¿ retq
> I guarantee you that no, it's not spending 41% of time in
> spin_unlock_irqrestore. That just isn't a valid profile. There's
> something seriously wrong somewhere.
>
> The fact that you then get the same profile when you turn _off_
> CONFIG_DEBUG_SPINLOCK only proves there is something going on that is
> pure garbage.
>
> I suspect that what you did was to edit the .config file, remove
> DEBUG_SPINLOCK, and then do "make oldconfig" again.
Yes.
> And it got turned on again,
No. I'm not that stupid - I checked:
$ grep SPINLOCK .config
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_PARAVIRT_SPINLOCKS=y
# CONFIG_DEBUG_SPINLOCK is not set
$
> because you have one of the lock debugging
> options on that force spinlock debuggin on again:
> - DEBUG_WW_MUTEX_SLOWPATH
> - DEBUG_LOCK_ALLOC
> - PROVE_LOCKING
None of which are set:
$ grep 'DEBUG\|PROVE' .config |grep -v '#'
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_RODATA=y
CONFIG_SLUB_DEBUG=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_PM_DEBUG=y
CONFIG_PM_SLEEP_DEBUG=y
CONFIG_DEBUG_DEVRES=y
CONFIG_PNP_DEBUG_MESSAGES=y
CONFIG_XFS_DEBUG=y
CONFIG_OCFS2_DEBUG_MASKLOG=y
CONFIG_CIFS_DEBUG=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_LIST=y
CONFIG_FAULT_INJECTION_DEBUG_FS=y
CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS=y
CONFIG_DEBUG_BOOT_PARAMS=y
$
> [ Light goes on ]
>
> Oh, no, I can see another possibility: you're not doing proper CPU
> profiles, you're doing some timer-irq profile, and the reason you get
> 41% on spin_unlock_irq_restore() is that that is where the interrupts
> are enabled again.
>
> Timer-interrupt based profiles are not useful either.
I've just been using whatever perf defaults to. Defaults are
supposed to be useful - if they aren't then perf needs to be fixed.
perf top reports this by default:
Samples: 118K of event 'cpu-clock', Event count (approx.): 793748915
Overhead Shared O Symbol ¿
34.48% [kernel] [k] _raw_spin_unlock_irqrestore ¿
7.89% [kernel] [k] copy_user_generic_string ¿
5.08% [kernel] [k] _raw_spin_unlock_irq
...
> Make sure you actually use "perf record -e cycles:pp" or something
> that uses PEBS to get real profiles using CPU performance counters.
WTF is PEBS? I'm not a CPU nerd, and I certainly don't expect to
have to learn all the intricacies of hardware performance counters
just to profile the kernel in a correct and sane manner. That's what
the *perf defaults* are supposed to do.
Anyway: `perf top -U -e cycles:pp`:
Samples: 301K of event 'cpu-clock:ppH', Event count (approx.): 69364814
Overhead Shared O Symbol ¿
30.89% [kernel] [k] _raw_spin_unlock_irqrestore ¿
7.04% [kernel] [k] _raw_spin_unlock_irq ¿
4.08% [kernel] [k] copy_user_generic_string ¿
2.44% [kernel] [k] get_page_from_freelist ¿
1.81% [kernel] [k] _raw_spin_lock
No change.
$ sudo perf record -e cycles:pp -a --all-kernel -- xfs_io -f -c "pwrite 0 47g" /mnt/scratch/fooey
# Samples: 2M of event 'cpu-clock:khppH'
# Event count (approx.): 588517250000
#
# Overhead Command Shared Object Symbol
# ........ ............... ................. ..........................................
#
83.09% swapper [kernel.kallsyms] [k] native_safe_halt
1.42% xfs_io [kernel.kallsyms] [k] copy_user_generic_string
1.26% kswapd3 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.24% kswapd1 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.09% kswapd2 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.98% kswapd0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.80% xfs_io [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.77% kworker/u34:2 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.73% xfs_io [kernel.kallsyms] [k] _raw_spin_unlock_irq
0.51% xfs_io [kernel.kallsyms] [k] get_page_from_freelist
0.39% xfs_io [kernel.kallsyms] [k] __block_commit_write.isra.29
0.16% xfs_io [kernel.kallsyms] [k] _raw_spin_lock
0.14% xfs_io [kernel.kallsyms] [k] up_write
0.14% kworker/u34:2 [kernel.kallsyms] [k] clear_page_dirty_for_io
0.14% kworker/u34:2 [kernel.kallsyms] [k] xfs_do_writepage
....
It's exactly the same profile, just reported as a percentage of 16
CPUs rather than normalised to a single CPU. From my ignorant
viewpoing, I'd say that's expected because perf is still using
"cpu-clock" event configuration.
The hardware event counters are undocumented in the perf man pages,
perf-list doesn't output a single "cpu" or "cycles" event counter,
or even what hardware event counters are available. Hence I've got
no idea if it's broken, why "cycles" (or "cpu-cycles") doesn't
apparently record "cycle" triggered events, or even what perf is
supposed to tell me is it's recording cycle triggered events.
-Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 2:53 ` Linus Torvalds
@ 2016-08-15 5:03 ` Ingo Molnar
-1 siblings, 0 replies; 219+ messages in thread
From: Ingo Molnar @ 2016-08-15 5:03 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Tejun Heo, Wu Fengguang, Kirill A. Shutemov,
Christoph Hellwig, Huang, Ying, LKML, Bob Peterson, LKP,
Arnaldo Carvalho de Melo, Peter Zijlstra
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Make sure you actually use "perf record -e cycles:pp" or something
> that uses PEBS to get real profiles using CPU performance counters.
Btw., 'perf record -e cycles:pp' is the default now for modern versions
of perf tooling (on most x86 systems) - if you do 'perf record' it will
just use the most precise profiling mode available on that particular
CPU model.
If unsure you can check the event that was used, via:
triton:~> perf report --stdio 2>&1 | grep '# Samples'
# Samples: 27K of event 'cycles:pp'
Thanks,
Ingo
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 5:03 ` Ingo Molnar
0 siblings, 0 replies; 219+ messages in thread
From: Ingo Molnar @ 2016-08-15 5:03 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 608 bytes --]
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Make sure you actually use "perf record -e cycles:pp" or something
> that uses PEBS to get real profiles using CPU performance counters.
Btw., 'perf record -e cycles:pp' is the default now for modern versions
of perf tooling (on most x86 systems) - if you do 'perf record' it will
just use the most precise profiling mode available on that particular
CPU model.
If unsure you can check the event that was used, via:
triton:~> perf report --stdio 2>&1 | grep '# Samples'
# Samples: 27K of event 'cycles:pp'
Thanks,
Ingo
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
[not found] ` <CA+55aFy14nUnJQ_GdF=j8Fa9xiH70c6fY2G3q5HQ01+8z1z3qQ@mail.gmail.com>
@ 2016-08-15 5:12 ` Linus Torvalds
2016-08-15 22:22 ` Dave Chinner
0 siblings, 1 reply; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 5:12 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1363 bytes --]
On Aug 14, 2016 10:00 PM, "Dave Chinner" <david@fromorbit.com> wrote:
>
> > What does it say if you annotate that _raw_spin_unlock_irqrestore()
function?
> ....
> ¿
> ¿ Disassembly of section load0:
> ¿
> ¿ ffffffff81e628b0 <load0>:
> ¿ nop
> ¿ push %rbp
> ¿ mov %rsp,%rbp
> ¿ movb $0x0,(%rdi)
> ¿ nop
> ¿ mov %rsi,%rdi
> ¿ push %rdi
> ¿ popfq
> 99.35 ¿ nop
Yeah, that's a good disassembly of a non-debug spin unlock, and the symbols
are fine, but the profile is not valid. That's an interrupt point, right
after the popf that enables interiors again.
I don't know why 'perf' isn't working on your machine, but it clearly
isn't.
Has it ever worked on that machine? What cpu is it? Are you running in some
virtualized environment without performance counters, perhaps?
It's not actually the unlock that is expensive, and there is no contention
on the lock (if there had been, the numbers would have been entirely
different for the debug case, which makes locking an order of magnitude
more expensive). All the cost of everything that happened while interrupts
were disabled is just accounted to the instruction after they were enabled
again.
Linus
[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 1754 bytes --]
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-12 18:03 ` Linus Torvalds
@ 2016-08-15 12:58 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-15 12:58 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Tejun Heo, Kirill A. Shutemov, Christoph Hellwig,
Huang, Ying, LKML, Bob Peterson, LKP
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
>On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner <david@fromorbit.com> wrote:
>> On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
>>>
>>> I don't recall having ever seen the mapping tree_lock as a contention
>>> point before, but it's not like I've tried that load either. So it
>>> might be a regression (going back long, I suspect), or just an unusual
>>> load that nobody has traditionally tested much.
>>>
>>> Single-threaded big file write one page at a time, was it?
>>
>> Yup. On a 4 node NUMA system.
>
>Ok, I can't see any real contention on my single-node workstation
>(running ext4 too, so there may be filesystem differences), but I
>guess that shouldn't surprise me. The cacheline bouncing just isn't
>expensive enough when it all stays on-die.
>
>I can see the tree_lock in my profiles (just not very high), and at
>least for ext4 the main caller ssems to be
>__set_page_dirty_nobuffers().
>
>And yes, looking at that, the biggest cost by _far_ inside the
>spinlock seems to be the accounting.
>
>Which doesn't even have to be inside the mapping lock, as far as I can
>tell, and as far as comments go.
>
>So a stupid patch to just move the dirty page accounting to outside
>the spinlock might help a lot.
>
>Does this attached patch help your contention numbers?
Hi Linus,
The 1BRD tests finished and there are no conclusive changes.
The overall aim7 jobs-per-min slightly decreases and
the overall fsmark files_per_sec slightly increases,
however both are small enough (less than 1%), which are kind of
expected numbers.
NUMA test results should be available tomorrow.
99091700659f4df9 1b5f2eb4a752e1fa7102f37545 testcase/testparams/testbox
---------------- -------------------------- ---------------------------
%stddev %change %stddev
\ | \
71443 71286 GEO-MEAN aim7.jobs-per-min
972 961 aim7/1BRD_48G-btrfs-creat-clo-4-performance/ivb44
52205 51525 aim7/1BRD_48G-btrfs-disk_cp-1500-performance/ivb44
2184471 ± 4% -6% 2051740 ± 3% aim7/1BRD_48G-btrfs-disk_rd-9000-performance/ivb44
47049 46630 ± 3% aim7/1BRD_48G-btrfs-disk_rr-1500-performance/ivb44
24932 -4% 23812 aim7/1BRD_48G-btrfs-disk_rw-1500-performance/ivb44
5884 5856 aim7/1BRD_48G-btrfs-disk_src-500-performance/ivb44
51430 51286 aim7/1BRD_48G-btrfs-disk_wrt-1500-performance/ivb44
218 220 aim7/1BRD_48G-btrfs-sync_disk_rw-10-performance/ivb44
22777 23199 aim7/1BRD_48G-ext4-creat-clo-1000-performance/ivb44
130085 128991 aim7/1BRD_48G-ext4-disk_cp-3000-performance/ivb44
2434088 ± 3% -8% 2232211 ± 4% aim7/1BRD_48G-ext4-disk_rd-9000-performance/ivb44
130351 128977 aim7/1BRD_48G-ext4-disk_rr-3000-performance/ivb44
73280 74044 aim7/1BRD_48G-ext4-disk_rw-3000-performance/ivb44
277035 -3% 268057 aim7/1BRD_48G-ext4-disk_src-3000-performance/ivb44
127584 4% 132639 aim7/1BRD_48G-ext4-disk_wrt-3000-performance/ivb44
10571 10659 aim7/1BRD_48G-ext4-sync_disk_rw-600-performance/ivb44
36924 ± 7% 36327 aim7/1BRD_48G-f2fs-creat-clo-1500-performance/ivb44
117238 119130 aim7/1BRD_48G-f2fs-disk_cp-3000-performance/ivb44
2340512 ± 5% 2352619 ± 10% aim7/1BRD_48G-f2fs-disk_rd-9000-performance/ivb44
107506 ± 9% 7% 114869 aim7/1BRD_48G-f2fs-disk_rr-3000-performance/ivb44
105642 106835 aim7/1BRD_48G-f2fs-disk_rw-3000-performance/ivb44
26900 ± 3% 26442 ± 3% aim7/1BRD_48G-f2fs-disk_src-3000-performance/ivb44
117124 ± 3% 117678 aim7/1BRD_48G-f2fs-disk_wrt-3000-performance/ivb44
3689 3616 aim7/1BRD_48G-f2fs-sync_disk_rw-600-performance/ivb44
70897 72758 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
267649 ± 3% 270867 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
485217 ± 3% 489403 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
360451 359042 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
338114 336838 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
60130 ± 5% 4% 62663 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
403144 401476 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
26327 26513 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44
99091700659f4df9 1b5f2eb4a752e1fa7102f37545
---------------- --------------------------
2117 2138 GEO-MEAN fsmark.files_per_sec
4325 4379 fsmark/1x-1t-1BRD_32G-btrfs-4K-4G-fsyncBeforeClose-1fpd-performance/ivb43
9466 ± 3% 4% 9804 fsmark/1x-1t-1BRD_32G-ext4-4K-4G-fsyncBeforeClose-1fpd-performance/ivb43
433 ± 5% 424 fsmark/1x-1t-1BRD_48G-btrfs-4M-40G-NoSync-performance/ivb44
185 ± 6% 5% 194 fsmark/1x-1t-1BRD_48G-btrfs-4M-40G-fsyncBeforeClose-performance/ivb44
368 ± 3% -4% 355 ± 6% fsmark/1x-1t-1BRD_48G-ext4-4M-40G-NoSync-performance/ivb44
191 191 fsmark/1x-1t-1BRD_48G-ext4-4M-40G-fsyncBeforeClose-performance/ivb44
393 ± 4% 397 ± 4% fsmark/1x-1t-1BRD_48G-xfs-4M-40G-NoSync-performance/ivb44
200 201 fsmark/1x-1t-1BRD_48G-xfs-4M-40G-fsyncBeforeClose-performance/ivb44
924 -3% 896 ± 3% fsmark/1x-1t-1HDD-xfs-4K-400M-fsyncBeforeClose-1fpd-performance/ivb43
488 ± 3% 6% 516 fsmark/1x-64t-1BRD_48G-btrfs-4M-40G-NoSync-performance/ivb44
559 564 fsmark/1x-64t-1BRD_48G-ext4-4M-40G-NoSync-performance/ivb44
1130 1111 fsmark/1x-64t-1BRD_48G-ext4-4M-40G-fsyncBeforeClose-performance/ivb44
526 ± 7% 6% 557 fsmark/1x-64t-1BRD_48G-xfs-4M-40G-NoSync-performance/ivb44
1583 ± 3% 1620 fsmark/1x-64t-1BRD_48G-xfs-4M-40G-fsyncBeforeClose-performance/ivb44
33202 33208 fsmark/8-1SSD-16-ext4-8K-75G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
33889 33784 fsmark/8-1SSD-16-ext4-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
25576 25509 fsmark/8-1SSD-32-xfs-9B-30G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
9117 9079 fsmark/8-1SSD-4-btrfs-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
13288 13261 fsmark/8-1SSD-4-btrfs-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
18851 ± 11% 11% 21013 fsmark/8-1SSD-4-f2fs-8K-72G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
24343 -4% 23473 ± 4% fsmark/8-1SSD-4-f2fs-9B-40G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 12:58 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-15 12:58 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 7847 bytes --]
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
>On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner <david@fromorbit.com> wrote:
>> On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
>>>
>>> I don't recall having ever seen the mapping tree_lock as a contention
>>> point before, but it's not like I've tried that load either. So it
>>> might be a regression (going back long, I suspect), or just an unusual
>>> load that nobody has traditionally tested much.
>>>
>>> Single-threaded big file write one page at a time, was it?
>>
>> Yup. On a 4 node NUMA system.
>
>Ok, I can't see any real contention on my single-node workstation
>(running ext4 too, so there may be filesystem differences), but I
>guess that shouldn't surprise me. The cacheline bouncing just isn't
>expensive enough when it all stays on-die.
>
>I can see the tree_lock in my profiles (just not very high), and at
>least for ext4 the main caller ssems to be
>__set_page_dirty_nobuffers().
>
>And yes, looking at that, the biggest cost by _far_ inside the
>spinlock seems to be the accounting.
>
>Which doesn't even have to be inside the mapping lock, as far as I can
>tell, and as far as comments go.
>
>So a stupid patch to just move the dirty page accounting to outside
>the spinlock might help a lot.
>
>Does this attached patch help your contention numbers?
Hi Linus,
The 1BRD tests finished and there are no conclusive changes.
The overall aim7 jobs-per-min slightly decreases and
the overall fsmark files_per_sec slightly increases,
however both are small enough (less than 1%), which are kind of
expected numbers.
NUMA test results should be available tomorrow.
99091700659f4df9 1b5f2eb4a752e1fa7102f37545 testcase/testparams/testbox
---------------- -------------------------- ---------------------------
%stddev %change %stddev
\ | \
71443 71286 GEO-MEAN aim7.jobs-per-min
972 961 aim7/1BRD_48G-btrfs-creat-clo-4-performance/ivb44
52205 51525 aim7/1BRD_48G-btrfs-disk_cp-1500-performance/ivb44
2184471 ± 4% -6% 2051740 ± 3% aim7/1BRD_48G-btrfs-disk_rd-9000-performance/ivb44
47049 46630 ± 3% aim7/1BRD_48G-btrfs-disk_rr-1500-performance/ivb44
24932 -4% 23812 aim7/1BRD_48G-btrfs-disk_rw-1500-performance/ivb44
5884 5856 aim7/1BRD_48G-btrfs-disk_src-500-performance/ivb44
51430 51286 aim7/1BRD_48G-btrfs-disk_wrt-1500-performance/ivb44
218 220 aim7/1BRD_48G-btrfs-sync_disk_rw-10-performance/ivb44
22777 23199 aim7/1BRD_48G-ext4-creat-clo-1000-performance/ivb44
130085 128991 aim7/1BRD_48G-ext4-disk_cp-3000-performance/ivb44
2434088 ± 3% -8% 2232211 ± 4% aim7/1BRD_48G-ext4-disk_rd-9000-performance/ivb44
130351 128977 aim7/1BRD_48G-ext4-disk_rr-3000-performance/ivb44
73280 74044 aim7/1BRD_48G-ext4-disk_rw-3000-performance/ivb44
277035 -3% 268057 aim7/1BRD_48G-ext4-disk_src-3000-performance/ivb44
127584 4% 132639 aim7/1BRD_48G-ext4-disk_wrt-3000-performance/ivb44
10571 10659 aim7/1BRD_48G-ext4-sync_disk_rw-600-performance/ivb44
36924 ± 7% 36327 aim7/1BRD_48G-f2fs-creat-clo-1500-performance/ivb44
117238 119130 aim7/1BRD_48G-f2fs-disk_cp-3000-performance/ivb44
2340512 ± 5% 2352619 ± 10% aim7/1BRD_48G-f2fs-disk_rd-9000-performance/ivb44
107506 ± 9% 7% 114869 aim7/1BRD_48G-f2fs-disk_rr-3000-performance/ivb44
105642 106835 aim7/1BRD_48G-f2fs-disk_rw-3000-performance/ivb44
26900 ± 3% 26442 ± 3% aim7/1BRD_48G-f2fs-disk_src-3000-performance/ivb44
117124 ± 3% 117678 aim7/1BRD_48G-f2fs-disk_wrt-3000-performance/ivb44
3689 3616 aim7/1BRD_48G-f2fs-sync_disk_rw-600-performance/ivb44
70897 72758 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
267649 ± 3% 270867 aim7/1BRD_48G-xfs-disk_cp-3000-performance/ivb44
485217 ± 3% 489403 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
360451 359042 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
338114 336838 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
60130 ± 5% 4% 62663 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
403144 401476 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
26327 26513 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44
99091700659f4df9 1b5f2eb4a752e1fa7102f37545
---------------- --------------------------
2117 2138 GEO-MEAN fsmark.files_per_sec
4325 4379 fsmark/1x-1t-1BRD_32G-btrfs-4K-4G-fsyncBeforeClose-1fpd-performance/ivb43
9466 ± 3% 4% 9804 fsmark/1x-1t-1BRD_32G-ext4-4K-4G-fsyncBeforeClose-1fpd-performance/ivb43
433 ± 5% 424 fsmark/1x-1t-1BRD_48G-btrfs-4M-40G-NoSync-performance/ivb44
185 ± 6% 5% 194 fsmark/1x-1t-1BRD_48G-btrfs-4M-40G-fsyncBeforeClose-performance/ivb44
368 ± 3% -4% 355 ± 6% fsmark/1x-1t-1BRD_48G-ext4-4M-40G-NoSync-performance/ivb44
191 191 fsmark/1x-1t-1BRD_48G-ext4-4M-40G-fsyncBeforeClose-performance/ivb44
393 ± 4% 397 ± 4% fsmark/1x-1t-1BRD_48G-xfs-4M-40G-NoSync-performance/ivb44
200 201 fsmark/1x-1t-1BRD_48G-xfs-4M-40G-fsyncBeforeClose-performance/ivb44
924 -3% 896 ± 3% fsmark/1x-1t-1HDD-xfs-4K-400M-fsyncBeforeClose-1fpd-performance/ivb43
488 ± 3% 6% 516 fsmark/1x-64t-1BRD_48G-btrfs-4M-40G-NoSync-performance/ivb44
559 564 fsmark/1x-64t-1BRD_48G-ext4-4M-40G-NoSync-performance/ivb44
1130 1111 fsmark/1x-64t-1BRD_48G-ext4-4M-40G-fsyncBeforeClose-performance/ivb44
526 ± 7% 6% 557 fsmark/1x-64t-1BRD_48G-xfs-4M-40G-NoSync-performance/ivb44
1583 ± 3% 1620 fsmark/1x-64t-1BRD_48G-xfs-4M-40G-fsyncBeforeClose-performance/ivb44
33202 33208 fsmark/8-1SSD-16-ext4-8K-75G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
33889 33784 fsmark/8-1SSD-16-ext4-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
25576 25509 fsmark/8-1SSD-32-xfs-9B-30G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
9117 9079 fsmark/8-1SSD-4-btrfs-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
13288 13261 fsmark/8-1SSD-4-btrfs-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
18851 ± 11% 11% 21013 fsmark/8-1SSD-4-f2fs-8K-72G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
24343 -4% 23473 ± 4% fsmark/8-1SSD-4-f2fs-9B-40G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-14 16:17 ` Christoph Hellwig
@ 2016-08-15 14:14 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-15 14:14 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Dave Chinner, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP
Hi Christoph,
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>Snipping the long contest:
>
>I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
> (2) lots of additional spinlock contention in the new case. A quick
> check shows that I fat-fingered my rewrite so that we do
> the xfs_inode_set_eofblocks_tag call now for the pure lookup
> case, and pretty much all new cycles come from that.
> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> we're already doing way to many even without my little bug above.
>
>So I've force pushed a new version of the iomap-fixes branch with
>(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>lot less expensive slotted in before that. Would be good to see
>the numbers with that.
The aim7 1BRD tests finished and there are ups and downs, with overall
performance remain flat.
99091700659f4df9 74a242ad94d13436a1644c0b45 bf4dc6e4ecc2a3d042029319bc testcase/testparams/testbox
---------------- -------------------------- -------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
159926 157324 158574 GEO-MEAN aim7.jobs-per-min
70897 5% 74137 4% 73775 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
485217 ± 3% 492431 477533 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
360451 -19% 292980 -17% 299377 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
338114 338410 5% 354078 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
60130 ± 5% 4% 62438 5% 62923 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
403144 397790 410648 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
26327 26534 26128 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44
The new commit bf4dc6e ("xfs: rewrite and optimize the delalloc write
path") improves the aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
case by 5%. Here are the detailed numbers:
aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
74a242ad94d13436 bf4dc6e4ecc2a3d042029319bc
---------------- --------------------------
%stddev %change %stddev
\ | \
338410 5% 354078 aim7.jobs-per-min
404390 8% 435117 aim7.time.voluntary_context_switches
2502 -4% 2396 aim7.time.maximum_resident_set_size
15018 -9% 13701 aim7.time.involuntary_context_switches
900 -11% 801 aim7.time.system_time
17432 11% 19365 vmstat.system.cs
47736 ± 19% -24% 36087 interrupts.CAL:Function_call_interrupts
2129646 31% 2790638 proc-vmstat.pgalloc_dma32
379503 13% 429384 numa-meminfo.node0.Dirty
15018 -9% 13701 time.involuntary_context_switches
900 -11% 801 time.system_time
1560 10% 1716 slabinfo.mnt_cache.active_objs
1560 10% 1716 slabinfo.mnt_cache.num_objs
61.53 -4 57.45 ± 4% perf-profile.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
61.63 -4 57.55 ± 4% perf-profile.func.cycles-pp.intel_idle
1007188 ± 16% 156% 2577911 ± 6% numa-numastat.node0.numa_miss
9662857 ± 4% -13% 8420159 ± 3% numa-numastat.node0.numa_foreign
1008220 ± 16% 155% 2570630 ± 6% numa-numastat.node1.numa_foreign
9664033 ± 4% -13% 8413184 ± 3% numa-numastat.node1.numa_miss
26519887 ± 3% 18% 31322674 cpuidle.C1-IVT.time
122238 16% 142383 cpuidle.C1-IVT.usage
46548 11% 51645 cpuidle.C1E-IVT.usage
17253419 13% 19567582 cpuidle.C3-IVT.time
86847 13% 98333 cpuidle.C3-IVT.usage
482033 ± 12% 108% 1000665 ± 8% numa-vmstat.node0.numa_miss
94689 14% 107744 numa-vmstat.node0.nr_zone_write_pending
94677 14% 107718 numa-vmstat.node0.nr_dirty
3156643 ± 3% -20% 2527460 ± 3% numa-vmstat.node0.numa_foreign
429288 ± 12% 129% 983053 ± 8% numa-vmstat.node1.numa_foreign
3104193 ± 3% -19% 2510128 numa-vmstat.node1.numa_miss
6.43 ± 5% 51% 9.70 ± 11% turbostat.Pkg%pc2
0.30 28% 0.38 turbostat.CPU%c3
9.71 9.92 turbostat.RAMWatt
158 154 turbostat.PkgWatt
125 -3% 121 turbostat.CorWatt
1141 -6% 1078 turbostat.Avg_MHz
38.70 -6% 36.48 turbostat.%Busy
5.03 ± 11% -51% 2.46 ± 40% turbostat.Pkg%pc6
8.33 ± 48% 88% 15.67 ± 36% sched_debug.cfs_rq:/.runnable_load_avg.max
1947 ± 3% -12% 1710 ± 7% sched_debug.cfs_rq:/.spread0.stddev
1936 ± 3% -12% 1698 ± 8% sched_debug.cfs_rq:/.min_vruntime.stddev
2170 ± 10% -14% 1863 ± 6% sched_debug.cfs_rq:/.load_avg.max
220926 ± 18% 37% 303192 ± 5% sched_debug.cpu.avg_idle.stddev
0.06 ± 13% 357% 0.28 ± 23% sched_debug.rt_rq:/.rt_time.avg
0.37 ± 10% 240% 1.25 ± 15% sched_debug.rt_rq:/.rt_time.stddev
2.54 ± 10% 160% 6.59 ± 10% sched_debug.rt_rq:/.rt_time.max
0.32 ± 19% 29% 0.42 ± 10% perf-stat.dTLB-load-miss-rate
964727 7% 1028830 perf-stat.context-switches
176406 4% 184289 perf-stat.cpu-migrations
0.29 4% 0.30 perf-stat.branch-miss-rate
1.634e+09 1.673e+09 perf-stat.node-store-misses
23.60 23.99 perf-stat.node-store-miss-rate
40.01 40.57 perf-stat.cache-miss-rate
0.95 -8% 0.87 perf-stat.ipc
3.203e+12 -9% 2.928e+12 perf-stat.cpu-cycles
1.506e+09 -11% 1.345e+09 perf-stat.branch-misses
50.64 ± 13% -14% 43.45 ± 4% perf-stat.iTLB-load-miss-rate
5.285e+11 -14% 4.523e+11 perf-stat.branch-instructions
3.042e+12 -16% 2.551e+12 perf-stat.instructions
7.996e+11 -18% 6.584e+11 perf-stat.dTLB-loads
5.569e+11 ± 4% -18% 4.578e+11 perf-stat.dTLB-stores
Here are the detailed numbers for the slowed down case:
aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
99091700659f4df9 bf4dc6e4ecc2a3d042029319bc
---------------- --------------------------
%stddev change %stddev
\ | \
360451 -17% 299377 aim7.jobs-per-min
12806 481% 74447 aim7.time.involuntary_context_switches
755 44% 1086 aim7.time.system_time
50.17 20% 60.36 aim7.time.elapsed_time
50.17 20% 60.36 aim7.time.elapsed_time.max
438148 446012 aim7.time.voluntary_context_switches
37798 ± 16% 780% 332583 ± 8% interrupts.CAL:Function_call_interrupts
78.82 ± 5% 18% 93.35 ± 5% uptime.boot
2847 ± 7% 11% 3160 ± 7% uptime.idle
147490 ± 8% 34% 197261 ± 3% softirqs.RCU
648159 29% 839283 softirqs.TIMER
160830 10% 177144 softirqs.SCHED
3845352 ± 4% 91% 7349133 numa-numastat.node0.numa_miss
4686838 ± 5% 67% 7835640 numa-numastat.node0.numa_foreign
3848455 ± 4% 91% 7352436 numa-numastat.node1.numa_foreign
4689920 ± 5% 67% 7838734 numa-numastat.node1.numa_miss
50.17 20% 60.36 time.elapsed_time.max
12806 481% 74447 time.involuntary_context_switches
755 44% 1086 time.system_time
50.17 20% 60.36 time.elapsed_time
1563 18% 1846 time.percent_of_cpu_this_job_got
11699 ± 19% 3738% 449048 vmstat.io.bo
18836969 -16% 15789996 vmstat.memory.free
16 19% 19 vmstat.procs.r
19377 459% 108364 vmstat.system.cs
48255 11% 53537 vmstat.system.in
2357299 25% 2951384 meminfo.Inactive(file)
2366381 25% 2960468 meminfo.Inactive
1575292 -9% 1429971 meminfo.Cached
19342499 -17% 16100340 meminfo.MemFree
1057904 -20% 842987 meminfo.Dirty
1057 21% 1284 turbostat.Avg_MHz
35.78 21% 43.24 turbostat.%Busy
9.95 15% 11.47 turbostat.RAMWatt
74 ± 5% 10% 81 turbostat.CoreTmp
74 ± 4% 10% 81 turbostat.PkgTmp
118 8% 128 turbostat.CorWatt
151 7% 162 turbostat.PkgWatt
29.06 -23% 22.39 turbostat.CPU%c6
487 ± 89% 3e+04 26448 ± 57% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1823 ± 82% 2e+06 1913796 ± 38% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
208475 ± 43% 1e+06 1409494 ± 5% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
6884 ± 73% 8e+04 90790 ± 9% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
1598 ± 20% 3e+04 35015 ± 27% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
2006 ± 25% 3e+04 31143 ± 35% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
29 ±101% 1e+04 10214 ± 29% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
1206 ± 51% 9e+03 9919 ± 25% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
29869205 ± 4% -10% 26804569 cpuidle.C1-IVT.time
5737726 39% 7952214 cpuidle.C1E-IVT.time
51141 17% 59958 cpuidle.C1E-IVT.usage
18377551 37% 25176426 cpuidle.C3-IVT.time
96067 17% 112045 cpuidle.C3-IVT.usage
1806811 12% 2024041 cpuidle.C6-IVT.usage
1104420 ± 36% 204% 3361085 ± 27% cpuidle.POLL.time
281 ± 10% 20% 338 cpuidle.POLL.usage
5.61 ± 11% -0.5 5.12 ± 18% perf-profile.cycles-pp.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
5.85 ± 6% -0.8 5.06 ± 15% perf-profile.cycles-pp.hrtimer_interrupt.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter
6.32 ± 6% -0.9 5.42 ± 15% perf-profile.cycles-pp.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
15.77 ± 8% -2 13.83 ± 17% perf-profile.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry
16.04 ± 8% -2 14.01 ± 15% perf-profile.cycles-pp.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
60.25 ± 4% -7 53.03 ± 7% perf-profile.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
60.41 ± 4% -7 53.12 ± 7% perf-profile.func.cycles-pp.intel_idle
1174104 22% 1436859 numa-meminfo.node0.Inactive
1167471 22% 1428271 numa-meminfo.node0.Inactive(file)
770811 -9% 698147 numa-meminfo.node0.FilePages
20707294 -12% 18281509 ± 6% numa-meminfo.node0.Active
20613745 -12% 18180987 ± 6% numa-meminfo.node0.Active(file)
9676639 -17% 8003627 numa-meminfo.node0.MemFree
509906 -22% 396192 numa-meminfo.node0.Dirty
1189539 28% 1524697 numa-meminfo.node1.Inactive(file)
1191989 28% 1525194 numa-meminfo.node1.Inactive
804508 -10% 727067 numa-meminfo.node1.FilePages
9654540 -16% 8077810 numa-meminfo.node1.MemFree
547956 -19% 441933 numa-meminfo.node1.Dirty
396 ± 12% 485% 2320 ± 37% slabinfo.bio-1.num_objs
396 ± 12% 481% 2303 ± 37% slabinfo.bio-1.active_objs
73 140% 176 ± 14% slabinfo.kmalloc-128.active_slabs
73 140% 176 ± 14% slabinfo.kmalloc-128.num_slabs
4734 94% 9171 ± 11% slabinfo.kmalloc-128.num_objs
4734 88% 8917 ± 13% slabinfo.kmalloc-128.active_objs
16238 -10% 14552 ± 3% slabinfo.kmalloc-256.active_objs
17189 -13% 15033 ± 3% slabinfo.kmalloc-256.num_objs
20651 96% 40387 ± 17% slabinfo.radix_tree_node.active_objs
398 91% 761 ± 17% slabinfo.radix_tree_node.active_slabs
398 91% 761 ± 17% slabinfo.radix_tree_node.num_slabs
22313 91% 42650 ± 17% slabinfo.radix_tree_node.num_objs
32 638% 236 ± 28% slabinfo.xfs_efd_item.active_slabs
32 638% 236 ± 28% slabinfo.xfs_efd_item.num_slabs
1295 281% 4934 ± 23% slabinfo.xfs_efd_item.num_objs
1295 280% 4923 ± 23% slabinfo.xfs_efd_item.active_objs
1661 81% 3000 ± 42% slabinfo.xfs_log_ticket.num_objs
1661 78% 2952 ± 42% slabinfo.xfs_log_ticket.active_objs
2617 49% 3905 ± 30% slabinfo.xfs_trans.num_objs
2617 48% 3870 ± 31% slabinfo.xfs_trans.active_objs
1015933 567% 6779099 perf-stat.context-switches
4.864e+08 126% 1.101e+09 perf-stat.node-load-misses
1.179e+09 103% 2.399e+09 perf-stat.node-loads
0.06 ± 34% 92% 0.12 ± 11% perf-stat.dTLB-store-miss-rate
2.985e+08 ± 32% 86% 5.542e+08 ± 11% perf-stat.dTLB-store-misses
2.551e+09 ± 15% 81% 4.625e+09 ± 13% perf-stat.dTLB-load-misses
0.39 ± 14% 66% 0.65 ± 13% perf-stat.dTLB-load-miss-rate
1.26e+09 60% 2.019e+09 perf-stat.node-store-misses
46072661 ± 27% 49% 68472915 perf-stat.iTLB-loads
2.738e+12 ± 4% 43% 3.916e+12 perf-stat.cpu-cycles
21.48 32% 28.35 perf-stat.node-store-miss-rate
1.612e+10 ± 3% 28% 2.066e+10 perf-stat.cache-references
1.669e+09 ± 3% 24% 2.063e+09 perf-stat.branch-misses
6.816e+09 ± 3% 20% 8.179e+09 perf-stat.cache-misses
177699 18% 209145 perf-stat.cpu-migrations
0.39 13% 0.44 perf-stat.branch-miss-rate
4.606e+09 11% 5.102e+09 perf-stat.node-stores
4.329e+11 ± 4% 9% 4.727e+11 perf-stat.branch-instructions
6.458e+11 9% 7.046e+11 perf-stat.dTLB-loads
29.19 8% 31.45 perf-stat.node-load-miss-rate
286173 8% 308115 perf-stat.page-faults
286191 8% 308109 perf-stat.minor-faults
45084934 4% 47073719 perf-stat.iTLB-load-misses
42.28 -6% 39.58 perf-stat.cache-miss-rate
50.62 ± 16% -19% 40.75 perf-stat.iTLB-load-miss-rate
0.89 -28% 0.64 perf-stat.ipc
2 ± 36% 4e+07% 970191 proc-vmstat.pgrotated
150 ± 21% 1e+07% 15356485 ± 3% proc-vmstat.nr_vmscan_immediate_reclaim
76823 ± 35% 56899% 43788651 proc-vmstat.pgscan_direct
153407 ± 19% 4483% 7031431 proc-vmstat.nr_written
619699 ± 19% 4441% 28139689 proc-vmstat.pgpgout
5342421 1061% 62050709 proc-vmstat.pgactivate
47 ± 25% 354% 217 proc-vmstat.nr_pages_scanned
8542963 ± 3% 78% 15182914 proc-vmstat.numa_miss
8542963 ± 3% 78% 15182715 proc-vmstat.numa_foreign
2820568 31% 3699073 proc-vmstat.pgalloc_dma32
589234 25% 738160 proc-vmstat.nr_zone_inactive_file
589240 25% 738155 proc-vmstat.nr_inactive_file
61347830 13% 69522958 proc-vmstat.pgfree
393711 -9% 356981 proc-vmstat.nr_file_pages
4831749 -17% 4020131 proc-vmstat.nr_free_pages
61252784 -18% 50183773 proc-vmstat.pgrefill
61245420 -18% 50176301 proc-vmstat.pgdeactivate
264397 -20% 210222 proc-vmstat.nr_zone_write_pending
264367 -20% 210188 proc-vmstat.nr_dirty
60420248 -39% 36646178 proc-vmstat.pgscan_kswapd
60373976 -44% 33735064 proc-vmstat.pgsteal_kswapd
1753 -98% 43 ± 18% proc-vmstat.pageoutrun
1095 -98% 25 ± 17% proc-vmstat.kswapd_low_wmark_hit_quickly
656 ± 3% -98% 15 ± 24% proc-vmstat.kswapd_high_wmark_hit_quickly
0 1136221 numa-vmstat.node0.workingset_refault
0 1136221 numa-vmstat.node0.workingset_activate
23 ± 45% 1e+07% 2756907 numa-vmstat.node0.nr_vmscan_immediate_reclaim
37618 ± 24% 3234% 1254165 numa-vmstat.node0.nr_written
1346538 ± 4% 104% 2748439 numa-vmstat.node0.numa_miss
1577620 ± 5% 80% 2842882 numa-vmstat.node0.numa_foreign
291242 23% 357407 numa-vmstat.node0.nr_inactive_file
291237 23% 357390 numa-vmstat.node0.nr_zone_inactive_file
13961935 12% 15577331 numa-vmstat.node0.numa_local
13961938 12% 15577332 numa-vmstat.node0.numa_hit
39831 10% 43768 numa-vmstat.node0.nr_unevictable
39831 10% 43768 numa-vmstat.node0.nr_zone_unevictable
193467 -10% 174639 numa-vmstat.node0.nr_file_pages
5147212 -12% 4542321 ± 6% numa-vmstat.node0.nr_active_file
5147237 -12% 4542325 ± 6% numa-vmstat.node0.nr_zone_active_file
2426129 -17% 2008637 numa-vmstat.node0.nr_free_pages
128285 -23% 99206 numa-vmstat.node0.nr_zone_write_pending
128259 -23% 99183 numa-vmstat.node0.nr_dirty
0 1190594 numa-vmstat.node1.workingset_refault
0 1190594 numa-vmstat.node1.workingset_activate
21 ± 36% 1e+07% 3120425 ± 4% numa-vmstat.node1.nr_vmscan_immediate_reclaim
38541 ± 26% 3336% 1324185 numa-vmstat.node1.nr_written
1316819 ± 4% 105% 2699075 numa-vmstat.node1.numa_foreign
1547929 ± 4% 80% 2793491 numa-vmstat.node1.numa_miss
296714 28% 381124 numa-vmstat.node1.nr_zone_inactive_file
296714 28% 381123 numa-vmstat.node1.nr_inactive_file
14311131 10% 15750908 numa-vmstat.node1.numa_hit
14311130 10% 15750905 numa-vmstat.node1.numa_local
201164 -10% 181742 numa-vmstat.node1.nr_file_pages
2422825 -16% 2027750 numa-vmstat.node1.nr_free_pages
137069 -19% 110501 numa-vmstat.node1.nr_zone_write_pending
137069 -19% 110497 numa-vmstat.node1.nr_dirty
737 ± 29% 27349% 202387 sched_debug.cfs_rq:/.min_vruntime.min
3637 ± 20% 7919% 291675 sched_debug.cfs_rq:/.min_vruntime.avg
11.00 ± 44% 4892% 549.17 ± 9% sched_debug.cfs_rq:/.runnable_load_avg.max
2.12 ± 36% 4853% 105.12 ± 5% sched_debug.cfs_rq:/.runnable_load_avg.stddev
1885 ± 6% 4189% 80870 sched_debug.cfs_rq:/.min_vruntime.stddev
1896 ± 6% 4166% 80895 sched_debug.cfs_rq:/.spread0.stddev
10774 ± 13% 4113% 453925 sched_debug.cfs_rq:/.min_vruntime.max
1.02 ± 19% 2630% 27.72 ± 7% sched_debug.cfs_rq:/.runnable_load_avg.avg
63060 ± 45% 776% 552157 sched_debug.cfs_rq:/.load.max
14442 ± 21% 590% 99615 ± 14% sched_debug.cfs_rq:/.load.stddev
8397 ± 9% 309% 34370 ± 12% sched_debug.cfs_rq:/.load.avg
46.02 ± 24% 176% 126.96 ± 6% sched_debug.cfs_rq:/.util_avg.stddev
817 19% 974 ± 3% sched_debug.cfs_rq:/.util_avg.max
721 -17% 600 ± 3% sched_debug.cfs_rq:/.util_avg.avg
595 ± 11% -38% 371 ± 7% sched_debug.cfs_rq:/.util_avg.min
1484 ± 20% -47% 792 ± 5% sched_debug.cfs_rq:/.load_avg.min
1798 ± 4% -50% 903 ± 5% sched_debug.cfs_rq:/.load_avg.avg
322 ± 8% 7726% 25239 ± 8% sched_debug.cpu.nr_switches.min
969 7238% 71158 sched_debug.cpu.nr_switches.avg
2.23 ± 40% 4650% 106.14 ± 4% sched_debug.cpu.cpu_load[0].stddev
943 ± 4% 3475% 33730 ± 3% sched_debug.cpu.nr_switches.stddev
0.87 ± 25% 3057% 27.46 ± 7% sched_debug.cpu.cpu_load[0].avg
5.43 ± 13% 2232% 126.61 sched_debug.cpu.nr_uninterruptible.stddev
6131 ± 3% 2028% 130453 sched_debug.cpu.nr_switches.max
1.58 ± 29% 1852% 30.90 ± 4% sched_debug.cpu.cpu_load[4].avg
2.00 ± 49% 1422% 30.44 ± 5% sched_debug.cpu.cpu_load[3].avg
63060 ± 45% 1053% 726920 ± 32% sched_debug.cpu.load.max
21.25 ± 44% 777% 186.33 ± 7% sched_debug.cpu.nr_uninterruptible.max
14419 ± 21% 731% 119865 ± 31% sched_debug.cpu.load.stddev
3586 381% 17262 sched_debug.cpu.nr_load_updates.min
8286 ± 8% 364% 38414 ± 17% sched_debug.cpu.load.avg
5444 303% 21956 sched_debug.cpu.nr_load_updates.avg
1156 231% 3827 sched_debug.cpu.nr_load_updates.stddev
8603 ± 4% 222% 27662 sched_debug.cpu.nr_load_updates.max
1410 165% 3735 sched_debug.cpu.curr->pid.max
28742 ± 15% 120% 63101 ± 7% sched_debug.cpu.clock.min
28742 ± 15% 120% 63101 ± 7% sched_debug.cpu.clock_task.min
28748 ± 15% 120% 63107 ± 7% sched_debug.cpu.clock.avg
28748 ± 15% 120% 63107 ± 7% sched_debug.cpu.clock_task.avg
28751 ± 15% 120% 63113 ± 7% sched_debug.cpu.clock.max
28751 ± 15% 120% 63113 ± 7% sched_debug.cpu.clock_task.max
442 ± 11% 93% 854 ± 15% sched_debug.cpu.curr->pid.avg
618 ± 3% 72% 1065 ± 4% sched_debug.cpu.curr->pid.stddev
1.88 ± 11% 50% 2.83 ± 8% sched_debug.cpu.clock.stddev
1.88 ± 11% 50% 2.83 ± 8% sched_debug.cpu.clock_task.stddev
5.22 ± 9% -55% 2.34 ± 23% sched_debug.rt_rq:/.rt_time.max
0.85 -55% 0.38 ± 28% sched_debug.rt_rq:/.rt_time.stddev
0.17 -56% 0.07 ± 33% sched_debug.rt_rq:/.rt_time.avg
27633 ± 16% 124% 61980 ± 8% sched_debug.ktime
28745 ± 15% 120% 63102 ± 7% sched_debug.sched_clk
28745 ± 15% 120% 63102 ± 7% sched_debug.cpu_clk
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 14:14 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-15 14:14 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 27403 bytes --]
Hi Christoph,
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>Snipping the long contest:
>
>I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
> (2) lots of additional spinlock contention in the new case. A quick
> check shows that I fat-fingered my rewrite so that we do
> the xfs_inode_set_eofblocks_tag call now for the pure lookup
> case, and pretty much all new cycles come from that.
> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> we're already doing way to many even without my little bug above.
>
>So I've force pushed a new version of the iomap-fixes branch with
>(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>lot less expensive slotted in before that. Would be good to see
>the numbers with that.
The aim7 1BRD tests finished and there are ups and downs, with overall
performance remain flat.
99091700659f4df9 74a242ad94d13436a1644c0b45 bf4dc6e4ecc2a3d042029319bc testcase/testparams/testbox
---------------- -------------------------- -------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
159926 157324 158574 GEO-MEAN aim7.jobs-per-min
70897 5% 74137 4% 73775 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
485217 ± 3% 492431 477533 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
360451 -19% 292980 -17% 299377 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
338114 338410 5% 354078 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
60130 ± 5% 4% 62438 5% 62923 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
403144 397790 410648 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
26327 26534 26128 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44
The new commit bf4dc6e ("xfs: rewrite and optimize the delalloc write
path") improves the aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
case by 5%. Here are the detailed numbers:
aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
74a242ad94d13436 bf4dc6e4ecc2a3d042029319bc
---------------- --------------------------
%stddev %change %stddev
\ | \
338410 5% 354078 aim7.jobs-per-min
404390 8% 435117 aim7.time.voluntary_context_switches
2502 -4% 2396 aim7.time.maximum_resident_set_size
15018 -9% 13701 aim7.time.involuntary_context_switches
900 -11% 801 aim7.time.system_time
17432 11% 19365 vmstat.system.cs
47736 ± 19% -24% 36087 interrupts.CAL:Function_call_interrupts
2129646 31% 2790638 proc-vmstat.pgalloc_dma32
379503 13% 429384 numa-meminfo.node0.Dirty
15018 -9% 13701 time.involuntary_context_switches
900 -11% 801 time.system_time
1560 10% 1716 slabinfo.mnt_cache.active_objs
1560 10% 1716 slabinfo.mnt_cache.num_objs
61.53 -4 57.45 ± 4% perf-profile.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
61.63 -4 57.55 ± 4% perf-profile.func.cycles-pp.intel_idle
1007188 ± 16% 156% 2577911 ± 6% numa-numastat.node0.numa_miss
9662857 ± 4% -13% 8420159 ± 3% numa-numastat.node0.numa_foreign
1008220 ± 16% 155% 2570630 ± 6% numa-numastat.node1.numa_foreign
9664033 ± 4% -13% 8413184 ± 3% numa-numastat.node1.numa_miss
26519887 ± 3% 18% 31322674 cpuidle.C1-IVT.time
122238 16% 142383 cpuidle.C1-IVT.usage
46548 11% 51645 cpuidle.C1E-IVT.usage
17253419 13% 19567582 cpuidle.C3-IVT.time
86847 13% 98333 cpuidle.C3-IVT.usage
482033 ± 12% 108% 1000665 ± 8% numa-vmstat.node0.numa_miss
94689 14% 107744 numa-vmstat.node0.nr_zone_write_pending
94677 14% 107718 numa-vmstat.node0.nr_dirty
3156643 ± 3% -20% 2527460 ± 3% numa-vmstat.node0.numa_foreign
429288 ± 12% 129% 983053 ± 8% numa-vmstat.node1.numa_foreign
3104193 ± 3% -19% 2510128 numa-vmstat.node1.numa_miss
6.43 ± 5% 51% 9.70 ± 11% turbostat.Pkg%pc2
0.30 28% 0.38 turbostat.CPU%c3
9.71 9.92 turbostat.RAMWatt
158 154 turbostat.PkgWatt
125 -3% 121 turbostat.CorWatt
1141 -6% 1078 turbostat.Avg_MHz
38.70 -6% 36.48 turbostat.%Busy
5.03 ± 11% -51% 2.46 ± 40% turbostat.Pkg%pc6
8.33 ± 48% 88% 15.67 ± 36% sched_debug.cfs_rq:/.runnable_load_avg.max
1947 ± 3% -12% 1710 ± 7% sched_debug.cfs_rq:/.spread0.stddev
1936 ± 3% -12% 1698 ± 8% sched_debug.cfs_rq:/.min_vruntime.stddev
2170 ± 10% -14% 1863 ± 6% sched_debug.cfs_rq:/.load_avg.max
220926 ± 18% 37% 303192 ± 5% sched_debug.cpu.avg_idle.stddev
0.06 ± 13% 357% 0.28 ± 23% sched_debug.rt_rq:/.rt_time.avg
0.37 ± 10% 240% 1.25 ± 15% sched_debug.rt_rq:/.rt_time.stddev
2.54 ± 10% 160% 6.59 ± 10% sched_debug.rt_rq:/.rt_time.max
0.32 ± 19% 29% 0.42 ± 10% perf-stat.dTLB-load-miss-rate
964727 7% 1028830 perf-stat.context-switches
176406 4% 184289 perf-stat.cpu-migrations
0.29 4% 0.30 perf-stat.branch-miss-rate
1.634e+09 1.673e+09 perf-stat.node-store-misses
23.60 23.99 perf-stat.node-store-miss-rate
40.01 40.57 perf-stat.cache-miss-rate
0.95 -8% 0.87 perf-stat.ipc
3.203e+12 -9% 2.928e+12 perf-stat.cpu-cycles
1.506e+09 -11% 1.345e+09 perf-stat.branch-misses
50.64 ± 13% -14% 43.45 ± 4% perf-stat.iTLB-load-miss-rate
5.285e+11 -14% 4.523e+11 perf-stat.branch-instructions
3.042e+12 -16% 2.551e+12 perf-stat.instructions
7.996e+11 -18% 6.584e+11 perf-stat.dTLB-loads
5.569e+11 ± 4% -18% 4.578e+11 perf-stat.dTLB-stores
Here are the detailed numbers for the slowed down case:
aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
99091700659f4df9 bf4dc6e4ecc2a3d042029319bc
---------------- --------------------------
%stddev change %stddev
\ | \
360451 -17% 299377 aim7.jobs-per-min
12806 481% 74447 aim7.time.involuntary_context_switches
755 44% 1086 aim7.time.system_time
50.17 20% 60.36 aim7.time.elapsed_time
50.17 20% 60.36 aim7.time.elapsed_time.max
438148 446012 aim7.time.voluntary_context_switches
37798 ± 16% 780% 332583 ± 8% interrupts.CAL:Function_call_interrupts
78.82 ± 5% 18% 93.35 ± 5% uptime.boot
2847 ± 7% 11% 3160 ± 7% uptime.idle
147490 ± 8% 34% 197261 ± 3% softirqs.RCU
648159 29% 839283 softirqs.TIMER
160830 10% 177144 softirqs.SCHED
3845352 ± 4% 91% 7349133 numa-numastat.node0.numa_miss
4686838 ± 5% 67% 7835640 numa-numastat.node0.numa_foreign
3848455 ± 4% 91% 7352436 numa-numastat.node1.numa_foreign
4689920 ± 5% 67% 7838734 numa-numastat.node1.numa_miss
50.17 20% 60.36 time.elapsed_time.max
12806 481% 74447 time.involuntary_context_switches
755 44% 1086 time.system_time
50.17 20% 60.36 time.elapsed_time
1563 18% 1846 time.percent_of_cpu_this_job_got
11699 ± 19% 3738% 449048 vmstat.io.bo
18836969 -16% 15789996 vmstat.memory.free
16 19% 19 vmstat.procs.r
19377 459% 108364 vmstat.system.cs
48255 11% 53537 vmstat.system.in
2357299 25% 2951384 meminfo.Inactive(file)
2366381 25% 2960468 meminfo.Inactive
1575292 -9% 1429971 meminfo.Cached
19342499 -17% 16100340 meminfo.MemFree
1057904 -20% 842987 meminfo.Dirty
1057 21% 1284 turbostat.Avg_MHz
35.78 21% 43.24 turbostat.%Busy
9.95 15% 11.47 turbostat.RAMWatt
74 ± 5% 10% 81 turbostat.CoreTmp
74 ± 4% 10% 81 turbostat.PkgTmp
118 8% 128 turbostat.CorWatt
151 7% 162 turbostat.PkgWatt
29.06 -23% 22.39 turbostat.CPU%c6
487 ± 89% 3e+04 26448 ± 57% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
1823 ± 82% 2e+06 1913796 ± 38% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
208475 ± 43% 1e+06 1409494 ± 5% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
6884 ± 73% 8e+04 90790 ± 9% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
1598 ± 20% 3e+04 35015 ± 27% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
2006 ± 25% 3e+04 31143 ± 35% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
29 ±101% 1e+04 10214 ± 29% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
1206 ± 51% 9e+03 9919 ± 25% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
29869205 ± 4% -10% 26804569 cpuidle.C1-IVT.time
5737726 39% 7952214 cpuidle.C1E-IVT.time
51141 17% 59958 cpuidle.C1E-IVT.usage
18377551 37% 25176426 cpuidle.C3-IVT.time
96067 17% 112045 cpuidle.C3-IVT.usage
1806811 12% 2024041 cpuidle.C6-IVT.usage
1104420 ± 36% 204% 3361085 ± 27% cpuidle.POLL.time
281 ± 10% 20% 338 cpuidle.POLL.usage
5.61 ± 11% -0.5 5.12 ± 18% perf-profile.cycles-pp.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
5.85 ± 6% -0.8 5.06 ± 15% perf-profile.cycles-pp.hrtimer_interrupt.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter
6.32 ± 6% -0.9 5.42 ± 15% perf-profile.cycles-pp.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
15.77 ± 8% -2 13.83 ± 17% perf-profile.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry
16.04 ± 8% -2 14.01 ± 15% perf-profile.cycles-pp.apic_timer_interrupt.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
60.25 ± 4% -7 53.03 ± 7% perf-profile.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
60.41 ± 4% -7 53.12 ± 7% perf-profile.func.cycles-pp.intel_idle
1174104 22% 1436859 numa-meminfo.node0.Inactive
1167471 22% 1428271 numa-meminfo.node0.Inactive(file)
770811 -9% 698147 numa-meminfo.node0.FilePages
20707294 -12% 18281509 ± 6% numa-meminfo.node0.Active
20613745 -12% 18180987 ± 6% numa-meminfo.node0.Active(file)
9676639 -17% 8003627 numa-meminfo.node0.MemFree
509906 -22% 396192 numa-meminfo.node0.Dirty
1189539 28% 1524697 numa-meminfo.node1.Inactive(file)
1191989 28% 1525194 numa-meminfo.node1.Inactive
804508 -10% 727067 numa-meminfo.node1.FilePages
9654540 -16% 8077810 numa-meminfo.node1.MemFree
547956 -19% 441933 numa-meminfo.node1.Dirty
396 ± 12% 485% 2320 ± 37% slabinfo.bio-1.num_objs
396 ± 12% 481% 2303 ± 37% slabinfo.bio-1.active_objs
73 140% 176 ± 14% slabinfo.kmalloc-128.active_slabs
73 140% 176 ± 14% slabinfo.kmalloc-128.num_slabs
4734 94% 9171 ± 11% slabinfo.kmalloc-128.num_objs
4734 88% 8917 ± 13% slabinfo.kmalloc-128.active_objs
16238 -10% 14552 ± 3% slabinfo.kmalloc-256.active_objs
17189 -13% 15033 ± 3% slabinfo.kmalloc-256.num_objs
20651 96% 40387 ± 17% slabinfo.radix_tree_node.active_objs
398 91% 761 ± 17% slabinfo.radix_tree_node.active_slabs
398 91% 761 ± 17% slabinfo.radix_tree_node.num_slabs
22313 91% 42650 ± 17% slabinfo.radix_tree_node.num_objs
32 638% 236 ± 28% slabinfo.xfs_efd_item.active_slabs
32 638% 236 ± 28% slabinfo.xfs_efd_item.num_slabs
1295 281% 4934 ± 23% slabinfo.xfs_efd_item.num_objs
1295 280% 4923 ± 23% slabinfo.xfs_efd_item.active_objs
1661 81% 3000 ± 42% slabinfo.xfs_log_ticket.num_objs
1661 78% 2952 ± 42% slabinfo.xfs_log_ticket.active_objs
2617 49% 3905 ± 30% slabinfo.xfs_trans.num_objs
2617 48% 3870 ± 31% slabinfo.xfs_trans.active_objs
1015933 567% 6779099 perf-stat.context-switches
4.864e+08 126% 1.101e+09 perf-stat.node-load-misses
1.179e+09 103% 2.399e+09 perf-stat.node-loads
0.06 ± 34% 92% 0.12 ± 11% perf-stat.dTLB-store-miss-rate
2.985e+08 ± 32% 86% 5.542e+08 ± 11% perf-stat.dTLB-store-misses
2.551e+09 ± 15% 81% 4.625e+09 ± 13% perf-stat.dTLB-load-misses
0.39 ± 14% 66% 0.65 ± 13% perf-stat.dTLB-load-miss-rate
1.26e+09 60% 2.019e+09 perf-stat.node-store-misses
46072661 ± 27% 49% 68472915 perf-stat.iTLB-loads
2.738e+12 ± 4% 43% 3.916e+12 perf-stat.cpu-cycles
21.48 32% 28.35 perf-stat.node-store-miss-rate
1.612e+10 ± 3% 28% 2.066e+10 perf-stat.cache-references
1.669e+09 ± 3% 24% 2.063e+09 perf-stat.branch-misses
6.816e+09 ± 3% 20% 8.179e+09 perf-stat.cache-misses
177699 18% 209145 perf-stat.cpu-migrations
0.39 13% 0.44 perf-stat.branch-miss-rate
4.606e+09 11% 5.102e+09 perf-stat.node-stores
4.329e+11 ± 4% 9% 4.727e+11 perf-stat.branch-instructions
6.458e+11 9% 7.046e+11 perf-stat.dTLB-loads
29.19 8% 31.45 perf-stat.node-load-miss-rate
286173 8% 308115 perf-stat.page-faults
286191 8% 308109 perf-stat.minor-faults
45084934 4% 47073719 perf-stat.iTLB-load-misses
42.28 -6% 39.58 perf-stat.cache-miss-rate
50.62 ± 16% -19% 40.75 perf-stat.iTLB-load-miss-rate
0.89 -28% 0.64 perf-stat.ipc
2 ± 36% 4e+07% 970191 proc-vmstat.pgrotated
150 ± 21% 1e+07% 15356485 ± 3% proc-vmstat.nr_vmscan_immediate_reclaim
76823 ± 35% 56899% 43788651 proc-vmstat.pgscan_direct
153407 ± 19% 4483% 7031431 proc-vmstat.nr_written
619699 ± 19% 4441% 28139689 proc-vmstat.pgpgout
5342421 1061% 62050709 proc-vmstat.pgactivate
47 ± 25% 354% 217 proc-vmstat.nr_pages_scanned
8542963 ± 3% 78% 15182914 proc-vmstat.numa_miss
8542963 ± 3% 78% 15182715 proc-vmstat.numa_foreign
2820568 31% 3699073 proc-vmstat.pgalloc_dma32
589234 25% 738160 proc-vmstat.nr_zone_inactive_file
589240 25% 738155 proc-vmstat.nr_inactive_file
61347830 13% 69522958 proc-vmstat.pgfree
393711 -9% 356981 proc-vmstat.nr_file_pages
4831749 -17% 4020131 proc-vmstat.nr_free_pages
61252784 -18% 50183773 proc-vmstat.pgrefill
61245420 -18% 50176301 proc-vmstat.pgdeactivate
264397 -20% 210222 proc-vmstat.nr_zone_write_pending
264367 -20% 210188 proc-vmstat.nr_dirty
60420248 -39% 36646178 proc-vmstat.pgscan_kswapd
60373976 -44% 33735064 proc-vmstat.pgsteal_kswapd
1753 -98% 43 ± 18% proc-vmstat.pageoutrun
1095 -98% 25 ± 17% proc-vmstat.kswapd_low_wmark_hit_quickly
656 ± 3% -98% 15 ± 24% proc-vmstat.kswapd_high_wmark_hit_quickly
0 1136221 numa-vmstat.node0.workingset_refault
0 1136221 numa-vmstat.node0.workingset_activate
23 ± 45% 1e+07% 2756907 numa-vmstat.node0.nr_vmscan_immediate_reclaim
37618 ± 24% 3234% 1254165 numa-vmstat.node0.nr_written
1346538 ± 4% 104% 2748439 numa-vmstat.node0.numa_miss
1577620 ± 5% 80% 2842882 numa-vmstat.node0.numa_foreign
291242 23% 357407 numa-vmstat.node0.nr_inactive_file
291237 23% 357390 numa-vmstat.node0.nr_zone_inactive_file
13961935 12% 15577331 numa-vmstat.node0.numa_local
13961938 12% 15577332 numa-vmstat.node0.numa_hit
39831 10% 43768 numa-vmstat.node0.nr_unevictable
39831 10% 43768 numa-vmstat.node0.nr_zone_unevictable
193467 -10% 174639 numa-vmstat.node0.nr_file_pages
5147212 -12% 4542321 ± 6% numa-vmstat.node0.nr_active_file
5147237 -12% 4542325 ± 6% numa-vmstat.node0.nr_zone_active_file
2426129 -17% 2008637 numa-vmstat.node0.nr_free_pages
128285 -23% 99206 numa-vmstat.node0.nr_zone_write_pending
128259 -23% 99183 numa-vmstat.node0.nr_dirty
0 1190594 numa-vmstat.node1.workingset_refault
0 1190594 numa-vmstat.node1.workingset_activate
21 ± 36% 1e+07% 3120425 ± 4% numa-vmstat.node1.nr_vmscan_immediate_reclaim
38541 ± 26% 3336% 1324185 numa-vmstat.node1.nr_written
1316819 ± 4% 105% 2699075 numa-vmstat.node1.numa_foreign
1547929 ± 4% 80% 2793491 numa-vmstat.node1.numa_miss
296714 28% 381124 numa-vmstat.node1.nr_zone_inactive_file
296714 28% 381123 numa-vmstat.node1.nr_inactive_file
14311131 10% 15750908 numa-vmstat.node1.numa_hit
14311130 10% 15750905 numa-vmstat.node1.numa_local
201164 -10% 181742 numa-vmstat.node1.nr_file_pages
2422825 -16% 2027750 numa-vmstat.node1.nr_free_pages
137069 -19% 110501 numa-vmstat.node1.nr_zone_write_pending
137069 -19% 110497 numa-vmstat.node1.nr_dirty
737 ± 29% 27349% 202387 sched_debug.cfs_rq:/.min_vruntime.min
3637 ± 20% 7919% 291675 sched_debug.cfs_rq:/.min_vruntime.avg
11.00 ± 44% 4892% 549.17 ± 9% sched_debug.cfs_rq:/.runnable_load_avg.max
2.12 ± 36% 4853% 105.12 ± 5% sched_debug.cfs_rq:/.runnable_load_avg.stddev
1885 ± 6% 4189% 80870 sched_debug.cfs_rq:/.min_vruntime.stddev
1896 ± 6% 4166% 80895 sched_debug.cfs_rq:/.spread0.stddev
10774 ± 13% 4113% 453925 sched_debug.cfs_rq:/.min_vruntime.max
1.02 ± 19% 2630% 27.72 ± 7% sched_debug.cfs_rq:/.runnable_load_avg.avg
63060 ± 45% 776% 552157 sched_debug.cfs_rq:/.load.max
14442 ± 21% 590% 99615 ± 14% sched_debug.cfs_rq:/.load.stddev
8397 ± 9% 309% 34370 ± 12% sched_debug.cfs_rq:/.load.avg
46.02 ± 24% 176% 126.96 ± 6% sched_debug.cfs_rq:/.util_avg.stddev
817 19% 974 ± 3% sched_debug.cfs_rq:/.util_avg.max
721 -17% 600 ± 3% sched_debug.cfs_rq:/.util_avg.avg
595 ± 11% -38% 371 ± 7% sched_debug.cfs_rq:/.util_avg.min
1484 ± 20% -47% 792 ± 5% sched_debug.cfs_rq:/.load_avg.min
1798 ± 4% -50% 903 ± 5% sched_debug.cfs_rq:/.load_avg.avg
322 ± 8% 7726% 25239 ± 8% sched_debug.cpu.nr_switches.min
969 7238% 71158 sched_debug.cpu.nr_switches.avg
2.23 ± 40% 4650% 106.14 ± 4% sched_debug.cpu.cpu_load[0].stddev
943 ± 4% 3475% 33730 ± 3% sched_debug.cpu.nr_switches.stddev
0.87 ± 25% 3057% 27.46 ± 7% sched_debug.cpu.cpu_load[0].avg
5.43 ± 13% 2232% 126.61 sched_debug.cpu.nr_uninterruptible.stddev
6131 ± 3% 2028% 130453 sched_debug.cpu.nr_switches.max
1.58 ± 29% 1852% 30.90 ± 4% sched_debug.cpu.cpu_load[4].avg
2.00 ± 49% 1422% 30.44 ± 5% sched_debug.cpu.cpu_load[3].avg
63060 ± 45% 1053% 726920 ± 32% sched_debug.cpu.load.max
21.25 ± 44% 777% 186.33 ± 7% sched_debug.cpu.nr_uninterruptible.max
14419 ± 21% 731% 119865 ± 31% sched_debug.cpu.load.stddev
3586 381% 17262 sched_debug.cpu.nr_load_updates.min
8286 ± 8% 364% 38414 ± 17% sched_debug.cpu.load.avg
5444 303% 21956 sched_debug.cpu.nr_load_updates.avg
1156 231% 3827 sched_debug.cpu.nr_load_updates.stddev
8603 ± 4% 222% 27662 sched_debug.cpu.nr_load_updates.max
1410 165% 3735 sched_debug.cpu.curr->pid.max
28742 ± 15% 120% 63101 ± 7% sched_debug.cpu.clock.min
28742 ± 15% 120% 63101 ± 7% sched_debug.cpu.clock_task.min
28748 ± 15% 120% 63107 ± 7% sched_debug.cpu.clock.avg
28748 ± 15% 120% 63107 ± 7% sched_debug.cpu.clock_task.avg
28751 ± 15% 120% 63113 ± 7% sched_debug.cpu.clock.max
28751 ± 15% 120% 63113 ± 7% sched_debug.cpu.clock_task.max
442 ± 11% 93% 854 ± 15% sched_debug.cpu.curr->pid.avg
618 ± 3% 72% 1065 ± 4% sched_debug.cpu.curr->pid.stddev
1.88 ± 11% 50% 2.83 ± 8% sched_debug.cpu.clock.stddev
1.88 ± 11% 50% 2.83 ± 8% sched_debug.cpu.clock_task.stddev
5.22 ± 9% -55% 2.34 ± 23% sched_debug.rt_rq:/.rt_time.max
0.85 -55% 0.38 ± 28% sched_debug.rt_rq:/.rt_time.stddev
0.17 -56% 0.07 ± 33% sched_debug.rt_rq:/.rt_time.avg
27633 ± 16% 124% 61980 ± 8% sched_debug.ktime
28745 ± 15% 120% 63102 ± 7% sched_debug.sched_clk
28745 ± 15% 120% 63102 ± 7% sched_debug.cpu_clk
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-11 4:46 ` Dave Chinner
@ 2016-08-15 17:22 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-15 17:22 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, Huang, Ying, LKML, Bob Peterson, Wu Fengguang,
LKP, Christoph Hellwig
Hi, Chinner,
Dave Chinner <david@fromorbit.com> writes:
> On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
>> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
>> >
>> > Here it is,
>>
>> Thanks.
>>
>> Appended is a munged "after" list, with the "before" values in
>> parenthesis. It actually looks fairly similar.
>>
>> The biggest difference is that we have "mark_page_accessed()" show up
>> after, and not before. There was also a lot of LRU noise in the
>> non-profile data. I wonder if that is the reason here: the old model
>> of using generic_perform_write/block_page_mkwrite didn't mark the
>> pages accessed, and now with iomap_file_buffered_write() they get
>> marked as active and that screws up the LRU list, and makes us not
>> flush out the dirty pages well (because they are seen as active and
>> not good for writeback), and then you get bad memory use.
>>
>> I'm not seeing anything that looks like locking-related.
>
> Not in that profile. I've been doing some local testing inside a
> 4-node fake-numa 16p/16GB RAM VM to see what I can find.
You run the test in a virtual machine, I think that is why your perf
data looks strange (high value of _raw_spin_unlock_irqrestore).
To setup KVM to use perf, you may refer to,
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Monitoring_Tools-vPMU.html
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Administration_Guide/sect-perf-mon.html
I haven't tested them. You may Google to find more information. Or the
perf/kvm people can give you more information.
> I'm yet to work out how I can trigger a profile like the one that
> was reported (I really need to see the event traces), but in the
> mean time I found this....
>
> Doing a large sequential single threaded buffered write using a 4k
> buffer (so single page per syscall to make the XFS IO path allocator
> behave the same way as in 4.7), I'm seeing a CPU profile that
> indicates we have a potential mapping->tree_lock issue:
>
> # xfs_io -f -c "truncate 0" -c "pwrite 0 47g" /mnt/scratch/fooey
> wrote 50465865728/50465865728 bytes at offset 0
> 47.000 GiB, 12320768 ops; 0:01:36.00 (499.418 MiB/sec and 127850.9132 ops/sec)
>
> ....
>
> 24.15% [kernel] [k] _raw_spin_unlock_irqrestore
> 9.67% [kernel] [k] copy_user_generic_string
> 5.64% [kernel] [k] _raw_spin_unlock_irq
> 3.34% [kernel] [k] get_page_from_freelist
> 2.57% [kernel] [k] mark_page_accessed
> 2.45% [kernel] [k] do_raw_spin_lock
> 1.83% [kernel] [k] shrink_page_list
> 1.70% [kernel] [k] free_hot_cold_page
> 1.26% [kernel] [k] xfs_do_writepage
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 17:22 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-15 17:22 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2890 bytes --]
Hi, Chinner,
Dave Chinner <david@fromorbit.com> writes:
> On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
>> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
>> >
>> > Here it is,
>>
>> Thanks.
>>
>> Appended is a munged "after" list, with the "before" values in
>> parenthesis. It actually looks fairly similar.
>>
>> The biggest difference is that we have "mark_page_accessed()" show up
>> after, and not before. There was also a lot of LRU noise in the
>> non-profile data. I wonder if that is the reason here: the old model
>> of using generic_perform_write/block_page_mkwrite didn't mark the
>> pages accessed, and now with iomap_file_buffered_write() they get
>> marked as active and that screws up the LRU list, and makes us not
>> flush out the dirty pages well (because they are seen as active and
>> not good for writeback), and then you get bad memory use.
>>
>> I'm not seeing anything that looks like locking-related.
>
> Not in that profile. I've been doing some local testing inside a
> 4-node fake-numa 16p/16GB RAM VM to see what I can find.
You run the test in a virtual machine, I think that is why your perf
data looks strange (high value of _raw_spin_unlock_irqrestore).
To setup KVM to use perf, you may refer to,
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Monitoring_Tools-vPMU.html
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Administration_Guide/sect-perf-mon.html
I haven't tested them. You may Google to find more information. Or the
perf/kvm people can give you more information.
> I'm yet to work out how I can trigger a profile like the one that
> was reported (I really need to see the event traces), but in the
> mean time I found this....
>
> Doing a large sequential single threaded buffered write using a 4k
> buffer (so single page per syscall to make the XFS IO path allocator
> behave the same way as in 4.7), I'm seeing a CPU profile that
> indicates we have a potential mapping->tree_lock issue:
>
> # xfs_io -f -c "truncate 0" -c "pwrite 0 47g" /mnt/scratch/fooey
> wrote 50465865728/50465865728 bytes at offset 0
> 47.000 GiB, 12320768 ops; 0:01:36.00 (499.418 MiB/sec and 127850.9132 ops/sec)
>
> ....
>
> 24.15% [kernel] [k] _raw_spin_unlock_irqrestore
> 9.67% [kernel] [k] copy_user_generic_string
> 5.64% [kernel] [k] _raw_spin_unlock_irq
> 3.34% [kernel] [k] get_page_from_freelist
> 2.57% [kernel] [k] mark_page_accessed
> 2.45% [kernel] [k] do_raw_spin_lock
> 1.83% [kernel] [k] shrink_page_list
> 1.70% [kernel] [k] free_hot_cold_page
> 1.26% [kernel] [k] xfs_do_writepage
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-14 16:17 ` Christoph Hellwig
@ 2016-08-15 20:30 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-15 20:30 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Fengguang Wu, LKP, Dave Chinner, LKML, Bob Peterson, Linus Torvalds
Christoph Hellwig <hch@lst.de> writes:
> Snipping the long contest:
>
> I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
> (2) lots of additional spinlock contention in the new case. A quick
> check shows that I fat-fingered my rewrite so that we do
> the xfs_inode_set_eofblocks_tag call now for the pure lookup
> case, and pretty much all new cycles come from that.
> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> we're already doing way to many even without my little bug above.
>
> So I've force pushed a new version of the iomap-fixes branch with
> (2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
> lot less expensive slotted in before that. Would be good to see
> the numbers with that.
For the original reported regression, the test result is as follow,
=========================================================================================
compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506 (parent of first bad commit)
68a9f5e7007c1afa2cf6830b690a90d0187c0684 (first bad commit)
99091700659f4df965e138b38b4fa26a29b7eade (base of your fixes branch)
bf4dc6e4ecc2a3d042029319bc8cd4204c185610 (head of your fixes branch)
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 99091700659f4df965e138b38b bf4dc6e4ecc2a3d042029319bc
---------------- -------------------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
484435 ± 0% -13.3% 420004 ± 0% -17.0% 402250 ± 0% -15.6% 408998 ± 0% aim7.jobs-per-min
And the perf data is as follow,
"perf-profile.func.cycles-pp.intel_idle": 20.25,
"perf-profile.func.cycles-pp.memset_erms": 11.72,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 8.37,
"perf-profile.func.cycles-pp.__block_commit_write.isra.21": 3.49,
"perf-profile.func.cycles-pp.block_write_end": 1.77,
"perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 1.63,
"perf-profile.func.cycles-pp.unlock_page": 1.58,
"perf-profile.func.cycles-pp.___might_sleep": 1.56,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.33,
"perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 1.23,
"perf-profile.func.cycles-pp.up_write": 1.21,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.18,
"perf-profile.func.cycles-pp.down_write": 1.06,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 0.94,
"perf-profile.func.cycles-pp.generic_write_end": 0.92,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 0.91,
"perf-profile.func.cycles-pp._raw_spin_lock": 0.81,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 0.79,
"perf-profile.func.cycles-pp.__might_sleep": 0.79,
"perf-profile.func.cycles-pp.xfs_file_iomap_begin_delay.isra.9": 0.7,
"perf-profile.func.cycles-pp.__list_del_entry": 0.7,
"perf-profile.func.cycles-pp.vfs_write": 0.69,
"perf-profile.func.cycles-pp.drop_buffers": 0.68,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 0.67,
"perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.67,
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 20:30 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-15 20:30 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3867 bytes --]
Christoph Hellwig <hch@lst.de> writes:
> Snipping the long contest:
>
> I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
> (2) lots of additional spinlock contention in the new case. A quick
> check shows that I fat-fingered my rewrite so that we do
> the xfs_inode_set_eofblocks_tag call now for the pure lookup
> case, and pretty much all new cycles come from that.
> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> we're already doing way to many even without my little bug above.
>
> So I've force pushed a new version of the iomap-fixes branch with
> (2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
> lot less expensive slotted in before that. Would be good to see
> the numbers with that.
For the original reported regression, the test result is as follow,
=========================================================================================
compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
commit:
f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506 (parent of first bad commit)
68a9f5e7007c1afa2cf6830b690a90d0187c0684 (first bad commit)
99091700659f4df965e138b38b4fa26a29b7eade (base of your fixes branch)
bf4dc6e4ecc2a3d042029319bc8cd4204c185610 (head of your fixes branch)
f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 99091700659f4df965e138b38b bf4dc6e4ecc2a3d042029319bc
---------------- -------------------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
484435 ± 0% -13.3% 420004 ± 0% -17.0% 402250 ± 0% -15.6% 408998 ± 0% aim7.jobs-per-min
And the perf data is as follow,
"perf-profile.func.cycles-pp.intel_idle": 20.25,
"perf-profile.func.cycles-pp.memset_erms": 11.72,
"perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 8.37,
"perf-profile.func.cycles-pp.__block_commit_write.isra.21": 3.49,
"perf-profile.func.cycles-pp.block_write_end": 1.77,
"perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 1.63,
"perf-profile.func.cycles-pp.unlock_page": 1.58,
"perf-profile.func.cycles-pp.___might_sleep": 1.56,
"perf-profile.func.cycles-pp.__block_write_begin_int": 1.33,
"perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 1.23,
"perf-profile.func.cycles-pp.up_write": 1.21,
"perf-profile.func.cycles-pp.__mark_inode_dirty": 1.18,
"perf-profile.func.cycles-pp.down_write": 1.06,
"perf-profile.func.cycles-pp.mark_buffer_dirty": 0.94,
"perf-profile.func.cycles-pp.generic_write_end": 0.92,
"perf-profile.func.cycles-pp.__radix_tree_lookup": 0.91,
"perf-profile.func.cycles-pp._raw_spin_lock": 0.81,
"perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 0.79,
"perf-profile.func.cycles-pp.__might_sleep": 0.79,
"perf-profile.func.cycles-pp.xfs_file_iomap_begin_delay.isra.9": 0.7,
"perf-profile.func.cycles-pp.__list_del_entry": 0.7,
"perf-profile.func.cycles-pp.vfs_write": 0.69,
"perf-profile.func.cycles-pp.drop_buffers": 0.68,
"perf-profile.func.cycles-pp.xfs_file_write_iter": 0.67,
"perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.67,
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 14:14 ` Fengguang Wu
@ 2016-08-15 21:22 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 21:22 UTC (permalink / raw)
To: Fengguang Wu
Cc: Christoph Hellwig, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP
On Mon, Aug 15, 2016 at 10:14:55PM +0800, Fengguang Wu wrote:
> Hi Christoph,
>
> On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
> >Snipping the long contest:
> >
> >I think there are three observations here:
> >
> >(1) removing the mark_page_accessed (which is the only significant
> > change in the parent commit) hurts the
> > aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> > I'd still rather stick to the filemap version and let the
> > VM people sort it out. How do the numbers for this test
> > look for XFS vs say ext4 and btrfs?
> >(2) lots of additional spinlock contention in the new case. A quick
> > check shows that I fat-fingered my rewrite so that we do
> > the xfs_inode_set_eofblocks_tag call now for the pure lookup
> > case, and pretty much all new cycles come from that.
> >(3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> > we're already doing way to many even without my little bug above.
> >
> >So I've force pushed a new version of the iomap-fixes branch with
> >(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
> >lot less expensive slotted in before that. Would be good to see
> >the numbers with that.
>
> The aim7 1BRD tests finished and there are ups and downs, with overall
> performance remain flat.
>
> 99091700659f4df9 74a242ad94d13436a1644c0b45 bf4dc6e4ecc2a3d042029319bc testcase/testparams/testbox
> ---------------- -------------------------- -------------------------- ---------------------------
What do these commits refer to, please? They mean nothing without
the commit names....
/me goes searching. Ok:
99091700659 is the top of Linus' tree
74a242ad94d is ????
bf4dc6e4ecc is the latest in Christoph's tree (because it's
mentioned below)
> %stddev %change %stddev %change %stddev
> \ | \ |
> \ 159926 157324 158574
> GEO-MEAN aim7.jobs-per-min
> 70897 5% 74137 4% 73775 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
> 485217 ± 3% 492431 477533 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
> 360451 -19% 292980 -17% 299377 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
So, why does random read go backwards by 20%? The iomap IO path
patches we are testing only affect the write path, so this
doesn't make a whole lot of sense.
> 338114 338410 5% 354078 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
> 60130 ± 5% 4% 62438 5% 62923 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
> 403144 397790 410648 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
And this is the test the original regression was reported for:
gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
And that shows no improvement at all. The orginal regression was:
484435 ± 0% -13.3% 420004 ± 0% aim7.jobs-per-min
So it's still 15% down on the orginal performance which, again,
doesn't make a whole lot of sense given the improvement in so many
other tests I've run....
> 26327 26534 26128 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44
>
> The new commit bf4dc6e ("xfs: rewrite and optimize the delalloc write
> path") improves the aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
> case by 5%. Here are the detailed numbers:
>
> aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
Not important at all. We need the results for the disk_wrt regression
we are chasing (disk_wrt-3000) so we can see how the code change
affected behaviour.
> Here are the detailed numbers for the slowed down case:
>
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
>
> 99091700659f4df9 bf4dc6e4ecc2a3d042029319bc
> ---------------- --------------------------
> %stddev change %stddev
> \ | \
> 360451 -17% 299377 aim7.jobs-per-min
> 12806 481% 74447 aim7.time.involuntary_context_switches
.....
> 19377 459% 108364 vmstat.system.cs
.....
> 487 ± 89% 3e+04 26448 ± 57% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
> 1823 ± 82% 2e+06 1913796 ± 38% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
> 208475 ± 43% 1e+06 1409494 ± 5% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
> 6884 ± 73% 8e+04 90790 ± 9% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
> 1598 ± 20% 3e+04 35015 ± 27% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
> 2006 ± 25% 3e+04 31143 ± 35% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
> 29 ±101% 1e+04 10214 ± 29% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
> 1206 ± 51% 9e+03 9919 ± 25% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
Significant increase in blocking delays in the journal during atime
updates. There's nothing in Christoph's tree that would affect that
behaviour. This smells like either a mount option change or
individual tests not being 100% isolated and the previous test run
is affecting this one?
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 21:22 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 21:22 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 7093 bytes --]
On Mon, Aug 15, 2016 at 10:14:55PM +0800, Fengguang Wu wrote:
> Hi Christoph,
>
> On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
> >Snipping the long contest:
> >
> >I think there are three observations here:
> >
> >(1) removing the mark_page_accessed (which is the only significant
> > change in the parent commit) hurts the
> > aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> > I'd still rather stick to the filemap version and let the
> > VM people sort it out. How do the numbers for this test
> > look for XFS vs say ext4 and btrfs?
> >(2) lots of additional spinlock contention in the new case. A quick
> > check shows that I fat-fingered my rewrite so that we do
> > the xfs_inode_set_eofblocks_tag call now for the pure lookup
> > case, and pretty much all new cycles come from that.
> >(3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
> > we're already doing way to many even without my little bug above.
> >
> >So I've force pushed a new version of the iomap-fixes branch with
> >(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
> >lot less expensive slotted in before that. Would be good to see
> >the numbers with that.
>
> The aim7 1BRD tests finished and there are ups and downs, with overall
> performance remain flat.
>
> 99091700659f4df9 74a242ad94d13436a1644c0b45 bf4dc6e4ecc2a3d042029319bc testcase/testparams/testbox
> ---------------- -------------------------- -------------------------- ---------------------------
What do these commits refer to, please? They mean nothing without
the commit names....
/me goes searching. Ok:
99091700659 is the top of Linus' tree
74a242ad94d is ????
bf4dc6e4ecc is the latest in Christoph's tree (because it's
mentioned below)
> %stddev %change %stddev %change %stddev
> \ | \ |
> \ 159926 157324 158574
> GEO-MEAN aim7.jobs-per-min
> 70897 5% 74137 4% 73775 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
> 485217 ± 3% 492431 477533 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
> 360451 -19% 292980 -17% 299377 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
So, why does random read go backwards by 20%? The iomap IO path
patches we are testing only affect the write path, so this
doesn't make a whole lot of sense.
> 338114 338410 5% 354078 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
> 60130 ± 5% 4% 62438 5% 62923 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
> 403144 397790 410648 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
And this is the test the original regression was reported for:
gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
And that shows no improvement at all. The orginal regression was:
484435 ± 0% -13.3% 420004 ± 0% aim7.jobs-per-min
So it's still 15% down on the orginal performance which, again,
doesn't make a whole lot of sense given the improvement in so many
other tests I've run....
> 26327 26534 26128 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44
>
> The new commit bf4dc6e ("xfs: rewrite and optimize the delalloc write
> path") improves the aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
> case by 5%. Here are the detailed numbers:
>
> aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
Not important at all. We need the results for the disk_wrt regression
we are chasing (disk_wrt-3000) so we can see how the code change
affected behaviour.
> Here are the detailed numbers for the slowed down case:
>
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
>
> 99091700659f4df9 bf4dc6e4ecc2a3d042029319bc
> ---------------- --------------------------
> %stddev change %stddev
> \ | \
> 360451 -17% 299377 aim7.jobs-per-min
> 12806 481% 74447 aim7.time.involuntary_context_switches
.....
> 19377 459% 108364 vmstat.system.cs
.....
> 487 ± 89% 3e+04 26448 ± 57% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
> 1823 ± 82% 2e+06 1913796 ± 38% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
> 208475 ± 43% 1e+06 1409494 ± 5% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
> 6884 ± 73% 8e+04 90790 ± 9% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
> 1598 ± 20% 3e+04 35015 ± 27% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
> 2006 ± 25% 3e+04 31143 ± 35% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
> 29 ±101% 1e+04 10214 ± 29% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
> 1206 ± 51% 9e+03 9919 ± 25% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
Significant increase in blocking delays in the journal during atime
updates. There's nothing in Christoph's tree that would affect that
behaviour. This smells like either a mount option change or
individual tests not being 100% isolated and the previous test run
is affecting this one?
-Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 5:12 ` Linus Torvalds
@ 2016-08-15 22:22 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 22:22 UTC (permalink / raw)
To: Linus Torvalds
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Sun, Aug 14, 2016 at 10:12:20PM -0700, Linus Torvalds wrote:
> On Aug 14, 2016 10:00 PM, "Dave Chinner" <david@fromorbit.com> wrote:
> >
> > > What does it say if you annotate that _raw_spin_unlock_irqrestore()
> function?
> > ....
> > ¿
> > ¿ Disassembly of section load0:
> > ¿
> > ¿ ffffffff81e628b0 <load0>:
> > ¿ nop
> > ¿ push %rbp
> > ¿ mov %rsp,%rbp
> > ¿ movb $0x0,(%rdi)
> > ¿ nop
> > ¿ mov %rsi,%rdi
> > ¿ push %rdi
> > ¿ popfq
> > 99.35 ¿ nop
>
> Yeah, that's a good disassembly of a non-debug spin unlock, and the symbols
> are fine, but the profile is not valid. That's an interrupt point, right
> after the popf that enables interiors again.
>
> I don't know why 'perf' isn't working on your machine, but it clearly
> isn't.
>
> Has it ever worked on that machine?
It's working the same as it's worked since I started using it many
years ago.
> What cpu is it?
Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz
> Are you running in some
> virtualized environment without performance counters, perhaps?
I've mentioned a couple of times in this thread that I'm testing
inside a VM. It's the same VM I've been running performance tests in
since early 2010. Nobody has complained that the profiles I've
posted are useless before, and not once in all that time have they
been wrong in indicating a spinning lock contention point.
i.e. In previous cases where I've measured double digit CPU usage
numbers in a spin_unlock variant, it's always been a result of
spinlock contention. And fixing the algorithmic problem that lead to
the spinlock showing up in the profile in the first place has always
substantially improved performance and scalability.
As such, I'm always going to treat a locking profile like that as
contention because even if it isn't contending *on my machine*,
that amount of work being done under a spinning lock is /way too
much/ and it *will* cause contention problems with larger machines.
> It's not actually the unlock that is expensive, and there is no contention
> on the lock (if there had been, the numbers would have been entirely
> different for the debug case, which makes locking an order of magnitude
> more expensive). All the cost of everything that happened while interrupts
> were disabled is just accounted to the instruction after they were enabled
> again.
Right, but that does not make the profile data useless, nor you
should shoot the messenger because they weren't supplied with
information you think should have been in the message. The message
still says that the majority of the overhead is in
__remove_mapping(), and it's an excessive amount of work being done
inside the tree_lock with interrupts disabled....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 22:22 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 22:22 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2967 bytes --]
On Sun, Aug 14, 2016 at 10:12:20PM -0700, Linus Torvalds wrote:
> On Aug 14, 2016 10:00 PM, "Dave Chinner" <david@fromorbit.com> wrote:
> >
> > > What does it say if you annotate that _raw_spin_unlock_irqrestore()
> function?
> > ....
> > ¿
> > ¿ Disassembly of section load0:
> > ¿
> > ¿ ffffffff81e628b0 <load0>:
> > ¿ nop
> > ¿ push %rbp
> > ¿ mov %rsp,%rbp
> > ¿ movb $0x0,(%rdi)
> > ¿ nop
> > ¿ mov %rsi,%rdi
> > ¿ push %rdi
> > ¿ popfq
> > 99.35 ¿ nop
>
> Yeah, that's a good disassembly of a non-debug spin unlock, and the symbols
> are fine, but the profile is not valid. That's an interrupt point, right
> after the popf that enables interiors again.
>
> I don't know why 'perf' isn't working on your machine, but it clearly
> isn't.
>
> Has it ever worked on that machine?
It's working the same as it's worked since I started using it many
years ago.
> What cpu is it?
Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz
> Are you running in some
> virtualized environment without performance counters, perhaps?
I've mentioned a couple of times in this thread that I'm testing
inside a VM. It's the same VM I've been running performance tests in
since early 2010. Nobody has complained that the profiles I've
posted are useless before, and not once in all that time have they
been wrong in indicating a spinning lock contention point.
i.e. In previous cases where I've measured double digit CPU usage
numbers in a spin_unlock variant, it's always been a result of
spinlock contention. And fixing the algorithmic problem that lead to
the spinlock showing up in the profile in the first place has always
substantially improved performance and scalability.
As such, I'm always going to treat a locking profile like that as
contention because even if it isn't contending *on my machine*,
that amount of work being done under a spinning lock is /way too
much/ and it *will* cause contention problems with larger machines.
> It's not actually the unlock that is expensive, and there is no contention
> on the lock (if there had been, the numbers would have been entirely
> different for the debug case, which makes locking an order of magnitude
> more expensive). All the cost of everything that happened while interrupts
> were disabled is just accounted to the instruction after they were enabled
> again.
Right, but that does not make the profile data useless, nor you
should shoot the messenger because they weren't supplied with
information you think should have been in the message. The message
still says that the majority of the overhead is in
__remove_mapping(), and it's an excessive amount of work being done
inside the tree_lock with interrupts disabled....
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 22:22 ` Dave Chinner
@ 2016-08-15 22:42 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 22:42 UTC (permalink / raw)
To: Linus Torvalds
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Tue, Aug 16, 2016 at 08:22:11AM +1000, Dave Chinner wrote:
> On Sun, Aug 14, 2016 at 10:12:20PM -0700, Linus Torvalds wrote:
> > On Aug 14, 2016 10:00 PM, "Dave Chinner" <david@fromorbit.com> wrote:
> > >
> > > > What does it say if you annotate that _raw_spin_unlock_irqrestore()
> > function?
> > > ....
> > > ¿
> > > ¿ Disassembly of section load0:
> > > ¿
> > > ¿ ffffffff81e628b0 <load0>:
> > > ¿ nop
> > > ¿ push %rbp
> > > ¿ mov %rsp,%rbp
> > > ¿ movb $0x0,(%rdi)
> > > ¿ nop
> > > ¿ mov %rsi,%rdi
> > > ¿ push %rdi
> > > ¿ popfq
> > > 99.35 ¿ nop
> >
> > Yeah, that's a good disassembly of a non-debug spin unlock, and the symbols
> > are fine, but the profile is not valid. That's an interrupt point, right
> > after the popf that enables interiors again.
> >
> > I don't know why 'perf' isn't working on your machine, but it clearly
> > isn't.
> >
> > Has it ever worked on that machine?
>
> It's working the same as it's worked since I started using it many
> years ago.
>
> > What cpu is it?
>
> Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz
>
> > Are you running in some
> > virtualized environment without performance counters, perhaps?
>
> I've mentioned a couple of times in this thread that I'm testing
> inside a VM. It's the same VM I've been running performance tests in
> since early 2010. Nobody has complained that the profiles I've
> posted are useless before, and not once in all that time have they
> been wrong in indicating a spinning lock contention point.
>
> i.e. In previous cases where I've measured double digit CPU usage
> numbers in a spin_unlock variant, it's always been a result of
> spinlock contention. And fixing the algorithmic problem that lead to
> the spinlock showing up in the profile in the first place has always
> substantially improved performance and scalability.
>
> As such, I'm always going to treat a locking profile like that as
> contention because even if it isn't contending *on my machine*,
> that amount of work being done under a spinning lock is /way too
> much/ and it *will* cause contention problems with larger machines.
And, so, after helpfully being pointed at the magic kvm "-cpu host"
flag to enable access to the performance counters from the guest
(using "-e cycles", because more precise counters aren't available),
the profile looks like this:
31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
9.90% [kernel] [k] copy_user_generic_string
3.65% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
2.62% [kernel] [k] __block_commit_write.isra.29
2.26% [kernel] [k] _raw_spin_lock_irqsave
1.72% [kernel] [k] _raw_spin_lock
1.33% [kernel] [k] __wake_up_bit
1.20% [kernel] [k] __radix_tree_lookup
1.19% [kernel] [k] __remove_mapping
1.12% [kernel] [k] __delete_from_page_cache
0.97% [kernel] [k] xfs_do_writepage
0.91% [kernel] [k] get_page_from_freelist
0.90% [kernel] [k] up_write
0.88% [kernel] [k] clear_page_dirty_for_io
0.83% [kernel] [k] radix_tree_tag_set
0.81% [kernel] [k] radix_tree_tag_clear
0.80% [kernel] [k] down_write
0.78% [kernel] [k] _raw_spin_unlock_irqrestore
0.77% [kernel] [k] shrink_page_list
0.76% [kernel] [k] ___might_sleep
0.76% [kernel] [k] unlock_page
0.74% [kernel] [k] __list_del_entry
0.67% [kernel] [k] __add_to_page_cache_locked
0.65% [kernel] [k] node_dirty_ok
0.61% [kernel] [k] __rmqueue
0.61% [kernel] [k] __block_write_begin_int
0.61% [kernel] [k] cancel_dirty_page
0.61% [kernel] [k] __test_set_page_writeback
0.59% [kernel] [k] page_mapping
0.57% [kernel] [k] __list_add
0.56% [kernel] [k] free_pcppages_bulk
0.54% [kernel] [k] _raw_spin_lock_irq
0.54% [kernel] [k] generic_write_end
0.51% [kernel] [k] drop_buffers
The call graph should be familiar by now:
36.60% 0.00% [kernel] [k] kswapd
- 30.29% kswapd
- 30.23% shrink_node
- 30.07% shrink_node_memcg.isra.75
- 30.15% shrink_inactive_list
- 29.49% shrink_page_list
- 22.79% __remove_mapping
- 22.27% _raw_spin_lock_irqsave
__pv_queued_spin_lock_slowpath
+ 1.86% __delete_from_page_cache
+ 1.27% _raw_spin_unlock_irqrestore
+ 4.31% try_to_release_page
+ 1.21% free_hot_cold_page_list
0.56% page_evictable
0.77% isolate_lru_pages.isra.72
That sure looks like spin lock contention to me....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 22:42 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-15 22:42 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4945 bytes --]
On Tue, Aug 16, 2016 at 08:22:11AM +1000, Dave Chinner wrote:
> On Sun, Aug 14, 2016 at 10:12:20PM -0700, Linus Torvalds wrote:
> > On Aug 14, 2016 10:00 PM, "Dave Chinner" <david@fromorbit.com> wrote:
> > >
> > > > What does it say if you annotate that _raw_spin_unlock_irqrestore()
> > function?
> > > ....
> > > ¿
> > > ¿ Disassembly of section load0:
> > > ¿
> > > ¿ ffffffff81e628b0 <load0>:
> > > ¿ nop
> > > ¿ push %rbp
> > > ¿ mov %rsp,%rbp
> > > ¿ movb $0x0,(%rdi)
> > > ¿ nop
> > > ¿ mov %rsi,%rdi
> > > ¿ push %rdi
> > > ¿ popfq
> > > 99.35 ¿ nop
> >
> > Yeah, that's a good disassembly of a non-debug spin unlock, and the symbols
> > are fine, but the profile is not valid. That's an interrupt point, right
> > after the popf that enables interiors again.
> >
> > I don't know why 'perf' isn't working on your machine, but it clearly
> > isn't.
> >
> > Has it ever worked on that machine?
>
> It's working the same as it's worked since I started using it many
> years ago.
>
> > What cpu is it?
>
> Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz
>
> > Are you running in some
> > virtualized environment without performance counters, perhaps?
>
> I've mentioned a couple of times in this thread that I'm testing
> inside a VM. It's the same VM I've been running performance tests in
> since early 2010. Nobody has complained that the profiles I've
> posted are useless before, and not once in all that time have they
> been wrong in indicating a spinning lock contention point.
>
> i.e. In previous cases where I've measured double digit CPU usage
> numbers in a spin_unlock variant, it's always been a result of
> spinlock contention. And fixing the algorithmic problem that lead to
> the spinlock showing up in the profile in the first place has always
> substantially improved performance and scalability.
>
> As such, I'm always going to treat a locking profile like that as
> contention because even if it isn't contending *on my machine*,
> that amount of work being done under a spinning lock is /way too
> much/ and it *will* cause contention problems with larger machines.
And, so, after helpfully being pointed at the magic kvm "-cpu host"
flag to enable access to the performance counters from the guest
(using "-e cycles", because more precise counters aren't available),
the profile looks like this:
31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
9.90% [kernel] [k] copy_user_generic_string
3.65% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
2.62% [kernel] [k] __block_commit_write.isra.29
2.26% [kernel] [k] _raw_spin_lock_irqsave
1.72% [kernel] [k] _raw_spin_lock
1.33% [kernel] [k] __wake_up_bit
1.20% [kernel] [k] __radix_tree_lookup
1.19% [kernel] [k] __remove_mapping
1.12% [kernel] [k] __delete_from_page_cache
0.97% [kernel] [k] xfs_do_writepage
0.91% [kernel] [k] get_page_from_freelist
0.90% [kernel] [k] up_write
0.88% [kernel] [k] clear_page_dirty_for_io
0.83% [kernel] [k] radix_tree_tag_set
0.81% [kernel] [k] radix_tree_tag_clear
0.80% [kernel] [k] down_write
0.78% [kernel] [k] _raw_spin_unlock_irqrestore
0.77% [kernel] [k] shrink_page_list
0.76% [kernel] [k] ___might_sleep
0.76% [kernel] [k] unlock_page
0.74% [kernel] [k] __list_del_entry
0.67% [kernel] [k] __add_to_page_cache_locked
0.65% [kernel] [k] node_dirty_ok
0.61% [kernel] [k] __rmqueue
0.61% [kernel] [k] __block_write_begin_int
0.61% [kernel] [k] cancel_dirty_page
0.61% [kernel] [k] __test_set_page_writeback
0.59% [kernel] [k] page_mapping
0.57% [kernel] [k] __list_add
0.56% [kernel] [k] free_pcppages_bulk
0.54% [kernel] [k] _raw_spin_lock_irq
0.54% [kernel] [k] generic_write_end
0.51% [kernel] [k] drop_buffers
The call graph should be familiar by now:
36.60% 0.00% [kernel] [k] kswapd
- 30.29% kswapd
- 30.23% shrink_node
- 30.07% shrink_node_memcg.isra.75
- 30.15% shrink_inactive_list
- 29.49% shrink_page_list
- 22.79% __remove_mapping
- 22.27% _raw_spin_lock_irqsave
__pv_queued_spin_lock_slowpath
+ 1.86% __delete_from_page_cache
+ 1.27% _raw_spin_unlock_irqrestore
+ 4.31% try_to_release_page
+ 1.21% free_hot_cold_page_list
0.56% page_evictable
0.77% isolate_lru_pages.isra.72
That sure looks like spin lock contention to me....
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 22:22 ` Dave Chinner
@ 2016-08-15 23:01 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 23:01 UTC (permalink / raw)
To: Dave Chinner
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 3:22 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Right, but that does not make the profile data useless,
Yes it does. Because it basically hides everything that happens inside
the lock, which is what causes the contention in the first place.
So stop making inane and stupid arguments, Dave.
Your profiles are shit. Deal with it, or accept that nobody is ever
going to bother working on them because your profiles don't give
useful information.
I see that you actually fixed your profiles, but quite frankly, the
amount of pure unadulterated crap you posted in this email is worth
reacting negatively to.
You generally make so much sense that it's shocking to see you then
make these crazy excuses for your completely broken profiles.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 23:01 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 23:01 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 814 bytes --]
On Mon, Aug 15, 2016 at 3:22 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Right, but that does not make the profile data useless,
Yes it does. Because it basically hides everything that happens inside
the lock, which is what causes the contention in the first place.
So stop making inane and stupid arguments, Dave.
Your profiles are shit. Deal with it, or accept that nobody is ever
going to bother working on them because your profiles don't give
useful information.
I see that you actually fixed your profiles, but quite frankly, the
amount of pure unadulterated crap you posted in this email is worth
reacting negatively to.
You generally make so much sense that it's shocking to see you then
make these crazy excuses for your completely broken profiles.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 22:42 ` Dave Chinner
@ 2016-08-15 23:20 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 23:20 UTC (permalink / raw)
To: Dave Chinner
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 3:42 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
> 9.90% [kernel] [k] copy_user_generic_string
> 3.65% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
> 2.62% [kernel] [k] __block_commit_write.isra.29
> 2.26% [kernel] [k] _raw_spin_lock_irqsave
> 1.72% [kernel] [k] _raw_spin_lock
Ok, this is more like it.
I'd still like to see it on raw hardware, just to see if we may have a
bug in the PV code. Because that code has been buggy before. I
*thought* we fixed it, but ...
In fact, you don't even need to do it outside of virtualization, but
with paravirt disabled (so that it runs the native non-pv locking in
the virtual machine).
> 36.60% 0.00% [kernel] [k] kswapd
> - 30.29% kswapd
> - 30.23% shrink_node
> - 30.07% shrink_node_memcg.isra.75
> - 30.15% shrink_inactive_list
> - 29.49% shrink_page_list
> - 22.79% __remove_mapping
> - 22.27% _raw_spin_lock_irqsave
> __pv_queued_spin_lock_slowpath
How I dislike the way perf shows the call graph data... Just last week
I was talking to Arnaldo about how to better visualize the cost of
spinlocks, because the normal way "perf" shows costs is so nasty.
What happens is that you see that 36% of CPU time is attributed to
kswapd, and then you can drill down and see where that 36% comes from.
So far so good, and that's what perf does fairly well.
But then when you find the spinlock, you actually want to go the other
way, and instead ask it to show "who were the callers to this routine
and what were the percentages", so that you can then see whether (for
example) it's just that __remove_mapping() use that contends with
itself, or whether it's contending with the page additions or
whatever..
And perf makes that unnecessarily much too hard to see.
So what I'd like to see (and this is where it becomes *so* much more
useful to be able to recreate it myself so that I can play with the
perf data several different ways) is to see what the profile looks
like in that spinlocked region.
Hmm. I guess you could just send me the "perf.data" and "vmlinux"
files, and I can look at it that way. But I'll try to see what happens
on my profile, even if I can't recreate the contention itself, just
trying to see what happens inside of that region.
None of this code is all that new, which is annoying. This must have
gone on forever,
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 23:20 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 23:20 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2619 bytes --]
On Mon, Aug 15, 2016 at 3:42 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
> 9.90% [kernel] [k] copy_user_generic_string
> 3.65% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
> 2.62% [kernel] [k] __block_commit_write.isra.29
> 2.26% [kernel] [k] _raw_spin_lock_irqsave
> 1.72% [kernel] [k] _raw_spin_lock
Ok, this is more like it.
I'd still like to see it on raw hardware, just to see if we may have a
bug in the PV code. Because that code has been buggy before. I
*thought* we fixed it, but ...
In fact, you don't even need to do it outside of virtualization, but
with paravirt disabled (so that it runs the native non-pv locking in
the virtual machine).
> 36.60% 0.00% [kernel] [k] kswapd
> - 30.29% kswapd
> - 30.23% shrink_node
> - 30.07% shrink_node_memcg.isra.75
> - 30.15% shrink_inactive_list
> - 29.49% shrink_page_list
> - 22.79% __remove_mapping
> - 22.27% _raw_spin_lock_irqsave
> __pv_queued_spin_lock_slowpath
How I dislike the way perf shows the call graph data... Just last week
I was talking to Arnaldo about how to better visualize the cost of
spinlocks, because the normal way "perf" shows costs is so nasty.
What happens is that you see that 36% of CPU time is attributed to
kswapd, and then you can drill down and see where that 36% comes from.
So far so good, and that's what perf does fairly well.
But then when you find the spinlock, you actually want to go the other
way, and instead ask it to show "who were the callers to this routine
and what were the percentages", so that you can then see whether (for
example) it's just that __remove_mapping() use that contends with
itself, or whether it's contending with the page additions or
whatever..
And perf makes that unnecessarily much too hard to see.
So what I'd like to see (and this is where it becomes *so* much more
useful to be able to recreate it myself so that I can play with the
perf data several different ways) is to see what the profile looks
like in that spinlocked region.
Hmm. I guess you could just send me the "perf.data" and "vmlinux"
files, and I can look at it that way. But I'll try to see what happens
on my profile, even if I can't recreate the contention itself, just
trying to see what happens inside of that region.
None of this code is all that new, which is annoying. This must have
gone on forever,
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 23:20 ` Linus Torvalds
@ 2016-08-15 23:48 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 23:48 UTC (permalink / raw)
To: Dave Chinner, Mel Gorman, Johannes Weiner, Vlastimil Babka,
Andrew Morton
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> None of this code is all that new, which is annoying. This must have
> gone on forever,
... ooh.
Wait, I take that back.
We actually have some very recent changes that I didn't even think
about that went into this very merge window.
In particular, I wonder if it's all (or at least partly) due to the
new per-node LRU lists.
So in shrink_page_list(), when kswapd is encountering a page that is
under page writeback due to page reclaim, it does:
if (current_is_kswapd() &&
PageReclaim(page) &&
test_bit(PGDAT_WRITEBACK, &pgdat->flags)) {
nr_immediate++;
goto keep_locked;
which basically ignores that page and puts it back on the LRU list.
But that "is this node under writeback" is new - it now does that per
node, and it *used* to do it per zone (so it _used_ to test "is this
zone under writeback").
All the mapping pages used to be in the same zone, so I think it
effectively single-threaded the kswapd reclaim for one mapping under
reclaim writeback. But in your cases, you have multiple nodes...
Ok, that's a lot of hand-wavy new-age crystal healing thinking.
Really, I haven't looked at it more than "this is one thing that has
changed recently, I wonder if it changes the patterns and could
explain much higher spin_lock contention on the mapping->tree_lock".
I'm adding Mel Gorman and his band of miscreants to the cc, so that
they can tell me that I'm full of shit, and completely missed on what
that zone->node change actually ends up meaning.
Mel? The issue is that Dave Chinner is seeing some nasty spinlock
contention on "mapping->tree_lock":
> 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
and one of the main paths is this:
> - 30.29% kswapd
> - 30.23% shrink_node
> - 30.07% shrink_node_memcg.isra.75
> - 30.15% shrink_inactive_list
> - 29.49% shrink_page_list
> - 22.79% __remove_mapping
> - 22.27% _raw_spin_lock_irqsave
> __pv_queued_spin_lock_slowpath
so there's something ridiculously bad going on with a fairly simple benchmark.
Dave's benchmark is literally just a "write a new 48GB file in
single-page chunks on a 4-node machine". Nothing odd - not rewriting
files, not seeking around, no nothing.
You can probably recreate it with a silly
dd bs=4096 count=$((12*1024*1024)) if=/dev/zero of=bigfile
although Dave actually had something rather fancier, I think.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-15 23:48 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-15 23:48 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2756 bytes --]
On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> None of this code is all that new, which is annoying. This must have
> gone on forever,
... ooh.
Wait, I take that back.
We actually have some very recent changes that I didn't even think
about that went into this very merge window.
In particular, I wonder if it's all (or at least partly) due to the
new per-node LRU lists.
So in shrink_page_list(), when kswapd is encountering a page that is
under page writeback due to page reclaim, it does:
if (current_is_kswapd() &&
PageReclaim(page) &&
test_bit(PGDAT_WRITEBACK, &pgdat->flags)) {
nr_immediate++;
goto keep_locked;
which basically ignores that page and puts it back on the LRU list.
But that "is this node under writeback" is new - it now does that per
node, and it *used* to do it per zone (so it _used_ to test "is this
zone under writeback").
All the mapping pages used to be in the same zone, so I think it
effectively single-threaded the kswapd reclaim for one mapping under
reclaim writeback. But in your cases, you have multiple nodes...
Ok, that's a lot of hand-wavy new-age crystal healing thinking.
Really, I haven't looked at it more than "this is one thing that has
changed recently, I wonder if it changes the patterns and could
explain much higher spin_lock contention on the mapping->tree_lock".
I'm adding Mel Gorman and his band of miscreants to the cc, so that
they can tell me that I'm full of shit, and completely missed on what
that zone->node change actually ends up meaning.
Mel? The issue is that Dave Chinner is seeing some nasty spinlock
contention on "mapping->tree_lock":
> 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
and one of the main paths is this:
> - 30.29% kswapd
> - 30.23% shrink_node
> - 30.07% shrink_node_memcg.isra.75
> - 30.15% shrink_inactive_list
> - 29.49% shrink_page_list
> - 22.79% __remove_mapping
> - 22.27% _raw_spin_lock_irqsave
> __pv_queued_spin_lock_slowpath
so there's something ridiculously bad going on with a fairly simple benchmark.
Dave's benchmark is literally just a "write a new 48GB file in
single-page chunks on a 4-node machine". Nothing odd - not rewriting
files, not seeking around, no nothing.
You can probably recreate it with a silly
dd bs=4096 count=$((12*1024*1024)) if=/dev/zero of=bigfile
although Dave actually had something rather fancier, I think.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 17:22 ` Huang, Ying
@ 2016-08-16 0:08 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:08 UTC (permalink / raw)
To: Huang, Ying
Cc: Linus Torvalds, LKML, Bob Peterson, Wu Fengguang, LKP, Christoph Hellwig
On Mon, Aug 15, 2016 at 10:22:43AM -0700, Huang, Ying wrote:
> Hi, Chinner,
>
> Dave Chinner <david@fromorbit.com> writes:
>
> > On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> >> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
> >> >
> >> > Here it is,
> >>
> >> Thanks.
> >>
> >> Appended is a munged "after" list, with the "before" values in
> >> parenthesis. It actually looks fairly similar.
> >>
> >> The biggest difference is that we have "mark_page_accessed()" show up
> >> after, and not before. There was also a lot of LRU noise in the
> >> non-profile data. I wonder if that is the reason here: the old model
> >> of using generic_perform_write/block_page_mkwrite didn't mark the
> >> pages accessed, and now with iomap_file_buffered_write() they get
> >> marked as active and that screws up the LRU list, and makes us not
> >> flush out the dirty pages well (because they are seen as active and
> >> not good for writeback), and then you get bad memory use.
> >>
> >> I'm not seeing anything that looks like locking-related.
> >
> > Not in that profile. I've been doing some local testing inside a
> > 4-node fake-numa 16p/16GB RAM VM to see what I can find.
>
> You run the test in a virtual machine, I think that is why your perf
> data looks strange (high value of _raw_spin_unlock_irqrestore).
>
> To setup KVM to use perf, you may refer to,
>
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Monitoring_Tools-vPMU.html
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Administration_Guide/sect-perf-mon.html
>
> I haven't tested them. You may Google to find more information. Or the
> perf/kvm people can give you more information.
Thanks, "-cpu host" on the qemu command line works. I hate magic,
undocumented(*) features that are necessary to make basic stuff work.
-Dave.
(*) yeah, try working this capability even exists out from the
qemu/kvm man page.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 0:08 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:08 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2194 bytes --]
On Mon, Aug 15, 2016 at 10:22:43AM -0700, Huang, Ying wrote:
> Hi, Chinner,
>
> Dave Chinner <david@fromorbit.com> writes:
>
> > On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> >> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
> >> >
> >> > Here it is,
> >>
> >> Thanks.
> >>
> >> Appended is a munged "after" list, with the "before" values in
> >> parenthesis. It actually looks fairly similar.
> >>
> >> The biggest difference is that we have "mark_page_accessed()" show up
> >> after, and not before. There was also a lot of LRU noise in the
> >> non-profile data. I wonder if that is the reason here: the old model
> >> of using generic_perform_write/block_page_mkwrite didn't mark the
> >> pages accessed, and now with iomap_file_buffered_write() they get
> >> marked as active and that screws up the LRU list, and makes us not
> >> flush out the dirty pages well (because they are seen as active and
> >> not good for writeback), and then you get bad memory use.
> >>
> >> I'm not seeing anything that looks like locking-related.
> >
> > Not in that profile. I've been doing some local testing inside a
> > 4-node fake-numa 16p/16GB RAM VM to see what I can find.
>
> You run the test in a virtual machine, I think that is why your perf
> data looks strange (high value of _raw_spin_unlock_irqrestore).
>
> To setup KVM to use perf, you may refer to,
>
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Monitoring_Tools-vPMU.html
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Administration_Guide/sect-perf-mon.html
>
> I haven't tested them. You may Google to find more information. Or the
> perf/kvm people can give you more information.
Thanks, "-cpu host" on the qemu command line works. I hate magic,
undocumented(*) features that are necessary to make basic stuff work.
-Dave.
(*) yeah, try working this capability even exists out from the
qemu/kvm man page.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 23:20 ` Linus Torvalds
@ 2016-08-16 0:15 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 0:15 UTC (permalink / raw)
To: Dave Chinner, Mel Gorman, Johannes Weiner, Vlastimil Babka
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> But I'll try to see what happens
> on my profile, even if I can't recreate the contention itself, just
> trying to see what happens inside of that region.
Yeah, since I run my machines on encrypted disks, my profile shows 60%
kthread, but that's just because 55% is crypto.
I only have 5% in kswapd. And the spinlock doesn't even show up for me
(but "__delete_from_page_cache()" does, which doesn't look
unreasonable).
And while the biggest reason the spinlock doesn't show up is likely
simply my single-node "everything is on one die", I still think the
lower kswapd CPU use might be partly due to the node-vs-zone thing.
For me, with just one node, the new
test_bit(PGDAT_WRITEBACK, &pgdat->flags)) {
ends up being very similar to what we used to have before, ie
test_bit(ZONE_WRITEBACK, &zone->flags)) {
but on a multi-node machine it would be rather different.
So I might never see contention anyway.
The basic logic in shrink_swap_list() goes back to commit 283aba9f9e0
("mm: vmscan: block kswapd if it is encountering pages under
writeback") but it has been messed around with a lot (and something
else existed there before - we've always had some "throttle kswapd so
that it doesn't use insane amounts of CPU time").
DaveC - does the spinlock contention go away if you just go back to
4.7? If so, I think it's the new zone thing. But it would be good to
verify - maybe it's something entirely different and it goes back much
further.
Mel - I may be barking up entirely the wrong tree, but it would be
good if you could take a look just in case this is actually it.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 0:15 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 0:15 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1769 bytes --]
On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> But I'll try to see what happens
> on my profile, even if I can't recreate the contention itself, just
> trying to see what happens inside of that region.
Yeah, since I run my machines on encrypted disks, my profile shows 60%
kthread, but that's just because 55% is crypto.
I only have 5% in kswapd. And the spinlock doesn't even show up for me
(but "__delete_from_page_cache()" does, which doesn't look
unreasonable).
And while the biggest reason the spinlock doesn't show up is likely
simply my single-node "everything is on one die", I still think the
lower kswapd CPU use might be partly due to the node-vs-zone thing.
For me, with just one node, the new
test_bit(PGDAT_WRITEBACK, &pgdat->flags)) {
ends up being very similar to what we used to have before, ie
test_bit(ZONE_WRITEBACK, &zone->flags)) {
but on a multi-node machine it would be rather different.
So I might never see contention anyway.
The basic logic in shrink_swap_list() goes back to commit 283aba9f9e0
("mm: vmscan: block kswapd if it is encountering pages under
writeback") but it has been messed around with a lot (and something
else existed there before - we've always had some "throttle kswapd so
that it doesn't use insane amounts of CPU time").
DaveC - does the spinlock contention go away if you just go back to
4.7? If so, I think it's the new zone thing. But it would be good to
verify - maybe it's something entirely different and it goes back much
further.
Mel - I may be barking up entirely the wrong tree, but it would be
good if you could take a look just in case this is actually it.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 23:01 ` Linus Torvalds
@ 2016-08-16 0:17 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:17 UTC (permalink / raw)
To: Linus Torvalds
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 04:01:00PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 3:22 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > Right, but that does not make the profile data useless,
>
> Yes it does. Because it basically hides everything that happens inside
> the lock, which is what causes the contention in the first place.
Read the code, Linus?
> So stop making inane and stupid arguments, Dave.
We know what happens inside the lock, and we know exactly how much
it is supposed to cost. And it isn't anywhere near as much as the
profiles indicate the function that contains the lock is costing.
Occam's Razor leads to only one conclusion, like it or not....
> Your profiles are shit. Deal with it, or accept that nobody is ever
> going to bother working on them because your profiles don't give
> useful information.
>
> I see that you actually fixed your profiles, but quite frankly, the
> amount of pure unadulterated crap you posted in this email is worth
> reacting negatively to.
I'm happy to be told that I'm wrong *when I'm wrong*, but you always
say "read the code to understand a problem" rather than depending on
potentially unreliable tools and debug information that is gathered.
Yet when I do that using partial profile information, your reaction
is to tell me I am "full of shit" because my information isn't 100%
reliable? Really, Linus?
> You generally make so much sense that it's shocking to see you then
> make these crazy excuses for your completely broken profiles.
Except they *aren't broken*. They are simply *less accurate* than
they could be. That does not invalidate the profile nor does it mean
that the insight it gives us into the functioning of the code is
wrong.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 0:17 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:17 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1833 bytes --]
On Mon, Aug 15, 2016 at 04:01:00PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 3:22 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > Right, but that does not make the profile data useless,
>
> Yes it does. Because it basically hides everything that happens inside
> the lock, which is what causes the contention in the first place.
Read the code, Linus?
> So stop making inane and stupid arguments, Dave.
We know what happens inside the lock, and we know exactly how much
it is supposed to cost. And it isn't anywhere near as much as the
profiles indicate the function that contains the lock is costing.
Occam's Razor leads to only one conclusion, like it or not....
> Your profiles are shit. Deal with it, or accept that nobody is ever
> going to bother working on them because your profiles don't give
> useful information.
>
> I see that you actually fixed your profiles, but quite frankly, the
> amount of pure unadulterated crap you posted in this email is worth
> reacting negatively to.
I'm happy to be told that I'm wrong *when I'm wrong*, but you always
say "read the code to understand a problem" rather than depending on
potentially unreliable tools and debug information that is gathered.
Yet when I do that using partial profile information, your reaction
is to tell me I am "full of shit" because my information isn't 100%
reliable? Really, Linus?
> You generally make so much sense that it's shocking to see you then
> make these crazy excuses for your completely broken profiles.
Except they *aren't broken*. They are simply *less accurate* than
they could be. That does not invalidate the profile nor does it mean
that the insight it gives us into the functioning of the code is
wrong.
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 23:20 ` Linus Torvalds
@ 2016-08-16 0:19 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:19 UTC (permalink / raw)
To: Linus Torvalds
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 04:20:55PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 3:42 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
> > 9.90% [kernel] [k] copy_user_generic_string
> > 3.65% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
> > 2.62% [kernel] [k] __block_commit_write.isra.29
> > 2.26% [kernel] [k] _raw_spin_lock_irqsave
> > 1.72% [kernel] [k] _raw_spin_lock
>
> Ok, this is more like it.
>
> I'd still like to see it on raw hardware, just to see if we may have a
> bug in the PV code. Because that code has been buggy before. I
> *thought* we fixed it, but ...
>
> In fact, you don't even need to do it outside of virtualization, but
> with paravirt disabled (so that it runs the native non-pv locking in
> the virtual machine).
>
> > 36.60% 0.00% [kernel] [k] kswapd
> > - 30.29% kswapd
> > - 30.23% shrink_node
> > - 30.07% shrink_node_memcg.isra.75
> > - 30.15% shrink_inactive_list
> > - 29.49% shrink_page_list
> > - 22.79% __remove_mapping
> > - 22.27% _raw_spin_lock_irqsave
> > __pv_queued_spin_lock_slowpath
>
> How I dislike the way perf shows the call graph data... Just last week
> I was talking to Arnaldo about how to better visualize the cost of
> spinlocks, because the normal way "perf" shows costs is so nasty.
Do not change it - it's the way call graph profiles have been
presented for the past 20 years. I hate it when long standing
conventions are changed because one person doesn't like them and
everyone else has to relearn skills the haven't had to think about
for years....
> What happens is that you see that 36% of CPU time is attributed to
> kswapd, and then you can drill down and see where that 36% comes from.
> So far so good, and that's what perf does fairly well.
>
> But then when you find the spinlock, you actually want to go the other
> way, and instead ask it to show "who were the callers to this routine
> and what were the percentages", so that you can then see whether (for
> example) it's just that __remove_mapping() use that contends with
> itself, or whether it's contending with the page additions or
> whatever..
Um, perf already does that:
- 31.55% 31.55% [kernel] [k] __pv_queued_spin_lock_slowpath
- 19.83% ret_from_fork
- kthread
- 18.55% kswapd
shrink_node
shrink_node_memcg.isra.75
shrink_inactive_list
1.76% worker_thread
process_one_work
wb_workfn
wb_writeback
__writeback_inodes_wb
writeback_sb_inodes
__writeback_single_inode
do_writepages
xfs_vm_writepages
write_cache_pages
xfs_do_writepage
+ 5.95% __libc_pwrite
I have that right here because *it's a view of the profile I've
already looked at*. I didn't post it because, well, it's shorter to
simply say "contention is from in kswapd".
> So what I'd like to see (and this is where it becomes *so* much more
> useful to be able to recreate it myself so that I can play with the
> perf data several different ways) is to see what the profile looks
> like in that spinlocked region.
Boot your machine with "fake_numa=4", and play till you heart is
content. That's all I do with my test VMs to make them exercise NUMA
paths.
> None of this code is all that new, which is annoying. This must have
> gone on forever,
Yes, it has been. Just worse than I've notice before, probably
because of all the stuff put under the tree lock in the past couple
of years.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 0:19 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:19 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3924 bytes --]
On Mon, Aug 15, 2016 at 04:20:55PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 3:42 PM, Dave Chinner <david@fromorbit.com> wrote:
> >
> > 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
> > 9.90% [kernel] [k] copy_user_generic_string
> > 3.65% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
> > 2.62% [kernel] [k] __block_commit_write.isra.29
> > 2.26% [kernel] [k] _raw_spin_lock_irqsave
> > 1.72% [kernel] [k] _raw_spin_lock
>
> Ok, this is more like it.
>
> I'd still like to see it on raw hardware, just to see if we may have a
> bug in the PV code. Because that code has been buggy before. I
> *thought* we fixed it, but ...
>
> In fact, you don't even need to do it outside of virtualization, but
> with paravirt disabled (so that it runs the native non-pv locking in
> the virtual machine).
>
> > 36.60% 0.00% [kernel] [k] kswapd
> > - 30.29% kswapd
> > - 30.23% shrink_node
> > - 30.07% shrink_node_memcg.isra.75
> > - 30.15% shrink_inactive_list
> > - 29.49% shrink_page_list
> > - 22.79% __remove_mapping
> > - 22.27% _raw_spin_lock_irqsave
> > __pv_queued_spin_lock_slowpath
>
> How I dislike the way perf shows the call graph data... Just last week
> I was talking to Arnaldo about how to better visualize the cost of
> spinlocks, because the normal way "perf" shows costs is so nasty.
Do not change it - it's the way call graph profiles have been
presented for the past 20 years. I hate it when long standing
conventions are changed because one person doesn't like them and
everyone else has to relearn skills the haven't had to think about
for years....
> What happens is that you see that 36% of CPU time is attributed to
> kswapd, and then you can drill down and see where that 36% comes from.
> So far so good, and that's what perf does fairly well.
>
> But then when you find the spinlock, you actually want to go the other
> way, and instead ask it to show "who were the callers to this routine
> and what were the percentages", so that you can then see whether (for
> example) it's just that __remove_mapping() use that contends with
> itself, or whether it's contending with the page additions or
> whatever..
Um, perf already does that:
- 31.55% 31.55% [kernel] [k] __pv_queued_spin_lock_slowpath
- 19.83% ret_from_fork
- kthread
- 18.55% kswapd
shrink_node
shrink_node_memcg.isra.75
shrink_inactive_list
1.76% worker_thread
process_one_work
wb_workfn
wb_writeback
__writeback_inodes_wb
writeback_sb_inodes
__writeback_single_inode
do_writepages
xfs_vm_writepages
write_cache_pages
xfs_do_writepage
+ 5.95% __libc_pwrite
I have that right here because *it's a view of the profile I've
already looked at*. I didn't post it because, well, it's shorter to
simply say "contention is from in kswapd".
> So what I'd like to see (and this is where it becomes *so* much more
> useful to be able to recreate it myself so that I can play with the
> perf data several different ways) is to see what the profile looks
> like in that spinlocked region.
Boot your machine with "fake_numa=4", and play till you heart is
content. That's all I do with my test VMs to make them exercise NUMA
paths.
> None of this code is all that new, which is annoying. This must have
> gone on forever,
Yes, it has been. Just worse than I've notice before, probably
because of all the stuff put under the tree lock in the past couple
of years.
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 0:15 ` Linus Torvalds
@ 2016-08-16 0:38 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:38 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 05:15:47PM -0700, Linus Torvalds wrote:
> DaveC - does the spinlock contention go away if you just go back to
> 4.7? If so, I think it's the new zone thing. But it would be good to
> verify - maybe it's something entirely different and it goes back much
> further.
Same in 4.7 (flat profile numbers climbed higher after this
snapshot was taken, as can be seen by the callgraph numbers):
29.47% [kernel] [k] __pv_queued_spin_lock_slowpath
11.59% [kernel] [k] copy_user_generic_string
3.13% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
2.87% [kernel] [k] __block_commit_write.isra.29
2.02% [kernel] [k] _raw_spin_lock_irqsave
1.77% [kernel] [k] get_page_from_freelist
1.36% [kernel] [k] __wake_up_bit
1.31% [kernel] [k] __radix_tree_lookup
1.22% [kernel] [k] radix_tree_tag_set
1.16% [kernel] [k] clear_page_dirty_for_io
1.14% [kernel] [k] __remove_mapping
1.14% [kernel] [k] _raw_spin_lock
1.00% [kernel] [k] zone_dirty_ok
0.95% [kernel] [k] radix_tree_tag_clear
0.90% [kernel] [k] generic_write_end
0.89% [kernel] [k] __delete_from_page_cache
0.87% [kernel] [k] unlock_page
0.86% [kernel] [k] cancel_dirty_page
0.81% [kernel] [k] up_write
0.80% [kernel] [k] ___might_sleep
0.77% [kernel] [k] _raw_spin_unlock_irqrestore
0.75% [kernel] [k] generic_perform_write
0.72% [kernel] [k] xfs_do_writepage
0.69% [kernel] [k] down_write
0.63% [kernel] [k] shrink_page_list
0.63% [kernel] [k] __xfs_get_blocks
0.61% [kernel] [k] __test_set_page_writeback
0.59% [kernel] [k] free_hot_cold_page
0.57% [kernel] [k] write_cache_pages
0.56% [kernel] [k] __radix_tree_create
0.55% [kernel] [k] __list_add
0.53% [kernel] [k] page_mapping
0.53% [kernel] [k] drop_buffers
0.51% [kernel] [k] xfs_vm_releasepage
0.51% [kernel] [k] free_pcppages_bulk
0.50% [kernel] [k] __list_del_entry
38.07% 38.07% [kernel] [k] __pv_queued_spin_lock_slowpath
- 25.52% ret_from_fork
- kthread
- 24.36% kswapd
shrink_zone
shrink_zone_memcg.isra.73
shrink_inactive_list
- 3.21% worker_thread
process_one_work
wb_workfn
wb_writeback
__writeback_inodes_wb
writeback_sb_inodes
__writeback_single_inode
do_writepages
xfs_vm_writepages
write_cache_pages
- 10.06% __libc_pwrite
entry_SYSCALL_64_fastpath
sys_pwrite64
vfs_write
__vfs_write
xfs_file_write_iter
xfs_file_buffered_aio_write
- generic_perform_write
- 5.51% xfs_vm_write_begin
- 4.94% grab_cache_page_write_begin
pagecache_get_page
0.57% __block_write_begin
create_page_buffers
create_empty_buffers
_raw_spin_lock
__pv_queued_spin_lock_slowpath
- 4.88% xfs_vm_write_end
generic_write_end
block_write_end
__block_commit_write.isra.29
mark_buffer_dirty
__set_page_dirty
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 0:38 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:38 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3427 bytes --]
On Mon, Aug 15, 2016 at 05:15:47PM -0700, Linus Torvalds wrote:
> DaveC - does the spinlock contention go away if you just go back to
> 4.7? If so, I think it's the new zone thing. But it would be good to
> verify - maybe it's something entirely different and it goes back much
> further.
Same in 4.7 (flat profile numbers climbed higher after this
snapshot was taken, as can be seen by the callgraph numbers):
29.47% [kernel] [k] __pv_queued_spin_lock_slowpath
11.59% [kernel] [k] copy_user_generic_string
3.13% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
2.87% [kernel] [k] __block_commit_write.isra.29
2.02% [kernel] [k] _raw_spin_lock_irqsave
1.77% [kernel] [k] get_page_from_freelist
1.36% [kernel] [k] __wake_up_bit
1.31% [kernel] [k] __radix_tree_lookup
1.22% [kernel] [k] radix_tree_tag_set
1.16% [kernel] [k] clear_page_dirty_for_io
1.14% [kernel] [k] __remove_mapping
1.14% [kernel] [k] _raw_spin_lock
1.00% [kernel] [k] zone_dirty_ok
0.95% [kernel] [k] radix_tree_tag_clear
0.90% [kernel] [k] generic_write_end
0.89% [kernel] [k] __delete_from_page_cache
0.87% [kernel] [k] unlock_page
0.86% [kernel] [k] cancel_dirty_page
0.81% [kernel] [k] up_write
0.80% [kernel] [k] ___might_sleep
0.77% [kernel] [k] _raw_spin_unlock_irqrestore
0.75% [kernel] [k] generic_perform_write
0.72% [kernel] [k] xfs_do_writepage
0.69% [kernel] [k] down_write
0.63% [kernel] [k] shrink_page_list
0.63% [kernel] [k] __xfs_get_blocks
0.61% [kernel] [k] __test_set_page_writeback
0.59% [kernel] [k] free_hot_cold_page
0.57% [kernel] [k] write_cache_pages
0.56% [kernel] [k] __radix_tree_create
0.55% [kernel] [k] __list_add
0.53% [kernel] [k] page_mapping
0.53% [kernel] [k] drop_buffers
0.51% [kernel] [k] xfs_vm_releasepage
0.51% [kernel] [k] free_pcppages_bulk
0.50% [kernel] [k] __list_del_entry
38.07% 38.07% [kernel] [k] __pv_queued_spin_lock_slowpath
- 25.52% ret_from_fork
- kthread
- 24.36% kswapd
shrink_zone
shrink_zone_memcg.isra.73
shrink_inactive_list
- 3.21% worker_thread
process_one_work
wb_workfn
wb_writeback
__writeback_inodes_wb
writeback_sb_inodes
__writeback_single_inode
do_writepages
xfs_vm_writepages
write_cache_pages
- 10.06% __libc_pwrite
entry_SYSCALL_64_fastpath
sys_pwrite64
vfs_write
__vfs_write
xfs_file_write_iter
xfs_file_buffered_aio_write
- generic_perform_write
- 5.51% xfs_vm_write_begin
- 4.94% grab_cache_page_write_begin
pagecache_get_page
0.57% __block_write_begin
create_page_buffers
create_empty_buffers
_raw_spin_lock
__pv_queued_spin_lock_slowpath
- 4.88% xfs_vm_write_end
generic_write_end
block_write_end
__block_commit_write.isra.29
mark_buffer_dirty
__set_page_dirty
-Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 23:48 ` Linus Torvalds
@ 2016-08-16 0:44 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:44 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Andrew Morton,
Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 04:48:36PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > None of this code is all that new, which is annoying. This must have
> > gone on forever,
>
> ... ooh.
>
> Wait, I take that back.
>
> We actually have some very recent changes that I didn't even think
> about that went into this very merge window.
....
> Mel? The issue is that Dave Chinner is seeing some nasty spinlock
> contention on "mapping->tree_lock":
>
> > 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
>
> and one of the main paths is this:
>
> > - 30.29% kswapd
> > - 30.23% shrink_node
> > - 30.07% shrink_node_memcg.isra.75
> > - 30.15% shrink_inactive_list
> > - 29.49% shrink_page_list
> > - 22.79% __remove_mapping
> > - 22.27% _raw_spin_lock_irqsave
> > __pv_queued_spin_lock_slowpath
>
> so there's something ridiculously bad going on with a fairly simple benchmark.
>
> Dave's benchmark is literally just a "write a new 48GB file in
> single-page chunks on a 4-node machine". Nothing odd - not rewriting
> files, not seeking around, no nothing.
>
> You can probably recreate it with a silly
>
> dd bs=4096 count=$((12*1024*1024)) if=/dev/zero of=bigfile
>
> although Dave actually had something rather fancier, I think.
16p, 16GB RAM, fake_numa=4. Overwrite a 47GB file on a 48GB
filesystem:
# mkfs.xfs -f -d size=48g /dev/vdc
# mount /dev/vdc /mnt/scratch
# xfs_io -f -c "pwrite 0 47g" /mnt/scratch/fooey
Wait for memory to fill and reclaim to kick in, then look at the
profile. If you run it a second time, reclaim kicks in straight
away.
It's not the new code in 4.8 - it reproduces on 4.7 just fine, and
probably will reproduce all the way back to when the memcg-aware
writeback code was added....
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 0:44 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 0:44 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2028 bytes --]
On Mon, Aug 15, 2016 at 04:48:36PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > None of this code is all that new, which is annoying. This must have
> > gone on forever,
>
> ... ooh.
>
> Wait, I take that back.
>
> We actually have some very recent changes that I didn't even think
> about that went into this very merge window.
....
> Mel? The issue is that Dave Chinner is seeing some nasty spinlock
> contention on "mapping->tree_lock":
>
> > 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
>
> and one of the main paths is this:
>
> > - 30.29% kswapd
> > - 30.23% shrink_node
> > - 30.07% shrink_node_memcg.isra.75
> > - 30.15% shrink_inactive_list
> > - 29.49% shrink_page_list
> > - 22.79% __remove_mapping
> > - 22.27% _raw_spin_lock_irqsave
> > __pv_queued_spin_lock_slowpath
>
> so there's something ridiculously bad going on with a fairly simple benchmark.
>
> Dave's benchmark is literally just a "write a new 48GB file in
> single-page chunks on a 4-node machine". Nothing odd - not rewriting
> files, not seeking around, no nothing.
>
> You can probably recreate it with a silly
>
> dd bs=4096 count=$((12*1024*1024)) if=/dev/zero of=bigfile
>
> although Dave actually had something rather fancier, I think.
16p, 16GB RAM, fake_numa=4. Overwrite a 47GB file on a 48GB
filesystem:
# mkfs.xfs -f -d size=48g /dev/vdc
# mount /dev/vdc /mnt/scratch
# xfs_io -f -c "pwrite 0 47g" /mnt/scratch/fooey
Wait for memory to fill and reclaim to kick in, then look at the
profile. If you run it a second time, reclaim kicks in straight
away.
It's not the new code in 4.8 - it reproduces on 4.7 just fine, and
probably will reproduce all the way back to when the memcg-aware
writeback code was added....
-Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 0:17 ` Dave Chinner
@ 2016-08-16 0:45 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 0:45 UTC (permalink / raw)
To: Dave Chinner
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 5:17 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Read the code, Linus?
I am. It's how I came up with my current pet theory.
But I don't actually have enough sane numbers to make it much more
than a cute pet theory. It *might* explain why you see tons of kswap
time and bad lock contention where it didn't use to exist, but ..
I can't recreate the problem, and your old profiles were bad enough
that they aren't really worth looking at.
> Except they *aren't broken*. They are simply *less accurate* than
> they could be.
They are so much less accurate that quite frankly, there's no point in
looking at them outside of "there is contention on the lock".
And considering that the numbers didn't even change when you had
spinlock debugging on, it's not the lock itself that causes this, I'm
pretty sure.
Because when you have normal contention due to the *locking* itself
being the problem, it tends to absolutely _explode_ with the debugging
spinlocks, because the lock itself becomes much more expensive.
Usually super-linearly.
But that wasn't the case here. The numbers stayed constant.
So yeah, I started looking at bigger behavioral issues, which is why I
zeroed in on that zone-vs-node change. But it might be a completely
broken theory. For example, if you still have the contention when
running plain 4.7, that theory was clearly complete BS.
And this is where "less accurate" means that they are almost entirely useless.
More detail needed. It might not be in the profiles themselves, of
course. There might be other much more informative sources if you can
come up with anything...
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 0:45 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 0:45 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1696 bytes --]
On Mon, Aug 15, 2016 at 5:17 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Read the code, Linus?
I am. It's how I came up with my current pet theory.
But I don't actually have enough sane numbers to make it much more
than a cute pet theory. It *might* explain why you see tons of kswap
time and bad lock contention where it didn't use to exist, but ..
I can't recreate the problem, and your old profiles were bad enough
that they aren't really worth looking at.
> Except they *aren't broken*. They are simply *less accurate* than
> they could be.
They are so much less accurate that quite frankly, there's no point in
looking at them outside of "there is contention on the lock".
And considering that the numbers didn't even change when you had
spinlock debugging on, it's not the lock itself that causes this, I'm
pretty sure.
Because when you have normal contention due to the *locking* itself
being the problem, it tends to absolutely _explode_ with the debugging
spinlocks, because the lock itself becomes much more expensive.
Usually super-linearly.
But that wasn't the case here. The numbers stayed constant.
So yeah, I started looking at bigger behavioral issues, which is why I
zeroed in on that zone-vs-node change. But it might be a completely
broken theory. For example, if you still have the contention when
running plain 4.7, that theory was clearly complete BS.
And this is where "less accurate" means that they are almost entirely useless.
More detail needed. It might not be in the profiles themselves, of
course. There might be other much more informative sources if you can
come up with anything...
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 0:38 ` Dave Chinner
@ 2016-08-16 0:50 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 0:50 UTC (permalink / raw)
To: Dave Chinner
Cc: Mel Gorman, Johannes Weiner, Vlastimil Babka, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 5:38 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Same in 4.7 (flat profile numbers climbed higher after this
> snapshot was taken, as can be seen by the callgraph numbers):
Ok, so it's not the zone-vs-node thing. It's just that nobody has
looked at that load in recent times.
Where "recent" may be years, of course.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 0:50 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 0:50 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 386 bytes --]
On Mon, Aug 15, 2016 at 5:38 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Same in 4.7 (flat profile numbers climbed higher after this
> snapshot was taken, as can be seen by the callgraph numbers):
Ok, so it's not the zone-vs-node thing. It's just that nobody has
looked at that load in recent times.
Where "recent" may be years, of course.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 0:19 ` Dave Chinner
@ 2016-08-16 1:51 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 1:51 UTC (permalink / raw)
To: Dave Chinner
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 5:19 PM, Dave Chinner <david@fromorbit.com> wrote:
>
>> None of this code is all that new, which is annoying. This must have
>> gone on forever,
>
> Yes, it has been. Just worse than I've notice before, probably
> because of all the stuff put under the tree lock in the past couple
> of years.
So this is where a good profile can matter.
Particularly if it's all about kswapd, and all the contention is just
from __remove_mapping(), what should matter is the "all the stuff"
added *there* and absolutely nowhere else.
Sadly (well, not for me), in my profiles I have
--3.37%--kswapd
|
--3.36%--shrink_node
|
|--2.88%--shrink_node_memcg
| |
| --2.87%--shrink_inactive_list
| |
| |--2.55%--shrink_page_list
| | |
| | |--0.84%--__remove_mapping
| | | |
| | | |--0.37%--__delete_from_page_cache
| | | | |
| | | | --0.21%--radix_tree_replace_clear_tags
| | | | |
| | | | --0.12%--__radix_tree_lookup
| | | |
| | | --0.23%--_raw_spin_lock_irqsave
| | | |
| | | --0.11%--queued_spin_lock_slowpath
| | |
................
which is rather different from your 22% spin-lock overhead.
Anyway, including the direct reclaim call paths gets
__remove_mapping() a bit higher, and _raw_spin_lock_irqsave climbs to
0.26%. But perhaps more importlantly, looking at what __remove_mapping
actually *does* (apart from the spinlock) gives us:
- inside remove_mapping itself (0.11% on its own - flat cost, no
child accounting)
48.50 │ lock cmpxchg %edx,0x1c(%rbx)
so that's about 0.05%
- 0.40% __delete_from_page_cache (0.22%
radix_tree_replace_clear_tags, 0.13%__radix_tree_lookup)
- 0.06% workingset_eviction()
so I'm not actually seeing anything *new* expensive in there. The
__delete_from_page_cache() overhead may have changed a bit with the
tagged tree changes, but this doesn't look like memcg.
But we clearly have very different situations.
What does your profile show for when you actually dig into
__remove_mapping() itself?, Looking at your flat profile, I'm assuming
you get
1.31% [kernel] [k] __radix_tree_lookup
1.22% [kernel] [k] radix_tree_tag_set
1.14% [kernel] [k] __remove_mapping
which is higher (but part of why my percentages are lower is that I
have that "50% CPU used for encryption" on my machine).
But I'm not seeing anything I'd attribute to "all the stuff added".
For example, originally I would have blamed memcg, but that's not
actually in this path at all.
I come back to wondering whether maybe you're hitting some PV-lock problem.
I know queued_spin_lock_slowpath() is ok. I'm not entirely sure
__pv_queued_spin_lock_slowpath() is.
So I'd love to see you try the non-PV case, but I also think it might
be interesting to see what the instruction profile for
__pv_queued_spin_lock_slowpath() itself is. They share a lot of code
(there's some interesting #include games going on to make
queued_spin_lock_slowpath() actually *be*
__pv_queued_spin_lock_slowpath() with some magic hooks), but there
might be issues.
For example, if you run a virtual 16-core system on a physical machine
that then doesn't consistently give 16 cores to the virtual machine,
you'll get no end of hiccups.
Because as mentioned, we've had bugs ("performance anomalies") there before.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 1:51 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 1:51 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3602 bytes --]
On Mon, Aug 15, 2016 at 5:19 PM, Dave Chinner <david@fromorbit.com> wrote:
>
>> None of this code is all that new, which is annoying. This must have
>> gone on forever,
>
> Yes, it has been. Just worse than I've notice before, probably
> because of all the stuff put under the tree lock in the past couple
> of years.
So this is where a good profile can matter.
Particularly if it's all about kswapd, and all the contention is just
from __remove_mapping(), what should matter is the "all the stuff"
added *there* and absolutely nowhere else.
Sadly (well, not for me), in my profiles I have
--3.37%--kswapd
|
--3.36%--shrink_node
|
|--2.88%--shrink_node_memcg
| |
| --2.87%--shrink_inactive_list
| |
| |--2.55%--shrink_page_list
| | |
| | |--0.84%--__remove_mapping
| | | |
| | | |--0.37%--__delete_from_page_cache
| | | | |
| | | | --0.21%--radix_tree_replace_clear_tags
| | | | |
| | | | --0.12%--__radix_tree_lookup
| | | |
| | | --0.23%--_raw_spin_lock_irqsave
| | | |
| | | --0.11%--queued_spin_lock_slowpath
| | |
................
which is rather different from your 22% spin-lock overhead.
Anyway, including the direct reclaim call paths gets
__remove_mapping() a bit higher, and _raw_spin_lock_irqsave climbs to
0.26%. But perhaps more importlantly, looking at what __remove_mapping
actually *does* (apart from the spinlock) gives us:
- inside remove_mapping itself (0.11% on its own - flat cost, no
child accounting)
48.50 │ lock cmpxchg %edx,0x1c(%rbx)
so that's about 0.05%
- 0.40% __delete_from_page_cache (0.22%
radix_tree_replace_clear_tags, 0.13%__radix_tree_lookup)
- 0.06% workingset_eviction()
so I'm not actually seeing anything *new* expensive in there. The
__delete_from_page_cache() overhead may have changed a bit with the
tagged tree changes, but this doesn't look like memcg.
But we clearly have very different situations.
What does your profile show for when you actually dig into
__remove_mapping() itself?, Looking at your flat profile, I'm assuming
you get
1.31% [kernel] [k] __radix_tree_lookup
1.22% [kernel] [k] radix_tree_tag_set
1.14% [kernel] [k] __remove_mapping
which is higher (but part of why my percentages are lower is that I
have that "50% CPU used for encryption" on my machine).
But I'm not seeing anything I'd attribute to "all the stuff added".
For example, originally I would have blamed memcg, but that's not
actually in this path at all.
I come back to wondering whether maybe you're hitting some PV-lock problem.
I know queued_spin_lock_slowpath() is ok. I'm not entirely sure
__pv_queued_spin_lock_slowpath() is.
So I'd love to see you try the non-PV case, but I also think it might
be interesting to see what the instruction profile for
__pv_queued_spin_lock_slowpath() itself is. They share a lot of code
(there's some interesting #include games going on to make
queued_spin_lock_slowpath() actually *be*
__pv_queued_spin_lock_slowpath() with some magic hooks), but there
might be issues.
For example, if you run a virtual 16-core system on a physical machine
that then doesn't consistently give 16 cores to the virtual machine,
you'll get no end of hiccups.
Because as mentioned, we've had bugs ("performance anomalies") there before.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 21:22 ` Dave Chinner
@ 2016-08-16 12:20 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-16 12:20 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson, LKP
On Tue, Aug 16, 2016 at 07:22:40AM +1000, Dave Chinner wrote:
>On Mon, Aug 15, 2016 at 10:14:55PM +0800, Fengguang Wu wrote:
>> Hi Christoph,
>>
>> On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>> >Snipping the long contest:
>> >
>> >I think there are three observations here:
>> >
>> >(1) removing the mark_page_accessed (which is the only significant
>> > change in the parent commit) hurts the
>> > aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
>> > I'd still rather stick to the filemap version and let the
>> > VM people sort it out. How do the numbers for this test
>> > look for XFS vs say ext4 and btrfs?
>> >(2) lots of additional spinlock contention in the new case. A quick
>> > check shows that I fat-fingered my rewrite so that we do
>> > the xfs_inode_set_eofblocks_tag call now for the pure lookup
>> > case, and pretty much all new cycles come from that.
>> >(3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
>> > we're already doing way to many even without my little bug above.
>> >
>> >So I've force pushed a new version of the iomap-fixes branch with
>> >(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>> >lot less expensive slotted in before that. Would be good to see
>> >the numbers with that.
>>
>> The aim7 1BRD tests finished and there are ups and downs, with overall
>> performance remain flat.
>>
>> 99091700659f4df9 74a242ad94d13436a1644c0b45 bf4dc6e4ecc2a3d042029319bc testcase/testparams/testbox
>> ---------------- -------------------------- -------------------------- ---------------------------
>
>What do these commits refer to, please? They mean nothing without
>the commit names....
>
>/me goes searching. Ok:
>
>99091700659 is the top of Linus' tree
>74a242ad94d is ????
That's the below one's parent commit, 74a242ad94d ("xfs: make
xfs_inode_set_eofblocks_tag cheaper for the common case").
Typically we'll compare a commit with its parent commit, and/or
the branch's base commit, which is normally on mainline kernel.
>bf4dc6e4ecc is the latest in Christoph's tree (because it's
> mentioned below)
>
>> %stddev %change %stddev %change %stddev
>> \ | \ |
>> \ 159926 157324 158574
>> GEO-MEAN aim7.jobs-per-min
>> 70897 5% 74137 4% 73775 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
>> 485217 ± 3% 492431 477533 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
>> 360451 -19% 292980 -17% 299377 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
>
>So, why does random read go backwards by 20%? The iomap IO path
>patches we are testing only affect the write path, so this
>doesn't make a whole lot of sense.
>
>> 338114 338410 5% 354078 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
>> 60130 ± 5% 4% 62438 5% 62923 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
>> 403144 397790 410648 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
>
>And this is the test the original regression was reported for:
>
>gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
>
>And that shows no improvement at all. The orginal regression was:
>
> 484435 ± 0% -13.3% 420004 ± 0% aim7.jobs-per-min
>
>So it's still 15% down on the orginal performance which, again,
>doesn't make a whole lot of sense given the improvement in so many
>other tests I've run....
Yes, same performance with 4.8-rc1 means the regression is still not
back comparing to the original reported first-bad-commit's parent
f0c6bcba74ac51cb ("xfs: reorder zeroing and flushing sequence in
truncate") which is on 4.7-rc1.
>> 26327 26534 26128 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44
>>
>> The new commit bf4dc6e ("xfs: rewrite and optimize the delalloc write
>> path") improves the aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
>> case by 5%. Here are the detailed numbers:
>>
>> aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
>
>Not important at all. We need the results for the disk_wrt regression
>we are chasing (disk_wrt-3000) so we can see how the code change
>affected behaviour.
Yeah it may not relevant to this case study, however should help
evaluate the patch in a more complete way.
>> Here are the detailed numbers for the slowed down case:
>>
>> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
>>
>> 99091700659f4df9 bf4dc6e4ecc2a3d042029319bc
>> ---------------- --------------------------
>> %stddev change %stddev
>> \ | \
>> 360451 -17% 299377 aim7.jobs-per-min
>> 12806 481% 74447 aim7.time.involuntary_context_switches
>.....
>> 19377 459% 108364 vmstat.system.cs
>.....
>> 487 ± 89% 3e+04 26448 ± 57% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
>> 1823 ± 82% 2e+06 1913796 ± 38% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
>> 208475 ± 43% 1e+06 1409494 ± 5% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
>> 6884 ± 73% 8e+04 90790 ± 9% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
>> 1598 ± 20% 3e+04 35015 ± 27% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
>> 2006 ± 25% 3e+04 31143 ± 35% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
>> 29 ±101% 1e+04 10214 ± 29% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
>> 1206 ± 51% 9e+03 9919 ± 25% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
>
>Significant increase in blocking delays in the journal during atime
>updates. There's nothing in Christoph's tree that would affect that
>behaviour. This smells like either a mount option change or
>individual tests not being 100% isolated and the previous test run
>is affecting this one?
We kexec reboot machines between tests to make sure zero influence
from previous test. The test jobs are queued in a batch and not
likely to change mount option etc. in between (just confirmed).
The kernels are build by a random build server and some builds will
reuse previous .o files (no distclean). To make sure I rebuilt the
kernels in the same build server with distclean. However new tests
still show the same numbers.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 12:20 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-16 12:20 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 8273 bytes --]
On Tue, Aug 16, 2016 at 07:22:40AM +1000, Dave Chinner wrote:
>On Mon, Aug 15, 2016 at 10:14:55PM +0800, Fengguang Wu wrote:
>> Hi Christoph,
>>
>> On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>> >Snipping the long contest:
>> >
>> >I think there are three observations here:
>> >
>> >(1) removing the mark_page_accessed (which is the only significant
>> > change in the parent commit) hurts the
>> > aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
>> > I'd still rather stick to the filemap version and let the
>> > VM people sort it out. How do the numbers for this test
>> > look for XFS vs say ext4 and btrfs?
>> >(2) lots of additional spinlock contention in the new case. A quick
>> > check shows that I fat-fingered my rewrite so that we do
>> > the xfs_inode_set_eofblocks_tag call now for the pure lookup
>> > case, and pretty much all new cycles come from that.
>> >(3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
>> > we're already doing way to many even without my little bug above.
>> >
>> >So I've force pushed a new version of the iomap-fixes branch with
>> >(2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>> >lot less expensive slotted in before that. Would be good to see
>> >the numbers with that.
>>
>> The aim7 1BRD tests finished and there are ups and downs, with overall
>> performance remain flat.
>>
>> 99091700659f4df9 74a242ad94d13436a1644c0b45 bf4dc6e4ecc2a3d042029319bc testcase/testparams/testbox
>> ---------------- -------------------------- -------------------------- ---------------------------
>
>What do these commits refer to, please? They mean nothing without
>the commit names....
>
>/me goes searching. Ok:
>
>99091700659 is the top of Linus' tree
>74a242ad94d is ????
That's the below one's parent commit, 74a242ad94d ("xfs: make
xfs_inode_set_eofblocks_tag cheaper for the common case").
Typically we'll compare a commit with its parent commit, and/or
the branch's base commit, which is normally on mainline kernel.
>bf4dc6e4ecc is the latest in Christoph's tree (because it's
> mentioned below)
>
>> %stddev %change %stddev %change %stddev
>> \ | \ |
>> \ 159926 157324 158574
>> GEO-MEAN aim7.jobs-per-min
>> 70897 5% 74137 4% 73775 aim7/1BRD_48G-xfs-creat-clo-1500-performance/ivb44
>> 485217 ± 3% 492431 477533 aim7/1BRD_48G-xfs-disk_rd-9000-performance/ivb44
>> 360451 -19% 292980 -17% 299377 aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
>
>So, why does random read go backwards by 20%? The iomap IO path
>patches we are testing only affect the write path, so this
>doesn't make a whole lot of sense.
>
>> 338114 338410 5% 354078 aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
>> 60130 ± 5% 4% 62438 5% 62923 aim7/1BRD_48G-xfs-disk_src-3000-performance/ivb44
>> 403144 397790 410648 aim7/1BRD_48G-xfs-disk_wrt-3000-performance/ivb44
>
>And this is the test the original regression was reported for:
>
>gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
>
>And that shows no improvement at all. The orginal regression was:
>
> 484435 ± 0% -13.3% 420004 ± 0% aim7.jobs-per-min
>
>So it's still 15% down on the orginal performance which, again,
>doesn't make a whole lot of sense given the improvement in so many
>other tests I've run....
Yes, same performance with 4.8-rc1 means the regression is still not
back comparing to the original reported first-bad-commit's parent
f0c6bcba74ac51cb ("xfs: reorder zeroing and flushing sequence in
truncate") which is on 4.7-rc1.
>> 26327 26534 26128 aim7/1BRD_48G-xfs-sync_disk_rw-600-performance/ivb44
>>
>> The new commit bf4dc6e ("xfs: rewrite and optimize the delalloc write
>> path") improves the aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
>> case by 5%. Here are the detailed numbers:
>>
>> aim7/1BRD_48G-xfs-disk_rw-3000-performance/ivb44
>
>Not important at all. We need the results for the disk_wrt regression
>we are chasing (disk_wrt-3000) so we can see how the code change
>affected behaviour.
Yeah it may not relevant to this case study, however should help
evaluate the patch in a more complete way.
>> Here are the detailed numbers for the slowed down case:
>>
>> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44
>>
>> 99091700659f4df9 bf4dc6e4ecc2a3d042029319bc
>> ---------------- --------------------------
>> %stddev change %stddev
>> \ | \
>> 360451 -17% 299377 aim7.jobs-per-min
>> 12806 481% 74447 aim7.time.involuntary_context_switches
>.....
>> 19377 459% 108364 vmstat.system.cs
>.....
>> 487 ± 89% 3e+04 26448 ± 57% latency_stats.max.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
>> 1823 ± 82% 2e+06 1913796 ± 38% latency_stats.sum.down.xfs_buf_lock._xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_read_agf.xfs_alloc_read_agf.xfs_alloc_fix_freelist.xfs_free_extent_fix_freelist.xfs_free_extent.xfs_trans_free_extent
>> 208475 ± 43% 1e+06 1409494 ± 5% latency_stats.sum.wait_on_page_bit.truncate_inode_pages_range.truncate_inode_pages_final.evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput.task_work_run.exit_to_usermode_loop
>> 6884 ± 73% 8e+04 90790 ± 9% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.file_update_time.xfs_file_aio_write_checks.xfs_file_buffered_aio_write.xfs_file_write_iter.__vfs_write.vfs_write.SyS_write
>> 1598 ± 20% 3e+04 35015 ± 27% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_free_eofblocks.xfs_release.xfs_file_release.__fput.____fput.task_work_run
>> 2006 ± 25% 3e+04 31143 ± 35% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode.evict.iput
>> 29 ±101% 1e+04 10214 ± 29% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.__xfs_trans_roll.xfs_trans_roll.xfs_defer_trans_roll.xfs_defer_finish.xfs_itruncate_extents.xfs_inactive_truncate.xfs_inactive.xfs_fs_destroy_inode.destroy_inode
>> 1206 ± 51% 9e+03 9919 ± 25% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.__xfs_trans_commit.xfs_trans_commit.xfs_vn_update_time.touch_atime.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.SyS_read
>
>Significant increase in blocking delays in the journal during atime
>updates. There's nothing in Christoph's tree that would affect that
>behaviour. This smells like either a mount option change or
>individual tests not being 100% isolated and the previous test run
>is affecting this one?
We kexec reboot machines between tests to make sure zero influence
from previous test. The test jobs are queued in a batch and not
likely to change mount option etc. in between (just confirmed).
The kernels are build by a random build server and some builds will
reuse previous .o files (no distclean). To make sure I rebuilt the
kernels in the same build server with distclean. However new tests
still show the same numbers.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-14 16:17 ` Christoph Hellwig
@ 2016-08-16 13:25 ` Fengguang Wu
-1 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-16 13:25 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Dave Chinner, Ye Xiaolong, Linus Torvalds, LKML, Bob Peterson,
LKP, linux-fsdevel
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>Snipping the long contest:
>
>I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
Here is a basic comparison of the 3 filesystems based on 99091700 ("
Merge tag 'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs").
% compare -a -g 99091700659f4df965e138b38b4fa26a29b7eade -d fs xfs ext4 btrfs
xfs ext4 btrfs testcase/testparams/testbox
---------------- -------------------------- -------------------------- ---------------------------
%stddev change %stddev change %stddev
\ | \ | \
193335 -27% 141400 -100% 8 GEO-MEAN aim7.jobs-per-min
267649 ± 3% -51% 130085 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
485217 ± 3% 402% 2434088 ± 3% 350% 2184471 ± 4% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
360286 -64% 130351 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
338114 -78% 73280 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
60130 ± 5% 361% 277035 aim7/1BRD_48G-disk_src-3000-performance/ivb44
403144 -68% 127584 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
26327 -60% 10571 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
2652 -96% 118 -82% 468 GEO-MEAN fsmark.files_per_sec
393 ± 4% -6% 368 ± 3% 10% 433 ± 5% fsmark/1x-1t-1BRD_48G-4M-40G-NoSync-performance/ivb44
200 -4% 191 -7% 185 ± 6% fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
1583 ± 3% -29% 1130 -31% 1088 fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
21363 59% 33958 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
11033 -17% 9117 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
11833 12% 13234 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
2381976 -100% 6598 -96% 100973 GEO-MEAN fsmark.app_overhead
564520 ± 7% 21% 681192 ± 3% 63% 919364 ± 3% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
860074 ± 5% 112% 1820590 ± 14% 47% 1262443 ± 3% fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
12232633 -18% 10085199 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
3143334 -11% 2784178 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
4107347 -21% 3248210 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
Thanks,
Fengguang
---
Some less important numbers.
xfs ext4 btrfs
---------------- -------------------------- --------------------------
1314 222% 4225 -100% 2 GEO-MEAN aim7.time.system_time
1491 ± 6% 302% 6004 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
4786 ± 3% -89% 502 ± 7% -87% 632 ± 7% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
756 689% 5971 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
891 1146% 11108 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
940 ± 5% 70% 1598 aim7/1BRD_48G-disk_src-3000-performance/ivb44
599 925% 6148 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
2496 390% 12225 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
154597 185% 440025 GEO-MEAN aim7.time.minor_page_faults
156203 144% 381038 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
155952 132% 362294 ± 3% aim7/1BRD_48G-disk_rr-3000-performance/ivb44
157550 266% 577044 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
153880 152% 387212 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
149531 ± 5% 258% 534809 ± 4% aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
86.94 37% 119.10 -98% 1.59 GEO-MEAN aim7.time.elapsed_time
67.56 ± 3% 105% 138.61 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
112.15 ± 3% -80% 22.93 ± 3% -77% 25.48 ± 4% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
50.19 176% 138.32 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
53.46 360% 245.91 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
300.63 ± 5% -78% 65.30 aim7/1BRD_48G-disk_src-3000-performance/ivb44
44.88 215% 141.35 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
136.82 149% 340.59 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
22.55 27% 28.54 GEO-MEAN aim7.time.user_time
18.74 46% 27.27 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
28.71 28% 36.88 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
29.59 38% 40.74 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
41.42 ± 4% -50% 20.90 aim7/1BRD_48G-disk_src-3000-performance/ivb44
10.93 61% 17.61 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
18.26 96% 35.85 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
1171009 -44% 660859 -100% 4 GEO-MEAN aim7.time.voluntary_context_switches
325355 -7% 303228 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
58321 ± 8% -48% 30407 -44% 32487 ± 3% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
437880 -37% 275709 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
395047 31% 518201 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
31067301 -93% 2034955 ± 5% aim7/1BRD_48G-disk_src-3000-performance/ivb44
506749 -38% 315597 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
58429810 11% 65070475 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
41445 658% 314065 -100% 4 GEO-MEAN aim7.time.involuntary_context_switches
21627 ± 5% 3118% 695989 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
594383 ± 5% -96% 25928 ± 7% -94% 38082 ± 11% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
12980 5128% 678629 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
14729 10572% 1571856 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
5894 ± 3% 249% 20595 ± 8% aim7/1BRD_48G-disk_src-3000-performance/ivb44
8950 7842% 710852 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
1620035 -34% 1069492 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
132117 -100% 216 -99% 990 GEO-MEAN fsmark.time.involuntary_context_switches
14651 -99% 86 ± 3% -99% 120 ± 9% fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
553 413% 2840 ± 3% 8078% 45278 ± 31% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
19895 ± 3% -33% 13242 487% 116776 ± 3% fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
6206551 -99% 31906 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
1992225 -100% 1236 ± 5% fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
2664982 -100% 1202 ± 10% fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
1755485 -100% 3248 -93% 120902 GEO-MEAN fsmark.time.voluntary_context_switches
542900 -98% 10270 -98% 10291 fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
36634 ± 6% -17% 30317 3893% 1462650 ± 9% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
162300 ± 15% -55% 72808 89% 306208 fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
58499255 -11% 51789210 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
10792062 168% 28969245 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
14361647 63% 23390858 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
391 -98% 7 -88% 47 GEO-MEAN fsmark.time.elapsed_time
591 -37% 371 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
285 21% 345 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
354 -11% 317 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
294.98 -88% 36.23 -51% 145.31 GEO-MEAN fsmark.time.system_time
45.16 ± 5% 157% 116.15 820% 415.73 ± 7% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
262.79 ± 4% 29% 338.49 21% 316.76 ± 4% fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
1419.35 12% 1587.76 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
320.95 124% 719.99 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
413.11 65% 683.20 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
222 -68% 70 -27% 161 GEO-MEAN fsmark.time.percent_of_cpu_this_job_got
91 ± 4% 9% 99 7% 97 fsmark/1x-1t-1BRD_48G-4M-40G-NoSync-performance/ivb44
95 3% 98 3% 98 fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
214 ± 3% 152% 540 807% 1940 ± 9% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
3938 -7% 3668 -17% 3286 fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
253 75% 443 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
118 80% 213 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
123 80% 221 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
15893 -96% 649 46% 23186 GEO-MEAN fsmark.time.minor_page_faults
10532 34% 14137 125% 23697 ± 5% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
17053 14% 19388 9% 18557 fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
22352 ± 34% 27% 28346 ± 28% fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
30857819 -100% 5089 405% 1.559e+08 GEO-MEAN fsmark.time.file_system_outputs
6400000 ± 50% 25% 8000000 2051% 1.377e+08 fsmark/1x-1t-1BRD_32G-4K-4G-fsyncBeforeClose-1fpd-performance/ivb43
83886080 83886080 85633962 fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
50331648 352% 2.277e+08 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
33554432 555% 2.199e+08 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 13:25 ` Fengguang Wu
0 siblings, 0 replies; 219+ messages in thread
From: Fengguang Wu @ 2016-08-16 13:25 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 17287 bytes --]
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>Snipping the long contest:
>
>I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd still rather stick to the filemap version and let the
> VM people sort it out. How do the numbers for this test
> look for XFS vs say ext4 and btrfs?
Here is a basic comparison of the 3 filesystems based on 99091700 ("
Merge tag 'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs").
% compare -a -g 99091700659f4df965e138b38b4fa26a29b7eade -d fs xfs ext4 btrfs
xfs ext4 btrfs testcase/testparams/testbox
---------------- -------------------------- -------------------------- ---------------------------
%stddev change %stddev change %stddev
\ | \ | \
193335 -27% 141400 -100% 8 GEO-MEAN aim7.jobs-per-min
267649 ± 3% -51% 130085 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
485217 ± 3% 402% 2434088 ± 3% 350% 2184471 ± 4% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
360286 -64% 130351 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
338114 -78% 73280 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
60130 ± 5% 361% 277035 aim7/1BRD_48G-disk_src-3000-performance/ivb44
403144 -68% 127584 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
26327 -60% 10571 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
2652 -96% 118 -82% 468 GEO-MEAN fsmark.files_per_sec
393 ± 4% -6% 368 ± 3% 10% 433 ± 5% fsmark/1x-1t-1BRD_48G-4M-40G-NoSync-performance/ivb44
200 -4% 191 -7% 185 ± 6% fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
1583 ± 3% -29% 1130 -31% 1088 fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
21363 59% 33958 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
11033 -17% 9117 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
11833 12% 13234 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
2381976 -100% 6598 -96% 100973 GEO-MEAN fsmark.app_overhead
564520 ± 7% 21% 681192 ± 3% 63% 919364 ± 3% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
860074 ± 5% 112% 1820590 ± 14% 47% 1262443 ± 3% fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
12232633 -18% 10085199 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
3143334 -11% 2784178 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
4107347 -21% 3248210 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
Thanks,
Fengguang
---
Some less important numbers.
xfs ext4 btrfs
---------------- -------------------------- --------------------------
1314 222% 4225 -100% 2 GEO-MEAN aim7.time.system_time
1491 ± 6% 302% 6004 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
4786 ± 3% -89% 502 ± 7% -87% 632 ± 7% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
756 689% 5971 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
891 1146% 11108 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
940 ± 5% 70% 1598 aim7/1BRD_48G-disk_src-3000-performance/ivb44
599 925% 6148 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
2496 390% 12225 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
154597 185% 440025 GEO-MEAN aim7.time.minor_page_faults
156203 144% 381038 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
155952 132% 362294 ± 3% aim7/1BRD_48G-disk_rr-3000-performance/ivb44
157550 266% 577044 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
153880 152% 387212 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
149531 ± 5% 258% 534809 ± 4% aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
86.94 37% 119.10 -98% 1.59 GEO-MEAN aim7.time.elapsed_time
67.56 ± 3% 105% 138.61 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
112.15 ± 3% -80% 22.93 ± 3% -77% 25.48 ± 4% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
50.19 176% 138.32 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
53.46 360% 245.91 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
300.63 ± 5% -78% 65.30 aim7/1BRD_48G-disk_src-3000-performance/ivb44
44.88 215% 141.35 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
136.82 149% 340.59 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
22.55 27% 28.54 GEO-MEAN aim7.time.user_time
18.74 46% 27.27 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
28.71 28% 36.88 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
29.59 38% 40.74 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
41.42 ± 4% -50% 20.90 aim7/1BRD_48G-disk_src-3000-performance/ivb44
10.93 61% 17.61 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
18.26 96% 35.85 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
1171009 -44% 660859 -100% 4 GEO-MEAN aim7.time.voluntary_context_switches
325355 -7% 303228 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
58321 ± 8% -48% 30407 -44% 32487 ± 3% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
437880 -37% 275709 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
395047 31% 518201 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
31067301 -93% 2034955 ± 5% aim7/1BRD_48G-disk_src-3000-performance/ivb44
506749 -38% 315597 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
58429810 11% 65070475 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
41445 658% 314065 -100% 4 GEO-MEAN aim7.time.involuntary_context_switches
21627 ± 5% 3118% 695989 aim7/1BRD_48G-disk_cp-3000-performance/ivb44
594383 ± 5% -96% 25928 ± 7% -94% 38082 ± 11% aim7/1BRD_48G-disk_rd-9000-performance/ivb44
12980 5128% 678629 aim7/1BRD_48G-disk_rr-3000-performance/ivb44
14729 10572% 1571856 aim7/1BRD_48G-disk_rw-3000-performance/ivb44
5894 ± 3% 249% 20595 ± 8% aim7/1BRD_48G-disk_src-3000-performance/ivb44
8950 7842% 710852 aim7/1BRD_48G-disk_wrt-3000-performance/ivb44
1620035 -34% 1069492 aim7/1BRD_48G-sync_disk_rw-600-performance/ivb44
xfs ext4 btrfs
---------------- -------------------------- --------------------------
132117 -100% 216 -99% 990 GEO-MEAN fsmark.time.involuntary_context_switches
14651 -99% 86 ± 3% -99% 120 ± 9% fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
553 413% 2840 ± 3% 8078% 45278 ± 31% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
19895 ± 3% -33% 13242 487% 116776 ± 3% fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
6206551 -99% 31906 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
1992225 -100% 1236 ± 5% fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
2664982 -100% 1202 ± 10% fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
1755485 -100% 3248 -93% 120902 GEO-MEAN fsmark.time.voluntary_context_switches
542900 -98% 10270 -98% 10291 fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
36634 ± 6% -17% 30317 3893% 1462650 ± 9% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
162300 ± 15% -55% 72808 89% 306208 fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
58499255 -11% 51789210 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
10792062 168% 28969245 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
14361647 63% 23390858 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
391 -98% 7 -88% 47 GEO-MEAN fsmark.time.elapsed_time
591 -37% 371 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
285 21% 345 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
354 -11% 317 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
294.98 -88% 36.23 -51% 145.31 GEO-MEAN fsmark.time.system_time
45.16 ± 5% 157% 116.15 820% 415.73 ± 7% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
262.79 ± 4% 29% 338.49 21% 316.76 ± 4% fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
1419.35 12% 1587.76 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
320.95 124% 719.99 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
413.11 65% 683.20 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
222 -68% 70 -27% 161 GEO-MEAN fsmark.time.percent_of_cpu_this_job_got
91 ± 4% 9% 99 7% 97 fsmark/1x-1t-1BRD_48G-4M-40G-NoSync-performance/ivb44
95 3% 98 3% 98 fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
214 ± 3% 152% 540 807% 1940 ± 9% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
3938 -7% 3668 -17% 3286 fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
253 75% 443 fsmark/8-1SSD-16-9B-48G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
118 80% 213 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
123 80% 221 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
15893 -96% 649 46% 23186 GEO-MEAN fsmark.time.minor_page_faults
10532 34% 14137 125% 23697 ± 5% fsmark/1x-64t-1BRD_48G-4M-40G-NoSync-performance/ivb44
17053 14% 19388 9% 18557 fsmark/1x-64t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
22352 ± 34% 27% 28346 ± 28% fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
xfs ext4 btrfs
---------------- -------------------------- --------------------------
30857819 -100% 5089 405% 1.559e+08 GEO-MEAN fsmark.time.file_system_outputs
6400000 ± 50% 25% 8000000 2051% 1.377e+08 fsmark/1x-1t-1BRD_32G-4K-4G-fsyncBeforeClose-1fpd-performance/ivb43
83886080 83886080 85633962 fsmark/1x-1t-1BRD_48G-4M-40G-fsyncBeforeClose-performance/ivb44
50331648 352% 2.277e+08 fsmark/8-1SSD-4-8K-24G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
33554432 555% 2.199e+08 fsmark/8-1SSD-4-9B-16G-fsyncBeforeClose-16d-256fpd-performance/lkp-hsw-ep4
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 23:48 ` Linus Torvalds
@ 2016-08-16 15:05 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-16 15:05 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Johannes Weiner, Vlastimil Babka, Andrew Morton,
Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 04:48:36PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > None of this code is all that new, which is annoying. This must have
> > gone on forever,
>
> ... ooh.
>
> Wait, I take that back.
>
> We actually have some very recent changes that I didn't even think
> about that went into this very merge window.
>
> In particular, I wonder if it's all (or at least partly) due to the
> new per-node LRU lists.
>
> So in shrink_page_list(), when kswapd is encountering a page that is
> under page writeback due to page reclaim, it does:
>
> if (current_is_kswapd() &&
> PageReclaim(page) &&
> test_bit(PGDAT_WRITEBACK, &pgdat->flags)) {
> nr_immediate++;
> goto keep_locked;
>
I have a limited view of the full topic as I've been in meetings all day
and have another 3 hours to go. I'll set time aside tomorrow to look closer
but there is a theory at the end of the mail.
Node-lru does alter what locks are contended and affects the timing of some
issues but this spot feels like a bad fit. That logic controls whether kswapd
will stall due to dirty/writeback pages reaching the tail of the LRU too
quickly. It can affect lru_lock contention that may be worse with node-lru,
particularly on single-node machines but a workload of a streaming writer
is unlikely to hit that unless the underlying storage is extremely slow.
Another alternation of node-lru potentially affects when buffer heads get
stripped but that's also a poor fit.
I'm not willing to rule out node-lru because it may be wishful thinking
but it feels unlikely.
> which basically ignores that page and puts it back on the LRU list.
>
> But that "is this node under writeback" is new - it now does that per
> node, and it *used* to do it per zone (so it _used_ to test "is this
> zone under writeback").
>
Superficially, a small high zone would affect the timing of when a zone
got marked congested and triggered a sleep. Sleeping avoids new pages being
allocated/dirties and may reduce contention. However, quick sleeps due to
small zones was offset by the fair zone allocation policy and is still
offset by GFP_WRITE distributing dirty pages on different zones. The
timing of when sleeps occur due to excessive dirty pages at the tail of
the LRU should be roughly similar with either zone-lru or node-lru.
> All the mapping pages used to be in the same zone, so I think it
> effectively single-threaded the kswapd reclaim for one mapping under
> reclaim writeback. But in your cases, you have multiple nodes...
>
> Ok, that's a lot of hand-wavy new-age crystal healing thinking.
>
> Really, I haven't looked at it more than "this is one thing that has
> changed recently, I wonder if it changes the patterns and could
> explain much higher spin_lock contention on the mapping->tree_lock".
>
> I'm adding Mel Gorman and his band of miscreants to the cc, so that
> they can tell me that I'm full of shit, and completely missed on what
> that zone->node change actually ends up meaning.
>
> Mel? The issue is that Dave Chinner is seeing some nasty spinlock
> contention on "mapping->tree_lock":
>
Band Of Miscreants may be the new name for the MM track at LSF/MM. In the
meantime lets try some hand waving;
A single-threaded file write on a 4-node system is going to have 4 kswapd
instances, writeback and potentially the writer itself all reclaiming.
Given the workload, it's likely that almost all pages have the same
mapping. As they are contending on __remove_mapping, the pages must be
clean when the attempt to reclaim was made and buffers stripped.
The throttling mechanisms for kswapd and direct reclaim rely on either
too many pages being isolated (unlikely to fire in this case) or too many
dirty/writeback pages reaching the end of the LRU. There is not a direct
throttling mechanism for excessive lock contention
However, historically there have been multiple indirect throttling mechanism
that were branded as congestion control but basically said "I don't know
what's going on so it's nap time". Many of these have been removed over
time and the last major one was ede37713737 ("mm: throttle on IO only when
there are too many dirty and writeback pages").
Before that commit, a process that entered direct reclaim and failed to make
progress would sleep before retrying. It's possible that sleep was enough
to reduce contention by temporarily stalling the writer and letting reclaim
make progress. After that commit, it may only do a cond_resched() check
and go back to allocating/reclaiming as quickly as possible. This active
writer may be enough to increase contention. If so, it also means it
stops kswapd making forward progress, leading to more direct reclaim and
more contention.
It's not a perfect theory and assumes;
1. The writer is direct reclaiming
2. The writer was previously failing to __remove_mapping
3. The writer calling congestion_wait due to __remove_mapping failing
was enough to allow kswapd or writeback to make enough progress to
avoid contention
4. The writer staying awake allocating and dirtying pages is keeping all
the kswapd instances awake and writeback continually active and
increasing the contention overall.
If it was possible to trigger this problem in 4.7 then it would also be
worth checking 4.6. If 4.6 is immune, check that before and after commit
ede37713737.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 15:05 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-16 15:05 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 5715 bytes --]
On Mon, Aug 15, 2016 at 04:48:36PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > None of this code is all that new, which is annoying. This must have
> > gone on forever,
>
> ... ooh.
>
> Wait, I take that back.
>
> We actually have some very recent changes that I didn't even think
> about that went into this very merge window.
>
> In particular, I wonder if it's all (or at least partly) due to the
> new per-node LRU lists.
>
> So in shrink_page_list(), when kswapd is encountering a page that is
> under page writeback due to page reclaim, it does:
>
> if (current_is_kswapd() &&
> PageReclaim(page) &&
> test_bit(PGDAT_WRITEBACK, &pgdat->flags)) {
> nr_immediate++;
> goto keep_locked;
>
I have a limited view of the full topic as I've been in meetings all day
and have another 3 hours to go. I'll set time aside tomorrow to look closer
but there is a theory at the end of the mail.
Node-lru does alter what locks are contended and affects the timing of some
issues but this spot feels like a bad fit. That logic controls whether kswapd
will stall due to dirty/writeback pages reaching the tail of the LRU too
quickly. It can affect lru_lock contention that may be worse with node-lru,
particularly on single-node machines but a workload of a streaming writer
is unlikely to hit that unless the underlying storage is extremely slow.
Another alternation of node-lru potentially affects when buffer heads get
stripped but that's also a poor fit.
I'm not willing to rule out node-lru because it may be wishful thinking
but it feels unlikely.
> which basically ignores that page and puts it back on the LRU list.
>
> But that "is this node under writeback" is new - it now does that per
> node, and it *used* to do it per zone (so it _used_ to test "is this
> zone under writeback").
>
Superficially, a small high zone would affect the timing of when a zone
got marked congested and triggered a sleep. Sleeping avoids new pages being
allocated/dirties and may reduce contention. However, quick sleeps due to
small zones was offset by the fair zone allocation policy and is still
offset by GFP_WRITE distributing dirty pages on different zones. The
timing of when sleeps occur due to excessive dirty pages at the tail of
the LRU should be roughly similar with either zone-lru or node-lru.
> All the mapping pages used to be in the same zone, so I think it
> effectively single-threaded the kswapd reclaim for one mapping under
> reclaim writeback. But in your cases, you have multiple nodes...
>
> Ok, that's a lot of hand-wavy new-age crystal healing thinking.
>
> Really, I haven't looked at it more than "this is one thing that has
> changed recently, I wonder if it changes the patterns and could
> explain much higher spin_lock contention on the mapping->tree_lock".
>
> I'm adding Mel Gorman and his band of miscreants to the cc, so that
> they can tell me that I'm full of shit, and completely missed on what
> that zone->node change actually ends up meaning.
>
> Mel? The issue is that Dave Chinner is seeing some nasty spinlock
> contention on "mapping->tree_lock":
>
Band Of Miscreants may be the new name for the MM track at LSF/MM. In the
meantime lets try some hand waving;
A single-threaded file write on a 4-node system is going to have 4 kswapd
instances, writeback and potentially the writer itself all reclaiming.
Given the workload, it's likely that almost all pages have the same
mapping. As they are contending on __remove_mapping, the pages must be
clean when the attempt to reclaim was made and buffers stripped.
The throttling mechanisms for kswapd and direct reclaim rely on either
too many pages being isolated (unlikely to fire in this case) or too many
dirty/writeback pages reaching the end of the LRU. There is not a direct
throttling mechanism for excessive lock contention
However, historically there have been multiple indirect throttling mechanism
that were branded as congestion control but basically said "I don't know
what's going on so it's nap time". Many of these have been removed over
time and the last major one was ede37713737 ("mm: throttle on IO only when
there are too many dirty and writeback pages").
Before that commit, a process that entered direct reclaim and failed to make
progress would sleep before retrying. It's possible that sleep was enough
to reduce contention by temporarily stalling the writer and letting reclaim
make progress. After that commit, it may only do a cond_resched() check
and go back to allocating/reclaiming as quickly as possible. This active
writer may be enough to increase contention. If so, it also means it
stops kswapd making forward progress, leading to more direct reclaim and
more contention.
It's not a perfect theory and assumes;
1. The writer is direct reclaiming
2. The writer was previously failing to __remove_mapping
3. The writer calling congestion_wait due to __remove_mapping failing
was enough to allow kswapd or writeback to make enough progress to
avoid contention
4. The writer staying awake allocating and dirtying pages is keeping all
the kswapd instances awake and writeback continually active and
increasing the contention overall.
If it was possible to trigger this problem in 4.7 then it would also be
worth checking 4.6. If 4.6 is immune, check that before and after commit
ede37713737.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 15:05 ` Mel Gorman
@ 2016-08-16 17:47 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 17:47 UTC (permalink / raw)
To: Mel Gorman, Michal Hocko, Minchan Kim, Vladimir Davydov
Cc: Dave Chinner, Johannes Weiner, Vlastimil Babka, Andrew Morton,
Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
Mel,
thanks for taking a look. Your theory sounds more complete than mine,
and since Dave is able to see the problem with 4.7, it would be nice
to hear about the 4.6 behavior and commit ede37713737 in particular.
That one seems more likely to affect contention than the zone/node one
I found during the merge window anyway, since it actually removes a
sleep in kswapd during congestion.
I've always preferred to see direct reclaim as the primary model for
reclaim, partly in order to throttle the actual "bad" process, but
also because "kswapd uses lots of CPU time" is such a nasty thing to
even begin guessing about.
So I have to admit to liking that "make kswapd sleep a bit if it's
just looping" logic that got removed in that commit.
And looking at DaveC's numbers, it really feels like it's not even
what we do inside the locked region that is the problem. Sure,
__delete_from_page_cache() (which is most of it) is at 1.86% of CPU
time (when including all the things it calls), but that still isn't
all that much. Especially when compared to just:
0.78% [kernel] [k] _raw_spin_unlock_irqrestore
from his flat profile. That's not some spinning wait, that's just
releasing the lock with a single write (and the popf, but while that's
an expensive instruction, it's just tens of cpu cycles).
So I'm more and more getting the feeling that it's not what we do
inside the lock that is problematic. I started out blaming memcg
accounting or something, but none of the numbers seem to back that up.
So it's primarily really just the fact that kswapd is simply hammering
on that lock way too much.
So yeah, I'm blaming kswapd itself doing something wrong. It's not a
problem in a single-node environment (since there's only one), but
with multiple nodes it clearly just devolves.
Yes, we could try to batch the locking like DaveC already suggested
(ie we could move the locking to the caller, and then make
shrink_page_list() just try to keep the lock held for a few pages if
the mapping doesn't change), and that might result in fewer crazy
cacheline ping-pongs overall. But that feels like exactly the wrong
kind of workaround.
I'd much rather re-instate some "if kswapd is just spinning on CPU
time and not actually improving IO parallelism, kswapd should just get
the hell out" logic.
Adding Michal Hocko to the participant list too, I think he's one of
the gang in this area. Who else should be made aware of this thread?
Minchan? Vladimir?
[ I'm assuming the new people can look up this thread on lkml. Note to
new people: the subject line (and about 75% of the posts) are about an
unrelated AIM7 regression, but there's this sub-thread about nasty
lock contention on mapping->tree_lock within that bigger context ]
Linus
On Tue, Aug 16, 2016 at 8:05 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> However, historically there have been multiple indirect throttling mechanism
> that were branded as congestion control but basically said "I don't know
> what's going on so it's nap time". Many of these have been removed over
> time and the last major one was ede37713737 ("mm: throttle on IO only when
> there are too many dirty and writeback pages").
>
> Before that commit, a process that entered direct reclaim and failed to make
> progress would sleep before retrying. It's possible that sleep was enough
> to reduce contention by temporarily stalling the writer and letting reclaim
> make progress. After that commit, it may only do a cond_resched() check
> and go back to allocating/reclaiming as quickly as possible. This active
> writer may be enough to increase contention. If so, it also means it
> stops kswapd making forward progress, leading to more direct reclaim and
> more contention.
>
> It's not a perfect theory and assumes;
>
> 1. The writer is direct reclaiming
> 2. The writer was previously failing to __remove_mapping
> 3. The writer calling congestion_wait due to __remove_mapping failing
> was enough to allow kswapd or writeback to make enough progress to
> avoid contention
> 4. The writer staying awake allocating and dirtying pages is keeping all
> the kswapd instances awake and writeback continually active and
> increasing the contention overall.
>
> If it was possible to trigger this problem in 4.7 then it would also be
> worth checking 4.6. If 4.6 is immune, check that before and after commit
> ede37713737.
>
> --
> Mel Gorman
> SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 17:47 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 17:47 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4531 bytes --]
Mel,
thanks for taking a look. Your theory sounds more complete than mine,
and since Dave is able to see the problem with 4.7, it would be nice
to hear about the 4.6 behavior and commit ede37713737 in particular.
That one seems more likely to affect contention than the zone/node one
I found during the merge window anyway, since it actually removes a
sleep in kswapd during congestion.
I've always preferred to see direct reclaim as the primary model for
reclaim, partly in order to throttle the actual "bad" process, but
also because "kswapd uses lots of CPU time" is such a nasty thing to
even begin guessing about.
So I have to admit to liking that "make kswapd sleep a bit if it's
just looping" logic that got removed in that commit.
And looking at DaveC's numbers, it really feels like it's not even
what we do inside the locked region that is the problem. Sure,
__delete_from_page_cache() (which is most of it) is at 1.86% of CPU
time (when including all the things it calls), but that still isn't
all that much. Especially when compared to just:
0.78% [kernel] [k] _raw_spin_unlock_irqrestore
from his flat profile. That's not some spinning wait, that's just
releasing the lock with a single write (and the popf, but while that's
an expensive instruction, it's just tens of cpu cycles).
So I'm more and more getting the feeling that it's not what we do
inside the lock that is problematic. I started out blaming memcg
accounting or something, but none of the numbers seem to back that up.
So it's primarily really just the fact that kswapd is simply hammering
on that lock way too much.
So yeah, I'm blaming kswapd itself doing something wrong. It's not a
problem in a single-node environment (since there's only one), but
with multiple nodes it clearly just devolves.
Yes, we could try to batch the locking like DaveC already suggested
(ie we could move the locking to the caller, and then make
shrink_page_list() just try to keep the lock held for a few pages if
the mapping doesn't change), and that might result in fewer crazy
cacheline ping-pongs overall. But that feels like exactly the wrong
kind of workaround.
I'd much rather re-instate some "if kswapd is just spinning on CPU
time and not actually improving IO parallelism, kswapd should just get
the hell out" logic.
Adding Michal Hocko to the participant list too, I think he's one of
the gang in this area. Who else should be made aware of this thread?
Minchan? Vladimir?
[ I'm assuming the new people can look up this thread on lkml. Note to
new people: the subject line (and about 75% of the posts) are about an
unrelated AIM7 regression, but there's this sub-thread about nasty
lock contention on mapping->tree_lock within that bigger context ]
Linus
On Tue, Aug 16, 2016 at 8:05 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> However, historically there have been multiple indirect throttling mechanism
> that were branded as congestion control but basically said "I don't know
> what's going on so it's nap time". Many of these have been removed over
> time and the last major one was ede37713737 ("mm: throttle on IO only when
> there are too many dirty and writeback pages").
>
> Before that commit, a process that entered direct reclaim and failed to make
> progress would sleep before retrying. It's possible that sleep was enough
> to reduce contention by temporarily stalling the writer and letting reclaim
> make progress. After that commit, it may only do a cond_resched() check
> and go back to allocating/reclaiming as quickly as possible. This active
> writer may be enough to increase contention. If so, it also means it
> stops kswapd making forward progress, leading to more direct reclaim and
> more contention.
>
> It's not a perfect theory and assumes;
>
> 1. The writer is direct reclaiming
> 2. The writer was previously failing to __remove_mapping
> 3. The writer calling congestion_wait due to __remove_mapping failing
> was enough to allow kswapd or writeback to make enough progress to
> avoid contention
> 4. The writer staying awake allocating and dirtying pages is keeping all
> the kswapd instances awake and writeback continually active and
> increasing the contention overall.
>
> If it was possible to trigger this problem in 4.7 then it would also be
> worth checking 4.6. If 4.6 is immune, check that before and after commit
> ede37713737.
>
> --
> Mel Gorman
> SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 1:51 ` Linus Torvalds
@ 2016-08-16 22:02 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 22:02 UTC (permalink / raw)
To: Linus Torvalds
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Mon, Aug 15, 2016 at 06:51:42PM -0700, Linus Torvalds wrote:
> Anyway, including the direct reclaim call paths gets
> __remove_mapping() a bit higher, and _raw_spin_lock_irqsave climbs to
> 0.26%. But perhaps more importlantly, looking at what __remove_mapping
> actually *does* (apart from the spinlock) gives us:
>
> - inside remove_mapping itself (0.11% on its own - flat cost, no
> child accounting)
>
> 48.50 │ lock cmpxchg %edx,0x1c(%rbx)
>
> so that's about 0.05%
>
> - 0.40% __delete_from_page_cache (0.22%
> radix_tree_replace_clear_tags, 0.13%__radix_tree_lookup)
>
> - 0.06% workingset_eviction()
>
> so I'm not actually seeing anything *new* expensive in there. The
> __delete_from_page_cache() overhead may have changed a bit with the
> tagged tree changes, but this doesn't look like memcg.
>
> But we clearly have very different situations.
>
> What does your profile show for when you actually dig into
> __remove_mapping() itself?, Looking at your flat profile, I'm assuming
> you get
- 22.26% 0.93% [kernel] [k] __remove_mapping
- 3.86% __remove_mapping
- 18.35% _raw_spin_lock_irqsave
__pv_queued_spin_lock_slowpath
1.32% __delete_from_page_cache
- 0.92% _raw_spin_unlock_irqrestore
__raw_callee_save___pv_queued_spin_unlock
And the instruction level profile:
.....
� xor %ecx,%ecx
� mov %rax,%r15
0.39 � mov $0x2,%eax
� lock cmpxchg %ecx,0x1c(%rbx)
32.56 � cmp $0x2,%eax
� � jne 12e
� mov 0x20(%rbx),%rax
� lea -0x1(%rax),%rdx
0.39 � test $0x1,%al
� cmove %rbx,%rdx
� mov (%rdx),%rax
0.39 � test $0x10,%al
� � jne 127
� mov (%rbx),%rcx
� shr $0xf,%rcx
� and $0x1,%ecx
� � jne 14a
� mov 0x68(%r14),%rax
36.03 � xor %esi,%esi
� test %r13b,%r13b
� mov 0x50(%rax),%rdx
1.16 � � jne e8
0.96 � a9: mov %rbx,%rdi
.....
Indicates most time on the cmpxchg for the page ref followed by the
grabbing on the ->freepage op vector:
freepage = mapping->a_ops->freepage;
> I come back to wondering whether maybe you're hitting some PV-lock problem.
>
> I know queued_spin_lock_slowpath() is ok. I'm not entirely sure
> __pv_queued_spin_lock_slowpath() is.
It's the same code AFAICT, except the pv version jumps straight to
the "queue" case.
> So I'd love to see you try the non-PV case, but I also think it might
> be interesting to see what the instruction profile for
> __pv_queued_spin_lock_slowpath() itself is. They share a lot of code
> (there's some interesting #include games going on to make
> queued_spin_lock_slowpath() actually *be*
> __pv_queued_spin_lock_slowpath() with some magic hooks), but there
> might be issues.
0.03 � data16 data16 data16 xchg %ax,%ax
� push %rbp
0.00 � mov %rsp,%rbp
0.01 � push %r15
� push %r14
� push %r13
0.01 � push %r12
� mov $0x18740,%r12
� push %rbx
� mov %rdi,%rbx
� sub $0x10,%rsp
� add %gs:0x7ef0d0e0(%rip),%r12
� movslq 0xc(%r12),%rax
0.02 � mov %gs:0x7ef0d0db(%rip),%r15d
� add $0x1,%r15d
� shl $0x12,%r15d
� lea 0x1(%rax),%edx
0.01 � mov %edx,0xc(%r12)
� mov %eax,%edx
� shl $0x4,%rax
� add %rax,%r12
� shl $0x10,%edx
� movq $0x0,(%r12)
0.02 � or %edx,%r15d
� mov %gs:0x7ef0d0ad(%rip),%eax
0.00 � movl $0x0,0x8(%r12)
0.01 � mov %eax,0x40(%r12)
� movb $0x0,0x44(%r12)
� mov (%rdi),%eax
0.88 � test %ax,%ax
� � jne 8f
0.02 � mov $0x1,%edx
� lock cmpxchg %dl,(%rdi)
0.38 � test %al,%al
� � je 14a
0.02 � 8f: mov %r15d,%eax
� shr $0x10,%eax
� xchg %ax,0x2(%rbx)
2.07 � shl $0x10,%eax
� test %eax,%eax
� � jne 171
� movq $0x0,-0x30(%rbp)
0.02 � ac: movzbl 0x44(%r12),%eax
0.97 � mov $0x1,%r13d
� mov $0x100,%r14d
� cmp $0x2,%al
� sete %al
� movzbl %al,%eax
� mov %rax,-0x38(%rbp)
0.00 � ca: movb $0x0,0x44(%r12)
0.00 � mov $0x8000,%edx
� movb $0x1,0x1(%rbx)
� � jmp e6
0.04 � db: pause
8.04 � sub $0x1,%edx
� � je 229
� e6: movzbl (%rbx),%eax
7.54 � test %al,%al
� � jne db
0.10 � mov %r14d,%eax
0.06 � lock cmpxchg %r13w,(%rbx)
2.93 ? cmp $0x100,%ax
� � jne db
� fc: mov (%rbx),%edx
0.37 � mov $0x1,%ecx
� or $0x1,%edx
� � jmp 114
0.01 �108: mov %edx,%eax
� lock cmpxchg %ecx,(%rbx)
0.26 � cmp %edx,%eax
� � je 14a
� mov %eax,%edx
�114: mov %edx,%eax
0.00 � xor %ax,%ax
� cmp %r15d,%eax
� � je 108
0.01 � cmpq $0x0,-0x30(%rbp)
? movb $0x1,(%rbx)
� � je 251
�12c: mov -0x30(%rbp),%rsi
0.01 � mov $0x1,%eax
� mov $0x2,%edx
� movl $0x1,0x8(%rsi)
0.11 � lock cmpxchg %dl,0x44(%rsi)
2.34 � cmp $0x1,%al
� � je 160
�14a: decl %gs:0x7ef1b5bb(%rip)
0.02 � add $0x10,%rsp
� pop %rbx
� pop %r12
0.00 � pop %r13
� pop %r14
� pop %r15
� pop %rbp
� � retq
�160: mov -0x30(%rbp),%rsi
� movb $0x3,(%rbx)
� mov %rbx,%rdi
� callq 0xffffffff810fcf90
� � jmp 14a
�171: lea 0x44(%r12),%r14
� mov %rax,%r13
� shr $0x12,%eax
� shr $0xc,%r13
� sub $0x1,%eax
� and $0x30,%r13d
� cltq
� add $0x18740,%r13
? add -0x7d8164c0(,%rax,8),%r13
0.03 � mov %r12,0x0(%r13)
0.38 �19c: mov $0x8000,%eax
� � jmp 1b7
0.04 �1a3: test %al,%al
� � jne 1b0
� movzbl 0x44(%r13),%edx
1.66 � test %dl,%dl
� � jne 1f1
1.75 �1b0: pause
64.57 � sub $0x1
0.04 � � mov 0x8(%r12),%eax
0.03 � � test %eax,%eax
� �� jne 1d4
�1c9:� pause
� � mov 0x8(%r12),%eax
� � test %eax,%eax
� �� je 1c9
�1d4:� mov (%r12),%rax
� � test %rax,%rax
� � mov %rax,-0x30(%rbp)
0.05 � �� je ac
� � mov -0x30(%rbp),%rax
� � prefet (%rax)
0.25 � �� jmpq ac
�1f1:� mov $0x1,%eax
� � xchg %al,0x44(%r12)
� � mov 0x8(%r12),%eax
� � test %eax,%eax
� �� jne 213
� � mov %r14,%rdi
� � mov $0x1,%esi
� � callq 0xffffffff8109f7a0
� � xchg %ax,%ax
�213:� mov $0x1,%eax
� � xor %edi,%edi
� � lock cmpxchg %dil,(%r14)
� � mov 0x8(%r12),%eax
� �� jmpq 19c
�229:� cmpq $0x0,-0x38(%rbp)
� � movb $0x0,0x1(%rbx)
� �� je 276
�234:� movb $0x1,0x44(%r12)
� � mov $0x3,%esi
� � mov %rbx,%rdi
� � callq 0xffffffff8109f7a0
� � xchg %ax,%ax
� � movzbl (%rbx),%eax
� �� jmpq ca
�251:� mov (%r12),%rax
0.14 � � test %rax,%rax
� � mov %rax,-0x30(%rbp)
� �� jne 12c
�262:� pause
0.31 � � mov (%r12),%rax
� � test %rax,%rax
� �� je 262
� � mov %rax,-0x30(%rbp)
� �� jmpq 12c
�276:? mov %r12,%rsi
� � mov %rbx,%rdi
� � callq 0xffffffff810fcf90
� � mov %rax,-0x38(%rbp)
� � mov $0x3,%eax
� � xchg %al,(%rbx)
� � test %al,%al
� �� jne 234
� � mov -0x38(%rbp),%rax
� � movb $0x1,(%rbx)
� � movq $0x0,(%rax)
� ���jmpq fc
> For example, if you run a virtual 16-core system on a physical machine
> that then doesn't consistently give 16 cores to the virtual machine,
> you'll get no end of hiccups.
I learnt that lesson 6-7 years ago when I first started doing
baseline benchmarking to compare bare metal to virtualised IO
performance.
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 22:02 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-16 22:02 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 10010 bytes --]
On Mon, Aug 15, 2016 at 06:51:42PM -0700, Linus Torvalds wrote:
> Anyway, including the direct reclaim call paths gets
> __remove_mapping() a bit higher, and _raw_spin_lock_irqsave climbs to
> 0.26%. But perhaps more importlantly, looking at what __remove_mapping
> actually *does* (apart from the spinlock) gives us:
>
> - inside remove_mapping itself (0.11% on its own - flat cost, no
> child accounting)
>
> 48.50 │ lock cmpxchg %edx,0x1c(%rbx)
>
> so that's about 0.05%
>
> - 0.40% __delete_from_page_cache (0.22%
> radix_tree_replace_clear_tags, 0.13%__radix_tree_lookup)
>
> - 0.06% workingset_eviction()
>
> so I'm not actually seeing anything *new* expensive in there. The
> __delete_from_page_cache() overhead may have changed a bit with the
> tagged tree changes, but this doesn't look like memcg.
>
> But we clearly have very different situations.
>
> What does your profile show for when you actually dig into
> __remove_mapping() itself?, Looking at your flat profile, I'm assuming
> you get
- 22.26% 0.93% [kernel] [k] __remove_mapping
- 3.86% __remove_mapping
- 18.35% _raw_spin_lock_irqsave
__pv_queued_spin_lock_slowpath
1.32% __delete_from_page_cache
- 0.92% _raw_spin_unlock_irqrestore
__raw_callee_save___pv_queued_spin_unlock
And the instruction level profile:
.....
� xor %ecx,%ecx
� mov %rax,%r15
0.39 � mov $0x2,%eax
� lock cmpxchg %ecx,0x1c(%rbx)
32.56 � cmp $0x2,%eax
� � jne 12e
� mov 0x20(%rbx),%rax
� lea -0x1(%rax),%rdx
0.39 � test $0x1,%al
� cmove %rbx,%rdx
� mov (%rdx),%rax
0.39 � test $0x10,%al
� � jne 127
� mov (%rbx),%rcx
� shr $0xf,%rcx
� and $0x1,%ecx
� � jne 14a
� mov 0x68(%r14),%rax
36.03 � xor %esi,%esi
� test %r13b,%r13b
� mov 0x50(%rax),%rdx
1.16 � � jne e8
0.96 � a9: mov %rbx,%rdi
.....
Indicates most time on the cmpxchg for the page ref followed by the
grabbing on the ->freepage op vector:
freepage = mapping->a_ops->freepage;
> I come back to wondering whether maybe you're hitting some PV-lock problem.
>
> I know queued_spin_lock_slowpath() is ok. I'm not entirely sure
> __pv_queued_spin_lock_slowpath() is.
It's the same code AFAICT, except the pv version jumps straight to
the "queue" case.
> So I'd love to see you try the non-PV case, but I also think it might
> be interesting to see what the instruction profile for
> __pv_queued_spin_lock_slowpath() itself is. They share a lot of code
> (there's some interesting #include games going on to make
> queued_spin_lock_slowpath() actually *be*
> __pv_queued_spin_lock_slowpath() with some magic hooks), but there
> might be issues.
0.03 � data16 data16 data16 xchg %ax,%ax
� push %rbp
0.00 � mov %rsp,%rbp
0.01 � push %r15
� push %r14
� push %r13
0.01 � push %r12
� mov $0x18740,%r12
� push %rbx
� mov %rdi,%rbx
� sub $0x10,%rsp
� add %gs:0x7ef0d0e0(%rip),%r12
� movslq 0xc(%r12),%rax
0.02 � mov %gs:0x7ef0d0db(%rip),%r15d
� add $0x1,%r15d
� shl $0x12,%r15d
� lea 0x1(%rax),%edx
0.01 � mov %edx,0xc(%r12)
� mov %eax,%edx
� shl $0x4,%rax
� add %rax,%r12
� shl $0x10,%edx
� movq $0x0,(%r12)
0.02 � or %edx,%r15d
� mov %gs:0x7ef0d0ad(%rip),%eax
0.00 � movl $0x0,0x8(%r12)
0.01 � mov %eax,0x40(%r12)
� movb $0x0,0x44(%r12)
� mov (%rdi),%eax
0.88 � test %ax,%ax
� � jne 8f
0.02 � mov $0x1,%edx
� lock cmpxchg %dl,(%rdi)
0.38 � test %al,%al
� � je 14a
0.02 � 8f: mov %r15d,%eax
� shr $0x10,%eax
� xchg %ax,0x2(%rbx)
2.07 � shl $0x10,%eax
� test %eax,%eax
� � jne 171
� movq $0x0,-0x30(%rbp)
0.02 � ac: movzbl 0x44(%r12),%eax
0.97 � mov $0x1,%r13d
� mov $0x100,%r14d
� cmp $0x2,%al
� sete %al
� movzbl %al,%eax
� mov %rax,-0x38(%rbp)
0.00 � ca: movb $0x0,0x44(%r12)
0.00 � mov $0x8000,%edx
� movb $0x1,0x1(%rbx)
� � jmp e6
0.04 � db: pause
8.04 � sub $0x1,%edx
� � je 229
� e6: movzbl (%rbx),%eax
7.54 � test %al,%al
� � jne db
0.10 � mov %r14d,%eax
0.06 � lock cmpxchg %r13w,(%rbx)
2.93 ? cmp $0x100,%ax
� � jne db
� fc: mov (%rbx),%edx
0.37 � mov $0x1,%ecx
� or $0x1,%edx
� � jmp 114
0.01 �108: mov %edx,%eax
� lock cmpxchg %ecx,(%rbx)
0.26 � cmp %edx,%eax
� � je 14a
� mov %eax,%edx
�114: mov %edx,%eax
0.00 � xor %ax,%ax
� cmp %r15d,%eax
� � je 108
0.01 � cmpq $0x0,-0x30(%rbp)
? movb $0x1,(%rbx)
� � je 251
�12c: mov -0x30(%rbp),%rsi
0.01 � mov $0x1,%eax
� mov $0x2,%edx
� movl $0x1,0x8(%rsi)
0.11 � lock cmpxchg %dl,0x44(%rsi)
2.34 � cmp $0x1,%al
� � je 160
�14a: decl %gs:0x7ef1b5bb(%rip)
0.02 � add $0x10,%rsp
� pop %rbx
� pop %r12
0.00 � pop %r13
� pop %r14
� pop %r15
� pop %rbp
� � retq
�160: mov -0x30(%rbp),%rsi
� movb $0x3,(%rbx)
� mov %rbx,%rdi
� callq 0xffffffff810fcf90
� � jmp 14a
�171: lea 0x44(%r12),%r14
� mov %rax,%r13
� shr $0x12,%eax
� shr $0xc,%r13
� sub $0x1,%eax
� and $0x30,%r13d
� cltq
� add $0x18740,%r13
? add -0x7d8164c0(,%rax,8),%r13
0.03 � mov %r12,0x0(%r13)
0.38 �19c: mov $0x8000,%eax
� � jmp 1b7
0.04 �1a3: test %al,%al
� � jne 1b0
� movzbl 0x44(%r13),%edx
1.66 � test %dl,%dl
� � jne 1f1
1.75 �1b0: pause
64.57 � sub $0x1
0.04 � � mov 0x8(%r12),%eax
0.03 � � test %eax,%eax
� �� jne 1d4
�1c9:� pause
� � mov 0x8(%r12),%eax
� � test %eax,%eax
� �� je 1c9
�1d4:� mov (%r12),%rax
� � test %rax,%rax
� � mov %rax,-0x30(%rbp)
0.05 � �� je ac
� � mov -0x30(%rbp),%rax
� � prefet (%rax)
0.25 � �� jmpq ac
�1f1:� mov $0x1,%eax
� � xchg %al,0x44(%r12)
� � mov 0x8(%r12),%eax
� � test %eax,%eax
� �� jne 213
� � mov %r14,%rdi
� � mov $0x1,%esi
� � callq 0xffffffff8109f7a0
� � xchg %ax,%ax
�213:� mov $0x1,%eax
� � xor %edi,%edi
� � lock cmpxchg %dil,(%r14)
� � mov 0x8(%r12),%eax
� �� jmpq 19c
�229:� cmpq $0x0,-0x38(%rbp)
� � movb $0x0,0x1(%rbx)
� �� je 276
�234:� movb $0x1,0x44(%r12)
� � mov $0x3,%esi
� � mov %rbx,%rdi
� � callq 0xffffffff8109f7a0
� � xchg %ax,%ax
� � movzbl (%rbx),%eax
� �� jmpq ca
�251:� mov (%r12),%rax
0.14 � � test %rax,%rax
� � mov %rax,-0x30(%rbp)
� �� jne 12c
�262:� pause
0.31 � � mov (%r12),%rax
� � test %rax,%rax
� �� je 262
� � mov %rax,-0x30(%rbp)
� �� jmpq 12c
�276:? mov %r12,%rsi
� � mov %rbx,%rdi
� � callq 0xffffffff810fcf90
� � mov %rax,-0x38(%rbp)
� � mov $0x3,%eax
� � xchg %al,(%rbx)
� � test %al,%al
� �� jne 234
� � mov -0x38(%rbp),%rax
� � movb $0x1,(%rbx)
� � movq $0x0,(%rax)
� ���jmpq fc
> For example, if you run a virtual 16-core system on a physical machine
> that then doesn't consistently give 16 cores to the virtual machine,
> you'll get no end of hiccups.
I learnt that lesson 6-7 years ago when I first started doing
baseline benchmarking to compare bare metal to virtualised IO
performance.
-Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 22:02 ` Dave Chinner
@ 2016-08-16 23:23 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 23:23 UTC (permalink / raw)
To: Dave Chinner
Cc: Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Tue, Aug 16, 2016 at 3:02 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> What does your profile show for when you actually dig into
>> __remove_mapping() itself?, Looking at your flat profile, I'm assuming
>> you get
>
> - 22.26% 0.93% [kernel] [k] __remove_mapping
> - 3.86% __remove_mapping
> - 18.35% _raw_spin_lock_irqsave
> __pv_queued_spin_lock_slowpath
> 1.32% __delete_from_page_cache
> - 0.92% _raw_spin_unlock_irqrestore
> __raw_callee_save___pv_queued_spin_unlock
Ok, that's all very consistent with my profiles, except - obviously -
for the crazy spinlock thing.
One difference is that your unlock has that PV unlock thing - on raw
hardware it's just a single store. But I don't think I saw the
unlock_slowpath in there.
There's nothing really expensive going on there that I can tell.
> And the instruction level profile:
Yup. The bulk is in the cmpxchg and a cache miss (it just shows up in
the instruction after it: you can use "cycles:pp" to get perf to
actually try to fix up the blame to the instruction that _causes_
things rather than the instruction following, but in this case it's
all trivial).
> It's the same code AFAICT, except the pv version jumps straight to
> the "queue" case.
Yes. Your profile looks perfectly fine. Most of the profile is rigth
after the 'pause', which you'd expect.
>From a quick look, it seems like only about 2/3rd of the time is
actually spent in the "pause" loop, but the control flow is complex
enough that maybe I didn't follow it right. The native case is
simpler. But since I suspect that it's not so much about the
spinlocked region being too costly, but just about locking too damn
much), that 2/3rds actually makes sense: it's not that it's
necessarily spinning waiting for the lock all that long in any
individual case, it's just that the spin_lock code is called so much.
So I still kind of just blame kswapd, rather than any new expense. It
would be interesting to hear if Mel is right about that kswapd
sleeping change between 4.6 and 4.7..
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-16 23:23 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-16 23:23 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]
On Tue, Aug 16, 2016 at 3:02 PM, Dave Chinner <david@fromorbit.com> wrote:
>>
>> What does your profile show for when you actually dig into
>> __remove_mapping() itself?, Looking at your flat profile, I'm assuming
>> you get
>
> - 22.26% 0.93% [kernel] [k] __remove_mapping
> - 3.86% __remove_mapping
> - 18.35% _raw_spin_lock_irqsave
> __pv_queued_spin_lock_slowpath
> 1.32% __delete_from_page_cache
> - 0.92% _raw_spin_unlock_irqrestore
> __raw_callee_save___pv_queued_spin_unlock
Ok, that's all very consistent with my profiles, except - obviously -
for the crazy spinlock thing.
One difference is that your unlock has that PV unlock thing - on raw
hardware it's just a single store. But I don't think I saw the
unlock_slowpath in there.
There's nothing really expensive going on there that I can tell.
> And the instruction level profile:
Yup. The bulk is in the cmpxchg and a cache miss (it just shows up in
the instruction after it: you can use "cycles:pp" to get perf to
actually try to fix up the blame to the instruction that _causes_
things rather than the instruction following, but in this case it's
all trivial).
> It's the same code AFAICT, except the pv version jumps straight to
> the "queue" case.
Yes. Your profile looks perfectly fine. Most of the profile is rigth
after the 'pause', which you'd expect.
>From a quick look, it seems like only about 2/3rd of the time is
actually spent in the "pause" loop, but the control flow is complex
enough that maybe I didn't follow it right. The native case is
simpler. But since I suspect that it's not so much about the
spinlocked region being too costly, but just about locking too damn
much), that 2/3rds actually makes sense: it's not that it's
necessarily spinning waiting for the lock all that long in any
individual case, it's just that the spin_lock code is called so much.
So I still kind of just blame kswapd, rather than any new expense. It
would be interesting to hear if Mel is right about that kswapd
sleeping change between 4.6 and 4.7..
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 17:47 ` Linus Torvalds
@ 2016-08-17 15:48 ` Michal Hocko
-1 siblings, 0 replies; 219+ messages in thread
From: Michal Hocko @ 2016-08-17 15:48 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mel Gorman, Minchan Kim, Vladimir Davydov, Dave Chinner,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Tue 16-08-16 10:47:36, Linus Torvalds wrote:
> Mel,
> thanks for taking a look. Your theory sounds more complete than mine,
> and since Dave is able to see the problem with 4.7, it would be nice
> to hear about the 4.6 behavior and commit ede37713737 in particular.
>
> That one seems more likely to affect contention than the zone/node one
> I found during the merge window anyway, since it actually removes a
> sleep in kswapd during congestion.
Hmm, the patch removes a short sleep from wait_iff_congested for
kworkers but that cannot affect kswapd context. Then it removes
wait_iff_congested from should_reclaim_retry but that is not kswapd path
and the sleep was added in the same merge window so it wasn't in 4.6 so
it shouldn't make any difference as well.
So I am not really sure how it could make any difference.
I will try to catch up with the rest of the email thread but from a
quick glance it just feels like we are doing more more work under the
lock.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-17 15:48 ` Michal Hocko
0 siblings, 0 replies; 219+ messages in thread
From: Michal Hocko @ 2016-08-17 15:48 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1027 bytes --]
On Tue 16-08-16 10:47:36, Linus Torvalds wrote:
> Mel,
> thanks for taking a look. Your theory sounds more complete than mine,
> and since Dave is able to see the problem with 4.7, it would be nice
> to hear about the 4.6 behavior and commit ede37713737 in particular.
>
> That one seems more likely to affect contention than the zone/node one
> I found during the merge window anyway, since it actually removes a
> sleep in kswapd during congestion.
Hmm, the patch removes a short sleep from wait_iff_congested for
kworkers but that cannot affect kswapd context. Then it removes
wait_iff_congested from should_reclaim_retry but that is not kswapd path
and the sleep was added in the same merge window so it wasn't in 4.6 so
it shouldn't make any difference as well.
So I am not really sure how it could make any difference.
I will try to catch up with the rest of the email thread but from a
quick glance it just feels like we are doing more more work under the
lock.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-16 17:47 ` Linus Torvalds
@ 2016-08-17 15:49 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-17 15:49 UTC (permalink / raw)
To: Linus Torvalds
Cc: Michal Hocko, Minchan Kim, Vladimir Davydov, Dave Chinner,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Tue, Aug 16, 2016 at 10:47:36AM -0700, Linus Torvalds wrote:
> I've always preferred to see direct reclaim as the primary model for
> reclaim, partly in order to throttle the actual "bad" process, but
> also because "kswapd uses lots of CPU time" is such a nasty thing to
> even begin guessing about.
>
While I agree that bugs with high CPU usage from kswapd are a pain,
I'm reluctant to move towards direct reclaim being the primary mode. The
stalls can be severe and there is no guarantee that the process punished
is the process responsible. I'm basing this assumption on observations
of severe performance regressions when I accidentally broke kswapd during
the development of node-lru.
> So I have to admit to liking that "make kswapd sleep a bit if it's
> just looping" logic that got removed in that commit.
>
It's primarily the direct reclaimer that is affected by that patch.
> And looking at DaveC's numbers, it really feels like it's not even
> what we do inside the locked region that is the problem. Sure,
> __delete_from_page_cache() (which is most of it) is at 1.86% of CPU
> time (when including all the things it calls), but that still isn't
> all that much. Especially when compared to just:
>
> 0.78% [kernel] [k] _raw_spin_unlock_irqrestore
>
The profile is shocking for such a basic workload. I automated what Dave
described with xfs_io except that the file size is 2*RAM. The filesystem
is sized to be roughly the same size as the file to minimise variances
due to block layout. A call-graph profile collected on bare metal UMA with
numa=fake=4 and paravirt spinlocks showed
1.40% 0.16% kswapd1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.36% 0.16% kswapd2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.21% 0.12% kswapd0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.12% 0.13% kswapd3 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.81% 0.45% xfs_io [kernel.vmlinux] [k] _raw_spin_lock_irqsave
Those contention figures are not great but they are not terrible either. The
vmstats told me there was no direct reclaim activity so either my theory
is wrong or this machine is not reproducing the same problem Dave is seeing.
I have partial results from a 2-socket and 4-socket machine. 2-socket spends
roughtly 1.8% in _raw_spin_lock_irqsave and 4-socket spends roughtly 3%,
both with no direct reclaim. Clearly the problem gets worse the more NUMA
nodes there are but not to the same extent Dave reports.
I believe potential reasons why I do not see the same problem as Dave are;
1. Different memory sizes changing timing
2. Dave has fast storage and I'm using a spinning disk
3. Lock contention problems are magnified inside KVM
I think 3 is a good possibility if contended locks result in expensive
exiting and reentery of the guest. I have a vague recollection that a
spinning vcpu exits the guest but I did not confirm that. I can setup a
KVM instance and run the tests but it'll take a few hours and possibly
will be pushed out until tomorrow.
> So I'm more and more getting the feeling that it's not what we do
> inside the lock that is problematic. I started out blaming memcg
> accounting or something, but none of the numbers seem to back that up.
> So it's primarily really just the fact that kswapd is simply hammering
> on that lock way too much.
>
Agreed.
> So yeah, I'm blaming kswapd itself doing something wrong. It's not a
> problem in a single-node environment (since there's only one), but
> with multiple nodes it clearly just devolves.
>
> Yes, we could try to batch the locking like DaveC already suggested
> (ie we could move the locking to the caller, and then make
> shrink_page_list() just try to keep the lock held for a few pages if
> the mapping doesn't change), and that might result in fewer crazy
> cacheline ping-pongs overall. But that feels like exactly the wrong
> kind of workaround.
>
Even if such batching was implemented, it would be very specific to the
case of a single large file filling LRUs on multiple nodes.
> I'd much rather re-instate some "if kswapd is just spinning on CPU
> time and not actually improving IO parallelism, kswapd should just get
> the hell out" logic.
>
I'm having trouble right now thinking of a good way of identifying when
kswapd should give up and force direct reclaim to take a hit.
I'd like to pass something else by the wtf-o-meter. I had a prototype
patch lying around that replaced a congestion_wait if too many LRU pages
were isolated with a waitqueue for an unrelated theoretical problem. It's
the bulk of the patch below but can be trivially extended for the case of
tree_lock contention.
The interesting part is the change to __remove_mapping. It stalls a
reclaimer (direct or kswapd) if the lock is contended for either a
timeout or a local reclaimer finishing some reclaim. This stalls for a
"real" reason instead of blindly calling congestion_wait. The downside
is that I do see xfs_io briefly enter direct reclaim when kswapd stalled
on contention but the overall impact to completion times was 0.01 seconds
in the UMA with fake NUMA nodes case.
This could be made specific to direct reclaimers but it makes a certain
amount of sense to stall kswapd instances contending with each other.
With the patch applied I see a drop in cycles spent in spin_lock_irqsave
Before
1.40% 0.16% kswapd1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.36% 0.16% kswapd2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.21% 0.12% kswapd0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.12% 0.13% kswapd3 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.81% 0.45% xfs_io [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.26% 0.23% kworker/u20:1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.25% 0.19% kworker/3:2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.23% 0.19% rm [kernel.vmlinux] [k] _raw_spin_lock_irqsave
After
0.57% 0.50% xfs_io [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.24% 0.20% rm [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.24% 0.21% kthreadd [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.21% 0.17% kworker/6:1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.12% 0.09% kworker/2:1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.11% 0.10% kworker/u20:0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.10% 0.10% swapper [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.09% 0.08% kworker/7:2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
....
0.01% 0.00% kswapd0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.01% 0.00% kswapd1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.01% 0.00% kswapd3 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.01% 0.00% kswapd2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
kswapd time on locking is almost eliminated.
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d572b78b65e1..72f92f67bd0c 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -653,6 +653,7 @@ typedef struct pglist_data {
int node_id;
wait_queue_head_t kswapd_wait;
wait_queue_head_t pfmemalloc_wait;
+ wait_queue_head_t contention_wait;
struct task_struct *kswapd; /* Protected by
mem_hotplug_begin/end() */
int kswapd_order;
diff --git a/mm/compaction.c b/mm/compaction.c
index 9affb2908304..57351dddcd9a 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1616,6 +1616,10 @@ static enum compact_result compact_zone(struct zone *zone, struct compact_contro
zone->compact_cached_free_pfn = free_pfn;
}
+ /* Page reclaim could have stalled due to isolated pages */
+ if (waitqueue_active(&zone->zone_pgdat->contention_wait))
+ wake_up(&zone->zone_pgdat->contention_wait);
+
trace_mm_compaction_end(start_pfn, cc->migrate_pfn,
cc->free_pfn, end_pfn, sync, ret);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3fbe73a6fe4b..5af4eecdb4c9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5825,6 +5825,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
#endif
init_waitqueue_head(&pgdat->kswapd_wait);
init_waitqueue_head(&pgdat->pfmemalloc_wait);
+ init_waitqueue_head(&pgdat->contention_wait);
#ifdef CONFIG_COMPACTION
init_waitqueue_head(&pgdat->kcompactd_wait);
#endif
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 374d95d04178..42c37bf88cb7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -633,7 +633,25 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
- spin_lock_irqsave(&mapping->tree_lock, flags);
+ if (!reclaimed) {
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ } else {
+ /*
+ * If a reclaimer encounters a contended tree_lock then briefly
+ * stall and allow the parallel reclaim to make progress. The
+ * full HZ/10 penalty is incurred if the lock holder is
+ * reclaiming on a remote node.
+ */
+ if (!spin_trylock_irqsave(&mapping->tree_lock, flags)) {
+ pg_data_t *pgdat = page_pgdat(page);
+
+ try_to_unmap_flush();
+ wait_event_interruptible_timeout(pgdat->contention_wait,
+ spin_is_locked(&mapping->tree_lock), HZ/10);
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ }
+ }
+
/*
* The non racy check for a busy page.
*
@@ -1554,16 +1572,16 @@ int isolate_lru_page(struct page *page)
* the LRU list will go small and be scanned faster than necessary, leading to
* unnecessary swapping, thrashing and OOM.
*/
-static int too_many_isolated(struct pglist_data *pgdat, int file,
+static bool safe_to_isolate(struct pglist_data *pgdat, int file,
struct scan_control *sc)
{
unsigned long inactive, isolated;
if (current_is_kswapd())
- return 0;
+ return true;
- if (!sane_reclaim(sc))
- return 0;
+ if (sane_reclaim(sc))
+ return true;
if (file) {
inactive = node_page_state(pgdat, NR_INACTIVE_FILE);
@@ -1581,7 +1599,7 @@ static int too_many_isolated(struct pglist_data *pgdat, int file,
if ((sc->gfp_mask & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS))
inactive >>= 3;
- return isolated > inactive;
+ return isolated < inactive;
}
static noinline_for_stack void
@@ -1701,12 +1719,15 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
if (!inactive_reclaimable_pages(lruvec, sc, lru))
return 0;
- while (unlikely(too_many_isolated(pgdat, file, sc))) {
- congestion_wait(BLK_RW_ASYNC, HZ/10);
+ while (!safe_to_isolate(pgdat, file, sc)) {
+ wait_event_interruptible_timeout(pgdat->contention_wait,
+ safe_to_isolate(pgdat, file, sc), HZ/10);
/* We are about to die and free our memory. Return now. */
- if (fatal_signal_pending(current))
- return SWAP_CLUSTER_MAX;
+ if (fatal_signal_pending(current)) {
+ nr_reclaimed = SWAP_CLUSTER_MAX;
+ goto out;
+ }
}
lru_add_drain();
@@ -1819,6 +1840,10 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
nr_scanned, nr_reclaimed,
sc->priority, file);
+
+out:
+ if (waitqueue_active(&pgdat->contention_wait))
+ wake_up(&pgdat->contention_wait);
return nr_reclaimed;
}
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-17 15:49 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-17 15:49 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 11987 bytes --]
On Tue, Aug 16, 2016 at 10:47:36AM -0700, Linus Torvalds wrote:
> I've always preferred to see direct reclaim as the primary model for
> reclaim, partly in order to throttle the actual "bad" process, but
> also because "kswapd uses lots of CPU time" is such a nasty thing to
> even begin guessing about.
>
While I agree that bugs with high CPU usage from kswapd are a pain,
I'm reluctant to move towards direct reclaim being the primary mode. The
stalls can be severe and there is no guarantee that the process punished
is the process responsible. I'm basing this assumption on observations
of severe performance regressions when I accidentally broke kswapd during
the development of node-lru.
> So I have to admit to liking that "make kswapd sleep a bit if it's
> just looping" logic that got removed in that commit.
>
It's primarily the direct reclaimer that is affected by that patch.
> And looking at DaveC's numbers, it really feels like it's not even
> what we do inside the locked region that is the problem. Sure,
> __delete_from_page_cache() (which is most of it) is at 1.86% of CPU
> time (when including all the things it calls), but that still isn't
> all that much. Especially when compared to just:
>
> 0.78% [kernel] [k] _raw_spin_unlock_irqrestore
>
The profile is shocking for such a basic workload. I automated what Dave
described with xfs_io except that the file size is 2*RAM. The filesystem
is sized to be roughly the same size as the file to minimise variances
due to block layout. A call-graph profile collected on bare metal UMA with
numa=fake=4 and paravirt spinlocks showed
1.40% 0.16% kswapd1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.36% 0.16% kswapd2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.21% 0.12% kswapd0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.12% 0.13% kswapd3 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.81% 0.45% xfs_io [kernel.vmlinux] [k] _raw_spin_lock_irqsave
Those contention figures are not great but they are not terrible either. The
vmstats told me there was no direct reclaim activity so either my theory
is wrong or this machine is not reproducing the same problem Dave is seeing.
I have partial results from a 2-socket and 4-socket machine. 2-socket spends
roughtly 1.8% in _raw_spin_lock_irqsave and 4-socket spends roughtly 3%,
both with no direct reclaim. Clearly the problem gets worse the more NUMA
nodes there are but not to the same extent Dave reports.
I believe potential reasons why I do not see the same problem as Dave are;
1. Different memory sizes changing timing
2. Dave has fast storage and I'm using a spinning disk
3. Lock contention problems are magnified inside KVM
I think 3 is a good possibility if contended locks result in expensive
exiting and reentery of the guest. I have a vague recollection that a
spinning vcpu exits the guest but I did not confirm that. I can setup a
KVM instance and run the tests but it'll take a few hours and possibly
will be pushed out until tomorrow.
> So I'm more and more getting the feeling that it's not what we do
> inside the lock that is problematic. I started out blaming memcg
> accounting or something, but none of the numbers seem to back that up.
> So it's primarily really just the fact that kswapd is simply hammering
> on that lock way too much.
>
Agreed.
> So yeah, I'm blaming kswapd itself doing something wrong. It's not a
> problem in a single-node environment (since there's only one), but
> with multiple nodes it clearly just devolves.
>
> Yes, we could try to batch the locking like DaveC already suggested
> (ie we could move the locking to the caller, and then make
> shrink_page_list() just try to keep the lock held for a few pages if
> the mapping doesn't change), and that might result in fewer crazy
> cacheline ping-pongs overall. But that feels like exactly the wrong
> kind of workaround.
>
Even if such batching was implemented, it would be very specific to the
case of a single large file filling LRUs on multiple nodes.
> I'd much rather re-instate some "if kswapd is just spinning on CPU
> time and not actually improving IO parallelism, kswapd should just get
> the hell out" logic.
>
I'm having trouble right now thinking of a good way of identifying when
kswapd should give up and force direct reclaim to take a hit.
I'd like to pass something else by the wtf-o-meter. I had a prototype
patch lying around that replaced a congestion_wait if too many LRU pages
were isolated with a waitqueue for an unrelated theoretical problem. It's
the bulk of the patch below but can be trivially extended for the case of
tree_lock contention.
The interesting part is the change to __remove_mapping. It stalls a
reclaimer (direct or kswapd) if the lock is contended for either a
timeout or a local reclaimer finishing some reclaim. This stalls for a
"real" reason instead of blindly calling congestion_wait. The downside
is that I do see xfs_io briefly enter direct reclaim when kswapd stalled
on contention but the overall impact to completion times was 0.01 seconds
in the UMA with fake NUMA nodes case.
This could be made specific to direct reclaimers but it makes a certain
amount of sense to stall kswapd instances contending with each other.
With the patch applied I see a drop in cycles spent in spin_lock_irqsave
Before
1.40% 0.16% kswapd1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.36% 0.16% kswapd2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.21% 0.12% kswapd0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
1.12% 0.13% kswapd3 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.81% 0.45% xfs_io [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.26% 0.23% kworker/u20:1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.25% 0.19% kworker/3:2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.23% 0.19% rm [kernel.vmlinux] [k] _raw_spin_lock_irqsave
After
0.57% 0.50% xfs_io [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.24% 0.20% rm [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.24% 0.21% kthreadd [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.21% 0.17% kworker/6:1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.12% 0.09% kworker/2:1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.11% 0.10% kworker/u20:0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.10% 0.10% swapper [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.09% 0.08% kworker/7:2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
....
0.01% 0.00% kswapd0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.01% 0.00% kswapd1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.01% 0.00% kswapd3 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
0.01% 0.00% kswapd2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
kswapd time on locking is almost eliminated.
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d572b78b65e1..72f92f67bd0c 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -653,6 +653,7 @@ typedef struct pglist_data {
int node_id;
wait_queue_head_t kswapd_wait;
wait_queue_head_t pfmemalloc_wait;
+ wait_queue_head_t contention_wait;
struct task_struct *kswapd; /* Protected by
mem_hotplug_begin/end() */
int kswapd_order;
diff --git a/mm/compaction.c b/mm/compaction.c
index 9affb2908304..57351dddcd9a 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1616,6 +1616,10 @@ static enum compact_result compact_zone(struct zone *zone, struct compact_contro
zone->compact_cached_free_pfn = free_pfn;
}
+ /* Page reclaim could have stalled due to isolated pages */
+ if (waitqueue_active(&zone->zone_pgdat->contention_wait))
+ wake_up(&zone->zone_pgdat->contention_wait);
+
trace_mm_compaction_end(start_pfn, cc->migrate_pfn,
cc->free_pfn, end_pfn, sync, ret);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3fbe73a6fe4b..5af4eecdb4c9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5825,6 +5825,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
#endif
init_waitqueue_head(&pgdat->kswapd_wait);
init_waitqueue_head(&pgdat->pfmemalloc_wait);
+ init_waitqueue_head(&pgdat->contention_wait);
#ifdef CONFIG_COMPACTION
init_waitqueue_head(&pgdat->kcompactd_wait);
#endif
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 374d95d04178..42c37bf88cb7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -633,7 +633,25 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
- spin_lock_irqsave(&mapping->tree_lock, flags);
+ if (!reclaimed) {
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ } else {
+ /*
+ * If a reclaimer encounters a contended tree_lock then briefly
+ * stall and allow the parallel reclaim to make progress. The
+ * full HZ/10 penalty is incurred if the lock holder is
+ * reclaiming on a remote node.
+ */
+ if (!spin_trylock_irqsave(&mapping->tree_lock, flags)) {
+ pg_data_t *pgdat = page_pgdat(page);
+
+ try_to_unmap_flush();
+ wait_event_interruptible_timeout(pgdat->contention_wait,
+ spin_is_locked(&mapping->tree_lock), HZ/10);
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ }
+ }
+
/*
* The non racy check for a busy page.
*
@@ -1554,16 +1572,16 @@ int isolate_lru_page(struct page *page)
* the LRU list will go small and be scanned faster than necessary, leading to
* unnecessary swapping, thrashing and OOM.
*/
-static int too_many_isolated(struct pglist_data *pgdat, int file,
+static bool safe_to_isolate(struct pglist_data *pgdat, int file,
struct scan_control *sc)
{
unsigned long inactive, isolated;
if (current_is_kswapd())
- return 0;
+ return true;
- if (!sane_reclaim(sc))
- return 0;
+ if (sane_reclaim(sc))
+ return true;
if (file) {
inactive = node_page_state(pgdat, NR_INACTIVE_FILE);
@@ -1581,7 +1599,7 @@ static int too_many_isolated(struct pglist_data *pgdat, int file,
if ((sc->gfp_mask & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS))
inactive >>= 3;
- return isolated > inactive;
+ return isolated < inactive;
}
static noinline_for_stack void
@@ -1701,12 +1719,15 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
if (!inactive_reclaimable_pages(lruvec, sc, lru))
return 0;
- while (unlikely(too_many_isolated(pgdat, file, sc))) {
- congestion_wait(BLK_RW_ASYNC, HZ/10);
+ while (!safe_to_isolate(pgdat, file, sc)) {
+ wait_event_interruptible_timeout(pgdat->contention_wait,
+ safe_to_isolate(pgdat, file, sc), HZ/10);
/* We are about to die and free our memory. Return now. */
- if (fatal_signal_pending(current))
- return SWAP_CLUSTER_MAX;
+ if (fatal_signal_pending(current)) {
+ nr_reclaimed = SWAP_CLUSTER_MAX;
+ goto out;
+ }
}
lru_add_drain();
@@ -1819,6 +1840,10 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
nr_scanned, nr_reclaimed,
sc->priority, file);
+
+out:
+ if (waitqueue_active(&pgdat->contention_wait))
+ wake_up(&pgdat->contention_wait);
return nr_reclaimed;
}
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 5:03 ` Ingo Molnar
@ 2016-08-17 16:24 ` Peter Zijlstra
-1 siblings, 0 replies; 219+ messages in thread
From: Peter Zijlstra @ 2016-08-17 16:24 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Dave Chinner, Tejun Heo, Wu Fengguang,
Kirill A. Shutemov, Christoph Hellwig, Huang, Ying, LKML,
Bob Peterson, LKP, Arnaldo Carvalho de Melo
On Mon, Aug 15, 2016 at 07:03:00AM +0200, Ingo Molnar wrote:
>
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > Make sure you actually use "perf record -e cycles:pp" or something
> > that uses PEBS to get real profiles using CPU performance counters.
>
> Btw., 'perf record -e cycles:pp' is the default now for modern versions
> of perf tooling (on most x86 systems) - if you do 'perf record' it will
> just use the most precise profiling mode available on that particular
> CPU model.
>
> If unsure you can check the event that was used, via:
>
> triton:~> perf report --stdio 2>&1 | grep '# Samples'
> # Samples: 27K of event 'cycles:pp'
Problem here is that Dave is using a KVM thingy. Getting hardware
counters in a guest is somewhat tricky but doable, but PEBS does not
virtualize.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-17 16:24 ` Peter Zijlstra
0 siblings, 0 replies; 219+ messages in thread
From: Peter Zijlstra @ 2016-08-17 16:24 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 836 bytes --]
On Mon, Aug 15, 2016 at 07:03:00AM +0200, Ingo Molnar wrote:
>
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > Make sure you actually use "perf record -e cycles:pp" or something
> > that uses PEBS to get real profiles using CPU performance counters.
>
> Btw., 'perf record -e cycles:pp' is the default now for modern versions
> of perf tooling (on most x86 systems) - if you do 'perf record' it will
> just use the most precise profiling mode available on that particular
> CPU model.
>
> If unsure you can check the event that was used, via:
>
> triton:~> perf report --stdio 2>&1 | grep '# Samples'
> # Samples: 27K of event 'cycles:pp'
Problem here is that Dave is using a KVM thingy. Getting hardware
counters in a guest is somewhat tricky but doable, but PEBS does not
virtualize.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-17 15:48 ` Michal Hocko
@ 2016-08-17 16:42 ` Michal Hocko
-1 siblings, 0 replies; 219+ messages in thread
From: Michal Hocko @ 2016-08-17 16:42 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mel Gorman, Minchan Kim, Vladimir Davydov, Dave Chinner,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Wed 17-08-16 17:48:25, Michal Hocko wrote:
[...]
> I will try to catch up with the rest of the email thread but from a
> quick glance it just feels like we are doing more more work under the
> lock.
Hmm, so it doesn't seem to be more work in __remove_mapping as pointed
out in http://lkml.kernel.org/r/20160816220250.GI16044@dastard
As Mel already pointed out the LRU will be basically single mapping for
this workload so any subtle change in timing might make a difference.
I was looking through 4.6..4.7 and one thing that has changed is the
inactive vs. active LRU size ratio. See 59dc76b0d4df ("mm: vmscan:
reduce size of inactive file list"). The machine has quite a lot of
memory and so the LRUs will be large as well so I guess this could have
change the timing somehow, but it feels like a wild guess so I would be
careful to blame this commit...
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-17 16:42 ` Michal Hocko
0 siblings, 0 replies; 219+ messages in thread
From: Michal Hocko @ 2016-08-17 16:42 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 909 bytes --]
On Wed 17-08-16 17:48:25, Michal Hocko wrote:
[...]
> I will try to catch up with the rest of the email thread but from a
> quick glance it just feels like we are doing more more work under the
> lock.
Hmm, so it doesn't seem to be more work in __remove_mapping as pointed
out in http://lkml.kernel.org/r/20160816220250.GI16044(a)dastard
As Mel already pointed out the LRU will be basically single mapping for
this workload so any subtle change in timing might make a difference.
I was looking through 4.6..4.7 and one thing that has changed is the
inactive vs. active LRU size ratio. See 59dc76b0d4df ("mm: vmscan:
reduce size of inactive file list"). The machine has quite a lot of
memory and so the LRUs will be large as well so I guess this could have
change the timing somehow, but it feels like a wild guess so I would be
careful to blame this commit...
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-17 15:49 ` Mel Gorman
@ 2016-08-18 0:45 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-18 0:45 UTC (permalink / raw)
To: Linus Torvalds
Cc: Michal Hocko, Minchan Kim, Vladimir Davydov, Dave Chinner,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > Yes, we could try to batch the locking like DaveC already suggested
> > (ie we could move the locking to the caller, and then make
> > shrink_page_list() just try to keep the lock held for a few pages if
> > the mapping doesn't change), and that might result in fewer crazy
> > cacheline ping-pongs overall. But that feels like exactly the wrong
> > kind of workaround.
> >
>
> Even if such batching was implemented, it would be very specific to the
> case of a single large file filling LRUs on multiple nodes.
>
The latest Jason Bourne movie was sufficiently bad that I spent time
thinking how the tree_lock could be batched during reclaim. It's not
straight-forward but this prototype did not blow up on UMA and may be
worth considering if Dave can test either approach has a positive impact.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 374d95d04178..926110219cd9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -621,19 +621,39 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
return PAGE_CLEAN;
}
+static void finalise_remove_mapping(struct list_head *swapcache,
+ struct list_head *filecache,
+ void (*freepage)(struct page *))
+{
+ struct page *page;
+
+ while (!list_empty(swapcache)) {
+ swp_entry_t swap = { .val = page_private(page) };
+ page = lru_to_page(swapcache);
+ list_del(&page->lru);
+ swapcache_free(swap);
+ set_page_private(page, 0);
+ }
+
+ while (!list_empty(filecache)) {
+ page = lru_to_page(swapcache);
+ list_del(&page->lru);
+ freepage(page);
+ }
+}
+
/*
* Same as remove_mapping, but if the page is removed from the mapping, it
* gets returned with a refcount of 0.
*/
-static int __remove_mapping(struct address_space *mapping, struct page *page,
- bool reclaimed)
+static int __remove_mapping_page(struct address_space *mapping,
+ struct page *page, bool reclaimed,
+ struct list_head *swapcache,
+ struct list_head *filecache)
{
- unsigned long flags;
-
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
- spin_lock_irqsave(&mapping->tree_lock, flags);
/*
* The non racy check for a busy page.
*
@@ -668,16 +688,18 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
}
if (PageSwapCache(page)) {
- swp_entry_t swap = { .val = page_private(page) };
+ unsigned long swapval = page_private(page);
+ swp_entry_t swap = { .val = swapval };
mem_cgroup_swapout(page, swap);
__delete_from_swap_cache(page);
- spin_unlock_irqrestore(&mapping->tree_lock, flags);
- swapcache_free(swap);
+ set_page_private(page, swapval);
+ list_add(&page->lru, swapcache);
} else {
- void (*freepage)(struct page *);
void *shadow = NULL;
+ void (*freepage)(struct page *);
freepage = mapping->a_ops->freepage;
+
/*
* Remember a shadow entry for reclaimed file cache in
* order to detect refaults, thus thrashing, later on.
@@ -698,16 +720,13 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
!mapping_exiting(mapping) && !dax_mapping(mapping))
shadow = workingset_eviction(mapping, page);
__delete_from_page_cache(page, shadow);
- spin_unlock_irqrestore(&mapping->tree_lock, flags);
-
- if (freepage != NULL)
- freepage(page);
+ if (freepage)
+ list_add(&page->lru, filecache);
}
return 1;
cannot_free:
- spin_unlock_irqrestore(&mapping->tree_lock, flags);
return 0;
}
@@ -719,16 +738,68 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
*/
int remove_mapping(struct address_space *mapping, struct page *page)
{
- if (__remove_mapping(mapping, page, false)) {
+ unsigned long flags;
+ LIST_HEAD(swapcache);
+ LIST_HEAD(filecache);
+ void (*freepage)(struct page *);
+ int ret = 0;
+
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ freepage = mapping->a_ops->freepage;
+
+ if (__remove_mapping_page(mapping, page, false, &swapcache, &filecache)) {
/*
* Unfreezing the refcount with 1 rather than 2 effectively
* drops the pagecache ref for us without requiring another
* atomic operation.
*/
page_ref_unfreeze(page, 1);
- return 1;
+ ret = 1;
+ }
+ spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ finalise_remove_mapping(&swapcache, &filecache, freepage);
+ return ret;
+}
+
+static void remove_mapping_list(struct list_head *mapping_list,
+ struct list_head *free_pages,
+ struct list_head *ret_pages)
+{
+ unsigned long flags;
+ struct address_space *mapping = NULL;
+ void (*freepage)(struct page *);
+ LIST_HEAD(swapcache);
+ LIST_HEAD(filecache);
+ struct page *page;
+
+ while (!list_empty(mapping_list)) {
+ page = lru_to_page(mapping_list);
+ list_del(&page->lru);
+
+ if (!mapping || page->mapping != mapping) {
+ if (mapping) {
+ spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ finalise_remove_mapping(&swapcache, &filecache, freepage);
+ }
+
+ mapping = page->mapping;
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ freepage = mapping->a_ops->freepage;
+ }
+
+ if (!__remove_mapping_page(mapping, page, true, &swapcache, &filecache)) {
+ unlock_page(page);
+ list_add(&page->lru, ret_pages);
+ } else {
+ __ClearPageLocked(page);
+ list_add(&page->lru, free_pages);
+ }
+ }
+
+ if (mapping) {
+ spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ finalise_remove_mapping(&swapcache, &filecache, freepage);
}
- return 0;
}
/**
@@ -910,6 +981,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
{
LIST_HEAD(ret_pages);
LIST_HEAD(free_pages);
+ LIST_HEAD(mapping_pages);
int pgactivate = 0;
unsigned long nr_unqueued_dirty = 0;
unsigned long nr_dirty = 0;
@@ -1206,17 +1278,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
}
lazyfree:
- if (!mapping || !__remove_mapping(mapping, page, true))
+ if (!mapping)
goto keep_locked;
- /*
- * At this point, we have no other references and there is
- * no way to pick any more up (removed from LRU, removed
- * from pagecache). Can use non-atomic bitops now (and
- * we obviously don't have to worry about waking up a process
- * waiting on the page lock, because there are no references.
- */
- __ClearPageLocked(page);
+ list_add(&page->lru, &mapping_pages);
+ if (ret == SWAP_LZFREE)
+ count_vm_event(PGLAZYFREED);
+ continue;
+
free_it:
if (ret == SWAP_LZFREE)
count_vm_event(PGLAZYFREED);
@@ -1251,6 +1320,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page);
}
+ remove_mapping_list(&mapping_pages, &free_pages, &ret_pages);
mem_cgroup_uncharge_list(&free_pages);
try_to_unmap_flush();
free_hot_cold_page_list(&free_pages, true);
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-18 0:45 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-18 0:45 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 7045 bytes --]
On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > Yes, we could try to batch the locking like DaveC already suggested
> > (ie we could move the locking to the caller, and then make
> > shrink_page_list() just try to keep the lock held for a few pages if
> > the mapping doesn't change), and that might result in fewer crazy
> > cacheline ping-pongs overall. But that feels like exactly the wrong
> > kind of workaround.
> >
>
> Even if such batching was implemented, it would be very specific to the
> case of a single large file filling LRUs on multiple nodes.
>
The latest Jason Bourne movie was sufficiently bad that I spent time
thinking how the tree_lock could be batched during reclaim. It's not
straight-forward but this prototype did not blow up on UMA and may be
worth considering if Dave can test either approach has a positive impact.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 374d95d04178..926110219cd9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -621,19 +621,39 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
return PAGE_CLEAN;
}
+static void finalise_remove_mapping(struct list_head *swapcache,
+ struct list_head *filecache,
+ void (*freepage)(struct page *))
+{
+ struct page *page;
+
+ while (!list_empty(swapcache)) {
+ swp_entry_t swap = { .val = page_private(page) };
+ page = lru_to_page(swapcache);
+ list_del(&page->lru);
+ swapcache_free(swap);
+ set_page_private(page, 0);
+ }
+
+ while (!list_empty(filecache)) {
+ page = lru_to_page(swapcache);
+ list_del(&page->lru);
+ freepage(page);
+ }
+}
+
/*
* Same as remove_mapping, but if the page is removed from the mapping, it
* gets returned with a refcount of 0.
*/
-static int __remove_mapping(struct address_space *mapping, struct page *page,
- bool reclaimed)
+static int __remove_mapping_page(struct address_space *mapping,
+ struct page *page, bool reclaimed,
+ struct list_head *swapcache,
+ struct list_head *filecache)
{
- unsigned long flags;
-
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
- spin_lock_irqsave(&mapping->tree_lock, flags);
/*
* The non racy check for a busy page.
*
@@ -668,16 +688,18 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
}
if (PageSwapCache(page)) {
- swp_entry_t swap = { .val = page_private(page) };
+ unsigned long swapval = page_private(page);
+ swp_entry_t swap = { .val = swapval };
mem_cgroup_swapout(page, swap);
__delete_from_swap_cache(page);
- spin_unlock_irqrestore(&mapping->tree_lock, flags);
- swapcache_free(swap);
+ set_page_private(page, swapval);
+ list_add(&page->lru, swapcache);
} else {
- void (*freepage)(struct page *);
void *shadow = NULL;
+ void (*freepage)(struct page *);
freepage = mapping->a_ops->freepage;
+
/*
* Remember a shadow entry for reclaimed file cache in
* order to detect refaults, thus thrashing, later on.
@@ -698,16 +720,13 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
!mapping_exiting(mapping) && !dax_mapping(mapping))
shadow = workingset_eviction(mapping, page);
__delete_from_page_cache(page, shadow);
- spin_unlock_irqrestore(&mapping->tree_lock, flags);
-
- if (freepage != NULL)
- freepage(page);
+ if (freepage)
+ list_add(&page->lru, filecache);
}
return 1;
cannot_free:
- spin_unlock_irqrestore(&mapping->tree_lock, flags);
return 0;
}
@@ -719,16 +738,68 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
*/
int remove_mapping(struct address_space *mapping, struct page *page)
{
- if (__remove_mapping(mapping, page, false)) {
+ unsigned long flags;
+ LIST_HEAD(swapcache);
+ LIST_HEAD(filecache);
+ void (*freepage)(struct page *);
+ int ret = 0;
+
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ freepage = mapping->a_ops->freepage;
+
+ if (__remove_mapping_page(mapping, page, false, &swapcache, &filecache)) {
/*
* Unfreezing the refcount with 1 rather than 2 effectively
* drops the pagecache ref for us without requiring another
* atomic operation.
*/
page_ref_unfreeze(page, 1);
- return 1;
+ ret = 1;
+ }
+ spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ finalise_remove_mapping(&swapcache, &filecache, freepage);
+ return ret;
+}
+
+static void remove_mapping_list(struct list_head *mapping_list,
+ struct list_head *free_pages,
+ struct list_head *ret_pages)
+{
+ unsigned long flags;
+ struct address_space *mapping = NULL;
+ void (*freepage)(struct page *);
+ LIST_HEAD(swapcache);
+ LIST_HEAD(filecache);
+ struct page *page;
+
+ while (!list_empty(mapping_list)) {
+ page = lru_to_page(mapping_list);
+ list_del(&page->lru);
+
+ if (!mapping || page->mapping != mapping) {
+ if (mapping) {
+ spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ finalise_remove_mapping(&swapcache, &filecache, freepage);
+ }
+
+ mapping = page->mapping;
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ freepage = mapping->a_ops->freepage;
+ }
+
+ if (!__remove_mapping_page(mapping, page, true, &swapcache, &filecache)) {
+ unlock_page(page);
+ list_add(&page->lru, ret_pages);
+ } else {
+ __ClearPageLocked(page);
+ list_add(&page->lru, free_pages);
+ }
+ }
+
+ if (mapping) {
+ spin_unlock_irqrestore(&mapping->tree_lock, flags);
+ finalise_remove_mapping(&swapcache, &filecache, freepage);
}
- return 0;
}
/**
@@ -910,6 +981,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
{
LIST_HEAD(ret_pages);
LIST_HEAD(free_pages);
+ LIST_HEAD(mapping_pages);
int pgactivate = 0;
unsigned long nr_unqueued_dirty = 0;
unsigned long nr_dirty = 0;
@@ -1206,17 +1278,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
}
lazyfree:
- if (!mapping || !__remove_mapping(mapping, page, true))
+ if (!mapping)
goto keep_locked;
- /*
- * At this point, we have no other references and there is
- * no way to pick any more up (removed from LRU, removed
- * from pagecache). Can use non-atomic bitops now (and
- * we obviously don't have to worry about waking up a process
- * waiting on the page lock, because there are no references.
- */
- __ClearPageLocked(page);
+ list_add(&page->lru, &mapping_pages);
+ if (ret == SWAP_LZFREE)
+ count_vm_event(PGLAZYFREED);
+ continue;
+
free_it:
if (ret == SWAP_LZFREE)
count_vm_event(PGLAZYFREED);
@@ -1251,6 +1320,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page);
}
+ remove_mapping_list(&mapping_pages, &free_pages, &ret_pages);
mem_cgroup_uncharge_list(&free_pages);
try_to_unmap_flush();
free_hot_cold_page_list(&free_pages, true);
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-17 15:49 ` Mel Gorman
@ 2016-08-18 2:44 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-18 2:44 UTC (permalink / raw)
To: Mel Gorman
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> On Tue, Aug 16, 2016 at 10:47:36AM -0700, Linus Torvalds wrote:
> > I've always preferred to see direct reclaim as the primary model for
> > reclaim, partly in order to throttle the actual "bad" process, but
> > also because "kswapd uses lots of CPU time" is such a nasty thing to
> > even begin guessing about.
> >
>
> While I agree that bugs with high CPU usage from kswapd are a pain,
> I'm reluctant to move towards direct reclaim being the primary mode. The
> stalls can be severe and there is no guarantee that the process punished
> is the process responsible. I'm basing this assumption on observations
> of severe performance regressions when I accidentally broke kswapd during
> the development of node-lru.
>
> > So I have to admit to liking that "make kswapd sleep a bit if it's
> > just looping" logic that got removed in that commit.
> >
>
> It's primarily the direct reclaimer that is affected by that patch.
>
> > And looking at DaveC's numbers, it really feels like it's not even
> > what we do inside the locked region that is the problem. Sure,
> > __delete_from_page_cache() (which is most of it) is at 1.86% of CPU
> > time (when including all the things it calls), but that still isn't
> > all that much. Especially when compared to just:
> >
> > 0.78% [kernel] [k] _raw_spin_unlock_irqrestore
> >
>
> The profile is shocking for such a basic workload. I automated what Dave
> described with xfs_io except that the file size is 2*RAM. The filesystem
> is sized to be roughly the same size as the file to minimise variances
> due to block layout. A call-graph profile collected on bare metal UMA with
> numa=fake=4 and paravirt spinlocks showed
>
> 1.40% 0.16% kswapd1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
> 1.36% 0.16% kswapd2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
> 1.21% 0.12% kswapd0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
> 1.12% 0.13% kswapd3 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
> 0.81% 0.45% xfs_io [kernel.vmlinux] [k] _raw_spin_lock_irqsave
>
> Those contention figures are not great but they are not terrible either. The
> vmstats told me there was no direct reclaim activity so either my theory
> is wrong or this machine is not reproducing the same problem Dave is seeing.
No, that's roughly the same un-normalised CPU percentage I am seeing
in spinlock contention. i.e. take way the idle CPU in the profile
(probably upwards of 80% if it's a 16p machine), and instead look at
that figure as a percentage of total CPU used by the workload. Then
you'll that it's 30-40% of the total CPU consumed by the workload.
> I have partial results from a 2-socket and 4-socket machine. 2-socket spends
> roughtly 1.8% in _raw_spin_lock_irqsave and 4-socket spends roughtly 3%,
> both with no direct reclaim. Clearly the problem gets worse the more NUMA
> nodes there are but not to the same extent Dave reports.
>
> I believe potential reasons why I do not see the same problem as Dave are;
>
> 1. Different memory sizes changing timing
> 2. Dave has fast storage and I'm using a spinning disk
This particular is using an abused 3 year old SATA SSD that still
runs at 500MB/s on sequential writes. This is "cheap desktop"
capability these days and is nowhere near what I'd call "fast".
> 3. Lock contention problems are magnified inside KVM
>
> I think 3 is a good possibility if contended locks result in expensive
> exiting and reentery of the guest. I have a vague recollection that a
> spinning vcpu exits the guest but I did not confirm that.
I don't think anything like that has been implemented in the pv
spinlocks yet. They just spin right now - it's the same lock
implementation as the host. Also, Context switch rates measured on
the host are not significantly higher than what is measured in the
guest, so there doesn't appear to be any extra scheduling on the
host side occurring.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-18 2:44 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-18 2:44 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4218 bytes --]
On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> On Tue, Aug 16, 2016 at 10:47:36AM -0700, Linus Torvalds wrote:
> > I've always preferred to see direct reclaim as the primary model for
> > reclaim, partly in order to throttle the actual "bad" process, but
> > also because "kswapd uses lots of CPU time" is such a nasty thing to
> > even begin guessing about.
> >
>
> While I agree that bugs with high CPU usage from kswapd are a pain,
> I'm reluctant to move towards direct reclaim being the primary mode. The
> stalls can be severe and there is no guarantee that the process punished
> is the process responsible. I'm basing this assumption on observations
> of severe performance regressions when I accidentally broke kswapd during
> the development of node-lru.
>
> > So I have to admit to liking that "make kswapd sleep a bit if it's
> > just looping" logic that got removed in that commit.
> >
>
> It's primarily the direct reclaimer that is affected by that patch.
>
> > And looking at DaveC's numbers, it really feels like it's not even
> > what we do inside the locked region that is the problem. Sure,
> > __delete_from_page_cache() (which is most of it) is at 1.86% of CPU
> > time (when including all the things it calls), but that still isn't
> > all that much. Especially when compared to just:
> >
> > 0.78% [kernel] [k] _raw_spin_unlock_irqrestore
> >
>
> The profile is shocking for such a basic workload. I automated what Dave
> described with xfs_io except that the file size is 2*RAM. The filesystem
> is sized to be roughly the same size as the file to minimise variances
> due to block layout. A call-graph profile collected on bare metal UMA with
> numa=fake=4 and paravirt spinlocks showed
>
> 1.40% 0.16% kswapd1 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
> 1.36% 0.16% kswapd2 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
> 1.21% 0.12% kswapd0 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
> 1.12% 0.13% kswapd3 [kernel.vmlinux] [k] _raw_spin_lock_irqsave
> 0.81% 0.45% xfs_io [kernel.vmlinux] [k] _raw_spin_lock_irqsave
>
> Those contention figures are not great but they are not terrible either. The
> vmstats told me there was no direct reclaim activity so either my theory
> is wrong or this machine is not reproducing the same problem Dave is seeing.
No, that's roughly the same un-normalised CPU percentage I am seeing
in spinlock contention. i.e. take way the idle CPU in the profile
(probably upwards of 80% if it's a 16p machine), and instead look at
that figure as a percentage of total CPU used by the workload. Then
you'll that it's 30-40% of the total CPU consumed by the workload.
> I have partial results from a 2-socket and 4-socket machine. 2-socket spends
> roughtly 1.8% in _raw_spin_lock_irqsave and 4-socket spends roughtly 3%,
> both with no direct reclaim. Clearly the problem gets worse the more NUMA
> nodes there are but not to the same extent Dave reports.
>
> I believe potential reasons why I do not see the same problem as Dave are;
>
> 1. Different memory sizes changing timing
> 2. Dave has fast storage and I'm using a spinning disk
This particular is using an abused 3 year old SATA SSD that still
runs at 500MB/s on sequential writes. This is "cheap desktop"
capability these days and is nowhere near what I'd call "fast".
> 3. Lock contention problems are magnified inside KVM
>
> I think 3 is a good possibility if contended locks result in expensive
> exiting and reentery of the guest. I have a vague recollection that a
> spinning vcpu exits the guest but I did not confirm that.
I don't think anything like that has been implemented in the pv
spinlocks yet. They just spin right now - it's the same lock
implementation as the host. Also, Context switch rates measured on
the host are not significantly higher than what is measured in the
guest, so there doesn't appear to be any extra scheduling on the
host side occurring.
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 0:45 ` Mel Gorman
@ 2016-08-18 7:11 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-18 7:11 UTC (permalink / raw)
To: Mel Gorman
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > Yes, we could try to batch the locking like DaveC already suggested
> > > (ie we could move the locking to the caller, and then make
> > > shrink_page_list() just try to keep the lock held for a few pages if
> > > the mapping doesn't change), and that might result in fewer crazy
> > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > kind of workaround.
> > >
> >
> > Even if such batching was implemented, it would be very specific to the
> > case of a single large file filling LRUs on multiple nodes.
> >
>
> The latest Jason Bourne movie was sufficiently bad that I spent time
> thinking how the tree_lock could be batched during reclaim. It's not
> straight-forward but this prototype did not blow up on UMA and may be
> worth considering if Dave can test either approach has a positive impact.
SO, I just did a couple of tests. I'll call the two patches "sleepy"
for the contention backoff patch and "bourney" for the Jason Bourne
inspired batching patch. This is an average of 3 runs, overwriting
a 47GB file on a machine with 16GB RAM:
IO throughput wall time __pv_queued_spin_lock_slowpath
vanilla 470MB/s 1m42s 25-30%
sleepy 295MB/s 2m43s <1%
bourney 425MB/s 1m53s 25-30%
The overall CPU usage of sleepy was much lower than the others, but
it was also much slower. Too much sleeping and not enough reclaim
work being done, I think.
As for bourney, it's not immediately clear as to why it's nearly as
bad as the movie. At worst I would have expected it to have not
noticable impact, but maybe we are delaying freeing of pages too
long and so stalling allocation of new pages? It also doesn't do
much to reduce contention, especially considering the reduction in
throughput.
On a hunch that the batch list isn't all one mapping, I sorted it.
Patch is below if you're curious.
IO throughput wall time __pv_queued_spin_lock_slowpath
vanilla 470MB/s 1m42s 25-30%
sleepy 295MB/s 2m43s <1%
bourney 425MB/s 1m53s 25-30%
sorted-bourney 465MB/s 1m43s 20%
The number of reclaim batches (from multiple runs) where the sorting
of the lists would have done anything is counted by list swaps (ls)
being > 1.
# grep " c " /var/log/syslog |grep -v "ls 1" |wc -l
7429
# grep " c " /var/log/syslog |grep "ls 1" |wc -l
1061767
IOWs in 1.07 million batches of pages reclaimed, only ~0.695% of
batches switched to a different mapping tree lock more than once.
>From those numbers I would not have expected sorting the page list
to have any measurable impact on performance. However, performance
seems very sensitive to the number of times the mapping tree lock
is bounced around.
FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
IO throughput wall time __pv_queued_spin_lock_slowpath
vanilla 470MB/s 1m42s 25-30%
zr=1 470MB/s 1m42s 2-3%
So isolating the page cache usage to a single node maintains
performance and shows a significant reduction in pressure on the
mapping tree lock. Same as a single node system, I'd guess.
Anyway, I've burnt enough erase cycles on this SSD for today....
-Dave.
---
mm/vmscan.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9261102..5cf1bd6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -56,6 +56,8 @@
#include "internal.h"
+#include <linux/list_sort.h>
+
#define CREATE_TRACE_POINTS
#include <trace/events/vmscan.h>
@@ -761,6 +763,17 @@ int remove_mapping(struct address_space *mapping, struct page *page)
return ret;
}
+static int mapping_cmp(void *priv, struct list_head *a, struct list_head *b)
+{
+ struct address_space *ma = container_of(a, struct page, lru)->mapping;
+ struct address_space *mb = container_of(a, struct page, lru)->mapping;
+
+ if (ma == mb)
+ return 0;
+ if (ma > mb)
+ return 1;
+ return -1;
+}
static void remove_mapping_list(struct list_head *mapping_list,
struct list_head *free_pages,
struct list_head *ret_pages)
@@ -771,12 +784,17 @@ static void remove_mapping_list(struct list_head *mapping_list,
LIST_HEAD(swapcache);
LIST_HEAD(filecache);
struct page *page;
+ int c = 0, ls = 0;
+
+ list_sort(NULL, mapping_list, mapping_cmp);
while (!list_empty(mapping_list)) {
+ c++;
page = lru_to_page(mapping_list);
list_del(&page->lru);
if (!mapping || page->mapping != mapping) {
+ ls++;
if (mapping) {
spin_unlock_irqrestore(&mapping->tree_lock, flags);
finalise_remove_mapping(&swapcache, &filecache, freepage);
@@ -800,6 +818,7 @@ static void remove_mapping_list(struct list_head *mapping_list,
spin_unlock_irqrestore(&mapping->tree_lock, flags);
finalise_remove_mapping(&swapcache, &filecache, freepage);
}
+ printk("c %d, ls %d\n", c, ls);
}
/**
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 374d95d04178..926110219cd9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -621,19 +621,39 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
> return PAGE_CLEAN;
> }
>
> +static void finalise_remove_mapping(struct list_head *swapcache,
> + struct list_head *filecache,
> + void (*freepage)(struct page *))
> +{
> + struct page *page;
> +
> + while (!list_empty(swapcache)) {
> + swp_entry_t swap = { .val = page_private(page) };
> + page = lru_to_page(swapcache);
> + list_del(&page->lru);
> + swapcache_free(swap);
> + set_page_private(page, 0);
> + }
> +
> + while (!list_empty(filecache)) {
> + page = lru_to_page(swapcache);
> + list_del(&page->lru);
> + freepage(page);
> + }
> +}
> +
> /*
> * Same as remove_mapping, but if the page is removed from the mapping, it
> * gets returned with a refcount of 0.
> */
> -static int __remove_mapping(struct address_space *mapping, struct page *page,
> - bool reclaimed)
> +static int __remove_mapping_page(struct address_space *mapping,
> + struct page *page, bool reclaimed,
> + struct list_head *swapcache,
> + struct list_head *filecache)
> {
> - unsigned long flags;
> -
> BUG_ON(!PageLocked(page));
> BUG_ON(mapping != page_mapping(page));
>
> - spin_lock_irqsave(&mapping->tree_lock, flags);
> /*
> * The non racy check for a busy page.
> *
> @@ -668,16 +688,18 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
> }
>
> if (PageSwapCache(page)) {
> - swp_entry_t swap = { .val = page_private(page) };
> + unsigned long swapval = page_private(page);
> + swp_entry_t swap = { .val = swapval };
> mem_cgroup_swapout(page, swap);
> __delete_from_swap_cache(page);
> - spin_unlock_irqrestore(&mapping->tree_lock, flags);
> - swapcache_free(swap);
> + set_page_private(page, swapval);
> + list_add(&page->lru, swapcache);
> } else {
> - void (*freepage)(struct page *);
> void *shadow = NULL;
> + void (*freepage)(struct page *);
>
> freepage = mapping->a_ops->freepage;
> +
> /*
> * Remember a shadow entry for reclaimed file cache in
> * order to detect refaults, thus thrashing, later on.
> @@ -698,16 +720,13 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
> !mapping_exiting(mapping) && !dax_mapping(mapping))
> shadow = workingset_eviction(mapping, page);
> __delete_from_page_cache(page, shadow);
> - spin_unlock_irqrestore(&mapping->tree_lock, flags);
> -
> - if (freepage != NULL)
> - freepage(page);
> + if (freepage)
> + list_add(&page->lru, filecache);
> }
>
> return 1;
>
> cannot_free:
> - spin_unlock_irqrestore(&mapping->tree_lock, flags);
> return 0;
> }
>
> @@ -719,16 +738,68 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
> */
> int remove_mapping(struct address_space *mapping, struct page *page)
> {
> - if (__remove_mapping(mapping, page, false)) {
> + unsigned long flags;
> + LIST_HEAD(swapcache);
> + LIST_HEAD(filecache);
> + void (*freepage)(struct page *);
> + int ret = 0;
> +
> + spin_lock_irqsave(&mapping->tree_lock, flags);
> + freepage = mapping->a_ops->freepage;
> +
> + if (__remove_mapping_page(mapping, page, false, &swapcache, &filecache)) {
> /*
> * Unfreezing the refcount with 1 rather than 2 effectively
> * drops the pagecache ref for us without requiring another
> * atomic operation.
> */
> page_ref_unfreeze(page, 1);
> - return 1;
> + ret = 1;
> + }
> + spin_unlock_irqrestore(&mapping->tree_lock, flags);
> + finalise_remove_mapping(&swapcache, &filecache, freepage);
> + return ret;
> +}
> +
> +static void remove_mapping_list(struct list_head *mapping_list,
> + struct list_head *free_pages,
> + struct list_head *ret_pages)
> +{
> + unsigned long flags;
> + struct address_space *mapping = NULL;
> + void (*freepage)(struct page *);
> + LIST_HEAD(swapcache);
> + LIST_HEAD(filecache);
> + struct page *page;
> +
> + while (!list_empty(mapping_list)) {
> + page = lru_to_page(mapping_list);
> + list_del(&page->lru);
> +
> + if (!mapping || page->mapping != mapping) {
> + if (mapping) {
> + spin_unlock_irqrestore(&mapping->tree_lock, flags);
> + finalise_remove_mapping(&swapcache, &filecache, freepage);
> + }
> +
> + mapping = page->mapping;
> + spin_lock_irqsave(&mapping->tree_lock, flags);
> + freepage = mapping->a_ops->freepage;
> + }
> +
> + if (!__remove_mapping_page(mapping, page, true, &swapcache, &filecache)) {
> + unlock_page(page);
> + list_add(&page->lru, ret_pages);
> + } else {
> + __ClearPageLocked(page);
> + list_add(&page->lru, free_pages);
> + }
> + }
> +
> + if (mapping) {
> + spin_unlock_irqrestore(&mapping->tree_lock, flags);
> + finalise_remove_mapping(&swapcache, &filecache, freepage);
> }
> - return 0;
> }
>
> /**
> @@ -910,6 +981,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> {
> LIST_HEAD(ret_pages);
> LIST_HEAD(free_pages);
> + LIST_HEAD(mapping_pages);
> int pgactivate = 0;
> unsigned long nr_unqueued_dirty = 0;
> unsigned long nr_dirty = 0;
> @@ -1206,17 +1278,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> }
>
> lazyfree:
> - if (!mapping || !__remove_mapping(mapping, page, true))
> + if (!mapping)
> goto keep_locked;
>
> - /*
> - * At this point, we have no other references and there is
> - * no way to pick any more up (removed from LRU, removed
> - * from pagecache). Can use non-atomic bitops now (and
> - * we obviously don't have to worry about waking up a process
> - * waiting on the page lock, because there are no references.
> - */
> - __ClearPageLocked(page);
> + list_add(&page->lru, &mapping_pages);
> + if (ret == SWAP_LZFREE)
> + count_vm_event(PGLAZYFREED);
> + continue;
> +
> free_it:
> if (ret == SWAP_LZFREE)
> count_vm_event(PGLAZYFREED);
> @@ -1251,6 +1320,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page);
> }
>
> + remove_mapping_list(&mapping_pages, &free_pages, &ret_pages);
> mem_cgroup_uncharge_list(&free_pages);
> try_to_unmap_flush();
> free_hot_cold_page_list(&free_pages, true);
>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-18 7:11 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-18 7:11 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 11614 bytes --]
On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > Yes, we could try to batch the locking like DaveC already suggested
> > > (ie we could move the locking to the caller, and then make
> > > shrink_page_list() just try to keep the lock held for a few pages if
> > > the mapping doesn't change), and that might result in fewer crazy
> > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > kind of workaround.
> > >
> >
> > Even if such batching was implemented, it would be very specific to the
> > case of a single large file filling LRUs on multiple nodes.
> >
>
> The latest Jason Bourne movie was sufficiently bad that I spent time
> thinking how the tree_lock could be batched during reclaim. It's not
> straight-forward but this prototype did not blow up on UMA and may be
> worth considering if Dave can test either approach has a positive impact.
SO, I just did a couple of tests. I'll call the two patches "sleepy"
for the contention backoff patch and "bourney" for the Jason Bourne
inspired batching patch. This is an average of 3 runs, overwriting
a 47GB file on a machine with 16GB RAM:
IO throughput wall time __pv_queued_spin_lock_slowpath
vanilla 470MB/s 1m42s 25-30%
sleepy 295MB/s 2m43s <1%
bourney 425MB/s 1m53s 25-30%
The overall CPU usage of sleepy was much lower than the others, but
it was also much slower. Too much sleeping and not enough reclaim
work being done, I think.
As for bourney, it's not immediately clear as to why it's nearly as
bad as the movie. At worst I would have expected it to have not
noticable impact, but maybe we are delaying freeing of pages too
long and so stalling allocation of new pages? It also doesn't do
much to reduce contention, especially considering the reduction in
throughput.
On a hunch that the batch list isn't all one mapping, I sorted it.
Patch is below if you're curious.
IO throughput wall time __pv_queued_spin_lock_slowpath
vanilla 470MB/s 1m42s 25-30%
sleepy 295MB/s 2m43s <1%
bourney 425MB/s 1m53s 25-30%
sorted-bourney 465MB/s 1m43s 20%
The number of reclaim batches (from multiple runs) where the sorting
of the lists would have done anything is counted by list swaps (ls)
being > 1.
# grep " c " /var/log/syslog |grep -v "ls 1" |wc -l
7429
# grep " c " /var/log/syslog |grep "ls 1" |wc -l
1061767
IOWs in 1.07 million batches of pages reclaimed, only ~0.695% of
batches switched to a different mapping tree lock more than once.
>From those numbers I would not have expected sorting the page list
to have any measurable impact on performance. However, performance
seems very sensitive to the number of times the mapping tree lock
is bounced around.
FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
IO throughput wall time __pv_queued_spin_lock_slowpath
vanilla 470MB/s 1m42s 25-30%
zr=1 470MB/s 1m42s 2-3%
So isolating the page cache usage to a single node maintains
performance and shows a significant reduction in pressure on the
mapping tree lock. Same as a single node system, I'd guess.
Anyway, I've burnt enough erase cycles on this SSD for today....
-Dave.
---
mm/vmscan.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9261102..5cf1bd6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -56,6 +56,8 @@
#include "internal.h"
+#include <linux/list_sort.h>
+
#define CREATE_TRACE_POINTS
#include <trace/events/vmscan.h>
@@ -761,6 +763,17 @@ int remove_mapping(struct address_space *mapping, struct page *page)
return ret;
}
+static int mapping_cmp(void *priv, struct list_head *a, struct list_head *b)
+{
+ struct address_space *ma = container_of(a, struct page, lru)->mapping;
+ struct address_space *mb = container_of(a, struct page, lru)->mapping;
+
+ if (ma == mb)
+ return 0;
+ if (ma > mb)
+ return 1;
+ return -1;
+}
static void remove_mapping_list(struct list_head *mapping_list,
struct list_head *free_pages,
struct list_head *ret_pages)
@@ -771,12 +784,17 @@ static void remove_mapping_list(struct list_head *mapping_list,
LIST_HEAD(swapcache);
LIST_HEAD(filecache);
struct page *page;
+ int c = 0, ls = 0;
+
+ list_sort(NULL, mapping_list, mapping_cmp);
while (!list_empty(mapping_list)) {
+ c++;
page = lru_to_page(mapping_list);
list_del(&page->lru);
if (!mapping || page->mapping != mapping) {
+ ls++;
if (mapping) {
spin_unlock_irqrestore(&mapping->tree_lock, flags);
finalise_remove_mapping(&swapcache, &filecache, freepage);
@@ -800,6 +818,7 @@ static void remove_mapping_list(struct list_head *mapping_list,
spin_unlock_irqrestore(&mapping->tree_lock, flags);
finalise_remove_mapping(&swapcache, &filecache, freepage);
}
+ printk("c %d, ls %d\n", c, ls);
}
/**
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 374d95d04178..926110219cd9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -621,19 +621,39 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
> return PAGE_CLEAN;
> }
>
> +static void finalise_remove_mapping(struct list_head *swapcache,
> + struct list_head *filecache,
> + void (*freepage)(struct page *))
> +{
> + struct page *page;
> +
> + while (!list_empty(swapcache)) {
> + swp_entry_t swap = { .val = page_private(page) };
> + page = lru_to_page(swapcache);
> + list_del(&page->lru);
> + swapcache_free(swap);
> + set_page_private(page, 0);
> + }
> +
> + while (!list_empty(filecache)) {
> + page = lru_to_page(swapcache);
> + list_del(&page->lru);
> + freepage(page);
> + }
> +}
> +
> /*
> * Same as remove_mapping, but if the page is removed from the mapping, it
> * gets returned with a refcount of 0.
> */
> -static int __remove_mapping(struct address_space *mapping, struct page *page,
> - bool reclaimed)
> +static int __remove_mapping_page(struct address_space *mapping,
> + struct page *page, bool reclaimed,
> + struct list_head *swapcache,
> + struct list_head *filecache)
> {
> - unsigned long flags;
> -
> BUG_ON(!PageLocked(page));
> BUG_ON(mapping != page_mapping(page));
>
> - spin_lock_irqsave(&mapping->tree_lock, flags);
> /*
> * The non racy check for a busy page.
> *
> @@ -668,16 +688,18 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
> }
>
> if (PageSwapCache(page)) {
> - swp_entry_t swap = { .val = page_private(page) };
> + unsigned long swapval = page_private(page);
> + swp_entry_t swap = { .val = swapval };
> mem_cgroup_swapout(page, swap);
> __delete_from_swap_cache(page);
> - spin_unlock_irqrestore(&mapping->tree_lock, flags);
> - swapcache_free(swap);
> + set_page_private(page, swapval);
> + list_add(&page->lru, swapcache);
> } else {
> - void (*freepage)(struct page *);
> void *shadow = NULL;
> + void (*freepage)(struct page *);
>
> freepage = mapping->a_ops->freepage;
> +
> /*
> * Remember a shadow entry for reclaimed file cache in
> * order to detect refaults, thus thrashing, later on.
> @@ -698,16 +720,13 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
> !mapping_exiting(mapping) && !dax_mapping(mapping))
> shadow = workingset_eviction(mapping, page);
> __delete_from_page_cache(page, shadow);
> - spin_unlock_irqrestore(&mapping->tree_lock, flags);
> -
> - if (freepage != NULL)
> - freepage(page);
> + if (freepage)
> + list_add(&page->lru, filecache);
> }
>
> return 1;
>
> cannot_free:
> - spin_unlock_irqrestore(&mapping->tree_lock, flags);
> return 0;
> }
>
> @@ -719,16 +738,68 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
> */
> int remove_mapping(struct address_space *mapping, struct page *page)
> {
> - if (__remove_mapping(mapping, page, false)) {
> + unsigned long flags;
> + LIST_HEAD(swapcache);
> + LIST_HEAD(filecache);
> + void (*freepage)(struct page *);
> + int ret = 0;
> +
> + spin_lock_irqsave(&mapping->tree_lock, flags);
> + freepage = mapping->a_ops->freepage;
> +
> + if (__remove_mapping_page(mapping, page, false, &swapcache, &filecache)) {
> /*
> * Unfreezing the refcount with 1 rather than 2 effectively
> * drops the pagecache ref for us without requiring another
> * atomic operation.
> */
> page_ref_unfreeze(page, 1);
> - return 1;
> + ret = 1;
> + }
> + spin_unlock_irqrestore(&mapping->tree_lock, flags);
> + finalise_remove_mapping(&swapcache, &filecache, freepage);
> + return ret;
> +}
> +
> +static void remove_mapping_list(struct list_head *mapping_list,
> + struct list_head *free_pages,
> + struct list_head *ret_pages)
> +{
> + unsigned long flags;
> + struct address_space *mapping = NULL;
> + void (*freepage)(struct page *);
> + LIST_HEAD(swapcache);
> + LIST_HEAD(filecache);
> + struct page *page;
> +
> + while (!list_empty(mapping_list)) {
> + page = lru_to_page(mapping_list);
> + list_del(&page->lru);
> +
> + if (!mapping || page->mapping != mapping) {
> + if (mapping) {
> + spin_unlock_irqrestore(&mapping->tree_lock, flags);
> + finalise_remove_mapping(&swapcache, &filecache, freepage);
> + }
> +
> + mapping = page->mapping;
> + spin_lock_irqsave(&mapping->tree_lock, flags);
> + freepage = mapping->a_ops->freepage;
> + }
> +
> + if (!__remove_mapping_page(mapping, page, true, &swapcache, &filecache)) {
> + unlock_page(page);
> + list_add(&page->lru, ret_pages);
> + } else {
> + __ClearPageLocked(page);
> + list_add(&page->lru, free_pages);
> + }
> + }
> +
> + if (mapping) {
> + spin_unlock_irqrestore(&mapping->tree_lock, flags);
> + finalise_remove_mapping(&swapcache, &filecache, freepage);
> }
> - return 0;
> }
>
> /**
> @@ -910,6 +981,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> {
> LIST_HEAD(ret_pages);
> LIST_HEAD(free_pages);
> + LIST_HEAD(mapping_pages);
> int pgactivate = 0;
> unsigned long nr_unqueued_dirty = 0;
> unsigned long nr_dirty = 0;
> @@ -1206,17 +1278,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> }
>
> lazyfree:
> - if (!mapping || !__remove_mapping(mapping, page, true))
> + if (!mapping)
> goto keep_locked;
>
> - /*
> - * At this point, we have no other references and there is
> - * no way to pick any more up (removed from LRU, removed
> - * from pagecache). Can use non-atomic bitops now (and
> - * we obviously don't have to worry about waking up a process
> - * waiting on the page lock, because there are no references.
> - */
> - __ClearPageLocked(page);
> + list_add(&page->lru, &mapping_pages);
> + if (ret == SWAP_LZFREE)
> + count_vm_event(PGLAZYFREED);
> + continue;
> +
> free_it:
> if (ret == SWAP_LZFREE)
> count_vm_event(PGLAZYFREED);
> @@ -1251,6 +1320,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page);
> }
>
> + remove_mapping_list(&mapping_pages, &free_pages, &ret_pages);
> mem_cgroup_uncharge_list(&free_pages);
> try_to_unmap_flush();
> free_hot_cold_page_list(&free_pages, true);
>
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 7:11 ` Dave Chinner
@ 2016-08-18 13:24 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-18 13:24 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > (ie we could move the locking to the caller, and then make
> > > > shrink_page_list() just try to keep the lock held for a few pages if
> > > > the mapping doesn't change), and that might result in fewer crazy
> > > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > > kind of workaround.
> > > >
> > >
> > > Even if such batching was implemented, it would be very specific to the
> > > case of a single large file filling LRUs on multiple nodes.
> > >
> >
> > The latest Jason Bourne movie was sufficiently bad that I spent time
> > thinking how the tree_lock could be batched during reclaim. It's not
> > straight-forward but this prototype did not blow up on UMA and may be
> > worth considering if Dave can test either approach has a positive impact.
>
> SO, I just did a couple of tests. I'll call the two patches "sleepy"
> for the contention backoff patch and "bourney" for the Jason Bourne
> inspired batching patch. This is an average of 3 runs, overwriting
> a 47GB file on a machine with 16GB RAM:
>
> IO throughput wall time __pv_queued_spin_lock_slowpath
> vanilla 470MB/s 1m42s 25-30%
> sleepy 295MB/s 2m43s <1%
> bourney 425MB/s 1m53s 25-30%
>
Thanks. I updated the tests today and reran them trying to reproduce what
you saw but I'm simply not seeing it on bare metal with a spinning disk.
xfsio Throughput
4.8.0-rc2 4.8.0-rc2 4.8.0-rc2
vanilla sleepy bourney
Min tput 147.4450 ( 0.00%) 147.2580 ( 0.13%) 147.3900 ( 0.04%)
Hmean tput 147.5853 ( 0.00%) 147.5101 ( 0.05%) 147.6121 ( -0.02%)
Stddev tput 0.1041 ( 0.00%) 0.1785 (-71.47%) 0.2036 (-95.63%)
CoeffVar tput 0.0705 ( 0.00%) 0.1210 (-71.56%) 0.1379 (-95.59%)
Max tput 147.6940 ( 0.00%) 147.6420 ( 0.04%) 147.8820 ( -0.13%)
I'm currently setting up a KVM instance that may fare better. Due to
quirks of where machines are, I have to setup the KVM instance on real
NUMA hardware but maybe that'll make the problem even more obvious.
> The overall CPU usage of sleepy was much lower than the others, but
> it was also much slower. Too much sleeping and not enough reclaim
> work being done, I think.
>
Looks like it. On my initial test, there was barely any sleeping.
> As for bourney, it's not immediately clear as to why it's nearly as
> bad as the movie. At worst I would have expected it to have not
> noticable impact, but maybe we are delaying freeing of pages too
> long and so stalling allocation of new pages? It also doesn't do
> much to reduce contention, especially considering the reduction in
> throughput.
>
> On a hunch that the batch list isn't all one mapping, I sorted it.
> Patch is below if you're curious.
>
The fact that sorting makes such a difference makes me think that it's
the wrong direction. It's far too specific to this test case and does
nothing to throttle a reclaimer. It's also fairly complex and I expected
that normal users of remove_mapping such as truncation would take a hit.
The hit of bouncing the lock around just hurts too much.
> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
>
That is a terrifying "fix" for this problem. It just happens to work
because there is no spillover to other nodes so only one kswapd instance
is potentially active.
> Anyway, I've burnt enough erase cycles on this SSD for today....
>
I'll continue looking at getting KVM up and running and then consider
other possibilities for throttling.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-18 13:24 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-18 13:24 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3983 bytes --]
On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > (ie we could move the locking to the caller, and then make
> > > > shrink_page_list() just try to keep the lock held for a few pages if
> > > > the mapping doesn't change), and that might result in fewer crazy
> > > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > > kind of workaround.
> > > >
> > >
> > > Even if such batching was implemented, it would be very specific to the
> > > case of a single large file filling LRUs on multiple nodes.
> > >
> >
> > The latest Jason Bourne movie was sufficiently bad that I spent time
> > thinking how the tree_lock could be batched during reclaim. It's not
> > straight-forward but this prototype did not blow up on UMA and may be
> > worth considering if Dave can test either approach has a positive impact.
>
> SO, I just did a couple of tests. I'll call the two patches "sleepy"
> for the contention backoff patch and "bourney" for the Jason Bourne
> inspired batching patch. This is an average of 3 runs, overwriting
> a 47GB file on a machine with 16GB RAM:
>
> IO throughput wall time __pv_queued_spin_lock_slowpath
> vanilla 470MB/s 1m42s 25-30%
> sleepy 295MB/s 2m43s <1%
> bourney 425MB/s 1m53s 25-30%
>
Thanks. I updated the tests today and reran them trying to reproduce what
you saw but I'm simply not seeing it on bare metal with a spinning disk.
xfsio Throughput
4.8.0-rc2 4.8.0-rc2 4.8.0-rc2
vanilla sleepy bourney
Min tput 147.4450 ( 0.00%) 147.2580 ( 0.13%) 147.3900 ( 0.04%)
Hmean tput 147.5853 ( 0.00%) 147.5101 ( 0.05%) 147.6121 ( -0.02%)
Stddev tput 0.1041 ( 0.00%) 0.1785 (-71.47%) 0.2036 (-95.63%)
CoeffVar tput 0.0705 ( 0.00%) 0.1210 (-71.56%) 0.1379 (-95.59%)
Max tput 147.6940 ( 0.00%) 147.6420 ( 0.04%) 147.8820 ( -0.13%)
I'm currently setting up a KVM instance that may fare better. Due to
quirks of where machines are, I have to setup the KVM instance on real
NUMA hardware but maybe that'll make the problem even more obvious.
> The overall CPU usage of sleepy was much lower than the others, but
> it was also much slower. Too much sleeping and not enough reclaim
> work being done, I think.
>
Looks like it. On my initial test, there was barely any sleeping.
> As for bourney, it's not immediately clear as to why it's nearly as
> bad as the movie. At worst I would have expected it to have not
> noticable impact, but maybe we are delaying freeing of pages too
> long and so stalling allocation of new pages? It also doesn't do
> much to reduce contention, especially considering the reduction in
> throughput.
>
> On a hunch that the batch list isn't all one mapping, I sorted it.
> Patch is below if you're curious.
>
The fact that sorting makes such a difference makes me think that it's
the wrong direction. It's far too specific to this test case and does
nothing to throttle a reclaimer. It's also fairly complex and I expected
that normal users of remove_mapping such as truncation would take a hit.
The hit of bouncing the lock around just hurts too much.
> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
>
That is a terrifying "fix" for this problem. It just happens to work
because there is no spillover to other nodes so only one kswapd instance
is potentially active.
> Anyway, I've burnt enough erase cycles on this SSD for today....
>
I'll continue looking at getting KVM up and running and then consider
other possibilities for throttling.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 13:24 ` Mel Gorman
@ 2016-08-18 17:55 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-18 17:55 UTC (permalink / raw)
To: Mel Gorman
Cc: Dave Chinner, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Thu, Aug 18, 2016 at 6:24 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
> On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
>> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
>>
>
> That is a terrifying "fix" for this problem. It just happens to work
> because there is no spillover to other nodes so only one kswapd instance
> is potentially active.
Well, it may be a terrifying fix, but it does bring up an intriguing
notion: maybe what we should think about is to make the actual page
cache allocations be more "node-sticky" for a particular mapping? Not
some hard node binding, but if we were to make a single mapping *tend*
to allocate pages primarily within the same node, that would have the
kind of secondary afvantage that it would avoid the cross-node mapping
locking.
Think of it as a gentler "guiding" fix to the spinlock contention
issue than a hard hammer.
And trying to (at least initially) keep the allocations of one
particular file to one particular node sounds like it could have other
locality advantages too.
In fact, looking at the __page_cache_alloc(), we already have that
"spread pages out" logic. I'm assuming Dave doesn't actually have that
bit set (I don't think it's the default), but I'm also envisioning
that maybe we could extend on that notion, and try to spread out
allocations in general, but keep page allocations from one particular
mapping within one node.
The fact that zone_reclaim_mode really improves on Dave's numbers
*that* dramatically does seem to imply that there is something to be
said for this.
We do *not* want to limit the whole page cache to a particular node -
that sounds very unreasonable in general. But limiting any particular
file mapping (by default - I'm sure there are things like databases
that just want their one DB file to take over all of memory) to a
single node sounds much less unreasonable.
What do you guys think? Worth exploring?
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-18 17:55 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-18 17:55 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2016 bytes --]
On Thu, Aug 18, 2016 at 6:24 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
> On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
>> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
>>
>
> That is a terrifying "fix" for this problem. It just happens to work
> because there is no spillover to other nodes so only one kswapd instance
> is potentially active.
Well, it may be a terrifying fix, but it does bring up an intriguing
notion: maybe what we should think about is to make the actual page
cache allocations be more "node-sticky" for a particular mapping? Not
some hard node binding, but if we were to make a single mapping *tend*
to allocate pages primarily within the same node, that would have the
kind of secondary afvantage that it would avoid the cross-node mapping
locking.
Think of it as a gentler "guiding" fix to the spinlock contention
issue than a hard hammer.
And trying to (at least initially) keep the allocations of one
particular file to one particular node sounds like it could have other
locality advantages too.
In fact, looking at the __page_cache_alloc(), we already have that
"spread pages out" logic. I'm assuming Dave doesn't actually have that
bit set (I don't think it's the default), but I'm also envisioning
that maybe we could extend on that notion, and try to spread out
allocations in general, but keep page allocations from one particular
mapping within one node.
The fact that zone_reclaim_mode really improves on Dave's numbers
*that* dramatically does seem to imply that there is something to be
said for this.
We do *not* want to limit the whole page cache to a particular node -
that sounds very unreasonable in general. But limiting any particular
file mapping (by default - I'm sure there are things like databases
that just want their one DB file to take over all of memory) to a
single node sounds much less unreasonable.
What do you guys think? Worth exploring?
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 17:55 ` Linus Torvalds
@ 2016-08-18 21:19 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-18 21:19 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mel Gorman, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Thu, Aug 18, 2016 at 10:55:01AM -0700, Linus Torvalds wrote:
> On Thu, Aug 18, 2016 at 6:24 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> >> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
> >>
> >
> > That is a terrifying "fix" for this problem. It just happens to work
> > because there is no spillover to other nodes so only one kswapd instance
> > is potentially active.
>
> Well, it may be a terrifying fix, but it does bring up an intriguing
> notion: maybe what we should think about is to make the actual page
> cache allocations be more "node-sticky" for a particular mapping? Not
> some hard node binding, but if we were to make a single mapping *tend*
> to allocate pages primarily within the same node, that would have the
> kind of secondary afvantage that it would avoid the cross-node mapping
> locking.
For streaming or use-once IO it makes a lot of sense to restrict the
locality of the page cache. The faster the IO device, the less dirty
page buffering we need to maintain full device bandwidth. And the
larger the machine the greater the effect of global page cache
pollution on the other appplications is.
> Think of it as a gentler "guiding" fix to the spinlock contention
> issue than a hard hammer.
>
> And trying to (at least initially) keep the allocations of one
> particular file to one particular node sounds like it could have other
> locality advantages too.
>
> In fact, looking at the __page_cache_alloc(), we already have that
> "spread pages out" logic. I'm assuming Dave doesn't actually have that
> bit set (I don't think it's the default), but I'm also envisioning
> that maybe we could extend on that notion, and try to spread out
> allocations in general, but keep page allocations from one particular
> mapping within one node.
CONFIG_CPUSETS=y
But I don't have any cpusets configured (unless systemd is doing
something wacky under the covers) so the page spread bit should not
be set.
> The fact that zone_reclaim_mode really improves on Dave's numbers
> *that* dramatically does seem to imply that there is something to be
> said for this.
>
> We do *not* want to limit the whole page cache to a particular node -
> that sounds very unreasonable in general. But limiting any particular
> file mapping (by default - I'm sure there are things like databases
> that just want their one DB file to take over all of memory) to a
> single node sounds much less unreasonable.
>
> What do you guys think? Worth exploring?
The problem is that whenever we turn this sort of behaviour on, some
benchmark regresses because it no longer holds it's working set in
the page cache, leading to the change being immediately reverted.
Enterprise java benchmarks ring a bell, for some reason.
Hence my comment above about needing it to be tied into specific
"use-once-only" page cache behaviours. I know we have working set
estimation, fadvise modes and things like readahead that help track
sequential and use-once access patterns, but I'm not sure how we can
tie that all together....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-18 21:19 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-18 21:19 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3241 bytes --]
On Thu, Aug 18, 2016 at 10:55:01AM -0700, Linus Torvalds wrote:
> On Thu, Aug 18, 2016 at 6:24 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> >> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
> >>
> >
> > That is a terrifying "fix" for this problem. It just happens to work
> > because there is no spillover to other nodes so only one kswapd instance
> > is potentially active.
>
> Well, it may be a terrifying fix, but it does bring up an intriguing
> notion: maybe what we should think about is to make the actual page
> cache allocations be more "node-sticky" for a particular mapping? Not
> some hard node binding, but if we were to make a single mapping *tend*
> to allocate pages primarily within the same node, that would have the
> kind of secondary afvantage that it would avoid the cross-node mapping
> locking.
For streaming or use-once IO it makes a lot of sense to restrict the
locality of the page cache. The faster the IO device, the less dirty
page buffering we need to maintain full device bandwidth. And the
larger the machine the greater the effect of global page cache
pollution on the other appplications is.
> Think of it as a gentler "guiding" fix to the spinlock contention
> issue than a hard hammer.
>
> And trying to (at least initially) keep the allocations of one
> particular file to one particular node sounds like it could have other
> locality advantages too.
>
> In fact, looking at the __page_cache_alloc(), we already have that
> "spread pages out" logic. I'm assuming Dave doesn't actually have that
> bit set (I don't think it's the default), but I'm also envisioning
> that maybe we could extend on that notion, and try to spread out
> allocations in general, but keep page allocations from one particular
> mapping within one node.
CONFIG_CPUSETS=y
But I don't have any cpusets configured (unless systemd is doing
something wacky under the covers) so the page spread bit should not
be set.
> The fact that zone_reclaim_mode really improves on Dave's numbers
> *that* dramatically does seem to imply that there is something to be
> said for this.
>
> We do *not* want to limit the whole page cache to a particular node -
> that sounds very unreasonable in general. But limiting any particular
> file mapping (by default - I'm sure there are things like databases
> that just want their one DB file to take over all of memory) to a
> single node sounds much less unreasonable.
>
> What do you guys think? Worth exploring?
The problem is that whenever we turn this sort of behaviour on, some
benchmark regresses because it no longer holds it's working set in
the page cache, leading to the change being immediately reverted.
Enterprise java benchmarks ring a bell, for some reason.
Hence my comment above about needing it to be tied into specific
"use-once-only" page cache behaviours. I know we have working set
estimation, fadvise modes and things like readahead that help track
sequential and use-once access patterns, but I'm not sure how we can
tie that all together....
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 21:19 ` Dave Chinner
@ 2016-08-18 22:25 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-18 22:25 UTC (permalink / raw)
To: Dave Chinner
Cc: Mel Gorman, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Thu, Aug 18, 2016 at 2:19 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> For streaming or use-once IO it makes a lot of sense to restrict the
> locality of the page cache. The faster the IO device, the less dirty
> page buffering we need to maintain full device bandwidth. And the
> larger the machine the greater the effect of global page cache
> pollution on the other appplications is.
Yes. But I agree with you that it might be very hard to actually get
something that does a good job automagically.
>> In fact, looking at the __page_cache_alloc(), we already have that
>> "spread pages out" logic. I'm assuming Dave doesn't actually have that
>> bit set (I don't think it's the default), but I'm also envisioning
>> that maybe we could extend on that notion, and try to spread out
>> allocations in general, but keep page allocations from one particular
>> mapping within one node.
>
> CONFIG_CPUSETS=y
>
> But I don't have any cpusets configured (unless systemd is doing
> something wacky under the covers) so the page spread bit should not
> be set.
Yeah, but even when it's not set we just do a generic alloc_pages(),
which is just going to fill up all nodes. Not perhaps quite as "spread
out", but there's obviously no attempt to try to be node-aware either.
So _if_ we come up with some reasonable way to say "let's keep the
pages of this mapping together", we could try to do it in that
numa-aware __page_cache_alloc().
It *could* be as simple/stupid as just saying "let's allocate the page
cache for new pages from the current node" - and if the process that
dirties pages just stays around on one single node, that might already
be sufficient.
So just for testing purposes, you could try changing that
return alloc_pages(gfp, 0);
in __page_cache_alloc() into something like
return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
or something.
>> The fact that zone_reclaim_mode really improves on Dave's numbers
>> *that* dramatically does seem to imply that there is something to be
>> said for this.
>>
>> We do *not* want to limit the whole page cache to a particular node -
>> that sounds very unreasonable in general. But limiting any particular
>> file mapping (by default - I'm sure there are things like databases
>> that just want their one DB file to take over all of memory) to a
>> single node sounds much less unreasonable.
>>
>> What do you guys think? Worth exploring?
>
> The problem is that whenever we turn this sort of behaviour on, some
> benchmark regresses because it no longer holds it's working set in
> the page cache, leading to the change being immediately reverted.
> Enterprise java benchmarks ring a bell, for some reason.
Yeah. It might be ok if we limit the new behavior to just new pages
that get allocated for writing, which is where we want to limit the
page cache more anyway (we already have all those dirty limits etc).
But from a testing standpoint, you can probably try the above
"alloc_pages_node()" hack and see if it even makes a difference. It
might not work, and the dirtier might be moving around too much etc.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-18 22:25 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-18 22:25 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3214 bytes --]
On Thu, Aug 18, 2016 at 2:19 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> For streaming or use-once IO it makes a lot of sense to restrict the
> locality of the page cache. The faster the IO device, the less dirty
> page buffering we need to maintain full device bandwidth. And the
> larger the machine the greater the effect of global page cache
> pollution on the other appplications is.
Yes. But I agree with you that it might be very hard to actually get
something that does a good job automagically.
>> In fact, looking at the __page_cache_alloc(), we already have that
>> "spread pages out" logic. I'm assuming Dave doesn't actually have that
>> bit set (I don't think it's the default), but I'm also envisioning
>> that maybe we could extend on that notion, and try to spread out
>> allocations in general, but keep page allocations from one particular
>> mapping within one node.
>
> CONFIG_CPUSETS=y
>
> But I don't have any cpusets configured (unless systemd is doing
> something wacky under the covers) so the page spread bit should not
> be set.
Yeah, but even when it's not set we just do a generic alloc_pages(),
which is just going to fill up all nodes. Not perhaps quite as "spread
out", but there's obviously no attempt to try to be node-aware either.
So _if_ we come up with some reasonable way to say "let's keep the
pages of this mapping together", we could try to do it in that
numa-aware __page_cache_alloc().
It *could* be as simple/stupid as just saying "let's allocate the page
cache for new pages from the current node" - and if the process that
dirties pages just stays around on one single node, that might already
be sufficient.
So just for testing purposes, you could try changing that
return alloc_pages(gfp, 0);
in __page_cache_alloc() into something like
return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
or something.
>> The fact that zone_reclaim_mode really improves on Dave's numbers
>> *that* dramatically does seem to imply that there is something to be
>> said for this.
>>
>> We do *not* want to limit the whole page cache to a particular node -
>> that sounds very unreasonable in general. But limiting any particular
>> file mapping (by default - I'm sure there are things like databases
>> that just want their one DB file to take over all of memory) to a
>> single node sounds much less unreasonable.
>>
>> What do you guys think? Worth exploring?
>
> The problem is that whenever we turn this sort of behaviour on, some
> benchmark regresses because it no longer holds it's working set in
> the page cache, leading to the change being immediately reverted.
> Enterprise java benchmarks ring a bell, for some reason.
Yeah. It might be ok if we limit the new behavior to just new pages
that get allocated for writing, which is where we want to limit the
page cache more anyway (we already have all those dirty limits etc).
But from a testing standpoint, you can probably try the above
"alloc_pages_node()" hack and see if it even makes a difference. It
might not work, and the dirtier might be moving around too much etc.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 22:25 ` Linus Torvalds
@ 2016-08-19 9:00 ` Michal Hocko
-1 siblings, 0 replies; 219+ messages in thread
From: Michal Hocko @ 2016-08-19 9:00 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Mel Gorman, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Thu 18-08-16 15:25:40, Linus Torvalds wrote:
[...]
> So just for testing purposes, you could try changing that
>
> return alloc_pages(gfp, 0);
>
> in __page_cache_alloc() into something like
>
> return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
That would break mempolicies AFAICS. Anyway, I might be missing
something (the mempolicy kod has a strange sense for aesthetic) but the
normal case without any explicit mempolicy should use default_policy
which is MPOL_PREFERRED and MPOL_F_LOCAL which means numa_node_id() so
the local node. So the above two should do the same thing unless I have
missing something.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-19 9:00 ` Michal Hocko
0 siblings, 0 replies; 219+ messages in thread
From: Michal Hocko @ 2016-08-19 9:00 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 702 bytes --]
On Thu 18-08-16 15:25:40, Linus Torvalds wrote:
[...]
> So just for testing purposes, you could try changing that
>
> return alloc_pages(gfp, 0);
>
> in __page_cache_alloc() into something like
>
> return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
That would break mempolicies AFAICS. Anyway, I might be missing
something (the mempolicy kod has a strange sense for aesthetic) but the
normal case without any explicit mempolicy should use default_policy
which is MPOL_PREFERRED and MPOL_F_LOCAL which means numa_node_id() so
the local node. So the above two should do the same thing unless I have
missing something.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 22:25 ` Linus Torvalds
@ 2016-08-19 10:49 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-19 10:49 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Chinner, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> >> In fact, looking at the __page_cache_alloc(), we already have that
> >> "spread pages out" logic. I'm assuming Dave doesn't actually have that
> >> bit set (I don't think it's the default), but I'm also envisioning
> >> that maybe we could extend on that notion, and try to spread out
> >> allocations in general, but keep page allocations from one particular
> >> mapping within one node.
> >
> > CONFIG_CPUSETS=y
> >
> > But I don't have any cpusets configured (unless systemd is doing
> > something wacky under the covers) so the page spread bit should not
> > be set.
>
> Yeah, but even when it's not set we just do a generic alloc_pages(),
> which is just going to fill up all nodes. Not perhaps quite as "spread
> out", but there's obviously no attempt to try to be node-aware either.
>
There is a slight difference. Reads should fill the nodes in turn but
dirty pages (__GFP_WRITE) get distributed to balance the number of dirty
pages on each node to avoid hitting dirty balance limits prematurely.
Yesterday I tried a patch that avoids distributing to remote nodes close
to the high watermark to avoid waking remote kswapd instances. It added a
lot of overhead to the fast path (3%) which hurts every writer but did not
reduce contention enough on the special case of writing a single large file.
As an aside, the dirty distribution check itself is very expensive so I
prototyped something that does the expensive calculations on a vmstat
update. Not sure if it'll work but it's a side issue.
> So _if_ we come up with some reasonable way to say "let's keep the
> pages of this mapping together", we could try to do it in that
> numa-aware __page_cache_alloc().
>
> It *could* be as simple/stupid as just saying "let's allocate the page
> cache for new pages from the current node" - and if the process that
> dirties pages just stays around on one single node, that might already
> be sufficient.
>
> So just for testing purposes, you could try changing that
>
> return alloc_pages(gfp, 0);
>
> in __page_cache_alloc() into something like
>
> return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
>
> or something.
>
The test would be interesting but I believe that keeping heavy writers
on one node will force them to stall early on dirty balancing even if
there is plenty of free memory on other nodes.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-19 10:49 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-19 10:49 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2515 bytes --]
On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> >> In fact, looking at the __page_cache_alloc(), we already have that
> >> "spread pages out" logic. I'm assuming Dave doesn't actually have that
> >> bit set (I don't think it's the default), but I'm also envisioning
> >> that maybe we could extend on that notion, and try to spread out
> >> allocations in general, but keep page allocations from one particular
> >> mapping within one node.
> >
> > CONFIG_CPUSETS=y
> >
> > But I don't have any cpusets configured (unless systemd is doing
> > something wacky under the covers) so the page spread bit should not
> > be set.
>
> Yeah, but even when it's not set we just do a generic alloc_pages(),
> which is just going to fill up all nodes. Not perhaps quite as "spread
> out", but there's obviously no attempt to try to be node-aware either.
>
There is a slight difference. Reads should fill the nodes in turn but
dirty pages (__GFP_WRITE) get distributed to balance the number of dirty
pages on each node to avoid hitting dirty balance limits prematurely.
Yesterday I tried a patch that avoids distributing to remote nodes close
to the high watermark to avoid waking remote kswapd instances. It added a
lot of overhead to the fast path (3%) which hurts every writer but did not
reduce contention enough on the special case of writing a single large file.
As an aside, the dirty distribution check itself is very expensive so I
prototyped something that does the expensive calculations on a vmstat
update. Not sure if it'll work but it's a side issue.
> So _if_ we come up with some reasonable way to say "let's keep the
> pages of this mapping together", we could try to do it in that
> numa-aware __page_cache_alloc().
>
> It *could* be as simple/stupid as just saying "let's allocate the page
> cache for new pages from the current node" - and if the process that
> dirties pages just stays around on one single node, that might already
> be sufficient.
>
> So just for testing purposes, you could try changing that
>
> return alloc_pages(gfp, 0);
>
> in __page_cache_alloc() into something like
>
> return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
>
> or something.
>
The test would be interesting but I believe that keeping heavy writers
on one node will force them to stall early on dirty balancing even if
there is plenty of free memory on other nodes.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 7:11 ` Dave Chinner
@ 2016-08-19 15:08 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-19 15:08 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > (ie we could move the locking to the caller, and then make
> > > > shrink_page_list() just try to keep the lock held for a few pages if
> > > > the mapping doesn't change), and that might result in fewer crazy
> > > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > > kind of workaround.
> > > >
> > >
> > > Even if such batching was implemented, it would be very specific to the
> > > case of a single large file filling LRUs on multiple nodes.
> > >
> >
> > The latest Jason Bourne movie was sufficiently bad that I spent time
> > thinking how the tree_lock could be batched during reclaim. It's not
> > straight-forward but this prototype did not blow up on UMA and may be
> > worth considering if Dave can test either approach has a positive impact.
>
> SO, I just did a couple of tests. I'll call the two patches "sleepy"
> for the contention backoff patch and "bourney" for the Jason Bourne
> inspired batching patch. This is an average of 3 runs, overwriting
> a 47GB file on a machine with 16GB RAM:
>
> IO throughput wall time __pv_queued_spin_lock_slowpath
> vanilla 470MB/s 1m42s 25-30%
> sleepy 295MB/s 2m43s <1%
> bourney 425MB/s 1m53s 25-30%
>
This is another blunt-force patch that
a) stalls all but one kswapd instance when treelock contention occurs
b) marks a pgdat congested when tree_lock contention is encountered
which may cause direct reclaimers to wait_iff_congested until
kswapd finishes balancing the node
I tested this on a KVM instance running on a 4-socket box. The vCPUs
were bound to pCPUs and the memory nodes in the KVM mapped to physical
memory nodes. Without the patch 3% of kswapd cycles were spent on
locking. With the patch, the cycle count was 0.23%
xfs_io contention was reduced from 0.63% to 0.39% which is not perfect.
It can be reduced by stalling all kswapd instances but then xfs_io direct
reclaims and throughput drops.
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d572b78b65e1..f6d3e886f405 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -532,6 +532,7 @@ enum pgdat_flags {
* many pages under writeback
*/
PGDAT_RECLAIM_LOCKED, /* prevents concurrent reclaim */
+ PGDAT_CONTENDED, /* kswapd contending on tree_lock */
};
static inline unsigned long zone_end_pfn(const struct zone *zone)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 374d95d04178..64ca2148755c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -621,19 +621,43 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
return PAGE_CLEAN;
}
+static atomic_t kswapd_contended = ATOMIC_INIT(0);
+
/*
* Same as remove_mapping, but if the page is removed from the mapping, it
* gets returned with a refcount of 0.
*/
static int __remove_mapping(struct address_space *mapping, struct page *page,
- bool reclaimed)
+ bool reclaimed, unsigned long *nr_contended)
{
unsigned long flags;
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
- spin_lock_irqsave(&mapping->tree_lock, flags);
+ if (!nr_contended || !current_is_kswapd())
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ else {
+ /* Account for trylock contentions in kswapd */
+ if (!spin_trylock_irqsave(&mapping->tree_lock, flags)) {
+ pg_data_t *pgdat = page_pgdat(page);
+ int nr_kswapd;
+
+ /* Account for contended pages and contended kswapds */
+ (*nr_contended)++;
+ if (!test_and_set_bit(PGDAT_CONTENDED, &pgdat->flags))
+ nr_kswapd = atomic_inc_return(&kswapd_contended);
+ else
+ nr_kswapd = atomic_read(&kswapd_contended);
+ BUG_ON(nr_kswapd > nr_online_nodes || nr_kswapd < 0);
+
+ /* Stall kswapd if multiple kswapds are contending */
+ if (nr_kswapd > 1)
+ congestion_wait(BLK_RW_ASYNC, HZ/10);
+
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ }
+ }
/*
* The non racy check for a busy page.
*
@@ -719,7 +743,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
*/
int remove_mapping(struct address_space *mapping, struct page *page)
{
- if (__remove_mapping(mapping, page, false)) {
+ if (__remove_mapping(mapping, page, false, NULL)) {
/*
* Unfreezing the refcount with 1 rather than 2 effectively
* drops the pagecache ref for us without requiring another
@@ -906,6 +930,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
unsigned long *ret_nr_congested,
unsigned long *ret_nr_writeback,
unsigned long *ret_nr_immediate,
+ unsigned long *ret_nr_contended,
bool force_reclaim)
{
LIST_HEAD(ret_pages);
@@ -917,6 +942,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
unsigned long nr_reclaimed = 0;
unsigned long nr_writeback = 0;
unsigned long nr_immediate = 0;
+ unsigned long nr_contended = 0;
cond_resched();
@@ -1206,7 +1232,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
}
lazyfree:
- if (!mapping || !__remove_mapping(mapping, page, true))
+ if (!mapping || !__remove_mapping(mapping, page, true, &nr_contended))
goto keep_locked;
/*
@@ -1263,6 +1289,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
*ret_nr_unqueued_dirty += nr_unqueued_dirty;
*ret_nr_writeback += nr_writeback;
*ret_nr_immediate += nr_immediate;
+ *ret_nr_contended += nr_contended;
return nr_reclaimed;
}
@@ -1274,7 +1301,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
.priority = DEF_PRIORITY,
.may_unmap = 1,
};
- unsigned long ret, dummy1, dummy2, dummy3, dummy4, dummy5;
+ unsigned long ret, dummy1, dummy2, dummy3, dummy4, dummy5, dummy6;
struct page *page, *next;
LIST_HEAD(clean_pages);
@@ -1288,7 +1315,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
ret = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
TTU_UNMAP|TTU_IGNORE_ACCESS,
- &dummy1, &dummy2, &dummy3, &dummy4, &dummy5, true);
+ &dummy1, &dummy2, &dummy3, &dummy4, &dummy5, &dummy6, true);
list_splice(&clean_pages, page_list);
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -ret);
return ret;
@@ -1693,6 +1720,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
unsigned long nr_unqueued_dirty = 0;
unsigned long nr_writeback = 0;
unsigned long nr_immediate = 0;
+ unsigned long nr_contended = 0;
isolate_mode_t isolate_mode = 0;
int file = is_file_lru(lru);
struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1738,7 +1766,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
nr_reclaimed = shrink_page_list(&page_list, pgdat, sc, TTU_UNMAP,
&nr_dirty, &nr_unqueued_dirty, &nr_congested,
- &nr_writeback, &nr_immediate,
+ &nr_writeback, &nr_immediate, &nr_contended,
false);
spin_lock_irq(&pgdat->lru_lock);
@@ -1789,6 +1817,15 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
set_bit(PGDAT_CONGESTED, &pgdat->flags);
/*
+ * Tag a zone as congested if kswapd encounters contended pages
+ * as it may indicate contention with a heavy writer or
+ * other kswapd instances. The tag may stall direct reclaimers
+ * in wait_iff_congested.
+ */
+ if (nr_contended && current_is_kswapd())
+ set_bit(PGDAT_CONGESTED, &pgdat->flags);
+
+ /*
* If dirty pages are scanned that are not queued for IO, it
* implies that flushers are not keeping up. In this case, flag
* the pgdat PGDAT_DIRTY and kswapd will start writing pages from
@@ -1805,6 +1842,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
*/
if (nr_immediate && current_may_throttle())
congestion_wait(BLK_RW_ASYNC, HZ/10);
+
}
/*
@@ -3109,6 +3147,9 @@ static bool zone_balanced(struct zone *zone, int order, int classzone_idx)
clear_bit(PGDAT_CONGESTED, &zone->zone_pgdat->flags);
clear_bit(PGDAT_DIRTY, &zone->zone_pgdat->flags);
+ if (test_and_clear_bit(PGDAT_CONTENDED, &zone->zone_pgdat->flags))
+ atomic_dec(&kswapd_contended);
+
return true;
}
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-19 15:08 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-19 15:08 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 8567 bytes --]
On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > (ie we could move the locking to the caller, and then make
> > > > shrink_page_list() just try to keep the lock held for a few pages if
> > > > the mapping doesn't change), and that might result in fewer crazy
> > > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > > kind of workaround.
> > > >
> > >
> > > Even if such batching was implemented, it would be very specific to the
> > > case of a single large file filling LRUs on multiple nodes.
> > >
> >
> > The latest Jason Bourne movie was sufficiently bad that I spent time
> > thinking how the tree_lock could be batched during reclaim. It's not
> > straight-forward but this prototype did not blow up on UMA and may be
> > worth considering if Dave can test either approach has a positive impact.
>
> SO, I just did a couple of tests. I'll call the two patches "sleepy"
> for the contention backoff patch and "bourney" for the Jason Bourne
> inspired batching patch. This is an average of 3 runs, overwriting
> a 47GB file on a machine with 16GB RAM:
>
> IO throughput wall time __pv_queued_spin_lock_slowpath
> vanilla 470MB/s 1m42s 25-30%
> sleepy 295MB/s 2m43s <1%
> bourney 425MB/s 1m53s 25-30%
>
This is another blunt-force patch that
a) stalls all but one kswapd instance when treelock contention occurs
b) marks a pgdat congested when tree_lock contention is encountered
which may cause direct reclaimers to wait_iff_congested until
kswapd finishes balancing the node
I tested this on a KVM instance running on a 4-socket box. The vCPUs
were bound to pCPUs and the memory nodes in the KVM mapped to physical
memory nodes. Without the patch 3% of kswapd cycles were spent on
locking. With the patch, the cycle count was 0.23%
xfs_io contention was reduced from 0.63% to 0.39% which is not perfect.
It can be reduced by stalling all kswapd instances but then xfs_io direct
reclaims and throughput drops.
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d572b78b65e1..f6d3e886f405 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -532,6 +532,7 @@ enum pgdat_flags {
* many pages under writeback
*/
PGDAT_RECLAIM_LOCKED, /* prevents concurrent reclaim */
+ PGDAT_CONTENDED, /* kswapd contending on tree_lock */
};
static inline unsigned long zone_end_pfn(const struct zone *zone)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 374d95d04178..64ca2148755c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -621,19 +621,43 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
return PAGE_CLEAN;
}
+static atomic_t kswapd_contended = ATOMIC_INIT(0);
+
/*
* Same as remove_mapping, but if the page is removed from the mapping, it
* gets returned with a refcount of 0.
*/
static int __remove_mapping(struct address_space *mapping, struct page *page,
- bool reclaimed)
+ bool reclaimed, unsigned long *nr_contended)
{
unsigned long flags;
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
- spin_lock_irqsave(&mapping->tree_lock, flags);
+ if (!nr_contended || !current_is_kswapd())
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ else {
+ /* Account for trylock contentions in kswapd */
+ if (!spin_trylock_irqsave(&mapping->tree_lock, flags)) {
+ pg_data_t *pgdat = page_pgdat(page);
+ int nr_kswapd;
+
+ /* Account for contended pages and contended kswapds */
+ (*nr_contended)++;
+ if (!test_and_set_bit(PGDAT_CONTENDED, &pgdat->flags))
+ nr_kswapd = atomic_inc_return(&kswapd_contended);
+ else
+ nr_kswapd = atomic_read(&kswapd_contended);
+ BUG_ON(nr_kswapd > nr_online_nodes || nr_kswapd < 0);
+
+ /* Stall kswapd if multiple kswapds are contending */
+ if (nr_kswapd > 1)
+ congestion_wait(BLK_RW_ASYNC, HZ/10);
+
+ spin_lock_irqsave(&mapping->tree_lock, flags);
+ }
+ }
/*
* The non racy check for a busy page.
*
@@ -719,7 +743,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
*/
int remove_mapping(struct address_space *mapping, struct page *page)
{
- if (__remove_mapping(mapping, page, false)) {
+ if (__remove_mapping(mapping, page, false, NULL)) {
/*
* Unfreezing the refcount with 1 rather than 2 effectively
* drops the pagecache ref for us without requiring another
@@ -906,6 +930,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
unsigned long *ret_nr_congested,
unsigned long *ret_nr_writeback,
unsigned long *ret_nr_immediate,
+ unsigned long *ret_nr_contended,
bool force_reclaim)
{
LIST_HEAD(ret_pages);
@@ -917,6 +942,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
unsigned long nr_reclaimed = 0;
unsigned long nr_writeback = 0;
unsigned long nr_immediate = 0;
+ unsigned long nr_contended = 0;
cond_resched();
@@ -1206,7 +1232,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
}
lazyfree:
- if (!mapping || !__remove_mapping(mapping, page, true))
+ if (!mapping || !__remove_mapping(mapping, page, true, &nr_contended))
goto keep_locked;
/*
@@ -1263,6 +1289,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
*ret_nr_unqueued_dirty += nr_unqueued_dirty;
*ret_nr_writeback += nr_writeback;
*ret_nr_immediate += nr_immediate;
+ *ret_nr_contended += nr_contended;
return nr_reclaimed;
}
@@ -1274,7 +1301,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
.priority = DEF_PRIORITY,
.may_unmap = 1,
};
- unsigned long ret, dummy1, dummy2, dummy3, dummy4, dummy5;
+ unsigned long ret, dummy1, dummy2, dummy3, dummy4, dummy5, dummy6;
struct page *page, *next;
LIST_HEAD(clean_pages);
@@ -1288,7 +1315,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
ret = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
TTU_UNMAP|TTU_IGNORE_ACCESS,
- &dummy1, &dummy2, &dummy3, &dummy4, &dummy5, true);
+ &dummy1, &dummy2, &dummy3, &dummy4, &dummy5, &dummy6, true);
list_splice(&clean_pages, page_list);
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -ret);
return ret;
@@ -1693,6 +1720,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
unsigned long nr_unqueued_dirty = 0;
unsigned long nr_writeback = 0;
unsigned long nr_immediate = 0;
+ unsigned long nr_contended = 0;
isolate_mode_t isolate_mode = 0;
int file = is_file_lru(lru);
struct pglist_data *pgdat = lruvec_pgdat(lruvec);
@@ -1738,7 +1766,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
nr_reclaimed = shrink_page_list(&page_list, pgdat, sc, TTU_UNMAP,
&nr_dirty, &nr_unqueued_dirty, &nr_congested,
- &nr_writeback, &nr_immediate,
+ &nr_writeback, &nr_immediate, &nr_contended,
false);
spin_lock_irq(&pgdat->lru_lock);
@@ -1789,6 +1817,15 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
set_bit(PGDAT_CONGESTED, &pgdat->flags);
/*
+ * Tag a zone as congested if kswapd encounters contended pages
+ * as it may indicate contention with a heavy writer or
+ * other kswapd instances. The tag may stall direct reclaimers
+ * in wait_iff_congested.
+ */
+ if (nr_contended && current_is_kswapd())
+ set_bit(PGDAT_CONGESTED, &pgdat->flags);
+
+ /*
* If dirty pages are scanned that are not queued for IO, it
* implies that flushers are not keeping up. In this case, flag
* the pgdat PGDAT_DIRTY and kswapd will start writing pages from
@@ -1805,6 +1842,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
*/
if (nr_immediate && current_may_throttle())
congestion_wait(BLK_RW_ASYNC, HZ/10);
+
}
/*
@@ -3109,6 +3147,9 @@ static bool zone_balanced(struct zone *zone, int order, int classzone_idx)
clear_bit(PGDAT_CONGESTED, &zone->zone_pgdat->flags);
clear_bit(PGDAT_DIRTY, &zone->zone_pgdat->flags);
+ if (test_and_clear_bit(PGDAT_CONTENDED, &zone->zone_pgdat->flags))
+ atomic_dec(&kswapd_contended);
+
return true;
}
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-19 10:49 ` Mel Gorman
@ 2016-08-19 23:48 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-19 23:48 UTC (permalink / raw)
To: Mel Gorman
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Fri, Aug 19, 2016 at 11:49:46AM +0100, Mel Gorman wrote:
> On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> > It *could* be as simple/stupid as just saying "let's allocate the page
> > cache for new pages from the current node" - and if the process that
> > dirties pages just stays around on one single node, that might already
> > be sufficient.
> >
> > So just for testing purposes, you could try changing that
> >
> > return alloc_pages(gfp, 0);
> >
> > in __page_cache_alloc() into something like
> >
> > return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
> >
> > or something.
> >
>
> The test would be interesting but I believe that keeping heavy writers
> on one node will force them to stall early on dirty balancing even if
> there is plenty of free memory on other nodes.
Well, it depends on the speed of the storage. The higher the speed
of the storage, the less we care about stalling on dirty pages
during reclaim. i.e. faster storage == shorter stalls. We really
should stop thinking we need to optimise reclaim purely for the
benefit of slow disks. 500MB/s write speed with latencies of a
under a couple of milliseconds is common hardware these days. pcie
based storage (e.g. m2, nvme) is rapidly becoming commonplace and
they can easily do 1-2GB/s write speeds.
The fast storage devices that are arriving need to be treated
more like a fast network device (e.g. a pci-e 4x nvme SSD has the
throughput of 2x10GbE devices). We have to consider if buffering
streaming data in the page cache for any longer than it takes to get
the data to userspace or to disk is worth the cost of reclaiming it
from the page cache.
Really, the question that needs to be answered is this: if we can
pull data from the storage at similar speeds and latencies as we can
from the page cache, then *why are we caching that data*?
We've already made that "don't cache for fast storage" decision in
the case of pmem - the DAX IO path is slowly moving towards making
full use of the mapping infrastructure for all it's tracking
requirements. pcie based storage is a bit slower than pmem, but
the principle is the same - the storage is sufficiently fast that
caching only really makes sense for data that is really hot...
I think the underlying principle here is that the faster the backing
device, the less we should cache and buffer the device in the OS. I
suspect a good initial approximation of "stickiness" for the page
cache would the speed of writeback as measured by the BDI underlying
the mapping....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-19 23:48 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-08-19 23:48 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2684 bytes --]
On Fri, Aug 19, 2016 at 11:49:46AM +0100, Mel Gorman wrote:
> On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> > It *could* be as simple/stupid as just saying "let's allocate the page
> > cache for new pages from the current node" - and if the process that
> > dirties pages just stays around on one single node, that might already
> > be sufficient.
> >
> > So just for testing purposes, you could try changing that
> >
> > return alloc_pages(gfp, 0);
> >
> > in __page_cache_alloc() into something like
> >
> > return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
> >
> > or something.
> >
>
> The test would be interesting but I believe that keeping heavy writers
> on one node will force them to stall early on dirty balancing even if
> there is plenty of free memory on other nodes.
Well, it depends on the speed of the storage. The higher the speed
of the storage, the less we care about stalling on dirty pages
during reclaim. i.e. faster storage == shorter stalls. We really
should stop thinking we need to optimise reclaim purely for the
benefit of slow disks. 500MB/s write speed with latencies of a
under a couple of milliseconds is common hardware these days. pcie
based storage (e.g. m2, nvme) is rapidly becoming commonplace and
they can easily do 1-2GB/s write speeds.
The fast storage devices that are arriving need to be treated
more like a fast network device (e.g. a pci-e 4x nvme SSD has the
throughput of 2x10GbE devices). We have to consider if buffering
streaming data in the page cache for any longer than it takes to get
the data to userspace or to disk is worth the cost of reclaiming it
from the page cache.
Really, the question that needs to be answered is this: if we can
pull data from the storage at similar speeds and latencies as we can
from the page cache, then *why are we caching that data*?
We've already made that "don't cache for fast storage" decision in
the case of pmem - the DAX IO path is slowly moving towards making
full use of the mapping infrastructure for all it's tracking
requirements. pcie based storage is a bit slower than pmem, but
the principle is the same - the storage is sufficiently fast that
caching only really makes sense for data that is really hot...
I think the underlying principle here is that the faster the backing
device, the less we should cache and buffer the device in the OS. I
suspect a good initial approximation of "stickiness" for the page
cache would the speed of writeback as measured by the BDI underlying
the mapping....
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-19 23:48 ` Dave Chinner
@ 2016-08-20 1:08 ` Linus Torvalds
-1 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-20 1:08 UTC (permalink / raw)
To: Dave Chinner
Cc: Mel Gorman, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Fri, Aug 19, 2016 at 4:48 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Well, it depends on the speed of the storage. The higher the speed
> of the storage, the less we care about stalling on dirty pages
> during reclaim
Actually, that's largely true independently of the speed of the storage, I feel.
On really fast storage, you might as well push it out and buffering
lots of dirty memory pointless. And on really slow storage, buffering
lots of dirty memory is absolutely *horrible* from a latency
standpoint.
So I don't think this is about fast-vs-slow disks.
I think a lot of our "let's aggressively buffer dirty data" is
entirely historical. When you had 16MB of RAM in a workstation,
aggressively using half of it for writeback caches meant that you
could do things like untar source trees without waiting for IO.
But when you have 16GB of RAM in a workstation, and terabytes of RAM
in multi-node big machines, it's kind of silly to talk about
"percentages of memory available" for dirty data. I think it's likely
silly to even see "one node worth of memory" as being some limiter.
So I think we should try to avoid stalling on dirty pages during
reclaim by simply aiming to have fewer dirty pages in the first place.
Not because the stall is shorter on a fast disk, but because we just
shouldn't use that much memory for dirty data.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-20 1:08 ` Linus Torvalds
0 siblings, 0 replies; 219+ messages in thread
From: Linus Torvalds @ 2016-08-20 1:08 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 1409 bytes --]
On Fri, Aug 19, 2016 at 4:48 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> Well, it depends on the speed of the storage. The higher the speed
> of the storage, the less we care about stalling on dirty pages
> during reclaim
Actually, that's largely true independently of the speed of the storage, I feel.
On really fast storage, you might as well push it out and buffering
lots of dirty memory pointless. And on really slow storage, buffering
lots of dirty memory is absolutely *horrible* from a latency
standpoint.
So I don't think this is about fast-vs-slow disks.
I think a lot of our "let's aggressively buffer dirty data" is
entirely historical. When you had 16MB of RAM in a workstation,
aggressively using half of it for writeback caches meant that you
could do things like untar source trees without waiting for IO.
But when you have 16GB of RAM in a workstation, and terabytes of RAM
in multi-node big machines, it's kind of silly to talk about
"percentages of memory available" for dirty data. I think it's likely
silly to even see "one node worth of memory" as being some limiter.
So I think we should try to avoid stalling on dirty pages during
reclaim by simply aiming to have fewer dirty pages in the first place.
Not because the stall is shorter on a fast disk, but because we just
shouldn't use that much memory for dirty data.
Linus
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-19 23:48 ` Dave Chinner
@ 2016-08-20 12:16 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-20 12:16 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Sat, Aug 20, 2016 at 09:48:39AM +1000, Dave Chinner wrote:
> On Fri, Aug 19, 2016 at 11:49:46AM +0100, Mel Gorman wrote:
> > On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> > > It *could* be as simple/stupid as just saying "let's allocate the page
> > > cache for new pages from the current node" - and if the process that
> > > dirties pages just stays around on one single node, that might already
> > > be sufficient.
> > >
> > > So just for testing purposes, you could try changing that
> > >
> > > return alloc_pages(gfp, 0);
> > >
> > > in __page_cache_alloc() into something like
> > >
> > > return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
> > >
> > > or something.
> > >
> >
> > The test would be interesting but I believe that keeping heavy writers
> > on one node will force them to stall early on dirty balancing even if
> > there is plenty of free memory on other nodes.
>
> Well, it depends on the speed of the storage. The higher the speed
> of the storage, the less we care about stalling on dirty pages
> during reclaim. i.e. faster storage == shorter stalls. We really
> should stop thinking we need to optimise reclaim purely for the
> benefit of slow disks. 500MB/s write speed with latencies of a
> under a couple of milliseconds is common hardware these days. pcie
> based storage (e.g. m2, nvme) is rapidly becoming commonplace and
> they can easily do 1-2GB/s write speeds.
>
I partially agree. I've been of the opinion for a long time that dirty_time
would be desirable and limit the amount of dirty data by microseconds
required to sync the data and pick a default like 5 seconds. It's
non-trivial as the write speed of all BDIs would have to be estimated
and on rotary storage the estimate would be unreliable.
A short-term practical idea would be to distribute pages for writing
only when the dirty limit is almost reached on a given node. For fast
storage, the distribution may never happen.
Neither idea would actually impact the current problem though unless it
was combined with discarding clean cache agressively if the underlying
storage is fast. Hence, it would still be nice if the contention problem
could be mitigated. Did that last patch help any?
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-20 12:16 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-20 12:16 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]
On Sat, Aug 20, 2016 at 09:48:39AM +1000, Dave Chinner wrote:
> On Fri, Aug 19, 2016 at 11:49:46AM +0100, Mel Gorman wrote:
> > On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> > > It *could* be as simple/stupid as just saying "let's allocate the page
> > > cache for new pages from the current node" - and if the process that
> > > dirties pages just stays around on one single node, that might already
> > > be sufficient.
> > >
> > > So just for testing purposes, you could try changing that
> > >
> > > return alloc_pages(gfp, 0);
> > >
> > > in __page_cache_alloc() into something like
> > >
> > > return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
> > >
> > > or something.
> > >
> >
> > The test would be interesting but I believe that keeping heavy writers
> > on one node will force them to stall early on dirty balancing even if
> > there is plenty of free memory on other nodes.
>
> Well, it depends on the speed of the storage. The higher the speed
> of the storage, the less we care about stalling on dirty pages
> during reclaim. i.e. faster storage == shorter stalls. We really
> should stop thinking we need to optimise reclaim purely for the
> benefit of slow disks. 500MB/s write speed with latencies of a
> under a couple of milliseconds is common hardware these days. pcie
> based storage (e.g. m2, nvme) is rapidly becoming commonplace and
> they can easily do 1-2GB/s write speeds.
>
I partially agree. I've been of the opinion for a long time that dirty_time
would be desirable and limit the amount of dirty data by microseconds
required to sync the data and pick a default like 5 seconds. It's
non-trivial as the write speed of all BDIs would have to be estimated
and on rotary storage the estimate would be unreliable.
A short-term practical idea would be to distribute pages for writing
only when the dirty limit is almost reached on a given node. For fast
storage, the distribution may never happen.
Neither idea would actually impact the current problem though unless it
was combined with discarding clean cache agressively if the underlying
storage is fast. Hence, it would still be nice if the contention problem
could be mitigated. Did that last patch help any?
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-15 20:30 ` Huang, Ying
@ 2016-08-22 22:09 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-22 22:09 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Linus Torvalds, Dave Chinner, LKML, Bob Peterson, Fengguang Wu,
LKP, Huang, Ying
Hi, Christoph,
"Huang, Ying" <ying.huang@intel.com> writes:
> Christoph Hellwig <hch@lst.de> writes:
>
>> Snipping the long contest:
>>
>> I think there are three observations here:
>>
>> (1) removing the mark_page_accessed (which is the only significant
>> change in the parent commit) hurts the
>> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
>> I'd still rather stick to the filemap version and let the
>> VM people sort it out. How do the numbers for this test
>> look for XFS vs say ext4 and btrfs?
>> (2) lots of additional spinlock contention in the new case. A quick
>> check shows that I fat-fingered my rewrite so that we do
>> the xfs_inode_set_eofblocks_tag call now for the pure lookup
>> case, and pretty much all new cycles come from that.
>> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
>> we're already doing way to many even without my little bug above.
>>
>> So I've force pushed a new version of the iomap-fixes branch with
>> (2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>> lot less expensive slotted in before that. Would be good to see
>> the numbers with that.
>
> For the original reported regression, the test result is as follow,
>
> =========================================================================================
> compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
> gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
>
> commit:
> f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506 (parent of first bad commit)
> 68a9f5e7007c1afa2cf6830b690a90d0187c0684 (first bad commit)
> 99091700659f4df965e138b38b4fa26a29b7eade (base of your fixes branch)
> bf4dc6e4ecc2a3d042029319bc8cd4204c185610 (head of your fixes branch)
>
> f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 99091700659f4df965e138b38b bf4dc6e4ecc2a3d042029319bc
> ---------------- -------------------------- -------------------------- --------------------------
> %stddev %change %stddev %change %stddev %change %stddev
> \ | \ | \ | \
> 484435 ± 0% -13.3% 420004 ± 0% -17.0% 402250 ± 0% -15.6% 408998 ± 0% aim7.jobs-per-min
It appears the original reported regression hasn't bee resolved by your
commit. Could you take a look at the test results and the perf data?
Best Regards,
Huang, Ying
>
> And the perf data is as follow,
>
> "perf-profile.func.cycles-pp.intel_idle": 20.25,
> "perf-profile.func.cycles-pp.memset_erms": 11.72,
> "perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 8.37,
> "perf-profile.func.cycles-pp.__block_commit_write.isra.21": 3.49,
> "perf-profile.func.cycles-pp.block_write_end": 1.77,
> "perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 1.63,
> "perf-profile.func.cycles-pp.unlock_page": 1.58,
> "perf-profile.func.cycles-pp.___might_sleep": 1.56,
> "perf-profile.func.cycles-pp.__block_write_begin_int": 1.33,
> "perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 1.23,
> "perf-profile.func.cycles-pp.up_write": 1.21,
> "perf-profile.func.cycles-pp.__mark_inode_dirty": 1.18,
> "perf-profile.func.cycles-pp.down_write": 1.06,
> "perf-profile.func.cycles-pp.mark_buffer_dirty": 0.94,
> "perf-profile.func.cycles-pp.generic_write_end": 0.92,
> "perf-profile.func.cycles-pp.__radix_tree_lookup": 0.91,
> "perf-profile.func.cycles-pp._raw_spin_lock": 0.81,
> "perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 0.79,
> "perf-profile.func.cycles-pp.__might_sleep": 0.79,
> "perf-profile.func.cycles-pp.xfs_file_iomap_begin_delay.isra.9": 0.7,
> "perf-profile.func.cycles-pp.__list_del_entry": 0.7,
> "perf-profile.func.cycles-pp.vfs_write": 0.69,
> "perf-profile.func.cycles-pp.drop_buffers": 0.68,
> "perf-profile.func.cycles-pp.xfs_file_write_iter": 0.67,
> "perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.67,
>
> Best Regards,
> Huang, Ying
> _______________________________________________
> LKP mailing list
> LKP@lists.01.org
> https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-22 22:09 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-22 22:09 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4362 bytes --]
Hi, Christoph,
"Huang, Ying" <ying.huang@intel.com> writes:
> Christoph Hellwig <hch@lst.de> writes:
>
>> Snipping the long contest:
>>
>> I think there are three observations here:
>>
>> (1) removing the mark_page_accessed (which is the only significant
>> change in the parent commit) hurts the
>> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
>> I'd still rather stick to the filemap version and let the
>> VM people sort it out. How do the numbers for this test
>> look for XFS vs say ext4 and btrfs?
>> (2) lots of additional spinlock contention in the new case. A quick
>> check shows that I fat-fingered my rewrite so that we do
>> the xfs_inode_set_eofblocks_tag call now for the pure lookup
>> case, and pretty much all new cycles come from that.
>> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
>> we're already doing way to many even without my little bug above.
>>
>> So I've force pushed a new version of the iomap-fixes branch with
>> (2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>> lot less expensive slotted in before that. Would be good to see
>> the numbers with that.
>
> For the original reported regression, the test result is as follow,
>
> =========================================================================================
> compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
> gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
>
> commit:
> f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506 (parent of first bad commit)
> 68a9f5e7007c1afa2cf6830b690a90d0187c0684 (first bad commit)
> 99091700659f4df965e138b38b4fa26a29b7eade (base of your fixes branch)
> bf4dc6e4ecc2a3d042029319bc8cd4204c185610 (head of your fixes branch)
>
> f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 99091700659f4df965e138b38b bf4dc6e4ecc2a3d042029319bc
> ---------------- -------------------------- -------------------------- --------------------------
> %stddev %change %stddev %change %stddev %change %stddev
> \ | \ | \ | \
> 484435 ± 0% -13.3% 420004 ± 0% -17.0% 402250 ± 0% -15.6% 408998 ± 0% aim7.jobs-per-min
It appears the original reported regression hasn't bee resolved by your
commit. Could you take a look at the test results and the perf data?
Best Regards,
Huang, Ying
>
> And the perf data is as follow,
>
> "perf-profile.func.cycles-pp.intel_idle": 20.25,
> "perf-profile.func.cycles-pp.memset_erms": 11.72,
> "perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 8.37,
> "perf-profile.func.cycles-pp.__block_commit_write.isra.21": 3.49,
> "perf-profile.func.cycles-pp.block_write_end": 1.77,
> "perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 1.63,
> "perf-profile.func.cycles-pp.unlock_page": 1.58,
> "perf-profile.func.cycles-pp.___might_sleep": 1.56,
> "perf-profile.func.cycles-pp.__block_write_begin_int": 1.33,
> "perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 1.23,
> "perf-profile.func.cycles-pp.up_write": 1.21,
> "perf-profile.func.cycles-pp.__mark_inode_dirty": 1.18,
> "perf-profile.func.cycles-pp.down_write": 1.06,
> "perf-profile.func.cycles-pp.mark_buffer_dirty": 0.94,
> "perf-profile.func.cycles-pp.generic_write_end": 0.92,
> "perf-profile.func.cycles-pp.__radix_tree_lookup": 0.91,
> "perf-profile.func.cycles-pp._raw_spin_lock": 0.81,
> "perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 0.79,
> "perf-profile.func.cycles-pp.__might_sleep": 0.79,
> "perf-profile.func.cycles-pp.xfs_file_iomap_begin_delay.isra.9": 0.7,
> "perf-profile.func.cycles-pp.__list_del_entry": 0.7,
> "perf-profile.func.cycles-pp.vfs_write": 0.69,
> "perf-profile.func.cycles-pp.drop_buffers": 0.68,
> "perf-profile.func.cycles-pp.xfs_file_write_iter": 0.67,
> "perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.67,
>
> Best Regards,
> Huang, Ying
> _______________________________________________
> LKP mailing list
> LKP(a)lists.01.org
> https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-18 0:45 ` Mel Gorman
@ 2016-08-24 15:40 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-24 15:40 UTC (permalink / raw)
To: Mel Gorman
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Dave Chinner, Johannes Weiner, Vlastimil Babka, Andrew Morton,
Bob Peterson, Kirill A. Shutemov, Huang, Ying, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
Hi, Mel,
Mel Gorman <mgorman@techsingularity.net> writes:
> On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
>> > Yes, we could try to batch the locking like DaveC already suggested
>> > (ie we could move the locking to the caller, and then make
>> > shrink_page_list() just try to keep the lock held for a few pages if
>> > the mapping doesn't change), and that might result in fewer crazy
>> > cacheline ping-pongs overall. But that feels like exactly the wrong
>> > kind of workaround.
>> >
>>
>> Even if such batching was implemented, it would be very specific to the
>> case of a single large file filling LRUs on multiple nodes.
>>
>
> The latest Jason Bourne movie was sufficiently bad that I spent time
> thinking how the tree_lock could be batched during reclaim. It's not
> straight-forward but this prototype did not blow up on UMA and may be
> worth considering if Dave can test either approach has a positive impact.
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 374d95d04178..926110219cd9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -621,19 +621,39 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
> return PAGE_CLEAN;
> }
We found this patch helps much for swap out performance, where there are
usually only one mapping for all swap pages. In our 16 processes
sequential swap write test case for a ramdisk on a Xeon E5 v3 machine,
the swap out throughput improved 40.4%, from ~0.97GB/s to ~1.36GB/s.
What's your plan for this patch? If it can be merged soon, that will be
great!
I found some issues in the original patch to work with swap cache. Below
is my fixes to make it work for swap cache.
Best Regards,
Huang, Ying
-------------------------------------------------------------------->
diff --git a/mm/vmscan.c b/mm/vmscan.c
index ac5fbff..dcaf295 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -623,22 +623,28 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
static void finalise_remove_mapping(struct list_head *swapcache,
struct list_head *filecache,
+ struct list_head *free_pages,
void (*freepage)(struct page *))
{
struct page *page;
while (!list_empty(swapcache)) {
- swp_entry_t swap = { .val = page_private(page) };
+ swp_entry_t swap;
page = lru_to_page(swapcache);
list_del(&page->lru);
+ swap.val = page_private(page);
swapcache_free(swap);
set_page_private(page, 0);
+ if (free_pages)
+ list_add(&page->lru, free_pages);
}
while (!list_empty(filecache)) {
- page = lru_to_page(swapcache);
+ page = lru_to_page(filecache);
list_del(&page->lru);
freepage(page);
+ if (free_pages)
+ list_add(&page->lru, free_pages);
}
}
@@ -649,7 +655,8 @@ static void finalise_remove_mapping(struct list_head *swapcache,
static int __remove_mapping_page(struct address_space *mapping,
struct page *page, bool reclaimed,
struct list_head *swapcache,
- struct list_head *filecache)
+ struct list_head *filecache,
+ struct list_head *free_pages)
{
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
@@ -722,6 +729,8 @@ static int __remove_mapping_page(struct address_space *mapping,
__delete_from_page_cache(page, shadow);
if (freepage)
list_add(&page->lru, filecache);
+ else if (free_pages)
+ list_add(&page->lru, free_pages);
}
return 1;
@@ -747,7 +756,7 @@ int remove_mapping(struct address_space *mapping, struct page *page)
spin_lock_irqsave(&mapping->tree_lock, flags);
freepage = mapping->a_ops->freepage;
- if (__remove_mapping_page(mapping, page, false, &swapcache, &filecache)) {
+ if (__remove_mapping_page(mapping, page, false, &swapcache, &filecache, NULL)) {
/*
* Unfreezing the refcount with 1 rather than 2 effectively
* drops the pagecache ref for us without requiring another
@@ -757,7 +766,7 @@ int remove_mapping(struct address_space *mapping, struct page *page)
ret = 1;
}
spin_unlock_irqrestore(&mapping->tree_lock, flags);
- finalise_remove_mapping(&swapcache, &filecache, freepage);
+ finalise_remove_mapping(&swapcache, &filecache, NULL, freepage);
return ret;
}
@@ -776,29 +785,28 @@ static void remove_mapping_list(struct list_head *mapping_list,
page = lru_to_page(mapping_list);
list_del(&page->lru);
- if (!mapping || page->mapping != mapping) {
+ if (!mapping || page_mapping(page) != mapping) {
if (mapping) {
spin_unlock_irqrestore(&mapping->tree_lock, flags);
- finalise_remove_mapping(&swapcache, &filecache, freepage);
+ finalise_remove_mapping(&swapcache, &filecache, free_pages, freepage);
}
- mapping = page->mapping;
+ mapping = page_mapping(page);
spin_lock_irqsave(&mapping->tree_lock, flags);
freepage = mapping->a_ops->freepage;
}
- if (!__remove_mapping_page(mapping, page, true, &swapcache, &filecache)) {
+ if (!__remove_mapping_page(mapping, page, true, &swapcache,
+ &filecache, free_pages)) {
unlock_page(page);
list_add(&page->lru, ret_pages);
- } else {
+ } else
__ClearPageLocked(page);
- list_add(&page->lru, free_pages);
- }
}
if (mapping) {
spin_unlock_irqrestore(&mapping->tree_lock, flags);
- finalise_remove_mapping(&swapcache, &filecache, freepage);
+ finalise_remove_mapping(&swapcache, &filecache, free_pages, freepage);
}
}
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-24 15:40 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-08-24 15:40 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 5510 bytes --]
Hi, Mel,
Mel Gorman <mgorman@techsingularity.net> writes:
> On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
>> > Yes, we could try to batch the locking like DaveC already suggested
>> > (ie we could move the locking to the caller, and then make
>> > shrink_page_list() just try to keep the lock held for a few pages if
>> > the mapping doesn't change), and that might result in fewer crazy
>> > cacheline ping-pongs overall. But that feels like exactly the wrong
>> > kind of workaround.
>> >
>>
>> Even if such batching was implemented, it would be very specific to the
>> case of a single large file filling LRUs on multiple nodes.
>>
>
> The latest Jason Bourne movie was sufficiently bad that I spent time
> thinking how the tree_lock could be batched during reclaim. It's not
> straight-forward but this prototype did not blow up on UMA and may be
> worth considering if Dave can test either approach has a positive impact.
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 374d95d04178..926110219cd9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -621,19 +621,39 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
> return PAGE_CLEAN;
> }
We found this patch helps much for swap out performance, where there are
usually only one mapping for all swap pages. In our 16 processes
sequential swap write test case for a ramdisk on a Xeon E5 v3 machine,
the swap out throughput improved 40.4%, from ~0.97GB/s to ~1.36GB/s.
What's your plan for this patch? If it can be merged soon, that will be
great!
I found some issues in the original patch to work with swap cache. Below
is my fixes to make it work for swap cache.
Best Regards,
Huang, Ying
-------------------------------------------------------------------->
diff --git a/mm/vmscan.c b/mm/vmscan.c
index ac5fbff..dcaf295 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -623,22 +623,28 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
static void finalise_remove_mapping(struct list_head *swapcache,
struct list_head *filecache,
+ struct list_head *free_pages,
void (*freepage)(struct page *))
{
struct page *page;
while (!list_empty(swapcache)) {
- swp_entry_t swap = { .val = page_private(page) };
+ swp_entry_t swap;
page = lru_to_page(swapcache);
list_del(&page->lru);
+ swap.val = page_private(page);
swapcache_free(swap);
set_page_private(page, 0);
+ if (free_pages)
+ list_add(&page->lru, free_pages);
}
while (!list_empty(filecache)) {
- page = lru_to_page(swapcache);
+ page = lru_to_page(filecache);
list_del(&page->lru);
freepage(page);
+ if (free_pages)
+ list_add(&page->lru, free_pages);
}
}
@@ -649,7 +655,8 @@ static void finalise_remove_mapping(struct list_head *swapcache,
static int __remove_mapping_page(struct address_space *mapping,
struct page *page, bool reclaimed,
struct list_head *swapcache,
- struct list_head *filecache)
+ struct list_head *filecache,
+ struct list_head *free_pages)
{
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
@@ -722,6 +729,8 @@ static int __remove_mapping_page(struct address_space *mapping,
__delete_from_page_cache(page, shadow);
if (freepage)
list_add(&page->lru, filecache);
+ else if (free_pages)
+ list_add(&page->lru, free_pages);
}
return 1;
@@ -747,7 +756,7 @@ int remove_mapping(struct address_space *mapping, struct page *page)
spin_lock_irqsave(&mapping->tree_lock, flags);
freepage = mapping->a_ops->freepage;
- if (__remove_mapping_page(mapping, page, false, &swapcache, &filecache)) {
+ if (__remove_mapping_page(mapping, page, false, &swapcache, &filecache, NULL)) {
/*
* Unfreezing the refcount with 1 rather than 2 effectively
* drops the pagecache ref for us without requiring another
@@ -757,7 +766,7 @@ int remove_mapping(struct address_space *mapping, struct page *page)
ret = 1;
}
spin_unlock_irqrestore(&mapping->tree_lock, flags);
- finalise_remove_mapping(&swapcache, &filecache, freepage);
+ finalise_remove_mapping(&swapcache, &filecache, NULL, freepage);
return ret;
}
@@ -776,29 +785,28 @@ static void remove_mapping_list(struct list_head *mapping_list,
page = lru_to_page(mapping_list);
list_del(&page->lru);
- if (!mapping || page->mapping != mapping) {
+ if (!mapping || page_mapping(page) != mapping) {
if (mapping) {
spin_unlock_irqrestore(&mapping->tree_lock, flags);
- finalise_remove_mapping(&swapcache, &filecache, freepage);
+ finalise_remove_mapping(&swapcache, &filecache, free_pages, freepage);
}
- mapping = page->mapping;
+ mapping = page_mapping(page);
spin_lock_irqsave(&mapping->tree_lock, flags);
freepage = mapping->a_ops->freepage;
}
- if (!__remove_mapping_page(mapping, page, true, &swapcache, &filecache)) {
+ if (!__remove_mapping_page(mapping, page, true, &swapcache,
+ &filecache, free_pages)) {
unlock_page(page);
list_add(&page->lru, ret_pages);
- } else {
+ } else
__ClearPageLocked(page);
- list_add(&page->lru, free_pages);
- }
}
if (mapping) {
spin_unlock_irqrestore(&mapping->tree_lock, flags);
- finalise_remove_mapping(&swapcache, &filecache, freepage);
+ finalise_remove_mapping(&swapcache, &filecache, free_pages, freepage);
}
}
^ permalink raw reply related [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-24 15:40 ` Huang, Ying
@ 2016-08-25 9:37 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-25 9:37 UTC (permalink / raw)
To: Huang, Ying
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Dave Chinner, Johannes Weiner, Vlastimil Babka, Andrew Morton,
Bob Peterson, Kirill A. Shutemov, Christoph Hellwig,
Wu Fengguang, LKP, Tejun Heo, LKML
On Wed, Aug 24, 2016 at 08:40:37AM -0700, Huang, Ying wrote:
> Mel Gorman <mgorman@techsingularity.net> writes:
>
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> >> > Yes, we could try to batch the locking like DaveC already suggested
> >> > (ie we could move the locking to the caller, and then make
> >> > shrink_page_list() just try to keep the lock held for a few pages if
> >> > the mapping doesn't change), and that might result in fewer crazy
> >> > cacheline ping-pongs overall. But that feels like exactly the wrong
> >> > kind of workaround.
> >> >
> >>
> >> Even if such batching was implemented, it would be very specific to the
> >> case of a single large file filling LRUs on multiple nodes.
> >>
> >
> > The latest Jason Bourne movie was sufficiently bad that I spent time
> > thinking how the tree_lock could be batched during reclaim. It's not
> > straight-forward but this prototype did not blow up on UMA and may be
> > worth considering if Dave can test either approach has a positive impact.
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 374d95d04178..926110219cd9 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -621,19 +621,39 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
> > return PAGE_CLEAN;
> > }
>
> We found this patch helps much for swap out performance, where there are
> usually only one mapping for all swap pages.
Yeah, I expected it would be an unconditional win on swapping. I just
did not concentrate on it very much as it was not the problem at hand.
> In our 16 processes
> sequential swap write test case for a ramdisk on a Xeon E5 v3 machine,
> the swap out throughput improved 40.4%, from ~0.97GB/s to ~1.36GB/s.
Ok, so main benefit would be for ultra-fast storage. I doubt it's noticable
on slow disks.
> What's your plan for this patch? If it can be merged soon, that will be
> great!
>
Until this mail, no plan. I'm still waiting to hear if Dave's test case
has improved with the latest prototype for reducing contention.
> I found some issues in the original patch to work with swap cache. Below
> is my fixes to make it work for swap cache.
>
Thanks for the fix. I'm going offline today for a few days but I added a
todo item to finish this patch at some point. I won't be rushing it but
it'll get done eventually.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-08-25 9:37 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-08-25 9:37 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 2440 bytes --]
On Wed, Aug 24, 2016 at 08:40:37AM -0700, Huang, Ying wrote:
> Mel Gorman <mgorman@techsingularity.net> writes:
>
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> >> > Yes, we could try to batch the locking like DaveC already suggested
> >> > (ie we could move the locking to the caller, and then make
> >> > shrink_page_list() just try to keep the lock held for a few pages if
> >> > the mapping doesn't change), and that might result in fewer crazy
> >> > cacheline ping-pongs overall. But that feels like exactly the wrong
> >> > kind of workaround.
> >> >
> >>
> >> Even if such batching was implemented, it would be very specific to the
> >> case of a single large file filling LRUs on multiple nodes.
> >>
> >
> > The latest Jason Bourne movie was sufficiently bad that I spent time
> > thinking how the tree_lock could be batched during reclaim. It's not
> > straight-forward but this prototype did not blow up on UMA and may be
> > worth considering if Dave can test either approach has a positive impact.
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 374d95d04178..926110219cd9 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -621,19 +621,39 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
> > return PAGE_CLEAN;
> > }
>
> We found this patch helps much for swap out performance, where there are
> usually only one mapping for all swap pages.
Yeah, I expected it would be an unconditional win on swapping. I just
did not concentrate on it very much as it was not the problem at hand.
> In our 16 processes
> sequential swap write test case for a ramdisk on a Xeon E5 v3 machine,
> the swap out throughput improved 40.4%, from ~0.97GB/s to ~1.36GB/s.
Ok, so main benefit would be for ultra-fast storage. I doubt it's noticable
on slow disks.
> What's your plan for this patch? If it can be merged soon, that will be
> great!
>
Until this mail, no plan. I'm still waiting to hear if Dave's test case
has improved with the latest prototype for reducing contention.
> I found some issues in the original patch to work with swap cache. Below
> is my fixes to make it work for swap cache.
>
Thanks for the fix. I'm going offline today for a few days but I added a
todo item to finish this patch at some point. I won't be rushing it but
it'll get done eventually.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-19 15:08 ` Mel Gorman
@ 2016-09-01 23:32 ` Dave Chinner
-1 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-09-01 23:32 UTC (permalink / raw)
To: Mel Gorman
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
> On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > > (ie we could move the locking to the caller, and then make
> > > > > shrink_page_list() just try to keep the lock held for a few pages if
> > > > > the mapping doesn't change), and that might result in fewer crazy
> > > > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > > > kind of workaround.
> > > > >
> > > >
> > > > Even if such batching was implemented, it would be very specific to the
> > > > case of a single large file filling LRUs on multiple nodes.
> > > >
> > >
> > > The latest Jason Bourne movie was sufficiently bad that I spent time
> > > thinking how the tree_lock could be batched during reclaim. It's not
> > > straight-forward but this prototype did not blow up on UMA and may be
> > > worth considering if Dave can test either approach has a positive impact.
> >
> > SO, I just did a couple of tests. I'll call the two patches "sleepy"
> > for the contention backoff patch and "bourney" for the Jason Bourne
> > inspired batching patch. This is an average of 3 runs, overwriting
> > a 47GB file on a machine with 16GB RAM:
> >
> > IO throughput wall time __pv_queued_spin_lock_slowpath
> > vanilla 470MB/s 1m42s 25-30%
> > sleepy 295MB/s 2m43s <1%
> > bourney 425MB/s 1m53s 25-30%
> >
>
> This is another blunt-force patch that
Sorry for taking so long to get back to this - had a bunch of other
stuff to do (e.g. XFS metadata CRCs have found their first compiler
bug) and haven't had to time test this.
The blunt force approach seems to work ok:
IO throughput wall time __pv_queued_spin_lock_slowpath
vanilla 470MB/s 1m42s 25-30%
sleepy 295MB/s 2m43s <1%
bourney 425MB/s 1m53s 25-30%
blunt 470MB/s 1m41s ~2%
Performance is pretty much the same as teh vanilla kernel - maybe
a little bit faster if we consider median rather than mean results.
A snapshot profile from 'perf top -U' looks like:
11.31% [kernel] [k] copy_user_generic_string
3.59% [kernel] [k] get_page_from_freelist
3.22% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
2.80% [kernel] [k] __block_commit_write.isra.29
2.14% [kernel] [k] __pv_queued_spin_lock_slowpath
1.99% [kernel] [k] _raw_spin_lock
1.98% [kernel] [k] wake_all_kswapds
1.92% [kernel] [k] _raw_spin_lock_irqsave
1.90% [kernel] [k] node_dirty_ok
1.69% [kernel] [k] __wake_up_bit
1.57% [kernel] [k] ___might_sleep
1.49% [kernel] [k] __might_sleep
1.24% [kernel] [k] __radix_tree_lookup
1.18% [kernel] [k] kmem_cache_alloc
1.13% [kernel] [k] update_fast_ctr
1.11% [kernel] [k] radix_tree_tag_set
1.08% [kernel] [k] clear_page_dirty_for_io
1.06% [kernel] [k] down_write
1.06% [kernel] [k] up_write
1.01% [kernel] [k] unlock_page
0.99% [kernel] [k] xfs_log_commit_cil
0.97% [kernel] [k] __inc_node_state
0.95% [kernel] [k] __memset
0.89% [kernel] [k] xfs_do_writepage
0.89% [kernel] [k] __list_del_entry
0.87% [kernel] [k] __vfs_write
0.85% [kernel] [k] xfs_inode_item_format
0.84% [kernel] [k] shrink_page_list
0.82% [kernel] [k] kmem_cache_free
0.79% [kernel] [k] radix_tree_tag_clear
0.78% [kernel] [k] _raw_spin_lock_irq
0.77% [kernel] [k] _raw_spin_unlock_irqrestore
0.76% [kernel] [k] node_page_state
0.72% [kernel] [k] xfs_count_page_state
0.68% [kernel] [k] xfs_file_aio_write_checks
0.65% [kernel] [k] wakeup_kswapd
There's still a lot of time in locking, but it's no longer obviously
being spent by spinning contention. We seem to be spending a lot of
time trying to wake kswapds now - the context switch rate of the
workload is only 400-500/s, so there aren't a lot of sleeps and
wakeups actually occurring....
Regardless, throughput and locking behvaiour seems to be a lot
better than the other patches...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-09-01 23:32 ` Dave Chinner
0 siblings, 0 replies; 219+ messages in thread
From: Dave Chinner @ 2016-09-01 23:32 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4319 bytes --]
On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
> On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > > (ie we could move the locking to the caller, and then make
> > > > > shrink_page_list() just try to keep the lock held for a few pages if
> > > > > the mapping doesn't change), and that might result in fewer crazy
> > > > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > > > kind of workaround.
> > > > >
> > > >
> > > > Even if such batching was implemented, it would be very specific to the
> > > > case of a single large file filling LRUs on multiple nodes.
> > > >
> > >
> > > The latest Jason Bourne movie was sufficiently bad that I spent time
> > > thinking how the tree_lock could be batched during reclaim. It's not
> > > straight-forward but this prototype did not blow up on UMA and may be
> > > worth considering if Dave can test either approach has a positive impact.
> >
> > SO, I just did a couple of tests. I'll call the two patches "sleepy"
> > for the contention backoff patch and "bourney" for the Jason Bourne
> > inspired batching patch. This is an average of 3 runs, overwriting
> > a 47GB file on a machine with 16GB RAM:
> >
> > IO throughput wall time __pv_queued_spin_lock_slowpath
> > vanilla 470MB/s 1m42s 25-30%
> > sleepy 295MB/s 2m43s <1%
> > bourney 425MB/s 1m53s 25-30%
> >
>
> This is another blunt-force patch that
Sorry for taking so long to get back to this - had a bunch of other
stuff to do (e.g. XFS metadata CRCs have found their first compiler
bug) and haven't had to time test this.
The blunt force approach seems to work ok:
IO throughput wall time __pv_queued_spin_lock_slowpath
vanilla 470MB/s 1m42s 25-30%
sleepy 295MB/s 2m43s <1%
bourney 425MB/s 1m53s 25-30%
blunt 470MB/s 1m41s ~2%
Performance is pretty much the same as teh vanilla kernel - maybe
a little bit faster if we consider median rather than mean results.
A snapshot profile from 'perf top -U' looks like:
11.31% [kernel] [k] copy_user_generic_string
3.59% [kernel] [k] get_page_from_freelist
3.22% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
2.80% [kernel] [k] __block_commit_write.isra.29
2.14% [kernel] [k] __pv_queued_spin_lock_slowpath
1.99% [kernel] [k] _raw_spin_lock
1.98% [kernel] [k] wake_all_kswapds
1.92% [kernel] [k] _raw_spin_lock_irqsave
1.90% [kernel] [k] node_dirty_ok
1.69% [kernel] [k] __wake_up_bit
1.57% [kernel] [k] ___might_sleep
1.49% [kernel] [k] __might_sleep
1.24% [kernel] [k] __radix_tree_lookup
1.18% [kernel] [k] kmem_cache_alloc
1.13% [kernel] [k] update_fast_ctr
1.11% [kernel] [k] radix_tree_tag_set
1.08% [kernel] [k] clear_page_dirty_for_io
1.06% [kernel] [k] down_write
1.06% [kernel] [k] up_write
1.01% [kernel] [k] unlock_page
0.99% [kernel] [k] xfs_log_commit_cil
0.97% [kernel] [k] __inc_node_state
0.95% [kernel] [k] __memset
0.89% [kernel] [k] xfs_do_writepage
0.89% [kernel] [k] __list_del_entry
0.87% [kernel] [k] __vfs_write
0.85% [kernel] [k] xfs_inode_item_format
0.84% [kernel] [k] shrink_page_list
0.82% [kernel] [k] kmem_cache_free
0.79% [kernel] [k] radix_tree_tag_clear
0.78% [kernel] [k] _raw_spin_lock_irq
0.77% [kernel] [k] _raw_spin_unlock_irqrestore
0.76% [kernel] [k] node_page_state
0.72% [kernel] [k] xfs_count_page_state
0.68% [kernel] [k] xfs_file_aio_write_checks
0.65% [kernel] [k] wakeup_kswapd
There's still a lot of time in locking, but it's no longer obviously
being spent by spinning contention. We seem to be spending a lot of
time trying to wake kswapds now - the context switch rate of the
workload is only 400-500/s, so there aren't a lot of sleeps and
wakeups actually occurring....
Regardless, throughput and locking behvaiour seems to be a lot
better than the other patches...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-09-01 23:32 ` Dave Chinner
@ 2016-09-06 15:37 ` Mel Gorman
-1 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-09-06 15:37 UTC (permalink / raw)
To: Dave Chinner
Cc: Linus Torvalds, Michal Hocko, Minchan Kim, Vladimir Davydov,
Johannes Weiner, Vlastimil Babka, Andrew Morton, Bob Peterson,
Kirill A. Shutemov, Huang, Ying, Christoph Hellwig, Wu Fengguang,
LKP, Tejun Heo, LKML
On Fri, Sep 02, 2016 at 09:32:58AM +1000, Dave Chinner wrote:
> On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> > > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > > > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > > > (ie we could move the locking to the caller, and then make
> > > > > > shrink_page_list() just try to keep the lock held for a few pages if
> > > > > > the mapping doesn't change), and that might result in fewer crazy
> > > > > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > > > > kind of workaround.
> > > > > >
> > > > >
> > > > > Even if such batching was implemented, it would be very specific to the
> > > > > case of a single large file filling LRUs on multiple nodes.
> > > > >
> > > >
> > > > The latest Jason Bourne movie was sufficiently bad that I spent time
> > > > thinking how the tree_lock could be batched during reclaim. It's not
> > > > straight-forward but this prototype did not blow up on UMA and may be
> > > > worth considering if Dave can test either approach has a positive impact.
> > >
> > > SO, I just did a couple of tests. I'll call the two patches "sleepy"
> > > for the contention backoff patch and "bourney" for the Jason Bourne
> > > inspired batching patch. This is an average of 3 runs, overwriting
> > > a 47GB file on a machine with 16GB RAM:
> > >
> > > IO throughput wall time __pv_queued_spin_lock_slowpath
> > > vanilla 470MB/s 1m42s 25-30%
> > > sleepy 295MB/s 2m43s <1%
> > > bourney 425MB/s 1m53s 25-30%
> > >
> >
> > This is another blunt-force patch that
>
> Sorry for taking so long to get back to this - had a bunch of other
> stuff to do (e.g. XFS metadata CRCs have found their first compiler
> bug) and haven't had to time test this.
>
No problem. Thanks for getting back to me.
> The blunt force approach seems to work ok:
>
Ok, good to know. Unfortunately I found that it's not a universal win. For
the swapping-to-fast-storage case (simulated with ramdisk), the batching is
a bigger gain *except* in the single threaded case. Stalling kswap in the
"blunt force approach" severely regressed a streaming anonymous reader
for all thread counts so it's not the right answer.
I'm working on a series during spare time that tries to balance all the
issues for either swapcache and filecache on different workloads but right
now, the complexity is high and it's still "win some, lose some".
As an aside for the LKP people using ramdisk for swap -- ramdisk considers
itself to be rotational storage. It takes the paths that are optimised to
minimise seeks but it's quite slow. When tree_lock contention is reduced,
workload is dominated by scan_swap_map. It's a one-line fix and I have
a patch for it but it only really matters if ramdisk is being used as a
simulator for swapping to fast storage.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-09-06 15:37 ` Mel Gorman
0 siblings, 0 replies; 219+ messages in thread
From: Mel Gorman @ 2016-09-06 15:37 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3108 bytes --]
On Fri, Sep 02, 2016 at 09:32:58AM +1000, Dave Chinner wrote:
> On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> > > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > > > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > > > (ie we could move the locking to the caller, and then make
> > > > > > shrink_page_list() just try to keep the lock held for a few pages if
> > > > > > the mapping doesn't change), and that might result in fewer crazy
> > > > > > cacheline ping-pongs overall. But that feels like exactly the wrong
> > > > > > kind of workaround.
> > > > > >
> > > > >
> > > > > Even if such batching was implemented, it would be very specific to the
> > > > > case of a single large file filling LRUs on multiple nodes.
> > > > >
> > > >
> > > > The latest Jason Bourne movie was sufficiently bad that I spent time
> > > > thinking how the tree_lock could be batched during reclaim. It's not
> > > > straight-forward but this prototype did not blow up on UMA and may be
> > > > worth considering if Dave can test either approach has a positive impact.
> > >
> > > SO, I just did a couple of tests. I'll call the two patches "sleepy"
> > > for the contention backoff patch and "bourney" for the Jason Bourne
> > > inspired batching patch. This is an average of 3 runs, overwriting
> > > a 47GB file on a machine with 16GB RAM:
> > >
> > > IO throughput wall time __pv_queued_spin_lock_slowpath
> > > vanilla 470MB/s 1m42s 25-30%
> > > sleepy 295MB/s 2m43s <1%
> > > bourney 425MB/s 1m53s 25-30%
> > >
> >
> > This is another blunt-force patch that
>
> Sorry for taking so long to get back to this - had a bunch of other
> stuff to do (e.g. XFS metadata CRCs have found their first compiler
> bug) and haven't had to time test this.
>
No problem. Thanks for getting back to me.
> The blunt force approach seems to work ok:
>
Ok, good to know. Unfortunately I found that it's not a universal win. For
the swapping-to-fast-storage case (simulated with ramdisk), the batching is
a bigger gain *except* in the single threaded case. Stalling kswap in the
"blunt force approach" severely regressed a streaming anonymous reader
for all thread counts so it's not the right answer.
I'm working on a series during spare time that tries to balance all the
issues for either swapcache and filecache on different workloads but right
now, the complexity is high and it's still "win some, lose some".
As an aside for the LKP people using ramdisk for swap -- ramdisk considers
itself to be rotational storage. It takes the paths that are optimised to
minimise seeks but it's quite slow. When tree_lock contention is reduced,
workload is dominated by scan_swap_map. It's a one-line fix and I have
a patch for it but it only really matters if ramdisk is being used as a
simulator for swapping to fast storage.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-09-06 15:37 ` Mel Gorman
@ 2016-09-06 15:52 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-09-06 15:52 UTC (permalink / raw)
To: Mel Gorman
Cc: Dave Chinner, Linus Torvalds, Michal Hocko, Minchan Kim,
Vladimir Davydov, Johannes Weiner, Vlastimil Babka,
Andrew Morton, Bob Peterson, Kirill A. Shutemov, Huang, Ying,
Christoph Hellwig, Wu Fengguang, LKP, Tejun Heo, LKML,
Tim C. Chen, Dave Hansen, Andi Kleen
Mel Gorman <mgorman@techsingularity.net> writes:
> On Fri, Sep 02, 2016 at 09:32:58AM +1000, Dave Chinner wrote:
>> On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
>> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
>> > > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
>> > > > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
>> > > > > > Yes, we could try to batch the locking like DaveC already suggested
>> > > > > > (ie we could move the locking to the caller, and then make
>> > > > > > shrink_page_list() just try to keep the lock held for a few pages if
>> > > > > > the mapping doesn't change), and that might result in fewer crazy
>> > > > > > cacheline ping-pongs overall. But that feels like exactly the wrong
>> > > > > > kind of workaround.
>> > > > > >
>> > > > >
>> > > > > Even if such batching was implemented, it would be very specific to the
>> > > > > case of a single large file filling LRUs on multiple nodes.
>> > > > >
>> > > >
>> > > > The latest Jason Bourne movie was sufficiently bad that I spent time
>> > > > thinking how the tree_lock could be batched during reclaim. It's not
>> > > > straight-forward but this prototype did not blow up on UMA and may be
>> > > > worth considering if Dave can test either approach has a positive impact.
>> > >
>> > > SO, I just did a couple of tests. I'll call the two patches "sleepy"
>> > > for the contention backoff patch and "bourney" for the Jason Bourne
>> > > inspired batching patch. This is an average of 3 runs, overwriting
>> > > a 47GB file on a machine with 16GB RAM:
>> > >
>> > > IO throughput wall time __pv_queued_spin_lock_slowpath
>> > > vanilla 470MB/s 1m42s 25-30%
>> > > sleepy 295MB/s 2m43s <1%
>> > > bourney 425MB/s 1m53s 25-30%
>> > >
>> >
>> > This is another blunt-force patch that
>>
>> Sorry for taking so long to get back to this - had a bunch of other
>> stuff to do (e.g. XFS metadata CRCs have found their first compiler
>> bug) and haven't had to time test this.
>>
>
> No problem. Thanks for getting back to me.
>
>> The blunt force approach seems to work ok:
>>
>
> Ok, good to know. Unfortunately I found that it's not a universal win. For
> the swapping-to-fast-storage case (simulated with ramdisk), the batching is
> a bigger gain *except* in the single threaded case. Stalling kswap in the
> "blunt force approach" severely regressed a streaming anonymous reader
> for all thread counts so it's not the right answer.
>
> I'm working on a series during spare time that tries to balance all the
> issues for either swapcache and filecache on different workloads but right
> now, the complexity is high and it's still "win some, lose some".
>
> As an aside for the LKP people using ramdisk for swap -- ramdisk considers
> itself to be rotational storage. It takes the paths that are optimised to
> minimise seeks but it's quite slow. When tree_lock contention is reduced,
> workload is dominated by scan_swap_map. It's a one-line fix and I have
> a patch for it but it only really matters if ramdisk is being used as a
> simulator for swapping to fast storage.
We (LKP people) use drivers/nvdimm/pmem.c instead of drivers/block/brd.c
as ramdisk. Which considers itself to be non-rotational storage.
And we have a series to optimize other locks in the swap path too, for
example batching the swap space allocating and freeing, etc. If your
solution to optimize batching removing pages from the swap cache can be
merged, that will help us much!
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-09-06 15:52 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-09-06 15:52 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 3631 bytes --]
Mel Gorman <mgorman@techsingularity.net> writes:
> On Fri, Sep 02, 2016 at 09:32:58AM +1000, Dave Chinner wrote:
>> On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
>> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
>> > > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
>> > > > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
>> > > > > > Yes, we could try to batch the locking like DaveC already suggested
>> > > > > > (ie we could move the locking to the caller, and then make
>> > > > > > shrink_page_list() just try to keep the lock held for a few pages if
>> > > > > > the mapping doesn't change), and that might result in fewer crazy
>> > > > > > cacheline ping-pongs overall. But that feels like exactly the wrong
>> > > > > > kind of workaround.
>> > > > > >
>> > > > >
>> > > > > Even if such batching was implemented, it would be very specific to the
>> > > > > case of a single large file filling LRUs on multiple nodes.
>> > > > >
>> > > >
>> > > > The latest Jason Bourne movie was sufficiently bad that I spent time
>> > > > thinking how the tree_lock could be batched during reclaim. It's not
>> > > > straight-forward but this prototype did not blow up on UMA and may be
>> > > > worth considering if Dave can test either approach has a positive impact.
>> > >
>> > > SO, I just did a couple of tests. I'll call the two patches "sleepy"
>> > > for the contention backoff patch and "bourney" for the Jason Bourne
>> > > inspired batching patch. This is an average of 3 runs, overwriting
>> > > a 47GB file on a machine with 16GB RAM:
>> > >
>> > > IO throughput wall time __pv_queued_spin_lock_slowpath
>> > > vanilla 470MB/s 1m42s 25-30%
>> > > sleepy 295MB/s 2m43s <1%
>> > > bourney 425MB/s 1m53s 25-30%
>> > >
>> >
>> > This is another blunt-force patch that
>>
>> Sorry for taking so long to get back to this - had a bunch of other
>> stuff to do (e.g. XFS metadata CRCs have found their first compiler
>> bug) and haven't had to time test this.
>>
>
> No problem. Thanks for getting back to me.
>
>> The blunt force approach seems to work ok:
>>
>
> Ok, good to know. Unfortunately I found that it's not a universal win. For
> the swapping-to-fast-storage case (simulated with ramdisk), the batching is
> a bigger gain *except* in the single threaded case. Stalling kswap in the
> "blunt force approach" severely regressed a streaming anonymous reader
> for all thread counts so it's not the right answer.
>
> I'm working on a series during spare time that tries to balance all the
> issues for either swapcache and filecache on different workloads but right
> now, the complexity is high and it's still "win some, lose some".
>
> As an aside for the LKP people using ramdisk for swap -- ramdisk considers
> itself to be rotational storage. It takes the paths that are optimised to
> minimise seeks but it's quite slow. When tree_lock contention is reduced,
> workload is dominated by scan_swap_map. It's a one-line fix and I have
> a patch for it but it only really matters if ramdisk is being used as a
> simulator for swapping to fast storage.
We (LKP people) use drivers/nvdimm/pmem.c instead of drivers/block/brd.c
as ramdisk. Which considers itself to be non-rotational storage.
And we have a series to optimize other locks in the swap path too, for
example batching the swap space allocating and freeing, etc. If your
solution to optimize batching removing pages from the swap cache can be
merged, that will help us much!
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-08-22 22:09 ` Huang, Ying
@ 2016-09-26 6:25 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-09-26 6:25 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Linus Torvalds, Dave Chinner, LKML, Bob Peterson, Fengguang Wu,
LKP, Huang, Ying
Hi, Christoph,
"Huang, Ying" <ying.huang@intel.com> writes:
> Hi, Christoph,
>
> "Huang, Ying" <ying.huang@intel.com> writes:
>
>> Christoph Hellwig <hch@lst.de> writes:
>>
>>> Snipping the long contest:
>>>
>>> I think there are three observations here:
>>>
>>> (1) removing the mark_page_accessed (which is the only significant
>>> change in the parent commit) hurts the
>>> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
>>> I'd still rather stick to the filemap version and let the
>>> VM people sort it out. How do the numbers for this test
>>> look for XFS vs say ext4 and btrfs?
>>> (2) lots of additional spinlock contention in the new case. A quick
>>> check shows that I fat-fingered my rewrite so that we do
>>> the xfs_inode_set_eofblocks_tag call now for the pure lookup
>>> case, and pretty much all new cycles come from that.
>>> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
>>> we're already doing way to many even without my little bug above.
>>>
>>> So I've force pushed a new version of the iomap-fixes branch with
>>> (2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>>> lot less expensive slotted in before that. Would be good to see
>>> the numbers with that.
>>
>> For the original reported regression, the test result is as follow,
>>
>> =========================================================================================
>> compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
>> gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
>>
>> commit:
>> f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506 (parent of first bad commit)
>> 68a9f5e7007c1afa2cf6830b690a90d0187c0684 (first bad commit)
>> 99091700659f4df965e138b38b4fa26a29b7eade (base of your fixes branch)
>> bf4dc6e4ecc2a3d042029319bc8cd4204c185610 (head of your fixes branch)
>>
>> f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 99091700659f4df965e138b38b bf4dc6e4ecc2a3d042029319bc
>> ---------------- -------------------------- -------------------------- --------------------------
>> %stddev %change %stddev %change %stddev %change %stddev
>> \ | \ | \ | \
>> 484435 ± 0% -13.3% 420004 ± 0% -17.0% 402250 ± 0% -15.6% 408998 ± 0% aim7.jobs-per-min
>
> It appears the original reported regression hasn't bee resolved by your
> commit. Could you take a look at the test results and the perf data?
Any update to this regression?
Best Regards,
Huang, Ying
>>
>> And the perf data is as follow,
>>
>> "perf-profile.func.cycles-pp.intel_idle": 20.25,
>> "perf-profile.func.cycles-pp.memset_erms": 11.72,
>> "perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 8.37,
>> "perf-profile.func.cycles-pp.__block_commit_write.isra.21": 3.49,
>> "perf-profile.func.cycles-pp.block_write_end": 1.77,
>> "perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 1.63,
>> "perf-profile.func.cycles-pp.unlock_page": 1.58,
>> "perf-profile.func.cycles-pp.___might_sleep": 1.56,
>> "perf-profile.func.cycles-pp.__block_write_begin_int": 1.33,
>> "perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 1.23,
>> "perf-profile.func.cycles-pp.up_write": 1.21,
>> "perf-profile.func.cycles-pp.__mark_inode_dirty": 1.18,
>> "perf-profile.func.cycles-pp.down_write": 1.06,
>> "perf-profile.func.cycles-pp.mark_buffer_dirty": 0.94,
>> "perf-profile.func.cycles-pp.generic_write_end": 0.92,
>> "perf-profile.func.cycles-pp.__radix_tree_lookup": 0.91,
>> "perf-profile.func.cycles-pp._raw_spin_lock": 0.81,
>> "perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 0.79,
>> "perf-profile.func.cycles-pp.__might_sleep": 0.79,
>> "perf-profile.func.cycles-pp.xfs_file_iomap_begin_delay.isra.9": 0.7,
>> "perf-profile.func.cycles-pp.__list_del_entry": 0.7,
>> "perf-profile.func.cycles-pp.vfs_write": 0.69,
>> "perf-profile.func.cycles-pp.drop_buffers": 0.68,
>> "perf-profile.func.cycles-pp.xfs_file_write_iter": 0.67,
>> "perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.67,
>>
>> Best Regards,
>> Huang, Ying
>> _______________________________________________
>> LKP mailing list
>> LKP@lists.01.org
>> https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-09-26 6:25 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-09-26 6:25 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 4549 bytes --]
Hi, Christoph,
"Huang, Ying" <ying.huang@intel.com> writes:
> Hi, Christoph,
>
> "Huang, Ying" <ying.huang@intel.com> writes:
>
>> Christoph Hellwig <hch@lst.de> writes:
>>
>>> Snipping the long contest:
>>>
>>> I think there are three observations here:
>>>
>>> (1) removing the mark_page_accessed (which is the only significant
>>> change in the parent commit) hurts the
>>> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
>>> I'd still rather stick to the filemap version and let the
>>> VM people sort it out. How do the numbers for this test
>>> look for XFS vs say ext4 and btrfs?
>>> (2) lots of additional spinlock contention in the new case. A quick
>>> check shows that I fat-fingered my rewrite so that we do
>>> the xfs_inode_set_eofblocks_tag call now for the pure lookup
>>> case, and pretty much all new cycles come from that.
>>> (3) Boy, are those xfs_inode_set_eofblocks_tag calls expensive, and
>>> we're already doing way to many even without my little bug above.
>>>
>>> So I've force pushed a new version of the iomap-fixes branch with
>>> (2) fixed, and also a little patch to xfs_inode_set_eofblocks_tag a
>>> lot less expensive slotted in before that. Would be good to see
>>> the numbers with that.
>>
>> For the original reported regression, the test result is as follow,
>>
>> =========================================================================================
>> compiler/cpufreq_governor/debug-setup/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
>> gcc-6/performance/profile/1BRD_48G/xfs/x86_64-rhel/3000/debian-x86_64-2015-02-07.cgz/ivb44/disk_wrt/aim7
>>
>> commit:
>> f0c6bcba74ac51cb77aadb33ad35cb2dc1ad1506 (parent of first bad commit)
>> 68a9f5e7007c1afa2cf6830b690a90d0187c0684 (first bad commit)
>> 99091700659f4df965e138b38b4fa26a29b7eade (base of your fixes branch)
>> bf4dc6e4ecc2a3d042029319bc8cd4204c185610 (head of your fixes branch)
>>
>> f0c6bcba74ac51cb 68a9f5e7007c1afa2cf6830b69 99091700659f4df965e138b38b bf4dc6e4ecc2a3d042029319bc
>> ---------------- -------------------------- -------------------------- --------------------------
>> %stddev %change %stddev %change %stddev %change %stddev
>> \ | \ | \ | \
>> 484435 ± 0% -13.3% 420004 ± 0% -17.0% 402250 ± 0% -15.6% 408998 ± 0% aim7.jobs-per-min
>
> It appears the original reported regression hasn't bee resolved by your
> commit. Could you take a look at the test results and the perf data?
Any update to this regression?
Best Regards,
Huang, Ying
>>
>> And the perf data is as follow,
>>
>> "perf-profile.func.cycles-pp.intel_idle": 20.25,
>> "perf-profile.func.cycles-pp.memset_erms": 11.72,
>> "perf-profile.func.cycles-pp.copy_user_enhanced_fast_string": 8.37,
>> "perf-profile.func.cycles-pp.__block_commit_write.isra.21": 3.49,
>> "perf-profile.func.cycles-pp.block_write_end": 1.77,
>> "perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath": 1.63,
>> "perf-profile.func.cycles-pp.unlock_page": 1.58,
>> "perf-profile.func.cycles-pp.___might_sleep": 1.56,
>> "perf-profile.func.cycles-pp.__block_write_begin_int": 1.33,
>> "perf-profile.func.cycles-pp.iov_iter_copy_from_user_atomic": 1.23,
>> "perf-profile.func.cycles-pp.up_write": 1.21,
>> "perf-profile.func.cycles-pp.__mark_inode_dirty": 1.18,
>> "perf-profile.func.cycles-pp.down_write": 1.06,
>> "perf-profile.func.cycles-pp.mark_buffer_dirty": 0.94,
>> "perf-profile.func.cycles-pp.generic_write_end": 0.92,
>> "perf-profile.func.cycles-pp.__radix_tree_lookup": 0.91,
>> "perf-profile.func.cycles-pp._raw_spin_lock": 0.81,
>> "perf-profile.func.cycles-pp.entry_SYSCALL_64_fastpath": 0.79,
>> "perf-profile.func.cycles-pp.__might_sleep": 0.79,
>> "perf-profile.func.cycles-pp.xfs_file_iomap_begin_delay.isra.9": 0.7,
>> "perf-profile.func.cycles-pp.__list_del_entry": 0.7,
>> "perf-profile.func.cycles-pp.vfs_write": 0.69,
>> "perf-profile.func.cycles-pp.drop_buffers": 0.68,
>> "perf-profile.func.cycles-pp.xfs_file_write_iter": 0.67,
>> "perf-profile.func.cycles-pp.rwsem_spin_on_owner": 0.67,
>>
>> Best Regards,
>> Huang, Ying
>> _______________________________________________
>> LKP mailing list
>> LKP(a)lists.01.org
>> https://lists.01.org/mailman/listinfo/lkp
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-09-26 6:25 ` Huang, Ying
@ 2016-09-26 14:55 ` Christoph Hellwig
-1 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-09-26 14:55 UTC (permalink / raw)
To: Huang, Ying
Cc: Christoph Hellwig, Linus Torvalds, Dave Chinner, LKML,
Bob Peterson, Fengguang Wu, LKP, Huang, Ying
Hi Ying,
> Any update to this regression?
Not really. We've optimized everything we could in XFS without
dropping the architecture that we really want to move to. Now we're
waiting for some MM behavior to be fixed that this unconvered. But
in the end will probabkly stuck with a slight regression in this
artificial workload.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-09-26 14:55 ` Christoph Hellwig
0 siblings, 0 replies; 219+ messages in thread
From: Christoph Hellwig @ 2016-09-26 14:55 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 340 bytes --]
Hi Ying,
> Any update to this regression?
Not really. We've optimized everything we could in XFS without
dropping the architecture that we really want to move to. Now we're
waiting for some MM behavior to be fixed that this unconvered. But
in the end will probabkly stuck with a slight regression in this
artificial workload.
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
2016-09-26 14:55 ` Christoph Hellwig
@ 2016-09-27 0:52 ` Huang, Ying
-1 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-09-27 0:52 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Linus Torvalds, Dave Chinner, LKML, Bob Peterson, Fengguang Wu,
LKP, Huang, Ying
Christoph Hellwig <hch@lst.de> writes:
> Hi Ying,
>
>> Any update to this regression?
>
> Not really. We've optimized everything we could in XFS without
> dropping the architecture that we really want to move to. Now we're
> waiting for some MM behavior to be fixed that this unconvered. But
> in the end will probabkly stuck with a slight regression in this
> artificial workload.
I see. Thanks for update. Please keep me posted.
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
* Re: [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
@ 2016-09-27 0:52 ` Huang, Ying
0 siblings, 0 replies; 219+ messages in thread
From: Huang, Ying @ 2016-09-27 0:52 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 481 bytes --]
Christoph Hellwig <hch@lst.de> writes:
> Hi Ying,
>
>> Any update to this regression?
>
> Not really. We've optimized everything we could in XFS without
> dropping the architecture that we really want to move to. Now we're
> waiting for some MM behavior to be fixed that this unconvered. But
> in the end will probabkly stuck with a slight regression in this
> artificial workload.
I see. Thanks for update. Please keep me posted.
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 219+ messages in thread
end of thread, other threads:[~2016-09-27 0:52 UTC | newest]
Thread overview: 219+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-09 14:33 [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression kernel test robot
2016-08-09 14:33 ` kernel test robot
2016-08-10 18:24 ` [lkp] " Linus Torvalds
2016-08-10 18:24 ` Linus Torvalds
2016-08-10 23:08 ` [lkp] " Dave Chinner
2016-08-10 23:08 ` Dave Chinner
2016-08-10 23:51 ` [lkp] " Linus Torvalds
2016-08-10 23:51 ` Linus Torvalds
2016-08-10 23:58 ` [LKP] [lkp] " Huang, Ying
2016-08-10 23:58 ` Huang, Ying
2016-08-11 0:11 ` [LKP] [lkp] " Huang, Ying
2016-08-11 0:11 ` Huang, Ying
2016-08-11 0:23 ` [LKP] [lkp] " Linus Torvalds
2016-08-11 0:23 ` Linus Torvalds
2016-08-11 0:33 ` [LKP] [lkp] " Huang, Ying
2016-08-11 0:33 ` Huang, Ying
2016-08-11 1:00 ` [LKP] [lkp] " Linus Torvalds
2016-08-11 1:00 ` Linus Torvalds
2016-08-11 4:46 ` [LKP] [lkp] " Dave Chinner
2016-08-11 4:46 ` Dave Chinner
2016-08-15 17:22 ` [LKP] [lkp] " Huang, Ying
2016-08-15 17:22 ` Huang, Ying
2016-08-16 0:08 ` [LKP] [lkp] " Dave Chinner
2016-08-16 0:08 ` Dave Chinner
2016-08-11 15:57 ` [LKP] [lkp] " Christoph Hellwig
2016-08-11 15:57 ` Christoph Hellwig
2016-08-11 16:55 ` [LKP] [lkp] " Linus Torvalds
2016-08-11 16:55 ` Linus Torvalds
2016-08-11 17:51 ` [LKP] [lkp] " Huang, Ying
2016-08-11 17:51 ` Huang, Ying
2016-08-11 19:51 ` [LKP] [lkp] " Linus Torvalds
2016-08-11 19:51 ` Linus Torvalds
2016-08-11 20:00 ` [LKP] [lkp] " Christoph Hellwig
2016-08-11 20:00 ` Christoph Hellwig
2016-08-11 20:35 ` [LKP] [lkp] " Linus Torvalds
2016-08-11 20:35 ` Linus Torvalds
2016-08-11 22:16 ` [LKP] [lkp] " Al Viro
2016-08-11 22:16 ` Al Viro
2016-08-11 22:30 ` [LKP] [lkp] " Linus Torvalds
2016-08-11 22:30 ` Linus Torvalds
2016-08-11 21:16 ` [LKP] [lkp] " Huang, Ying
2016-08-11 21:16 ` Huang, Ying
2016-08-11 21:40 ` [LKP] [lkp] " Linus Torvalds
2016-08-11 21:40 ` Linus Torvalds
2016-08-11 22:08 ` [LKP] [lkp] " Christoph Hellwig
2016-08-11 22:08 ` Christoph Hellwig
2016-08-12 0:54 ` [LKP] [lkp] " Dave Chinner
2016-08-12 0:54 ` Dave Chinner
2016-08-12 2:23 ` [LKP] [lkp] " Dave Chinner
2016-08-12 2:23 ` Dave Chinner
2016-08-12 2:32 ` [LKP] [lkp] " Linus Torvalds
2016-08-12 2:32 ` Linus Torvalds
2016-08-12 2:52 ` [LKP] [lkp] " Christoph Hellwig
2016-08-12 2:52 ` Christoph Hellwig
2016-08-12 3:20 ` [LKP] [lkp] " Linus Torvalds
2016-08-12 3:20 ` Linus Torvalds
2016-08-12 4:16 ` [LKP] [lkp] " Dave Chinner
2016-08-12 4:16 ` Dave Chinner
2016-08-12 5:02 ` [LKP] [lkp] " Linus Torvalds
2016-08-12 5:02 ` Linus Torvalds
2016-08-12 6:04 ` [LKP] [lkp] " Dave Chinner
2016-08-12 6:04 ` Dave Chinner
2016-08-12 6:29 ` [LKP] [lkp] " Ye Xiaolong
2016-08-12 6:29 ` Ye Xiaolong
2016-08-12 8:51 ` [LKP] [lkp] " Ye Xiaolong
2016-08-12 8:51 ` Ye Xiaolong
2016-08-12 10:02 ` [LKP] [lkp] " Dave Chinner
2016-08-12 10:02 ` Dave Chinner
2016-08-12 10:43 ` Fengguang Wu
2016-08-12 10:43 ` Fengguang Wu
2016-08-13 0:30 ` [LKP] [lkp] " Christoph Hellwig
2016-08-13 0:30 ` Christoph Hellwig
2016-08-13 21:48 ` [LKP] [lkp] " Christoph Hellwig
2016-08-13 21:48 ` Christoph Hellwig
2016-08-13 22:07 ` [LKP] [lkp] " Fengguang Wu
2016-08-13 22:07 ` Fengguang Wu
2016-08-13 22:15 ` [LKP] [lkp] " Christoph Hellwig
2016-08-13 22:15 ` Christoph Hellwig
2016-08-13 22:51 ` [LKP] [lkp] " Fengguang Wu
2016-08-13 22:51 ` Fengguang Wu
2016-08-14 14:50 ` [LKP] [lkp] " Fengguang Wu
2016-08-14 14:50 ` Fengguang Wu
2016-08-14 16:17 ` [LKP] [lkp] " Christoph Hellwig
2016-08-14 16:17 ` Christoph Hellwig
2016-08-14 23:46 ` [LKP] [lkp] " Dave Chinner
2016-08-14 23:46 ` Dave Chinner
2016-08-14 23:57 ` [LKP] [lkp] " Fengguang Wu
2016-08-14 23:57 ` Fengguang Wu
2016-08-15 14:14 ` [LKP] [lkp] " Fengguang Wu
2016-08-15 14:14 ` Fengguang Wu
2016-08-15 21:22 ` [LKP] [lkp] " Dave Chinner
2016-08-15 21:22 ` Dave Chinner
2016-08-16 12:20 ` [LKP] [lkp] " Fengguang Wu
2016-08-16 12:20 ` Fengguang Wu
2016-08-15 20:30 ` [LKP] [lkp] " Huang, Ying
2016-08-15 20:30 ` Huang, Ying
2016-08-22 22:09 ` [LKP] [lkp] " Huang, Ying
2016-08-22 22:09 ` Huang, Ying
2016-09-26 6:25 ` [LKP] [lkp] " Huang, Ying
2016-09-26 6:25 ` Huang, Ying
2016-09-26 14:55 ` [LKP] [lkp] " Christoph Hellwig
2016-09-26 14:55 ` Christoph Hellwig
2016-09-27 0:52 ` [LKP] [lkp] " Huang, Ying
2016-09-27 0:52 ` Huang, Ying
2016-08-16 13:25 ` [LKP] [lkp] " Fengguang Wu
2016-08-16 13:25 ` Fengguang Wu
2016-08-13 23:32 ` [LKP] [lkp] " Dave Chinner
2016-08-13 23:32 ` Dave Chinner
2016-08-12 2:27 ` [LKP] [lkp] " Linus Torvalds
2016-08-12 2:27 ` Linus Torvalds
2016-08-12 3:56 ` [LKP] [lkp] " Dave Chinner
2016-08-12 3:56 ` Dave Chinner
2016-08-12 18:03 ` [LKP] [lkp] " Linus Torvalds
2016-08-12 18:03 ` Linus Torvalds
2016-08-13 23:58 ` [LKP] [lkp] " Fengguang Wu
2016-08-13 23:58 ` Fengguang Wu
2016-08-15 0:48 ` [LKP] [lkp] " Dave Chinner
2016-08-15 0:48 ` Dave Chinner
2016-08-15 1:37 ` [LKP] [lkp] " Linus Torvalds
2016-08-15 1:37 ` Linus Torvalds
2016-08-15 2:28 ` [LKP] [lkp] " Dave Chinner
2016-08-15 2:28 ` Dave Chinner
2016-08-15 2:53 ` [LKP] [lkp] " Linus Torvalds
2016-08-15 2:53 ` Linus Torvalds
2016-08-15 5:00 ` [LKP] [lkp] " Dave Chinner
2016-08-15 5:00 ` Dave Chinner
[not found] ` <CA+55aFwva2Xffai+Eqv1Jn_NGryk3YJ2i5JoHOQnbQv6qVPAsw@mail.gmail.com>
[not found] ` <CA+55aFy14nUnJQ_GdF=j8Fa9xiH70c6fY2G3q5HQ01+8z1z3qQ@mail.gmail.com>
2016-08-15 5:12 ` Linus Torvalds
2016-08-15 22:22 ` [LKP] [lkp] " Dave Chinner
2016-08-15 22:22 ` Dave Chinner
2016-08-15 22:42 ` [LKP] [lkp] " Dave Chinner
2016-08-15 22:42 ` Dave Chinner
2016-08-15 23:20 ` [LKP] [lkp] " Linus Torvalds
2016-08-15 23:20 ` Linus Torvalds
2016-08-15 23:48 ` [LKP] [lkp] " Linus Torvalds
2016-08-15 23:48 ` Linus Torvalds
2016-08-16 0:44 ` [LKP] [lkp] " Dave Chinner
2016-08-16 0:44 ` Dave Chinner
2016-08-16 15:05 ` [LKP] [lkp] " Mel Gorman
2016-08-16 15:05 ` Mel Gorman
2016-08-16 17:47 ` [LKP] [lkp] " Linus Torvalds
2016-08-16 17:47 ` Linus Torvalds
2016-08-17 15:48 ` [LKP] [lkp] " Michal Hocko
2016-08-17 15:48 ` Michal Hocko
2016-08-17 16:42 ` [LKP] [lkp] " Michal Hocko
2016-08-17 16:42 ` Michal Hocko
2016-08-17 15:49 ` [LKP] [lkp] " Mel Gorman
2016-08-17 15:49 ` Mel Gorman
2016-08-18 0:45 ` [LKP] [lkp] " Mel Gorman
2016-08-18 0:45 ` Mel Gorman
2016-08-18 7:11 ` [LKP] [lkp] " Dave Chinner
2016-08-18 7:11 ` Dave Chinner
2016-08-18 13:24 ` [LKP] [lkp] " Mel Gorman
2016-08-18 13:24 ` Mel Gorman
2016-08-18 17:55 ` [LKP] [lkp] " Linus Torvalds
2016-08-18 17:55 ` Linus Torvalds
2016-08-18 21:19 ` [LKP] [lkp] " Dave Chinner
2016-08-18 21:19 ` Dave Chinner
2016-08-18 22:25 ` [LKP] [lkp] " Linus Torvalds
2016-08-18 22:25 ` Linus Torvalds
2016-08-19 9:00 ` [LKP] [lkp] " Michal Hocko
2016-08-19 9:00 ` Michal Hocko
2016-08-19 10:49 ` [LKP] [lkp] " Mel Gorman
2016-08-19 10:49 ` Mel Gorman
2016-08-19 23:48 ` [LKP] [lkp] " Dave Chinner
2016-08-19 23:48 ` Dave Chinner
2016-08-20 1:08 ` [LKP] [lkp] " Linus Torvalds
2016-08-20 1:08 ` Linus Torvalds
2016-08-20 12:16 ` [LKP] [lkp] " Mel Gorman
2016-08-20 12:16 ` Mel Gorman
2016-08-19 15:08 ` [LKP] [lkp] " Mel Gorman
2016-08-19 15:08 ` Mel Gorman
2016-09-01 23:32 ` [LKP] [lkp] " Dave Chinner
2016-09-01 23:32 ` Dave Chinner
2016-09-06 15:37 ` [LKP] [lkp] " Mel Gorman
2016-09-06 15:37 ` Mel Gorman
2016-09-06 15:52 ` [LKP] [lkp] " Huang, Ying
2016-09-06 15:52 ` Huang, Ying
2016-08-24 15:40 ` [LKP] [lkp] " Huang, Ying
2016-08-24 15:40 ` Huang, Ying
2016-08-25 9:37 ` [LKP] [lkp] " Mel Gorman
2016-08-25 9:37 ` Mel Gorman
2016-08-18 2:44 ` [LKP] [lkp] " Dave Chinner
2016-08-18 2:44 ` Dave Chinner
2016-08-16 0:15 ` [LKP] [lkp] " Linus Torvalds
2016-08-16 0:15 ` Linus Torvalds
2016-08-16 0:38 ` [LKP] [lkp] " Dave Chinner
2016-08-16 0:38 ` Dave Chinner
2016-08-16 0:50 ` [LKP] [lkp] " Linus Torvalds
2016-08-16 0:50 ` Linus Torvalds
2016-08-16 0:19 ` [LKP] [lkp] " Dave Chinner
2016-08-16 0:19 ` Dave Chinner
2016-08-16 1:51 ` [LKP] [lkp] " Linus Torvalds
2016-08-16 1:51 ` Linus Torvalds
2016-08-16 22:02 ` [LKP] [lkp] " Dave Chinner
2016-08-16 22:02 ` Dave Chinner
2016-08-16 23:23 ` [LKP] [lkp] " Linus Torvalds
2016-08-16 23:23 ` Linus Torvalds
2016-08-15 23:01 ` [LKP] [lkp] " Linus Torvalds
2016-08-15 23:01 ` Linus Torvalds
2016-08-16 0:17 ` [LKP] [lkp] " Dave Chinner
2016-08-16 0:17 ` Dave Chinner
2016-08-16 0:45 ` [LKP] [lkp] " Linus Torvalds
2016-08-16 0:45 ` Linus Torvalds
2016-08-15 5:03 ` [LKP] [lkp] " Ingo Molnar
2016-08-15 5:03 ` Ingo Molnar
2016-08-17 16:24 ` [LKP] [lkp] " Peter Zijlstra
2016-08-17 16:24 ` Peter Zijlstra
2016-08-15 12:58 ` [LKP] [lkp] " Fengguang Wu
2016-08-15 12:58 ` Fengguang Wu
2016-08-11 1:16 ` [LKP] [lkp] " Dave Chinner
2016-08-11 1:16 ` Dave Chinner
2016-08-11 1:32 ` [LKP] [lkp] " Dave Chinner
2016-08-11 1:32 ` Dave Chinner
2016-08-11 2:36 ` [LKP] [lkp] " Ye Xiaolong
2016-08-11 2:36 ` Ye Xiaolong
2016-08-11 3:05 ` [LKP] [lkp] " Dave Chinner
2016-08-11 3:05 ` Dave Chinner
2016-08-12 1:26 ` [LKP] [lkp] " Dave Chinner
2016-08-12 1:26 ` Dave Chinner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.