All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/7] trace memory objects
@ 2011-08-29  3:29 ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Linux Memory Management List, Wu Fengguang, LKML

Andrew, Ingo,

I'd like to introduce three memory object interfaces

/debug/tracing/objects/mm/pages/dump-pfn
/debug/tracing/objects/mm/pages/dump-file
/debug/tracing/objects/mm/pages/dump-fs

for dumping

- a range of PFNs
- the cached inodes (icache)
- the cached inode pages (page cache)

The "dump-pfn" interface is a superset of the existing /proc/kpageflags and
/proc/kpagecount interfaces. Example output:

# echo 10000 +10 > /debug/tracing/objects/mm/pages/dump-pfn
# cat /debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-3128  [000]  1393.460292: dump_page_frame: pfn=10001 page=ffffea000009c400 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460296: dump_page_frame: pfn=10002 page=ffffea000009c440 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460298: dump_page_frame: pfn=10003 page=ffffea000009c480 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460299: dump_page_frame: pfn=10004 page=ffffea000009c4c0 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460300: dump_page_frame: pfn=10005 page=ffffea000009c500 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460301: dump_page_frame: pfn=10006 page=ffffea000009c540 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460301: dump_page_frame: pfn=10007 page=ffffea000009c580 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460302: dump_page_frame: pfn=10008 page=ffffea000009c5c0 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460303: dump_page_frame: pfn=10009 page=ffffea000009c600 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460304: dump_page_frame: pfn=10010 page=ffffea000009c640 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved

The "dump-file", "dump-fs" interfaces could be pretty useful for examining
the page cache status. Example output:

# echo > /debug/tracing/trace
# echo / > /debug/tracing/objects/mm/pages/dump-fs
# head -50 /debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-3128  [000]  1482.623149: dump_inode_cache: ino=1507329 size=4096 cached=4096 dirtied_when=4294676467 age=1482 state=____ type=DIR name=/
             zsh-3128  [000]  1482.623157: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623340: dump_inode_cache: ino=1786836 size=12288 cached=12288 dirtied_when=4294676472 age=1482 state=____ type=DIR name=/sbin
             zsh-3128  [000]  1482.623355: dump_page_cache: index=0 len=3 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623393: dump_inode_cache: ino=1786946 size=37312 cached=40960 dirtied_when=4294676473 age=1482 state=____ type=REG name=/sbin/init
             zsh-3128  [000]  1482.623404: dump_page_cache: index=0 len=6 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]  1482.623405: dump_page_cache: index=6 len=1 flags=M__A_U_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]  1482.623406: dump_page_cache: index=7 len=1 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]  1482.623408: dump_page_cache: index=8 len=2 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623422: dump_inode_cache: ino=1507464 size=4 cached=4096 dirtied_when=4294676477 age=1482 state=____ type=LNK name=/lib64
             zsh-3128  [000]  1482.623427: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623441: dump_inode_cache: ino=1590173 size=12288 cached=0 dirtied_when=4294676477 age=1482 state=____ type=DIR name=/lib
             zsh-3128  [000]  1482.623458: dump_inode_cache: ino=1590265 size=27 cached=4096 dirtied_when=4294676478 age=1482 state=____ type=LNK name=/lib/ld-linux-x86-64.so.2
             zsh-3128  [000]  1482.623462: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623478: dump_inode_cache: ino=1663440 size=4096 cached=0 dirtied_when=4294676479 age=1482 state=____ type=DIR name=/lib/x86_64-linux-gnu
             zsh-3128  [000]  1482.623495: dump_inode_cache: ino=3293287 size=136936 cached=139264 dirtied_when=4294676480 age=1482 state=____ type=REG name=/lib/x86_64-linux-gnu/ld-2.13.so
             zsh-3128  [000]  1482.623499: dump_page_cache: index=0 len=1 flags=M__ARU_____ count=28 mapcount=27 memcg=1
             zsh-3128  [000]  1482.623501: dump_page_cache: index=1 len=3 flags=M__ARU_____ count=23 mapcount=22 memcg=1
             zsh-3128  [000]  1482.623503: dump_page_cache: index=4 len=4 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623504: dump_page_cache: index=8 len=2 flags=M__ARU_____ count=34 mapcount=33 memcg=1
             zsh-3128  [000]  1482.623506: dump_page_cache: index=10 len=3 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623508: dump_page_cache: index=13 len=1 flags=M__ARU_____ count=34 mapcount=33 memcg=1
             zsh-3128  [000]  1482.623509: dump_page_cache: index=14 len=1 flags=M__ARU_____ count=33 mapcount=32 memcg=1
             zsh-3128  [000]  1482.623510: dump_page_cache: index=15 len=1 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623511: dump_page_cache: index=16 len=1 flags=M__ARU_____ count=25 mapcount=24 memcg=1
             zsh-3128  [000]  1482.623512: dump_page_cache: index=17 len=1 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623513: dump_page_cache: index=18 len=1 flags=M__ARU_____ count=10 mapcount=9 memcg=1
             zsh-3128  [000]  1482.623514: dump_page_cache: index=19 len=1 flags=M__ARU_____ count=34 mapcount=33 memcg=1
             zsh-3128  [000]  1482.623515: dump_page_cache: index=20 len=1 flags=M__ARU_____ count=23 mapcount=22 memcg=1
             zsh-3128  [000]  1482.623516: dump_page_cache: index=21 len=1 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623517: dump_page_cache: index=22 len=1 flags=M__ARU_____ count=34 mapcount=33 memcg=1
             zsh-3128  [000]  1482.623518: dump_page_cache: index=23 len=2 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623519: dump_page_cache: index=25 len=1 flags=M__ARU_____ count=33 mapcount=32 memcg=1
             zsh-3128  [000]  1482.623520: dump_page_cache: index=26 len=1 flags=M__ARU_____ count=10 mapcount=9 memcg=1
             zsh-3128  [000]  1482.623522: dump_page_cache: index=27 len=1 flags=M__ARU_____ count=4 mapcount=3 memcg=1
             zsh-3128  [000]  1482.623523: dump_page_cache: index=28 len=1 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623525: dump_page_cache: index=29 len=5 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623540: dump_inode_cache: ino=1525477 size=12288 cached=0 dirtied_when=4294676489 age=1482 state=____ type=DIR name=/etc
             zsh-3128  [000]  1482.623556: dump_inode_cache: ino=1526474 size=69589 cached=69632 dirtied_when=4294676491 age=1482 state=____ type=REG name=/etc/ld.so.cache
             zsh-3128  [000]  1482.623561: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623562: dump_page_cache: index=1 len=1 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623569: dump_page_cache: index=2 len=15 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623585: dump_inode_cache: ino=1590258 size=241632 cached=241664 dirtied_when=4294676493 age=1482 state=____ type=REG name=/lib/libsepol.so.1
             zsh-3128  [000]  1482.623591: dump_page_cache: index=0 len=5 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]  1482.623609: dump_page_cache: index=5 len=42 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623610: dump_page_cache: index=47 len=1 flags=M__ARU_____ count=2 mapcount=1 memcg=1

patches:

dump-pfn interface
(it looks more clean and easier for review to fold patches 1-5 into one, but
let's keep the changelog for the initial post)

	[RFC][PATCH 1/7] tracing/mm: add page frame snapshot trace
	[RFC][PATCH 2/7] tracing/mm: rename trigger file to dump-pfn
	[RFC][PATCH 3/7] tracing/mm: create trace_objects.c
	[RFC][PATCH 4/7] tracing/mm: dump more page frame information
	[RFC][PATCH 5/7] tracing/mm: accept echo-able input format for pfn range

dump-file and dump-fs interfaces

	[RFC][PATCH 6/7] tracing/mm: add dump-file and dump-fs interfaces

add memcg support

	[RFC][PATCH 7/7] tracing/mm: add memcg field

diffstat:

 fs/inode.c                   |    8 
 fs/internal.h                |    5 
 include/linux/fs.h           |    1 
 include/linux/memcontrol.h   |    6 
 include/linux/page-flags.h   |    1 
 include/trace/events/mm.h    |  170 ++++++++++++++
 kernel/trace/Makefile        |    2 
 kernel/trace/trace.h         |    1 
 kernel/trace/trace_mm.c      |  378 +++++++++++++++++++++++++++++++++
 kernel/trace/trace_objects.c |   26 ++
 mm/memcontrol.c              |   18 +
 mm/page_alloc.c              |    4 
 12 files changed, 612 insertions(+), 8 deletions(-)

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 0/7] trace memory objects
@ 2011-08-29  3:29 ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Linux Memory Management List, Wu Fengguang, LKML

Andrew, Ingo,

I'd like to introduce three memory object interfaces

/debug/tracing/objects/mm/pages/dump-pfn
/debug/tracing/objects/mm/pages/dump-file
/debug/tracing/objects/mm/pages/dump-fs

for dumping

- a range of PFNs
- the cached inodes (icache)
- the cached inode pages (page cache)

The "dump-pfn" interface is a superset of the existing /proc/kpageflags and
/proc/kpagecount interfaces. Example output:

# echo 10000 +10 > /debug/tracing/objects/mm/pages/dump-pfn
# cat /debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-3128  [000]  1393.460292: dump_page_frame: pfn=10001 page=ffffea000009c400 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460296: dump_page_frame: pfn=10002 page=ffffea000009c440 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460298: dump_page_frame: pfn=10003 page=ffffea000009c480 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460299: dump_page_frame: pfn=10004 page=ffffea000009c4c0 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460300: dump_page_frame: pfn=10005 page=ffffea000009c500 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460301: dump_page_frame: pfn=10006 page=ffffea000009c540 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460301: dump_page_frame: pfn=10007 page=ffffea000009c580 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460302: dump_page_frame: pfn=10008 page=ffffea000009c5c0 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460303: dump_page_frame: pfn=10009 page=ffffea000009c600 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved
             zsh-3128  [000]  1393.460304: dump_page_frame: pfn=10010 page=ffffea000009c640 count=1 mapcount=0 memcg=0 private=0 mapping=0 index=0 flags=reserved

The "dump-file", "dump-fs" interfaces could be pretty useful for examining
the page cache status. Example output:

# echo > /debug/tracing/trace
# echo / > /debug/tracing/objects/mm/pages/dump-fs
# head -50 /debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-3128  [000]  1482.623149: dump_inode_cache: ino=1507329 size=4096 cached=4096 dirtied_when=4294676467 age=1482 state=____ type=DIR name=/
             zsh-3128  [000]  1482.623157: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623340: dump_inode_cache: ino=1786836 size=12288 cached=12288 dirtied_when=4294676472 age=1482 state=____ type=DIR name=/sbin
             zsh-3128  [000]  1482.623355: dump_page_cache: index=0 len=3 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623393: dump_inode_cache: ino=1786946 size=37312 cached=40960 dirtied_when=4294676473 age=1482 state=____ type=REG name=/sbin/init
             zsh-3128  [000]  1482.623404: dump_page_cache: index=0 len=6 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]  1482.623405: dump_page_cache: index=6 len=1 flags=M__A_U_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]  1482.623406: dump_page_cache: index=7 len=1 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]  1482.623408: dump_page_cache: index=8 len=2 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623422: dump_inode_cache: ino=1507464 size=4 cached=4096 dirtied_when=4294676477 age=1482 state=____ type=LNK name=/lib64
             zsh-3128  [000]  1482.623427: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623441: dump_inode_cache: ino=1590173 size=12288 cached=0 dirtied_when=4294676477 age=1482 state=____ type=DIR name=/lib
             zsh-3128  [000]  1482.623458: dump_inode_cache: ino=1590265 size=27 cached=4096 dirtied_when=4294676478 age=1482 state=____ type=LNK name=/lib/ld-linux-x86-64.so.2
             zsh-3128  [000]  1482.623462: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623478: dump_inode_cache: ino=1663440 size=4096 cached=0 dirtied_when=4294676479 age=1482 state=____ type=DIR name=/lib/x86_64-linux-gnu
             zsh-3128  [000]  1482.623495: dump_inode_cache: ino=3293287 size=136936 cached=139264 dirtied_when=4294676480 age=1482 state=____ type=REG name=/lib/x86_64-linux-gnu/ld-2.13.so
             zsh-3128  [000]  1482.623499: dump_page_cache: index=0 len=1 flags=M__ARU_____ count=28 mapcount=27 memcg=1
             zsh-3128  [000]  1482.623501: dump_page_cache: index=1 len=3 flags=M__ARU_____ count=23 mapcount=22 memcg=1
             zsh-3128  [000]  1482.623503: dump_page_cache: index=4 len=4 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623504: dump_page_cache: index=8 len=2 flags=M__ARU_____ count=34 mapcount=33 memcg=1
             zsh-3128  [000]  1482.623506: dump_page_cache: index=10 len=3 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623508: dump_page_cache: index=13 len=1 flags=M__ARU_____ count=34 mapcount=33 memcg=1
             zsh-3128  [000]  1482.623509: dump_page_cache: index=14 len=1 flags=M__ARU_____ count=33 mapcount=32 memcg=1
             zsh-3128  [000]  1482.623510: dump_page_cache: index=15 len=1 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623511: dump_page_cache: index=16 len=1 flags=M__ARU_____ count=25 mapcount=24 memcg=1
             zsh-3128  [000]  1482.623512: dump_page_cache: index=17 len=1 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623513: dump_page_cache: index=18 len=1 flags=M__ARU_____ count=10 mapcount=9 memcg=1
             zsh-3128  [000]  1482.623514: dump_page_cache: index=19 len=1 flags=M__ARU_____ count=34 mapcount=33 memcg=1
             zsh-3128  [000]  1482.623515: dump_page_cache: index=20 len=1 flags=M__ARU_____ count=23 mapcount=22 memcg=1
             zsh-3128  [000]  1482.623516: dump_page_cache: index=21 len=1 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623517: dump_page_cache: index=22 len=1 flags=M__ARU_____ count=34 mapcount=33 memcg=1
             zsh-3128  [000]  1482.623518: dump_page_cache: index=23 len=2 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623519: dump_page_cache: index=25 len=1 flags=M__ARU_____ count=33 mapcount=32 memcg=1
             zsh-3128  [000]  1482.623520: dump_page_cache: index=26 len=1 flags=M__ARU_____ count=10 mapcount=9 memcg=1
             zsh-3128  [000]  1482.623522: dump_page_cache: index=27 len=1 flags=M__ARU_____ count=4 mapcount=3 memcg=1
             zsh-3128  [000]  1482.623523: dump_page_cache: index=28 len=1 flags=M__ARU_____ count=26 mapcount=25 memcg=1
             zsh-3128  [000]  1482.623525: dump_page_cache: index=29 len=5 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623540: dump_inode_cache: ino=1525477 size=12288 cached=0 dirtied_when=4294676489 age=1482 state=____ type=DIR name=/etc
             zsh-3128  [000]  1482.623556: dump_inode_cache: ino=1526474 size=69589 cached=69632 dirtied_when=4294676491 age=1482 state=____ type=REG name=/etc/ld.so.cache
             zsh-3128  [000]  1482.623561: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623562: dump_page_cache: index=1 len=1 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623569: dump_page_cache: index=2 len=15 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623585: dump_inode_cache: ino=1590258 size=241632 cached=241664 dirtied_when=4294676493 age=1482 state=____ type=REG name=/lib/libsepol.so.1
             zsh-3128  [000]  1482.623591: dump_page_cache: index=0 len=5 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]  1482.623609: dump_page_cache: index=5 len=42 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]  1482.623610: dump_page_cache: index=47 len=1 flags=M__ARU_____ count=2 mapcount=1 memcg=1

patches:

dump-pfn interface
(it looks more clean and easier for review to fold patches 1-5 into one, but
let's keep the changelog for the initial post)

	[RFC][PATCH 1/7] tracing/mm: add page frame snapshot trace
	[RFC][PATCH 2/7] tracing/mm: rename trigger file to dump-pfn
	[RFC][PATCH 3/7] tracing/mm: create trace_objects.c
	[RFC][PATCH 4/7] tracing/mm: dump more page frame information
	[RFC][PATCH 5/7] tracing/mm: accept echo-able input format for pfn range

dump-file and dump-fs interfaces

	[RFC][PATCH 6/7] tracing/mm: add dump-file and dump-fs interfaces

add memcg support

	[RFC][PATCH 7/7] tracing/mm: add memcg field

diffstat:

 fs/inode.c                   |    8 
 fs/internal.h                |    5 
 include/linux/fs.h           |    1 
 include/linux/memcontrol.h   |    6 
 include/linux/page-flags.h   |    1 
 include/trace/events/mm.h    |  170 ++++++++++++++
 kernel/trace/Makefile        |    2 
 kernel/trace/trace.h         |    1 
 kernel/trace/trace_mm.c      |  378 +++++++++++++++++++++++++++++++++
 kernel/trace/trace_objects.c |   26 ++
 mm/memcontrol.c              |   18 +
 mm/page_alloc.c              |    4 
 12 files changed, 612 insertions(+), 8 deletions(-)

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 1/7] tracing/mm: add page frame snapshot trace
  2011-08-29  3:29 ` Wu Fengguang
@ 2011-08-29  3:29   ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Steven Rostedt, Wu Fengguang,
	Linux Memory Management List, LKML

[-- Attachment #1: 0001-tracing-mm-add-page-frame-snapshot-trace.patch --]
[-- Type: text/plain, Size: 6345 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This is a prototype to dump out a snapshot of the page tables to the
tracing buffer. Currently it is very primitive, and just writes out
the events. There is no synchronization to not loose the events,
so /debug/tracing/buffer_size_kb has to be large enough for all
events to fit.

We will do something about synchronization later. That is, have a way
to read the buffer through the tracing/object/mm/page/X file and have
the two in sync.

But this is just a prototype to get the ball rolling.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/trace/events/mm.h |   48 ++++++++++
 kernel/trace/Makefile     |    1 
 kernel/trace/trace_mm.c   |  172 ++++++++++++++++++++++++++++++++++++
 3 files changed, 221 insertions(+)
 create mode 100644 include/trace/events/mm.h
 create mode 100644 kernel/trace/trace_mm.c

--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ mmotm/include/trace/events/mm.h	2011-03-03 19:18:02.000000000 +0800
@@ -0,0 +1,48 @@
+#if !defined(_TRACE_MM_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_MM_H
+
+#include <linux/tracepoint.h>
+#include <linux/mm.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mm
+
+/**
+ * dump_pages - called by the trace page dump trigger
+ * @pfn: page frame number
+ * @page: pointer to the page frame
+ *
+ * This is a helper trace point into the dumping of the page frames.
+ * It will record various infromation about a page frame.
+ */
+TRACE_EVENT(dump_pages,
+
+	TP_PROTO(unsigned long pfn, struct page *page),
+
+	TP_ARGS(pfn, page),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	pfn		)
+		__field(	unsigned long,	flags		)
+		__field(	unsigned long,	index		)
+		__field(	unsigned int,	count		)
+		__field(	unsigned int,	mapcount	)
+	),
+
+	TP_fast_assign(
+		__entry->pfn		= pfn;
+		__entry->flags		= page->flags;
+		__entry->count		= atomic_read(&page->_count);
+		__entry->mapcount	= page_mapcount(page);
+		__entry->index		= page->index;
+	),
+
+	TP_printk("pfn=%lu flags=%lx count=%u mapcount=%u index=%lu",
+		  __entry->pfn, __entry->flags, __entry->count,
+		  __entry->mapcount, __entry->index)
+);
+
+#endif /*  _TRACE_MM_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--- mmotm.orig/kernel/trace/Makefile	2011-03-03 19:12:42.000000000 +0800
+++ mmotm/kernel/trace/Makefile	2011-03-03 19:17:47.000000000 +0800
@@ -53,6 +53,7 @@ endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
 obj-$(CONFIG_TRACEPOINTS) += power-traces.o
+obj-$(CONFIG_EVENT_TRACING) += trace_mm.o
 ifeq ($(CONFIG_TRACING),y)
 obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
 endif
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ mmotm/kernel/trace/trace_mm.c	2011-03-03 19:17:48.000000000 +0800
@@ -0,0 +1,172 @@
+/*
+ * Trace mm pages
+ *
+ * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ *
+ * Code based on Matt Mackall's /proc/[kpagecount|kpageflags] code.
+ */
+#include <linux/module.h>
+#include <linux/bootmem.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+
+#include "trace_output.h"
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/mm.h>
+
+void trace_read_page_frames(unsigned long start, unsigned long end,
+			    void (*trace)(unsigned long pfn, struct page *page))
+{
+	unsigned long pfn = start;
+	struct page *page;
+
+	if (start > max_pfn - 1)
+		return;
+
+	if (end > max_pfn - 1)
+		end = max_pfn - 1;
+
+	while (pfn < end) {
+		page = NULL;
+		if (pfn_valid(pfn))
+			page = pfn_to_page(pfn);
+		pfn++;
+		if (page)
+			trace(pfn, page);
+	}
+}
+
+static void trace_do_dump_pages(unsigned long pfn, struct page *page)
+{
+	trace_dump_pages(pfn, page);
+}
+
+static ssize_t
+trace_mm_trigger_read(struct file *filp, char __user *ubuf, size_t cnt,
+		 loff_t *ppos)
+{
+	return simple_read_from_buffer(ubuf, cnt, ppos, "0\n", 2);
+}
+
+
+static ssize_t
+trace_mm_trigger_write(struct file *filp, const char __user *ubuf, size_t cnt,
+		       loff_t *ppos)
+{
+	unsigned long val, start, end;
+	char buf[64];
+	int ret;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	if (tracing_update_buffers() < 0)
+		return -ENOMEM;
+
+	if (trace_set_clr_event("mm", "dump_pages", 1))
+		return -EINVAL;
+
+	buf[cnt] = 0;
+
+	ret = strict_strtol(buf, 10, &val);
+	if (ret < 0)
+		return ret;
+
+	start = *ppos;
+	if (val < 0)
+		end = max_pfn - 1;
+	else
+		end = start + val;
+
+	trace_read_page_frames(start, end, trace_do_dump_pages);
+
+	*ppos += cnt;
+
+	return cnt;
+}
+
+static const struct file_operations trace_mm_fops = {
+	.open		= tracing_open_generic,
+	.read		= trace_mm_trigger_read,
+	.write		= trace_mm_trigger_write,
+};
+
+/* move this into trace_objects.c when that file is created */
+static struct dentry *trace_objects_dir(void)
+{
+	static struct dentry *d_objects;
+	struct dentry *d_tracer;
+
+	if (d_objects)
+		return d_objects;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_objects = debugfs_create_dir("objects", d_tracer);
+	if (!d_objects)
+		pr_warning("Could not create debugfs "
+			   "'objects' directory\n");
+
+	return d_objects;
+}
+
+
+static struct dentry *trace_objects_mm_dir(void)
+{
+	static struct dentry *d_mm;
+	struct dentry *d_objects;
+
+	if (d_mm)
+		return d_mm;
+
+	d_objects = trace_objects_dir();
+	if (!d_objects)
+		return NULL;
+
+	d_mm = debugfs_create_dir("mm", d_objects);
+	if (!d_mm)
+		pr_warning("Could not create 'objects/mm' directory\n");
+
+	return d_mm;
+}
+
+static struct dentry *trace_objects_mm_pages_dir(void)
+{
+	static struct dentry *d_pages;
+	struct dentry *d_mm;
+
+	if (d_pages)
+		return d_pages;
+
+	d_mm = trace_objects_mm_dir();
+	if (!d_mm)
+		return NULL;
+
+	d_pages = debugfs_create_dir("pages", d_mm);
+	if (!d_pages)
+		pr_warning("Could not create debugfs "
+			   "'objects/mm/pages' directory\n");
+
+	return d_pages;
+}
+
+static __init int trace_objects_mm_init(void)
+{
+	struct dentry *d_pages;
+
+	d_pages = trace_objects_mm_pages_dir();
+	if (!d_pages)
+		return 0;
+
+	trace_create_file("trigger", 0600, d_pages, NULL,
+			  &trace_mm_fops);
+
+	return 0;
+}
+fs_initcall(trace_objects_mm_init);



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 1/7] tracing/mm: add page frame snapshot trace
@ 2011-08-29  3:29   ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Steven Rostedt, Wu Fengguang,
	Linux Memory Management List, LKML

[-- Attachment #1: 0001-tracing-mm-add-page-frame-snapshot-trace.patch --]
[-- Type: text/plain, Size: 6648 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This is a prototype to dump out a snapshot of the page tables to the
tracing buffer. Currently it is very primitive, and just writes out
the events. There is no synchronization to not loose the events,
so /debug/tracing/buffer_size_kb has to be large enough for all
events to fit.

We will do something about synchronization later. That is, have a way
to read the buffer through the tracing/object/mm/page/X file and have
the two in sync.

But this is just a prototype to get the ball rolling.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/trace/events/mm.h |   48 ++++++++++
 kernel/trace/Makefile     |    1 
 kernel/trace/trace_mm.c   |  172 ++++++++++++++++++++++++++++++++++++
 3 files changed, 221 insertions(+)
 create mode 100644 include/trace/events/mm.h
 create mode 100644 kernel/trace/trace_mm.c

--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ mmotm/include/trace/events/mm.h	2011-03-03 19:18:02.000000000 +0800
@@ -0,0 +1,48 @@
+#if !defined(_TRACE_MM_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_MM_H
+
+#include <linux/tracepoint.h>
+#include <linux/mm.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mm
+
+/**
+ * dump_pages - called by the trace page dump trigger
+ * @pfn: page frame number
+ * @page: pointer to the page frame
+ *
+ * This is a helper trace point into the dumping of the page frames.
+ * It will record various infromation about a page frame.
+ */
+TRACE_EVENT(dump_pages,
+
+	TP_PROTO(unsigned long pfn, struct page *page),
+
+	TP_ARGS(pfn, page),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	pfn		)
+		__field(	unsigned long,	flags		)
+		__field(	unsigned long,	index		)
+		__field(	unsigned int,	count		)
+		__field(	unsigned int,	mapcount	)
+	),
+
+	TP_fast_assign(
+		__entry->pfn		= pfn;
+		__entry->flags		= page->flags;
+		__entry->count		= atomic_read(&page->_count);
+		__entry->mapcount	= page_mapcount(page);
+		__entry->index		= page->index;
+	),
+
+	TP_printk("pfn=%lu flags=%lx count=%u mapcount=%u index=%lu",
+		  __entry->pfn, __entry->flags, __entry->count,
+		  __entry->mapcount, __entry->index)
+);
+
+#endif /*  _TRACE_MM_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--- mmotm.orig/kernel/trace/Makefile	2011-03-03 19:12:42.000000000 +0800
+++ mmotm/kernel/trace/Makefile	2011-03-03 19:17:47.000000000 +0800
@@ -53,6 +53,7 @@ endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
 obj-$(CONFIG_TRACEPOINTS) += power-traces.o
+obj-$(CONFIG_EVENT_TRACING) += trace_mm.o
 ifeq ($(CONFIG_TRACING),y)
 obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
 endif
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ mmotm/kernel/trace/trace_mm.c	2011-03-03 19:17:48.000000000 +0800
@@ -0,0 +1,172 @@
+/*
+ * Trace mm pages
+ *
+ * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ *
+ * Code based on Matt Mackall's /proc/[kpagecount|kpageflags] code.
+ */
+#include <linux/module.h>
+#include <linux/bootmem.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+
+#include "trace_output.h"
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/mm.h>
+
+void trace_read_page_frames(unsigned long start, unsigned long end,
+			    void (*trace)(unsigned long pfn, struct page *page))
+{
+	unsigned long pfn = start;
+	struct page *page;
+
+	if (start > max_pfn - 1)
+		return;
+
+	if (end > max_pfn - 1)
+		end = max_pfn - 1;
+
+	while (pfn < end) {
+		page = NULL;
+		if (pfn_valid(pfn))
+			page = pfn_to_page(pfn);
+		pfn++;
+		if (page)
+			trace(pfn, page);
+	}
+}
+
+static void trace_do_dump_pages(unsigned long pfn, struct page *page)
+{
+	trace_dump_pages(pfn, page);
+}
+
+static ssize_t
+trace_mm_trigger_read(struct file *filp, char __user *ubuf, size_t cnt,
+		 loff_t *ppos)
+{
+	return simple_read_from_buffer(ubuf, cnt, ppos, "0\n", 2);
+}
+
+
+static ssize_t
+trace_mm_trigger_write(struct file *filp, const char __user *ubuf, size_t cnt,
+		       loff_t *ppos)
+{
+	unsigned long val, start, end;
+	char buf[64];
+	int ret;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	if (tracing_update_buffers() < 0)
+		return -ENOMEM;
+
+	if (trace_set_clr_event("mm", "dump_pages", 1))
+		return -EINVAL;
+
+	buf[cnt] = 0;
+
+	ret = strict_strtol(buf, 10, &val);
+	if (ret < 0)
+		return ret;
+
+	start = *ppos;
+	if (val < 0)
+		end = max_pfn - 1;
+	else
+		end = start + val;
+
+	trace_read_page_frames(start, end, trace_do_dump_pages);
+
+	*ppos += cnt;
+
+	return cnt;
+}
+
+static const struct file_operations trace_mm_fops = {
+	.open		= tracing_open_generic,
+	.read		= trace_mm_trigger_read,
+	.write		= trace_mm_trigger_write,
+};
+
+/* move this into trace_objects.c when that file is created */
+static struct dentry *trace_objects_dir(void)
+{
+	static struct dentry *d_objects;
+	struct dentry *d_tracer;
+
+	if (d_objects)
+		return d_objects;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_objects = debugfs_create_dir("objects", d_tracer);
+	if (!d_objects)
+		pr_warning("Could not create debugfs "
+			   "'objects' directory\n");
+
+	return d_objects;
+}
+
+
+static struct dentry *trace_objects_mm_dir(void)
+{
+	static struct dentry *d_mm;
+	struct dentry *d_objects;
+
+	if (d_mm)
+		return d_mm;
+
+	d_objects = trace_objects_dir();
+	if (!d_objects)
+		return NULL;
+
+	d_mm = debugfs_create_dir("mm", d_objects);
+	if (!d_mm)
+		pr_warning("Could not create 'objects/mm' directory\n");
+
+	return d_mm;
+}
+
+static struct dentry *trace_objects_mm_pages_dir(void)
+{
+	static struct dentry *d_pages;
+	struct dentry *d_mm;
+
+	if (d_pages)
+		return d_pages;
+
+	d_mm = trace_objects_mm_dir();
+	if (!d_mm)
+		return NULL;
+
+	d_pages = debugfs_create_dir("pages", d_mm);
+	if (!d_pages)
+		pr_warning("Could not create debugfs "
+			   "'objects/mm/pages' directory\n");
+
+	return d_pages;
+}
+
+static __init int trace_objects_mm_init(void)
+{
+	struct dentry *d_pages;
+
+	d_pages = trace_objects_mm_pages_dir();
+	if (!d_pages)
+		return 0;
+
+	trace_create_file("trigger", 0600, d_pages, NULL,
+			  &trace_mm_fops);
+
+	return 0;
+}
+fs_initcall(trace_objects_mm_init);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 1/7] tracing/mm: add page frame snapshot trace
  2011-08-29  3:29 ` Wu Fengguang
  (?)
@ 2011-08-29  3:29 ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Steven Rostedt, Wu Fengguang,
	Linux Memory Management List, LKML

[-- Attachment #1: 0001-tracing-mm-add-page-frame-snapshot-trace.patch --]
[-- Type: text/plain, Size: 6343 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This is a prototype to dump out a snapshot of the page tables to the
tracing buffer. Currently it is very primitive, and just writes out
the events. There is no synchronization to not loose the events,
so /debug/tracing/buffer_size_kb has to be large enough for all
events to fit.

We will do something about synchronization later. That is, have a way
to read the buffer through the tracing/object/mm/page/X file and have
the two in sync.

But this is just a prototype to get the ball rolling.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/trace/events/mm.h |   48 ++++++++++
 kernel/trace/Makefile     |    1 
 kernel/trace/trace_mm.c   |  172 ++++++++++++++++++++++++++++++++++++
 3 files changed, 221 insertions(+)
 create mode 100644 include/trace/events/mm.h
 create mode 100644 kernel/trace/trace_mm.c

--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ mmotm/include/trace/events/mm.h	2011-03-03 19:18:02.000000000 +0800
@@ -0,0 +1,48 @@
+#if !defined(_TRACE_MM_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_MM_H
+
+#include <linux/tracepoint.h>
+#include <linux/mm.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mm
+
+/**
+ * dump_pages - called by the trace page dump trigger
+ * @pfn: page frame number
+ * @page: pointer to the page frame
+ *
+ * This is a helper trace point into the dumping of the page frames.
+ * It will record various infromation about a page frame.
+ */
+TRACE_EVENT(dump_pages,
+
+	TP_PROTO(unsigned long pfn, struct page *page),
+
+	TP_ARGS(pfn, page),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	pfn		)
+		__field(	unsigned long,	flags		)
+		__field(	unsigned long,	index		)
+		__field(	unsigned int,	count		)
+		__field(	unsigned int,	mapcount	)
+	),
+
+	TP_fast_assign(
+		__entry->pfn		= pfn;
+		__entry->flags		= page->flags;
+		__entry->count		= atomic_read(&page->_count);
+		__entry->mapcount	= page_mapcount(page);
+		__entry->index		= page->index;
+	),
+
+	TP_printk("pfn=%lu flags=%lx count=%u mapcount=%u index=%lu",
+		  __entry->pfn, __entry->flags, __entry->count,
+		  __entry->mapcount, __entry->index)
+);
+
+#endif /*  _TRACE_MM_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--- mmotm.orig/kernel/trace/Makefile	2011-03-03 19:12:42.000000000 +0800
+++ mmotm/kernel/trace/Makefile	2011-03-03 19:17:47.000000000 +0800
@@ -53,6 +53,7 @@ endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
 obj-$(CONFIG_TRACEPOINTS) += power-traces.o
+obj-$(CONFIG_EVENT_TRACING) += trace_mm.o
 ifeq ($(CONFIG_TRACING),y)
 obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
 endif
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ mmotm/kernel/trace/trace_mm.c	2011-03-03 19:17:48.000000000 +0800
@@ -0,0 +1,172 @@
+/*
+ * Trace mm pages
+ *
+ * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ *
+ * Code based on Matt Mackall's /proc/[kpagecount|kpageflags] code.
+ */
+#include <linux/module.h>
+#include <linux/bootmem.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+
+#include "trace_output.h"
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/mm.h>
+
+void trace_read_page_frames(unsigned long start, unsigned long end,
+			    void (*trace)(unsigned long pfn, struct page *page))
+{
+	unsigned long pfn = start;
+	struct page *page;
+
+	if (start > max_pfn - 1)
+		return;
+
+	if (end > max_pfn - 1)
+		end = max_pfn - 1;
+
+	while (pfn < end) {
+		page = NULL;
+		if (pfn_valid(pfn))
+			page = pfn_to_page(pfn);
+		pfn++;
+		if (page)
+			trace(pfn, page);
+	}
+}
+
+static void trace_do_dump_pages(unsigned long pfn, struct page *page)
+{
+	trace_dump_pages(pfn, page);
+}
+
+static ssize_t
+trace_mm_trigger_read(struct file *filp, char __user *ubuf, size_t cnt,
+		 loff_t *ppos)
+{
+	return simple_read_from_buffer(ubuf, cnt, ppos, "0\n", 2);
+}
+
+
+static ssize_t
+trace_mm_trigger_write(struct file *filp, const char __user *ubuf, size_t cnt,
+		       loff_t *ppos)
+{
+	unsigned long val, start, end;
+	char buf[64];
+	int ret;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	if (tracing_update_buffers() < 0)
+		return -ENOMEM;
+
+	if (trace_set_clr_event("mm", "dump_pages", 1))
+		return -EINVAL;
+
+	buf[cnt] = 0;
+
+	ret = strict_strtol(buf, 10, &val);
+	if (ret < 0)
+		return ret;
+
+	start = *ppos;
+	if (val < 0)
+		end = max_pfn - 1;
+	else
+		end = start + val;
+
+	trace_read_page_frames(start, end, trace_do_dump_pages);
+
+	*ppos += cnt;
+
+	return cnt;
+}
+
+static const struct file_operations trace_mm_fops = {
+	.open		= tracing_open_generic,
+	.read		= trace_mm_trigger_read,
+	.write		= trace_mm_trigger_write,
+};
+
+/* move this into trace_objects.c when that file is created */
+static struct dentry *trace_objects_dir(void)
+{
+	static struct dentry *d_objects;
+	struct dentry *d_tracer;
+
+	if (d_objects)
+		return d_objects;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_objects = debugfs_create_dir("objects", d_tracer);
+	if (!d_objects)
+		pr_warning("Could not create debugfs "
+			   "'objects' directory\n");
+
+	return d_objects;
+}
+
+
+static struct dentry *trace_objects_mm_dir(void)
+{
+	static struct dentry *d_mm;
+	struct dentry *d_objects;
+
+	if (d_mm)
+		return d_mm;
+
+	d_objects = trace_objects_dir();
+	if (!d_objects)
+		return NULL;
+
+	d_mm = debugfs_create_dir("mm", d_objects);
+	if (!d_mm)
+		pr_warning("Could not create 'objects/mm' directory\n");
+
+	return d_mm;
+}
+
+static struct dentry *trace_objects_mm_pages_dir(void)
+{
+	static struct dentry *d_pages;
+	struct dentry *d_mm;
+
+	if (d_pages)
+		return d_pages;
+
+	d_mm = trace_objects_mm_dir();
+	if (!d_mm)
+		return NULL;
+
+	d_pages = debugfs_create_dir("pages", d_mm);
+	if (!d_pages)
+		pr_warning("Could not create debugfs "
+			   "'objects/mm/pages' directory\n");
+
+	return d_pages;
+}
+
+static __init int trace_objects_mm_init(void)
+{
+	struct dentry *d_pages;
+
+	d_pages = trace_objects_mm_pages_dir();
+	if (!d_pages)
+		return 0;
+
+	trace_create_file("trigger", 0600, d_pages, NULL,
+			  &trace_mm_fops);
+
+	return 0;
+}
+fs_initcall(trace_objects_mm_init);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 2/7] tracing/mm: rename trigger file to dump-pfn
  2011-08-29  3:29 ` Wu Fengguang
@ 2011-08-29  3:29   ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: 0003-tracing-mm-rename-trigger-file-to-dump_range.patch --]
[-- Type: text/plain, Size: 3614 bytes --]

From: Ingo Molnar <mingo@elte.hu>

Wu Fengguang noted that /debug/tracing/objects/mm/pages/trigger was
not very intuitively named - rename it to 'dump-pfn', which covers
its functionality better.

[ Impact: rename /debug/tracing file ]

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/trace/events/mm.h |    4 ++--
 kernel/trace/trace_mm.c   |   26 +++++++++++++-------------
 2 files changed, 15 insertions(+), 15 deletions(-)

--- mmotm.orig/kernel/trace/trace_mm.c	2011-03-03 19:17:48.000000000 +0800
+++ mmotm/kernel/trace/trace_mm.c	2011-03-03 19:18:17.000000000 +0800
@@ -15,8 +15,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/mm.h>
 
-void trace_read_page_frames(unsigned long start, unsigned long end,
-			    void (*trace)(unsigned long pfn, struct page *page))
+void trace_mm_page_frames(unsigned long start, unsigned long end,
+			  void (*trace)(unsigned long pfn, struct page *page))
 {
 	unsigned long pfn = start;
 	struct page *page;
@@ -37,22 +37,22 @@ void trace_read_page_frames(unsigned lon
 	}
 }
 
-static void trace_do_dump_pages(unsigned long pfn, struct page *page)
+static void trace_mm_page_frame(unsigned long pfn, struct page *page)
 {
-	trace_dump_pages(pfn, page);
+	trace_dump_page_frame(pfn, page);
 }
 
 static ssize_t
-trace_mm_trigger_read(struct file *filp, char __user *ubuf, size_t cnt,
-		 loff_t *ppos)
+trace_mm_pfn_range_read(struct file *filp, char __user *ubuf, size_t cnt,
+			loff_t *ppos)
 {
 	return simple_read_from_buffer(ubuf, cnt, ppos, "0\n", 2);
 }
 
 
 static ssize_t
-trace_mm_trigger_write(struct file *filp, const char __user *ubuf, size_t cnt,
-		       loff_t *ppos)
+trace_mm_pfn_range_write(struct file *filp, const char __user *ubuf, size_t cnt,
+			 loff_t *ppos)
 {
 	unsigned long val, start, end;
 	char buf[64];
@@ -67,7 +67,7 @@ trace_mm_trigger_write(struct file *filp
 	if (tracing_update_buffers() < 0)
 		return -ENOMEM;
 
-	if (trace_set_clr_event("mm", "dump_pages", 1))
+	if (trace_set_clr_event("mm", "dump_page_frame", 1))
 		return -EINVAL;
 
 	buf[cnt] = 0;
@@ -82,7 +82,7 @@ trace_mm_trigger_write(struct file *filp
 	else
 		end = start + val;
 
-	trace_read_page_frames(start, end, trace_do_dump_pages);
+	trace_mm_page_frames(start, end, trace_mm_page_frame);
 
 	*ppos += cnt;
 
@@ -91,8 +91,8 @@ trace_mm_trigger_write(struct file *filp
 
 static const struct file_operations trace_mm_fops = {
 	.open		= tracing_open_generic,
-	.read		= trace_mm_trigger_read,
-	.write		= trace_mm_trigger_write,
+	.read		= trace_mm_pfn_range_read,
+	.write		= trace_mm_pfn_range_write,
 };
 
 /* move this into trace_objects.c when that file is created */
@@ -164,7 +164,7 @@ static __init int trace_objects_mm_init(
 	if (!d_pages)
 		return 0;
 
-	trace_create_file("trigger", 0600, d_pages, NULL,
+	trace_create_file("dump-pfn", 0600, d_pages, NULL,
 			  &trace_mm_fops);
 
 	return 0;
--- mmotm.orig/include/trace/events/mm.h	2011-03-03 19:18:02.000000000 +0800
+++ mmotm/include/trace/events/mm.h	2011-03-03 19:18:17.000000000 +0800
@@ -8,14 +8,14 @@
 #define TRACE_SYSTEM mm
 
 /**
- * dump_pages - called by the trace page dump trigger
+ * dump_page_frame - called by the trace page dump trigger
  * @pfn: page frame number
  * @page: pointer to the page frame
  *
  * This is a helper trace point into the dumping of the page frames.
  * It will record various infromation about a page frame.
  */
-TRACE_EVENT(dump_pages,
+TRACE_EVENT(dump_page_frame,
 
 	TP_PROTO(unsigned long pfn, struct page *page),
 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 2/7] tracing/mm: rename trigger file to dump-pfn
@ 2011-08-29  3:29   ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: 0003-tracing-mm-rename-trigger-file-to-dump_range.patch --]
[-- Type: text/plain, Size: 3917 bytes --]

From: Ingo Molnar <mingo@elte.hu>

Wu Fengguang noted that /debug/tracing/objects/mm/pages/trigger was
not very intuitively named - rename it to 'dump-pfn', which covers
its functionality better.

[ Impact: rename /debug/tracing file ]

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/trace/events/mm.h |    4 ++--
 kernel/trace/trace_mm.c   |   26 +++++++++++++-------------
 2 files changed, 15 insertions(+), 15 deletions(-)

--- mmotm.orig/kernel/trace/trace_mm.c	2011-03-03 19:17:48.000000000 +0800
+++ mmotm/kernel/trace/trace_mm.c	2011-03-03 19:18:17.000000000 +0800
@@ -15,8 +15,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/mm.h>
 
-void trace_read_page_frames(unsigned long start, unsigned long end,
-			    void (*trace)(unsigned long pfn, struct page *page))
+void trace_mm_page_frames(unsigned long start, unsigned long end,
+			  void (*trace)(unsigned long pfn, struct page *page))
 {
 	unsigned long pfn = start;
 	struct page *page;
@@ -37,22 +37,22 @@ void trace_read_page_frames(unsigned lon
 	}
 }
 
-static void trace_do_dump_pages(unsigned long pfn, struct page *page)
+static void trace_mm_page_frame(unsigned long pfn, struct page *page)
 {
-	trace_dump_pages(pfn, page);
+	trace_dump_page_frame(pfn, page);
 }
 
 static ssize_t
-trace_mm_trigger_read(struct file *filp, char __user *ubuf, size_t cnt,
-		 loff_t *ppos)
+trace_mm_pfn_range_read(struct file *filp, char __user *ubuf, size_t cnt,
+			loff_t *ppos)
 {
 	return simple_read_from_buffer(ubuf, cnt, ppos, "0\n", 2);
 }
 
 
 static ssize_t
-trace_mm_trigger_write(struct file *filp, const char __user *ubuf, size_t cnt,
-		       loff_t *ppos)
+trace_mm_pfn_range_write(struct file *filp, const char __user *ubuf, size_t cnt,
+			 loff_t *ppos)
 {
 	unsigned long val, start, end;
 	char buf[64];
@@ -67,7 +67,7 @@ trace_mm_trigger_write(struct file *filp
 	if (tracing_update_buffers() < 0)
 		return -ENOMEM;
 
-	if (trace_set_clr_event("mm", "dump_pages", 1))
+	if (trace_set_clr_event("mm", "dump_page_frame", 1))
 		return -EINVAL;
 
 	buf[cnt] = 0;
@@ -82,7 +82,7 @@ trace_mm_trigger_write(struct file *filp
 	else
 		end = start + val;
 
-	trace_read_page_frames(start, end, trace_do_dump_pages);
+	trace_mm_page_frames(start, end, trace_mm_page_frame);
 
 	*ppos += cnt;
 
@@ -91,8 +91,8 @@ trace_mm_trigger_write(struct file *filp
 
 static const struct file_operations trace_mm_fops = {
 	.open		= tracing_open_generic,
-	.read		= trace_mm_trigger_read,
-	.write		= trace_mm_trigger_write,
+	.read		= trace_mm_pfn_range_read,
+	.write		= trace_mm_pfn_range_write,
 };
 
 /* move this into trace_objects.c when that file is created */
@@ -164,7 +164,7 @@ static __init int trace_objects_mm_init(
 	if (!d_pages)
 		return 0;
 
-	trace_create_file("trigger", 0600, d_pages, NULL,
+	trace_create_file("dump-pfn", 0600, d_pages, NULL,
 			  &trace_mm_fops);
 
 	return 0;
--- mmotm.orig/include/trace/events/mm.h	2011-03-03 19:18:02.000000000 +0800
+++ mmotm/include/trace/events/mm.h	2011-03-03 19:18:17.000000000 +0800
@@ -8,14 +8,14 @@
 #define TRACE_SYSTEM mm
 
 /**
- * dump_pages - called by the trace page dump trigger
+ * dump_page_frame - called by the trace page dump trigger
  * @pfn: page frame number
  * @page: pointer to the page frame
  *
  * This is a helper trace point into the dumping of the page frames.
  * It will record various infromation about a page frame.
  */
-TRACE_EVENT(dump_pages,
+TRACE_EVENT(dump_page_frame,
 
 	TP_PROTO(unsigned long pfn, struct page *page),
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 2/7] tracing/mm: rename trigger file to dump-pfn
  2011-08-29  3:29 ` Wu Fengguang
                   ` (2 preceding siblings ...)
  (?)
@ 2011-08-29  3:29 ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: 0003-tracing-mm-rename-trigger-file-to-dump_range.patch --]
[-- Type: text/plain, Size: 3612 bytes --]

From: Ingo Molnar <mingo@elte.hu>

Wu Fengguang noted that /debug/tracing/objects/mm/pages/trigger was
not very intuitively named - rename it to 'dump-pfn', which covers
its functionality better.

[ Impact: rename /debug/tracing file ]

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/trace/events/mm.h |    4 ++--
 kernel/trace/trace_mm.c   |   26 +++++++++++++-------------
 2 files changed, 15 insertions(+), 15 deletions(-)

--- mmotm.orig/kernel/trace/trace_mm.c	2011-03-03 19:17:48.000000000 +0800
+++ mmotm/kernel/trace/trace_mm.c	2011-03-03 19:18:17.000000000 +0800
@@ -15,8 +15,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/mm.h>
 
-void trace_read_page_frames(unsigned long start, unsigned long end,
-			    void (*trace)(unsigned long pfn, struct page *page))
+void trace_mm_page_frames(unsigned long start, unsigned long end,
+			  void (*trace)(unsigned long pfn, struct page *page))
 {
 	unsigned long pfn = start;
 	struct page *page;
@@ -37,22 +37,22 @@ void trace_read_page_frames(unsigned lon
 	}
 }
 
-static void trace_do_dump_pages(unsigned long pfn, struct page *page)
+static void trace_mm_page_frame(unsigned long pfn, struct page *page)
 {
-	trace_dump_pages(pfn, page);
+	trace_dump_page_frame(pfn, page);
 }
 
 static ssize_t
-trace_mm_trigger_read(struct file *filp, char __user *ubuf, size_t cnt,
-		 loff_t *ppos)
+trace_mm_pfn_range_read(struct file *filp, char __user *ubuf, size_t cnt,
+			loff_t *ppos)
 {
 	return simple_read_from_buffer(ubuf, cnt, ppos, "0\n", 2);
 }
 
 
 static ssize_t
-trace_mm_trigger_write(struct file *filp, const char __user *ubuf, size_t cnt,
-		       loff_t *ppos)
+trace_mm_pfn_range_write(struct file *filp, const char __user *ubuf, size_t cnt,
+			 loff_t *ppos)
 {
 	unsigned long val, start, end;
 	char buf[64];
@@ -67,7 +67,7 @@ trace_mm_trigger_write(struct file *filp
 	if (tracing_update_buffers() < 0)
 		return -ENOMEM;
 
-	if (trace_set_clr_event("mm", "dump_pages", 1))
+	if (trace_set_clr_event("mm", "dump_page_frame", 1))
 		return -EINVAL;
 
 	buf[cnt] = 0;
@@ -82,7 +82,7 @@ trace_mm_trigger_write(struct file *filp
 	else
 		end = start + val;
 
-	trace_read_page_frames(start, end, trace_do_dump_pages);
+	trace_mm_page_frames(start, end, trace_mm_page_frame);
 
 	*ppos += cnt;
 
@@ -91,8 +91,8 @@ trace_mm_trigger_write(struct file *filp
 
 static const struct file_operations trace_mm_fops = {
 	.open		= tracing_open_generic,
-	.read		= trace_mm_trigger_read,
-	.write		= trace_mm_trigger_write,
+	.read		= trace_mm_pfn_range_read,
+	.write		= trace_mm_pfn_range_write,
 };
 
 /* move this into trace_objects.c when that file is created */
@@ -164,7 +164,7 @@ static __init int trace_objects_mm_init(
 	if (!d_pages)
 		return 0;
 
-	trace_create_file("trigger", 0600, d_pages, NULL,
+	trace_create_file("dump-pfn", 0600, d_pages, NULL,
 			  &trace_mm_fops);
 
 	return 0;
--- mmotm.orig/include/trace/events/mm.h	2011-03-03 19:18:02.000000000 +0800
+++ mmotm/include/trace/events/mm.h	2011-03-03 19:18:17.000000000 +0800
@@ -8,14 +8,14 @@
 #define TRACE_SYSTEM mm
 
 /**
- * dump_pages - called by the trace page dump trigger
+ * dump_page_frame - called by the trace page dump trigger
  * @pfn: page frame number
  * @page: pointer to the page frame
  *
  * This is a helper trace point into the dumping of the page frames.
  * It will record various infromation about a page frame.
  */
-TRACE_EVENT(dump_pages,
+TRACE_EVENT(dump_page_frame,
 
 	TP_PROTO(unsigned long pfn, struct page *page),
 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 3/7] tracing/mm: create trace_objects.c
  2011-08-29  3:29 ` Wu Fengguang
@ 2011-08-29  3:29   ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Steven Rostedt, Wu Fengguang,
	Linux Memory Management List, LKML

[-- Attachment #1: trace-objects.patch --]
[-- Type: text/plain, Size: 2712 bytes --]

Code refactor: create trace_objects.c and move relevant code from trace_mm.c

CC: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 kernel/trace/Makefile        |    1 +
 kernel/trace/trace.h         |    1 +
 kernel/trace/trace_mm.c      |   22 ----------------------
 kernel/trace/trace_objects.c |   26 ++++++++++++++++++++++++++
 4 files changed, 28 insertions(+), 22 deletions(-)

--- linux-mmotm.orig/kernel/trace/Makefile	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/Makefile	2011-08-28 10:09:28.000000000 +0800
@@ -26,6 +26,7 @@ obj-$(CONFIG_RING_BUFFER) += ring_buffer
 obj-$(CONFIG_RING_BUFFER_BENCHMARK) += ring_buffer_benchmark.o
 
 obj-$(CONFIG_TRACING) += trace.o
+obj-$(CONFIG_TRACING) += trace_objects.o
 obj-$(CONFIG_TRACING) += trace_output.o
 obj-$(CONFIG_TRACING) += trace_stat.o
 obj-$(CONFIG_TRACING) += trace_printk.o
--- linux-mmotm.orig/kernel/trace/trace.h	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/trace.h	2011-08-28 10:09:28.000000000 +0800
@@ -318,6 +318,7 @@ struct dentry *trace_create_file(const c
 				 const struct file_operations *fops);
 
 struct dentry *tracing_init_dentry(void);
+struct dentry *trace_objects_dir(void);
 
 struct ring_buffer_event;
 
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/kernel/trace/trace_objects.c	2011-08-28 10:09:28.000000000 +0800
@@ -0,0 +1,26 @@
+#include <linux/debugfs.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+struct dentry *trace_objects_dir(void)
+{
+	static struct dentry *d_objects;
+	struct dentry *d_tracer;
+
+	if (d_objects)
+		return d_objects;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_objects = debugfs_create_dir("objects", d_tracer);
+	if (!d_objects)
+		pr_warning("Could not create debugfs "
+			   "'objects' directory\n");
+
+	return d_objects;
+}
+
+
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-28 10:09:27.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-28 10:09:28.000000000 +0800
@@ -95,28 +95,6 @@ static const struct file_operations trac
 	.write		= trace_mm_pfn_range_write,
 };
 
-/* move this into trace_objects.c when that file is created */
-static struct dentry *trace_objects_dir(void)
-{
-	static struct dentry *d_objects;
-	struct dentry *d_tracer;
-
-	if (d_objects)
-		return d_objects;
-
-	d_tracer = tracing_init_dentry();
-	if (!d_tracer)
-		return NULL;
-
-	d_objects = debugfs_create_dir("objects", d_tracer);
-	if (!d_objects)
-		pr_warning("Could not create debugfs "
-			   "'objects' directory\n");
-
-	return d_objects;
-}
-
-
 static struct dentry *trace_objects_mm_dir(void)
 {
 	static struct dentry *d_mm;



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 3/7] tracing/mm: create trace_objects.c
@ 2011-08-29  3:29   ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Steven Rostedt, Wu Fengguang,
	Linux Memory Management List, LKML

[-- Attachment #1: trace-objects.patch --]
[-- Type: text/plain, Size: 3015 bytes --]

Code refactor: create trace_objects.c and move relevant code from trace_mm.c

CC: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 kernel/trace/Makefile        |    1 +
 kernel/trace/trace.h         |    1 +
 kernel/trace/trace_mm.c      |   22 ----------------------
 kernel/trace/trace_objects.c |   26 ++++++++++++++++++++++++++
 4 files changed, 28 insertions(+), 22 deletions(-)

--- linux-mmotm.orig/kernel/trace/Makefile	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/Makefile	2011-08-28 10:09:28.000000000 +0800
@@ -26,6 +26,7 @@ obj-$(CONFIG_RING_BUFFER) += ring_buffer
 obj-$(CONFIG_RING_BUFFER_BENCHMARK) += ring_buffer_benchmark.o
 
 obj-$(CONFIG_TRACING) += trace.o
+obj-$(CONFIG_TRACING) += trace_objects.o
 obj-$(CONFIG_TRACING) += trace_output.o
 obj-$(CONFIG_TRACING) += trace_stat.o
 obj-$(CONFIG_TRACING) += trace_printk.o
--- linux-mmotm.orig/kernel/trace/trace.h	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/trace.h	2011-08-28 10:09:28.000000000 +0800
@@ -318,6 +318,7 @@ struct dentry *trace_create_file(const c
 				 const struct file_operations *fops);
 
 struct dentry *tracing_init_dentry(void);
+struct dentry *trace_objects_dir(void);
 
 struct ring_buffer_event;
 
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/kernel/trace/trace_objects.c	2011-08-28 10:09:28.000000000 +0800
@@ -0,0 +1,26 @@
+#include <linux/debugfs.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+struct dentry *trace_objects_dir(void)
+{
+	static struct dentry *d_objects;
+	struct dentry *d_tracer;
+
+	if (d_objects)
+		return d_objects;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_objects = debugfs_create_dir("objects", d_tracer);
+	if (!d_objects)
+		pr_warning("Could not create debugfs "
+			   "'objects' directory\n");
+
+	return d_objects;
+}
+
+
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-28 10:09:27.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-28 10:09:28.000000000 +0800
@@ -95,28 +95,6 @@ static const struct file_operations trac
 	.write		= trace_mm_pfn_range_write,
 };
 
-/* move this into trace_objects.c when that file is created */
-static struct dentry *trace_objects_dir(void)
-{
-	static struct dentry *d_objects;
-	struct dentry *d_tracer;
-
-	if (d_objects)
-		return d_objects;
-
-	d_tracer = tracing_init_dentry();
-	if (!d_tracer)
-		return NULL;
-
-	d_objects = debugfs_create_dir("objects", d_tracer);
-	if (!d_objects)
-		pr_warning("Could not create debugfs "
-			   "'objects' directory\n");
-
-	return d_objects;
-}
-
-
 static struct dentry *trace_objects_mm_dir(void)
 {
 	static struct dentry *d_mm;


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 3/7] tracing/mm: create trace_objects.c
  2011-08-29  3:29 ` Wu Fengguang
                   ` (5 preceding siblings ...)
  (?)
@ 2011-08-29  3:29 ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Steven Rostedt, Wu Fengguang,
	Linux Memory Management List, LKML

[-- Attachment #1: trace-objects.patch --]
[-- Type: text/plain, Size: 2710 bytes --]

Code refactor: create trace_objects.c and move relevant code from trace_mm.c

CC: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 kernel/trace/Makefile        |    1 +
 kernel/trace/trace.h         |    1 +
 kernel/trace/trace_mm.c      |   22 ----------------------
 kernel/trace/trace_objects.c |   26 ++++++++++++++++++++++++++
 4 files changed, 28 insertions(+), 22 deletions(-)

--- linux-mmotm.orig/kernel/trace/Makefile	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/Makefile	2011-08-28 10:09:28.000000000 +0800
@@ -26,6 +26,7 @@ obj-$(CONFIG_RING_BUFFER) += ring_buffer
 obj-$(CONFIG_RING_BUFFER_BENCHMARK) += ring_buffer_benchmark.o
 
 obj-$(CONFIG_TRACING) += trace.o
+obj-$(CONFIG_TRACING) += trace_objects.o
 obj-$(CONFIG_TRACING) += trace_output.o
 obj-$(CONFIG_TRACING) += trace_stat.o
 obj-$(CONFIG_TRACING) += trace_printk.o
--- linux-mmotm.orig/kernel/trace/trace.h	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/trace.h	2011-08-28 10:09:28.000000000 +0800
@@ -318,6 +318,7 @@ struct dentry *trace_create_file(const c
 				 const struct file_operations *fops);
 
 struct dentry *tracing_init_dentry(void);
+struct dentry *trace_objects_dir(void);
 
 struct ring_buffer_event;
 
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/kernel/trace/trace_objects.c	2011-08-28 10:09:28.000000000 +0800
@@ -0,0 +1,26 @@
+#include <linux/debugfs.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+struct dentry *trace_objects_dir(void)
+{
+	static struct dentry *d_objects;
+	struct dentry *d_tracer;
+
+	if (d_objects)
+		return d_objects;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_objects = debugfs_create_dir("objects", d_tracer);
+	if (!d_objects)
+		pr_warning("Could not create debugfs "
+			   "'objects' directory\n");
+
+	return d_objects;
+}
+
+
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-28 10:09:27.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-28 10:09:28.000000000 +0800
@@ -95,28 +95,6 @@ static const struct file_operations trac
 	.write		= trace_mm_pfn_range_write,
 };
 
-/* move this into trace_objects.c when that file is created */
-static struct dentry *trace_objects_dir(void)
-{
-	static struct dentry *d_objects;
-	struct dentry *d_tracer;
-
-	if (d_objects)
-		return d_objects;
-
-	d_tracer = tracing_init_dentry();
-	if (!d_tracer)
-		return NULL;
-
-	d_objects = debugfs_create_dir("objects", d_tracer);
-	if (!d_objects)
-		pr_warning("Could not create debugfs "
-			   "'objects' directory\n");
-
-	return d_objects;
-}
-
-
 static struct dentry *trace_objects_mm_dir(void)
 {
 	static struct dentry *d_mm;

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 4/7] tracing/mm: dump more page frame information
  2011-08-29  3:29 ` Wu Fengguang
@ 2011-08-29  3:29   ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: mm-export-pageflag_names.patch --]
[-- Type: text/plain, Size: 3655 bytes --]

Add 4 more fields to dump_page_frame trace event.

1) stable page flags in addition to the raw page flags

User space should only make use the stable page flags.  The raw page
flags is stored mainly to take advantage of ftrace_print_flags_seq()
for showing symbolic flag names.

2) struct page address
3) page->private
4) page->mapping

The above 3 fields are mainly targeted for VM debug aids.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/linux/page-flags.h |    1 +
 include/trace/events/mm.h  |   29 +++++++++++++++++++++++++----
 mm/page_alloc.c            |    4 ++--
 3 files changed, 28 insertions(+), 6 deletions(-)

--- linux-mmotm.orig/mm/page_alloc.c	2011-08-28 10:09:24.000000000 +0800
+++ linux-mmotm/mm/page_alloc.c	2011-08-28 10:09:31.000000000 +0800
@@ -5743,7 +5743,7 @@ bool is_free_buddy_page(struct page *pag
 }
 #endif
 
-static struct trace_print_flags pageflag_names[] = {
+struct trace_print_flags pageflag_names[] = {
 	{1UL << PG_locked,		"locked"	},
 	{1UL << PG_error,		"error"		},
 	{1UL << PG_referenced,		"referenced"	},
@@ -5790,7 +5790,7 @@ static void dump_page_flags(unsigned lon
 	printk(KERN_ALERT "page flags: %#lx(", flags);
 
 	/* remove zone id */
-	flags &= (1UL << NR_PAGEFLAGS) - 1;
+	flags &= PAGE_FLAGS_MASK;
 
 	for (i = 0; pageflag_names[i].name && flags; i++) {
 
--- linux-mmotm.orig/include/linux/page-flags.h	2011-08-28 10:09:24.000000000 +0800
+++ linux-mmotm/include/linux/page-flags.h	2011-08-28 10:09:31.000000000 +0800
@@ -462,6 +462,7 @@ static inline int PageTransCompound(stru
  * there has been a kernel bug or struct page corruption.
  */
 #define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_MASK			((1 << NR_PAGEFLAGS) - 1)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)
--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-28 10:09:27.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-28 10:43:38.000000000 +0800
@@ -2,11 +2,14 @@
 #define _TRACE_MM_H
 
 #include <linux/tracepoint.h>
+#include <linux/page-flags.h>
 #include <linux/mm.h>
 
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM mm
 
+extern struct trace_print_flags pageflag_names[];
+
 /**
  * dump_page_frame - called by the trace page dump trigger
  * @pfn: page frame number
@@ -23,23 +26,41 @@ TRACE_EVENT(dump_page_frame,
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	pfn		)
+		__field(	struct page *,	page		)
+		__field(	u64,		stable_flags	)
 		__field(	unsigned long,	flags		)
-		__field(	unsigned long,	index		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned long,	private		)
+		__field(	unsigned long,	mapping		)
+		__field(	unsigned long,	index		)
 	),
 
 	TP_fast_assign(
 		__entry->pfn		= pfn;
+		__entry->page		= page;
+		__entry->stable_flags	= stable_page_flags(page);
 		__entry->flags		= page->flags;
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->private	= page->private;
+		__entry->mapping	= (unsigned long)page->mapping;
 		__entry->index		= page->index;
 	),
 
-	TP_printk("pfn=%lu flags=%lx count=%u mapcount=%u index=%lu",
-		  __entry->pfn, __entry->flags, __entry->count,
-		  __entry->mapcount, __entry->index)
+	TP_printk("pfn=%lu page=%p count=%u mapcount=%u "
+		  "private=%lx mapping=%lx index=%lx flags=%s",
+		  __entry->pfn,
+		  __entry->page,
+		  __entry->count,
+		  __entry->mapcount,
+		  __entry->private,
+		  __entry->mapping,
+		  __entry->index,
+		  ftrace_print_flags_seq(p, "|",
+					 __entry->flags & PAGE_FLAGS_MASK,
+					 pageflag_names)
+	)
 );
 
 #endif /*  _TRACE_MM_H */



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 4/7] tracing/mm: dump more page frame information
@ 2011-08-29  3:29   ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: mm-export-pageflag_names.patch --]
[-- Type: text/plain, Size: 3958 bytes --]

Add 4 more fields to dump_page_frame trace event.

1) stable page flags in addition to the raw page flags

User space should only make use the stable page flags.  The raw page
flags is stored mainly to take advantage of ftrace_print_flags_seq()
for showing symbolic flag names.

2) struct page address
3) page->private
4) page->mapping

The above 3 fields are mainly targeted for VM debug aids.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/linux/page-flags.h |    1 +
 include/trace/events/mm.h  |   29 +++++++++++++++++++++++++----
 mm/page_alloc.c            |    4 ++--
 3 files changed, 28 insertions(+), 6 deletions(-)

--- linux-mmotm.orig/mm/page_alloc.c	2011-08-28 10:09:24.000000000 +0800
+++ linux-mmotm/mm/page_alloc.c	2011-08-28 10:09:31.000000000 +0800
@@ -5743,7 +5743,7 @@ bool is_free_buddy_page(struct page *pag
 }
 #endif
 
-static struct trace_print_flags pageflag_names[] = {
+struct trace_print_flags pageflag_names[] = {
 	{1UL << PG_locked,		"locked"	},
 	{1UL << PG_error,		"error"		},
 	{1UL << PG_referenced,		"referenced"	},
@@ -5790,7 +5790,7 @@ static void dump_page_flags(unsigned lon
 	printk(KERN_ALERT "page flags: %#lx(", flags);
 
 	/* remove zone id */
-	flags &= (1UL << NR_PAGEFLAGS) - 1;
+	flags &= PAGE_FLAGS_MASK;
 
 	for (i = 0; pageflag_names[i].name && flags; i++) {
 
--- linux-mmotm.orig/include/linux/page-flags.h	2011-08-28 10:09:24.000000000 +0800
+++ linux-mmotm/include/linux/page-flags.h	2011-08-28 10:09:31.000000000 +0800
@@ -462,6 +462,7 @@ static inline int PageTransCompound(stru
  * there has been a kernel bug or struct page corruption.
  */
 #define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_MASK			((1 << NR_PAGEFLAGS) - 1)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)
--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-28 10:09:27.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-28 10:43:38.000000000 +0800
@@ -2,11 +2,14 @@
 #define _TRACE_MM_H
 
 #include <linux/tracepoint.h>
+#include <linux/page-flags.h>
 #include <linux/mm.h>
 
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM mm
 
+extern struct trace_print_flags pageflag_names[];
+
 /**
  * dump_page_frame - called by the trace page dump trigger
  * @pfn: page frame number
@@ -23,23 +26,41 @@ TRACE_EVENT(dump_page_frame,
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	pfn		)
+		__field(	struct page *,	page		)
+		__field(	u64,		stable_flags	)
 		__field(	unsigned long,	flags		)
-		__field(	unsigned long,	index		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned long,	private		)
+		__field(	unsigned long,	mapping		)
+		__field(	unsigned long,	index		)
 	),
 
 	TP_fast_assign(
 		__entry->pfn		= pfn;
+		__entry->page		= page;
+		__entry->stable_flags	= stable_page_flags(page);
 		__entry->flags		= page->flags;
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->private	= page->private;
+		__entry->mapping	= (unsigned long)page->mapping;
 		__entry->index		= page->index;
 	),
 
-	TP_printk("pfn=%lu flags=%lx count=%u mapcount=%u index=%lu",
-		  __entry->pfn, __entry->flags, __entry->count,
-		  __entry->mapcount, __entry->index)
+	TP_printk("pfn=%lu page=%p count=%u mapcount=%u "
+		  "private=%lx mapping=%lx index=%lx flags=%s",
+		  __entry->pfn,
+		  __entry->page,
+		  __entry->count,
+		  __entry->mapcount,
+		  __entry->private,
+		  __entry->mapping,
+		  __entry->index,
+		  ftrace_print_flags_seq(p, "|",
+					 __entry->flags & PAGE_FLAGS_MASK,
+					 pageflag_names)
+	)
 );
 
 #endif /*  _TRACE_MM_H */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 4/7] tracing/mm: dump more page frame information
  2011-08-29  3:29 ` Wu Fengguang
                   ` (6 preceding siblings ...)
  (?)
@ 2011-08-29  3:29 ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: mm-export-pageflag_names.patch --]
[-- Type: text/plain, Size: 3653 bytes --]

Add 4 more fields to dump_page_frame trace event.

1) stable page flags in addition to the raw page flags

User space should only make use the stable page flags.  The raw page
flags is stored mainly to take advantage of ftrace_print_flags_seq()
for showing symbolic flag names.

2) struct page address
3) page->private
4) page->mapping

The above 3 fields are mainly targeted for VM debug aids.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/linux/page-flags.h |    1 +
 include/trace/events/mm.h  |   29 +++++++++++++++++++++++++----
 mm/page_alloc.c            |    4 ++--
 3 files changed, 28 insertions(+), 6 deletions(-)

--- linux-mmotm.orig/mm/page_alloc.c	2011-08-28 10:09:24.000000000 +0800
+++ linux-mmotm/mm/page_alloc.c	2011-08-28 10:09:31.000000000 +0800
@@ -5743,7 +5743,7 @@ bool is_free_buddy_page(struct page *pag
 }
 #endif
 
-static struct trace_print_flags pageflag_names[] = {
+struct trace_print_flags pageflag_names[] = {
 	{1UL << PG_locked,		"locked"	},
 	{1UL << PG_error,		"error"		},
 	{1UL << PG_referenced,		"referenced"	},
@@ -5790,7 +5790,7 @@ static void dump_page_flags(unsigned lon
 	printk(KERN_ALERT "page flags: %#lx(", flags);
 
 	/* remove zone id */
-	flags &= (1UL << NR_PAGEFLAGS) - 1;
+	flags &= PAGE_FLAGS_MASK;
 
 	for (i = 0; pageflag_names[i].name && flags; i++) {
 
--- linux-mmotm.orig/include/linux/page-flags.h	2011-08-28 10:09:24.000000000 +0800
+++ linux-mmotm/include/linux/page-flags.h	2011-08-28 10:09:31.000000000 +0800
@@ -462,6 +462,7 @@ static inline int PageTransCompound(stru
  * there has been a kernel bug or struct page corruption.
  */
 #define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_MASK			((1 << NR_PAGEFLAGS) - 1)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)
--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-28 10:09:27.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-28 10:43:38.000000000 +0800
@@ -2,11 +2,14 @@
 #define _TRACE_MM_H
 
 #include <linux/tracepoint.h>
+#include <linux/page-flags.h>
 #include <linux/mm.h>
 
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM mm
 
+extern struct trace_print_flags pageflag_names[];
+
 /**
  * dump_page_frame - called by the trace page dump trigger
  * @pfn: page frame number
@@ -23,23 +26,41 @@ TRACE_EVENT(dump_page_frame,
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	pfn		)
+		__field(	struct page *,	page		)
+		__field(	u64,		stable_flags	)
 		__field(	unsigned long,	flags		)
-		__field(	unsigned long,	index		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned long,	private		)
+		__field(	unsigned long,	mapping		)
+		__field(	unsigned long,	index		)
 	),
 
 	TP_fast_assign(
 		__entry->pfn		= pfn;
+		__entry->page		= page;
+		__entry->stable_flags	= stable_page_flags(page);
 		__entry->flags		= page->flags;
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->private	= page->private;
+		__entry->mapping	= (unsigned long)page->mapping;
 		__entry->index		= page->index;
 	),
 
-	TP_printk("pfn=%lu flags=%lx count=%u mapcount=%u index=%lu",
-		  __entry->pfn, __entry->flags, __entry->count,
-		  __entry->mapcount, __entry->index)
+	TP_printk("pfn=%lu page=%p count=%u mapcount=%u "
+		  "private=%lx mapping=%lx index=%lx flags=%s",
+		  __entry->pfn,
+		  __entry->page,
+		  __entry->count,
+		  __entry->mapcount,
+		  __entry->private,
+		  __entry->mapping,
+		  __entry->index,
+		  ftrace_print_flags_seq(p, "|",
+					 __entry->flags & PAGE_FLAGS_MASK,
+					 pageflag_names)
+	)
 );
 
 #endif /*  _TRACE_MM_H */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 5/7] tracing/mm: accept echo-able input format for pfn range
  2011-08-29  3:29 ` Wu Fengguang
@ 2011-08-29  3:29   ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: trace-mm-pfn-range-input.patch --]
[-- Type: text/plain, Size: 2164 bytes --]

The seek+write style input for specifying pfn range is not scriptable.
Change it to more user friendly echo-able format.

Before patch:

	fd = open("/debug/tracing/object/mm/page/dump-pfn");
	seek(fd, start);
	write(fd, "size");

After patch:

	echo start +size > /debug/tracing/object/mm/page/dump-pfn
or
	echo start end   > /debug/tracing/object/mm/page/dump-pfn

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 kernel/trace/trace_mm.c |   39 +++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

--- mmotm.orig/kernel/trace/trace_mm.c	2010-12-26 20:05:26.000000000 +0800
+++ mmotm/kernel/trace/trace_mm.c	2010-12-26 20:20:13.000000000 +0800
@@ -9,6 +9,7 @@
 #include <linux/bootmem.h>
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
+#include <linux/ctype.h>
 
 #include "trace_output.h"
 
@@ -24,8 +25,8 @@ void trace_mm_page_frames(unsigned long 
 	if (start > max_pfn - 1)
 		return;
 
-	if (end > max_pfn - 1)
-		end = max_pfn - 1;
+	if (end > max_pfn)
+		end = max_pfn;
 
 	while (pfn < end) {
 		page = NULL;
@@ -50,13 +51,20 @@ trace_mm_pfn_range_read(struct file *fil
 }
 
 
+/*
+ * recognized formats:
+ * 		"M N"	start=M, end=N
+ * 		"M"	start=M, end=M+1
+ * 		"M +N"	start=M, end=M+N-1
+ */
 static ssize_t
 trace_mm_pfn_range_write(struct file *filp, const char __user *ubuf, size_t cnt,
 			 loff_t *ppos)
 {
-	unsigned long val, start, end;
+	unsigned long start;
+	unsigned long end = 0;
 	char buf[64];
-	int ret;
+	char *ptr;
 
 	if (cnt >= sizeof(buf))
 		return -EINVAL;
@@ -72,19 +80,20 @@ trace_mm_pfn_range_write(struct file *fi
 
 	buf[cnt] = 0;
 
-	ret = strict_strtol(buf, 10, &val);
-	if (ret < 0)
-		return ret;
-
-	start = *ppos;
-	if (val < 0)
-		end = max_pfn - 1;
-	else
-		end = start + val;
+	start = simple_strtoul(buf, &ptr, 0);
 
-	trace_mm_page_frames(start, end, trace_mm_page_frame);
+	for (; *ptr; ptr++) {
+		if (isdigit(*ptr)) {
+			if (*(ptr - 1) == '+')
+				end = start;
+			end += simple_strtoul(ptr, NULL, 0);
+			break;
+		}
+	}
+	if (!*ptr)
+		end = start + 1;
 
-	*ppos += cnt;
+	trace_mm_page_frames(start, end, trace_mm_page_frame);
 
 	return cnt;
 }



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 5/7] tracing/mm: accept echo-able input format for pfn range
@ 2011-08-29  3:29   ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: trace-mm-pfn-range-input.patch --]
[-- Type: text/plain, Size: 2467 bytes --]

The seek+write style input for specifying pfn range is not scriptable.
Change it to more user friendly echo-able format.

Before patch:

	fd = open("/debug/tracing/object/mm/page/dump-pfn");
	seek(fd, start);
	write(fd, "size");

After patch:

	echo start +size > /debug/tracing/object/mm/page/dump-pfn
or
	echo start end   > /debug/tracing/object/mm/page/dump-pfn

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 kernel/trace/trace_mm.c |   39 +++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

--- mmotm.orig/kernel/trace/trace_mm.c	2010-12-26 20:05:26.000000000 +0800
+++ mmotm/kernel/trace/trace_mm.c	2010-12-26 20:20:13.000000000 +0800
@@ -9,6 +9,7 @@
 #include <linux/bootmem.h>
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
+#include <linux/ctype.h>
 
 #include "trace_output.h"
 
@@ -24,8 +25,8 @@ void trace_mm_page_frames(unsigned long 
 	if (start > max_pfn - 1)
 		return;
 
-	if (end > max_pfn - 1)
-		end = max_pfn - 1;
+	if (end > max_pfn)
+		end = max_pfn;
 
 	while (pfn < end) {
 		page = NULL;
@@ -50,13 +51,20 @@ trace_mm_pfn_range_read(struct file *fil
 }
 
 
+/*
+ * recognized formats:
+ * 		"M N"	start=M, end=N
+ * 		"M"	start=M, end=M+1
+ * 		"M +N"	start=M, end=M+N-1
+ */
 static ssize_t
 trace_mm_pfn_range_write(struct file *filp, const char __user *ubuf, size_t cnt,
 			 loff_t *ppos)
 {
-	unsigned long val, start, end;
+	unsigned long start;
+	unsigned long end = 0;
 	char buf[64];
-	int ret;
+	char *ptr;
 
 	if (cnt >= sizeof(buf))
 		return -EINVAL;
@@ -72,19 +80,20 @@ trace_mm_pfn_range_write(struct file *fi
 
 	buf[cnt] = 0;
 
-	ret = strict_strtol(buf, 10, &val);
-	if (ret < 0)
-		return ret;
-
-	start = *ppos;
-	if (val < 0)
-		end = max_pfn - 1;
-	else
-		end = start + val;
+	start = simple_strtoul(buf, &ptr, 0);
 
-	trace_mm_page_frames(start, end, trace_mm_page_frame);
+	for (; *ptr; ptr++) {
+		if (isdigit(*ptr)) {
+			if (*(ptr - 1) == '+')
+				end = start;
+			end += simple_strtoul(ptr, NULL, 0);
+			break;
+		}
+	}
+	if (!*ptr)
+		end = start + 1;
 
-	*ppos += cnt;
+	trace_mm_page_frames(start, end, trace_mm_page_frame);
 
 	return cnt;
 }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 5/7] tracing/mm: accept echo-able input format for pfn range
  2011-08-29  3:29 ` Wu Fengguang
                   ` (8 preceding siblings ...)
  (?)
@ 2011-08-29  3:29 ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: trace-mm-pfn-range-input.patch --]
[-- Type: text/plain, Size: 2162 bytes --]

The seek+write style input for specifying pfn range is not scriptable.
Change it to more user friendly echo-able format.

Before patch:

	fd = open("/debug/tracing/object/mm/page/dump-pfn");
	seek(fd, start);
	write(fd, "size");

After patch:

	echo start +size > /debug/tracing/object/mm/page/dump-pfn
or
	echo start end   > /debug/tracing/object/mm/page/dump-pfn

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 kernel/trace/trace_mm.c |   39 +++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

--- mmotm.orig/kernel/trace/trace_mm.c	2010-12-26 20:05:26.000000000 +0800
+++ mmotm/kernel/trace/trace_mm.c	2010-12-26 20:20:13.000000000 +0800
@@ -9,6 +9,7 @@
 #include <linux/bootmem.h>
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
+#include <linux/ctype.h>
 
 #include "trace_output.h"
 
@@ -24,8 +25,8 @@ void trace_mm_page_frames(unsigned long 
 	if (start > max_pfn - 1)
 		return;
 
-	if (end > max_pfn - 1)
-		end = max_pfn - 1;
+	if (end > max_pfn)
+		end = max_pfn;
 
 	while (pfn < end) {
 		page = NULL;
@@ -50,13 +51,20 @@ trace_mm_pfn_range_read(struct file *fil
 }
 
 
+/*
+ * recognized formats:
+ * 		"M N"	start=M, end=N
+ * 		"M"	start=M, end=M+1
+ * 		"M +N"	start=M, end=M+N-1
+ */
 static ssize_t
 trace_mm_pfn_range_write(struct file *filp, const char __user *ubuf, size_t cnt,
 			 loff_t *ppos)
 {
-	unsigned long val, start, end;
+	unsigned long start;
+	unsigned long end = 0;
 	char buf[64];
-	int ret;
+	char *ptr;
 
 	if (cnt >= sizeof(buf))
 		return -EINVAL;
@@ -72,19 +80,20 @@ trace_mm_pfn_range_write(struct file *fi
 
 	buf[cnt] = 0;
 
-	ret = strict_strtol(buf, 10, &val);
-	if (ret < 0)
-		return ret;
-
-	start = *ppos;
-	if (val < 0)
-		end = max_pfn - 1;
-	else
-		end = start + val;
+	start = simple_strtoul(buf, &ptr, 0);
 
-	trace_mm_page_frames(start, end, trace_mm_page_frame);
+	for (; *ptr; ptr++) {
+		if (isdigit(*ptr)) {
+			if (*(ptr - 1) == '+')
+				end = start;
+			end += simple_strtoul(ptr, NULL, 0);
+			break;
+		}
+	}
+	if (!*ptr)
+		end = start + 1;
 
-	*ppos += cnt;
+	trace_mm_page_frames(start, end, trace_mm_page_frame);
 
 	return cnt;
 }

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 6/7] tracing/mm: add dump-file and dump-fs interfaces
  2011-08-29  3:29 ` Wu Fengguang
@ 2011-08-29  3:29   ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Chris Frost, Steven Rostedt, Peter Zijlstra,
	Frederic Weisbecker, Wu Fengguang, Linux Memory Management List,
	LKML

[-- Attachment #1: trace-mm-pages.patch --]
[-- Type: text/plain, Size: 13546 bytes --]

This dumps
- all cached files of a mounted fs  (the inode-cache)
- all cached pages of a cached file (the page-cache)

Usage and Sample output:

# echo / > /debug/tracing/objects/mm/pages/dump-fs
# head -20 /debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-3128  [000]   432.050943: dump_inode_cache: ino=1507329 size=4096 cached=4096 dirtied_when=4294676467 age=444 state=____ type=DIR name=/
             zsh-3128  [000]   432.050949: dump_page_cache: index=0 len=1 flags=____RU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050962: dump_inode_cache: ino=1786836 size=12288 cached=12288 dirtied_when=4294676472 age=444 state=____ type=DIR name=/sbin
             zsh-3128  [000]   432.050966: dump_page_cache: index=0 len=3 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050973: dump_inode_cache: ino=1786946 size=37312 cached=40960 dirtied_when=4294676473 age=444 state=____ type=REG name=/sbin/init
             zsh-3128  [000]   432.050977: dump_page_cache: index=0 len=6 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050978: dump_page_cache: index=6 len=1 flags=M__A_U_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050978: dump_page_cache: index=7 len=1 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050979: dump_page_cache: index=8 len=2 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050986: dump_inode_cache: ino=1507464 size=4 cached=4096 dirtied_when=4294676477 age=444 state=____ type=LNK name=/lib64
             zsh-3128  [000]   432.050989: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050995: dump_inode_cache: ino=1590173 size=12288 cached=0 dirtied_when=4294676477 age=444 state=____ type=DIR name=/lib
             zsh-3128  [000]   432.051003: dump_inode_cache: ino=1590265 size=27 cached=4096 dirtied_when=4294676478 age=444 state=____ type=LNK name=/lib/ld-linux-x86-64.so.2
             zsh-3128  [000]   432.051006: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.051013: dump_inode_cache: ino=1663440 size=4096 cached=0 dirtied_when=4294676479 age=444 state=____ type=DIR name=/lib/x86_64-linux-gnu
             zsh-3128  [000]   432.051022: dump_inode_cache: ino=3293287 size=136936 cached=139264 dirtied_when=4294676480 age=444 state=____ type=REG name=/lib/x86_64-linux-gnu/ld-2.13.so

Here "age" is the number of seconds from either inode create time, or
last dirty time for dirtied inodes. "memcg" is the memory controller
group id to be added by next patch.

CC: Ingo Molnar <mingo@elte.hu>
CC: Chris Frost <frost@cs.ucla.edu>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/inode.c                |    8 +
 fs/internal.h             |    5 
 include/linux/fs.h        |    1 
 include/trace/events/mm.h |   95 ++++++++++++++++
 kernel/trace/trace_mm.c   |  213 ++++++++++++++++++++++++++++++++++++
 5 files changed, 316 insertions(+), 6 deletions(-)

--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-29 10:26:27.000000000 +0800
@@ -3,7 +3,10 @@
 
 #include <linux/tracepoint.h>
 #include <linux/page-flags.h>
+#include <linux/memcontrol.h>
+#include <linux/pagemap.h>
 #include <linux/mm.h>
+#include <linux/kernel-page-flags.h>
 
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM mm
@@ -63,6 +66,98 @@ TRACE_EVENT(dump_page_frame,
 	)
 );
 
+TRACE_EVENT(dump_page_cache,
+
+	TP_PROTO(struct page *page, unsigned long len),
+
+	TP_ARGS(page, len),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	index		)
+		__field(	unsigned long,	len		)
+		__field(	u64,		flags		)
+		__field(	unsigned int,	count		)
+		__field(	unsigned int,	mapcount	)
+	),
+
+	TP_fast_assign(
+		__entry->index		= page->index;
+		__entry->len		= len;
+		__entry->flags		= stable_page_flags(page);
+		__entry->count		= atomic_read(&page->_count);
+		__entry->mapcount	= page_mapcount(page);
+	),
+
+	TP_printk("index=%lu len=%lu flags=%c%c%c%c%c%c%c%c%c%c%c "
+		  "count=%u mapcount=%u",
+		  __entry->index,
+		  __entry->len,
+		  __entry->flags & (1ULL << KPF_MMAP)		? 'M' : '_',
+		  __entry->flags & (1ULL << KPF_MLOCKED)	? 'm' : '_',
+		  __entry->flags & (1ULL << KPF_UNEVICTABLE)	? 'u' : '_',
+		  __entry->flags & (1ULL << KPF_ACTIVE)		? 'A' : '_',
+		  __entry->flags & (1ULL << KPF_REFERENCED)	? 'R' : '_',
+		  __entry->flags & (1ULL << KPF_UPTODATE)	? 'U' : '_',
+		  __entry->flags & (1ULL << KPF_DIRTY)		? 'D' : '_',
+		  __entry->flags & (1ULL << KPF_WRITEBACK)	? 'W' : '_',
+		  __entry->flags & (1ULL << KPF_RECLAIM)	? 'I' : '_',
+		  __entry->flags & (1ULL << KPF_MAPPEDTODISK)	? 'd' : '_',
+		  __entry->flags & (1ULL << KPF_PRIVATE)	? 'P' : '_',
+		  __entry->count,
+		  __entry->mapcount)
+);
+
+
+#define show_inode_type(val)	__print_symbolic(val, 	   \
+				{ S_IFREG,	"REG"	}, \
+				{ S_IFDIR,	"DIR"	}, \
+				{ S_IFLNK,	"LNK"	}, \
+				{ S_IFBLK,	"BLK"	}, \
+				{ S_IFCHR,	"CHR"	}, \
+				{ S_IFIFO,	"FIFO"	}, \
+				{ S_IFSOCK,	"SOCK"	})
+
+TRACE_EVENT(dump_inode_cache,
+
+	TP_PROTO(struct inode *inode, char *name, int len),
+
+	TP_ARGS(inode, name, len),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	ino		)
+		__field(	loff_t,		size		) /* bytes */
+		__field(	loff_t,		cached		) /* bytes */
+		__field(	unsigned long,	dirtied_when	)
+		__field(	unsigned long,	state		)
+		__field(	umode_t,	mode		)
+		__dynamic_array(char,		file,	len	)
+	),
+
+	TP_fast_assign(
+		__entry->ino	= inode->i_ino;
+		__entry->size	= i_size_read(inode);
+		__entry->cached	= inode->i_mapping->nrpages << PAGE_CACHE_SHIFT;
+		__entry->dirtied_when	= inode->dirtied_when;
+		__entry->state	= inode->i_state;
+		__entry->mode	= inode->i_mode;
+		memcpy(__get_str(file), name, len);
+	),
+
+	TP_printk("ino=%lu size=%llu cached=%llu dirtied_when=%lu age=%lu "
+		  "state=%c%c%c%c type=%s name=%s",
+		  __entry->ino,
+		  __entry->size,
+		  __entry->cached,
+		  __entry->dirtied_when,
+		  (jiffies - __entry->dirtied_when) / HZ,
+		  __entry->state & I_DIRTY_PAGES	? 'D' : '_',
+		  __entry->state & I_DIRTY_DATASYNC	? 'd' : '_',
+		  __entry->state & I_DIRTY_SYNC		? 'm' : '_',
+		  __entry->state & I_SYNC		? 'S' : '_',
+		  show_inode_type(__entry->mode & S_IFMT),
+		  __get_str(file))
+);
+
 #endif /*  _TRACE_MM_H */
 
 /* This part must be outside protection */
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-29 09:50:56.000000000 +0800
@@ -2,6 +2,7 @@
  * Trace mm pages
  *
  * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ * Copyright (C) 2009 Intel Corporation, Wu Fengguang <fengguang.wu@intel.com>
  *
  * Code based on Matt Mackall's /proc/[kpagecount|kpageflags] code.
  */
@@ -10,6 +11,10 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
+#include <linux/pagevec.h>
+#include <linux/writeback.h>
+#include <linux/file.h>
+#include <linux/slab.h>
 
 #include "trace_output.h"
 
@@ -123,6 +128,208 @@ static struct dentry *trace_objects_mm_d
 	return d_mm;
 }
 
+static unsigned long page_flags(struct page *page)
+{
+	return page->flags & PAGE_FLAGS_MASK;
+}
+
+static int pages_similar(struct page *page0, struct page *page)
+{
+	if (page_flags(page0) != page_flags(page))
+		return 0;
+
+	if (page_count(page0) != page_count(page))
+		return 0;
+
+	if (page_mapcount(page0) != page_mapcount(page))
+		return 0;
+
+	return 1;
+}
+
+static void dump_pagecache(struct address_space *mapping)
+{
+	unsigned long nr_pages;
+	struct page *pages[PAGEVEC_SIZE];
+	struct page *uninitialized_var(page0);
+	struct page *page;
+	unsigned long start = 0;
+	unsigned long len = 0;
+	int i;
+
+	for (;;) {
+		rcu_read_lock();
+		nr_pages = radix_tree_gang_lookup(&mapping->page_tree,
+				(void **)pages, start + len, PAGEVEC_SIZE);
+		rcu_read_unlock();
+
+		if (nr_pages == 0) {
+			if (len)
+				trace_dump_page_cache(page0, len);
+			return;
+		}
+
+		for (i = 0; i < nr_pages; i++) {
+			page = pages[i];
+
+			if (len &&
+			    page->index == start + len &&
+			    pages_similar(page0, page))
+				len++;
+			else {
+				if (len)
+					trace_dump_page_cache(page0, len);
+				page0 = page;
+				start = page->index;
+				len = 1;
+			}
+		}
+		cond_resched();
+	}
+}
+
+static void dump_inode_cache(struct inode *inode,
+			     char *name_buf,
+			     struct vfsmount *mnt)
+{
+	struct path path = {
+		.mnt = mnt,
+		.dentry = d_find_alias(inode)
+	};
+	char *name;
+	int len;
+
+	if (!mnt) {
+		trace_dump_inode_cache(inode, name_buf, strlen(name_buf));
+		return;
+	}
+
+	if (!path.dentry) {
+		trace_dump_inode_cache(inode, "", 1);
+		return;
+	}
+
+	name = d_path(&path, name_buf, PAGE_SIZE);
+	if (IS_ERR(name)) {
+		name = "";
+		len = 1;
+	} else
+		len = PAGE_SIZE + name_buf - name;
+
+	trace_dump_inode_cache(inode, name, len);
+
+	if (path.dentry)
+		dput(path.dentry);
+}
+
+static void dump_fs_pagecache(struct super_block *sb, struct vfsmount *mnt)
+{
+	struct inode *inode;
+	struct inode *prev_inode = NULL;
+	char *name_buf;
+
+	name_buf = (char *)__get_free_page(GFP_TEMPORARY);
+	if (!name_buf)
+		return;
+
+	down_read(&sb->s_umount);
+	if (!sb->s_root)
+		goto out;
+
+	spin_lock(&inode_sb_list_lock);
+	list_for_each_entry_reverse(inode, &sb->s_inodes, i_sb_list) {
+		spin_lock(&inode->i_lock);
+		if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE|I_NEW)) {
+			spin_unlock(&inode->i_lock);
+			continue;
+		}
+		__iget(inode);
+		spin_unlock(&inode->i_lock);
+		spin_unlock(&inode_sb_list_lock);
+		dump_inode_cache(inode, name_buf, mnt);
+		if (inode->i_mapping->nrpages)
+			dump_pagecache(inode->i_mapping);
+		iput(prev_inode);
+		prev_inode = inode;
+		cond_resched();
+		spin_lock(&inode_sb_list_lock);
+	}
+	spin_unlock(&inode_sb_list_lock);
+	iput(prev_inode);
+out:
+	up_read(&sb->s_umount);
+	free_page((unsigned long)name_buf);
+}
+
+static ssize_t
+trace_pagecache_write(struct file *filp, const char __user *ubuf, size_t count,
+		      loff_t *ppos)
+{
+	struct file *file = NULL;
+	char *name;
+	int err = 0;
+
+	if (count <= 1)
+		return -EINVAL;
+	if (count >= PAGE_SIZE)
+		return -ENAMETOOLONG;
+
+	name = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (!name)
+		return -ENOMEM;
+
+	if (copy_from_user(name, ubuf, count)) {
+		err = -EFAULT;
+		goto out;
+	}
+
+	/* strip the newline added by `echo` */
+	if (name[count-1] == '\n')
+		name[count-1] = '\0';
+	else
+		name[count] = '\0';
+
+	file = filp_open(name, O_RDONLY|O_LARGEFILE, 0);
+	if (IS_ERR(file)) {
+		err = PTR_ERR(file);
+		file = NULL;
+		goto out;
+	}
+
+	if (tracing_update_buffers() < 0) {
+		err = -ENOMEM;
+		goto out;
+	}
+	if (trace_set_clr_event("mm", "dump_page_cache", 1)) {
+		err = -EINVAL;
+		goto out;
+	}
+	if (trace_set_clr_event("mm", "dump_inode_cache", 1)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (filp->f_path.dentry->d_inode->i_private) {
+		dump_fs_pagecache(file->f_path.dentry->d_sb, file->f_path.mnt);
+	} else {
+		dump_inode_cache(file->f_mapping->host, name, NULL);
+		dump_pagecache(file->f_mapping);
+	}
+
+out:
+	if (file)
+		fput(file);
+	kfree(name);
+
+	return err ? err : count;
+}
+
+static const struct file_operations trace_pagecache_fops = {
+	.open		= tracing_open_generic,
+	.read		= trace_mm_pfn_range_read,
+	.write		= trace_pagecache_write,
+};
+
 static struct dentry *trace_objects_mm_pages_dir(void)
 {
 	static struct dentry *d_pages;
@@ -154,6 +361,12 @@ static __init int trace_objects_mm_init(
 	trace_create_file("dump-pfn", 0600, d_pages, NULL,
 			  &trace_mm_fops);
 
+	trace_create_file("dump-file", 0600, d_pages, NULL,
+			  &trace_pagecache_fops);
+
+	trace_create_file("dump-fs", 0600, d_pages, (void *)1,
+			  &trace_pagecache_fops);
+
 	return 0;
 }
 fs_initcall(trace_objects_mm_init);
--- linux-mmotm.orig/fs/inode.c	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/fs/inode.c	2011-08-29 09:50:56.000000000 +0800
@@ -158,7 +158,13 @@ int inode_init_always(struct super_block
 	inode->i_bdev = NULL;
 	inode->i_cdev = NULL;
 	inode->i_rdev = 0;
-	inode->dirtied_when = 0;
+
+	/*
+	 * This records inode load time. It will be invalidated once inode is
+	 * dirtied, or jiffies wraps around. Despite the pitfalls it still
+	 * provides useful information for some use cases like fastboot.
+	 */
+	inode->dirtied_when = jiffies;
 
 	if (security_inode_alloc(inode))
 		goto out;
--- linux-mmotm.orig/fs/internal.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/fs/internal.h	2011-08-29 09:50:56.000000000 +0800
@@ -124,11 +124,6 @@ extern long do_handle_open(int mountdirf
 			   struct file_handle __user *ufh, int open_flag);
 
 /*
- * inode.c
- */
-extern spinlock_t inode_sb_list_lock;
-
-/*
  * fs-writeback.c
  */
 extern void inode_wb_list_del(struct inode *inode);
--- linux-mmotm.orig/include/linux/fs.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/include/linux/fs.h	2011-08-29 09:50:56.000000000 +0800
@@ -1378,6 +1378,7 @@ extern int send_sigurg(struct fown_struc
 
 extern struct list_head super_blocks;
 extern spinlock_t sb_lock;
+extern spinlock_t inode_sb_list_lock;
 
 struct super_block {
 	struct list_head	s_list;		/* Keep this first */



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 6/7] tracing/mm: add dump-file and dump-fs interfaces
@ 2011-08-29  3:29   ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Chris Frost, Steven Rostedt, Peter Zijlstra,
	Frederic Weisbecker, Wu Fengguang, Linux Memory Management List,
	LKML

[-- Attachment #1: trace-mm-pages.patch --]
[-- Type: text/plain, Size: 13849 bytes --]

This dumps
- all cached files of a mounted fs  (the inode-cache)
- all cached pages of a cached file (the page-cache)

Usage and Sample output:

# echo / > /debug/tracing/objects/mm/pages/dump-fs
# head -20 /debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-3128  [000]   432.050943: dump_inode_cache: ino=1507329 size=4096 cached=4096 dirtied_when=4294676467 age=444 state=____ type=DIR name=/
             zsh-3128  [000]   432.050949: dump_page_cache: index=0 len=1 flags=____RU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050962: dump_inode_cache: ino=1786836 size=12288 cached=12288 dirtied_when=4294676472 age=444 state=____ type=DIR name=/sbin
             zsh-3128  [000]   432.050966: dump_page_cache: index=0 len=3 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050973: dump_inode_cache: ino=1786946 size=37312 cached=40960 dirtied_when=4294676473 age=444 state=____ type=REG name=/sbin/init
             zsh-3128  [000]   432.050977: dump_page_cache: index=0 len=6 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050978: dump_page_cache: index=6 len=1 flags=M__A_U_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050978: dump_page_cache: index=7 len=1 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050979: dump_page_cache: index=8 len=2 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050986: dump_inode_cache: ino=1507464 size=4 cached=4096 dirtied_when=4294676477 age=444 state=____ type=LNK name=/lib64
             zsh-3128  [000]   432.050989: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050995: dump_inode_cache: ino=1590173 size=12288 cached=0 dirtied_when=4294676477 age=444 state=____ type=DIR name=/lib
             zsh-3128  [000]   432.051003: dump_inode_cache: ino=1590265 size=27 cached=4096 dirtied_when=4294676478 age=444 state=____ type=LNK name=/lib/ld-linux-x86-64.so.2
             zsh-3128  [000]   432.051006: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.051013: dump_inode_cache: ino=1663440 size=4096 cached=0 dirtied_when=4294676479 age=444 state=____ type=DIR name=/lib/x86_64-linux-gnu
             zsh-3128  [000]   432.051022: dump_inode_cache: ino=3293287 size=136936 cached=139264 dirtied_when=4294676480 age=444 state=____ type=REG name=/lib/x86_64-linux-gnu/ld-2.13.so

Here "age" is the number of seconds from either inode create time, or
last dirty time for dirtied inodes. "memcg" is the memory controller
group id to be added by next patch.

CC: Ingo Molnar <mingo@elte.hu>
CC: Chris Frost <frost@cs.ucla.edu>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/inode.c                |    8 +
 fs/internal.h             |    5 
 include/linux/fs.h        |    1 
 include/trace/events/mm.h |   95 ++++++++++++++++
 kernel/trace/trace_mm.c   |  213 ++++++++++++++++++++++++++++++++++++
 5 files changed, 316 insertions(+), 6 deletions(-)

--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-29 10:26:27.000000000 +0800
@@ -3,7 +3,10 @@
 
 #include <linux/tracepoint.h>
 #include <linux/page-flags.h>
+#include <linux/memcontrol.h>
+#include <linux/pagemap.h>
 #include <linux/mm.h>
+#include <linux/kernel-page-flags.h>
 
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM mm
@@ -63,6 +66,98 @@ TRACE_EVENT(dump_page_frame,
 	)
 );
 
+TRACE_EVENT(dump_page_cache,
+
+	TP_PROTO(struct page *page, unsigned long len),
+
+	TP_ARGS(page, len),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	index		)
+		__field(	unsigned long,	len		)
+		__field(	u64,		flags		)
+		__field(	unsigned int,	count		)
+		__field(	unsigned int,	mapcount	)
+	),
+
+	TP_fast_assign(
+		__entry->index		= page->index;
+		__entry->len		= len;
+		__entry->flags		= stable_page_flags(page);
+		__entry->count		= atomic_read(&page->_count);
+		__entry->mapcount	= page_mapcount(page);
+	),
+
+	TP_printk("index=%lu len=%lu flags=%c%c%c%c%c%c%c%c%c%c%c "
+		  "count=%u mapcount=%u",
+		  __entry->index,
+		  __entry->len,
+		  __entry->flags & (1ULL << KPF_MMAP)		? 'M' : '_',
+		  __entry->flags & (1ULL << KPF_MLOCKED)	? 'm' : '_',
+		  __entry->flags & (1ULL << KPF_UNEVICTABLE)	? 'u' : '_',
+		  __entry->flags & (1ULL << KPF_ACTIVE)		? 'A' : '_',
+		  __entry->flags & (1ULL << KPF_REFERENCED)	? 'R' : '_',
+		  __entry->flags & (1ULL << KPF_UPTODATE)	? 'U' : '_',
+		  __entry->flags & (1ULL << KPF_DIRTY)		? 'D' : '_',
+		  __entry->flags & (1ULL << KPF_WRITEBACK)	? 'W' : '_',
+		  __entry->flags & (1ULL << KPF_RECLAIM)	? 'I' : '_',
+		  __entry->flags & (1ULL << KPF_MAPPEDTODISK)	? 'd' : '_',
+		  __entry->flags & (1ULL << KPF_PRIVATE)	? 'P' : '_',
+		  __entry->count,
+		  __entry->mapcount)
+);
+
+
+#define show_inode_type(val)	__print_symbolic(val, 	   \
+				{ S_IFREG,	"REG"	}, \
+				{ S_IFDIR,	"DIR"	}, \
+				{ S_IFLNK,	"LNK"	}, \
+				{ S_IFBLK,	"BLK"	}, \
+				{ S_IFCHR,	"CHR"	}, \
+				{ S_IFIFO,	"FIFO"	}, \
+				{ S_IFSOCK,	"SOCK"	})
+
+TRACE_EVENT(dump_inode_cache,
+
+	TP_PROTO(struct inode *inode, char *name, int len),
+
+	TP_ARGS(inode, name, len),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	ino		)
+		__field(	loff_t,		size		) /* bytes */
+		__field(	loff_t,		cached		) /* bytes */
+		__field(	unsigned long,	dirtied_when	)
+		__field(	unsigned long,	state		)
+		__field(	umode_t,	mode		)
+		__dynamic_array(char,		file,	len	)
+	),
+
+	TP_fast_assign(
+		__entry->ino	= inode->i_ino;
+		__entry->size	= i_size_read(inode);
+		__entry->cached	= inode->i_mapping->nrpages << PAGE_CACHE_SHIFT;
+		__entry->dirtied_when	= inode->dirtied_when;
+		__entry->state	= inode->i_state;
+		__entry->mode	= inode->i_mode;
+		memcpy(__get_str(file), name, len);
+	),
+
+	TP_printk("ino=%lu size=%llu cached=%llu dirtied_when=%lu age=%lu "
+		  "state=%c%c%c%c type=%s name=%s",
+		  __entry->ino,
+		  __entry->size,
+		  __entry->cached,
+		  __entry->dirtied_when,
+		  (jiffies - __entry->dirtied_when) / HZ,
+		  __entry->state & I_DIRTY_PAGES	? 'D' : '_',
+		  __entry->state & I_DIRTY_DATASYNC	? 'd' : '_',
+		  __entry->state & I_DIRTY_SYNC		? 'm' : '_',
+		  __entry->state & I_SYNC		? 'S' : '_',
+		  show_inode_type(__entry->mode & S_IFMT),
+		  __get_str(file))
+);
+
 #endif /*  _TRACE_MM_H */
 
 /* This part must be outside protection */
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-29 09:50:56.000000000 +0800
@@ -2,6 +2,7 @@
  * Trace mm pages
  *
  * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ * Copyright (C) 2009 Intel Corporation, Wu Fengguang <fengguang.wu@intel.com>
  *
  * Code based on Matt Mackall's /proc/[kpagecount|kpageflags] code.
  */
@@ -10,6 +11,10 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
+#include <linux/pagevec.h>
+#include <linux/writeback.h>
+#include <linux/file.h>
+#include <linux/slab.h>
 
 #include "trace_output.h"
 
@@ -123,6 +128,208 @@ static struct dentry *trace_objects_mm_d
 	return d_mm;
 }
 
+static unsigned long page_flags(struct page *page)
+{
+	return page->flags & PAGE_FLAGS_MASK;
+}
+
+static int pages_similar(struct page *page0, struct page *page)
+{
+	if (page_flags(page0) != page_flags(page))
+		return 0;
+
+	if (page_count(page0) != page_count(page))
+		return 0;
+
+	if (page_mapcount(page0) != page_mapcount(page))
+		return 0;
+
+	return 1;
+}
+
+static void dump_pagecache(struct address_space *mapping)
+{
+	unsigned long nr_pages;
+	struct page *pages[PAGEVEC_SIZE];
+	struct page *uninitialized_var(page0);
+	struct page *page;
+	unsigned long start = 0;
+	unsigned long len = 0;
+	int i;
+
+	for (;;) {
+		rcu_read_lock();
+		nr_pages = radix_tree_gang_lookup(&mapping->page_tree,
+				(void **)pages, start + len, PAGEVEC_SIZE);
+		rcu_read_unlock();
+
+		if (nr_pages == 0) {
+			if (len)
+				trace_dump_page_cache(page0, len);
+			return;
+		}
+
+		for (i = 0; i < nr_pages; i++) {
+			page = pages[i];
+
+			if (len &&
+			    page->index == start + len &&
+			    pages_similar(page0, page))
+				len++;
+			else {
+				if (len)
+					trace_dump_page_cache(page0, len);
+				page0 = page;
+				start = page->index;
+				len = 1;
+			}
+		}
+		cond_resched();
+	}
+}
+
+static void dump_inode_cache(struct inode *inode,
+			     char *name_buf,
+			     struct vfsmount *mnt)
+{
+	struct path path = {
+		.mnt = mnt,
+		.dentry = d_find_alias(inode)
+	};
+	char *name;
+	int len;
+
+	if (!mnt) {
+		trace_dump_inode_cache(inode, name_buf, strlen(name_buf));
+		return;
+	}
+
+	if (!path.dentry) {
+		trace_dump_inode_cache(inode, "", 1);
+		return;
+	}
+
+	name = d_path(&path, name_buf, PAGE_SIZE);
+	if (IS_ERR(name)) {
+		name = "";
+		len = 1;
+	} else
+		len = PAGE_SIZE + name_buf - name;
+
+	trace_dump_inode_cache(inode, name, len);
+
+	if (path.dentry)
+		dput(path.dentry);
+}
+
+static void dump_fs_pagecache(struct super_block *sb, struct vfsmount *mnt)
+{
+	struct inode *inode;
+	struct inode *prev_inode = NULL;
+	char *name_buf;
+
+	name_buf = (char *)__get_free_page(GFP_TEMPORARY);
+	if (!name_buf)
+		return;
+
+	down_read(&sb->s_umount);
+	if (!sb->s_root)
+		goto out;
+
+	spin_lock(&inode_sb_list_lock);
+	list_for_each_entry_reverse(inode, &sb->s_inodes, i_sb_list) {
+		spin_lock(&inode->i_lock);
+		if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE|I_NEW)) {
+			spin_unlock(&inode->i_lock);
+			continue;
+		}
+		__iget(inode);
+		spin_unlock(&inode->i_lock);
+		spin_unlock(&inode_sb_list_lock);
+		dump_inode_cache(inode, name_buf, mnt);
+		if (inode->i_mapping->nrpages)
+			dump_pagecache(inode->i_mapping);
+		iput(prev_inode);
+		prev_inode = inode;
+		cond_resched();
+		spin_lock(&inode_sb_list_lock);
+	}
+	spin_unlock(&inode_sb_list_lock);
+	iput(prev_inode);
+out:
+	up_read(&sb->s_umount);
+	free_page((unsigned long)name_buf);
+}
+
+static ssize_t
+trace_pagecache_write(struct file *filp, const char __user *ubuf, size_t count,
+		      loff_t *ppos)
+{
+	struct file *file = NULL;
+	char *name;
+	int err = 0;
+
+	if (count <= 1)
+		return -EINVAL;
+	if (count >= PAGE_SIZE)
+		return -ENAMETOOLONG;
+
+	name = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (!name)
+		return -ENOMEM;
+
+	if (copy_from_user(name, ubuf, count)) {
+		err = -EFAULT;
+		goto out;
+	}
+
+	/* strip the newline added by `echo` */
+	if (name[count-1] == '\n')
+		name[count-1] = '\0';
+	else
+		name[count] = '\0';
+
+	file = filp_open(name, O_RDONLY|O_LARGEFILE, 0);
+	if (IS_ERR(file)) {
+		err = PTR_ERR(file);
+		file = NULL;
+		goto out;
+	}
+
+	if (tracing_update_buffers() < 0) {
+		err = -ENOMEM;
+		goto out;
+	}
+	if (trace_set_clr_event("mm", "dump_page_cache", 1)) {
+		err = -EINVAL;
+		goto out;
+	}
+	if (trace_set_clr_event("mm", "dump_inode_cache", 1)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (filp->f_path.dentry->d_inode->i_private) {
+		dump_fs_pagecache(file->f_path.dentry->d_sb, file->f_path.mnt);
+	} else {
+		dump_inode_cache(file->f_mapping->host, name, NULL);
+		dump_pagecache(file->f_mapping);
+	}
+
+out:
+	if (file)
+		fput(file);
+	kfree(name);
+
+	return err ? err : count;
+}
+
+static const struct file_operations trace_pagecache_fops = {
+	.open		= tracing_open_generic,
+	.read		= trace_mm_pfn_range_read,
+	.write		= trace_pagecache_write,
+};
+
 static struct dentry *trace_objects_mm_pages_dir(void)
 {
 	static struct dentry *d_pages;
@@ -154,6 +361,12 @@ static __init int trace_objects_mm_init(
 	trace_create_file("dump-pfn", 0600, d_pages, NULL,
 			  &trace_mm_fops);
 
+	trace_create_file("dump-file", 0600, d_pages, NULL,
+			  &trace_pagecache_fops);
+
+	trace_create_file("dump-fs", 0600, d_pages, (void *)1,
+			  &trace_pagecache_fops);
+
 	return 0;
 }
 fs_initcall(trace_objects_mm_init);
--- linux-mmotm.orig/fs/inode.c	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/fs/inode.c	2011-08-29 09:50:56.000000000 +0800
@@ -158,7 +158,13 @@ int inode_init_always(struct super_block
 	inode->i_bdev = NULL;
 	inode->i_cdev = NULL;
 	inode->i_rdev = 0;
-	inode->dirtied_when = 0;
+
+	/*
+	 * This records inode load time. It will be invalidated once inode is
+	 * dirtied, or jiffies wraps around. Despite the pitfalls it still
+	 * provides useful information for some use cases like fastboot.
+	 */
+	inode->dirtied_when = jiffies;
 
 	if (security_inode_alloc(inode))
 		goto out;
--- linux-mmotm.orig/fs/internal.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/fs/internal.h	2011-08-29 09:50:56.000000000 +0800
@@ -124,11 +124,6 @@ extern long do_handle_open(int mountdirf
 			   struct file_handle __user *ufh, int open_flag);
 
 /*
- * inode.c
- */
-extern spinlock_t inode_sb_list_lock;
-
-/*
  * fs-writeback.c
  */
 extern void inode_wb_list_del(struct inode *inode);
--- linux-mmotm.orig/include/linux/fs.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/include/linux/fs.h	2011-08-29 09:50:56.000000000 +0800
@@ -1378,6 +1378,7 @@ extern int send_sigurg(struct fown_struc
 
 extern struct list_head super_blocks;
 extern spinlock_t sb_lock;
+extern spinlock_t inode_sb_list_lock;
 
 struct super_block {
 	struct list_head	s_list;		/* Keep this first */


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 6/7] tracing/mm: add dump-file and dump-fs interfaces
  2011-08-29  3:29 ` Wu Fengguang
                   ` (11 preceding siblings ...)
  (?)
@ 2011-08-29  3:29 ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Chris Frost, Steven Rostedt, Peter Zijlstra,
	Frederic Weisbecker, Wu Fengguang, Linux Memory Management List,
	LKML

[-- Attachment #1: trace-mm-pages.patch --]
[-- Type: text/plain, Size: 13544 bytes --]

This dumps
- all cached files of a mounted fs  (the inode-cache)
- all cached pages of a cached file (the page-cache)

Usage and Sample output:

# echo / > /debug/tracing/objects/mm/pages/dump-fs
# head -20 /debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-3128  [000]   432.050943: dump_inode_cache: ino=1507329 size=4096 cached=4096 dirtied_when=4294676467 age=444 state=____ type=DIR name=/
             zsh-3128  [000]   432.050949: dump_page_cache: index=0 len=1 flags=____RU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050962: dump_inode_cache: ino=1786836 size=12288 cached=12288 dirtied_when=4294676472 age=444 state=____ type=DIR name=/sbin
             zsh-3128  [000]   432.050966: dump_page_cache: index=0 len=3 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050973: dump_inode_cache: ino=1786946 size=37312 cached=40960 dirtied_when=4294676473 age=444 state=____ type=REG name=/sbin/init
             zsh-3128  [000]   432.050977: dump_page_cache: index=0 len=6 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050978: dump_page_cache: index=6 len=1 flags=M__A_U_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050978: dump_page_cache: index=7 len=1 flags=M__ARU_____ count=2 mapcount=1 memcg=1
             zsh-3128  [000]   432.050979: dump_page_cache: index=8 len=2 flags=_____U_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050986: dump_inode_cache: ino=1507464 size=4 cached=4096 dirtied_when=4294676477 age=444 state=____ type=LNK name=/lib64
             zsh-3128  [000]   432.050989: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.050995: dump_inode_cache: ino=1590173 size=12288 cached=0 dirtied_when=4294676477 age=444 state=____ type=DIR name=/lib
             zsh-3128  [000]   432.051003: dump_inode_cache: ino=1590265 size=27 cached=4096 dirtied_when=4294676478 age=444 state=____ type=LNK name=/lib/ld-linux-x86-64.so.2
             zsh-3128  [000]   432.051006: dump_page_cache: index=0 len=1 flags=___ARU_____ count=1 mapcount=0 memcg=1
             zsh-3128  [000]   432.051013: dump_inode_cache: ino=1663440 size=4096 cached=0 dirtied_when=4294676479 age=444 state=____ type=DIR name=/lib/x86_64-linux-gnu
             zsh-3128  [000]   432.051022: dump_inode_cache: ino=3293287 size=136936 cached=139264 dirtied_when=4294676480 age=444 state=____ type=REG name=/lib/x86_64-linux-gnu/ld-2.13.so

Here "age" is the number of seconds from either inode create time, or
last dirty time for dirtied inodes. "memcg" is the memory controller
group id to be added by next patch.

CC: Ingo Molnar <mingo@elte.hu>
CC: Chris Frost <frost@cs.ucla.edu>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/inode.c                |    8 +
 fs/internal.h             |    5 
 include/linux/fs.h        |    1 
 include/trace/events/mm.h |   95 ++++++++++++++++
 kernel/trace/trace_mm.c   |  213 ++++++++++++++++++++++++++++++++++++
 5 files changed, 316 insertions(+), 6 deletions(-)

--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-29 10:26:27.000000000 +0800
@@ -3,7 +3,10 @@
 
 #include <linux/tracepoint.h>
 #include <linux/page-flags.h>
+#include <linux/memcontrol.h>
+#include <linux/pagemap.h>
 #include <linux/mm.h>
+#include <linux/kernel-page-flags.h>
 
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM mm
@@ -63,6 +66,98 @@ TRACE_EVENT(dump_page_frame,
 	)
 );
 
+TRACE_EVENT(dump_page_cache,
+
+	TP_PROTO(struct page *page, unsigned long len),
+
+	TP_ARGS(page, len),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	index		)
+		__field(	unsigned long,	len		)
+		__field(	u64,		flags		)
+		__field(	unsigned int,	count		)
+		__field(	unsigned int,	mapcount	)
+	),
+
+	TP_fast_assign(
+		__entry->index		= page->index;
+		__entry->len		= len;
+		__entry->flags		= stable_page_flags(page);
+		__entry->count		= atomic_read(&page->_count);
+		__entry->mapcount	= page_mapcount(page);
+	),
+
+	TP_printk("index=%lu len=%lu flags=%c%c%c%c%c%c%c%c%c%c%c "
+		  "count=%u mapcount=%u",
+		  __entry->index,
+		  __entry->len,
+		  __entry->flags & (1ULL << KPF_MMAP)		? 'M' : '_',
+		  __entry->flags & (1ULL << KPF_MLOCKED)	? 'm' : '_',
+		  __entry->flags & (1ULL << KPF_UNEVICTABLE)	? 'u' : '_',
+		  __entry->flags & (1ULL << KPF_ACTIVE)		? 'A' : '_',
+		  __entry->flags & (1ULL << KPF_REFERENCED)	? 'R' : '_',
+		  __entry->flags & (1ULL << KPF_UPTODATE)	? 'U' : '_',
+		  __entry->flags & (1ULL << KPF_DIRTY)		? 'D' : '_',
+		  __entry->flags & (1ULL << KPF_WRITEBACK)	? 'W' : '_',
+		  __entry->flags & (1ULL << KPF_RECLAIM)	? 'I' : '_',
+		  __entry->flags & (1ULL << KPF_MAPPEDTODISK)	? 'd' : '_',
+		  __entry->flags & (1ULL << KPF_PRIVATE)	? 'P' : '_',
+		  __entry->count,
+		  __entry->mapcount)
+);
+
+
+#define show_inode_type(val)	__print_symbolic(val, 	   \
+				{ S_IFREG,	"REG"	}, \
+				{ S_IFDIR,	"DIR"	}, \
+				{ S_IFLNK,	"LNK"	}, \
+				{ S_IFBLK,	"BLK"	}, \
+				{ S_IFCHR,	"CHR"	}, \
+				{ S_IFIFO,	"FIFO"	}, \
+				{ S_IFSOCK,	"SOCK"	})
+
+TRACE_EVENT(dump_inode_cache,
+
+	TP_PROTO(struct inode *inode, char *name, int len),
+
+	TP_ARGS(inode, name, len),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	ino		)
+		__field(	loff_t,		size		) /* bytes */
+		__field(	loff_t,		cached		) /* bytes */
+		__field(	unsigned long,	dirtied_when	)
+		__field(	unsigned long,	state		)
+		__field(	umode_t,	mode		)
+		__dynamic_array(char,		file,	len	)
+	),
+
+	TP_fast_assign(
+		__entry->ino	= inode->i_ino;
+		__entry->size	= i_size_read(inode);
+		__entry->cached	= inode->i_mapping->nrpages << PAGE_CACHE_SHIFT;
+		__entry->dirtied_when	= inode->dirtied_when;
+		__entry->state	= inode->i_state;
+		__entry->mode	= inode->i_mode;
+		memcpy(__get_str(file), name, len);
+	),
+
+	TP_printk("ino=%lu size=%llu cached=%llu dirtied_when=%lu age=%lu "
+		  "state=%c%c%c%c type=%s name=%s",
+		  __entry->ino,
+		  __entry->size,
+		  __entry->cached,
+		  __entry->dirtied_when,
+		  (jiffies - __entry->dirtied_when) / HZ,
+		  __entry->state & I_DIRTY_PAGES	? 'D' : '_',
+		  __entry->state & I_DIRTY_DATASYNC	? 'd' : '_',
+		  __entry->state & I_DIRTY_SYNC		? 'm' : '_',
+		  __entry->state & I_SYNC		? 'S' : '_',
+		  show_inode_type(__entry->mode & S_IFMT),
+		  __get_str(file))
+);
+
 #endif /*  _TRACE_MM_H */
 
 /* This part must be outside protection */
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-29 09:50:56.000000000 +0800
@@ -2,6 +2,7 @@
  * Trace mm pages
  *
  * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ * Copyright (C) 2009 Intel Corporation, Wu Fengguang <fengguang.wu@intel.com>
  *
  * Code based on Matt Mackall's /proc/[kpagecount|kpageflags] code.
  */
@@ -10,6 +11,10 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
+#include <linux/pagevec.h>
+#include <linux/writeback.h>
+#include <linux/file.h>
+#include <linux/slab.h>
 
 #include "trace_output.h"
 
@@ -123,6 +128,208 @@ static struct dentry *trace_objects_mm_d
 	return d_mm;
 }
 
+static unsigned long page_flags(struct page *page)
+{
+	return page->flags & PAGE_FLAGS_MASK;
+}
+
+static int pages_similar(struct page *page0, struct page *page)
+{
+	if (page_flags(page0) != page_flags(page))
+		return 0;
+
+	if (page_count(page0) != page_count(page))
+		return 0;
+
+	if (page_mapcount(page0) != page_mapcount(page))
+		return 0;
+
+	return 1;
+}
+
+static void dump_pagecache(struct address_space *mapping)
+{
+	unsigned long nr_pages;
+	struct page *pages[PAGEVEC_SIZE];
+	struct page *uninitialized_var(page0);
+	struct page *page;
+	unsigned long start = 0;
+	unsigned long len = 0;
+	int i;
+
+	for (;;) {
+		rcu_read_lock();
+		nr_pages = radix_tree_gang_lookup(&mapping->page_tree,
+				(void **)pages, start + len, PAGEVEC_SIZE);
+		rcu_read_unlock();
+
+		if (nr_pages == 0) {
+			if (len)
+				trace_dump_page_cache(page0, len);
+			return;
+		}
+
+		for (i = 0; i < nr_pages; i++) {
+			page = pages[i];
+
+			if (len &&
+			    page->index == start + len &&
+			    pages_similar(page0, page))
+				len++;
+			else {
+				if (len)
+					trace_dump_page_cache(page0, len);
+				page0 = page;
+				start = page->index;
+				len = 1;
+			}
+		}
+		cond_resched();
+	}
+}
+
+static void dump_inode_cache(struct inode *inode,
+			     char *name_buf,
+			     struct vfsmount *mnt)
+{
+	struct path path = {
+		.mnt = mnt,
+		.dentry = d_find_alias(inode)
+	};
+	char *name;
+	int len;
+
+	if (!mnt) {
+		trace_dump_inode_cache(inode, name_buf, strlen(name_buf));
+		return;
+	}
+
+	if (!path.dentry) {
+		trace_dump_inode_cache(inode, "", 1);
+		return;
+	}
+
+	name = d_path(&path, name_buf, PAGE_SIZE);
+	if (IS_ERR(name)) {
+		name = "";
+		len = 1;
+	} else
+		len = PAGE_SIZE + name_buf - name;
+
+	trace_dump_inode_cache(inode, name, len);
+
+	if (path.dentry)
+		dput(path.dentry);
+}
+
+static void dump_fs_pagecache(struct super_block *sb, struct vfsmount *mnt)
+{
+	struct inode *inode;
+	struct inode *prev_inode = NULL;
+	char *name_buf;
+
+	name_buf = (char *)__get_free_page(GFP_TEMPORARY);
+	if (!name_buf)
+		return;
+
+	down_read(&sb->s_umount);
+	if (!sb->s_root)
+		goto out;
+
+	spin_lock(&inode_sb_list_lock);
+	list_for_each_entry_reverse(inode, &sb->s_inodes, i_sb_list) {
+		spin_lock(&inode->i_lock);
+		if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE|I_NEW)) {
+			spin_unlock(&inode->i_lock);
+			continue;
+		}
+		__iget(inode);
+		spin_unlock(&inode->i_lock);
+		spin_unlock(&inode_sb_list_lock);
+		dump_inode_cache(inode, name_buf, mnt);
+		if (inode->i_mapping->nrpages)
+			dump_pagecache(inode->i_mapping);
+		iput(prev_inode);
+		prev_inode = inode;
+		cond_resched();
+		spin_lock(&inode_sb_list_lock);
+	}
+	spin_unlock(&inode_sb_list_lock);
+	iput(prev_inode);
+out:
+	up_read(&sb->s_umount);
+	free_page((unsigned long)name_buf);
+}
+
+static ssize_t
+trace_pagecache_write(struct file *filp, const char __user *ubuf, size_t count,
+		      loff_t *ppos)
+{
+	struct file *file = NULL;
+	char *name;
+	int err = 0;
+
+	if (count <= 1)
+		return -EINVAL;
+	if (count >= PAGE_SIZE)
+		return -ENAMETOOLONG;
+
+	name = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (!name)
+		return -ENOMEM;
+
+	if (copy_from_user(name, ubuf, count)) {
+		err = -EFAULT;
+		goto out;
+	}
+
+	/* strip the newline added by `echo` */
+	if (name[count-1] == '\n')
+		name[count-1] = '\0';
+	else
+		name[count] = '\0';
+
+	file = filp_open(name, O_RDONLY|O_LARGEFILE, 0);
+	if (IS_ERR(file)) {
+		err = PTR_ERR(file);
+		file = NULL;
+		goto out;
+	}
+
+	if (tracing_update_buffers() < 0) {
+		err = -ENOMEM;
+		goto out;
+	}
+	if (trace_set_clr_event("mm", "dump_page_cache", 1)) {
+		err = -EINVAL;
+		goto out;
+	}
+	if (trace_set_clr_event("mm", "dump_inode_cache", 1)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (filp->f_path.dentry->d_inode->i_private) {
+		dump_fs_pagecache(file->f_path.dentry->d_sb, file->f_path.mnt);
+	} else {
+		dump_inode_cache(file->f_mapping->host, name, NULL);
+		dump_pagecache(file->f_mapping);
+	}
+
+out:
+	if (file)
+		fput(file);
+	kfree(name);
+
+	return err ? err : count;
+}
+
+static const struct file_operations trace_pagecache_fops = {
+	.open		= tracing_open_generic,
+	.read		= trace_mm_pfn_range_read,
+	.write		= trace_pagecache_write,
+};
+
 static struct dentry *trace_objects_mm_pages_dir(void)
 {
 	static struct dentry *d_pages;
@@ -154,6 +361,12 @@ static __init int trace_objects_mm_init(
 	trace_create_file("dump-pfn", 0600, d_pages, NULL,
 			  &trace_mm_fops);
 
+	trace_create_file("dump-file", 0600, d_pages, NULL,
+			  &trace_pagecache_fops);
+
+	trace_create_file("dump-fs", 0600, d_pages, (void *)1,
+			  &trace_pagecache_fops);
+
 	return 0;
 }
 fs_initcall(trace_objects_mm_init);
--- linux-mmotm.orig/fs/inode.c	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/fs/inode.c	2011-08-29 09:50:56.000000000 +0800
@@ -158,7 +158,13 @@ int inode_init_always(struct super_block
 	inode->i_bdev = NULL;
 	inode->i_cdev = NULL;
 	inode->i_rdev = 0;
-	inode->dirtied_when = 0;
+
+	/*
+	 * This records inode load time. It will be invalidated once inode is
+	 * dirtied, or jiffies wraps around. Despite the pitfalls it still
+	 * provides useful information for some use cases like fastboot.
+	 */
+	inode->dirtied_when = jiffies;
 
 	if (security_inode_alloc(inode))
 		goto out;
--- linux-mmotm.orig/fs/internal.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/fs/internal.h	2011-08-29 09:50:56.000000000 +0800
@@ -124,11 +124,6 @@ extern long do_handle_open(int mountdirf
 			   struct file_handle __user *ufh, int open_flag);
 
 /*
- * inode.c
- */
-extern spinlock_t inode_sb_list_lock;
-
-/*
  * fs-writeback.c
  */
 extern void inode_wb_list_del(struct inode *inode);
--- linux-mmotm.orig/include/linux/fs.h	2011-08-29 09:50:55.000000000 +0800
+++ linux-mmotm/include/linux/fs.h	2011-08-29 09:50:56.000000000 +0800
@@ -1378,6 +1378,7 @@ extern int send_sigurg(struct fown_struc
 
 extern struct list_head super_blocks;
 extern spinlock_t sb_lock;
+extern spinlock_t inode_sb_list_lock;
 
 struct super_block {
 	struct list_head	s_list;		/* Keep this first */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 7/7] tracing/mm: add memcg field
  2011-08-29  3:29 ` Wu Fengguang
@ 2011-08-29  3:29   ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Balbir Singh, Daisuke Nishimura, KAMEZAWA Hiroyuki,
	Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: memcg-page-id.patch --]
[-- Type: text/plain, Size: 5628 bytes --]

Dump the memcg id associated with a pagecache page.

The downside is, the page_memcg_id() is a pretty heavy weight function
that needs to lock/unlock the page..

CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/linux/memcontrol.h |    6 ++++++
 include/trace/events/mm.h  |   16 +++++++++++-----
 kernel/trace/trace_mm.c    |   12 +++++++++---
 mm/memcontrol.c            |   18 ++++++++++++++++++
 4 files changed, 44 insertions(+), 8 deletions(-)

--- linux-mmotm.orig/include/linux/memcontrol.h	2011-08-29 10:55:48.000000000 +0800
+++ linux-mmotm/include/linux/memcontrol.h	2011-08-29 10:55:53.000000000 +0800
@@ -91,6 +91,7 @@ extern void mem_cgroup_uncharge_cache_pa
 extern void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask);
 int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *memcg);
 
+extern unsigned short page_memcg_id(struct page *page);
 extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
 extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
 extern struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm);
@@ -199,6 +200,11 @@ static inline int mem_cgroup_try_charge_
 	return 0;
 }
 
+static inline unsigned short page_memcg_id(struct page *page)
+{
+	return 0;
+}
+
 static inline void mem_cgroup_commit_charge_swapin(struct page *page,
 					  struct mem_cgroup *ptr)
 {
--- linux-mmotm.orig/mm/memcontrol.c	2011-08-29 10:55:48.000000000 +0800
+++ linux-mmotm/mm/memcontrol.c	2011-08-29 10:59:28.000000000 +0800
@@ -591,6 +591,24 @@ mem_cgroup_largest_soft_limit_node(struc
 	return mz;
 }
 
+unsigned short page_memcg_id(struct page *page)
+{
+	struct mem_cgroup *memcg;
+	struct cgroup_subsys_state *css;
+	unsigned short id = 0;
+
+	lock_page(page);
+	memcg = try_get_mem_cgroup_from_page(page);
+	if (memcg) {
+		css = mem_cgroup_css(memcg);
+		id = css_id(css);
+		css_put(css);
+	}
+	unlock_page(page);
+
+	return id;
+}
+
 /*
  * Implementation Note: reading percpu statistics for memcg.
  *
--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-29 10:55:52.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-29 10:55:53.000000000 +0800
@@ -34,6 +34,7 @@ TRACE_EVENT(dump_page_frame,
 		__field(	unsigned long,	flags		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned int,	memcg		)
 		__field(	unsigned long,	private		)
 		__field(	unsigned long,	mapping		)
 		__field(	unsigned long,	index		)
@@ -46,17 +47,19 @@ TRACE_EVENT(dump_page_frame,
 		__entry->flags		= page->flags;
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->memcg		= page_memcg_id(page);
 		__entry->private	= page->private;
 		__entry->mapping	= (unsigned long)page->mapping;
 		__entry->index		= page->index;
 	),
 
 	TP_printk("pfn=%lu page=%p count=%u mapcount=%u "
-		  "private=%lx mapping=%lx index=%lx flags=%s",
+		  "memcg=%u private=%lx mapping=%lx index=%lx flags=%s",
 		  __entry->pfn,
 		  __entry->page,
 		  __entry->count,
 		  __entry->mapcount,
+		  __entry->memcg,
 		  __entry->private,
 		  __entry->mapping,
 		  __entry->index,
@@ -68,9 +71,9 @@ TRACE_EVENT(dump_page_frame,
 
 TRACE_EVENT(dump_page_cache,
 
-	TP_PROTO(struct page *page, unsigned long len),
+	TP_PROTO(struct page *page, unsigned long len, unsigned int memcg),
 
-	TP_ARGS(page, len),
+	TP_ARGS(page, len, memcg),
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	index		)
@@ -78,6 +81,7 @@ TRACE_EVENT(dump_page_cache,
 		__field(	u64,		flags		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned int,	memcg		)
 	),
 
 	TP_fast_assign(
@@ -86,10 +90,11 @@ TRACE_EVENT(dump_page_cache,
 		__entry->flags		= stable_page_flags(page);
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->memcg		= memcg;
 	),
 
 	TP_printk("index=%lu len=%lu flags=%c%c%c%c%c%c%c%c%c%c%c "
-		  "count=%u mapcount=%u",
+		  "count=%u mapcount=%u memcg=%u",
 		  __entry->index,
 		  __entry->len,
 		  __entry->flags & (1ULL << KPF_MMAP)		? 'M' : '_',
@@ -104,7 +109,8 @@ TRACE_EVENT(dump_page_cache,
 		  __entry->flags & (1ULL << KPF_MAPPEDTODISK)	? 'd' : '_',
 		  __entry->flags & (1ULL << KPF_PRIVATE)	? 'P' : '_',
 		  __entry->count,
-		  __entry->mapcount)
+		  __entry->mapcount,
+		  __entry->memcg)
 );
 
 
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-29 10:56:02.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-29 11:01:04.000000000 +0800
@@ -155,6 +155,8 @@ static void dump_pagecache(struct addres
 	struct page *page;
 	unsigned long start = 0;
 	unsigned long len = 0;
+	unsigned int memcg0;
+	unsigned int memcg;
 	int i;
 
 	for (;;) {
@@ -165,23 +167,27 @@ static void dump_pagecache(struct addres
 
 		if (nr_pages == 0) {
 			if (len)
-				trace_dump_page_cache(page0, len);
+				trace_dump_page_cache(page0, len, memcg0);
 			return;
 		}
 
 		for (i = 0; i < nr_pages; i++) {
 			page = pages[i];
+			memcg = page_memcg_id(page);
 
 			if (len &&
 			    page->index == start + len &&
-			    pages_similar(page0, page))
+			    pages_similar(page0, page) &&
+			    memcg0 == memcg)
 				len++;
 			else {
 				if (len)
-					trace_dump_page_cache(page0, len);
+					trace_dump_page_cache(page0, len,
+							      memcg0);
 				page0 = page;
 				start = page->index;
 				len = 1;
+				memcg0 = memcg;
 			}
 		}
 		cond_resched();



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 7/7] tracing/mm: add memcg field
@ 2011-08-29  3:29   ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Balbir Singh, Daisuke Nishimura, KAMEZAWA Hiroyuki,
	Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: memcg-page-id.patch --]
[-- Type: text/plain, Size: 5931 bytes --]

Dump the memcg id associated with a pagecache page.

The downside is, the page_memcg_id() is a pretty heavy weight function
that needs to lock/unlock the page..

CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/linux/memcontrol.h |    6 ++++++
 include/trace/events/mm.h  |   16 +++++++++++-----
 kernel/trace/trace_mm.c    |   12 +++++++++---
 mm/memcontrol.c            |   18 ++++++++++++++++++
 4 files changed, 44 insertions(+), 8 deletions(-)

--- linux-mmotm.orig/include/linux/memcontrol.h	2011-08-29 10:55:48.000000000 +0800
+++ linux-mmotm/include/linux/memcontrol.h	2011-08-29 10:55:53.000000000 +0800
@@ -91,6 +91,7 @@ extern void mem_cgroup_uncharge_cache_pa
 extern void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask);
 int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *memcg);
 
+extern unsigned short page_memcg_id(struct page *page);
 extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
 extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
 extern struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm);
@@ -199,6 +200,11 @@ static inline int mem_cgroup_try_charge_
 	return 0;
 }
 
+static inline unsigned short page_memcg_id(struct page *page)
+{
+	return 0;
+}
+
 static inline void mem_cgroup_commit_charge_swapin(struct page *page,
 					  struct mem_cgroup *ptr)
 {
--- linux-mmotm.orig/mm/memcontrol.c	2011-08-29 10:55:48.000000000 +0800
+++ linux-mmotm/mm/memcontrol.c	2011-08-29 10:59:28.000000000 +0800
@@ -591,6 +591,24 @@ mem_cgroup_largest_soft_limit_node(struc
 	return mz;
 }
 
+unsigned short page_memcg_id(struct page *page)
+{
+	struct mem_cgroup *memcg;
+	struct cgroup_subsys_state *css;
+	unsigned short id = 0;
+
+	lock_page(page);
+	memcg = try_get_mem_cgroup_from_page(page);
+	if (memcg) {
+		css = mem_cgroup_css(memcg);
+		id = css_id(css);
+		css_put(css);
+	}
+	unlock_page(page);
+
+	return id;
+}
+
 /*
  * Implementation Note: reading percpu statistics for memcg.
  *
--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-29 10:55:52.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-29 10:55:53.000000000 +0800
@@ -34,6 +34,7 @@ TRACE_EVENT(dump_page_frame,
 		__field(	unsigned long,	flags		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned int,	memcg		)
 		__field(	unsigned long,	private		)
 		__field(	unsigned long,	mapping		)
 		__field(	unsigned long,	index		)
@@ -46,17 +47,19 @@ TRACE_EVENT(dump_page_frame,
 		__entry->flags		= page->flags;
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->memcg		= page_memcg_id(page);
 		__entry->private	= page->private;
 		__entry->mapping	= (unsigned long)page->mapping;
 		__entry->index		= page->index;
 	),
 
 	TP_printk("pfn=%lu page=%p count=%u mapcount=%u "
-		  "private=%lx mapping=%lx index=%lx flags=%s",
+		  "memcg=%u private=%lx mapping=%lx index=%lx flags=%s",
 		  __entry->pfn,
 		  __entry->page,
 		  __entry->count,
 		  __entry->mapcount,
+		  __entry->memcg,
 		  __entry->private,
 		  __entry->mapping,
 		  __entry->index,
@@ -68,9 +71,9 @@ TRACE_EVENT(dump_page_frame,
 
 TRACE_EVENT(dump_page_cache,
 
-	TP_PROTO(struct page *page, unsigned long len),
+	TP_PROTO(struct page *page, unsigned long len, unsigned int memcg),
 
-	TP_ARGS(page, len),
+	TP_ARGS(page, len, memcg),
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	index		)
@@ -78,6 +81,7 @@ TRACE_EVENT(dump_page_cache,
 		__field(	u64,		flags		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned int,	memcg		)
 	),
 
 	TP_fast_assign(
@@ -86,10 +90,11 @@ TRACE_EVENT(dump_page_cache,
 		__entry->flags		= stable_page_flags(page);
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->memcg		= memcg;
 	),
 
 	TP_printk("index=%lu len=%lu flags=%c%c%c%c%c%c%c%c%c%c%c "
-		  "count=%u mapcount=%u",
+		  "count=%u mapcount=%u memcg=%u",
 		  __entry->index,
 		  __entry->len,
 		  __entry->flags & (1ULL << KPF_MMAP)		? 'M' : '_',
@@ -104,7 +109,8 @@ TRACE_EVENT(dump_page_cache,
 		  __entry->flags & (1ULL << KPF_MAPPEDTODISK)	? 'd' : '_',
 		  __entry->flags & (1ULL << KPF_PRIVATE)	? 'P' : '_',
 		  __entry->count,
-		  __entry->mapcount)
+		  __entry->mapcount,
+		  __entry->memcg)
 );
 
 
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-29 10:56:02.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-29 11:01:04.000000000 +0800
@@ -155,6 +155,8 @@ static void dump_pagecache(struct addres
 	struct page *page;
 	unsigned long start = 0;
 	unsigned long len = 0;
+	unsigned int memcg0;
+	unsigned int memcg;
 	int i;
 
 	for (;;) {
@@ -165,23 +167,27 @@ static void dump_pagecache(struct addres
 
 		if (nr_pages == 0) {
 			if (len)
-				trace_dump_page_cache(page0, len);
+				trace_dump_page_cache(page0, len, memcg0);
 			return;
 		}
 
 		for (i = 0; i < nr_pages; i++) {
 			page = pages[i];
+			memcg = page_memcg_id(page);
 
 			if (len &&
 			    page->index == start + len &&
-			    pages_similar(page0, page))
+			    pages_similar(page0, page) &&
+			    memcg0 == memcg)
 				len++;
 			else {
 				if (len)
-					trace_dump_page_cache(page0, len);
+					trace_dump_page_cache(page0, len,
+							      memcg0);
 				page0 = page;
 				start = page->index;
 				len = 1;
+				memcg0 = memcg;
 			}
 		}
 		cond_resched();


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC][PATCH 7/7] tracing/mm: add memcg field
  2011-08-29  3:29 ` Wu Fengguang
                   ` (12 preceding siblings ...)
  (?)
@ 2011-08-29  3:29 ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:29 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar
  Cc: Mel Gorman, Balbir Singh, Daisuke Nishimura, KAMEZAWA Hiroyuki,
	Wu Fengguang, Linux Memory Management List, LKML

[-- Attachment #1: memcg-page-id.patch --]
[-- Type: text/plain, Size: 5626 bytes --]

Dump the memcg id associated with a pagecache page.

The downside is, the page_memcg_id() is a pretty heavy weight function
that needs to lock/unlock the page..

CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 include/linux/memcontrol.h |    6 ++++++
 include/trace/events/mm.h  |   16 +++++++++++-----
 kernel/trace/trace_mm.c    |   12 +++++++++---
 mm/memcontrol.c            |   18 ++++++++++++++++++
 4 files changed, 44 insertions(+), 8 deletions(-)

--- linux-mmotm.orig/include/linux/memcontrol.h	2011-08-29 10:55:48.000000000 +0800
+++ linux-mmotm/include/linux/memcontrol.h	2011-08-29 10:55:53.000000000 +0800
@@ -91,6 +91,7 @@ extern void mem_cgroup_uncharge_cache_pa
 extern void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask);
 int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *memcg);
 
+extern unsigned short page_memcg_id(struct page *page);
 extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
 extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
 extern struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm);
@@ -199,6 +200,11 @@ static inline int mem_cgroup_try_charge_
 	return 0;
 }
 
+static inline unsigned short page_memcg_id(struct page *page)
+{
+	return 0;
+}
+
 static inline void mem_cgroup_commit_charge_swapin(struct page *page,
 					  struct mem_cgroup *ptr)
 {
--- linux-mmotm.orig/mm/memcontrol.c	2011-08-29 10:55:48.000000000 +0800
+++ linux-mmotm/mm/memcontrol.c	2011-08-29 10:59:28.000000000 +0800
@@ -591,6 +591,24 @@ mem_cgroup_largest_soft_limit_node(struc
 	return mz;
 }
 
+unsigned short page_memcg_id(struct page *page)
+{
+	struct mem_cgroup *memcg;
+	struct cgroup_subsys_state *css;
+	unsigned short id = 0;
+
+	lock_page(page);
+	memcg = try_get_mem_cgroup_from_page(page);
+	if (memcg) {
+		css = mem_cgroup_css(memcg);
+		id = css_id(css);
+		css_put(css);
+	}
+	unlock_page(page);
+
+	return id;
+}
+
 /*
  * Implementation Note: reading percpu statistics for memcg.
  *
--- linux-mmotm.orig/include/trace/events/mm.h	2011-08-29 10:55:52.000000000 +0800
+++ linux-mmotm/include/trace/events/mm.h	2011-08-29 10:55:53.000000000 +0800
@@ -34,6 +34,7 @@ TRACE_EVENT(dump_page_frame,
 		__field(	unsigned long,	flags		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned int,	memcg		)
 		__field(	unsigned long,	private		)
 		__field(	unsigned long,	mapping		)
 		__field(	unsigned long,	index		)
@@ -46,17 +47,19 @@ TRACE_EVENT(dump_page_frame,
 		__entry->flags		= page->flags;
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->memcg		= page_memcg_id(page);
 		__entry->private	= page->private;
 		__entry->mapping	= (unsigned long)page->mapping;
 		__entry->index		= page->index;
 	),
 
 	TP_printk("pfn=%lu page=%p count=%u mapcount=%u "
-		  "private=%lx mapping=%lx index=%lx flags=%s",
+		  "memcg=%u private=%lx mapping=%lx index=%lx flags=%s",
 		  __entry->pfn,
 		  __entry->page,
 		  __entry->count,
 		  __entry->mapcount,
+		  __entry->memcg,
 		  __entry->private,
 		  __entry->mapping,
 		  __entry->index,
@@ -68,9 +71,9 @@ TRACE_EVENT(dump_page_frame,
 
 TRACE_EVENT(dump_page_cache,
 
-	TP_PROTO(struct page *page, unsigned long len),
+	TP_PROTO(struct page *page, unsigned long len, unsigned int memcg),
 
-	TP_ARGS(page, len),
+	TP_ARGS(page, len, memcg),
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	index		)
@@ -78,6 +81,7 @@ TRACE_EVENT(dump_page_cache,
 		__field(	u64,		flags		)
 		__field(	unsigned int,	count		)
 		__field(	unsigned int,	mapcount	)
+		__field(	unsigned int,	memcg		)
 	),
 
 	TP_fast_assign(
@@ -86,10 +90,11 @@ TRACE_EVENT(dump_page_cache,
 		__entry->flags		= stable_page_flags(page);
 		__entry->count		= atomic_read(&page->_count);
 		__entry->mapcount	= page_mapcount(page);
+		__entry->memcg		= memcg;
 	),
 
 	TP_printk("index=%lu len=%lu flags=%c%c%c%c%c%c%c%c%c%c%c "
-		  "count=%u mapcount=%u",
+		  "count=%u mapcount=%u memcg=%u",
 		  __entry->index,
 		  __entry->len,
 		  __entry->flags & (1ULL << KPF_MMAP)		? 'M' : '_',
@@ -104,7 +109,8 @@ TRACE_EVENT(dump_page_cache,
 		  __entry->flags & (1ULL << KPF_MAPPEDTODISK)	? 'd' : '_',
 		  __entry->flags & (1ULL << KPF_PRIVATE)	? 'P' : '_',
 		  __entry->count,
-		  __entry->mapcount)
+		  __entry->mapcount,
+		  __entry->memcg)
 );
 
 
--- linux-mmotm.orig/kernel/trace/trace_mm.c	2011-08-29 10:56:02.000000000 +0800
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-29 11:01:04.000000000 +0800
@@ -155,6 +155,8 @@ static void dump_pagecache(struct addres
 	struct page *page;
 	unsigned long start = 0;
 	unsigned long len = 0;
+	unsigned int memcg0;
+	unsigned int memcg;
 	int i;
 
 	for (;;) {
@@ -165,23 +167,27 @@ static void dump_pagecache(struct addres
 
 		if (nr_pages == 0) {
 			if (len)
-				trace_dump_page_cache(page0, len);
+				trace_dump_page_cache(page0, len, memcg0);
 			return;
 		}
 
 		for (i = 0; i < nr_pages; i++) {
 			page = pages[i];
+			memcg = page_memcg_id(page);
 
 			if (len &&
 			    page->index == start + len &&
-			    pages_similar(page0, page))
+			    pages_similar(page0, page) &&
+			    memcg0 == memcg)
 				len++;
 			else {
 				if (len)
-					trace_dump_page_cache(page0, len);
+					trace_dump_page_cache(page0, len,
+							      memcg0);
 				page0 = page;
 				start = page->index;
 				len = 1;
+				memcg0 = memcg;
 			}
 		}
 		cond_resched();

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/7] trace memory objects
  2011-08-29  3:29 ` Wu Fengguang
@ 2011-08-29  3:59   ` Wu Fengguang
  -1 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:59 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar; +Cc: Mel Gorman, Linux Memory Management List, LKML

> dump-pfn interface
> (it looks more clean and easier for review to fold patches 1-5 into one, but
> let's keep the changelog for the initial post)
> 
> 	[RFC][PATCH 1/7] tracing/mm: add page frame snapshot trace
> 	[RFC][PATCH 2/7] tracing/mm: rename trigger file to dump-pfn
> 	[RFC][PATCH 3/7] tracing/mm: create trace_objects.c
> 	[RFC][PATCH 4/7] tracing/mm: dump more page frame information
> 	[RFC][PATCH 5/7] tracing/mm: accept echo-able input format for pfn range

For your convenience, here is the combined diff for the above 5 incremental ones.

--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/include/trace/events/mm.h	2011-08-29 11:57:16.000000000 +0800
@@ -0,0 +1,69 @@
+#if !defined(_TRACE_MM_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_MM_H
+
+#include <linux/tracepoint.h>
+#include <linux/page-flags.h>
+#include <linux/mm.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mm
+
+extern struct trace_print_flags pageflag_names[];
+
+/**
+ * dump_page_frame - called by the trace page dump trigger
+ * @pfn: page frame number
+ * @page: pointer to the page frame
+ *
+ * This is a helper trace point into the dumping of the page frames.
+ * It will record various infromation about a page frame.
+ */
+TRACE_EVENT(dump_page_frame,
+
+	TP_PROTO(unsigned long pfn, struct page *page),
+
+	TP_ARGS(pfn, page),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	pfn		)
+		__field(	struct page *,	page		)
+		__field(	u64,		stable_flags	)
+		__field(	unsigned long,	flags		)
+		__field(	unsigned int,	count		)
+		__field(	unsigned int,	mapcount	)
+		__field(	unsigned long,	private		)
+		__field(	unsigned long,	mapping		)
+		__field(	unsigned long,	index		)
+	),
+
+	TP_fast_assign(
+		__entry->pfn		= pfn;
+		__entry->page		= page;
+		__entry->stable_flags	= stable_page_flags(page);
+		__entry->flags		= page->flags;
+		__entry->count		= atomic_read(&page->_count);
+		__entry->mapcount	= page_mapcount(page);
+		__entry->private	= page->private;
+		__entry->mapping	= (unsigned long)page->mapping;
+		__entry->index		= page->index;
+	),
+
+	TP_printk("pfn=%lu page=%p count=%u mapcount=%u "
+		  "private=%lx mapping=%lx index=%lx flags=%s",
+		  __entry->pfn,
+		  __entry->page,
+		  __entry->count,
+		  __entry->mapcount,
+		  __entry->private,
+		  __entry->mapping,
+		  __entry->index,
+		  ftrace_print_flags_seq(p, "|",
+					 __entry->flags & PAGE_FLAGS_MASK,
+					 pageflag_names)
+	)
+);
+
+#endif /*  _TRACE_MM_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--- linux-mmotm.orig/kernel/trace/Makefile	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/Makefile	2011-08-28 10:09:28.000000000 +0800
@@ -26,6 +26,7 @@ obj-$(CONFIG_RING_BUFFER) += ring_buffer
 obj-$(CONFIG_RING_BUFFER_BENCHMARK) += ring_buffer_benchmark.o
 
 obj-$(CONFIG_TRACING) += trace.o
+obj-$(CONFIG_TRACING) += trace_objects.o
 obj-$(CONFIG_TRACING) += trace_output.o
 obj-$(CONFIG_TRACING) += trace_stat.o
 obj-$(CONFIG_TRACING) += trace_printk.o
@@ -53,6 +54,7 @@ endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
 obj-$(CONFIG_TRACEPOINTS) += power-traces.o
+obj-$(CONFIG_EVENT_TRACING) += trace_mm.o
 ifeq ($(CONFIG_TRACING),y)
 obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
 endif
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-29 11:57:16.000000000 +0800
@@ -0,0 +1,150 @@
+/*
+ * Trace mm pages
+ *
+ * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ *
+ * Code based on Matt Mackall's /proc/[kpagecount|kpageflags] code.
+ */
+#include <linux/module.h>
+#include <linux/bootmem.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+
+#include "trace_output.h"
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/mm.h>
+
+void trace_mm_page_frames(unsigned long start, unsigned long end,
+			  void (*trace)(unsigned long pfn, struct page *page))
+{
+	unsigned long pfn = start;
+	struct page *page;
+
+	if (start > max_pfn - 1)
+		return;
+
+	if (end > max_pfn - 1)
+		end = max_pfn - 1;
+
+	while (pfn < end) {
+		page = NULL;
+		if (pfn_valid(pfn))
+			page = pfn_to_page(pfn);
+		pfn++;
+		if (page)
+			trace(pfn, page);
+	}
+}
+
+static void trace_mm_page_frame(unsigned long pfn, struct page *page)
+{
+	trace_dump_page_frame(pfn, page);
+}
+
+static ssize_t
+trace_mm_pfn_range_read(struct file *filp, char __user *ubuf, size_t cnt,
+			loff_t *ppos)
+{
+	return simple_read_from_buffer(ubuf, cnt, ppos, "0\n", 2);
+}
+
+
+static ssize_t
+trace_mm_pfn_range_write(struct file *filp, const char __user *ubuf, size_t cnt,
+			 loff_t *ppos)
+{
+	unsigned long val, start, end;
+	char buf[64];
+	int ret;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	if (tracing_update_buffers() < 0)
+		return -ENOMEM;
+
+	if (trace_set_clr_event("mm", "dump_page_frame", 1))
+		return -EINVAL;
+
+	buf[cnt] = 0;
+
+	ret = strict_strtol(buf, 10, &val);
+	if (ret < 0)
+		return ret;
+
+	start = *ppos;
+	if (val < 0)
+		end = max_pfn - 1;
+	else
+		end = start + val;
+
+	trace_mm_page_frames(start, end, trace_mm_page_frame);
+
+	*ppos += cnt;
+
+	return cnt;
+}
+
+static const struct file_operations trace_mm_fops = {
+	.open		= tracing_open_generic,
+	.read		= trace_mm_pfn_range_read,
+	.write		= trace_mm_pfn_range_write,
+};
+
+static struct dentry *trace_objects_mm_dir(void)
+{
+	static struct dentry *d_mm;
+	struct dentry *d_objects;
+
+	if (d_mm)
+		return d_mm;
+
+	d_objects = trace_objects_dir();
+	if (!d_objects)
+		return NULL;
+
+	d_mm = debugfs_create_dir("mm", d_objects);
+	if (!d_mm)
+		pr_warning("Could not create 'objects/mm' directory\n");
+
+	return d_mm;
+}
+
+static struct dentry *trace_objects_mm_pages_dir(void)
+{
+	static struct dentry *d_pages;
+	struct dentry *d_mm;
+
+	if (d_pages)
+		return d_pages;
+
+	d_mm = trace_objects_mm_dir();
+	if (!d_mm)
+		return NULL;
+
+	d_pages = debugfs_create_dir("pages", d_mm);
+	if (!d_pages)
+		pr_warning("Could not create debugfs "
+			   "'objects/mm/pages' directory\n");
+
+	return d_pages;
+}
+
+static __init int trace_objects_mm_init(void)
+{
+	struct dentry *d_pages;
+
+	d_pages = trace_objects_mm_pages_dir();
+	if (!d_pages)
+		return 0;
+
+	trace_create_file("dump-pfn", 0600, d_pages, NULL,
+			  &trace_mm_fops);
+
+	return 0;
+}
+fs_initcall(trace_objects_mm_init);
--- linux-mmotm.orig/kernel/trace/trace.h	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/trace.h	2011-08-28 10:09:28.000000000 +0800
@@ -318,6 +318,7 @@ struct dentry *trace_create_file(const c
 				 const struct file_operations *fops);
 
 struct dentry *tracing_init_dentry(void);
+struct dentry *trace_objects_dir(void);
 
 struct ring_buffer_event;
 
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/kernel/trace/trace_objects.c	2011-08-28 10:09:28.000000000 +0800
@@ -0,0 +1,26 @@
+#include <linux/debugfs.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+struct dentry *trace_objects_dir(void)
+{
+	static struct dentry *d_objects;
+	struct dentry *d_tracer;
+
+	if (d_objects)
+		return d_objects;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_objects = debugfs_create_dir("objects", d_tracer);
+	if (!d_objects)
+		pr_warning("Could not create debugfs "
+			   "'objects' directory\n");
+
+	return d_objects;
+}
+
+
--- linux-mmotm.orig/mm/page_alloc.c	2011-08-29 10:43:01.000000000 +0800
+++ linux-mmotm/mm/page_alloc.c	2011-08-29 10:43:03.000000000 +0800
@@ -5743,7 +5743,7 @@ bool is_free_buddy_page(struct page *pag
 }
 #endif
 
-static struct trace_print_flags pageflag_names[] = {
+struct trace_print_flags pageflag_names[] = {
 	{1UL << PG_locked,		"locked"	},
 	{1UL << PG_error,		"error"		},
 	{1UL << PG_referenced,		"referenced"	},
@@ -5790,7 +5790,7 @@ static void dump_page_flags(unsigned lon
 	printk(KERN_ALERT "page flags: %#lx(", flags);
 
 	/* remove zone id */
-	flags &= (1UL << NR_PAGEFLAGS) - 1;
+	flags &= PAGE_FLAGS_MASK;
 
 	for (i = 0; pageflag_names[i].name && flags; i++) {
 
--- linux-mmotm.orig/include/linux/page-flags.h	2011-08-29 10:43:01.000000000 +0800
+++ linux-mmotm/include/linux/page-flags.h	2011-08-29 10:43:03.000000000 +0800
@@ -462,6 +462,7 @@ static inline int PageTransCompound(stru
  * there has been a kernel bug or struct page corruption.
  */
 #define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_MASK			((1 << NR_PAGEFLAGS) - 1)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC][PATCH 0/7] trace memory objects
@ 2011-08-29  3:59   ` Wu Fengguang
  0 siblings, 0 replies; 26+ messages in thread
From: Wu Fengguang @ 2011-08-29  3:59 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar; +Cc: Mel Gorman, Linux Memory Management List, LKML

> dump-pfn interface
> (it looks more clean and easier for review to fold patches 1-5 into one, but
> let's keep the changelog for the initial post)
> 
> 	[RFC][PATCH 1/7] tracing/mm: add page frame snapshot trace
> 	[RFC][PATCH 2/7] tracing/mm: rename trigger file to dump-pfn
> 	[RFC][PATCH 3/7] tracing/mm: create trace_objects.c
> 	[RFC][PATCH 4/7] tracing/mm: dump more page frame information
> 	[RFC][PATCH 5/7] tracing/mm: accept echo-able input format for pfn range

For your convenience, here is the combined diff for the above 5 incremental ones.

--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/include/trace/events/mm.h	2011-08-29 11:57:16.000000000 +0800
@@ -0,0 +1,69 @@
+#if !defined(_TRACE_MM_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_MM_H
+
+#include <linux/tracepoint.h>
+#include <linux/page-flags.h>
+#include <linux/mm.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mm
+
+extern struct trace_print_flags pageflag_names[];
+
+/**
+ * dump_page_frame - called by the trace page dump trigger
+ * @pfn: page frame number
+ * @page: pointer to the page frame
+ *
+ * This is a helper trace point into the dumping of the page frames.
+ * It will record various infromation about a page frame.
+ */
+TRACE_EVENT(dump_page_frame,
+
+	TP_PROTO(unsigned long pfn, struct page *page),
+
+	TP_ARGS(pfn, page),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	pfn		)
+		__field(	struct page *,	page		)
+		__field(	u64,		stable_flags	)
+		__field(	unsigned long,	flags		)
+		__field(	unsigned int,	count		)
+		__field(	unsigned int,	mapcount	)
+		__field(	unsigned long,	private		)
+		__field(	unsigned long,	mapping		)
+		__field(	unsigned long,	index		)
+	),
+
+	TP_fast_assign(
+		__entry->pfn		= pfn;
+		__entry->page		= page;
+		__entry->stable_flags	= stable_page_flags(page);
+		__entry->flags		= page->flags;
+		__entry->count		= atomic_read(&page->_count);
+		__entry->mapcount	= page_mapcount(page);
+		__entry->private	= page->private;
+		__entry->mapping	= (unsigned long)page->mapping;
+		__entry->index		= page->index;
+	),
+
+	TP_printk("pfn=%lu page=%p count=%u mapcount=%u "
+		  "private=%lx mapping=%lx index=%lx flags=%s",
+		  __entry->pfn,
+		  __entry->page,
+		  __entry->count,
+		  __entry->mapcount,
+		  __entry->private,
+		  __entry->mapping,
+		  __entry->index,
+		  ftrace_print_flags_seq(p, "|",
+					 __entry->flags & PAGE_FLAGS_MASK,
+					 pageflag_names)
+	)
+);
+
+#endif /*  _TRACE_MM_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--- linux-mmotm.orig/kernel/trace/Makefile	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/Makefile	2011-08-28 10:09:28.000000000 +0800
@@ -26,6 +26,7 @@ obj-$(CONFIG_RING_BUFFER) += ring_buffer
 obj-$(CONFIG_RING_BUFFER_BENCHMARK) += ring_buffer_benchmark.o
 
 obj-$(CONFIG_TRACING) += trace.o
+obj-$(CONFIG_TRACING) += trace_objects.o
 obj-$(CONFIG_TRACING) += trace_output.o
 obj-$(CONFIG_TRACING) += trace_stat.o
 obj-$(CONFIG_TRACING) += trace_printk.o
@@ -53,6 +54,7 @@ endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
 obj-$(CONFIG_TRACEPOINTS) += power-traces.o
+obj-$(CONFIG_EVENT_TRACING) += trace_mm.o
 ifeq ($(CONFIG_TRACING),y)
 obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
 endif
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/kernel/trace/trace_mm.c	2011-08-29 11:57:16.000000000 +0800
@@ -0,0 +1,150 @@
+/*
+ * Trace mm pages
+ *
+ * Copyright (C) 2009 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ *
+ * Code based on Matt Mackall's /proc/[kpagecount|kpageflags] code.
+ */
+#include <linux/module.h>
+#include <linux/bootmem.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+
+#include "trace_output.h"
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/mm.h>
+
+void trace_mm_page_frames(unsigned long start, unsigned long end,
+			  void (*trace)(unsigned long pfn, struct page *page))
+{
+	unsigned long pfn = start;
+	struct page *page;
+
+	if (start > max_pfn - 1)
+		return;
+
+	if (end > max_pfn - 1)
+		end = max_pfn - 1;
+
+	while (pfn < end) {
+		page = NULL;
+		if (pfn_valid(pfn))
+			page = pfn_to_page(pfn);
+		pfn++;
+		if (page)
+			trace(pfn, page);
+	}
+}
+
+static void trace_mm_page_frame(unsigned long pfn, struct page *page)
+{
+	trace_dump_page_frame(pfn, page);
+}
+
+static ssize_t
+trace_mm_pfn_range_read(struct file *filp, char __user *ubuf, size_t cnt,
+			loff_t *ppos)
+{
+	return simple_read_from_buffer(ubuf, cnt, ppos, "0\n", 2);
+}
+
+
+static ssize_t
+trace_mm_pfn_range_write(struct file *filp, const char __user *ubuf, size_t cnt,
+			 loff_t *ppos)
+{
+	unsigned long val, start, end;
+	char buf[64];
+	int ret;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	if (tracing_update_buffers() < 0)
+		return -ENOMEM;
+
+	if (trace_set_clr_event("mm", "dump_page_frame", 1))
+		return -EINVAL;
+
+	buf[cnt] = 0;
+
+	ret = strict_strtol(buf, 10, &val);
+	if (ret < 0)
+		return ret;
+
+	start = *ppos;
+	if (val < 0)
+		end = max_pfn - 1;
+	else
+		end = start + val;
+
+	trace_mm_page_frames(start, end, trace_mm_page_frame);
+
+	*ppos += cnt;
+
+	return cnt;
+}
+
+static const struct file_operations trace_mm_fops = {
+	.open		= tracing_open_generic,
+	.read		= trace_mm_pfn_range_read,
+	.write		= trace_mm_pfn_range_write,
+};
+
+static struct dentry *trace_objects_mm_dir(void)
+{
+	static struct dentry *d_mm;
+	struct dentry *d_objects;
+
+	if (d_mm)
+		return d_mm;
+
+	d_objects = trace_objects_dir();
+	if (!d_objects)
+		return NULL;
+
+	d_mm = debugfs_create_dir("mm", d_objects);
+	if (!d_mm)
+		pr_warning("Could not create 'objects/mm' directory\n");
+
+	return d_mm;
+}
+
+static struct dentry *trace_objects_mm_pages_dir(void)
+{
+	static struct dentry *d_pages;
+	struct dentry *d_mm;
+
+	if (d_pages)
+		return d_pages;
+
+	d_mm = trace_objects_mm_dir();
+	if (!d_mm)
+		return NULL;
+
+	d_pages = debugfs_create_dir("pages", d_mm);
+	if (!d_pages)
+		pr_warning("Could not create debugfs "
+			   "'objects/mm/pages' directory\n");
+
+	return d_pages;
+}
+
+static __init int trace_objects_mm_init(void)
+{
+	struct dentry *d_pages;
+
+	d_pages = trace_objects_mm_pages_dir();
+	if (!d_pages)
+		return 0;
+
+	trace_create_file("dump-pfn", 0600, d_pages, NULL,
+			  &trace_mm_fops);
+
+	return 0;
+}
+fs_initcall(trace_objects_mm_init);
--- linux-mmotm.orig/kernel/trace/trace.h	2011-08-28 10:09:25.000000000 +0800
+++ linux-mmotm/kernel/trace/trace.h	2011-08-28 10:09:28.000000000 +0800
@@ -318,6 +318,7 @@ struct dentry *trace_create_file(const c
 				 const struct file_operations *fops);
 
 struct dentry *tracing_init_dentry(void);
+struct dentry *trace_objects_dir(void);
 
 struct ring_buffer_event;
 
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-mmotm/kernel/trace/trace_objects.c	2011-08-28 10:09:28.000000000 +0800
@@ -0,0 +1,26 @@
+#include <linux/debugfs.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+struct dentry *trace_objects_dir(void)
+{
+	static struct dentry *d_objects;
+	struct dentry *d_tracer;
+
+	if (d_objects)
+		return d_objects;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_objects = debugfs_create_dir("objects", d_tracer);
+	if (!d_objects)
+		pr_warning("Could not create debugfs "
+			   "'objects' directory\n");
+
+	return d_objects;
+}
+
+
--- linux-mmotm.orig/mm/page_alloc.c	2011-08-29 10:43:01.000000000 +0800
+++ linux-mmotm/mm/page_alloc.c	2011-08-29 10:43:03.000000000 +0800
@@ -5743,7 +5743,7 @@ bool is_free_buddy_page(struct page *pag
 }
 #endif
 
-static struct trace_print_flags pageflag_names[] = {
+struct trace_print_flags pageflag_names[] = {
 	{1UL << PG_locked,		"locked"	},
 	{1UL << PG_error,		"error"		},
 	{1UL << PG_referenced,		"referenced"	},
@@ -5790,7 +5790,7 @@ static void dump_page_flags(unsigned lon
 	printk(KERN_ALERT "page flags: %#lx(", flags);
 
 	/* remove zone id */
-	flags &= (1UL << NR_PAGEFLAGS) - 1;
+	flags &= PAGE_FLAGS_MASK;
 
 	for (i = 0; pageflag_names[i].name && flags; i++) {
 
--- linux-mmotm.orig/include/linux/page-flags.h	2011-08-29 10:43:01.000000000 +0800
+++ linux-mmotm/include/linux/page-flags.h	2011-08-29 10:43:03.000000000 +0800
@@ -462,6 +462,7 @@ static inline int PageTransCompound(stru
  * there has been a kernel bug or struct page corruption.
  */
 #define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_MASK			((1 << NR_PAGEFLAGS) - 1)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH] kernel.h/checkpatch: Mark strict_strto<foo> and simple_strto<foo> as obsolete
       [not found]   ` <20110829152034.cb5d2c28.akpm@linux-foundation.org>
@ 2011-08-30  2:12     ` Joe Perches
  0 siblings, 0 replies; 26+ messages in thread
From: Joe Perches @ 2011-08-30  2:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: LKML

Mark obsolete/deprecated strict_strto<foo> and simple_strto<foo>
functions and macros as obsolete.

Update checkpatch to warn about their use.

Signed-off-by: Joe Perches <joe@perches.com>

---

 include/linux/kernel.h |    4 ++++
 scripts/checkpatch.pl  |    6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 46ac9a5..1f9bc02 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -287,6 +287,8 @@ static inline int __must_check kstrtos32_from_user(const char __user *s, size_t
 	return kstrtoint_from_user(s, count, base, res);
 }
 
+/* Obsolete, do not use.  Use kstrto<foo> instead */
+
 extern unsigned long simple_strtoul(const char *,char **,unsigned int);
 extern long simple_strtol(const char *,char **,unsigned int);
 extern unsigned long long simple_strtoull(const char *,char **,unsigned int);
@@ -296,6 +298,8 @@ extern long long simple_strtoll(const char *,char **,unsigned int);
 #define strict_strtoull	kstrtoull
 #define strict_strtoll	kstrtoll
 
+/* lib/printf utilities */
+
 extern int sprintf(char * buf, const char * fmt, ...)
 	__attribute__ ((format (printf, 2, 3)));
 extern int vsprintf(char *buf, const char *, va_list)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9d761c9..bf1734a 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3150,10 +3150,10 @@ sub process {
 			     "consider using a completion\n" . $herecurr);
 
 		}
-# recommend kstrto* over simple_strto*
-		if ($line =~ /\bsimple_(strto.*?)\s*\(/) {
+# recommend kstrto* over simple_strto* and strict_strto*
+		if ($line =~ /\b((simple|strict)_(strto(l|ll|ul|ull)))\s*\(/) {
 			WARN("CONSIDER_KSTRTO",
-			     "consider using kstrto* in preference to simple_$1\n" . $herecurr);
+			     "$1 is obsolete, use k$3 instead\n" . $herecurr);
 		}
 # check for __initcall(), use device_initcall() explicitly please
 		if ($line =~ /^.\s*__initcall\s*\(/) {



^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2011-08-30  2:12 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-29  3:29 [RFC][PATCH 0/7] trace memory objects Wu Fengguang
2011-08-29  3:29 ` Wu Fengguang
2011-08-29  3:29 ` [RFC][PATCH 1/7] tracing/mm: add page frame snapshot trace Wu Fengguang
2011-08-29  3:29 ` Wu Fengguang
2011-08-29  3:29   ` Wu Fengguang
     [not found]   ` <20110829152034.cb5d2c28.akpm@linux-foundation.org>
2011-08-30  2:12     ` [PATCH] kernel.h/checkpatch: Mark strict_strto<foo> and simple_strto<foo> as obsolete Joe Perches
2011-08-29  3:29 ` [RFC][PATCH 2/7] tracing/mm: rename trigger file to dump-pfn Wu Fengguang
2011-08-29  3:29 ` Wu Fengguang
2011-08-29  3:29   ` Wu Fengguang
2011-08-29  3:29 ` [RFC][PATCH 3/7] tracing/mm: create trace_objects.c Wu Fengguang
2011-08-29  3:29   ` Wu Fengguang
2011-08-29  3:29 ` Wu Fengguang
2011-08-29  3:29 ` [RFC][PATCH 4/7] tracing/mm: dump more page frame information Wu Fengguang
2011-08-29  3:29 ` Wu Fengguang
2011-08-29  3:29   ` Wu Fengguang
2011-08-29  3:29 ` [RFC][PATCH 5/7] tracing/mm: accept echo-able input format for pfn range Wu Fengguang
2011-08-29  3:29 ` Wu Fengguang
2011-08-29  3:29   ` Wu Fengguang
2011-08-29  3:29 ` [RFC][PATCH 6/7] tracing/mm: add dump-file and dump-fs interfaces Wu Fengguang
2011-08-29  3:29   ` Wu Fengguang
2011-08-29  3:29 ` Wu Fengguang
2011-08-29  3:29 ` [RFC][PATCH 7/7] tracing/mm: add memcg field Wu Fengguang
2011-08-29  3:29 ` Wu Fengguang
2011-08-29  3:29   ` Wu Fengguang
2011-08-29  3:59 ` [RFC][PATCH 0/7] trace memory objects Wu Fengguang
2011-08-29  3:59   ` Wu Fengguang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.