From: Mathieu Desnoyers <compudj@krystal.dyndns.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Jonathan Corbet <corbet@lwn.net>,
LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
prasad@linux.vnet.ibm.com, "Frank Ch. Eigler" <fche@redhat.com>,
David Wilder <dwilder@us.ibm.com>,
hch@lst.de, Martin Bligh <mbligh@google.com>,
Christoph Hellwig <hch@infradead.org>,
Masami Hiramatsu <mhiramat@redhat.com>,
Steven Rostedt <srostedt@redhat.com>,
Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Subject: Re: [PATCH] ring_buffer: allocate buffer page pointer
Date: Wed, 1 Oct 2008 13:36:53 -0400 [thread overview]
Message-ID: <20081001173653.GA14538@Krystal> (raw)
In-Reply-To: <alpine.DEB.1.10.0810011114000.30777@gandalf.stny.rr.com>
* Steven Rostedt (rostedt@goodmis.org) wrote:
>
> The current method of overlaying the page frame as the buffer page pointer
> can be very dangerous and limits our ability to do other things with
> a page from the buffer, like send it off to disk.
>
> This patch allocates the buffer_page instead of overlaying the page's
> page frame. The use of the buffer_page has hardly changed due to this.
>
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
> kernel/trace/ring_buffer.c | 54 ++++++++++++++++++++++++++-------------------
> 1 file changed, 32 insertions(+), 22 deletions(-)
>
> Index: linux-tip.git/kernel/trace/ring_buffer.c
> ===================================================================
> --- linux-tip.git.orig/kernel/trace/ring_buffer.c 2008-10-01 09:37:23.000000000 -0400
> +++ linux-tip.git/kernel/trace/ring_buffer.c 2008-10-01 11:03:16.000000000 -0400
> @@ -115,16 +115,10 @@ void *ring_buffer_event_data(struct ring
> * Thanks to Peter Zijlstra for suggesting this idea.
> */
> struct buffer_page {
> - union {
> - struct {
> - unsigned long flags; /* mandatory */
> - atomic_t _count; /* mandatory */
> - u64 time_stamp; /* page time stamp */
> - unsigned size; /* size of page data */
> - struct list_head list; /* list of free pages */
> - };
> - struct page page;
> - };
> + u64 time_stamp; /* page time stamp */
> + unsigned size; /* size of page data */
> + struct list_head list; /* list of free pages */
> + void *page; /* Actual data page */
> };
>
> /*
> @@ -133,9 +127,9 @@ struct buffer_page {
> */
> static inline void free_buffer_page(struct buffer_page *bpage)
> {
> - reset_page_mapcount(&bpage->page);
> - bpage->page.mapping = NULL;
> - __free_page(&bpage->page);
> + if (bpage->page)
> + __free_page(bpage->page);
> + kfree(bpage);
> }
>
> /*
> @@ -237,11 +231,16 @@ static int rb_allocate_pages(struct ring
> unsigned i;
>
> for (i = 0; i < nr_pages; i++) {
> + page = kzalloc_node(ALIGN(sizeof(*page), cache_line_size()),
> + GFP_KERNEL, cpu_to_node(cpu));
> + if (!page)
> + goto free_pages;
> + list_add(&page->list, &pages);
> +
> addr = __get_free_page(GFP_KERNEL);
> if (!addr)
> goto free_pages;
> - page = (struct buffer_page *)virt_to_page(addr);
> - list_add(&page->list, &pages);
> + page->page = (void *)addr;
> }
>
> list_splice(&pages, head);
> @@ -262,6 +261,7 @@ static struct ring_buffer_per_cpu *
> rb_allocate_cpu_buffer(struct ring_buffer *buffer, int cpu)
> {
> struct ring_buffer_per_cpu *cpu_buffer;
> + struct buffer_page *page;
> unsigned long addr;
> int ret;
>
> @@ -275,10 +275,17 @@ rb_allocate_cpu_buffer(struct ring_buffe
> spin_lock_init(&cpu_buffer->lock);
> INIT_LIST_HEAD(&cpu_buffer->pages);
>
> + page = kzalloc_node(ALIGN(sizeof(*page), cache_line_size()),
> + GFP_KERNEL, cpu_to_node(cpu));
Hi Steven,
I understand that you want to allocate these struct buffer_page in
memory local to a given cpu node, which is great, but why do you feel
you need to align them on cache_line_size() ?
Hrm.. you put the timestamp in there, so I guess you're concerned about
having a writer on one CPU, a reader on another, and the fact that you
will have cache line bouncing because of that.
Note that if you put the timestamp and the unused bytes in a tiny header
at the beginning of the page, you
1 - make this information directly accessible for disk, network I/O
without any other abstraction layer.
2 - won't have to do such alignment on the struct buffer_page, because
it will only be read once it's been allocated.
My 2 cents ;)
Mathieu
> + if (!page)
> + goto fail_free_buffer;
> +
> + cpu_buffer->reader_page = page;
> addr = __get_free_page(GFP_KERNEL);
> if (!addr)
> - goto fail_free_buffer;
> - cpu_buffer->reader_page = (struct buffer_page *)virt_to_page(addr);
> + goto fail_free_reader;
> + page->page = (void *)addr;
> +
> INIT_LIST_HEAD(&cpu_buffer->reader_page->list);
> cpu_buffer->reader_page->size = 0;
>
> @@ -523,11 +530,16 @@ int ring_buffer_resize(struct ring_buffe
>
> for_each_buffer_cpu(buffer, cpu) {
> for (i = 0; i < new_pages; i++) {
> + page = kzalloc_node(ALIGN(sizeof(*page),
> + cache_line_size()),
> + GFP_KERNEL, cpu_to_node(cpu));
> + if (!page)
> + goto free_pages;
> + list_add(&page->list, &pages);
> addr = __get_free_page(GFP_KERNEL);
> if (!addr)
> goto free_pages;
> - page = (struct buffer_page *)virt_to_page(addr);
> - list_add(&page->list, &pages);
> + page->page = (void *)addr;
> }
> }
>
> @@ -567,9 +579,7 @@ static inline int rb_null_event(struct r
>
> static inline void *rb_page_index(struct buffer_page *page, unsigned index)
> {
> - void *addr = page_address(&page->page);
> -
> - return addr + index;
> + return page->page + index;
> }
>
> static inline struct ring_buffer_event *
>
>
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2008-10-01 17:37 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-25 18:51 [RFC PATCH 0/2 v3] Unified trace buffer Steven Rostedt
2008-09-25 18:51 ` [RFC PATCH 1/2 " Steven Rostedt
2008-09-26 1:02 ` [RFC PATCH v4] " Steven Rostedt
2008-09-26 1:52 ` Masami Hiramatsu
2008-09-26 2:11 ` Steven Rostedt
2008-09-26 2:47 ` Masami Hiramatsu
2008-09-26 3:20 ` Mathieu Desnoyers
2008-09-26 7:18 ` Peter Zijlstra
2008-09-26 10:45 ` Steven Rostedt
2008-09-26 11:00 ` Peter Zijlstra
2008-09-26 16:57 ` Masami Hiramatsu
2008-09-26 17:14 ` Steven Rostedt
2008-09-26 10:47 ` Steven Rostedt
2008-09-26 16:04 ` Mathieu Desnoyers
2008-09-26 17:11 ` [PATCH v5] " Steven Rostedt
2008-09-26 17:31 ` Arnaldo Carvalho de Melo
2008-09-26 17:37 ` Linus Torvalds
2008-09-26 17:46 ` Steven Rostedt
2008-09-27 17:02 ` Ingo Molnar
2008-09-27 17:18 ` Steven Rostedt
2008-09-26 18:05 ` [PATCH v6] " Steven Rostedt
2008-09-26 18:30 ` Richard Holden
2008-09-26 18:39 ` Steven Rostedt
2008-09-26 18:59 ` Peter Zijlstra
2008-09-26 19:46 ` Martin Bligh
2008-09-26 19:52 ` Steven Rostedt
2008-09-26 21:37 ` Steven Rostedt
2008-09-26 19:14 ` Peter Zijlstra
2008-09-26 22:28 ` Mike Travis
2008-09-26 23:56 ` Steven Rostedt
2008-09-27 0:05 ` Mike Travis
2008-09-27 0:18 ` Steven Rostedt
2008-09-27 0:46 ` Mike Travis
2008-09-27 0:52 ` Steven Rostedt
2008-09-26 19:17 ` Peter Zijlstra
2008-09-26 23:16 ` Arjan van de Ven
2008-09-26 20:08 ` Peter Zijlstra
2008-09-26 21:14 ` Masami Hiramatsu
2008-09-26 21:26 ` Steven Rostedt
2008-09-26 21:13 ` [PATCH v7] " Steven Rostedt
2008-09-27 2:02 ` [PATCH v8] " Steven Rostedt
2008-09-27 6:06 ` [PATCH v9] " Steven Rostedt
2008-09-27 18:39 ` Ingo Molnar
2008-09-27 19:24 ` Steven Rostedt
2008-09-27 19:41 ` Ingo Molnar
2008-09-27 19:54 ` Steven Rostedt
2008-09-27 20:00 ` Ingo Molnar
2008-09-29 15:05 ` Steven Rostedt
2008-09-27 20:07 ` Martin Bligh
2008-09-27 20:34 ` Ingo Molnar
2008-09-29 16:10 ` [PATCH v10 Golden] " Steven Rostedt
2008-09-29 16:11 ` Steven Rostedt
2008-09-29 23:35 ` Mathieu Desnoyers
2008-09-30 0:01 ` Steven Rostedt
2008-09-30 0:03 ` Mathieu Desnoyers
2008-09-30 0:12 ` Steven Rostedt
2008-09-30 3:46 ` Mathieu Desnoyers
2008-09-30 4:00 ` Steven Rostedt
2008-09-30 15:20 ` Jonathan Corbet
2008-09-30 15:54 ` Peter Zijlstra
2008-09-30 16:38 ` Linus Torvalds
2008-09-30 16:48 ` Steven Rostedt
2008-09-30 17:00 ` Peter Zijlstra
2008-09-30 17:41 ` Steven Rostedt
2008-09-30 17:49 ` Peter Zijlstra
2008-09-30 17:56 ` Steven Rostedt
2008-09-30 18:02 ` Steven Rostedt
2008-09-30 17:01 ` Linus Torvalds
2008-10-01 15:14 ` [PATCH] ring_buffer: allocate buffer page pointer Steven Rostedt
2008-10-01 17:36 ` Mathieu Desnoyers [this message]
2008-10-01 17:49 ` Steven Rostedt
2008-10-01 18:21 ` Mathieu Desnoyers
2008-10-02 8:50 ` Ingo Molnar
2008-10-02 8:51 ` Ingo Molnar
2008-10-02 9:05 ` [PATCH] ring-buffer: fix build error Ingo Molnar
2008-10-02 9:38 ` [boot crash] " Ingo Molnar
2008-10-02 13:16 ` Steven Rostedt
2008-10-02 13:17 ` Steven Rostedt
2008-10-02 15:50 ` Ingo Molnar
2008-10-02 18:27 ` Steven Rostedt
2008-10-02 18:55 ` Ingo Molnar
2008-10-02 23:18 ` [PATCH] ring_buffer: map to cpu not page Steven Rostedt
2008-10-02 23:36 ` Steven Rostedt
2008-10-03 4:56 ` [PATCH] x86 Topology cpu_to_node parameter check Mathieu Desnoyers
2008-10-03 5:20 ` Steven Rostedt
2008-10-03 15:56 ` Mathieu Desnoyers
2008-10-03 16:26 ` Steven Rostedt
2008-10-03 17:21 ` Mathieu Desnoyers
2008-10-03 17:54 ` Steven Rostedt
2008-10-03 18:53 ` [PATCH] topology.h define mess fix Mathieu Desnoyers
2008-10-03 20:14 ` Luck, Tony
2008-10-03 22:47 ` [PATCH] topology.h define mess fix v2 Mathieu Desnoyers
2008-10-03 7:27 ` [PATCH] ring_buffer: map to cpu not page Ingo Molnar
2008-10-02 9:06 ` [PATCH] ring_buffer: allocate buffer page pointer Andrew Morton
2008-10-02 9:41 ` Ingo Molnar
2008-10-02 13:06 ` Steven Rostedt
2008-09-26 22:31 ` [PATCH v6] Unified trace buffer Arnaldo Carvalho de Melo
2008-09-26 23:58 ` Steven Rostedt
2008-09-27 0:13 ` Linus Torvalds
2008-09-27 0:23 ` Steven Rostedt
2008-09-27 0:28 ` Steven Rostedt
2008-09-25 18:51 ` [RFC PATCH 2/2 v3] ftrace: make work with new ring buffer Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081001173653.GA14538@Krystal \
--to=compudj@krystal.dyndns.org \
--cc=acme@ghostprotocols.net \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=dwilder@us.ibm.com \
--cc=fche@redhat.com \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@google.com \
--cc=mhiramat@redhat.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=prasad@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
--cc=srostedt@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).