* 2.6.1 sendfile regression
@ 2004-01-10 0:01 Felix von Leitner
2004-01-10 0:21 ` Andrew Morton
2004-01-10 5:23 ` Lennert Buytenhek
0 siblings, 2 replies; 7+ messages in thread
From: Felix von Leitner @ 2004-01-10 0:01 UTC (permalink / raw)
To: linux-kernel
I'm getting a huge sendfile regression in 2.6.1; when serving a large
file on an IPv4 TCP socket via sendfile64, the transfer starts at about
4 MB (2.6.0: >7 MB) and the hard disk like stays on 100% (which normally
only happens if the file is badly fragmented, fragmentation of this file
is 0%). Then, suddenly, the network performance drops dramatically,
getting worse and worse. strace shows that the process is hanging
inside sendfile64 (which should not happen since the socket is
non-blocking). The process then stays inside sendfile for up to a
minute or so and can't be interrupted or killed in that time.
Please fix!
Felix
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.1 sendfile regression
2004-01-10 0:01 2.6.1 sendfile regression Felix von Leitner
@ 2004-01-10 0:21 ` Andrew Morton
2004-01-10 0:52 ` Ram Pai
2004-01-10 5:23 ` Lennert Buytenhek
1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2004-01-10 0:21 UTC (permalink / raw)
To: Felix von Leitner; +Cc: linux-kernel, Ram Pai
Felix von Leitner <felix-kernel@fefe.de> wrote:
>
> I'm getting a huge sendfile regression in 2.6.1; when serving a large
> file on an IPv4 TCP socket via sendfile64, the transfer starts at about
> 4 MB (2.6.0: >7 MB) and the hard disk like stays on 100% (which normally
> only happens if the file is badly fragmented, fragmentation of this file
> is 0%). Then, suddenly, the network performance drops dramatically,
> getting worse and worse. strace shows that the process is hanging
> inside sendfile64 (which should not happen since the socket is
> non-blocking). The process then stays inside sendfile for up to a
> minute or so and can't be interrupted or killed in that time.
>
Probably it is the ill-advised readahead tweak. Does the below patch fix
it up?
diff -puN mm/filemap.c~a mm/filemap.c
--- 25/mm/filemap.c~a Fri Jan 9 16:20:32 2004
+++ 25-akpm/mm/filemap.c Fri Jan 9 16:20:46 2004
@@ -596,12 +596,6 @@ void do_generic_mapping_read(struct addr
offset = *ppos & ~PAGE_CACHE_MASK;
last = (*ppos + desc->count) >> PAGE_CACHE_SHIFT;
- /*
- * Let the readahead logic know upfront about all
- * the pages we'll need to satisfy this request
- */
- for (; index < last; index++)
- page_cache_readahead(mapping, ra, filp, index);
index = *ppos >> PAGE_CACHE_SHIFT;
for (;;) {
_
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.1 sendfile regression
2004-01-10 0:21 ` Andrew Morton
@ 2004-01-10 0:52 ` Ram Pai
2004-01-10 1:42 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Ram Pai @ 2004-01-10 0:52 UTC (permalink / raw)
To: Andrew Morton; +Cc: Felix von Leitner, linux-kernel
On Fri, 2004-01-09 at 16:21, Andrew Morton wrote:
> Probably it is the ill-advised readahead tweak. Does the below patch fix
> it up?
>
> diff -puN mm/filemap.c~a mm/filemap.c
> --- 25/mm/filemap.c~a Fri Jan 9 16:20:32 2004
> +++ 25-akpm/mm/filemap.c Fri Jan 9 16:20:46 2004
> @@ -596,12 +596,6 @@ void do_generic_mapping_read(struct addr
> offset = *ppos & ~PAGE_CACHE_MASK;
> last = (*ppos + desc->count) >> PAGE_CACHE_SHIFT;
>
> - /*
> - * Let the readahead logic know upfront about all
> - * the pages we'll need to satisfy this request
> - */
> - for (; index < last; index++)
> - page_cache_readahead(mapping, ra, filp, index);
> index = *ppos >> PAGE_CACHE_SHIFT;
>
> for (;;) {
>
There is a small mistake in Andrew's patch. The call to
page_cache_readahead() is missing.
Try this one.
*** linux-2.6.1-rc2/mm/filemap.c Fri Jan 9 00:47:04 2004
--- linux-2.6.1-rc2/mm/filemap.c.1 Fri Jan 9 00:46:43 2004
***************
*** 587,608 ****
read_actor_t actor)
{
struct inode *inode = mapping->host;
! unsigned long index, offset, last;
struct page *cached_page;
int error;
cached_page = NULL;
index = *ppos >> PAGE_CACHE_SHIFT;
offset = *ppos & ~PAGE_CACHE_MASK;
- last = (*ppos + desc->count) >> PAGE_CACHE_SHIFT;
-
- /*
- * Let the readahead logic know upfront about all
- * the pages we'll need to satisfy this request
- */
- for (; index < last; index++)
- page_cache_readahead(mapping, ra, filp, index);
- index = *ppos >> PAGE_CACHE_SHIFT;
for (;;) {
struct page *page;
--- 587,599 ----
read_actor_t actor)
{
struct inode *inode = mapping->host;
! unsigned long index, offset;
struct page *cached_page;
int error;
cached_page = NULL;
index = *ppos >> PAGE_CACHE_SHIFT;
offset = *ppos & ~PAGE_CACHE_MASK;
for (;;) {
struct page *page;
***************
*** 621,626 ****
--- 612,618 ----
}
cond_resched();
+ page_cache_readahead(mapping, ra, filp, index);
nr = nr - offset;
find_page:
> _
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.1 sendfile regression
2004-01-10 0:52 ` Ram Pai
@ 2004-01-10 1:42 ` Andrew Morton
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2004-01-10 1:42 UTC (permalink / raw)
To: Ram Pai; +Cc: felix-kernel, linux-kernel
Ram Pai <linuxram@us.ibm.com> wrote:
>
> There is a small mistake in Andrew's patch. The call to
> page_cache_readahead() is missing.
>
> Try this one.
Third time lucky.
diff -puN mm/filemap.c~readahead-partial-backout mm/filemap.c
--- 25/mm/filemap.c~readahead-partial-backout 2004-01-09 17:41:14.000000000 -0800
+++ 25-akpm/mm/filemap.c 2004-01-09 17:41:14.000000000 -0800
@@ -587,22 +587,13 @@ void do_generic_mapping_read(struct addr
read_actor_t actor)
{
struct inode *inode = mapping->host;
- unsigned long index, offset, last;
+ unsigned long index, offset;
struct page *cached_page;
int error;
cached_page = NULL;
index = *ppos >> PAGE_CACHE_SHIFT;
offset = *ppos & ~PAGE_CACHE_MASK;
- last = (*ppos + desc->count) >> PAGE_CACHE_SHIFT;
-
- /*
- * Let the readahead logic know upfront about all
- * the pages we'll need to satisfy this request
- */
- for (; index < last; index++)
- page_cache_readahead(mapping, ra, filp, index);
- index = *ppos >> PAGE_CACHE_SHIFT;
for (;;) {
struct page *page;
@@ -621,6 +612,7 @@ void do_generic_mapping_read(struct addr
}
cond_resched();
+ page_cache_readahead(mapping, ra, filp, index);
nr = nr - offset;
find_page:
_
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.1 sendfile regression
2004-01-10 0:01 2.6.1 sendfile regression Felix von Leitner
2004-01-10 0:21 ` Andrew Morton
@ 2004-01-10 5:23 ` Lennert Buytenhek
2004-01-12 2:58 ` Linus Torvalds
1 sibling, 1 reply; 7+ messages in thread
From: Lennert Buytenhek @ 2004-01-10 5:23 UTC (permalink / raw)
To: Felix von Leitner; +Cc: linux-kernel
On Sat, Jan 10, 2004 at 01:01:28AM +0100, Felix von Leitner wrote:
> strace shows that the process is hanging
> inside sendfile64 (which should not happen since the socket is
> non-blocking).
What if the data you're sending is not in the page cache?
--L
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.1 sendfile regression
2004-01-10 5:23 ` Lennert Buytenhek
@ 2004-01-12 2:58 ` Linus Torvalds
2004-01-12 11:31 ` Lennert Buytenhek
0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2004-01-12 2:58 UTC (permalink / raw)
To: Lennert Buytenhek; +Cc: Felix von Leitner, linux-kernel
On Sat, 10 Jan 2004, Lennert Buytenhek wrote:
> On Sat, Jan 10, 2004 at 01:01:28AM +0100, Felix von Leitner wrote:
>
> > strace shows that the process is hanging
> > inside sendfile64 (which should not happen since the socket is
> > non-blocking).
>
> What if the data you're sending is not in the page cache?
It will always block on the actual page cache, although we could try to
change that. However, even if it blocks, it should only block at one page
at a time (or "incidental" blockage due to memory allocations etc).
Blocking for long times implies a bug.
Linus
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.1 sendfile regression
2004-01-12 2:58 ` Linus Torvalds
@ 2004-01-12 11:31 ` Lennert Buytenhek
0 siblings, 0 replies; 7+ messages in thread
From: Lennert Buytenhek @ 2004-01-12 11:31 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Felix von Leitner, linux-kernel
On Sun, Jan 11, 2004 at 06:58:55PM -0800, Linus Torvalds wrote:
> > > strace shows that the process is hanging
> > > inside sendfile64 (which should not happen since the socket is
> > > non-blocking).
> >
> > What if the data you're sending is not in the page cache?
>
> It will always block on the actual page cache, although we could try to
> change that.
My impression is that this is the reason why AIO sendfile was attempted.
Although my wild guess is that that probably would only work on raw block
devices and still not on regular filesystem files.
> However, even if it blocks, it should only block at one page
> at a time (or "incidental" blockage due to memory allocations etc).
>
> Blocking for long times implies a bug.
(Even if it only blocks a page at a time, it will happily block on the
next page and the one after that as long as there is write space in the
destination socket?)
--L
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-01-12 11:32 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-10 0:01 2.6.1 sendfile regression Felix von Leitner
2004-01-10 0:21 ` Andrew Morton
2004-01-10 0:52 ` Ram Pai
2004-01-10 1:42 ` Andrew Morton
2004-01-10 5:23 ` Lennert Buytenhek
2004-01-12 2:58 ` Linus Torvalds
2004-01-12 11:31 ` Lennert Buytenhek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).