linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ext3-0.9.15 against linux-2.4.14
@ 2001-11-06  9:20 Andrew Morton
  2001-11-06  9:42 ` Alan Cox
  2001-11-06 18:09 ` Steven N. Hirsch
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2001-11-06  9:20 UTC (permalink / raw)
  To: lkml, ext3-users

Download details and documentation are at

	http://www.uow.edu.au/~andrewm/linux/ext3/

Changes since ext3-0.9.13 (which was against linux-2.4.13):

- Fixed a null-pointer dereference oops which could hit on
  SMP machines.  This fix was applied to 2.4.12-ac6, but the
  oops has never been reported against -ac kernels.

- Large amounts of developer debug code has been removed.  This
  will now be maintained separately.

- There is an interaction failure between ext3 and the current
  Extended Attributes and Access Control Lists patch which leads
  to crashes under heavy load on SMP.  This is possibly due to
  a subtle API change between ext3 in 2.2 and 2.4 kernels (ie: I
  broke it).  On the to-do list.

- For a long time, the ext3 patch has used a semaphore in the core
  kernel to prevent concurrent pagein and truncate of the same
  file.  This was to prevent a race wherein the paging-in task
  would wake up after the truncate and would instantiate a page
  in the process's page tables which had attached buffers.  This
  leads to a BUG() if the swapout code tries to swap the page out.

  This semaphore has been removed.  The swapout code has been altered
  to simply detect and ignore these pages.

  This is an incredibly obscure and hard-to-hit situation.  The testcase
  which used to trigger it can no longer do so.  So if anyone sees the
  message "try_to_swap_out: page has buffers!", please shout out.

  There are no plans to remove this semaphore from -ac kernels,
  unless Alan wants it that way.

-

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext3-0.9.15 against linux-2.4.14
  2001-11-06  9:20 ext3-0.9.15 against linux-2.4.14 Andrew Morton
@ 2001-11-06  9:42 ` Alan Cox
  2001-11-06 18:09 ` Steven N. Hirsch
  1 sibling, 0 replies; 6+ messages in thread
From: Alan Cox @ 2001-11-06  9:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: lkml, ext3-users

>   There are no plans to remove this semaphore from -ac kernels,
>   unless Alan wants it that way.

That should just come out by magic as the VM and other stuff converge

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext3-0.9.15 against linux-2.4.14
  2001-11-06  9:20 ext3-0.9.15 against linux-2.4.14 Andrew Morton
  2001-11-06  9:42 ` Alan Cox
@ 2001-11-06 18:09 ` Steven N. Hirsch
  2001-11-06 18:49   ` Andrew Morton
  2001-11-07  0:31   ` Stephen Tweedie
  1 sibling, 2 replies; 6+ messages in thread
From: Steven N. Hirsch @ 2001-11-06 18:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: lkml, ext3-users

On Tue, 6 Nov 2001, Andrew Morton wrote:

> Download details and documentation are at
> 
> 	http://www.uow.edu.au/~andrewm/linux/ext3/
> 
> Changes since ext3-0.9.13 (which was against linux-2.4.13):
> 
> - For a long time, the ext3 patch has used a semaphore in the core
>   kernel to prevent concurrent pagein and truncate of the same
>   file.  This was to prevent a race wherein the paging-in task
>   would wake up after the truncate and would instantiate a page
>   in the process's page tables which had attached buffers.  This
>   leads to a BUG() if the swapout code tries to swap the page out.
> 
>   This semaphore has been removed.  The swapout code has been altered
>   to simply detect and ignore these pages.
> 
>   This is an incredibly obscure and hard-to-hit situation.  The testcase
>   which used to trigger it can no longer do so.  So if anyone sees the
>   message "try_to_swap_out: page has buffers!", please shout out.

Andrew,

I have been getting thousands of these when the system was under heavy 
load, but didn't realize it was from the ext3 code!  I'm using Linus's 
2.4.14-pre7 + ext3 patch from Neil Brown's site (the latter is identified 
as "ZeroNineFourteen".)  Would you like me to upgrade kernel and patch?

Steve



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext3-0.9.15 against linux-2.4.14
  2001-11-06 18:09 ` Steven N. Hirsch
@ 2001-11-06 18:49   ` Andrew Morton
  2001-11-07  0:31   ` Stephen Tweedie
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2001-11-06 18:49 UTC (permalink / raw)
  To: Steven N. Hirsch; +Cc: lkml, ext3-users

"Steven N. Hirsch" wrote:
> 
> On Tue, 6 Nov 2001, Andrew Morton wrote:
> 
> > Download details and documentation are at
> >
> >       http://www.uow.edu.au/~andrewm/linux/ext3/
> >
> > Changes since ext3-0.9.13 (which was against linux-2.4.13):
> >
> > - For a long time, the ext3 patch has used a semaphore in the core
> >   kernel to prevent concurrent pagein and truncate of the same
> >   file.  This was to prevent a race wherein the paging-in task
> >   would wake up after the truncate and would instantiate a page
> >   in the process's page tables which had attached buffers.  This
> >   leads to a BUG() if the swapout code tries to swap the page out.
> >
> >   This semaphore has been removed.  The swapout code has been altered
> >   to simply detect and ignore these pages.
> >
> >   This is an incredibly obscure and hard-to-hit situation.  The testcase
> >   which used to trigger it can no longer do so.  So if anyone sees the
> >   message "try_to_swap_out: page has buffers!", please shout out.
> 
> Andrew,
> 
> I have been getting thousands of these when the system was under heavy
> load, but didn't realize it was from the ext3 code!  I'm using Linus's
> 2.4.14-pre7 + ext3 patch from Neil Brown's site (the latter is identified
> as "ZeroNineFourteen".)  Would you like me to upgrade kernel and patch?
> 

Now that's interesting.  The printk is in there so I can ensure
that the codepath gets tested and is known to work.

Could you please send me details of the hardware setup, URL
for Neil's patch and a description of the workload?  Whatever
I need to make it happen locally.

If the message bothers you, please just remove the printk from 
vmscan.c.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext3-0.9.15 against linux-2.4.14
  2001-11-06 18:09 ` Steven N. Hirsch
  2001-11-06 18:49   ` Andrew Morton
@ 2001-11-07  0:31   ` Stephen Tweedie
  2001-11-07 17:59     ` Andrew Morton
  1 sibling, 1 reply; 6+ messages in thread
From: Stephen Tweedie @ 2001-11-07  0:31 UTC (permalink / raw)
  To: Steven N. Hirsch; +Cc: Andrew Morton, lkml, ext3-users, Stephen Tweedie

Hi,

On Tue, Nov 06, 2001 at 01:09:42PM -0500, Steven N. Hirsch wrote:

> >   This is an incredibly obscure and hard-to-hit situation.  The testcase
> >   which used to trigger it can no longer do so.  So if anyone sees the
> >   message "try_to_swap_out: page has buffers!", please shout out.
 
> I have been getting thousands of these when the system was under heavy 
> load, but didn't realize it was from the ext3 code!  I'm using Linus's 
> 2.4.14-pre7 + ext3 patch from Neil Brown's site (the latter is identified 
> as "ZeroNineFourteen".)  Would you like me to upgrade kernel and patch?

Andrew, the code

	if (page->buffers) {
		/*
		 * Anonymous buffercache page left behind by
		 * truncate.
		 */
		printk(__FUNCTION__ ": page has buffers!\n");
		goto preserve;
	}

is going to end up preserving the pte forever and shouting to syslog
every time the VM walks over the pte in question.  I'd be just as
happy dropping these ptes on the floor when we find them, as they are
clearly of no use to anybody at this point.

Cheers,
 Stephen

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext3-0.9.15 against linux-2.4.14
  2001-11-07  0:31   ` Stephen Tweedie
@ 2001-11-07 17:59     ` Andrew Morton
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2001-11-07 17:59 UTC (permalink / raw)
  To: Stephen Tweedie; +Cc: Steven N. Hirsch, lkml, ext3-users

Stephen Tweedie wrote:
> 
> Andrew, the code
> 
>         if (page->buffers) {
>                 /*
>                  * Anonymous buffercache page left behind by
>                  * truncate.
>                  */
>                 printk(__FUNCTION__ ": page has buffers!\n");
>                 goto preserve;
>         }
> 
> is going to end up preserving the pte forever and shouting to syslog
> every time the VM walks over the pte in question.  I'd be just as
> happy dropping these ptes on the floor when we find them, as they are
> clearly of no use to anybody at this point.
> 

Yes, perhaps we could do something smarter - I wasn't even sure it 
was possible to hit any more (still waiting to hear back from
Steve Hirsch!)

The idea is that in this rare case, shrink_cache() will at
some later time revisit the page and again try to remove its
buffers, and will succeed.   It's still on the LRU.

We definitely need to kill the printk(), but I really want
to get to test this code path locally.

-

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-11-07 18:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-11-06  9:20 ext3-0.9.15 against linux-2.4.14 Andrew Morton
2001-11-06  9:42 ` Alan Cox
2001-11-06 18:09 ` Steven N. Hirsch
2001-11-06 18:49   ` Andrew Morton
2001-11-07  0:31   ` Stephen Tweedie
2001-11-07 17:59     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).