linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mmap-related questions
@ 2003-03-31 14:41 Kenny Simpson
  2003-03-31 17:55 ` Benjamin LaHaise
  0 siblings, 1 reply; 8+ messages in thread
From: Kenny Simpson @ 2003-03-31 14:41 UTC (permalink / raw)
  To: linux-kernel

Greetings!  I hate to ask this type of questions here,
but having searched the list and googling, I have
found no good answers, so here goes..

If I use mmap to give me a sliding window view onto a
file (mmap/munmap/mmap or mremap), how can I sync all
unmapped memory associated with the file?

I read from Stevens that "the call to munmap does not
cause the contents of the mapped region to be written
to the disk file.", but I don't want to pay the
penalty of doing many msync()'s each time I move my
window.
I tested that fsync() does not seem to sync pages that
were mapped with mmap.  Is there some way to sync all
data associated with the file?  Is there a way which
is also portable to Solaris 2.6?

Thanks,
-Kenny

BTW: I'm using 2.4.7 (RH enterprise)


__________________________________________________
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
http://platinum.yahoo.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mmap-related questions
  2003-03-31 14:41 mmap-related questions Kenny Simpson
@ 2003-03-31 17:55 ` Benjamin LaHaise
  2003-04-01  3:25   ` Kenny Simpson
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin LaHaise @ 2003-03-31 17:55 UTC (permalink / raw)
  To: Kenny Simpson; +Cc: linux-kernel

On Mon, Mar 31, 2003 at 06:41:10AM -0800, Kenny Simpson wrote:
> I tested that fsync() does not seem to sync pages that
> were mapped with mmap.  Is there some way to sync all
> data associated with the file?  Is there a way which
> is also portable to Solaris 2.6?

No.  You must use msync().  Note that fsync() after munmap() will flush the 
pages to disk under Linux.

> BTW: I'm using 2.4.7 (RH enterprise)

2.4.7 is way out of date and should be updated for the numerous bugfixes and 
security errata.

		-ben

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mmap-related questions
  2003-03-31 17:55 ` Benjamin LaHaise
@ 2003-04-01  3:25   ` Kenny Simpson
  2003-04-01 17:50     ` Benjamin LaHaise
  0 siblings, 1 reply; 8+ messages in thread
From: Kenny Simpson @ 2003-04-01  3:25 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: linux-kernel

--- Benjamin LaHaise <bcrl@redhat.com> wrote:
> No.  You must use msync().

> Note that fsync() after
> munmap() will flush the 
> pages to disk under Linux.
Sweet!  Paydirt!  Is this documented/guaranteed to
continue to work for a while?
Is this true for all non-mmap()ed dirty buffers for a
given file?

Just to restate what you said:
- if part of a file is mmap()ed, msync() MUST be used
to sync it.
- any non-mmap()ed portions are synched with fsync().

I'm assuming this is a per-process thing.  i.e. The
above is true regardless of what other processes are
doing (e.g. even if another process has the same file
mmap()'d, I don't care).

> 2.4.7 is way out of date and should be updated for
> the numerous bugfixes and 
> security errata.
I know.  Unfortunately not my call.  Desperately
trying to beat people with clue sticks....

Thanks!,
-Kenny

__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://platinum.yahoo.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mmap-related questions
  2003-04-01  3:25   ` Kenny Simpson
@ 2003-04-01 17:50     ` Benjamin LaHaise
  2003-04-02  3:18       ` Kenny Simpson
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin LaHaise @ 2003-04-01 17:50 UTC (permalink / raw)
  To: Kenny Simpson; +Cc: linux-kernel

On Mon, Mar 31, 2003 at 07:25:46PM -0800, Kenny Simpson wrote:
> --- Benjamin LaHaise <bcrl@redhat.com> wrote:
> > No.  You must use msync().
> 
> > Note that fsync() after
> > munmap() will flush the 
> > pages to disk under Linux.
> Sweet!  Paydirt!  Is this documented/guaranteed to
> continue to work for a while?
> Is this true for all non-mmap()ed dirty buffers for a
> given file?

It's only true for the pages the munmap() removes from the process' page 
tables: the act of unmapping them transfers the dirty bit from the page 
tables into the page cache where fsync() acts on them.

> Just to restate what you said:
> - if part of a file is mmap()ed, msync() MUST be used
> to sync it.
> - any non-mmap()ed portions are synched with fsync().

Pretty much.

> I'm assuming this is a per-process thing.  i.e. The
> above is true regardless of what other processes are
> doing (e.g. even if another process has the same file
> mmap()'d, I don't care).

Right.  Other processes are responsible for managing their own syncing of 
dirty bits to disk at the appropriate times.  The one case this breaks down 
on is when the mmap()'d file is on NFS -- the reordering there can result in 
writebacks from mmap()s occuring in unexpected ways.  But then, nobody trusts 
their data to NFS, right?  ;-)

		-ben
-- 
Junk email?  <a href="mailto:aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mmap-related questions
  2003-04-01 17:50     ` Benjamin LaHaise
@ 2003-04-02  3:18       ` Kenny Simpson
  2003-04-02  9:30         ` Jakob Oestergaard
  0 siblings, 1 reply; 8+ messages in thread
From: Kenny Simpson @ 2003-04-02  3:18 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: linux-kernel

--- Benjamin LaHaise <bcrl@redhat.com> wrote:
> the act of unmapping them transfers the
> dirty bit from the page 
> tables into the page cache where fsync() acts on
> them.
>
Should this info be included with Mel Gorman's
excellent doc:
http://www.csn.ul.ie/~mel/projects/vm/guide/html/understand/node31.html#SECTION009411000000000000000
Or is it there, but I missed it?

> The
> one case this breaks down 
> on is when the mmap()'d file is on NFS -- the
> reordering there can result in 
> writebacks from mmap()s occuring in unexpected ways.
I sometimes wish mmap was not supported on NFS, or at
least require a special MAP_NFS flag be used.  It has
caused lots of pain over the years.

Thanks again for this info, it has helped greatly!

-Kenny


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mmap-related questions
  2003-04-02  3:18       ` Kenny Simpson
@ 2003-04-02  9:30         ` Jakob Oestergaard
  2003-04-02 15:10           ` Benjamin LaHaise
  0 siblings, 1 reply; 8+ messages in thread
From: Jakob Oestergaard @ 2003-04-02  9:30 UTC (permalink / raw)
  To: Kenny Simpson; +Cc: Benjamin LaHaise, linux-kernel

On Tue, Apr 01, 2003 at 07:18:40PM -0800, Kenny Simpson wrote:
> --- Benjamin LaHaise <bcrl@redhat.com> wrote:
> > the act of unmapping them transfers the
> > dirty bit from the page 
> > tables into the page cache where fsync() acts on
> > them.
> >
> Should this info be included with Mel Gorman's
> excellent doc:
> http://www.csn.ul.ie/~mel/projects/vm/guide/html/understand/node31.html#SECTION009411000000000000000
> Or is it there, but I missed it?
> 
> > The
> > one case this breaks down 
> > on is when the mmap()'d file is on NFS -- the
> > reordering there can result in 
> > writebacks from mmap()s occuring in unexpected ways.
> I sometimes wish mmap was not supported on NFS, or at
> least require a special MAP_NFS flag be used.  It has
> caused lots of pain over the years.

Could someone elaborate on this please?

If my client does
  big_map = mmap(... some file ...)
  make_dirty(big_map)
  msync(first half of big_map)
  msync(second half of big_map)    { crash during this }

Then I am guaranteed that (unless the server crashes), the first half of
big_map *will* have reached the server, but not that all of the second
half has.   Right?

Like any local-disk backed file.

Ignoring the case where the NFS *server* crashes, where could the write
ordering differ, compared to local disk files ?

In other words, what does Benjamin's "unexpected ways" refer to ?

Thanks,

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mmap-related questions
  2003-04-02  9:30         ` Jakob Oestergaard
@ 2003-04-02 15:10           ` Benjamin LaHaise
  2003-04-02 23:43             ` Jakob Oestergaard
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin LaHaise @ 2003-04-02 15:10 UTC (permalink / raw)
  To: Jakob Oestergaard, Kenny Simpson, linux-kernel

On Wed, Apr 02, 2003 at 11:30:50AM +0200, Jakob Oestergaard wrote:
>   make_dirty(big_map)
>   msync(first half of big_map)
>   msync(second half of big_map)    { crash during this }
> 
> Then I am guaranteed that (unless the server crashes), the first half of
> big_map *will* have reached the server, but not that all of the second
> half has.   Right?

Assuming you used MS_SYNC for the msync() flags.  MS_ASYNC could still be 
proceeding to flush the pages out in the background.  And the kernel may 
have triggered writeback of the second half -- it is free to do so as it 
sees fit.

> Like any local-disk backed file.
> 
> Ignoring the case where the NFS *server* crashes, where could the write
> ordering differ, compared to local disk files ?

> In other words, what does Benjamin's "unexpected ways" refer to ?

All local clients will see the mmap() being updated from the time it is 
dirtied, but there is no ordering of write()s with respect to the mmap 
unless you explicitely msync(..MS_SYNC..) as in your example.

		-ben
-- 
Junk email?  <a href="mailto:aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: mmap-related questions
  2003-04-02 15:10           ` Benjamin LaHaise
@ 2003-04-02 23:43             ` Jakob Oestergaard
  0 siblings, 0 replies; 8+ messages in thread
From: Jakob Oestergaard @ 2003-04-02 23:43 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Kenny Simpson, linux-kernel

On Wed, Apr 02, 2003 at 10:10:06AM -0500, Benjamin LaHaise wrote:
> On Wed, Apr 02, 2003 at 11:30:50AM +0200, Jakob Oestergaard wrote:
> >   make_dirty(big_map)
> >   msync(first half of big_map)
> >   msync(second half of big_map)    { crash during this }
> > 
> > Then I am guaranteed that (unless the server crashes), the first half of
> > big_map *will* have reached the server, but not that all of the second
> > half has.   Right?
> 
> Assuming you used MS_SYNC for the msync() flags.  MS_ASYNC could still be 
> proceeding to flush the pages out in the background.  And the kernel may 
> have triggered writeback of the second half -- it is free to do so as it 
> sees fit.

Yes. MS_ASYNC is "advisory" only, as I understand it.  (too bad it isn't
select()'able actually, I could use that to work wonders with a database
engine here...)

> 
> > Like any local-disk backed file.
> > 
> > Ignoring the case where the NFS *server* crashes, where could the write
> > ordering differ, compared to local disk files ?
> 
> > In other words, what does Benjamin's "unexpected ways" refer to ?
> 
> All local clients will see the mmap() being updated from the time it is 
> dirtied, but there is no ordering of write()s with respect to the mmap 
> unless you explicitely msync(..MS_SYNC..) as in your example.

Ok, so we're talking multiple processes reading/writing.

Now it makes a lot more sense - I was thinking one process only.  Silly
simple-minded me  ;)

Thanks,

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-04-02 23:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-31 14:41 mmap-related questions Kenny Simpson
2003-03-31 17:55 ` Benjamin LaHaise
2003-04-01  3:25   ` Kenny Simpson
2003-04-01 17:50     ` Benjamin LaHaise
2003-04-02  3:18       ` Kenny Simpson
2003-04-02  9:30         ` Jakob Oestergaard
2003-04-02 15:10           ` Benjamin LaHaise
2003-04-02 23:43             ` Jakob Oestergaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).