linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: Memory mapped files question
@ 2003-04-14 19:31 Perez-Gonzalez, Inaky
  2003-04-14 19:42 ` Bryan Shumsky
  0 siblings, 1 reply; 18+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-04-14 19:31 UTC (permalink / raw)
  To: 'Bryan Shumsky', 'linux-kernel@vger.kernel.org'


> From: Bryan Shumsky [mailto:bzs@via.com]

> Hi, everyone.  I'm running into a problem that I hope someone else has
seen,
> and maybe can help solve.  We're using the mmap system function for memory
> mapped files, but our updates never get flushed until we munmap or msysnc.

I thought that was the way it was supposed to work.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 19:31 Memory mapped files question Perez-Gonzalez, Inaky
@ 2003-04-14 19:42 ` Bryan Shumsky
  2003-04-14 19:53   ` Alan Cox
                     ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Bryan Shumsky @ 2003-04-14 19:42 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky, linux-kernel

Hi, everyone.  Thanks for all your responses.  Our confusion is that in Unix
environments, when we modify memory in memory-mapped files the underlying
system flusher manages to flush the files for us before the files are
munmap'ed or msysnc'ed.

Rewriting all of our code to manually handle the flushing is a MAJOR
undertaking, so I was hoping there might be some sneaky solution you could
come up with.  Any ideas?

Thanks again,

-- Bryan Shumsky
Director of Engineering
Via Systems, Inc.

----- Original Message -----
From: "Perez-Gonzalez, Inaky" <inaky.perez-gonzalez@intel.com>
To: "'Bryan Shumsky'" <bzs@via.com>; <linux-kernel@vger.kernel.org>
Sent: Monday, April 14, 2003 12:31 PM
Subject: RE: Memory mapped files question


>
> > From: Bryan Shumsky [mailto:bzs@via.com]
>
> > Hi, everyone.  I'm running into a problem that I hope someone else has
> seen,
> > and maybe can help solve.  We're using the mmap system function for
memory
> > mapped files, but our updates never get flushed until we munmap or
msysnc.
>
> I thought that was the way it was supposed to work.
>
> Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
> (and my fault)
>
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 19:42 ` Bryan Shumsky
@ 2003-04-14 19:53   ` Alan Cox
  2003-04-14 21:24     ` Chris Friesen
  2003-04-14 20:13   ` Richard B. Johnson
  2003-04-14 22:08   ` H. Peter Anvin
  2 siblings, 1 reply; 18+ messages in thread
From: Alan Cox @ 2003-04-14 19:53 UTC (permalink / raw)
  To: Bryan Shumsky; +Cc: Perez-Gonzalez, Inaky, Linux Kernel Mailing List

On Llu, 2003-04-14 at 20:42, Bryan Shumsky wrote:
> Rewriting all of our code to manually handle the flushing is a MAJOR
> undertaking, so I was hoping there might be some sneaky solution you could
> come up with.  Any ideas?

Create a thread that does msync's every so often. Its that simple


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 19:42 ` Bryan Shumsky
  2003-04-14 19:53   ` Alan Cox
@ 2003-04-14 20:13   ` Richard B. Johnson
  2003-04-14 20:27     ` Frank van Maarseveen
  2003-04-14 22:08   ` H. Peter Anvin
  2 siblings, 1 reply; 18+ messages in thread
From: Richard B. Johnson @ 2003-04-14 20:13 UTC (permalink / raw)
  To: Bryan Shumsky; +Cc: Perez-Gonzalez, Inaky, linux-kernel

On Mon, 14 Apr 2003, Bryan Shumsky wrote:

> Hi, everyone.  Thanks for all your responses.  Our confusion is that in Unix
> environments, when we modify memory in memory-mapped files the underlying
> system flusher manages to flush the files for us before the files are
> munmap'ed or msysnc'ed.
>

So? Why should you care?

> Rewriting all of our code to manually handle the flushing is a MAJOR
> undertaking, so I was hoping there might be some sneaky solution you could
> come up with.  Any ideas?
>
> Thanks again,
>

Memory mapped files are supposed to be accessed through memory!
Any program that needs to know what's on the physical disk is
broken. If you need to write to files and know when they are
written to the physical media, you use a journaled file-system.

Anytime you write to a memory mapped file, anybody else who
also mapped that file will see the changes as long as the file
was mapped MAP_SHARED. If your program mapped it MAP_PRIVATE, it's
probably broken.

There is nothing to work-around. All working databases work
this way.

Memory mapped files are used to speed access to large files. Only
the portion of the file actually being accessed needs to be resident
in memory. Multiple readers and writers can access these memory-
mapped portions without enduring the time for long seeks to various
pieces of the file. The snippets of the file get written to the
media using a LRU (least recently used) mechanism so that stuff that
isn't being actively updated is the stuff that gets written to the
device (actually replaced with something that needs to be accessed).
Therefore, there is absolutely no possible way for this interchange
to be somehow handled by the user. It doesn't happen in any memory-
mapped file --ever.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 20:13   ` Richard B. Johnson
@ 2003-04-14 20:27     ` Frank van Maarseveen
  2003-04-14 20:52       ` Richard B. Johnson
  0 siblings, 1 reply; 18+ messages in thread
From: Frank van Maarseveen @ 2003-04-14 20:27 UTC (permalink / raw)
  To: Richard B. Johnson; +Cc: linux-kernel

On Mon, Apr 14, 2003 at 04:13:52PM -0400, Richard B. Johnson wrote:
> 
> Memory mapped files are supposed to be accessed through memory!
> Any program that needs to know what's on the physical disk is
> broken. If you need to write to files and know when they are
> written to the physical media, you use a journaled file-system.

It is not that simple.
Shared mmaped files are _never_ flushed, at least in 2.4.x. So,
without an explicit msync() a process (innd comes to mind) may loose
years of updates upon a system crash or power outage.

I have learned to live with it but I still find this a bit awkward.

-- 
Frank

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 20:27     ` Frank van Maarseveen
@ 2003-04-14 20:52       ` Richard B. Johnson
  0 siblings, 0 replies; 18+ messages in thread
From: Richard B. Johnson @ 2003-04-14 20:52 UTC (permalink / raw)
  To: Frank van Maarseveen; +Cc: linux-kernel

On Mon, 14 Apr 2003, Frank van Maarseveen wrote:

> On Mon, Apr 14, 2003 at 04:13:52PM -0400, Richard B. Johnson wrote:
> >
> > Memory mapped files are supposed to be accessed through memory!
> > Any program that needs to know what's on the physical disk is
> > broken. If you need to write to files and know when they are
> > written to the physical media, you use a journaled file-system.
>
> It is not that simple.
> Shared mmaped files are _never_ flushed, at least in 2.4.x. So,
> without an explicit msync() a process (innd comes to mind) may loose
> years of updates upon a system crash or power outage.
>
> I have learned to live with it but I still find this a bit awkward.
>
> --
> Frank
>

But it is that simple. If you need to update the file with the
memory copy an explicit msync() must be used. Also note that
msync() takes some parameters that may be different in different
processes that access that mmapped file. Therefore, even msync()
goesn't guarantee that everything gets updated for the next
power outage. So, if you absolutely-posix-a-tively need to get
that data onto a disk, they you need to use some kind of signaling
mechanism (to pause all writers), explicitly write everything from
the lowest to the highest, update your check-point file and close
it, then restart all writers. Upon restart, you can unwind to
the check-point file and restart.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 19:53   ` Alan Cox
@ 2003-04-14 21:24     ` Chris Friesen
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Friesen @ 2003-04-14 21:24 UTC (permalink / raw)
  To: Alan Cox; +Cc: Bryan Shumsky, Perez-Gonzalez, Inaky, Linux Kernel Mailing List

Alan Cox wrote:
> On Llu, 2003-04-14 at 20:42, Bryan Shumsky wrote:
> 
>>Rewriting all of our code to manually handle the flushing is a MAJOR
>>undertaking, so I was hoping there might be some sneaky solution you could
>>come up with.  Any ideas?
>>
> 
> Create a thread that does msync's every so often. Its that simple

How do you deal with ensuring (or even trying to ensure) that the stuff *on 
disk* is sane?

If I understand correctly, msync() doesn't guarantee order of writes, so 
randomly firing off msync() calls doesn't help.

If I want to update an entry and then set a flag saying that the entry is 
correct, I need to have two msyncs, one for the entry data, and one for the 
flag.  I had hoped that I could avoid this by opening the file with O_SYNC, but 
hpa just disabused me of that notion...

Are the mmap semantics different for devices?

Chris



-- 
Chris Friesen                    | MailStop: 043/33/F10
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 19:42 ` Bryan Shumsky
  2003-04-14 19:53   ` Alan Cox
  2003-04-14 20:13   ` Richard B. Johnson
@ 2003-04-14 22:08   ` H. Peter Anvin
  2003-04-15  4:56     ` Jamie Lokier
  2 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2003-04-14 22:08 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <004301c302bd$ed548680$fe64a8c0@webserver>
By author:    "Bryan Shumsky" <bzs@via.com>
In newsgroup: linux.dev.kernel
>
> Hi, everyone.  Thanks for all your responses.  Our confusion is that in Unix
> environments, when we modify memory in memory-mapped files the underlying
> system flusher manages to flush the files for us before the files are
> munmap'ed or msysnc'ed.
> 

Bullshit.  It might work on one particular Unix implementation, but
the definition of Unix, the Single Unix Standard, does explicitly
*not* require this behavior.

> Rewriting all of our code to manually handle the flushing is a MAJOR
> undertaking, so I was hoping there might be some sneaky solution you could
> come up with.  Any ideas?

Your code is fundamentally broken.  You need to fix it.

	-hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 22:08   ` H. Peter Anvin
@ 2003-04-15  4:56     ` Jamie Lokier
  2003-04-15  5:30       ` H. Peter Anvin
  0 siblings, 1 reply; 18+ messages in thread
From: Jamie Lokier @ 2003-04-15  4:56 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

H. Peter Anvin wrote:
> > Hi, everyone.  Thanks for all your responses.  Our confusion is
> > that in Unix environments, when we modify memory in memory-mapped
> > files the underlying system flusher manages to flush the files for
> > us before the files are munmap'ed or msysnc'ed.
> 
> Bullshit.  It might work on one particular Unix implementation, but
> the definition of Unix, the Single Unix Standard, does explicitly
> *not* require this behavior.

I presume that if you do write(), the Single Unix Standard allows the
data to remain dirty in RAM for an arbitrary duration too.

If I write() a file I expect it to be automatically written to disk
within a few minutes at most, where that is plausible.

Frank van Maarseveen wrote:
> Shared mmaped files are _never_ flushed, at least in 2.4.x. So,
> without an explicit msync() a process (innd comes to mind) may loose
> years of updates upon a system crash or power outage.

It's a quality of implementation issue if data can remain dirty in RAM
forever without ever being flushed.

Can this really happen with normal open/mmap/munmap/close usage, or
does it only occur with long-lived processes like innd which mmap a
file, dirty the pages but never munmap them?

If the former case does happen, I'd say we're failing on quality of
implementation.  If it's only the latter case, though, fair enough: the
application writer will have to use msync().

-- Jamie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-15  4:56     ` Jamie Lokier
@ 2003-04-15  5:30       ` H. Peter Anvin
  0 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2003-04-15  5:30 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: linux-kernel

Jamie Lokier wrote:
> 
> It's a quality of implementation issue if data can remain dirty in RAM
> forever without ever being flushed.
> 
> Can this really happen with normal open/mmap/munmap/close usage, or
> does it only occur with long-lived processes like innd which mmap a
> file, dirty the pages but never munmap them?
> 
> If the former case does happen, I'd say we're failing on quality of
> implementation.  If it's only the latter case, though, fair enough: the
> application writer will have to use msync().
> 

The latter, I'm pretty sure.  After all, that's what pgflush/bdflush is 
all about.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 20:39         ` Chris Friesen
@ 2003-04-14 21:12           ` H. Peter Anvin
  0 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2003-04-14 21:12 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-kernel

Chris Friesen wrote:
> 
>> munmap() and fsync() or msync() will flush it to disk; there is no reason
>> munmap() should unless perhaps the file was opened O_SYNC.
> 
> Wait a minute.  Shouldn't a file opened O_SYNC flush the writes as they
> happen,
> removing the requirement for any explicit syncing?  If it doesn't there
> are some very broken apps around.
> 

Not for mmap().

	-hpa



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 18:07       ` H. Peter Anvin
@ 2003-04-14 20:39         ` Chris Friesen
  2003-04-14 21:12           ` H. Peter Anvin
  0 siblings, 1 reply; 18+ messages in thread
From: Chris Friesen @ 2003-04-14 20:39 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

H. Peter Anvin wrote:
 > Followup to:  <11640000.1050332688@[10.10.2.4]> By author:    "Martin J.
 > Bligh" <mbligh@aracnet.com> In newsgroup: linux.dev.kernel
 >
 >>> Martin, something which was not mentioned last week (I've just checked).
 >>>
 >>> It's OK if we never write to disk unless explicitely told, but will we
 >>> writeback when we munmap?

 > munmap() and fsync() or msync() will flush it to disk; there is no reason
 > munmap() should unless perhaps the file was opened O_SYNC.

Wait a minute.  Shouldn't a file opened O_SYNC flush the writes as they happen,
removing the requirement for any explicit syncing?  If it doesn't there are some 
very broken apps around.

Chris


-- 
Chris Friesen                    | MailStop: 043/33/F10
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Memory mapped files question
@ 2003-04-14 19:50 Perez-Gonzalez, Inaky
  0 siblings, 0 replies; 18+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-04-14 19:50 UTC (permalink / raw)
  To: 'Bryan Shumsky', 'linux-kernel@vger.kernel.org'


> From: Bryan Shumsky [mailto:bzs@via.com]
> 
> Hi, everyone.  Thanks for all your responses.  Our confusion is that in
Unix
> environments, when we modify memory in memory-mapped files the underlying
> system flusher manages to flush the files for us before the files are
> munmap'ed or msysnc'ed.
> 
> Rewriting all of our code to manually handle the flushing is a MAJOR
> undertaking, so I was hoping there might be some sneaky solution you could
> come up with.  Any ideas?

Have a high prio thread msync()ing every now and then? OOps forget it,
I never said that, it is really a lame solution :) you can run in all
kinds of trouble.

Or maybe it is not that lame if you know *exactly* when do you need
the flush (for example, some other program is going to access the data);
this way you can signal the process who did the modification for it
to issue an msync(). However, this is going to be some delicate
synchronization
between the two apps.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 15:04     ` Martin J. Bligh
@ 2003-04-14 18:07       ` H. Peter Anvin
  2003-04-14 20:39         ` Chris Friesen
  0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2003-04-14 18:07 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <11640000.1050332688@[10.10.2.4]>
By author:    "Martin J. Bligh" <mbligh@aracnet.com>
In newsgroup: linux.dev.kernel
>
> > Martin, something which was not mentioned last week (I've just checked).
> > 
> > It's OK if we never write to disk unless explicitely told, but will we writeback
> > when we munmap?
> 
> Don't know for sure - you'd have to read the code (do_munmap) ... I couldn't
> see anything there at a quick glance. However, I'd guess we don't write it, 
> as multiple people could have the file mapped, or we could remap it
> again from somewhere. Presumably the standard LRU will just flush it out.
> 

munmap() and fsync() or msync() will flush it to disk; there is no
reason munmap() should unless perhaps the file was opened O_SYNC.

	-hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 14:36 ` Martin J. Bligh
@ 2003-04-14 15:07   ` Antonio Vargas
  2003-04-14 15:04     ` Martin J. Bligh
  0 siblings, 1 reply; 18+ messages in thread
From: Antonio Vargas @ 2003-04-14 15:07 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Bryan Shumsky, linux-kernel

On Mon, Apr 14, 2003 at 07:36:39AM -0700, Martin J. Bligh wrote:
> > Hi, everyone.  I'm running into a problem that I hope someone else has seen,
> > and maybe can help solve.  We're using the mmap system function for memory
> > mapped files, but our updates never get flushed until we munmap or msysnc.
> > Are we missing something?  Is there a tunable parameter in the kernel or
> > filing system that can be set to flag these updates as 'write required'?
> 
> This was discussed about a week ago on either linux-kernel or linux-mm.
> The short answer is "yes, that's deliberate", but an archive search would 
> probably be fruitful.
> 
> M.

Martin, something which was not mentioned last week (I've just checked).

It's OK if we never write to disk unless explicitely told, but will we writeback
when we munmap?

Greets, Antonio.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14 15:07   ` Antonio Vargas
@ 2003-04-14 15:04     ` Martin J. Bligh
  2003-04-14 18:07       ` H. Peter Anvin
  0 siblings, 1 reply; 18+ messages in thread
From: Martin J. Bligh @ 2003-04-14 15:04 UTC (permalink / raw)
  To: Antonio Vargas; +Cc: Bryan Shumsky, linux-kernel

> Martin, something which was not mentioned last week (I've just checked).
> 
> It's OK if we never write to disk unless explicitely told, but will we writeback
> when we munmap?

Don't know for sure - you'd have to read the code (do_munmap) ... I couldn't
see anything there at a quick glance. However, I'd guess we don't write it, 
as multiple people could have the file mapped, or we could remap it
again from somewhere. Presumably the standard LRU will just flush it out.

M.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Memory mapped files question
  2003-04-14  3:57 Bryan Shumsky
@ 2003-04-14 14:36 ` Martin J. Bligh
  2003-04-14 15:07   ` Antonio Vargas
  0 siblings, 1 reply; 18+ messages in thread
From: Martin J. Bligh @ 2003-04-14 14:36 UTC (permalink / raw)
  To: Bryan Shumsky, linux-kernel

> Hi, everyone.  I'm running into a problem that I hope someone else has seen,
> and maybe can help solve.  We're using the mmap system function for memory
> mapped files, but our updates never get flushed until we munmap or msysnc.
> Are we missing something?  Is there a tunable parameter in the kernel or
> filing system that can be set to flag these updates as 'write required'?

This was discussed about a week ago on either linux-kernel or linux-mm.
The short answer is "yes, that's deliberate", but an archive search would 
probably be fruitful.

M.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Memory mapped files question
@ 2003-04-14  3:57 Bryan Shumsky
  2003-04-14 14:36 ` Martin J. Bligh
  0 siblings, 1 reply; 18+ messages in thread
From: Bryan Shumsky @ 2003-04-14  3:57 UTC (permalink / raw)
  To: linux-kernel

Hi, everyone.  I'm running into a problem that I hope someone else has seen,
and maybe can help solve.  We're using the mmap system function for memory
mapped files, but our updates never get flushed until we munmap or msysnc.
Are we missing something?  Is there a tunable parameter in the kernel or
filing system that can be set to flag these updates as 'write required'?

Any advice or suggestions would be appreciated.

Regards,

-- Bryan Shumsky
Director of Engineering
Via Systems, Inc.



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2003-04-15  5:19 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-14 19:31 Memory mapped files question Perez-Gonzalez, Inaky
2003-04-14 19:42 ` Bryan Shumsky
2003-04-14 19:53   ` Alan Cox
2003-04-14 21:24     ` Chris Friesen
2003-04-14 20:13   ` Richard B. Johnson
2003-04-14 20:27     ` Frank van Maarseveen
2003-04-14 20:52       ` Richard B. Johnson
2003-04-14 22:08   ` H. Peter Anvin
2003-04-15  4:56     ` Jamie Lokier
2003-04-15  5:30       ` H. Peter Anvin
  -- strict thread matches above, loose matches on Subject: below --
2003-04-14 19:50 Perez-Gonzalez, Inaky
2003-04-14  3:57 Bryan Shumsky
2003-04-14 14:36 ` Martin J. Bligh
2003-04-14 15:07   ` Antonio Vargas
2003-04-14 15:04     ` Martin J. Bligh
2003-04-14 18:07       ` H. Peter Anvin
2003-04-14 20:39         ` Chris Friesen
2003-04-14 21:12           ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).