On Wed, Apr 12 2017, Jeff Layton wrote: > On Wed, 2017-04-12 at 07:38 -0700, Matthew Wilcox wrote: >> On Wed, Apr 12, 2017 at 09:01:34AM -0400, Jeff Layton wrote: >> > On Wed, 2017-04-12 at 08:06 -0400, Jeff Layton wrote: >> > > Not sure what to do here just yet. >> > > >> > > Signed-off-by: Jeff Layton >> > > --- >> > > mm/page-writeback.c | 6 ++++++ >> > > 1 file changed, 6 insertions(+) >> > > >> > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c >> > > index de0dbf12e2c1..3ac8399dc984 100644 >> > > --- a/mm/page-writeback.c >> > > +++ b/mm/page-writeback.c >> > > @@ -2388,6 +2388,12 @@ int write_one_page(struct page *page) >> > > ret = mapping->a_ops->writepage(page, &wbc); >> > > if (ret == 0) { >> > > wait_on_page_writeback(page); >> > > + /* >> > > + * FIXME: is this racy? What guarantees that PG_error >> > > + * will still be set once we get around to checking it? >> > > + * What if writeback fails, but then a read is issued >> > > + * before we check this, and that calls ClearPageError? >> > > + */ >> > > if (PageError(page)) >> > > ret = -EIO; >> > > } >> > >> > Ahh, we are always under the page lock here, and this is generally used >> > for writing out directory pages anyway. I'm fine with dropping this >> > patch unless someone else sees a problem here. >> >> ->writepage drops the page lock. We're still holding a refcount on this >> page, but that's not going to prevent read being called. But maybe the >> filesystem won't call read on a page that's marked as PageError? > > Hard to be sure there. I really wonder if that check is needed at all, > the more I look at it. After all, we are calling writepage with > WB_SYNC_ALL so we should get an error there. WB_SYNC_ALL doesn't cause writepage to wait. It might case it to ask for REQ_SYNC, so the write requests gets priority in the block layer. WB_SYNC_ALL does cause writepages (with an 's') to wait. (At least, that is how I read the code). > > Is it also possible these pages could be written back before that point > (due to memory pressure or something) and that fail? Probably, in which case clear_page_dirty_for_io() will fail and write_one_page() will just unlock the page. > > Maybe we should just have a call to filemap_check_errors on exiting > this function? I'm leaning in that direction. > > With the the wb_err_t based stuff, we could change it to sample the > wb_err early, and then use that to see if an error has occurred since > then. Maybe we should even allow callers to pass a wb_err_t in here, so > we can report errors that have occurred since a known point? That feels to me like over-engineering. We would need to unconditionally call writepage() for that to work. We seem to be agreed that write errors for buffered writes are reported per-address-space. To get per-page errors you have to use direct IO. Let's focus on that policy and make it work. Thanks, NeilBrown