All of lore.kernel.org
 help / color / mirror / Atom feed
* nfs home directory and google chrome.
@ 2020-10-04 11:53 Kenneth Johansson
  2020-10-05 16:46 ` Patrick Goetz
  2020-10-27 23:01 ` Kenneth Johansson
  0 siblings, 2 replies; 15+ messages in thread
From: Kenneth Johansson @ 2020-10-04 11:53 UTC (permalink / raw)
  To: linux-nfs

So I have had for a long time problems with google chrome and suspend 
resume causing it to mangle its sqlite database.

it looks to only happen if I use nfs mounted home directory. I'm not 
sure exactly what is happening but lets first see if this happens to 
anybody else.

How to get the error.

1. start google from a terminal with "google-chrome"

2. suspend the computer

3. wait a while. There is some type of minimum time here I do not know 
what its is but I basically get the error every time of I suspend in 
evening and resume in morning

4. look for printout that looks like something like this

[16789:18181:1004/125852.529750:ERROR:database.cc(1692)] Passwords 
sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
[16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web sqlite 
error 1034, errno 5: disk I/O error, sql: COMMIT
[16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web sqlite 
error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
autofill_model_type_state (model_type, value) VALUES(?,?)
[16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)] 
Autofill datatype error was encountered: Failed to update ModelTypeState.
[16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History sqlite 
error 1034, errno 5: disk I/O error, sql: COMMIT
[16789:19002:1004/125902.536903:ERROR:database.cc(1692)] Thumbnail 
sqlite error 778, errno 5: disk I/O error, sql: COMMIT


[16789:19002:1004/130044.120379:ERROR:database.cc(1692)] Passwords 
sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
sync_model_metadata (id, model_metadata) VALUES(1, ?)
[16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web sqlite 
error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
autofill_model_type_state (model_type, value) VALUES(?,?)


and so on.  if you use google sync you can also check 
"chrome://sync-internals" to see if something is wrong with the database.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-04 11:53 nfs home directory and google chrome Kenneth Johansson
@ 2020-10-05 16:46 ` Patrick Goetz
  2020-10-05 20:07   ` Kenneth Johansson
  2020-10-27 23:01 ` Kenneth Johansson
  1 sibling, 1 reply; 15+ messages in thread
From: Patrick Goetz @ 2020-10-05 16:46 UTC (permalink / raw)
  To: Kenneth Johansson, linux-nfs

We had a similar problem with Firefox, most notably with Mac OSX users 
who have NFS-mounted home directories. There's an about:config solution 
for Firefox; namely set

    storage.nfs_filesystem: true

This forces a specific network file locking mechanism which makes sqlite 
behave better. I'm guessing google chrome has something similar.

On 10/4/20 6:53 AM, Kenneth Johansson wrote:
> So I have had for a long time problems with google chrome and suspend 
> resume causing it to mangle its sqlite database.
> 
> it looks to only happen if I use nfs mounted home directory. I'm not 
> sure exactly what is happening but lets first see if this happens to 
> anybody else.
> 
> How to get the error.
> 
> 1. start google from a terminal with "google-chrome"
> 
> 2. suspend the computer
> 
> 3. wait a while. There is some type of minimum time here I do not know 
> what its is but I basically get the error every time of I suspend in 
> evening and resume in morning
> 
> 4. look for printout that looks like something like this
> 
> [16789:18181:1004/125852.529750:ERROR:database.cc(1692)] Passwords 
> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> [16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web sqlite 
> error 1034, errno 5: disk I/O error, sql: COMMIT
> [16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web sqlite 
> error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
> autofill_model_type_state (model_type, value) VALUES(?,?)
> [16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)] 
> Autofill datatype error was encountered: Failed to update ModelTypeState.
> [16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History sqlite 
> error 1034, errno 5: disk I/O error, sql: COMMIT
> [16789:19002:1004/125902.536903:ERROR:database.cc(1692)] Thumbnail 
> sqlite error 778, errno 5: disk I/O error, sql: COMMIT
> 
> 
> [16789:19002:1004/130044.120379:ERROR:database.cc(1692)] Passwords 
> sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
> sync_model_metadata (id, model_metadata) VALUES(1, ?)
> [16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web sqlite 
> error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
> autofill_model_type_state (model_type, value) VALUES(?,?)
> 
> 
> and so on.  if you use google sync you can also check 
> "chrome://sync-internals" to see if something is wrong with the database.
> 
> 
> 
>>> This message is from an external sender. Learn more about why this <<
>>> matters at https://links.utexas.edu/rtyclf.                        <<

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-05 16:46 ` Patrick Goetz
@ 2020-10-05 20:07   ` Kenneth Johansson
  2020-10-06 18:14     ` J. Bruce Fields
  0 siblings, 1 reply; 15+ messages in thread
From: Kenneth Johansson @ 2020-10-05 20:07 UTC (permalink / raw)
  To: Patrick Goetz, linux-nfs

On 2020-10-05 18:46, Patrick Goetz wrote:
> We had a similar problem with Firefox, most notably with Mac OSX users 
> who have NFS-mounted home directories. There's an about:config 
> solution for Firefox; namely set
>
>    storage.nfs_filesystem: true
>
> This forces a specific network file locking mechanism which makes 
> sqlite behave better. I'm guessing google chrome has something similar.
>
Since I have used chrome for years without any problems my guess it that 
its something that changed with nfs in my setup.

I did a strace and the first -EIO I get look like this

fdatasync(94</home/kenjo/.config/google-chrome/Default/Login Data>) = -1 
EIO (Input/output error)

then the same thing happens for other files like

fdatasync(83</home/kenjo/.config/google-chrome/Default/Web Data>) = -1 
EIO (Input/output error)

fdatasync(74</home/kenjo/.config/google-chrome/Default/History>) = -1 
EIO (Input/output error)




> On 10/4/20 6:53 AM, Kenneth Johansson wrote:
>> So I have had for a long time problems with google chrome and suspend 
>> resume causing it to mangle its sqlite database.
>>
>> it looks to only happen if I use nfs mounted home directory. I'm not 
>> sure exactly what is happening but lets first see if this happens to 
>> anybody else.
>>
>> How to get the error.
>>
>> 1. start google from a terminal with "google-chrome"
>>
>> 2. suspend the computer
>>
>> 3. wait a while. There is some type of minimum time here I do not 
>> know what its is but I basically get the error every time of I 
>> suspend in evening and resume in morning
>>
>> 4. look for printout that looks like something like this
>>
>> [16789:18181:1004/125852.529750:ERROR:database.cc(1692)] Passwords 
>> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
>> [16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web sqlite 
>> error 1034, errno 5: disk I/O error, sql: COMMIT
>> [16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web sqlite 
>> error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
>> autofill_model_type_state (model_type, value) VALUES(?,?)
>> [16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)] 
>> Autofill datatype error was encountered: Failed to update 
>> ModelTypeState.
>> [16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History 
>> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
>> [16789:19002:1004/125902.536903:ERROR:database.cc(1692)] Thumbnail 
>> sqlite error 778, errno 5: disk I/O error, sql: COMMIT
>>
>>
>> [16789:19002:1004/130044.120379:ERROR:database.cc(1692)] Passwords 
>> sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE 
>> INTO sync_model_metadata (id, model_metadata) VALUES(1, ?)
>> [16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web sqlite 
>> error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
>> autofill_model_type_state (model_type, value) VALUES(?,?)
>>
>>
>> and so on.  if you use google sync you can also check 
>> "chrome://sync-internals" to see if something is wrong with the 
>> database.
>>
>>
>>
>>>> This message is from an external sender. Learn more about why this <<
>>>> matters at https://links.utexas.edu/rtyclf. <<



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-05 20:07   ` Kenneth Johansson
@ 2020-10-06 18:14     ` J. Bruce Fields
  2020-10-07 10:54       ` Kenneth Johansson
  0 siblings, 1 reply; 15+ messages in thread
From: J. Bruce Fields @ 2020-10-06 18:14 UTC (permalink / raw)
  To: Kenneth Johansson; +Cc: Patrick Goetz, linux-nfs

On Mon, Oct 05, 2020 at 10:07:56PM +0200, Kenneth Johansson wrote:
> On 2020-10-05 18:46, Patrick Goetz wrote:
> >We had a similar problem with Firefox, most notably with Mac OSX
> >users who have NFS-mounted home directories. There's an
> >about:config solution for Firefox; namely set
> >
> >   storage.nfs_filesystem: true
> >
> >This forces a specific network file locking mechanism which makes
> >sqlite behave better. I'm guessing google chrome has something
> >similar.
> >
> Since I have used chrome for years without any problems my guess it
> that its something that changed with nfs in my setup.
> 
> I did a strace and the first -EIO I get look like this
> 
> fdatasync(94</home/kenjo/.config/google-chrome/Default/Login Data>)
> = -1 EIO (Input/output error)
> 
> then the same thing happens for other files like
> 
> fdatasync(83</home/kenjo/.config/google-chrome/Default/Web Data>) =
> -1 EIO (Input/output error)
> 
> fdatasync(74</home/kenjo/.config/google-chrome/Default/History>) =
> -1 EIO (Input/output error)

Are you using soft mounts?

(What are your mount options?)

--b.

> 
> 
> 
> 
> >On 10/4/20 6:53 AM, Kenneth Johansson wrote:
> >>So I have had for a long time problems with google chrome and
> >>suspend resume causing it to mangle its sqlite database.
> >>
> >>it looks to only happen if I use nfs mounted home directory. I'm
> >>not sure exactly what is happening but lets first see if this
> >>happens to anybody else.
> >>
> >>How to get the error.
> >>
> >>1. start google from a terminal with "google-chrome"
> >>
> >>2. suspend the computer
> >>
> >>3. wait a while. There is some type of minimum time here I do
> >>not know what its is but I basically get the error every time of
> >>I suspend in evening and resume in morning
> >>
> >>4. look for printout that looks like something like this
> >>
> >>[16789:18181:1004/125852.529750:ERROR:database.cc(1692)]
> >>Passwords sqlite error 1034, errno 5: disk I/O error, sql:
> >>COMMIT
> >>[16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web
> >>sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> >>[16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web
> >>sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR
> >>REPLACE INTO autofill_model_type_state (model_type, value)
> >>VALUES(?,?)
> >>[16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)]
> >>Autofill datatype error was encountered: Failed to update
> >>ModelTypeState.
> >>[16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History
> >>sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> >>[16789:19002:1004/125902.536903:ERROR:database.cc(1692)]
> >>Thumbnail sqlite error 778, errno 5: disk I/O error, sql: COMMIT
> >>
> >>
> >>[16789:19002:1004/130044.120379:ERROR:database.cc(1692)]
> >>Passwords sqlite error 1034, errno 5: disk I/O error, sql:
> >>INSERT OR REPLACE INTO sync_model_metadata (id, model_metadata)
> >>VALUES(1, ?)
> >>[16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web
> >>sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR
> >>REPLACE INTO autofill_model_type_state (model_type, value)
> >>VALUES(?,?)
> >>
> >>
> >>and so on.  if you use google sync you can also check
> >>"chrome://sync-internals" to see if something is wrong with the
> >>database.
> >>
> >>
> >>
> >>>>This message is from an external sender. Learn more about why this <<
> >>>>matters at https://links.utexas.edu/rtyclf. <<
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-06 18:14     ` J. Bruce Fields
@ 2020-10-07 10:54       ` Kenneth Johansson
  2020-10-07 13:10         ` J. Bruce Fields
  0 siblings, 1 reply; 15+ messages in thread
From: Kenneth Johansson @ 2020-10-07 10:54 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Patrick Goetz, linux-nfs

On 2020-10-06 20:14, J. Bruce Fields wrote:
> On Mon, Oct 05, 2020 at 10:07:56PM +0200, Kenneth Johansson wrote:
>> On 2020-10-05 18:46, Patrick Goetz wrote:
>>> We had a similar problem with Firefox, most notably with Mac OSX
>>> users who have NFS-mounted home directories. There's an
>>> about:config solution for Firefox; namely set
>>>
>>>     storage.nfs_filesystem: true
>>>
>>> This forces a specific network file locking mechanism which makes
>>> sqlite behave better. I'm guessing google chrome has something
>>> similar.
>>>
>> Since I have used chrome for years without any problems my guess it
>> that its something that changed with nfs in my setup.
>>
>> I did a strace and the first -EIO I get look like this
>>
>> fdatasync(94</home/kenjo/.config/google-chrome/Default/Login Data>)
>> = -1 EIO (Input/output error)
>>
>> then the same thing happens for other files like
>>
>> fdatasync(83</home/kenjo/.config/google-chrome/Default/Web Data>) =
>> -1 EIO (Input/output error)
>>
>> fdatasync(74</home/kenjo/.config/google-chrome/Default/History>) =
>> -1 EIO (Input/output error)
> Are you using soft mounts?
>
> (What are your mount options?)

auto.home /home autofs 
rw,relatime,fd=18,pgrp=2682,timeout=300,minproto=5,maxproto=5,indirect,pipe_ino=67621 
0 0

/home/kenjo nfs4 
rw,noatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=120,acregmax=120,acdirmin=120,acdirmax=120,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.2.16,fsc,local_lock=none,addr=172.16.2.6 
0 0

what I actualy set manually in auto.home is

-tcp,fsc,noatime,ac,actimeo=120


> --b.
>
>>
>>
>>
>>> On 10/4/20 6:53 AM, Kenneth Johansson wrote:
>>>> So I have had for a long time problems with google chrome and
>>>> suspend resume causing it to mangle its sqlite database.
>>>>
>>>> it looks to only happen if I use nfs mounted home directory. I'm
>>>> not sure exactly what is happening but lets first see if this
>>>> happens to anybody else.
>>>>
>>>> How to get the error.
>>>>
>>>> 1. start google from a terminal with "google-chrome"
>>>>
>>>> 2. suspend the computer
>>>>
>>>> 3. wait a while. There is some type of minimum time here I do
>>>> not know what its is but I basically get the error every time of
>>>> I suspend in evening and resume in morning
>>>>
>>>> 4. look for printout that looks like something like this
>>>>
>>>> [16789:18181:1004/125852.529750:ERROR:database.cc(1692)]
>>>> Passwords sqlite error 1034, errno 5: disk I/O error, sql:
>>>> COMMIT
>>>> [16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web
>>>> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
>>>> [16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web
>>>> sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR
>>>> REPLACE INTO autofill_model_type_state (model_type, value)
>>>> VALUES(?,?)
>>>> [16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)]
>>>> Autofill datatype error was encountered: Failed to update
>>>> ModelTypeState.
>>>> [16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History
>>>> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
>>>> [16789:19002:1004/125902.536903:ERROR:database.cc(1692)]
>>>> Thumbnail sqlite error 778, errno 5: disk I/O error, sql: COMMIT
>>>>
>>>>
>>>> [16789:19002:1004/130044.120379:ERROR:database.cc(1692)]
>>>> Passwords sqlite error 1034, errno 5: disk I/O error, sql:
>>>> INSERT OR REPLACE INTO sync_model_metadata (id, model_metadata)
>>>> VALUES(1, ?)
>>>> [16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web
>>>> sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR
>>>> REPLACE INTO autofill_model_type_state (model_type, value)
>>>> VALUES(?,?)
>>>>
>>>>
>>>> and so on.  if you use google sync you can also check
>>>> "chrome://sync-internals" to see if something is wrong with the
>>>> database.
>>>>
>>>>
>>>>
>>>>>> This message is from an external sender. Learn more about why this <<
>>>>>> matters at https://links.utexas.edu/rtyclf. <<



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-07 10:54       ` Kenneth Johansson
@ 2020-10-07 13:10         ` J. Bruce Fields
  2020-10-07 14:34           ` Frank Filz
  2020-10-07 21:10           ` Kenneth Johansson
  0 siblings, 2 replies; 15+ messages in thread
From: J. Bruce Fields @ 2020-10-07 13:10 UTC (permalink / raw)
  To: Kenneth Johansson; +Cc: Patrick Goetz, linux-nfs

On Wed, Oct 07, 2020 at 12:54:50PM +0200, Kenneth Johansson wrote:
> On 2020-10-06 20:14, J. Bruce Fields wrote:
> >On Mon, Oct 05, 2020 at 10:07:56PM +0200, Kenneth Johansson wrote:
> >>On 2020-10-05 18:46, Patrick Goetz wrote:
> >>>We had a similar problem with Firefox, most notably with Mac OSX
> >>>users who have NFS-mounted home directories. There's an
> >>>about:config solution for Firefox; namely set
> >>>
> >>>    storage.nfs_filesystem: true
> >>>
> >>>This forces a specific network file locking mechanism which makes
> >>>sqlite behave better. I'm guessing google chrome has something
> >>>similar.
> >>>
> >>Since I have used chrome for years without any problems my guess it
> >>that its something that changed with nfs in my setup.
> >>
> >>I did a strace and the first -EIO I get look like this
> >>
> >>fdatasync(94</home/kenjo/.config/google-chrome/Default/Login Data>)
> >>= -1 EIO (Input/output error)
> >>
> >>then the same thing happens for other files like
> >>
> >>fdatasync(83</home/kenjo/.config/google-chrome/Default/Web Data>) =
> >>-1 EIO (Input/output error)
> >>
> >>fdatasync(74</home/kenjo/.config/google-chrome/Default/History>) =
> >>-1 EIO (Input/output error)
> >Are you using soft mounts?
> >
> >(What are your mount options?)
> 
> auto.home /home autofs rw,relatime,fd=18,pgrp=2682,timeout=300,minproto=5,maxproto=5,indirect,pipe_ino=67621
> 0 0
> 
> /home/kenjo nfs4 rw,noatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=120,acregmax=120,acdirmin=120,acdirmax=120,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.2.16,fsc,local_lock=none,addr=172.16.2.6
> 0 0
> 
> what I actualy set manually in auto.home is
> 
> -tcp,fsc,noatime,ac,actimeo=120

OK, that looks fine.

Maybe I overlooked the obvious: if Chrome holds a lock on that file when
you suspend, and if you stay in suspend for longer than the NFSv4 lease
time (default 90 seconds), then the client will lose its lease, hence
any file locks.  I think these days the client then returns EIO on any
further IO to that file descriptor.

Maybe there's some way to turn off that locking as a workaround.

The simplest thing we can do to help might be implementing "courteous
server" behavior: instead of automatically removing locks after a
client's lease expires, it can wait until there's an actual lock
conflict.  That might be enough for your case.

There's been a little planning done and it's not a big project, but I
don't think it's actually at the top of anyone's todo list right now, so
I'm not sure when that will get done.

--b.

> 
> 
> >--b.
> >
> >>
> >>
> >>
> >>>On 10/4/20 6:53 AM, Kenneth Johansson wrote:
> >>>>So I have had for a long time problems with google chrome and
> >>>>suspend resume causing it to mangle its sqlite database.
> >>>>
> >>>>it looks to only happen if I use nfs mounted home directory. I'm
> >>>>not sure exactly what is happening but lets first see if this
> >>>>happens to anybody else.
> >>>>
> >>>>How to get the error.
> >>>>
> >>>>1. start google from a terminal with "google-chrome"
> >>>>
> >>>>2. suspend the computer
> >>>>
> >>>>3. wait a while. There is some type of minimum time here I do
> >>>>not know what its is but I basically get the error every time of
> >>>>I suspend in evening and resume in morning
> >>>>
> >>>>4. look for printout that looks like something like this
> >>>>
> >>>>[16789:18181:1004/125852.529750:ERROR:database.cc(1692)]
> >>>>Passwords sqlite error 1034, errno 5: disk I/O error, sql:
> >>>>COMMIT
> >>>>[16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web
> >>>>sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> >>>>[16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web
> >>>>sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR
> >>>>REPLACE INTO autofill_model_type_state (model_type, value)
> >>>>VALUES(?,?)
> >>>>[16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)]
> >>>>Autofill datatype error was encountered: Failed to update
> >>>>ModelTypeState.
> >>>>[16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History
> >>>>sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> >>>>[16789:19002:1004/125902.536903:ERROR:database.cc(1692)]
> >>>>Thumbnail sqlite error 778, errno 5: disk I/O error, sql: COMMIT
> >>>>
> >>>>
> >>>>[16789:19002:1004/130044.120379:ERROR:database.cc(1692)]
> >>>>Passwords sqlite error 1034, errno 5: disk I/O error, sql:
> >>>>INSERT OR REPLACE INTO sync_model_metadata (id, model_metadata)
> >>>>VALUES(1, ?)
> >>>>[16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web
> >>>>sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR
> >>>>REPLACE INTO autofill_model_type_state (model_type, value)
> >>>>VALUES(?,?)
> >>>>
> >>>>
> >>>>and so on.  if you use google sync you can also check
> >>>>"chrome://sync-internals" to see if something is wrong with the
> >>>>database.
> >>>>
> >>>>
> >>>>
> >>>>>>This message is from an external sender. Learn more about why this <<
> >>>>>>matters at https://links.utexas.edu/rtyclf. <<
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: nfs home directory and google chrome.
  2020-10-07 13:10         ` J. Bruce Fields
@ 2020-10-07 14:34           ` Frank Filz
  2020-10-07 15:17             ` 'J. Bruce Fields'
  2020-10-07 15:39             ` Chuck Lever
  2020-10-07 21:10           ` Kenneth Johansson
  1 sibling, 2 replies; 15+ messages in thread
From: Frank Filz @ 2020-10-07 14:34 UTC (permalink / raw)
  To: 'J. Bruce Fields', 'Kenneth Johansson'
  Cc: 'Patrick Goetz', linux-nfs

> -----Original Message-----
> From: J. Bruce Fields [mailto:bfields@fieldses.org]
> Maybe I overlooked the obvious: if Chrome holds a lock on that file when you
> suspend, and if you stay in suspend for longer than the NFSv4 lease time (default
> 90 seconds), then the client will lose its lease, hence any file locks.  I think these
> days the client then returns EIO on any further IO to that file descriptor.
> 
> Maybe there's some way to turn off that locking as a workaround.
> 
> The simplest thing we can do to help might be implementing "courteous server"
> behavior: instead of automatically removing locks after a client's lease expires,
> it can wait until there's an actual lock conflict.  That might be enough for your
> case.
> 
> There's been a little planning done and it's not a big project, but I don't think it's
> actually at the top of anyone's todo list right now, so I'm not sure when that will
> get done.

I've had courtesy locks on my back burner for Ganesha though I hadn't thought about that there might actually be an important practical issue. Does any other server implement them? If we suggest this as a solution to the Chrome suspend issue, it might be good to assure that the major server vendors implement this.

There is a problem with the courtesy locks for this solution though... The clientid is still going to be expired, and the locks are associated with the clientid, so unless we allow courtesy re-instatement of expired clientids, courtesy locks don't actually solve the problem...

Option - use NFSv3 instead :-) The lack of lock expiry due to AWOL client would work in a suspended client's favor... Note also that a suspended client could be a VM, for example, VirtualBox allows saving and suspending a VM in running state.

Interesting problem...

Frank


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-07 14:34           ` Frank Filz
@ 2020-10-07 15:17             ` 'J. Bruce Fields'
  2020-10-07 15:39             ` Chuck Lever
  1 sibling, 0 replies; 15+ messages in thread
From: 'J. Bruce Fields' @ 2020-10-07 15:17 UTC (permalink / raw)
  To: Frank Filz
  Cc: 'Kenneth Johansson', 'Patrick Goetz', linux-nfs

On Wed, Oct 07, 2020 at 07:34:27AM -0700, Frank Filz wrote:
> > -----Original Message----- From: J. Bruce Fields
> > [mailto:bfields@fieldses.org] Maybe I overlooked the obvious: if
> > Chrome holds a lock on that file when you suspend, and if you stay
> > in suspend for longer than the NFSv4 lease time (default 90
> > seconds), then the client will lose its lease, hence any file locks.
> > I think these days the client then returns EIO on any further IO to
> > that file descriptor.
> > 
> > Maybe there's some way to turn off that locking as a workaround.
> > 
> > The simplest thing we can do to help might be implementing
> > "courteous server" behavior: instead of automatically removing locks
> > after a client's lease expires, it can wait until there's an actual
> > lock conflict.  That might be enough for your case.
> > 
> > There's been a little planning done and it's not a big project, but
> > I don't think it's actually at the top of anyone's todo list right
> > now, so I'm not sure when that will get done.
> 
> I've had courtesy locks on my back burner for Ganesha though I hadn't
> thought about that there might actually be an important practical
> issue. Does any other server implement them? If we suggest this as a
> solution to the Chrome suspend issue, it might be good to assure that
> the major server vendors implement this.
> 
> There is a problem with the courtesy locks for this solution though...
> The clientid is still going to be expired, and the locks are
> associated with the clientid, so unless we allow courtesy
> re-instatement of expired clientids, courtesy locks don't actually
> solve the problem...

The server's not required to expire the clientid when the lease expires.
A server that chooses to be "courteous" can let it hang around.

As a first implementation our server would probably wait until there's a
lock conflict, then destroy all the client's state.  But we could also
choose to revoke only those locks we have to.  The client uses
TEST_STATEID, I think, to sort out what's happened in that case.

I believe the Linux client implements all of this.  I'm not sure about
the status of other servers.

--b.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-07 14:34           ` Frank Filz
  2020-10-07 15:17             ` 'J. Bruce Fields'
@ 2020-10-07 15:39             ` Chuck Lever
  2020-10-07 18:11               ` Frank Filz
  1 sibling, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2020-10-07 15:39 UTC (permalink / raw)
  To: Frank Filz
  Cc: Bruce Fields, Kenneth Johansson, Patrick Goetz, Linux NFS Mailing List



> On Oct 7, 2020, at 10:34 AM, Frank Filz <ffilzlnx@mindspring.com> wrote:
> 
>> -----Original Message-----
>> From: J. Bruce Fields [mailto:bfields@fieldses.org]
>> Maybe I overlooked the obvious: if Chrome holds a lock on that file when you
>> suspend, and if you stay in suspend for longer than the NFSv4 lease time (default
>> 90 seconds), then the client will lose its lease, hence any file locks.  I think these
>> days the client then returns EIO on any further IO to that file descriptor.
>> 
>> Maybe there's some way to turn off that locking as a workaround.
>> 
>> The simplest thing we can do to help might be implementing "courteous server"
>> behavior: instead of automatically removing locks after a client's lease expires,
>> it can wait until there's an actual lock conflict.  That might be enough for your
>> case.
>> 
>> There's been a little planning done and it's not a big project, but I don't think it's
>> actually at the top of anyone's todo list right now, so I'm not sure when that will
>> get done.
> 
> I've had courtesy locks on my back burner for Ganesha though I hadn't thought about that there might actually be an important practical issue.

We've found that instantly bringing the hammer down on NFSv4 leases has
negative operational consequences in environments where minutes-long
network partitions are part of life.

Extending the lease period impacts the length an NFS server is in grace
after a reboot, so it's not always a good solution.


> Does any other server implement them? If we suggest this as a solution to the Chrome suspend issue, it might be good to assure that the major server vendors implement this.

We think OnTAP does, at least.


> There is a problem with the courtesy locks for this solution though... The clientid is still going to be expired, and the locks are associated with the clientid, so unless we allow courtesy re-instatement of expired clientids, courtesy locks don't actually solve the problem...

An NFSv4 server is not required to expire a lease after the lease period
expires.

A courteous server would simply allow a conflicting lock request to take
an expired lock after a client's lease expired. If no conflicting lock
operations occur, then the missing client could come back and find its
lease state intact (unless of course the server has restarted or purged
the lease for other reasons).

Oracle has an open design document that can be posted here for more
comment and review. We agree that this is much better server behavior
and would like more server implementations to adopt it.


> Option - use NFSv3 instead :-) The lack of lock expiry due to AWOL client would work in a suspended client's favor... Note also that a suspended client could be a VM, for example, VirtualBox allows saving and suspending a VM in running state.
> 
> Interesting problem...
> 
> Frank
> 

--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: nfs home directory and google chrome.
  2020-10-07 15:39             ` Chuck Lever
@ 2020-10-07 18:11               ` Frank Filz
  2020-10-07 18:36                 ` Chuck Lever
  2020-10-07 23:58                 ` Rick Macklem
  0 siblings, 2 replies; 15+ messages in thread
From: Frank Filz @ 2020-10-07 18:11 UTC (permalink / raw)
  To: 'Chuck Lever'
  Cc: 'Bruce Fields', 'Kenneth Johansson',
	'Patrick Goetz', 'Linux NFS Mailing List'



> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@oracle.com]
> Sent: Wednesday, October 7, 2020 8:40 AM
> To: Frank Filz <ffilzlnx@mindspring.com>
> Cc: Bruce Fields <bfields@fieldses.org>; Kenneth Johansson
<ken@kenjo.org>;
> Patrick Goetz <pgoetz@math.utexas.edu>; Linux NFS Mailing List <linux-
> nfs@vger.kernel.org>
> Subject: Re: nfs home directory and google chrome.
> 
> 
> 
> > On Oct 7, 2020, at 10:34 AM, Frank Filz <ffilzlnx@mindspring.com> wrote:
> >
> >> -----Original Message-----
> >> From: J. Bruce Fields [mailto:bfields@fieldses.org] Maybe I
> >> overlooked the obvious: if Chrome holds a lock on that file when you
> >> suspend, and if you stay in suspend for longer than the NFSv4 lease
> >> time (default
> >> 90 seconds), then the client will lose its lease, hence any file
> >> locks.  I think these days the client then returns EIO on any further
IO to that
> file descriptor.
> >>
> >> Maybe there's some way to turn off that locking as a workaround.
> >>
> >> The simplest thing we can do to help might be implementing "courteous
> server"
> >> behavior: instead of automatically removing locks after a client's
> >> lease expires, it can wait until there's an actual lock conflict.
> >> That might be enough for your case.
> >>
> >> There's been a little planning done and it's not a big project, but I
> >> don't think it's actually at the top of anyone's todo list right now,
> >> so I'm not sure when that will get done.
> >
> > I've had courtesy locks on my back burner for Ganesha though I hadn't
thought
> about that there might actually be an important practical issue.
> 
> We've found that instantly bringing the hammer down on NFSv4 leases has
> negative operational consequences in environments where minutes-long
> network partitions are part of life.
> 
> Extending the lease period impacts the length an NFS server is in grace
after a
> reboot, so it's not always a good solution.
> 
> 
> > Does any other server implement them? If we suggest this as a solution
to the
> Chrome suspend issue, it might be good to assure that the major server
vendors
> implement this.
> 
> We think OnTAP does, at least.
> 
> 
> > There is a problem with the courtesy locks for this solution though...
The
> clientid is still going to be expired, and the locks are associated with
the clientid,
> so unless we allow courtesy re-instatement of expired clientids, courtesy
locks
> don't actually solve the problem...
> 
> An NFSv4 server is not required to expire a lease after the lease period
expires.
> 
> A courteous server would simply allow a conflicting lock request to take
an
> expired lock after a client's lease expired. If no conflicting lock
operations occur,
> then the missing client could come back and find its lease state intact
(unless of
> course the server has restarted or purged the lease for other reasons).
> 
> Oracle has an open design document that can be posted here for more
> comment and review. We agree that this is much better server behavior and
> would like more server implementations to adopt it.

Ah that document would be helpful. Does the document discuss conditions
where a server might abandon a courtesy hold on a client id and expire it
out anyway? For example, to conserve resources.

Thanks

Frank


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-07 18:11               ` Frank Filz
@ 2020-10-07 18:36                 ` Chuck Lever
  2020-10-07 23:58                 ` Rick Macklem
  1 sibling, 0 replies; 15+ messages in thread
From: Chuck Lever @ 2020-10-07 18:36 UTC (permalink / raw)
  To: Frank Filz
  Cc: Bruce Fields, Kenneth Johansson, Patrick Goetz, Linux NFS Mailing List



> On Oct 7, 2020, at 2:11 PM, Frank Filz <ffilzlnx@mindspring.com> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: Chuck Lever [mailto:chuck.lever@oracle.com]
>> Sent: Wednesday, October 7, 2020 8:40 AM
>> To: Frank Filz <ffilzlnx@mindspring.com>
>> Cc: Bruce Fields <bfields@fieldses.org>; Kenneth Johansson
> <ken@kenjo.org>;
>> Patrick Goetz <pgoetz@math.utexas.edu>; Linux NFS Mailing List <linux-
>> nfs@vger.kernel.org>
>> Subject: Re: nfs home directory and google chrome.
>> 
>> 
>> 
>>> On Oct 7, 2020, at 10:34 AM, Frank Filz <ffilzlnx@mindspring.com> wrote:
>>> 
>>>> -----Original Message-----
>>>> From: J. Bruce Fields [mailto:bfields@fieldses.org] Maybe I
>>>> overlooked the obvious: if Chrome holds a lock on that file when you
>>>> suspend, and if you stay in suspend for longer than the NFSv4 lease
>>>> time (default
>>>> 90 seconds), then the client will lose its lease, hence any file
>>>> locks.  I think these days the client then returns EIO on any further
> IO to that
>> file descriptor.
>>>> 
>>>> Maybe there's some way to turn off that locking as a workaround.
>>>> 
>>>> The simplest thing we can do to help might be implementing "courteous
>> server"
>>>> behavior: instead of automatically removing locks after a client's
>>>> lease expires, it can wait until there's an actual lock conflict.
>>>> That might be enough for your case.
>>>> 
>>>> There's been a little planning done and it's not a big project, but I
>>>> don't think it's actually at the top of anyone's todo list right now,
>>>> so I'm not sure when that will get done.
>>> 
>>> I've had courtesy locks on my back burner for Ganesha though I hadn't
> thought
>> about that there might actually be an important practical issue.
>> 
>> We've found that instantly bringing the hammer down on NFSv4 leases has
>> negative operational consequences in environments where minutes-long
>> network partitions are part of life.
>> 
>> Extending the lease period impacts the length an NFS server is in grace
> after a
>> reboot, so it's not always a good solution.
>> 
>> 
>>> Does any other server implement them? If we suggest this as a solution
> to the
>> Chrome suspend issue, it might be good to assure that the major server
> vendors
>> implement this.
>> 
>> We think OnTAP does, at least.
>> 
>> 
>>> There is a problem with the courtesy locks for this solution though...
> The
>> clientid is still going to be expired, and the locks are associated with
> the clientid,
>> so unless we allow courtesy re-instatement of expired clientids, courtesy
> locks
>> don't actually solve the problem...
>> 
>> An NFSv4 server is not required to expire a lease after the lease period
> expires.
>> 
>> A courteous server would simply allow a conflicting lock request to take
> an
>> expired lock after a client's lease expired. If no conflicting lock
> operations occur,
>> then the missing client could come back and find its lease state intact
> (unless of
>> course the server has restarted or purged the lease for other reasons).
>> 
>> Oracle has an open design document that can be posted here for more
>> comment and review. We agree that this is much better server behavior and
>> would like more server implementations to adopt it.
> 
> Ah that document would be helpful. Does the document discuss conditions
> where a server might abandon a courtesy hold on a client id and expire it
> out anyway? For example, to conserve resources.

Yes. It covers appropriate server responses to a client to report that
it has done this.

Bill will post the document soon in a separate thread.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-07 13:10         ` J. Bruce Fields
  2020-10-07 14:34           ` Frank Filz
@ 2020-10-07 21:10           ` Kenneth Johansson
  1 sibling, 0 replies; 15+ messages in thread
From: Kenneth Johansson @ 2020-10-07 21:10 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Patrick Goetz, linux-nfs

On 2020-10-07 15:10, J. Bruce Fields wrote:
> On Wed, Oct 07, 2020 at 12:54:50PM +0200, Kenneth Johansson wrote:
>> On 2020-10-06 20:14, J. Bruce Fields wrote:
>>> On Mon, Oct 05, 2020 at 10:07:56PM +0200, Kenneth Johansson wrote:
>>>> On 2020-10-05 18:46, Patrick Goetz wrote:
>>>>> We had a similar problem with Firefox, most notably with Mac OSX
>>>>> users who have NFS-mounted home directories. There's an
>>>>> about:config solution for Firefox; namely set
>>>>>
>>>>>     storage.nfs_filesystem: true
>>>>>
>>>>> This forces a specific network file locking mechanism which makes
>>>>> sqlite behave better. I'm guessing google chrome has something
>>>>> similar.
>>>>>
>>>> Since I have used chrome for years without any problems my guess it
>>>> that its something that changed with nfs in my setup.
>>>>
>>>> I did a strace and the first -EIO I get look like this
>>>>
>>>> fdatasync(94</home/kenjo/.config/google-chrome/Default/Login Data>)
>>>> = -1 EIO (Input/output error)
>>>>
>>>> then the same thing happens for other files like
>>>>
>>>> fdatasync(83</home/kenjo/.config/google-chrome/Default/Web Data>) =
>>>> -1 EIO (Input/output error)
>>>>
>>>> fdatasync(74</home/kenjo/.config/google-chrome/Default/History>) =
>>>> -1 EIO (Input/output error)
>>> Are you using soft mounts?
>>>
>>> (What are your mount options?)
>> auto.home /home autofs rw,relatime,fd=18,pgrp=2682,timeout=300,minproto=5,maxproto=5,indirect,pipe_ino=67621
>> 0 0
>>
>> /home/kenjo nfs4 rw,noatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=120,acregmax=120,acdirmin=120,acdirmax=120,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.2.16,fsc,local_lock=none,addr=172.16.2.6
>> 0 0
>>
>> what I actualy set manually in auto.home is
>>
>> -tcp,fsc,noatime,ac,actimeo=120
> OK, that looks fine.
>
> Maybe I overlooked the obvious: if Chrome holds a lock on that file when
> you suspend, and if you stay in suspend for longer than the NFSv4 lease
> time (default 90 seconds), then the client will lose its lease, hence
> any file locks.  I think these days the client then returns EIO on any
> further IO to that file descriptor.

So I tested by just pulling the network cable for a few minutes and the 
effect is the same. this time the password  database survived but the 
history went bye bye.

History sqlite error 1034, errno 5: disk I/O error, sql: COMMIT

The real problem here is that chrome destroys the database file and do 
not communicate that there is a problem to the end user unless started 
from a terminal and it can't recover ever. If it had just removed the 
file and recreated a new one from the server it would not be a big 
problem. What the user notice is that password handling and form filling 
do work very poorly and the only solution is really  "rm -rf  
~/.config/google-chrome/" to force it to recreate the files.


> Maybe there's some way to turn off that locking as a workaround.
>
> The simplest thing we can do to help might be implementing "courteous
> server" behavior: instead of automatically removing locks after a
> client's lease expires, it can wait until there's an actual lock
> conflict.  That might be enough for your case.
>
> There's been a little planning done and it's not a big project, but I
> don't think it's actually at the top of anyone's todo list right now, so
> I'm not sure when that will get done.
>
> --b.
>
>>
>>> --b.
>>>
>>>>
>>>>
>>>>> On 10/4/20 6:53 AM, Kenneth Johansson wrote:
>>>>>> So I have had for a long time problems with google chrome and
>>>>>> suspend resume causing it to mangle its sqlite database.
>>>>>>
>>>>>> it looks to only happen if I use nfs mounted home directory. I'm
>>>>>> not sure exactly what is happening but lets first see if this
>>>>>> happens to anybody else.
>>>>>>
>>>>>> How to get the error.
>>>>>>
>>>>>> 1. start google from a terminal with "google-chrome"
>>>>>>
>>>>>> 2. suspend the computer
>>>>>>
>>>>>> 3. wait a while. There is some type of minimum time here I do
>>>>>> not know what its is but I basically get the error every time of
>>>>>> I suspend in evening and resume in morning
>>>>>>
>>>>>> 4. look for printout that looks like something like this
>>>>>>
>>>>>> [16789:18181:1004/125852.529750:ERROR:database.cc(1692)]
>>>>>> Passwords sqlite error 1034, errno 5: disk I/O error, sql:
>>>>>> COMMIT
>>>>>> [16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web
>>>>>> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
>>>>>> [16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web
>>>>>> sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR
>>>>>> REPLACE INTO autofill_model_type_state (model_type, value)
>>>>>> VALUES(?,?)
>>>>>> [16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)]
>>>>>> Autofill datatype error was encountered: Failed to update
>>>>>> ModelTypeState.
>>>>>> [16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History
>>>>>> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
>>>>>> [16789:19002:1004/125902.536903:ERROR:database.cc(1692)]
>>>>>> Thumbnail sqlite error 778, errno 5: disk I/O error, sql: COMMIT
>>>>>>
>>>>>>
>>>>>> [16789:19002:1004/130044.120379:ERROR:database.cc(1692)]
>>>>>> Passwords sqlite error 1034, errno 5: disk I/O error, sql:
>>>>>> INSERT OR REPLACE INTO sync_model_metadata (id, model_metadata)
>>>>>> VALUES(1, ?)
>>>>>> [16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web
>>>>>> sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR
>>>>>> REPLACE INTO autofill_model_type_state (model_type, value)
>>>>>> VALUES(?,?)
>>>>>>
>>>>>>
>>>>>> and so on.  if you use google sync you can also check
>>>>>> "chrome://sync-internals" to see if something is wrong with the
>>>>>> database.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>> This message is from an external sender. Learn more about why this <<
>>>>>>>> matters at https://links.utexas.edu/rtyclf. <<



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-07 18:11               ` Frank Filz
  2020-10-07 18:36                 ` Chuck Lever
@ 2020-10-07 23:58                 ` Rick Macklem
  1 sibling, 0 replies; 15+ messages in thread
From: Rick Macklem @ 2020-10-07 23:58 UTC (permalink / raw)
  To: Frank Filz, 'Chuck Lever'
  Cc: 'Bruce Fields', 'Kenneth Johansson',
	'Patrick Goetz', 'Linux NFS Mailing List'

Frank Filz wrote:
>> -----Original Message-----
>> From: Chuck Lever [mailto:chuck.lever@oracle.com]
>> Sent: Wednesday, October 7, 2020 8:40 AM
>> To: Frank Filz <ffilzlnx@mindspring.com>
>> Cc: Bruce Fields <bfields@fieldses.org>; Kenneth Johansson
<ken@kenjo.org>;
>> Patrick Goetz <pgoetz@math.utexas.edu>; Linux NFS Mailing List <linux-
>> nfs@vger.kernel.org>
>> Subject: Re: nfs home directory and google chrome.
>>
>>
>>
>> > On Oct 7, 2020, at 10:34 AM, Frank Filz <ffilzlnx@mindspring.com> wrote:
>> >
>> >> -----Original Message-----
>> >> From: J. Bruce Fields [mailto:bfields@fieldses.org] Maybe I
>> >> overlooked the obvious: if Chrome holds a lock on that file when you
>> >> suspend, and if you stay in suspend for longer than the NFSv4 lease
>> >> time (default
>> >> 90 seconds), then the client will lose its lease, hence any file
>> >> locks.  I think these days the client then returns EIO on any further
IO to that
>> file descriptor.
>> >>
>> >> Maybe there's some way to turn off that locking as a workaround.
>> >>
>> >> The simplest thing we can do to help might be implementing "courteous
>> server"
>> >> behavior: instead of automatically removing locks after a client's
>> >> lease expires, it can wait until there's an actual lock conflict.
>> >> That might be enough for your case.
>> >>
>> >> There's been a little planning done and it's not a big project, but I
>> >> don't think it's actually at the top of anyone's todo list right now,
>> >> so I'm not sure when that will get done.
>> >
>> > I've had courtesy locks on my back burner for Ganesha though I hadn't
thought
>> about that there might actually be an important practical issue.
>>
>> We've found that instantly bringing the hammer down on NFSv4 leases has
>> negative operational consequences in environments where minutes-long
>> network partitions are part of life.
>>
>> Extending the lease period impacts the length an NFS server is in grace
after a
>> reboot, so it's not always a good solution.
>>
>>
>> > Does any other server implement them? If we suggest this as a solution
to the
>> Chrome suspend issue, it might be good to assure that the major server
vendors
>> implement this.
>>
>> We think OnTAP does, at least.
>>
>>
>> > There is a problem with the courtesy locks for this solution though...
The
>> clientid is still going to be expired, and the locks are associated with
the clientid,
>> so unless we allow courtesy re-instatement of expired clientids, courtesy
locks
>> don't actually solve the problem...
>>
>> An NFSv4 server is not required to expire a lease after the lease period
expires.
The way the FreeBSD server is implemented is that does not expire a clientID
(and all associated open/byte range lock state) when a lease expires.
However, a conflicting Open/Lock request results in the ClientID and all
associated opens/byte range locks being expired at that point in time.

I was never sure if a NFSv4 client would expect NFS4ERR_EXPIRED to be
returned for some lock, but not all state related operations. so I chose
to expire ClientID + all Opens and Locks at the same time.
--> However, this is only triggered by a conflicting Open/Lock request or
      server resource depletion and not simply a client failing to renew a lease.

rick
ps: It has been like this for almost 20years and I have not heard of it
      causing problems.

>
> A courteous server would simply allow a conflicting lock request to take
an
> expired lock after a client's lease expired. If no conflicting lock
operations occur,
> then the missing client could come back and find its lease state intact
(unless of
> course the server has restarted or purged the lease for other reasons).
>
> Oracle has an open design document that can be posted here for more
> comment and review. We agree that this is much better server behavior and
> would like more server implementations to adopt it.

Ah that document would be helpful. Does the document discuss conditions
where a server might abandon a courtesy hold on a client id and expire it
out anyway? For example, to conserve resources.

Thanks

Frank



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-04 11:53 nfs home directory and google chrome Kenneth Johansson
  2020-10-05 16:46 ` Patrick Goetz
@ 2020-10-27 23:01 ` Kenneth Johansson
  2020-10-29 17:36   ` J. Bruce Fields
  1 sibling, 1 reply; 15+ messages in thread
From: Kenneth Johansson @ 2020-10-27 23:01 UTC (permalink / raw)
  To: linux-nfs

So this is just an update to how to avoid this problem.

I switched to nfs v3 and no more issues. Since the switch chrome have 
not stopped syncing with the google server even once. suspend resume 
causes no issues and everything looks ok.  So it's clear that 
google-chrome currently does not like nfs v4 and I need chrome to work 
more than I need to run nfs v4.


On 2020-10-04 13:53, Kenneth Johansson wrote:
> So I have had for a long time problems with google chrome and suspend 
> resume causing it to mangle its sqlite database.
>
> it looks to only happen if I use nfs mounted home directory. I'm not 
> sure exactly what is happening but lets first see if this happens to 
> anybody else.
>
> How to get the error.
>
> 1. start google from a terminal with "google-chrome"
>
> 2. suspend the computer
>
> 3. wait a while. There is some type of minimum time here I do not know 
> what its is but I basically get the error every time of I suspend in 
> evening and resume in morning
>
> 4. look for printout that looks like something like this
>
> [16789:18181:1004/125852.529750:ERROR:database.cc(1692)] Passwords 
> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> [16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web sqlite 
> error 1034, errno 5: disk I/O error, sql: COMMIT
> [16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web sqlite 
> error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
> autofill_model_type_state (model_type, value) VALUES(?,?)
> [16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)] 
> Autofill datatype error was encountered: Failed to update ModelTypeState.
> [16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History 
> sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> [16789:19002:1004/125902.536903:ERROR:database.cc(1692)] Thumbnail 
> sqlite error 778, errno 5: disk I/O error, sql: COMMIT
>
>
> [16789:19002:1004/130044.120379:ERROR:database.cc(1692)] Passwords 
> sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE 
> INTO sync_model_metadata (id, model_metadata) VALUES(1, ?)
> [16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web sqlite 
> error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE INTO 
> autofill_model_type_state (model_type, value) VALUES(?,?)
>
>
> and so on.  if you use google sync you can also check 
> "chrome://sync-internals" to see if something is wrong with the database.
>
>
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: nfs home directory and google chrome.
  2020-10-27 23:01 ` Kenneth Johansson
@ 2020-10-29 17:36   ` J. Bruce Fields
  0 siblings, 0 replies; 15+ messages in thread
From: J. Bruce Fields @ 2020-10-29 17:36 UTC (permalink / raw)
  To: Kenneth Johansson; +Cc: linux-nfs

On Wed, Oct 28, 2020 at 12:01:28AM +0100, Kenneth Johansson wrote:
> So this is just an update to how to avoid this problem.
> 
> I switched to nfs v3 and no more issues.

Yes, that's also consistent with the explanation that the problem is
client lease expiry.

NFSv4 locks are lease-based--the client loses all its locks if it
doesn't contact the server regularly (by default, about every 90
seconds).  So, if you suspend or lose contact with the server for too
long, then you lose your locks.

NFSv3 (NLM) locks are not.  The client keeps them until it unlocks them
or explicitly tells the server to remove them all (such as if it comes
back up after crashing).  So, there's no risk of losing locks when you
suspend, but there's more of a risk of stuck locks that get in other
client's way when one client dies.

Once we implement "courteous server", locks will only be removed once
the client loses contact for more than 90 seconds *and* either another
client requests a conflicting lock, or  the server just runs out of
memory for client state.  I think that'll be a better compromise.

--b.

> Since the switch chrome
> have not stopped syncing with the google server even once. suspend
> resume causes no issues and everything looks ok.  So it's clear that
> google-chrome currently does not like nfs v4 and I need chrome to
> work more than I need to run nfs v4.
> 
> 
> On 2020-10-04 13:53, Kenneth Johansson wrote:
> >So I have had for a long time problems with google chrome and
> >suspend resume causing it to mangle its sqlite database.
> >
> >it looks to only happen if I use nfs mounted home directory. I'm
> >not sure exactly what is happening but lets first see if this
> >happens to anybody else.
> >
> >How to get the error.
> >
> >1. start google from a terminal with "google-chrome"
> >
> >2. suspend the computer
> >
> >3. wait a while. There is some type of minimum time here I do not
> >know what its is but I basically get the error every time of I
> >suspend in evening and resume in morning
> >
> >4. look for printout that looks like something like this
> >
> >[16789:18181:1004/125852.529750:ERROR:database.cc(1692)] Passwords
> >sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> >[16789:16829:1004/125852.529744:ERROR:database.cc(1692)] Web
> >sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> >[16789:16829:1004/125852.530261:ERROR:database.cc(1692)] Web
> >sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE
> >INTO autofill_model_type_state (model_type, value) VALUES(?,?)
> >[16789:16789:1004/125852.563571:ERROR:sync_metadata_store_change_list.cc(34)]
> >Autofill datatype error was encountered: Failed to update
> >ModelTypeState.
> >[16789:19002:1004/125902.534103:ERROR:database.cc(1692)] History
> >sqlite error 1034, errno 5: disk I/O error, sql: COMMIT
> >[16789:19002:1004/125902.536903:ERROR:database.cc(1692)] Thumbnail
> >sqlite error 778, errno 5: disk I/O error, sql: COMMIT
> >
> >
> >[16789:19002:1004/130044.120379:ERROR:database.cc(1692)] Passwords
> >sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE
> >INTO sync_model_metadata (id, model_metadata) VALUES(1, ?)
> >[16789:16829:1004/130044.120388:ERROR:database.cc(1692)] Web
> >sqlite error 1034, errno 5: disk I/O error, sql: INSERT OR REPLACE
> >INTO autofill_model_type_state (model_type, value) VALUES(?,?)
> >
> >
> >and so on.  if you use google sync you can also check
> >"chrome://sync-internals" to see if something is wrong with the
> >database.
> >
> >
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-10-29 17:36 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-04 11:53 nfs home directory and google chrome Kenneth Johansson
2020-10-05 16:46 ` Patrick Goetz
2020-10-05 20:07   ` Kenneth Johansson
2020-10-06 18:14     ` J. Bruce Fields
2020-10-07 10:54       ` Kenneth Johansson
2020-10-07 13:10         ` J. Bruce Fields
2020-10-07 14:34           ` Frank Filz
2020-10-07 15:17             ` 'J. Bruce Fields'
2020-10-07 15:39             ` Chuck Lever
2020-10-07 18:11               ` Frank Filz
2020-10-07 18:36                 ` Chuck Lever
2020-10-07 23:58                 ` Rick Macklem
2020-10-07 21:10           ` Kenneth Johansson
2020-10-27 23:01 ` Kenneth Johansson
2020-10-29 17:36   ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.