All of lore.kernel.org
 help / color / mirror / Atom feed
* NFS hang after concurrent writes
@ 2015-03-09 22:36 Peng Yu
  2015-03-10  8:26 ` Benjamin Coddington
  0 siblings, 1 reply; 12+ messages in thread
From: Peng Yu @ 2015-03-09 22:36 UTC (permalink / raw)
  To: linux-nfs

Hi,

I have a home directory from an ubuntu server shared to multiple other
ubuntu servers as the home directory (the default login shell is bash
on these servers) via NFS (see the following configuration in
/etc/exports from the NFS server).

/mnt/home 172.17.0.0/16(fsid=2,rw,insecure,no_subtree_check,sync,no_root_squash)

autofs is used to mount the NFS on the client servers (see the
following configuration).

~$ cat /etc/auto.master
+auto.master
/home   /etc/auto.home --timeout=90
~$ cat /etc/auto.home
*  -fstype=nfs4,rw,intr,fsc    nsfserver:/mnt/home/&

But I frequently end up with a situation when ~/.bash_history is not
readable which blocks the logins to these client servers. I feel that
this may be related with concurrent write to ~/.bash_history from
these servers, which somehow screw up NFS.

Has anybody seen this before? If so, is there a solution to this
problem? Thanks.

-- 
Regards,
Peng

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-09 22:36 NFS hang after concurrent writes Peng Yu
@ 2015-03-10  8:26 ` Benjamin Coddington
  2015-03-10 11:07   ` Peng Yu
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Coddington @ 2015-03-10  8:26 UTC (permalink / raw)
  To: Peng Yu; +Cc: linux-nfs

On Mon, 9 Mar 2015, Peng Yu wrote:

> Hi,
>
> I have a home directory from an ubuntu server shared to multiple other
> ubuntu servers as the home directory (the default login shell is bash
> on these servers) via NFS (see the following configuration in
> /etc/exports from the NFS server).
>
> /mnt/home 172.17.0.0/16(fsid=2,rw,insecure,no_subtree_check,sync,no_root_squash)
>
> autofs is used to mount the NFS on the client servers (see the
> following configuration).
>
> ~$ cat /etc/auto.master
> +auto.master
> /home   /etc/auto.home --timeout=90
> ~$ cat /etc/auto.home
> *  -fstype=nfs4,rw,intr,fsc    nsfserver:/mnt/home/&
>
> But I frequently end up with a situation when ~/.bash_history is not
> readable which blocks the logins to these client servers. I feel that
> this may be related with concurrent write to ~/.bash_history from
> these servers, which somehow screw up NFS.

Hi Peng, what do you have that supports your feeling that the problem is
conflicts on the ~/.bash_history file?  It's hard to help without more
information.

You should expect NFS home directories to work and work well - it is a very
common usage.

Ben

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-10  8:26 ` Benjamin Coddington
@ 2015-03-10 11:07   ` Peng Yu
  2015-03-10 11:36     ` Benjamin Coddington
  0 siblings, 1 reply; 12+ messages in thread
From: Peng Yu @ 2015-03-10 11:07 UTC (permalink / raw)
  To: Benjamin Coddington; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 1871 bytes --]

On Tuesday, March 10, 2015, Benjamin Coddington <bcodding@redhat.com> wrote:

> On Mon, 9 Mar 2015, Peng Yu wrote:
>
> > Hi,
> >
> > I have a home directory from an ubuntu server shared to multiple other
> > ubuntu servers as the home directory (the default login shell is bash
> > on these servers) via NFS (see the following configuration in
> > /etc/exports from the NFS server).
> >
> > /mnt/home
> 172.17.0.0/16(fsid=2,rw,insecure,no_subtree_check,sync,no_root_squash)
> >
> > autofs is used to mount the NFS on the client servers (see the
> > following configuration).
> >
> > ~$ cat /etc/auto.master
> > +auto.master
> > /home   /etc/auto.home --timeout=90
> > ~$ cat /etc/auto.home
> > *  -fstype=nfs4,rw,intr,fsc    nsfserver:/mnt/home/&
> >
> > But I frequently end up with a situation when ~/.bash_history is not
> > readable which blocks the logins to these client servers. I feel that
> > this may be related with concurrent write to ~/.bash_history from
> > these servers, which somehow screw up NFS.
>
> Hi Peng, what do you have that supports your feeling that the problem is
> conflicts on the ~/.bash_history file?  It's hard to help without more
> information.


The login process hangs when a line related to ~/.bash_history in ~/.bashrc
is being loaded. At that time, cat ~/.bash_history also hangs, but stat the
file works fine. After I delete ~/.bash_history, the login process stops
hanging.

The real difficulty is to reliably repoduce this problem in order
to pinpoint the exact cause of the problem with nfs. So far I only see the
problem with ~/.bash_history, it is hard to believe this problem is related
with the actual filename. A logic conclusion is that it must be somehow
related with how this file is written.


> You should expect NFS home directories to work and work well - it is a very
> common usage.
>
> Ben
>


-- 
Regards,
Peng

[-- Attachment #2: Type: text/html, Size: 2974 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-10 11:07   ` Peng Yu
@ 2015-03-10 11:36     ` Benjamin Coddington
  2015-03-10 14:53       ` Peng Yu
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Coddington @ 2015-03-10 11:36 UTC (permalink / raw)
  To: Peng Yu; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 2046 bytes --]

On Tue, 10 Mar 2015, Peng Yu wrote:

>
>
> On Tuesday, March 10, 2015, Benjamin Coddington <bcodding@redhat.com> wrote:
>       On Mon, 9 Mar 2015, Peng Yu wrote:
>
>       > Hi,
>       >
>       > I have a home directory from an ubuntu server shared to multiple other
>       > ubuntu servers as the home directory (the default login shell is bash
>       > on these servers) via NFS (see the following configuration in
>       > /etc/exports from the NFS server).
>       >
>       > /mnt/home 172.17.0.0/16(fsid=2,rw,insecure,no_subtree_check,sync,no_root_squash)
>       >
>       > autofs is used to mount the NFS on the client servers (see the
>       > following configuration).
>       >
>       > ~$ cat /etc/auto.master
>       > +auto.master
>       > /home   /etc/auto.home --timeout=90
>       > ~$ cat /etc/auto.home
>       > *  -fstype=nfs4,rw,intr,fsc    nsfserver:/mnt/home/&
>       >
>       > But I frequently end up with a situation when ~/.bash_history is not
>       > readable which blocks the logins to these client servers. I feel that
>       > this may be related with concurrent write to ~/.bash_history from
>       > these servers, which somehow screw up NFS.
>
>       Hi Peng, what do you have that supports your feeling that the problem is
>       conflicts on the ~/.bash_history file?  It's hard to help without more
>       information.
>
>
> The login process hangs when a line related to ~/.bash_history in ~/.bashrc is being loaded. At that time, cat ~/.bash_history
> also hangs, but stat the file works fine. After I delete ~/.bash_history, the login process stops hanging.
>
> The real difficulty is to reliably repoduce this problem in order to pinpoint the exact cause of the problem with nfs. So far I
> only see the problem with ~/.bash_history, it is hard to believe this problem is related with the actual filename. A logic
> conclusion is that it must be somehow related with how this file is written.

A network capture would be a good next step toward finding the problem.

Ben

^ permalink raw reply	[flat|nested] 12+ messages in thread

* NFS hang after concurrent writes
  2015-03-10 11:36     ` Benjamin Coddington
@ 2015-03-10 14:53       ` Peng Yu
  2015-03-11 14:14         ` Benjamin Coddington
  0 siblings, 1 reply; 12+ messages in thread
From: Peng Yu @ 2015-03-10 14:53 UTC (permalink / raw)
  To: Benjamin Coddington; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 2604 bytes --]

On Tuesday, March 10, 2015, Benjamin Coddington <bcodding@redhat.com
<javascript:_e(%7B%7D,'cvml','bcodding@redhat.com');>> wrote:

> On Tue, 10 Mar 2015, Peng Yu wrote:
>
> >
> >
> > On Tuesday, March 10, 2015, Benjamin Coddington <bcodding@redhat.com>
> wrote:
> >       On Mon, 9 Mar 2015, Peng Yu wrote:
> >
> >       > Hi,
> >       >
> >       > I have a home directory from an ubuntu server shared to multiple
> other
> >       > ubuntu servers as the home directory (the default login shell is
> bash
> >       > on these servers) via NFS (see the following configuration in
> >       > /etc/exports from the NFS server).
> >       >
> >       > /mnt/home
> 172.17.0.0/16(fsid=2,rw,insecure,no_subtree_check,sync,no_root_squash)
> >       >
> >       > autofs is used to mount the NFS on the client servers (see the
> >       > following configuration).
> >       >
> >       > ~$ cat /etc/auto.master
> >       > +auto.master
> >       > /home   /etc/auto.home --timeout=90
> >       > ~$ cat /etc/auto.home
> >       > *  -fstype=nfs4,rw,intr,fsc    nsfserver:/mnt/home/&
> >       >
> >       > But I frequently end up with a situation when ~/.bash_history is
> not
> >       > readable which blocks the logins to these client servers. I feel
> that
> >       > this may be related with concurrent write to ~/.bash_history from
> >       > these servers, which somehow screw up NFS.
> >
> >       Hi Peng, what do you have that supports your feeling that the
> problem is
> >       conflicts on the ~/.bash_history file?  It's hard to help without
> more
> >       information.
> >
> >
> > The login process hangs when a line related to ~/.bash_history
> in ~/.bashrc is being loaded. At that time, cat ~/.bash_history
> > also hangs, but stat the file works fine. After I delete
> ~/.bash_history, the login process stops hanging.
> >
> > The real difficulty is to reliably repoduce this problem in order
> to pinpoint the exact cause of the problem with nfs. So far I
> > only see the problem with ~/.bash_history, it is hard to believe this
> problem is related with the actual filename. A logic
> > conclusion is that it must be somehow related with how this file is
> written.
>
> A network capture would be a good next step toward finding the problem.


I see something like this.

http://wiki.linux-nfs.org/wiki/index.php/General_troubleshooting_recommendations#Capturing_a_Network_Trace

Could you help show the detailed proceedure on how to do network capture to
debug this specific issue if there is anything better than that in the
above instruction ?

>
> Ben



-- 
Regards,
Peng

[-- Attachment #2: Type: text/html, Size: 3667 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-10 14:53       ` Peng Yu
@ 2015-03-11 14:14         ` Benjamin Coddington
  2015-03-18 15:11           ` Peng Yu
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Coddington @ 2015-03-11 14:14 UTC (permalink / raw)
  To: Peng Yu; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 2975 bytes --]

On Tue, 10 Mar 2015, Peng Yu wrote:

>
>
> On Tuesday, March 10, 2015, Benjamin Coddington <bcodding@redhat.com> wrote:
>       On Tue, 10 Mar 2015, Peng Yu wrote:
>
>       >
>       >
>       > On Tuesday, March 10, 2015, Benjamin Coddington <bcodding@redhat.com> wrote:
>       >       On Mon, 9 Mar 2015, Peng Yu wrote:
>       >
>       >       > Hi,
>       >       >
>       >       > I have a home directory from an ubuntu server shared to multiple other
>       >       > ubuntu servers as the home directory (the default login shell is bash
>       >       > on these servers) via NFS (see the following configuration in
>       >       > /etc/exports from the NFS server).
>       >       >
>       >       > /mnt/home 172.17.0.0/16(fsid=2,rw,insecure,no_subtree_check,sync,no_root_squash)
>       >       >
>       >       > autofs is used to mount the NFS on the client servers (see the
>       >       > following configuration).
>       >       >
>       >       > ~$ cat /etc/auto.master
>       >       > +auto.master
>       >       > /home   /etc/auto.home --timeout=90
>       >       > ~$ cat /etc/auto.home
>       >       > *  -fstype=nfs4,rw,intr,fsc    nsfserver:/mnt/home/&
>       >       >
>       >       > But I frequently end up with a situation when ~/.bash_history is not
>       >       > readable which blocks the logins to these client servers. I feel that
>       >       > this may be related with concurrent write to ~/.bash_history from
>       >       > these servers, which somehow screw up NFS.
>       >
>       >       Hi Peng, what do you have that supports your feeling that the problem is
>       >       conflicts on the ~/.bash_history file?  It's hard to help without more
>       >       information.
>       >
>       >
>       > The login process hangs when a line related to ~/.bash_history in ~/.bashrc is being loaded. At that time,
>       cat ~/.bash_history
>       > also hangs, but stat the file works fine. After I delete ~/.bash_history, the login process stops hanging.
>       >
>       > The real difficulty is to reliably repoduce this problem in order to pinpoint the exact cause of the problem with
>       nfs. So far I
>       > only see the problem with ~/.bash_history, it is hard to believe this problem is related with the actual filename. A
>       logic
>       > conclusion is that it must be somehow related with how this file is written.
>
>       A network capture would be a good next step toward finding the problem.
>
>
> I see something like this.
>
> http://wiki.linux-nfs.org/wiki/index.php/General_troubleshooting_recommendations#Capturing_a_Network_Trace
>
> Could you help show the detailed proceedure on how to do network capture to debug this specific issue if there is anything better
> than that in the above instruction ?
>
Those instructions are fine.

Ben

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-11 14:14         ` Benjamin Coddington
@ 2015-03-18 15:11           ` Peng Yu
  2015-03-18 15:18             ` Benjamin Coddington
  0 siblings, 1 reply; 12+ messages in thread
From: Peng Yu @ 2015-03-18 15:11 UTC (permalink / raw)
  To: Benjamin Coddington; +Cc: linux-nfs

>> http://wiki.linux-nfs.org/wiki/index.php/General_troubleshooting_recommendations#Capturing_a_Network_Trace
>>
>> Could you help show the detailed proceedure on how to do network capture to debug this specific issue if there is anything better
>> than that in the above instruction ?
>>
> Those instructions are fine.

Just to be sure. I following this procedure when this problem occurs, right?

-- 
Regards,
Peng

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-18 15:11           ` Peng Yu
@ 2015-03-18 15:18             ` Benjamin Coddington
  2015-03-18 15:22               ` Peng Yu
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Coddington @ 2015-03-18 15:18 UTC (permalink / raw)
  To: Peng Yu; +Cc: linux-nfs

On Wed, 18 Mar 2015, Peng Yu wrote:

> >> http://wiki.linux-nfs.org/wiki/index.php/General_troubleshooting_recommendations#Capturing_a_Network_Trace
> >>
> >> Could you help show the detailed proceedure on how to do network capture to debug this specific issue if there is anything better
> >> than that in the above instruction ?
> >>
> > Those instructions are fine.
>
> Just to be sure. I following this procedure when this problem occurs, right?

Yes.  If you can reliably reproduce the problem, a network capture of the
NFS traffic between the client and server as you produce the problem would
be very helpful.

Ben

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-18 15:18             ` Benjamin Coddington
@ 2015-03-18 15:22               ` Peng Yu
  2015-03-18 15:30                 ` Benjamin Coddington
  0 siblings, 1 reply; 12+ messages in thread
From: Peng Yu @ 2015-03-18 15:22 UTC (permalink / raw)
  To: Benjamin Coddington; +Cc: linux-nfs

> Yes.  If you can reliably reproduce the problem, a network capture of the
> NFS traffic between the client and server as you produce the problem would
> be very helpful.

It sounds the solution can be simpler than this.

Here is the current problem.

It hangs when I login to the server using an account with the home
directory mounted via NFS.

I login to another server on which the same NFS directory is mounted
and I can delete the file corresponding to ~/.bash_history on the
first server.

When try to login the first server, it still hangs. When I login this
server using another account without NFS mounted as home, I can see
there is a file ~/.nfs000000160000351200000001 in the first account's
home (but not ~/.bash_history anymore). After I delete
~/.nfs000000160000351200000001 in the first account home, I am able to
login using the first account.

Can anybody pinpoint the cause of the problem given the above description?

-- 
Regards,
Peng

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-18 15:22               ` Peng Yu
@ 2015-03-18 15:30                 ` Benjamin Coddington
  2015-03-18 15:50                   ` Peng Yu
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Coddington @ 2015-03-18 15:30 UTC (permalink / raw)
  To: Peng Yu; +Cc: linux-nfs

On Wed, 18 Mar 2015, Peng Yu wrote:

> > Yes.  If you can reliably reproduce the problem, a network capture of the
> > NFS traffic between the client and server as you produce the problem would
> > be very helpful.
>
> It sounds the solution can be simpler than this.
>
> Here is the current problem.
>
> It hangs when I login to the server using an account with the home
> directory mounted via NFS.
>
> I login to another server on which the same NFS directory is mounted
> and I can delete the file corresponding to ~/.bash_history on the
> first server.
>
> When try to login the first server, it still hangs. When I login this
> server using another account without NFS mounted as home, I can see
> there is a file ~/.nfs000000160000351200000001 in the first account's
> home (but not ~/.bash_history anymore). After I delete
> ~/.nfs000000160000351200000001 in the first account home, I am able to
> login using the first account.
>
> Can anybody pinpoint the cause of the problem given the above description?

That's expected behavior, see: http://nfs.sourceforge.net/#faq_d2

Ben

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-18 15:30                 ` Benjamin Coddington
@ 2015-03-18 15:50                   ` Peng Yu
  2015-03-18 16:34                     ` Benjamin Coddington
  0 siblings, 1 reply; 12+ messages in thread
From: Peng Yu @ 2015-03-18 15:50 UTC (permalink / raw)
  To: Benjamin Coddington; +Cc: linux-nfs

>> ~/.nfs000000160000351200000001 in the first account home, I am able to
>> login using the first account.

OK. I understand that  the content of ~/.nfs000000160000351200000001
should be basically the same as the original ~/.bash_history which has
been deleted in the NFS server via the second server. Right?

The strange thing is:  when I try to login the first sever with a
different login session, why the login session still hangs on this
nonexistent ~/.bash_history. Is it that NFS remembers ~/.bash_history
as just ~/.nfs000000160000351200000001. Unless the last program that
uses ~/.bash_history quit, all subsequent programs trying to access
~/.bash_history will actually get to ~/.nfs000000160000351200000001?

>> Can anybody pinpoint the cause of the problem given the above description?
>
> That's expected behavior, see: http://nfs.sourceforge.net/#faq_d2

-- 
Regards,
Peng

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS hang after concurrent writes
  2015-03-18 15:50                   ` Peng Yu
@ 2015-03-18 16:34                     ` Benjamin Coddington
  0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Coddington @ 2015-03-18 16:34 UTC (permalink / raw)
  To: Peng Yu; +Cc: linux-nfs

> >> ~/.nfs000000160000351200000001 in the first account home, I am able to
> >> login using the first account.
>
> OK. I understand that  the content of ~/.nfs000000160000351200000001
> should be basically the same as the original ~/.bash_history which has
> been deleted in the NFS server via the second server. Right?
>
> The strange thing is:  when I try to login the first sever with a
> different login session, why the login session still hangs on this
> nonexistent ~/.bash_history. Is it that NFS remembers ~/.bash_history
> as just ~/.nfs000000160000351200000001. Unless the last program that
> uses ~/.bash_history quit, all subsequent programs trying to access
> ~/.bash_history will actually get to ~/.nfs000000160000351200000001?

No, after the file has been moved, if you open ~/.bash_history, that open
will be for a new separate file.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-03-18 16:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-09 22:36 NFS hang after concurrent writes Peng Yu
2015-03-10  8:26 ` Benjamin Coddington
2015-03-10 11:07   ` Peng Yu
2015-03-10 11:36     ` Benjamin Coddington
2015-03-10 14:53       ` Peng Yu
2015-03-11 14:14         ` Benjamin Coddington
2015-03-18 15:11           ` Peng Yu
2015-03-18 15:18             ` Benjamin Coddington
2015-03-18 15:22               ` Peng Yu
2015-03-18 15:30                 ` Benjamin Coddington
2015-03-18 15:50                   ` Peng Yu
2015-03-18 16:34                     ` Benjamin Coddington

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.