Discussion:
Stale NFS file handle
Garry Trethewey
2010-05-10 05:38:05 UTC
Permalink
Hello everyone.

Attempting
cp -pruv /media/nfs_saami_sda8/data/* /media/wdtb1

I'm getting a "stale file handle" error as I try to copy from one box to
another.

cp: cannot stat `/media/nfs_saa (long_path) /181.JPG': Stale NFS file handle

http://www.cyberciti.biz/tips/nfs-stale-file-handle-error-and-solution.html
says
"A filehandle becomes stale whenever the file or directory referenced by
the handle is removed by another host, while your client still holds an
active reference to the object. A typical example occurs when the
current directory of a process, running on your client, is removed on
the server (either by a process running on the server or on another
client)."

No files are getting removed, I can't see why anything about the network
suddenly changes. I only have 2 boxes, not anything big & complicated.

I reboot both boxes, continue the copy for a while, having success with
files that failed before, and then the same error occurs.

Just to check nomenclature,
cp -pruv /media/nfs_saami_sda8/data/* /media/wdtb1
the_other_box this_box
server client
host
jaunty hardy


I can't see why "another host" is relevant. As I said, I only have 2
boxes, and static IP addresses, eg 10.0.0.9

Unless my bigpond router or some stray wireless setting (which I don't
understand and so don't use) is interfering somehow?

There's plenty of advice on the web about how to fix once this has
happened, but I can't see why it happens in the first place.


Any ideas?


regards

------------------------------------
Garry Trethewey
------------------------------------
Thomas Sprinkmeier
2010-05-11 05:47:38 UTC
Permalink
Post by Garry Trethewey
Hello everyone.
Attempting
cp -pruv /media/nfs_saami_sda8/data/* /media/wdtb1
Something like
tar --create --verbose --file - --directory
/media/nfs_saami_sda8/data/ . | tar --extract --file - --directory
/media/wdtb1
should do the same and may be more resilient.
Post by Garry Trethewey
I'm getting a "stale file handle" error as I try to copy from one box to
another.
cp: cannot stat `/media/nfs_saa (long_path) /181.JPG': Stale NFS file handle
how long is (long_path)?
overflowing a buffer somewhere? exceeding MAXPATHLEN?

Depending on where you look the max path seems to be one of:

/usr/include/linux/nfs.h
13:#define NFS_MAXPATHLEN 1024

/usr/include/bits/posix1_lim.h
97:#define _POSIX_PATH_MAX 256

/usr/include/linux/limits.h
12:#define PATH_MAX 4096
Martin Ebourne
2010-05-11 06:31:56 UTC
Permalink
Post by Thomas Sprinkmeier
Post by Garry Trethewey
Hello everyone.
Attempting
cp -pruv /media/nfs_saami_sda8/data/* /media/wdtb1
Something like
tar --create --verbose --file - --directory
/media/nfs_saami_sda8/data/ . | tar --extract --file - --directory
/media/wdtb1
should do the same and may be more resilient.
Post by Garry Trethewey
I'm getting a "stale file handle" error as I try to copy from one box to
another.
cp: cannot stat `/media/nfs_saa (long_path) /181.JPG': Stale NFS file handle
how long is (long_path)?
overflowing a buffer somewhere? exceeding MAXPATHLEN?
/usr/include/linux/nfs.h
13:#define NFS_MAXPATHLEN 1024
/usr/include/bits/posix1_lim.h
97:#define _POSIX_PATH_MAX 256
/usr/include/linux/limits.h
12:#define PATH_MAX 4096
On Fedora 13 client and server using NFS 4 I get the following:

Path name length
4082: works fine over nfs and locally
4094: gives filename too long error for nfs but works locally
4106: zsh crashes even if you echo the filename which is a builtin, let
alone ls or cd

I'm able to cd into a directory 5906 long (using relative paths in
chunks, didn't try any deeper) and ls/pwd/tab-complete even over nfs,
seems to work fine.

Zsh has clearly has an exponential algorithm for completing pathnames,
slows down dramatically after a few dozen nested dirs.

Back on the original issue I've seen NFS connections die when using tcp
rather than udp on Linux giving stale filehandles and other nasties,
usually when they are heavily stressed transferring lots of data (as
here). Certainly seem to be some bugs in there that need ironing out
(I've seen this on & off over the last 4-5 years including recently). I
switched to cifs for a couple of years for my media drives because it
occurred too frequently, only switched back to nfs again this year.

Cheers,
Martin
Thomas Sprinkmeier
2010-05-11 07:00:41 UTC
Permalink
Post by Thomas Sprinkmeier
Post by Garry Trethewey
Hello everyone.
Attempting
cp -pruv /media/nfs_saami_sda8/data/* /media/wdtb1
Something like
? ?tar --create --verbose --file - --directory
/media/nfs_saami_sda8/data/ . | tar --extract --file - --directory
/media/wdtb1
should do the same and may be more resilient.
I don't think the prob lies with cp versus tar. They both have to access
files that the os can't deal with.
yes, but tar might be able to keep going and process the rest of the
files (with the usual 'continued after errors' disclaimer at the end)
I've always wondered what a maximum path length was.
...
But no, mine aren't that long.
ssh victim tar -cf - ... | tar -xf -

take NFS out of the loop?

(does not address the error, but might solve your problem)


Thomas
Garry Trethewey
2010-05-11 07:11:32 UTC
Permalink
Post by Thomas Sprinkmeier
ssh victim tar -cf - ... | tar -xf -
take NFS out of the loop?
Oh I see, another kind of network. I'll read a bit about it. Similarly
with tcp, udp, cifs. Thanks all.
------------------------------------
Garry Trethewey
------------------------------------
Martin Ebourne
2010-05-11 07:45:46 UTC
Permalink
Post by Garry Trethewey
Post by Thomas Sprinkmeier
ssh victim tar -cf - ... | tar -xf -
take NFS out of the loop?
Oh I see, another kind of network. I'll read a bit about it. Similarly
with tcp, udp, cifs. Thanks all.
If you're going to be cutting nfs out of the loop you probably want to
be using rsync. "man rsync" or google for lots of info.

rsync is a program (and its own protocol) and is usually the best way of
getting directory structures between hosts over a network whether you do
it once or incrementally over many times.

Regarding the others they are different kinds of things:

cifs is an alternative network filesystem to nfs (it's actually the MS
windows share protocol but can be used on unix with samba even including
unix acls using HP extentions).
tcp and udp are the two main fundamental tcp/ip protocols, almost
everything (DNS, HTTP, SSH, NFS, CIFS, and the rest) is built on top of
these two basic layers. NFS is unusual in that these days it can work on
top of either tcp and udp and you get to choose. udp is traditional, tcp
is more modern (mainly because it's more firewall friendly).

Cheers,
Martin

Garry Trethewey
2010-05-11 06:18:48 UTC
Permalink
Post by Thomas Sprinkmeier
Post by Garry Trethewey
Hello everyone.
Attempting
cp -pruv /media/nfs_saami_sda8/data/* /media/wdtb1
Something like
tar --create --verbose --file - --directory
/media/nfs_saami_sda8/data/ . | tar --extract --file - --directory
/media/wdtb1
should do the same and may be more resilient.
I don't think the prob lies with cp versus tar. They both have to access
files that the os can't deal with.
Post by Thomas Sprinkmeier
Post by Garry Trethewey
I'm getting a "stale file handle" error as I try to copy from one box to
another.
cp: cannot stat `/media/nfs_saa (long_path) /181.JPG': Stale NFS file handle
how long is (long_path)?
overflowing a buffer somewhere? exceeding MAXPATHLEN?
/usr/include/linux/nfs.h
13:#define NFS_MAXPATHLEN 1024
/usr/include/bits/posix1_lim.h
97:#define _POSIX_PATH_MAX 256
/usr/include/linux/limits.h
12:#define PATH_MAX 4096
I've always wondered what a maximum path length was. And I just checked
max filename length is just over 250, (don't bother reading the
following lines too carefully :-)

garry at toybox-hardy:~$ touch
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
touch: cannot touch
`12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890':
File name too long

But no, mine aren't that long.

regards
------------------------------------
Garry Trethewey
------------------------------------
Loading...