On 06/02/18 18:55, J. Bruce Fields wrote:
On Tue, Feb 06, 2018 at 06:49:28PM +0000, Terry Barnaby wrote:
> On 05/02/18 14:52, J. Bruce Fields wrote:
>>> Yet another poor NFSv3 performance issue. If I do a "ls -lR" of a
certain
>>> NFS mounted directory over a slow link (NFS over Openvpn over FTTP
>>> 80/20Mbps), just after mounting the file system (default NFSv4 mount with
>>> async), it takes about 9 seconds. If I run the same "ls -lR" again,
just
>>> after, it takes about 60 seconds.
>> A wireshark trace might help.
>>
>> Also, is it possible some process is writing while this is happening?
>>
>> --b.
>>
> Ok, I have made some wireshark traces and put these at:
>
>
https://www.beam.ltd.uk/files/files//nfs/
>
> There are other processing running obviously, but nothing that should be
> doing anything that should really affect this.
>
> As a naive input, it looks like the client is using a cache but checking the
> update times of each file individually using GETATTR. As it is using a
> simple GETATTR per file in each directory the latency of these RPC calls is
> mounting up. I guess it would be possible to check the cache status of all
> files in a dir at once with one call that would allow this to be faster when
> a full readdir is in progress, like a "GETATTR_DIR <dir>" RPC call.
The
> overhead of the extra data would probably not affect a single file check
> cache time as latency rather than amount of data is the killer.
Yeah, that's effectively what READDIR is--it can request attributes
along with the directory entries. (In NFSv4--in NFSv3 there's a
seperate call called READDIR_PLUS that gets attributes.)
So the client needs some heuristics to decide when to do a lot of
GETATTRs and when to instead do READDIR. Those heuristics have gotten
some tweaking over time.
What kernel version is your client on again?
--b.
System is Fedora27, Kernel is: 4.14.16-300.fc27.x86_64 on both client
and server.