RFC Errata


Errata Search

 
Source of RFC  
Summary Table Full Records

RFC 5661, "Network File System (NFS) Version 4 Minor Version 1 Protocol", January 2010

Source of RFC: nfsv4 (tsv)

Errata ID: 2751

Status: Reported
Type: Technical

Reported By: Ricardo Labiaga
Date Reported: 2011-03-21

Throughout the document, when it says:


It should say:

12.5.4.1.  LAYOUTCOMMIT and change/time_modify
becomes
12.5.4.2.  LAYOUTCOMMIT and change/time_modify


12.5.4.2.  LAYOUTCOMMIT and size
becomes
12.5.4.3.  LAYOUTCOMMIT and size


12.5.4.3.  LAYOUTCOMMIT and layoutupdate
becomes
12.5.4.4.  LAYOUTCOMMIT and layoutupdate


Add new Section 
12.5.4.1 Implications of LAYOUTCOMMIT on file layouts
For file layouts, WRITEs to a Data Server that return a stable_how4 value of 
FILE_SYNC4 guarantee that data and file system metadata are on stable 
storage.  This means that a LAYOUTCOMMIT is not needed in order to make the 
data and metadata visible to the metadata server and other clients.

For file layouts, when WRITE to the data server returns UNSTABLE4 or 
DATA_SYNC4  and the NFL4_UFLG_COMMIT_THRU_MDS flag is set, the client MUST 
send the COMMIT to the metadata server.  A successful COMMIT to the metadata 
server guarantees that data and file system metadata are on stable storage.  
Therefore, any time that NFS4_UFLG_COMMIT_THRU_MDS is set, a LAYOUTCOMMIT (of 
the byte range specified by the layout) is not needed.

For file layouts, when NFL4_UFLG_COMMIT_THRU_MDS flag is not set, and WRITE or 
COMMIT to the data server return DATA_SYNC4, the client MUST send the 
LAYOUTCOMMIT to the metadata server in order to synchronize file metadata.  

The following table summarizes the rules when a LAYOUTCOMMIT is needed, and 
the effects of a COMMIT to a data server and metadata server.  

+------------+------------+------------+------------+----------+
| NFL4_UFLG_ | WRITE to   | Meaning of | Meaning    | LAYOUT   |
| COMMIT_    | DS returns | COMMIT to  | of COMMIT  | COMMIT   | 
| THRU_MDS   |            | DS         | to MDS     | required |            
+------------+------------+------------+------------+----------+
| Not Set    | UNSTABLE4  | DATA_SYNC4 | Nothing    | Yes      |
| Not Set    | DATA_SYNC4 | Nothing    | Nothing    | Yes      |
| Not Set    | FILE_SYNC4 | Nothing    | Nothing    | NO       |
| Set        | UNSTABLE4  | Nothing    | FILE_SYNC4 | NO       |
| Set        | DATA_SYNC4 | Nothing    | FILE_SYNC4 | NO       |
| Set        | FILE_SYNC4 | Nothing    | Nothing    | NO       |
+------------+------------+------------+------------+----------+

Note that a client can always demand FILE_SYNC4 or DATA_SYNC4 in WRITE's 
arguments.  Also note that specifying these stability levels may adversely 
impact performance.

If a LAYOUTCOMMIT is required, it should be sent before CLOSE to maintain 
close-to-open semantics.  If required, it should be sent before LOCKU, 
OPEN_DOWNGRADE, LAYOUTRETURN, and when the application issues fsync() [25].  
Again, if LAYOUTCOMMIT is required, it should be sent periodically to keep the 
file size and modification time synchronized.  This allows use cases like 
tail -f [56] which copies its input file to the standard output and updates 
the output as new lines become available in the input file.  It is up to the 
client implementation to determine how frequently LAYOUTCOMMIT is issued.  
Possible policies include every N'th COMMIT to a data server, every N'th unit 
of time, or after writing a stripe to a set of data servers.

Even if a required LAYOUTCOMMIT is not issued by the client, the data server 
and metadata servers have a set of responsibilities to fulfill in order to 
guarantee data consistency:
1) Data servers MUST commit data and synchronize modification and size 
attributes with the metadata server before a layout is revoked as described in 
section 12.5.4.
2) Data servers SHOULD commit data and synchronize modification and size 
attributes with the metadata server after the metadata server reboots.  In 
theory the client should commit the data, but this avoids the problem where 
both the client and metadata server crash at the same time.
3) The metadata server MAY periodically poll data servers to synchronize 
modification and size attributes.


Section 13.9.2.3 says:
   For the NFSv4.1-based data storage protocol, it is  necessary to re-
synchronize state such as the size attribute, and  the setting of 
mtime/change/atime.

Should say:
   For the NFSv4.1-based data storage protocol, it may be necessary to re-
synchronize state such as the size attribute, and the setting of 
mtime/change/atime.


Section 13.10 says:
   For the case above, this means that a LAYOUTCOMMIT will be done at close 
(along with the data WRITEs) and will update the file's size and change 
attribute.

Should say:
   For the case above, this means that, if necessary, a LAYOUTCOMMIT will be 
done at close (along with the data WRITEs) and will update the file's size and 
change attribute.


Section 18.3.4 says:
   The COMMIT operation is similar in operation and semantics to the POSIX 
fsync() [25] system interface that synchronizes a file's state with the disk 
(file data and metadata is flushed to disk or stable storage).  COMMIT 
performs the same operation for a client, flushing any unsynchronized data and 
metadata on the server to the server's disk or stable storage for the 
specified file.

Should say:
   The COMMIT operation is similar in operation and semantics to the POSIX 
fsync() [25] system interface that synchronizes a file's state with the disk 
(file data and metadata is flushed to disk or stable storage).  COMMIT 
performs the same operation for a client, flushing any unsynchronized data and 
metadata on the server to the server's disk or stable storage for the 
specified file.  When using pNFS, if a WRITE returned UNSTABLE4 and 
NFL4_UFLG_COMMIT_THRU_MDS is not set, then the client MUST COMMIT to the data 
server.  The COMMIT may result in flushing the data but not the metadata.  In 
this case, the metadata MUST be flushed with a subsequent LAYOUTCOMMIT to the 
metadata server.  A complete set of pNFS rules for flushing data and metadata 
is described in section 12.5.4.1.


Section 18.3.4 says:
   The above description applies to page-cache-based systems as well as buffer-
cache-based systems.  In the former systems, the virtual memory system will 
need to be modified instead of the buffer cache.

Should say:
   The above description applies to page-cache-based systems as well as buffer-
cache-based systems.  In the former systems, the virtual memory system will 
need to be modified instead of the buffer cache.

   Refer to Section 12.5.4.1 for a discussion of the effects of data stability 
levels on data servers or metadata servers.


Section 18.32.4 says:
   However, since it is possible for a WRITE to be done with a special 
stateid, the server needs to check for this case even though the client should 
have done an OPEN previously.

Should say:
   However, since it is possible for a WRITE to be done with a special 
stateid, the server needs to check for this case even though the client should 
have done an OPEN previously.

   Refer to Section 12.5.4.1 for a discussion of the effects of data stability 
levels on data servers or metadata servers.


Section 20.3.4 says:
   In the case of modified data being written while the layout is held, the 
client must use LAYOUTCOMMIT operations at the appropriate time; as required 
LAYOUTCOMMIT must be done before the LAYOUTRETURN.

Should say:
   In the case of modified data being written while the layout is held, the 
client may be required to use LAYOUTCOMMIT operations at the appropriate time; 
if LAYOUTCOMMIT is required, it must be done before the LAYOUTRETURN.


Add new informative reference to Section 23.2
[56] The Open Group, "section 'tail' of The Open Group Base Specifications 
Issue 6 IEEE Std 1003.1, 2004 Edition, HTML Version (www.opengroup.org), 
ISBN 1931624453, 2004.


Notes:

A new section describing the implications of LAYOUTCOMMIT on file layouts is
defined in this errata, along with updates to existing sections of the spec.
The technical details in this errata were agreed upon at the IETF Interim
Meeting in Sunnyvale, CA on Feb 18-19, 2011.

Report New Errata