Configuring name normalization in CephFS

Learn how to set directory entry name normalization and the ceph.dir.charmap virtual attribute in CephFS.

How normalization works

CephFS uses Unicode normalization to ensure consistent handling of directory names. It supports the following Unicode normalization forms:
nfd:
Canonical decomposition (default)
nfc
Canonical decomposition followed by canonical composition
nfkd
Compatibility decomposition
nfkc
Compatibility decomposition followed by canonical composition

When a client performs a path traversal or lookup, it applies the configured normalization form to each directory entry name before sending the request to the Ceph Metadata Server (MDS). The MDS stores only the normalized names and uses them for comparisons and lookups.

Note: If you set the normalization value to an empty string, the MDS restores the default, nfd.
Note: The MDS maintains an alternate_name metadata (also used for encryption) for directory entries which allows the client to persist the original un-normalized name used by the application. The MDS does not interpret this metadata in any way; it’s only used by clients to reconstruct the original name of the directory entry.

Before you begin

Before you begin, make sure that you have the following prerequisites in place:
  • CephFS must be configured and running.
  • Target directory must be empty and not part of a snapshot.
  • You must have appropriate authorization to modify extended attributes on the directory.

Procedure

  1. Run the setfattr command to configure the desired normalization form for a directory.
    setfattr -n ceph.dir.normalization -v <normalization> <directory>/ 
    For example,
    setfattr -n ceph.dir.normalization -v nfc foo/ 
    
    getfattr -n ceph.dir.normalization foo/ 
    
    # file: foo/ 
    
    ceph.dir.normalization="nfc"  
  2. Verify the normalized setting.
    getfattr -n ceph.dir.normalization foo/
    For example,
    ceph.dir.normalization="nfc"