Considerations beyond that of Enhanced ASCII

Because POSIX standards apply, z/OS® UNIX still treats text that is to be (or has been) converted as a series of bytes. Read and write request sizes, return values, file offsets, file sizes, and so on, remain byte-designated values. However, because one byte does not always equal one character, complications can result when the LFS is requested to perform text conversion. For example, for a write operation, a program might receive a return value equal to the number of bytes that the program requested, but z/OS UNIX might write a different amount due to converting the data. Typically, a program would not be aware of this action, but file sizes increase or decrease differently due to conversion. A large write operation might fail because the maximum number of bytes that are allowed (2 G default for a regular file) was exceeded, even though the program specified less than 2 G.

Incomplete characters at the end of an I/O stream might cause problems. This situation occurs, for example, when z/OS UNIX receives data that does not end on a character boundary. z/OS UNIX tries to resolve this problem by caching the end of this data for the next I/O operation. But unconverted data will not be hardened. Consequently, an fsync operation will not cause a partial character to be written to the file. Likewise, closing a file with a partial character held by z/OS UNIX causes that partial character to be lost. Partial characters occur only when multibyte characters sets are being converted.

Additionally, when certain multibyte character sets are being converted, a lseek operation can cause the file offset to jump to a location in the file that is not on a character boundary or that contains a different character set. These jumps can cause a subsequent read or write operation to fail with an I/O error or a conversion error. Sequential reading and writing is the preferred I/O method to use when different character sets exist in the file.