Understanding libiconv
The section will cover the iconv application programming interface (API) conversion.
Often The iconv application programming interface (API) consists of the following subroutines that accomplish conversion:
- iconv_open
- Performs the initialization required to convert characters from the code set specified by the FromCode parameter to the code set specified by the ToCode parameter. The strings specified are dependent on the converters installed in the system. If initialization is successful, the converter descriptor, iconv_t, is returned in its initial state.
- iconv
- Invokes the converter function using the descriptor obtained from the iconv_open
subroutine. The inbuf parameter points to the first character in the input buffer, and the
inbytesleft parameter indicates the number of bytes to the end of the buffer being converted.
The outbuf parameter points to the first available byte in the output buffer, and the
outbytesleft parameter indicates the number of available bytes to the end of the buffer.
For state-dependent encoding, the subroutine is placed in its initial state by a call for which the inbuf value is a null pointer. Subsequent calls with the inbuf parameter as something other than a null pointer cause the internal state of the function to be altered as necessary.
- iconv_close
- Closes the conversion descriptor specified by the cd variable and makes it usable again
In a network environment, the following factors determine how data should be converted:
- Code sets of the sender and the receiver
- Communication protocol (8-bit or 7-bit data)
The following table outlines the conversion methods and recommends how to convert data in different situations.
Criteria | Communication protocol | Communication protocol |
---|---|---|
method to choose | 7-bit only | 8-bit |
as is | not valid | best choice |
fold7 | OK | OK |
fold8 | not valid | OK |
uucode | best choice | OK |
This table shows communication with system using different code set or when receiver's code set is unknown.
Criteria | Communication protocol | Communication protocol |
---|---|---|
method to choose | 7-bit only | 8-bit |
as is | not valid | not valid if remote code set is unknown |
fold7 | best choice | OK |
fold8 | not valid | best choice |
uucode | not valid | not valid |
If the sender uses the same code set as the receiver, the following possibilities exist:
- When protocol allows 8-bit data, the data can be sent without conversions.
- When protocol allows only 7-bit data, the 8-bit code points must be mapped to
7-bit values. Use the iconv interface and one of the following methods:
- uucode
- Provides the same mapping as the uuencode and uudecode commands. This is the recommended method.
- 7-bit
- Converts internal code sets using 7-bit data. This method passes ASCII without any change.
If the sender uses a code set different from the receiver, there are two possibilities:
- When protocol allows only 7-bit data, use the fold7 method.
- When protocol allows 8-bit data and you know the receiver's code set, use the
iconv interface to convert the data. If you do not know the receiver's code set, use the following
method:
- 8-bit
- Converts internal code sets to standard interchange formats. The 8-bit data is transmitted and the information is preserved so that the receiver can reconstruct the data in its code set.