ENCODING Subcommand (SAVE TRANSLATE command)

The ENCODING subcommand specifies the character encoding for SAS, Stata, tab-delimited text, and CSV data files.

Example

SAVE TRANSLATE
  /OUTFILE='/data/sasdata.sas7bdat'
  /VALFILE='/data/saslabels.sas'
  /TYPE=SAS /VERSION=7 /PLATFORM=WINDOWS
  /ENCODING='Windows-1252'.

BOM Keyword

By default, files encoded in any of the UTF formats include a byte order mark (BOM). Some applications cannot interpret the byte order mark. You can use the BOM keyword to suppress the byte order mark.

BOM=YES
Include the byte order mark in UTF files. This option is the default.
BOM=NO
No not include the byte order mark in UTF files.

Character encoding values for SAS and Stata

Table 1. Character Encoding
Character Set Encoding
IBM® SPSS® Statistics Locale Locale
Operating System Locale System
Western ISO-8859-1
Western ISO-8859-15
Western IBM850
Western Windows-1252
Celtic ISO-8859-14
Greek ISO-8859-7
Greek Windows-1253
Nordic ISO-8859-10
Baltic Windows-1257
Central European IBM852
Central European ISO-8859-2
Cyrillic IBM855
Cyrillic ISO-8859-5
Cyrillic Windows-1251
Cyrillic/Russian CP-866
Chinese Simplified GBK
Chinese Simplified ISO-2022-CN
Chinese Traditional Big5
Chinese Traditional EUC-TW
Japanese EUC-JP
Japanese ISO-2022-JP
Japanese Shift-JIS
Korean EUC-KR
Thai Windows-874
Turkish IBM857
Turkish ISO-8859-9
Arabic Windows-1256
Arabic IBM864
Hebrew ISO-8859-8
Hebrew Windows-1255
Hebrew IBM862