ENCODING Subcommand (GET STATA command)

The ENCODING subcommand specifies the character encoding of the Stata data file.

  • The encoding must be correctly identified or the file cannot be read.
  • The subcommand is followed by an optional equals sign and a quoted encoding value.
  • The quoted value can be any of the values in the Encoding column in the Character Encoding table.
  • The default encoding is "Locale", which is the encoding of the current IBM® SPSS® Statistics locale. See the topic LOCALE Subcommand (SET command) for more information.

Example

GET STATA FILE='/data/empl.dta'
  /ENCODING='Windows-1252'.
Table 1. Character Encoding
Character Set Encoding
IBM SPSS Statistics Locale Locale
Operating System Locale System
Western ISO-8859-1
Western ISO-8859-15
Western IBM850
Western Windows-1252
Celtic ISO-8859-14
Greek ISO-8859-7
Greek Windows-1253
Nordic ISO-8859-10
Baltic Windows-1257
Central European IBM852
Central European ISO-8859-2
Cyrillic IBM855
Cyrillic ISO-8859-5
Cyrillic Windows-1251
Cyrillic/Russian CP-866
Chinese Simplified GBK
Chinese Simplified ISO-2022-CN
Chinese Traditional Big5
Chinese Traditional EUC-TW
Japanese EUC-JP
Japanese ISO-2022-JP
Japanese Shift-JIS
Korean EUC-KR
Thai Windows-874
Turkish IBM857
Turkish ISO-8859-9
Arabic Windows-1256
Arabic IBM864
Hebrew ISO-8859-8
Hebrew Windows-1255
Hebrew IBM862