Reading in source text

You can use the Language Identifier node to identify the natural language of a text field within your source data. The output of this node is a derived field that contains the detected language code.

Language Identifier node

Data for text mining can be in any of the standard formats that are used by SPSS Modeler flows, including databases or other "rectangular" formats that represent data in rows and columns.

  • To read in text from any of the standard data formats used by SPSS Modeler flows, such as a database with one or more text fields for customer comments, you can use an Import node.
  • When you're processing large amounts of data, which might include text in several different languages, use the Language Identifier node to identify the language used in a specific field.