The complete API reference for Watson NLP can be found here:
Watson NLP API reference
Semantics of Span offsets
In Watson NLP, a Span is a contiguous region of a Text object, which is identified by the beginning and ending offsets in that Text object. Assume that your input text is:
Amelia Earhart is a pilot.
The text across Span [0-6] is
Amelia. This Span can be visualized as:
A m e l i a ^ ^ ^ ^ ^ ^ ^ 0 1 2 3 4 5 6
Likewise, the text across Span [20-25] is
A Span of [x-x] represents the span between the end of a character and the beginning of the next character. In the previous example, [0-0] is an empty string before the character
A. Likewise, a Span of [3-3] is an empty string
between the characters
Span offsets in Watson NLP APIs follow the semantics of
String objects in the respective programming language, to ensure interoperability with other libraries written in that programming language. Specifically, the APIs use
code point (UTF-32) to represent Span offsets.
For a detailed description and relationship between code units and code points, see Character.