`wx-ai deployment text-generate`

Infers the next tokens for a given deployed model with a set of parameters. If a serving_name is used, then it must match the serving_name that is returned in the inference when the deployment was created.

Syntax

cpd-cli wx-ai deployment
text-generate \
--id-or-name=<id-or-name> \
--parameters-include-stop-sequence=<parameters-include-stop-sequence> \
--parameters-length-penalty=<parameters-length-penalty> \ 
--parameters-max-new-tokens=<parameters-max-new-tokens> \
--parameters-min-new-tokens=<parameters-min-new-tokens> \
--parameters-prompt-variables=<parameters-prompt-variables>\ 
--parameters-random-seed=<parameters-random-seed> \
--parameters-repetition-penalty=<parameters-repetition-penalty> \
--parameters-return-options=<parameters-return-options> \
--parameters-stop-sequences=<parameters-stop-sequences> \
--parameters-temperature=<parameters-temperature> \
--parameters-time-limit=<parameters-time-limit> \
--parameters-top-k=<parameters-top-k> \
--parameters-top-p=<parameters-top-p> \
--parameters-truncate-input-tokens=<parameters-truncate-input-tokens> \
--parameters-typical-p=<parameters-typical-p> \
[--input=<input>] ] \
[--moderations=<moderations>] \
[--parameters=<parameters> | --parameters-decoding-method=<parameters-decoding-method> ]

Options

Table 1: Command options

Option	Description
`--id-or-name`	The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction. Status Required. Syntax `--id-or-name=<id-or-name>` Default value `No default.` Input type `string`
`--input`	The prompt to generate completions. Note: The method tokenizes the input internally. Status Optional. Syntax `--input=<input>` Default value `No default.` Input type `string`
`--moderations`	Properties that control the moderations, for usages such as Hate and profanity (HAP) and Personal identifiable information (PII) filtering. This list can be extended with new types of moderations. Status Optional. Syntax `--moderations=<moderations>` Default value `No default.` Input type `string` Valid values Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a `@`, for example `--moderations=@path/to/file.json`. The following example shows the format of the `Moderations` object. `{ "hap" : { "input" : { "enabled" : true, "threshold" : 0 }, "output" : { "enabled" : true, "threshold" : 0 }, "mask" : { "remove_entity_value" : false } }, "pii" : { "input" : { "enabled" : true, "threshold" : 0 }, "output" : { "enabled" : true, "threshold" : 0 }, "mask" : { "remove_entity_value" : false } }, "input_ranges" : [ { "start" : 0, "end" : 0 } ] }`
`--parameters`	The template properties if this request refers to a prompt template. This JSON option can instead be provided by setting individual fields with other options. It is mutually exclusive with those options. Status Optional. Syntax `--parameters=<parameters>` Default value `No default.` Input type `string` Valid values Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a `@`, for example `--parameters=@path/to/file.json`.
`--parameters-decoding-method`	Represents the strategy that is used for picking the tokens during generation of the output text. Status Optional. Syntax `--parameters-decoding-method=<parameters-decoding-method>` Default value `No default.` Input type `string`
`--parameters-include-stop-sequence`	Pass false to omit matched stop sequences from the end of the output text. The default is true, meaning that the output ends with the stop sequence text when matched. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-include-stop-sequence=<parameters-include-stop-sequence>` Default value `True.` Input type `Boolean`
`--parameters-length-penalty`	It can be used to exponentially increase the likelihood of the text generation terminating when a specified number of tokens have been generated. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-length-penalty=<parameters-length-penalty>` Default value `No default.`
`--parameters-max-new-tokens`	The maximum number of new tokens to be generated. The maximum supported value for this field depends on the model that is used. Status Required. Syntax `--parameters-max-new-tokens=<parameters-max-new-tokens>` Input type `Int64` Default value `No default.`
`--parameters-min-new-tokens`	If stop sequences are given, they are ignored until minimum tokens are generated. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. The minimum value is 0. Status Required. Syntax `--parameters-min-new-tokens=<parameters-min-new-tokens>` Input type `Int64` Default value `The default value is 0.`
`--parameters-prompt-variables`	The prompt variables. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-prompt-variables=<parameters-prompt-variables>` Input type `string` Default value `No default.` Valid values Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a `@`, for example `--parameters-prompt-variables=@path/to/file.json`.
`--parameters-random-seed`	Random number generator seed to use in sampling mode for experimental repeatability. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. The minimum value is 1. Status Required. Syntax `--parameters-random-seed=<parameters-random-seed>` Input type `int64` Default value `No default.`
`--parameters-repetition-penalty`	Represents the penalty for penalizing tokens that have already been generated or belong to the context. The value 1.0 means that there is no penalty. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-repetition-penalty=<parameters-repetition-penalty>` Input type `float64` Default value `The default value is 1. The maximum value is 2. The minimum value is 1.`
`--parameters-return-options`	Properties that control what is returned. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-return-options=<parameters-return-options>` Default value `No default.` Valid values Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a `@`, for example `--parameters-return-options=@path/to/file.json`.
`--parameters-stop-sequences`	Stop sequences are one or more strings that cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered before the minimum number of tokens being generated will be ignored. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. The maximum length is 6 items. The minimum length is 0 items. Status Required. Syntax `--parameters-stop-sequences=<parameters-stop-sequences>` Input type `string` Default value `No default.` Valid values Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a `@`, for example `--parameters-return-options=@path/to/file.json`.
`--parameters-temperature`	A value used to modify the next-token probabilities in sampling mode. Values less than 1.0 sharpen the probability distribution, resulting in "less random" output. Values greater than 1.0 flatten the probability distribution, resulting in "more random" output. A value of 1.0 has no effect. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-temperature=<parameters-temperature>` Input type `float64` Default value `The default value is 1. The maximum value is 2. The minimum value is 0.`
`--parameters-time-limit`	Time limit in milliseconds - if not completed within this time, generation stops. The text generated so far is returned along with the TIME_LIMIT stop reason. Status Required. Syntax `--parameters-time-limit=<parameters-time-limit>` Input type `int64` Default value `No default.`
`--parameters-top-k`	The number of highest probability vocabulary tokens to keep for top-k-filtering. Applies only for sampling mode. When decoding_strategy is set to sample, only the top_k most likely tokens are considered as candidates for the next generated token. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-top-k=<parameters-top-k>` Input type `int64` Default value `No default. The maximum value is 100. The minimum value is 1.`
`--parameters-top-p`	Similar to top_k except the candidates to generate the next token are the most likely tokens with probabilities that add up to at least top_p. Also known as nucleus sampling. A value of 1.0 is equivalent to disabled. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-top-p=<parameters-top-p>` Input type `float64` Default value `The default value is 1. The maximum value is 1. The value must be greater than 0.`
`--parameters-truncate-input-tokens`	Represents the maximum number of input tokens accepted. Use this option to avoid requests failing due to input being longer than the configured limits. If the text is truncated, then it truncates the start of the input (on the left), so the end of the input remains the same. If this value exceeds the maximum sequence length (refer to the documentation to find this value for the model), then the call fails if the total number of tokens exceeds the maximum sequence length. Zero means don't truncate. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-truncate-input-tokens=<parameters-truncate-input-tokens>` Input type `int64` Default value `No default. The minimum value is 0.`
`--parameters-typical-p`	Local typicality measures how similar the conditional probability of predicting a target token next is to the expected conditional probability of predicting a random token next, given the partial text already generated. If less than 1, the smallest set of the most locally typical tokens with probabilities that add up to typical_p or higher are kept for generation. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option. Status Required. Syntax `--parameters-typical-p=<parameters-typical-p>` Input type `float64` Default value `No default. The maximum value is 1. The value must be greater than 0.`

Examples

cpd-cli wx-ai deployment
text-generate \
    --id-or-name classification \
    --input 'how far is paris from bangalore:\n' \
    --parameters '{"decoding_method": "greedy", "length_penalty": {"decay_factor": 2.5, "start_index": 5}, "max_new_tokens": 100, "min_new_tokens": 5, "random_seed": 1, "stop_sequences": ["fail"], "temperature": 1.5, "time_limit": 600000, "top_k": 50, "top_p": 0.5, "repetition_penalty": 1.5, "truncate_input_tokens": 0, "return_options": {"input_text": true, "generated_tokens": true, "input_tokens": true, "token_logprobs": true, "token_ranks": true, "top_n_tokens": 2}, "include_stop_sequence": true, "typical_p": 0.5, "prompt_variables": {}}' \
    --moderations '{"hap": {"input": {"enabled": true, "threshold": 0}, "output": {"enabled": true, "threshold": 0}, "mask": {"remove_entity_value": false}}, "pii": {"input": {"enabled": true, "threshold": 0}, "output": {"enabled": true, "threshold": 0}, "mask": {"remove_entity_value": false}}, "input_ranges": [{"start": 0, "end": 0}]}'

Alternatively, granular options are available for the sub-fields of JSON string options:

cpd-cli wx-ai deployment
text-generate \
    --id-or-name classification \
    --input 'how far is paris from bangalore:\n' \
    --moderations '{"hap": moderationHapPropertiesModel, "pii": moderationPiiPropertiesModel, "input_ranges": [moderationTextRangeModel]}' \
    --parameters-decoding-method greedy \ 
    --parameters-include-stop-sequence true \
    --parameters-length-penalty textGenLengthPenalty \
    --parameters-max-new-tokens 30 \
    --parameters-min-new-tokens 5 \ 
    --parameters-prompt-variables '{}' \
    --parameters-random-seed 1 \
    --parameters-repetition-penalty 1.5 \
    --parameters-return-options returnOptionProperties \
    --parameters-stop-sequences fail \
    --parameters-temperature 1.5 \
    --parameters-time-limit 600000 \
    --parameters-top-k 50 \
    --parameters-top-p 0.5 \
    --parameters-truncate-input-tokens 0 \
    --parameters-typical-p 0.5