wx-ai text generate

Infers the next tokens for a given deployed model with a set of parameters.

Syntax

cpdctl wx-ai text generate \
--input INPUT \
--model-id MODEL-ID \
--parameters-length-penalty PARAMETERS-LENGTH-PENALTY \
--parameters-max-new-tokens PARAMETERS-MAX-NEW-TOKENS \
--parameters-min-new-tokens PARAMETERS-MIN-NEW-TOKENS \
--parameters-random-seed PARAMETERS-RANDOM-SEED \
--parameters-repetition-penalty PARAMETERS-REPETITION-PENALTY \
--parameters-return-options PARAMETERS-RETURN-OPTIONS \
--parameters-stop-sequences PARAMETERS-STOP-SEQUENCES \
--parameters-temperature PARAMETERS-TEMPERATURE \
--parameters-time-limit PARAMETERS-TIME-LIMIT \
--parameters-top-k PARAMETERS-TOP-K \
--parameters-top-p PARAMETERS-TOP-P \
--parameters-truncate-input-tokens PARAMETERS-TRUNCATE-INPUT-TOKENS \
[--cpd-scope CPD-SCOPE] \
[--moderations MODERATIONS] \
[--parameters PARAMETERS | --parameters-decoding-method PARAMETERS-DECODING-METHOD] \
[--parameters-include-stop-sequence PARAMETERS-INCLUDE-STOP-SEQUENCE] \
[--project-id PROJECT-ID] \
[--space-id SPACE-ID]

Options

Table 1: Command options
Option Description
--cpd-scope (string) The IBM Software Hub space, project, or catalog scope. For example, cpd://default-context/spaces/7bccdda4-9752-4f37-868e-891de6c48135.
Status
Optional.
Syntax
--cpd-scope=<cpd-scope>
Input type
string
Default value
No default.
--input (string)

The prompt to generate completions. Note: The method tokenizes the input internally. Do not leave any trailing spaces. Required.

--model-id (string)

The id of the model to be used for this request. For more information, see list of models. Required.

--moderations

Properties that control the moderations, for usages such as Hate and profanity (HAP) and Personal identifiable information (PII) filtering. This list can be extended with new types of moderations.

Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a @, for example --moderations=@path/to/file.json.

The following example shows the format of the Moderations object.

{
  "hap" : {
    "input" : {
      "enabled" : true,
      "threshold" : 0
    },
    "output" : {
      "enabled" : true,
      "threshold" : 0
    },
    "mask" : {
      "remove_entity_value" : false
    }
  },
  "pii" : {
    "input" : {
      "enabled" : true,
      "threshold" : 0
    },
    "output" : {
      "enabled" : true,
      "threshold" : 0
    },
    "mask" : {
      "remove_entity_value" : false
    }
  },
  "input_ranges" : [ {
    "start" : 0,
    "end" : 0
  } ]
}
--parameters

Properties that control the model and response. This JSON option can instead be provided by setting individual fields with other options. It is mutually exclusive with those options.

Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a @, for example --parameters=@path/to/file.json.

--parameters-decoding-method (string)

Represents the strategy that is used for picking the tokens during generation of the output text.

During text generation, when this parameter value is set to greedy, each successive token corresponds to the highest probability token given the text that has already been generated. This strategy can lead to repetitive results especially for longer output sequences. The alternative sample strategy generates text by picking subsequent tokens based on the probability distribution of possible next tokens that are defined by (that is, conditioned on) the already-generated text and the top_k and top_p parameters. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The default value is `sample`. Allowable values are:`sample`, `greedy`.

--parameters-include-stop-sequence (Boolean)

Pass false to omit matched stop sequences from the end of the output text. The default is true, meaning that the output ends with the stop sequence text when matched. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The default value is true.

--parameters-length-penalty

Use this option to exponentially increase the likelihood of the text generation terminating when a specified number of tokens are generated. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a @, for example --parameters-length-penalty=@path/to/file.json.

--parameters-max-new-tokens (int64)

The maximum number of new tokens to be generated. The maximum supported value for this field depends on the model being used.

How the "token" is defined depends on the tokenizer and vocabulary size, which in turn depends on the model. Often the tokens are a mix of fullwords and subwords. To learn more about tokenization, see here.

Depending on the users plan, and on the model being used, an enforced maximum number of new tokens might be enforced. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The default value is `20`. The minimum value is `0`.

--parameters-min-new-tokens (int64)

If stop sequences are given, they are ignored until minimum tokens are generated. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The default value is 0. The minimum value is 0.

--parameters-random-seed (int64)

Random number generator seed to use in sampling mode for experimental repeatability. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The minimum value is 1.

--parameters-repetition-penalty (float64)

Represents the penalty for penalizing tokens that have already been generated or belong to the context. The value 1.0 means that there is no penalty. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The default value is 1. The maximum value is 2. The minimum value is 1.

--parameters-return-options

Properties that control what is returned. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

Provide a JSON string option or specify a JSON file to read from by providing a file path option that begins with a @, for example --parameters-return-options=@path/to/file.json.

--parameters-stop-sequences (string)

Stop sequences are one or more strings that cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered before the minimum number of tokens being generated will be ignored. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The maximum length is 6 items. The minimum length is 0 items.

--parameters-temperature (float64)

A value used to modify the next-token probabilities in sampling mode. Values less than 1.0 sharpen the probability distribution, resulting in "less random" output. Values greater than 1.0 flatten the probability distribution, resulting in "more random" output. A value of 1.0 has no effect. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The default value is 1. The maximum value is 2. The minimum value is 0.

--parameters-time-limit (int64)

Time limit in milliseconds - if not completed within this time, generation stops. The text generated so far will be returned along with the TIME_LIMIT stop reason.

The value must be greater than `0`.

Depending on the users plan, and on the model being used, an enforced maximum time limit might be enforced. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

--parameters-top-k (int64)

The number of highest probability vocabulary tokens to keep for top-k-filtering. This option applies only for sampling mode. When decoding_strategy is set to sample, only the top_k most likely tokens are considered as candidates for the next generated token. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The maximum value is 100. The minimum value is 1.

--parameters-top-p (float64)

Similar to top_k except the candidates to generate the next token are the most likely tokens with probabilities that add up to at least top_p. Also known as nucleus sampling. A value of 1.0 is equivalent to disabled. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The default value is 1. The maximum value is 1. The value must be greater than 0.

--parameters-truncate-input-tokens (int64)

Represents the maximum number of input tokens accepted. Use this option to avoid requests failing due to input being longer than the configured limits. If the text is truncated, then it truncates the start of the input (on the left), so the end of the input remains the same. If this value exceeds the maximum sequence length (refer to the documentation to find this value for the model), the call fails when the total number of tokens exceeds the maximum sequence length. Zero means don't truncate. This option provides a value for a sub-field of the JSON option 'parameters'. It is mutually exclusive with that option.

The minimum value is 0.

--project-id (string)

The project that contains the resource. Either space_id or project_id must be given.

The maximum length is 36 characters. The minimum length is 36 characters. The value must match the regular expression /[a-zA-Z0-9-]*/.

--space-id (string)

The space that contains the resource. Either space_id or project_id must be given.

The maximum length is 36 characters. The minimum length is 36 characters. The value must match the regular expression /[a-zA-Z0-9-]*/.

Examples

cpdctl wx-ai text generate \
    --input 'Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n' \
    --model-id google/flan-ul2 \
    --space-id exampleString \
    --project-id 12ac4cf1-252f-424b-b52d-5cdd9814987f \
    --parameters '{"decoding_method": "greedy", "length_penalty": {"decay_factor": 2.5, "start_index": 5}, "max_new_tokens": 30, "min_new_tokens": 5, "random_seed": 1, "stop_sequences": ["fail"], "temperature": 0.8, "time_limit": 600000, "top_k": 50, "top_p": 0.5, "repetition_penalty": 1.5, "truncate_input_tokens": 0, "return_options": {"input_text": true, "generated_tokens": true, "input_tokens": true, "token_logprobs": true, "token_ranks": true, "top_n_tokens": 2}, "include_stop_sequence": true}' \
    --moderations '{"hap": {"input": {"enabled": true, "threshold": 0}, "output": {"enabled": true, "threshold": 0}, "mask": {"remove_entity_value": false}}, "pii": {"input": {"enabled": true, "threshold": 0}, "output": {"enabled": true, "threshold": 0}, "mask": {"remove_entity_value": false}}, "input_ranges": [{"start": 0, "end": 0}]}'

Alternatively, granular options are available for the sub-fields of JSON string options:

cpdctl wx-ai text generate \
    --input 'Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n' \
    --model-id google/flan-ul2 \
    --space-id exampleString \
    --project-id 12ac4cf1-252f-424b-b52d-5cdd9814987f \
    --moderations '{"hap": moderationHapPropertiesModel, "pii": moderationPiiPropertiesModel, "input_ranges": [moderationTextRangeModel]}' \
    --parameters-decoding-method greedy \
    --parameters-length-penalty textGenLengthPenalty \
    --parameters-max-new-tokens 30 \
    --parameters-min-new-tokens 5 \
    --parameters-random-seed 1 \
    --parameters-stop-sequences fail \
    --parameters-temperature 1.5 \
    --parameters-time-limit 600000 \
    --parameters-top-k 50 \
    --parameters-top-p 0.5 \
    --parameters-repetition-penalty 1.5 \
    --parameters-truncate-input-tokens 0 \
    --parameters-return-options returnOptionProperties \
    --parameters-include-stop-sequence true

Example output

A response without moderations.

The generated text from the model along with other details.

{
  "model_id" : "google/flan-ul2",
  "created_at" : "2023-07-21T16:52:32.190Z",
  "results" : [ {
    "generated_text" : "4,000 km",
    "generated_token_count" : 4,
    "input_token_count" : 12,
    "stop_reason" : "eos_token"
  } ]
}