Data Generation Rules

This feature is available only in API Connect Enterprise as a Service

Data generation rules guide the auto-tester in choosing values to populate the parameters and payloads of its requests.

There are default rules derived from the API definition and extensions, but sometimes you want to override these, either to improve the general accuracy of the auto-tester or to tailor its requests to a specific use case, and you do that by adding custom rules to the Datatypes section of the profile configuration.

A data generation rule binds a data generator to a schema or type in the API definition such that, when the auto-tester assembles the data for a request and encounters that schema or type, it uses the generator to provide an appropriate value at that point.

Data generators

A data generator describes a never-ending sequence of values, and whenever the auto-tester calls on a generator it takes the next value from the sequence.

The syntax for generation rules reuses terms from JSON Schema where the implied constraints on a schema are sufficient to guide the auto-tester in generating conformant values. Such constraints occurring in the API definition help inform the default rules.

Constant

The constant generator always returns the same value, which can be of any simple type.

const: 0

# ==> 0, 0, 0, ...
const: Thursday

# ==> "Thursday", "Thursday", "Thursday", ...

Enumeration

The enumeration generator returns values selected at random from a given list, which can be of any simple type.

enum:
  - 0
  - 1
  - 2

# ==> 1, 0, 2, 0, ...
enum:
  - Thursday
  - Friday
  - Saturday

# ==> "Friday", "Saturday", "Saturday", ...

Pattern

The pattern generator returns string values that match a given regular expression.

pattern: '[0-9]{3}'

# ==> "884", "924", "121", ...
pattern: '[a-z][a-z_]{0,15}'

# ==> "dbe", "kg_cfsv", "dhbfgbjxavmsaxq", ...

Range

The range generator returns numeric values uniformly distributed over a given range (inclusive).

minimum: 0
maximum: 10

# ==> 10, 6, 8, ...

If either endpoint is omitted then it defaults to the most negative or most positive value of the type:

minimum: 0

# ==> 10029625, 3162517758, 841054690, ...

Semantic

The semantic generator returns values that conform to some pre-defined semantic category, covering names, addresses, credit card numbers, etc. The set of categories is the same as that allowed in the API extensions.

semantic: email

# ==> "herbertschaden@fahey.net", "emmyfarrell@cruickshank.com", "vilmaprohaska@zboncak.org", ...
semantic: first_name

# ==> "Antwan", "Connie", "Modesto", ...

Resource

The resource generator returns values that are unique identifiers of resource instances of a given type, selected at random from the pool of existing instances. The identifier of an instance is taken from the id_name property of the resource type defined in the API extensions and its type and format will vary accordingly.

resource: Customer

# ==> "e5537298-9041-47fc-b97e-c910d8f9d971", "f60521c8-2ccb-4036-a742-9f350271675f", "e5537298-9041-47fc-b97e-c910d8f9d971", ...

Array

The array generator returns array values whose contents are populated from a data generator defined recursively. The length of the array is chosen at random between minItems and maxItems inclusive.

items:
  pattern: '[0-9]{3}'
minItems: 3
maxItems: 3

# ==> ["770", "496", "219"], ["431", "895", "331"], ["864", "130", "204"], ...
items:
  enum:
    - 0
    - 1
minItems: 0
maxItems: 5

# ==> [], [1, 0, 1, 1], [0], ...

Object

The object generator returns object values whose properties are populated from named data generators, defined recursively.

properties:
  x:
    minimum: 0
    maximum: 100
  y:
    minimum: 0
    maximum: 100

# ==> {"x": 59, "y": 63}, {"x": 22, "y": 8}, {"x": 95, "y": 57}, ...
properties:
  code:
    pattern: '[0-9]{3}'
  language:
    enum:
      - en
      - fr
      - de

# ==> {"code": "452", "language": "fr"}, {"code": "504", "language": "en"}, {"code": "846", "language": "fr"}, ...

If a property is not required by the target schema then it may be marked with an optional attribute which specifies a frequency (between 0.0 and 1.0) that determines how often the property is included in the generated object (where 0 means never, and 1 means always).

properties:
  code:
    pattern: '[0-9]{3}'
  language:
    optional: 0.5
    enum:
      - en
      - fr
      - de
  tag:
    optional: 0.0

# ==> {"code": "452", "language": "fr"}, {"code": "504", "language": "en"}, {"code": "846"}, ...

Choice

The choice generator returns values selected at random from a given list of data generators, using optional weights to bias the outcome. The alternatives must be simple generators, which excludes array and object generators.

choice:
  # bias the selection towards English
  - const: en
    weight: 5
  - const: fr
  - const: de

# ==> "en", "en", "en", ..., "de", "en", ...
choice:
  # normal values in the range 1-10
  - minimum: 1
    maximum: 10
    weight: 98
  # with a small percentage of outliers
  - const: 999
    weight: 2

# ==> 6, 3, 9, 1, ..., 999, 6, ...

Data generation rules

A data generation rule binds a data generator to a schema occurring in the API definition so that the auto-tester knows how to generate values for that schema in an API call. Rules are specified in the Datatypes section of the profile configuration, and override the default rules derived from the API definition and extensions.

Schemas occur widely in the OpenAPI specification, but there are three key locations which have most relevance to the auto-tester where the profile allows override rules:

  • Named schemas
  • Parameters
  • Inline request bodies

Named schemas

A schema rule assigns a data generator to a named schema from the API definition, found either under definitions (OpenAPI 2.0) or components/schema (OpenAPI 3.0). The auto-tester will use the specified generator wherever the named schema occurs in a request.

Schema rules appear in the schemas section of the profile configuration, under Datatypes.

Because this is an override, the generator may be incomplete for the original schema, and the auto-tester will fall back to using the default rules to cover any gaps. In particular, if the schema describes an object type then the generator need only name a subset of the object properties, and the auto-tester will apply the override for those properties but use default rules for the rest.

For example, given this named schema in the API definition:

Movie:
  type: object
  properties:
    title:
      type: string
    release_date:
      type: string
      format: date
    language:
      description: Two-letter ISO 639-1 language code
      type: string
      pattern: '[a-z]{2}'

The default rule derived from the schema returns values such as:

{
  "title": "mkLHM22Ohabb6YG4-wdZFUwxQa7dn",
  "release_date": "2018-09-14",
  "language": "jq"
}

You can refine this by overriding the default rules for the title and language properties:

Datatypes:
  schemas:
    Movie:
      properties:
        title:
          semantic: sentence
        language:
          enum:
            - en
            - fr
            - de

This rule binds an object generator (introduced by the keyword properties) to the schema named Movie which -- combined with the default rule for release_date -- returns values such as:

{
  "title": "What dog everybody am myself hourly meeting how group over example her",
  "release_date": "2094-06-31",
  "language": "de"
}

Parameters

A parameter rule binds a data generator to a named parameter within an operation found under paths in the API definition.

Parameter rules appear in the operations section of the profile configuration, under Datatypes.

For example, given this parameter definition on GET /movies in the API definition that restricts the number of movies returned in a single call:

paths:
  /movies:
    get:
      parameters:
        - name: limit
          in: query
          description: Sets an upper bound on the number of movies returned
          schema:
            type: integer

The default rule derived from this definition will return values from the full integer range, but you can override this with a rule in the profile configuration that specifies a more realistic range, with occasional outliers to check the validation logic:

Datatypes:
  operations:
    /movies:
      get:
        parameters:
          - name: limit
            data:
              choice:
                - minimum: 1
                  maximum: 500
                  weight: 95
                - const: 0
                  weight: 3
                - const: 8000000
                  weight: 2

The format of the rule follows the format of the original definition but with the keyword data in place of schema to introduce the data generator.

In OpenAPI, a parameter is uniquely identified by a name and a location (so the same name could be used in, say, both query and header) so the parameter rules allow the in property as optional where needed to disambiguate. It's not used in this example because the query parameter is the only one named limit.

The auto-tester only supports parameters of type query and path at present (and body in OpenAPI 2.0). Rules bound to other parameter types are ignored.

Request body

A request body rule binds a data generator to the body of a request where the corresponding schema is specified directly as part of the operation in the API definition, rather than via a reference to a named schema.

Request body rules appear in the operations section of the profile configuration, under Datatypes.

For example, given this definition of the payload to a POST request:

paths:
  /movies:
    post:
      requestBody:
        content:
          application/json:
            schema:
              type: object
              allOf:
                - $ref: '#/components/schemas/Movie'
                - properties:
                    keywords:
                      type: array
                      items:
                        type: string

The following rule generates some appropriate keywords to attach to a new movie (where the movie itself is covered by the rules discussed earlier):

Datatypes:
  operations:
    /movies:
      post:
        requestBody:
          content:
            application/json:
              data:
                properties:
                  keywords:
                    # pick up to 3 items from the following list
                    items:
                      enum:
                        - avant-garde
                        - dystopia
                        - epic
                        - postmodern
                        - remake
                        - shootout
                    minItems: 0
                    maxItems: 3

As with the parameter rules, the format follows that of the original definition but with data in place of schema. The definition is in the style of OpenAPI 3.0, but the rule can work for OpenAPI 2.0 as well as long as the content type is consistent; alternatively, starting from an OpenAPI 2.0 definition, you can use a body parameter rule to specify the payload, but that does not work for 3.0.