Data Generation Rules
This feature is available only in API Connect Enterprise as a Service
Data generation rules guide the auto-tester in choosing values to populate the parameters and payloads of its requests.
There are default rules derived from the API definition and extensions, but sometimes you want to override these, either to improve the general accuracy of the auto-tester or to tailor its requests to a specific use case, and you do that by adding custom rules to the Datatypes section of the profile configuration.
A data generation rule binds a data generator to a schema or type in the API definition such that, when the auto-tester assembles the data for a request and encounters that schema or type, it uses the generator to provide an appropriate value at that point.
Data generators
A data generator describes a never-ending sequence of values, and whenever the auto-tester calls on a generator it takes the next value from the sequence.
The syntax for generation rules reuses terms from JSON Schema where the implied constraints on a schema are sufficient to guide the auto-tester in generating conformant values. Such constraints occurring in the API definition help inform the default rules.
Constant
The constant generator always returns the same value, which can be of any simple type.
const: 0
# ==> 0, 0, 0, ...
const: Thursday
# ==> "Thursday", "Thursday", "Thursday", ...
Enumeration
The enumeration generator returns values selected at random from a given list, which can be of any simple type.
enum:
- 0
- 1
- 2
# ==> 1, 0, 2, 0, ...
enum:
- Thursday
- Friday
- Saturday
# ==> "Friday", "Saturday", "Saturday", ...
Pattern
The pattern generator returns string values that match a given regular expression.
pattern: '[0-9]{3}'
# ==> "884", "924", "121", ...
pattern: '[a-z][a-z_]{0,15}'
# ==> "dbe", "kg_cfsv", "dhbfgbjxavmsaxq", ...
Range
The range generator returns numeric values uniformly distributed over a given range (inclusive).
minimum: 0
maximum: 10
# ==> 10, 6, 8, ...
If either endpoint is omitted then it defaults to the most negative or most positive value of the type:
minimum: 0
# ==> 10029625, 3162517758, 841054690, ...
Semantic
The semantic generator returns values that conform to some pre-defined semantic category, covering names, addresses, credit card numbers, etc. The set of categories is the same as that allowed in the API extensions.
semantic: email
# ==> "herbertschaden@fahey.net", "emmyfarrell@cruickshank.com", "vilmaprohaska@zboncak.org", ...
semantic: first_name
# ==> "Antwan", "Connie", "Modesto", ...
Resource
The resource generator returns values that are unique identifiers of resource instances of a given type, selected at random from the pool of existing instances. The identifier of an instance is taken from the id_name
property of the resource type defined in the API extensions and its type and format will vary accordingly.
resource: Customer
# ==> "e5537298-9041-47fc-b97e-c910d8f9d971", "f60521c8-2ccb-4036-a742-9f350271675f", "e5537298-9041-47fc-b97e-c910d8f9d971", ...
Array
The array generator returns array values whose contents are populated from a data generator defined recursively. The length of the array is chosen at random between minItems
and maxItems
inclusive.
items:
pattern: '[0-9]{3}'
minItems: 3
maxItems: 3
# ==> ["770", "496", "219"], ["431", "895", "331"], ["864", "130", "204"], ...
items:
enum:
- 0
- 1
minItems: 0
maxItems: 5
# ==> [], [1, 0, 1, 1], [0], ...
Object
The object generator returns object values whose properties are populated from named data generators, defined recursively.
properties:
x:
minimum: 0
maximum: 100
y:
minimum: 0
maximum: 100
# ==> {"x": 59, "y": 63}, {"x": 22, "y": 8}, {"x": 95, "y": 57}, ...
properties:
code:
pattern: '[0-9]{3}'
language:
enum:
- en
- fr
- de
# ==> {"code": "452", "language": "fr"}, {"code": "504", "language": "en"}, {"code": "846", "language": "fr"}, ...
If a property is not required by the target schema then it may be marked with an optional
attribute which specifies a frequency (between 0.0
and 1.0
) that determines how often the property is included
in the generated object (where 0 means never, and 1 means always).
properties:
code:
pattern: '[0-9]{3}'
language:
optional: 0.5
enum:
- en
- fr
- de
tag:
optional: 0.0
# ==> {"code": "452", "language": "fr"}, {"code": "504", "language": "en"}, {"code": "846"}, ...
Choice
The choice generator returns values selected at random from a given list of data generators, using optional weights to bias the outcome. The alternatives must be simple generators, which excludes array
and object
generators.
choice:
# bias the selection towards English
- const: en
weight: 5
- const: fr
- const: de
# ==> "en", "en", "en", ..., "de", "en", ...
choice:
# normal values in the range 1-10
- minimum: 1
maximum: 10
weight: 98
# with a small percentage of outliers
- const: 999
weight: 2
# ==> 6, 3, 9, 1, ..., 999, 6, ...
Data generation rules
A data generation rule binds a data generator to a schema occurring in the API definition so that the auto-tester knows how to generate values for that schema in an API call. Rules are specified in the Datatypes section of the profile configuration, and override the default rules derived from the API definition and extensions.
Schemas occur widely in the OpenAPI specification, but there are three key locations which have most relevance to the auto-tester where the profile allows override rules:
- Named schemas
- Parameters
- Inline request bodies
Named schemas
A schema rule assigns a data generator to a named schema from the API definition, found either under definitions
(OpenAPI 2.0) or components/schema
(OpenAPI 3.0). The auto-tester will use the specified generator wherever
the named schema occurs in a request.
Schema rules appear in the schemas section of the profile configuration, under Datatypes
.
Because this is an override, the generator may be incomplete for the original schema, and the auto-tester will fall back to using the default rules to cover any gaps. In particular, if the schema describes an object type then the generator need only name a subset of the object properties, and the auto-tester will apply the override for those properties but use default rules for the rest.
For example, given this named schema in the API definition:
Movie:
type: object
properties:
title:
type: string
release_date:
type: string
format: date
language:
description: Two-letter ISO 639-1 language code
type: string
pattern: '[a-z]{2}'
The default rule derived from the schema returns values such as:
{
"title": "mkLHM22Ohabb6YG4-wdZFUwxQa7dn",
"release_date": "2018-09-14",
"language": "jq"
}
You can refine this by overriding the default rules for the title
and language
properties:
Datatypes:
schemas:
Movie:
properties:
title:
semantic: sentence
language:
enum:
- en
- fr
- de
This rule binds an object generator (introduced by the keyword properties
) to the schema named Movie
which -- combined with the default rule for release_date
-- returns values such as:
{
"title": "What dog everybody am myself hourly meeting how group over example her",
"release_date": "2094-06-31",
"language": "de"
}
Parameters
A parameter rule binds a data generator to a named parameter within an operation found under paths
in the API definition.
Parameter rules appear in the operations section of the profile configuration, under Datatypes
.
For example, given this parameter definition on GET /movies
in the API definition that restricts the number of movies returned in a single call:
paths:
/movies:
get:
parameters:
- name: limit
in: query
description: Sets an upper bound on the number of movies returned
schema:
type: integer
The default rule derived from this definition will return values from the full integer range, but you can override this with a rule in the profile configuration that specifies a more realistic range, with occasional outliers to check the validation logic:
Datatypes:
operations:
/movies:
get:
parameters:
- name: limit
data:
choice:
- minimum: 1
maximum: 500
weight: 95
- const: 0
weight: 3
- const: 8000000
weight: 2
The format of the rule follows the format of the original definition but with the keyword data
in place of schema
to introduce the data generator.
In OpenAPI, a parameter is uniquely identified by a name and a location (so the same name could be used in, say, both query and header) so the parameter rules allow the in
property as optional where needed to disambiguate.
It's not used in this example because the query parameter is the only one named limit
.
The auto-tester only supports parameters of type query
and path
at present (and body
in OpenAPI 2.0). Rules bound to other parameter types are ignored.
Request body
A request body rule binds a data generator to the body of a request where the corresponding schema is specified directly as part of the operation in the API definition, rather than via a reference to a named schema.
Request body rules appear in the operations section of the profile configuration, under Datatypes
.
For example, given this definition of the payload to a POST
request:
paths:
/movies:
post:
requestBody:
content:
application/json:
schema:
type: object
allOf:
- $ref: '#/components/schemas/Movie'
- properties:
keywords:
type: array
items:
type: string
The following rule generates some appropriate keywords to attach to a new movie (where the movie itself is covered by the rules discussed earlier):
Datatypes:
operations:
/movies:
post:
requestBody:
content:
application/json:
data:
properties:
keywords:
# pick up to 3 items from the following list
items:
enum:
- avant-garde
- dystopia
- epic
- postmodern
- remake
- shootout
minItems: 0
maxItems: 3
As with the parameter rules, the format follows that of the original definition but with data
in place of schema
. The definition is in the style of OpenAPI 3.0, but the rule can work for OpenAPI 2.0 as well as long
as the content type is consistent; alternatively, starting from an OpenAPI 2.0 definition, you can use a body
parameter rule to specify the payload, but that does not work for 3.0.