Generating a taxonomy

Generate an AI-optimized taxonomy from your source documents using Knowledge Transformer.

Before you begin

To generate a taxonomy, you will need:

About this task

The first generation creates the taxonomy from scratch. You can then refine the taxonomy iteratively by re-running the transform process and specifying your existing taxonomy.

Processing time varies based on document size, complexity, and quantity.

Generate a new taxonomy when starting fresh with a document set. Use the refinement process (see Refining an existing taxonomy) when you want to improve or expand an existing taxonomy with additional documents.

Procedure

  1. From your command-line interface, prepare your workspace.
    1. Navigate to your working directory.
    2. Ensure input docs are accessible.
    3. Verify Knowledge Transformer is running (if container based).
  2. Construct the transform command with optional configurations.
    zassist transform [input-file] [knowledge-dir] -o [output-dir]
    1. Specify input document
      When generating a taxonomy, specify your input file for transformation. The supported file formats are:
      • Video transcript files (.vtt)
      • PowerPoint files (.pptx)
      • Portable Document Format (.pdf)
      • Microsoft Word documents (.docx)
      • Video files (.mp4)
      • Plain text files (.txt)
      • Markdown documents (.md)
      For more information on these supported formats, see Supported file formats.

      If you have multiple files you want to transform, you will need to run the zassist transform command for each file (see Refining an existing taxonomy).

    2. Optional: Specify your existing taxonomy ([knowledge-dir]).

      If refining an existing taxonomy, specify the directory where it is located. If you did not previously specify a different output directory or alternative name, this may just be ./output.

    3. Optional: Specify the output directory ([output-dir])

      If no output directory is specified, a directory named output is created in the current working directory. To specify the output directory, you can use either the --output flag or -o.

  3. Run the transform command
    Start the taxonomy generation by running the transform command. For example:
    zassist transform huge-file.vtt ./knowledge -o ./results
    If you are generating a taxonomy from scratch with only your input as a parameter, it may look like this:
    zassist transform document.vtt
    Run the command, and wait for completion to generate a taxonomy.

Results

You should have:
  • A generated taxonomy file in either the output directory or an alternative specified location
  • A processing log displayed in the CLI
  • Confirmation of success
For more information on understanding the output from taxonomy generation, see Understanding the output.
For more information on troubleshooting common problems, see Troubleshooting Knowledge Transformer.

What to do next

After generating the taxonomy, you can:
  • Refine existing taxonomy with any additional input files (see Refining an existing taxonomy).
  • Manually validate the quality of your taxonomy. Check for concept coverage and identify missing concepts to determine what additional input files are needed.
  • Ingest the taxonomy to make the knowledge available to AI agents (see [LINK TO WXA4Z INGESTING CONTENT THROUGH CLI PAGE]).