IBM Support

IBM Watson Knowledge Studio release notes

News


Abstract

Learn about new features and product compatibility issues, limitations, and known problems.

Content

New features

February 2017

  • The free trial plan no longer has a 30-day limit. You can sign up for the Free plan subscription to experiment with the powerful tools provided by Watson Knowledge Studio for as long as you want.
    Note: The same restrictions that applied to the 30-day trial apply to the free plan, including the 1 user, 5 projects, and 5 GB storage limits.
  • You can now deploy a machine-learning model that can be used by the IBM Watson Natural Language Understanding service. You can also deploy a rule-based model to the service for experimental purposes.

December 2016
  • The service was re-architected to improve its reliability and stability.
  • You can now build rule-based models in addition to machine learning models. The application includes a rule editor that simplifies the process of defining rules. You can use the resulting rule-based model to pre-annotate documents.
  • Models that you build (machine-learning and, on an experimental basis, rule-based models) can be deployed for use in the IBM Watson Discovery service in addition to AlchemyLanguage.
  • There is a new Standard plan subscription that enables you to use the tool at a lower cost by giving you access to an instance that is hosted on systems which are shared across multiple organizations.

November 2016:
  • The SIRE model that is used for English tokenization was updated to improve the performance of the default tokenizer.
  • Watson Knowledge Studio instances now support IBM ID single-sign on.

September 2016:
  • The AlchemyLanguage pre-annotator is available. Run the pre-annotator to use the AlchemyLanguage entity extraction service to find mentions of entities in your documents automatically. Pre-annotation helps you to train custom models faster and with less effort from human annotators.
  • Support is available for working with documents in Brazilian Portuguese, French, and Korean. In addition, Italian language support is being made available early for you to experiment with.
  • A new statistical tokenizer is introduced as the default. The previous dictionary-based tokenizer is available as a choice for advanced users during project creation.

Known issues

General issues:
    • If you delete a user account, all annotations added to documents by that user that have not yet been promoted to ground truth will be deleted, too.

Documents and dictionaries issues:
    • If you try to import a large ZIP file that contains UIMA CAS XMI files and experience issues due to network performance, consider splitting your ZIP file into smaller ZIP files and upload the ZIP files one by one.
    • If you use an AlchemyLanguage, dictionary, or machine-learning annotator to pre-annotate documents, the annotations appear only in tasks that you create after you run pre-annotation.
    • If compound words that include a hyphen are not being pre-annotated, add a surface form that includes spaces around the hyphen. For example, ensure that "pre-jurassic" and "pre - jurassic" are defined in the dictionary before you use the dictionary to pre-annotate the documents.
    • If you choose to use the dictionary-based tokenizer with a project, and you notice that some punctuation is causing incorrect sentence breaks, then you can add a dictionary to address the issue. For example, the punctuation in the abbreviation of the word Figure (Fig.) or the company name, Yahoo! might be misinterpreted by the tokenizer as indicating the end of a sentence. As a workaround, you can create a dictionary that includes potentially problematic terms like these, and then use it to pre-annotate the documents. While you cannot adjust sentence breaks using dictionary pre-annotation with the default machine learning-based tokenizer, it is also less likely to misinterpret the punctuation to begin with.

Human annotation and adjudication issues:
    • Google Chrome and Mozilla Firefox are the supported web browsers. For best performance in the Ground Truth Editor, use Chrome.
    • Human annotators cannot withdraw submitted documents, although an annotation process manager can reject a submitted document set. Annotation process managers cannot reverse the approval of document sets after they have been approved.
    • When Arabic documents are imported, some words are being split into separate tokens that should not be. As a result, some Arabic characters, such as Ta Marbuta (which appears at the end of a letter and, if the preceding letter allows, should be connected to it) are always displayed as separate text spans in the Ground Truth Editor.

Annotator component issues:
    • A project can have only one annotator of each type: a machine-learning annotator, a dictionary pre-annotator, and an AlchemyLanguage pre-annotator. You can create multiple versions of the machine-learning annotator, but not of the dictionary or AlchemyLanguage pre-annotators.
    • Only one machine-learning annotator component can be trained at a time within the same Watson Knowledge Studio instance. If training requests are submitted while the system is already processing a request, the later requests fail.
    • The Statistics page does not provide information about how to interpret the scores. For high-level annotator performance guidelines, and descriptions of the metrics, see the Analyzing performance statistics topic in the product documentation.
    • If you plan to export a model for use in IBM Watson Explorer, you must annotate at least one entity type. Otherwise, it can lead to issues when you run the model with IBM Watson Explorer, release 11.0.1.0. If you plan to annotate relation types, then you must define at least two relation types, and annotate instances of these relationships in the ground truth.
    • If you have a trial subscription, you cannot export the machine learning model that you build to use it with Watson Explorer. However, you can deploy the model to try out using it with the AlchemyLanguage service.
    • You cannot use AlchemyLanguage to pre-annotate documents that are written in Arabic, Japanese, or Korean, because those languages are not currently supported by the AlchemyLanguage entity extraction service. Another restriction is that documents that you pre-annotate are obscured into a non-readable format when they are exported; all annotations are obscured including any that were added to the documents by human annotators.
    • If there are too few relation annotations in a training set, training might result in an error. A possible indication of this issue is the following error message in the train_relation_date.log file:


    • [FutureSpace::Open] Future space too small. Problem reading files, perhaps?
       Also check SetFuture() method in all Future classes.
       computePrior: FutureSpace.C:188: void __com_ibm_ykt_ikm_nls::FutureSpace::
       ReadFutureSpaceFromTraining(const string&): Assertion `0' failed.

      [FutureSpace::Open] Future space too small. Problem reading files, perhaps?
       Also check SetFuture() method in all Future classes.
       computePrior: FutureSpace.C:188: void __com_ibm_ykt_ikm_nls::FutureSpace::ReadFutureSpaceFromTraining(const string&): Assertion `0' failed.


      The safest way to avoid too little data for relation training is to require any project that uses relations to annotate at least two occurrences each of at least two relation types in the training data. Because Watson Knowledge Studio does not invoke relation training, this problem does not appear when there are zero valid relations in the training data.

[{"Product":{"code":"SS2JKR","label":"IBM Watson Knowledge Studio"},"Business Unit":{"code":"BU055","label":"Cognitive Applications"},"Component":"Not Applicable","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"Version Independent","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 June 2018

UID

swg21986001