Handling exceptions
As Archive Service is deployed outside Sterling Order Management System Software, you might encounter problems over time, for example Order Service might be down or unreachable. There might be Archive Service validation errors or Cassandra-specific problems as well, such as some database implementation error like issue in query syntax or database runtime errors like unavailable connection pool, and so on. These can result in failure of archival operations. This results in the following considerations:
- Orders for which archival has failed need to be tracked and re-attempted.
- Orders for which archival has failed must be reported so that you can take corrective actions, if needed.
Handling exceptions due to configuration issues
As explained in previous section, Order Service provides few configurations by using which you can choose to further slice the order’s direct subordinate entities to be archived as a separate part, and for each part you can define the data size to be stored in Archive Service as needed for your business.
OSI_AWAITING_ARCHIVE
table for archival:- Properties Validation:
Archive Service validates the properties that are configured before using the
properties for archival processing. All the properties which are non-modifiable at runtime are
validated before fetching any records for archival. In case of any incorrect configurations, you are
notified to take the corrective actions. The properties that are modifiable at runtime are validated
during the Archive Service process and the archival processing fails due to
incorrect configuration. Any issues due to incorrect configuration are notified so that you can take
appropriate actions as follows:
- Properties which are non-modifiable at runtime, correct the property value and restart the agent server.
- Properties which are modifiable at runtime, correct the property value. The agent pick ups the updated value in the next trigger. In this case, the agent restart is not required.
- Connection validation:The archival process validates the connection with Order Service even before attempting the archival of the records staged for archival in
the
OSI_AWAITING_ARCHIVE
table. Any failure due to invalid Order Service URL, Order Service may be down, or unreachable are reported to take the corrective action.
Handling runtime exceptions during Archive Service
Any runtime exception that occurs during the Archive Service process for a history order immediately stops further processing for the order and rollbacks transaction in Sterling Order Management System Software. Archival of the failed orders are reattempted in the next trigger of the archival processing. This ensures that no data is lost in Sterling Order Management System Software in case of failure during archival of any of the parts of the history order. The error details are logged and reported for the specific exception.
- For any intermittent failures received from Archive Service, Archive Service stops processing of orders. Such orders are picked up in the next trigger of the archival.
- For failures received from Archive Service that are not expected to occur for
every orders and could be related to specific type of order which may require either the fix from
product side or some correction to the Order data, the Archive Service agent
updates the LAST_FAILED_DATE column to current date in the
OSI_AWAITING_ARCHIVE
table for the failed orders. This ensures that such records are not picked up again for processing for the next 30 days. This helps in reducing the change of continuous failure in the archival process. - If order archival fails due to order not present in Order Search, Archive Serviceprocessing inserts a record for that order in YFS_AWAITING_INDEX table.
This record is picked up by the SSI_DELAYED_SYNC agent for indexing that order in Order Search. Archive Service does not remove the record from the
OSI_AWAITING_ARCHIVE
table for that order and this order is picked up for archival in next trigger of the agent.
Exception notification
Any exception that occurs due to validation failures or any failures during archival operation are notified.
The exceptions from the Archive Service agent are published by raising an alert or raising an event.
Raising alerts
Alerts are raised for any failure in order archival processing. The alerts raised by the Archive Service processing agent are like any other alerts raised for operational
exceptions. All exceptions are logged with ExceptionType=’AGENTEXCEPTION’. They are created in the
YFS_Inbox
table.
Alerts raised due to property or connection validation failure, which is performed before fetching the jobs for archival are consolidated.
Alerts raised during archival failure for a history order are not consolidated and contain order-specific details
Raising events
The Archive Service agent raises ON_FAILURE event for any failures that require manual intervention. This event, if enabled, can publish the information as illustrated in the following template:
<OrderArchive FailureType="" FailureStatus="" FailureCode="" ErrorCode="" ErrorDescription="" ErrorRelatedMoreInfo="" OccuredOn="" Comments="">
<Order DocumentType="" EnterpriseCode="" OrderHeaderKey="" OrderNo="" Id="">
<Part Name="" />
</Order>
<ErrorReferences>
<ErrorReference Name="" Value="" />
</ErrorReferences>
<StackTrace/>
</OrderArchive>
This is configurable and the event template OSI_ORDER_ARCHIVE.ON_FAILURE.xml
is
present in the <INSTALL_DIR>/repository/xapi/template/merged/event
directory.
Sterling Order Management System Software does not provide any default event handlers such as conditions, actions, or services to trigger a process when this event is raised.
You can define actions to publish the event output to the database, create an alert, send an email, and so forth, and define event handlers by providing conditions that determine the types of actions that are performed when this event is raised.
The following table describes the attributes published for the exceptions from the Archive Service agent.
Attributes | Description |
---|---|
FailureType | There can be three types of failures that can occur in Archive Service agent:
|
FailureStatus | HTTP REST status received in failure response. This will not be published for configurational failures. |
FailureCode | A standard failure code received from Archive Service. This is published only if the agent is able to connect to Archive Service and receive a failure response. |
ErrorCode | The error code for the error. You can view the description and cause of the errors, as well as the actions to troubleshoot them in Sterling Order Management System Software. |
ErrorDescription | Provides the error description of the error code. |
ErrorRelateMoreInfo | Provides the innermost cause of the failure. |
OccuredOn | Date on which failure occurred. |
Comments | Provides additional information about the scenario. |
Order | This element contains below attribute details of the history order for which attempt to
archive failed.
|
ErrorReference | Contains name-value pair to provide additional information about the error occurred. It
includes any context specific information, if available.
|
StackTrace |
The entire stack trace of the exception. This is not present in the application-provided template
for the |