Data payload corruption
Data that is transferred to complete an HTTP request can get modified unintentionally and silently. Corruption might be tolerable, or might be intolerable for sensitive data. There are features in AS4 Microservice to minimize undetected corruption and prevent the acceptance of known corrupted data.
As data is transferred through multiple components through the Internet, data can be corrupted in various ways. Proxy servers can introduce corruption and corruption can occur because of errors on the disks that hold the data. Corruption can occur to data that is downloaded and data that is uploaded. The MD5 algorithm converts any sequence of bytes (zero or more) into a single 16-byte value that is called a hash. The MD5 algorithm conforms to the IETF RFC 1321 standard. The algorithm runs on any computer language, operating system, or hardware to compute the same value for the same input sequence. The MD5 algorithm produces a cryptographic hash because it is unlikely that two inputs have the same 16-byte output.
POST request
Content-MD5
header, the
server independently computes the MD5 value that is based on the bytes
of the body. If any bytes changed in transit, the MD5 value that the
server computes differs from what the client specified in the header.
Calculate the hash by using commands and provide the hash in the format:Content-MD5: 5MTASjcWUgmtLbAi8AZ0jQ==
where Content-MD5
is
the header name and 5MTASjcWUgmtLbAi8AZ0jQ==
is the
value.Content-MD5
is not included in the header, the storage server can generate a header
for each blob based on the following property:StoreWithMD5Always="true"
If
StoreWithMD5Always="true"
the hash is stored in the metadata that is associated
with the payload. An example of the hash value that is stored in the metadata
is:<entry key="md5Digest">5MTASjcWUgmtLbAi8AZ0jQ==</entry>
where
5MTASjcWUgmtLbAi8AZ0jQ==
is the MD5 hash value that is represented as a string
according to the IETF RFC 1864 specification. GET request
If the blob
metadata does not include a hash, the GET response does not contain
a Content-MD5
header. When the blob metadata does
include the MD5 digest, the server returns that value in a Content-MD5
header
in the response.
If any component retrieves a blob from storage
and finds the Content-MD5
header, the component can
calculate the MD5 value as the data is being received. After the last
byte from the GET request is read, the calculated digest value is
compared to what the header contains. If there is a mismatch, the
request is rejected. In this scenario, the client computes and verifies
the hash.
Additionally, the storage component computes the hash on the data read from the blob and compares it with the hash in the metadata of the blob. If there is a mismatch, then the storage component holds off from sending the last byte to the client, which forces the client to reject the response.