JMS client message conversion and encoding

The methods you use to do JMS client message conversion and encoding are listed, with code examples of each type of conversion.

Conversion and encoding occur when Java primitives or objects are read or written to and from JMS messages. The conversion is called JMS client data conversion to distinguish it from queue manager data conversion and application data conversion. The conversion takes place strictly when data is read from or written to a JMS message. Text is converted to and from the internal 16 bit Unicode representation 1 to the character set used for text in messages. Numeric data is converted to and Java primitive numeric types to the encoding defined for the message. Whether conversion is performed, and what type of conversion is performed, depends on the JMS message type and the read or write operation.

Table 1 categorizes the read and write methods for different JMS message types by the type of conversion performed. The conversions types are described in the text following the table.
Table 1. Message types and conversion types
  Conversion type
Message type Text Numeric Other None
JMSObjectMessage
     
getObject
setObject
JMSTextMessage
getText
setText
     
JMSBytesMessage
readUTF
writeUTF
readDouble
readFloat
readInt
readLong
readShort
readUnsignedShort
writeDouble
writeFloat
writeInt
writeLong
writeShort
readBoolean
readObject
writeBoolean
writeObject
readByte
readUnsignedByte
readBytes
readChar
writeByte
writeBytes
writeChar
JMSStreamMessage
readString
writeString
readDouble
readFloat
readInt
readLong
readShort
writeDouble
writeFloat
writeInt
writeLong
writeShort
readBoolean
readObject
writeBoolean
writeObject
readByte
readBytes
readChar
writeByte
writeBytes
writeChar
JMSMapMessage
getString
setString
getDouble
getFloat
getInt
getLong
getShort
setDouble
setFloat
setInt
setLong
setShort
getBoolean
getObject
setBoolean
setObject
getByte
getBytes
readChar
setByte
setBytes
setChar
Text

The default CodedCharacterSetId for a destination is 1208, UTF-8. By default, text is converted from Unicode and sent as a UTF-8 text string. On receive, the text is converted from the coded character set in the message received by the client, into Unicode.

The setText and writeString methods convert text from Unicode into the character set defined for the destination. An application can override the destination character set by setting the message property JMS_IBM_CHARACTER_SET. JMS_IBM_CHARACTER_SET, when sending a message must be a numeric coded character set identifier2.

The code snippets in Sending and receiving a JMSTextmessage send two messages. One is sent in the character set defined for the destination and the other in character set 37, defined by the application.

The getText and readString methods convert the text in the message from the character set defined in the message into Unicode. The methods use the code page defined in the message property, JMS_IBM_CHARACTER_SET. The code page is mapped from MQRFH2.CodedCharacterSetId unless the message is an MQ-type message and has no MQRFH2. If the message is a MQ-type message, with no MQRFH2, the code page is mapped from MQMD.CodedCharacterSetId.

The code snippet in Figure 5 receives the message that was sent to the destination. The text in the message is converted from code page IBM037 back into Unicode.
Note: A simple way to check that the text is converted to coded character set 37 is to use WebSphere® MQ Explorer. Browse the queue and show the properties of the message before it is retrieved.

Contrast the code snippet in Figure 4 with the incorrect code snippet in Figure 1. In the incorrect snippet the text string is converted twice, once by the application, and again by WebSphere MQ.

Figure 1. Incorrect code page conversion
TextMessage tmo = session.createTextMessage();
tmo.setIntProperty(WMQConstants.JMS_IBM_CHARACTER_SET, 37);
tmo.setText(new String("Sent in EBCDIC character set 37".getBytes(CCSID.getCodepage(37))));
producer.send(tmo);

The writeUTF method converts text from Unicode to 1208, UTF-8. The text string is prefaced with a 2 byte length. The maximum length of the text string is 65534 bytes. The readUTF method reads an item in a message written by the writeUTF method. It reads exactly the number of bytes written by the writeUTF method.

Numeric

The default numeric encoding for a destination is Native. The Native encoding constant for Java has the value 273, x'00000111', which is the same for all platforms. On receive, the numbers in the message are correctly transformed into numeric Java primitives. The transformation uses the encoding defined in the message and the type returned by the read method.

The send method converts numbers that are added to a message by the set and write into the numeric encoding defined for the destination. The destination encoding can be overridden for a message by an application setting the message property, JMS_IBM_ENCODING; for example:
message.setIntProperty(WMQConstants.JMS_IBM_ENCODING, WMQConstants.WMQ_ENCODING_INTEGER_REVERSED);

The get and read numeric methods convert numbers in the message from the numeric encoding defined in the message. They convert the numbers to the type that is specified by the read or get method; see The ENCODING property . The methods use the encoding defined in JMS_IBM_ENCODING. The encoding is mapped from MQRFH2.Encoding unless the message is an MQ-type message and has no MQRFH2. If the message is a MQ-type message, with no MQRFH2, then the methods use the encoding defined in MQMD.Encoding.

The example in Figure 6 shows an application encoding a number in the destination format and sending it in a JMSStreamMessage. Compare the example in Figure 6 to the example in Figure 7. The difference is that JMS_IBM_ENCODING must be set in a JMSBytesMessage.
Note: A simple way to check that the number is encoded correctly is to use WebSphere MQ Explorer. Browse the queue and show the properties of the message before it is consumed.
Other

The boolean methods encode true and false as x'01' and x'00' in a JMSByteMessage, JMSStreamMessage, and JMSMapMessage.

The UTF methods encode and decode Unicode into UTF-8 text strings. The strings are limited to less than 65536 characters, and are preceded by the 2 byte length field.

The Object methods wrap primitive types as objects. Numeric and text types are encoded or converted as if the primitive types had been read or written using the numeric and text methods.

None

The readByte, readBytes, readUnsignedByte, writeByte, and writeBytes methods get or put single bytes, or arrays of bytes, between the application and the message without conversion. The readChar and writeChar methods get and put 2 byte Unicode characters between the application and the message without conversion.

Using the readBytes and writeBytes methods, the application can perform its own code point conversion, as in Sending and receiving text in a JMSBytesMessage.

WebSphere MQ does not perform any code page conversion in the client as the message is a JMSBytesMessage, and because the readBytes and writeBytes methods are used. Nonetheless, if the bytes represent text, make sure that code page used by the application matches the coded character set of the destination. The message might be converted again by a queue manager conversion exit. Another possibility is that the receiving JMS client program might follow the convention of converting any byte arrays representing text in the message into strings or characters using the JMS_IBM_CHARACTER_SET property in the message.

In this example the client uses the destination coded character set for its conversion:

bytes.writeBytes("In the destination code page".getBytes(
      CCSID.getCodepage(((MQDestination) destination)
           .getIntProperty(WMQConstants.WMQ_CCSID))));
Alternatively, the client might have chosen a code page and then set the corresponding coded character set in the JMS_IBM_CHARACTER_SET property of the message. The WebSphere MQ classes for Java use JMS_IBM_CHARACTER_SET to set the CodedCharacterSetId field in the JMS properties in the MQRFH2, or in the message descriptor, MQMD:
String codePage = CCSID.getCodepage(37);
message.setIntProperty(WMQConstants.JMS_IBM_CHARACTER_SET, codePage);3

If a byte array is written into a JMSStringMessage or JMSMapMessage, WebSphere MQ classes for JMS does not perform data conversion, as the bytes are typed as hexadecimal data not as text in the JMSStringMessage and JMSMapMessage.

If the bytes represent characters in your application, you must take into account what code points to read and write to the message. The code in Figure 2 follows the convention of using the destination coded character set. If you create the string using the default character set for the JVM, the byte contents depend on the platform. A JVM on Windows typically has a default Charset of windows-1252, and UNIX, UTF-8. Interchange between Windows and UNIX does require that you select an explicit code page for exchanging text as bytes.

Figure 2. Writing bytes representing a string in a JMSStreamMessage using the destination character set
StreamMessage smo = producer.session.createStreamMessage();
smo.writeBytes("123".getBytes(CCSID.getCodepage(((MQDestination) destination)
           .getIntProperty(WMQConstants.WMQ_CCSID))));

Examples

Sending and receiving a JMSTextmessage

A text message cannot contain text in different character sets. The example shows text in different character sets, sent in two different messages.

Figure 3. Send text message in the character set defined by the destination
TextMessage tmo = session.createTextMessage();
tmo.setText("Sent in the character set defined for the destination");
producer.send(tmo);
Figure 4. Send text message in ccsid 37
TextMessage tmo = session.createTextMessage();
tmo.setIntProperty(WMQConstants.JMS_IBM_CHARACTER_SET, 37);
tmo.setText("Sent in EBCDIC character set 37");
producer.send(tmo);
Figure 5. Receive text message
TextMessage tmi = (TextMessage)consumer.receive();
System.out.println(tmi.getText());
...
Sent in the character set defined for the destination

Encoding examples

Examples showing a number being sent in the encoding defines for a destination. Notice that you must set the JMS_IBM_ENCODING property of a JMSBytesMessage to the value specified for the destination.

Figure 6. Sending a number using the destination encoding in a JMSStreamMessage
StreamMessage smo = session.createStreamMessage();
smo.writeInt(256);
producer.send(smo);
...
StreamMessage smi = (StreamMessage)consumer.receive();
System.out.println(smi.readInt());
...
256
Figure 7. Sending a number using the destination encoding in a JMSBytesMessage
BytesMessage bmo = session.createBytesMessage();
bmo.writeInt(256);
int encoding = ((MQDestination) (destination)).getIntProperty
    (WMQConstants.WMQ_ENCODING)
bmo.setIntProperty(WMQConstants.JMS_IBM_ENCODING, encoding);
producer.send(bmo);
...
BytesMessage bmi = (BytesMessage)consumer.receive();
System.out.println(bmi.readInt());
...
256

Sending and receiving text in a JMSBytesMessage

The code in Figure 8 sends a string in a BytesMessage. For simplicity, the example sends a single string, for which a JMSTextMessage is more appropriate. To receive a text string in bytes message containing a mixture of types, you must know the length of the string in bytes, called TEXT_LENGTH in Figure 9. Even for a string with a fixed number of characters, the length of the byte representation might be longer.

Figure 8. Sending a String in a JMSBytesMessage
BytesMessage bytes = session.createBytesMessage();
String codePage = CCSID.getCodepage(((MQDestination) destination)
                  .getIntProperty(WMQConstants.WMQ_CCSID));
bytes.writeBytes("In the destination code page".getBytes(codePage));
producer.send(bytes);
Figure 9. Receiving a String from a JMSBytesMessage
BytesMessage message = (BytesMessage)consumer.receive();
int TEXT_LENGTH = new Long(message.getBodyLength())).intValue();
byte[] textBytes = new byte[TEXT_LENGTH];
message.readBytes(textBytes, TEXT_LENGTH);
String codePage = message.getStringProperty(WMQConstants.JMS_IBM_CHARACTER_SET);
String textString = new String(textBytes, codePage);
1 Some Unicode representation requires more than 16 bits. See a Java SE reference.
2 When receiving a message JMS_IBM_CHARACTER_SET is a Java Charset code page name.
3 SetStringProperty(WMQConstants.JMS_IBM_CHARACTER_SET, codePage) currently accepts only numeric character set identifiers.