SMPP Submit Long Message and message split - java

We are using SMPP cloud-hopper library to SMS long long messages to SMS gateway Innovativetxt.com, but it seems like when we split following the long message TO 140 bytes each part. The number of characters in each message gets to 134 character.
However industry standard is kind of 153 character shall be for each part of GSM Encoded long message. Is it something wrong we are doing by having only 134 character when we split via 140 byte? If we trying to submit greater than 140 bytes message, the gateway provider rejects it with message oversized message body.
Shall be split the message to 153 character each to sbumit to SMSC, instead spiting the messages via 140 bytes each.
What is the best way to split long message? By message size i.e 140 bytes or message characters count?
Anyone faced same issues via cloudhopper or other Java-based Library what we shall do.

It's a common confusion. You are doing everything right. Message lengths may be 160 chars (7-bit GSM 03.38), 140 chars (8-bit Latin), 70 chars (16-bit UCS-2). Notice: 160 * 7 == 140 * 8 == 70 * 16.
When you split a long message additional info like total parts number and part index is stored in the message body, so-called User Data Header (UDH). This header also takes place. So, with UDH you left with 153 GSM chars (7-bit), 134 chars/bytes (8-bit) payload or 67 2bytes-unicode chars (16-bit)
See also http://www.nowsms.com/long-sms-text-messages-and-the-160-character-limit
The UDH is 6 bytes long for Contatenated message 8-bit as in your case.
UDH structure
0x05: Length of UDH (5 bytes to follow)
0x00: Concatenated message Information Element (8-bit reference number)
0x03: Length of Information Element data (3 bytes to follow)
0xXX: Reference number for this concatenated message
0xYY: Number of fragments in the concatenated message
0xZZ: Fragment number/index within the concatenated message
Total message length, bits: 160*7 = 140*8 = 1120
UDH length, bits: 6*8 = 48
Left payload, bits: 1120-48 = 1072
For GSM 03.38 you get 1072/7 = 153 GSM (7-bit) chars + 1 filling unused bit.
For Latin you get 1072/8 = 134 (8-bit) chars.
For UCS-2 you get 1072/16 = 67 (16-bit) chars.
As you can see 153 GSM chars equals to 134 bytes minus 1 bit. Probably these 134 chars is what Java reports you. But once you split your long text message you end up with a binary message containing both text and UDH. And you should treat the message as binary. I suggest you to make binary dumps out of the resulting parts and investigate them.

Hello See sample method for sending both short or long SMS
public synchronized String sendSMSMessage(String aMessage,
String aSentFromNumber, String aSendToNumber,
boolean requestDeliveryReceipt) {
byte[] textBytes = CharsetUtil.encode(aMessage,
CharsetUtil.CHARSET_ISO_8859_1);
try {
SubmitSm submitMsg = new SubmitSm();
// add delivery receipt if enabled.
if (requestDeliveryReceipt) {
submitMsg
.setRegisteredDelivery(SmppConstants.REGISTERED_DELIVERY_SMSC_RECEIPT_REQUESTED);
}
submitMsg.setSourceAddress(new Address((byte) 0x03, (byte) 0x00,
aSentFromNumber));
submitMsg.setDestAddress(new Address((byte) 0x01, (byte) 0x01,
aSendToNumber));
if (textBytes != null && textBytes.length > 255) {
submitMsg.addOptionalParameter(new Tlv(SmppConstants.TAG_MESSAGE_PAYLOAD, textBytes, "message_payload"));
}else{
submitMsg.setShortMessage(textBytes);
}
logger.debug("About to send message to " + aSendToNumber
+ ", Msg is :: " + aMessage + ", from :: "
+ aSentFromNumber);
SubmitSmResp submitResp = smppSession.submit(submitMsg, 15000);
logger.debug("Message sent to " + aSendToNumber
+ " with message id " + submitResp.getMessageId());
return submitResp.getMessageId();
} catch (Exception ex) {
logger.error("Exception sending message [Msg, From, To] :: ["
+ aMessage + ", " + aSentFromNumber + ", " + aSendToNumber,
ex);
}
logger.debug("Message **NOT** sent to " + aSendToNumber);
return "Message Not Submitted to " + aSendToNumber;
}

Related

How to properly parse a tcp packet in java?

I currently have a simple TCP server, which calls a function and passes it the binary TCP payload in the form of a byte array on every new incoming packet, what is the proper way to parse it?
I tried doing it by just slicing it into different byte arrays and processing them separately, but my packets include a variable-length datatype, which I am unable to split from the other data. It really seems like it's not what I should do and I think there is a better way to do it
readVarInt() is a function that parses the variable-length datatype and returns the result and the length of the unparsed datatype in an int[] array
PacketHeader is a class that that has size, id and body fields, id and size are already parsed and the body field contains everything else after the size and id bytes
int[] parsedProtocolVersion = Main.readVarInt(packetHeader.body);
System.out.println(parsedProtocolVersion[0] + ' ' + parsedProtocolVersion[1]);
int[] parsedServerAddressSize = Main.readVarInt( packetHeader.body.subList(parsedProtocolVersion[1], packetHeader.body.size()));
System.out.println("parsed 1 " + parsedServerAddressSize[1] + " " + parsedServerAddressSize[0]);
String parsedServerAddress = String.valueOf(packetHeader.body.subList(parsedServerAddressSize[1], parsedServerAddressSize[0]));
System.out.println("parsed 2");
int parsedServerPort = Integer.parseUnsignedInt((packetHeader.body.subList(parsedServerAddressSize[1]+parsedProtocolVersion[1], parsedServerAddressSize[1]+parsedProtocolVersion[1]+2).toString()));
System.out.println("parsed 3");
int[] parsedNextState = Main.readVarInt(packetHeader.body.subList(parsedServerAddressSize[1]+parsedProtocolVersion[1]+2, packetHeader.body.size()));
System.out.println("parsed 4");
return String.format("%d %s %d %d",parsedProtocolVersion[0], parsedServerAddress, parsedServerPort, parsedNextState[0]);

About Java Android BASE64 decoding to ASCII String

I have Base64 string data that i have received from a service.
I am able to decode this data and get byte array.
But when i create a new string from that byte array, my server is not being able to read that data properly.
But this same process in C language of Linux based device is working fine on my server side. That is to say, if i (Base64) decode that same string (using OpenSSL and get char array) on that device and send it to my server, the server is able to read that properly.
Now, i tried a sample code in eclipse to understand the problem. Below is the sample,
String base1 =
"sUqVKrgErEId6j3rH8BMMpzvXuTf05rj0PlO/eLOoJwQb3rXrsplAl28unkZP0WvrXRTlpAmT3Y
ohtPFl2+zyUaCSrYfug5JtVHLoVsJ9++Afpx6A5dupn3KJQ9L9ItfWvatIlamQyMo2S5nDypCw79
B2HNAR/PG1wfgYG5OPMNjNSC801kQSE9ljMg3hH6nrRJhXvEVFlllKIOXOYuR/NORAH9k5W+rQeQ
7ONsnao2zvYjfiKO6eGleL6/DF3MKCnGx1sbci9488EQhEBBOG5FGJ7KjTPEQzn/rq3m1Yj9Le/r
KsmzbRNcJN2p/wy1xz9oHy8jWDm81iwRYndJYAQ==";
byte[] b3 = Base64.getDecoder().decode(base1.getBytes());
System.out.println("B3Len:" + b3.length );
String s2 = new String(b3);
System.out.println("S2Len:" + s2.length() );
System.out.println("B3Hex: " + bytesToHex(b3) );
System.out.println("B3HexLen: " + bytesToHex(b3).length() );
byte[] b2 = s2.getBytes();
System.out.println("B2Len:" + b2.length );
int count = 0;
for(int i = 0; i< b3.length; i++) {
if(b3[i] != b2[i]) {
count++;
System.out.println("Byte: " + i + " >> " + b3[i] + " != " + b2[i]);
}
}
System.out.println("Count: " + count);
System.out.println("B2Hex: " + bytesToHex(b2) );
System.out.println("B2HexLen: " + bytesToHex(b2).length() );
Below is output:
B3Len:256
S2Len:256
B3Hex:
b14a952ab804ac421dea3deb1fc04c329cef5ee4dfd39ae3d0f94efde2cea09c106f7ad7aeca
65025dbcba79193f45afad74539690264f762886d3c5976fb3c946824ab61fba0e49b551cba1
5b09f7ef807e9c7a03976ea67dca250f4bf48b5f5af6ad2256a6432328d92e670f2a42c3bf41
d8734047f3c6d707e0606e4e3cc3633520bcd35910484f658cc837847ea7ad12615ef1151659
65288397398b91fcd391007f64e56fab41e43b38db276a8db3bd88df88a3ba78695e2fafc317
730a0a71b1d6c6dc8bde3cf0442110104e1b914627b2a34cf110ce7febab79b5623f4b7bfaca
b26cdb44d709376a7fc32d71cfda07cbc8d60e6f358b04589dd25801
B3HexLen: 512
B2Len:256
Byte: 52 >> -112 != 63
Byte: 175 >> -115 != 63
Byte: 252 >> -99 != 63
Count: 3
B2Hex:
b14a952ab804ac421dea3deb1fc04c329cef5ee4dfd39ae3d0f94efde2cea09c106f7ad7aeca
65025dbcba79193f45afad7453963f264f762886d3c5976fb3c946824ab61fba0e49b551cba1
5b09f7ef807e9c7a03976ea67dca250f4bf48b5f5af6ad2256a6432328d92e670f2a42c3bf41
d8734047f3c6d707e0606e4e3cc3633520bcd35910484f658cc837847ea7ad12615ef1151659
65288397398b91fcd391007f64e56fab41e43b38db276a3fb3bd88df88a3ba78695e2fafc317
730a0a71b1d6c6dc8bde3cf0442110104e1b914627b2a34cf110ce7febab79b5623f4b7bfaca
b26cdb44d709376a7fc32d71cfda07cbc8d60e6f358b04583fd25801
B2HexLen: 512
I understand that there are extended characters in this string.
So, here we can see that the reconverting the hex to string is not working properly, because of the differences in the byte arrays.
I actually need this to work because, i have much larger Base64 string than the one in this sample that i need to send to my server which is trying to read ASCII string.
Or,
Can anyone give me a solution that can give me an ASCII String output that is identical to char array output from C language (OpenSSL decoding) on Linux device.

How to cut a String into 1 megabyte subString with Java?

I have come up with the following:
public static void cutString(String s) {
List<String> strings = new ArrayList<>();
int index = 0;
while (index < s.length()) {
strings.add(s.substring(index, Math.min(index + 1048576, s.length())));
index += 1048576;
}
}
But my problem is, that using UTF-8 some character doesn't exactly take 1 byte, so using 1048576 to tell where to cut the String is not working. I was thinking about maybe using Iterator, but that doesn't seem efficient. What'd be the most efficient solution for this? The String can be smaller than 1 Mb to avoid character slicing, just not bigger than that!
Quick, unsafe hack
You can use s.getBytes("UTF-8") to get an array with the actual bytes used by each UTF-8 character. Like this:
System.out.println("¡Adiós!".getBytes("UTF-8").length);
// Prints: 9
Once you have that, it's just a matter of splitting the byte array in chunks of length 1048576, and then turn the chunks back into UTF-8 strings with new String(chunk, "UTF-8").
However, by doing it like that you can break multi-byte characters at the beginning or end of the chunks. Say the 1048576th character is a 3-byte Unicode character: the first byte would go into the first chunk and the other two bytes would get put into the second chunk, thus breaking the encoding.
Proper approach
If you can relax the "1 MB" requirement, you can take a safer approach: split the string in chunks of 1048576 characters (not bytes), and then test each chunk's real length with getBytes, removing chars from the end as needed until the real size is equal or less than 1 MB.
Here's an implementation that won't break characters, at the expense of having some lines smaller than the given size:
public static List<String> cutString(String original, int chunkSize, String encoding) throws UnsupportedEncodingException {
List<String> strings = new ArrayList<>();
final int end = original.length();
int from = 0, to = 0;
do {
to = (to + chunkSize > end) ? end : to + chunkSize; // next chunk, watch out for small strings
String chunk = original.substring(from, to); // get chunk
while (chunk.getBytes(encoding).length > chunkSize) { // adjust chunk to proper byte size if necessary
chunk = original.substring(from, --to);
}
strings.add(chunk); // add chunk to collection
from = to; // next chunk
} while (to < end);
return strings;
}
I tested it with chunkSize = 24 so you could see the effect. It should work as well with any other size:
String test = "En la fase de maquetación de un documento o una página web o para probar un tipo de letra es necesario visualizar el aspecto del diseño. ٩(-̮̮̃-̃)۶ ٩(●̮̮̃•̃)۶ ٩(͡๏̯͡๏)۶ ٩(-̮̮̃•̃).";
for (String chunk : cutString(test, 24, "UTF-8")) {
System.out.println(String.format(
"Chunk [%s] - Chars: %d - Bytes: %d",
chunk, chunk.length(), chunk.getBytes("UTF-8").length));
}
/*
Prints:
Chunk [En la fase de maquetaci] - Chars: 23 - Bytes: 23
Chunk [ón de un documento o un] - Chars: 23 - Bytes: 24
Chunk [a página web o para pro] - Chars: 23 - Bytes: 24
Chunk [bar un tipo de letra es ] - Chars: 24 - Bytes: 24
Chunk [necesario visualizar el ] - Chars: 24 - Bytes: 24
Chunk [aspecto del diseño. ٩(] - Chars: 22 - Bytes: 24
Chunk [-̮̮̃-̃)۶ ٩(●̮̮] - Chars: 14 - Bytes: 24
Chunk [̃•̃)۶ ٩(͡๏̯͡] - Chars: 12 - Bytes: 23
Chunk [๏)۶ ٩(-̮̮̃•̃).] - Chars: 14 - Bytes: 24
*/
Another test with a 3 MB string like the one you mention in your comments:
String string = "0123456789ABCDEF";
StringBuilder bigAssString = new StringBuilder(1024*1024*3);
for (int i = 0; i < ((1024*1024*3)/16); i++) {
bigAssString.append(string);
}
System.out.println("bigAssString.length = " + bigAssString.toString().length());
bigAssString.replace((1024*1024*3)/4, ((1024*1024*3)/4)+1, "á");
for (String chunk : cutString(bigAssString.toString(), 1024*1024, "UTF-8")) {
System.out.println(String.format(
"Chunk [...] - Chars: %d - Bytes: %d",
chunk.length(), chunk.getBytes("UTF-8").length));
}
/*
Prints:
bigAssString.length = 3145728
Chunk [...] - Chars: 1048575 - Bytes: 1048576
Chunk [...] - Chars: 1048576 - Bytes: 1048576
Chunk [...] - Chars: 1048576 - Bytes: 1048576
Chunk [...] - Chars: 1 - Bytes: 1
*/
You can use a ByteArrayOutputStream with an OutputStreamWriter
ByteArrayOutputStream out = new ByteArrayOutputStream();
Writer w = OutputStreamWriter(out, "utf-8");
//write everything to the writer
w.write(myString);
byte[] bytes = out.toByteArray();
//now you have the actual size of the string, you can parcel by Mb. Be aware that problems may occur however if you have a multi-byte character separated into two locations

How to find if HL7 Segment has ended or not if Carriage return is not present

I am working on a tool which will construct a HL7 message in following Way :
Message will start with : 0B
Segment will end with : OD
And Message will end with : 1C0D
So, here i have reached so far, i am able to add OB and add 1C0D in the end of the HL7 Message. I am also able to add OD before at the end of the segment. I am accomplishing with the of code where i will check if Character before Segment name is 0D or not.
But the issue is if text in the message is somewhat like this ...PID| my code will add 0D before PID| which is not correct it should check if its the start of the segment or not.
Please help if someone has worked on similar requirement.
Link to my code is :
Arraylist Sublist IndexOutOfBounds Exception
I had some time to look at this problem. As far as I could understand, you have some piece of code that generates the HL7v2 segments for you and then you want to create a message with the following delimiters:
Segment delimiter: 0x0D (or 13 in ASCII), which is the Carriage Return. It's the segment separator, as per HL7v2 standard;
Message start delimiter: 0x0B (ASCII 11 - Vertical Tab);
Message finish delimiter: 0x1C0D. My guess is that this value is supposed to be the concatenation of 0x1C (ASCII 28 - File Separator) and 0x0D (ASCII 13 - Carriage Return).
With #1 you get HL7v2 messages standard-compliant. With #2 and #3 you are able to clearly define delimiters for the message so that it can be processed and parsed later by some custom processor.
So I took a shot writing some simple code and here's the result:
public class App
{
public static void main( String[] args ) throws Exception
{
String msg = "MSH|^~\\&|HIS|RIH|EKG|EKG|199904140038||ADT^A01||P|2.5" +
"PID|0001|00009874|00001122|A00977|SMITH^JOHN^M|MOM|19581119|F|NOTREAL^LINDA^M|C|564 SPRING ST^^NEEDHAM^MA^02494^US" +
"AL1||SEV|001^POLLEN";
String[] segments = msg.split("(?=PID|AL1)");
System.out.println("Initial message:");
for (String s : segments)
System.out.println(s);
byte hexStartMessage = 0x0B;
byte hexFinishMessage1 = 0x1C;
byte hexFinishMessage2 = 0x0D;
byte hexFinishSegment = 0x0D;
String finalMessage = Byte.toString(hexStartMessage) +
intersperse(segments, hexFinishSegment) +
Byte.toString(hexFinishMessage1) +
Byte.toString(hexFinishMessage2);
System.out.println("\nFinal message:\n" + finalMessage);
}
public static String intersperse(String[] segments, byte delimiter) throws UnsupportedEncodingException {
// uncomment this line if you wish to show the delimiter in the output
//System.out.printf("Byte Delimiter: %s", String.format("%04x", (int)delimiter));
StringBuilder sb = new StringBuilder();
String defaultDelimiter = "";
for (String segment : segments) {
sb.append(defaultDelimiter).append(segment);
defaultDelimiter = Byte.toString(delimiter);
}
return sb.toString();
}
}
I picked up a simple HL7v2 message and I splitted it in segments, according to the segments (name) used in the message, with the help of a regex with a lookahead strategy. This means that, for your messages you'll need to know the segments that are going to be used (you can get that from the standard).
I then interspersed the segment delimiter between each segment (at its end) and added the message start and end delimiters. In this case, for the message end delimiters, I used the 0x1C and 0x0D values separated, but if you need to use a single value then you only need to change the final appends.
Here's the output:
Initial message:
MSH|^~\&|HIS|RIH|EKG|EKG|199904140038||ADT^A01||P|2.5
PID|0001|00009874|00001122|A00977|SMITH^JOHN^M|MOM|19581119|F|NOTREAL^LINDA^M|C|564 SPRING ST^^NEEDHAM^MA^02494^US
AL1||SEV|001^POLLEN
Final message:
11MSH|^~\&|HIS|RIH|EKG|EKG|199904140038||ADT^A01||P|2.5
PID|0001|00009874|00001122|A00977|SMITH^JOHN^M|MOM|19581119|F|NOTREAL^LINDA^M|C|564 SPRING ST^^NEEDHAM^MA^02494^US
AL1||SEV|001^POLLEN2813
As you see, the final message begins with value 11 (0x0B) and ends with 28 (0x1C) and 13 (0x0D). The 13 (0x0D) at the end of each segment is not shown because Java's System.out.println() recognizes it as being the '\r' character and starts a new line because I'm running in Mac OS X. If you try to intersperse the segments with any other character (ex: 0x25 = '%') you'll notice that the final message is printed in a single line:
11MSH|^~\&|HIS|RIH|EKG|EKG|199904140038||ADT^A01||P|2.5%PID|0001|00009874|00001122|A00977|SMITH^JOHN^M|MOM|19581119|F|NOTREAL^LINDA^M|C|564 SPRING ST^^NEEDHAM^MA^02494^US%AL1||SEV|001^POLLEN2813
If I run in Ubuntu, you get to see the message in one line with the segment delimiter:
11MSH|^~\&|HIS|RIH|EKG|EKG|199904140038||ADT^A01||P|2.513PID|0001|00009874|00001122|A00977|SMITH^JOHN^M|MOM|19581119|F|NOTREAL^LINDA^M|C|564 SPRING ST^^NEEDHAM^MA^02494^US13AL1||SEV|001^POLLEN2813

extract values of ping message

I am working on an application on android that performs ping requests (via android shell) and I read from the console the message displayed. A typical message is the following
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=46 time=186 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=46 time=209 ms
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 186.127/197.891/209.656/11.772 ms
I store the above message in a String. I want to extract the values of the time, for example 186 and 209 and also the percentage for loss, 0 (in this case).
I was thinking to go through the string and look the values after "time=". However I don't know how to do it.
How can I manipulate the string I have in order to extract the values?
Start by getting each line of the string:
String[] lines = pingResult.split("\n");
Then, loop and use substring.
for (String line : lines) {
if (!line.contains("time=")) continue;
// Find the index of "time="
int index = line.indexOf("time=");
String time = line.substring(index + "time=".length());
// do what you will
}
If you want to parse to an int, you could additionally do:
int millis = Integer.parseInt(time.replaceAll("[^0-9]", ""));
This will remove all non-digit characters
You can do something similar for the percentage:
for (String line : lines) {
if (!line.contains("%")) continue;
// Find the index of "received, "
int index1 = line.indexOf("received, ");
// Find the index of "%"
int index2 = line.indexOf("%");
String percent = line.substring(index1 + "received, ".length(), index2);
// do what you will
}

Categories

Resources