I have a project in which i need to extract some images in BLOB format as string, from an ORACLE database to send it through a JSON. I'm using Eclipse Java EE IDE.
Version: Mars Release (4.5.0)
Build id: 20150621-1200
Would this be the proper way to extract the BLOB data as string?
String query = "SELECT operation, c_book, x_book, x_text1, x_text2, x_text3, x_text4,"
+ "UTL_RAW.CAST_TO_VARCHAR2(DBMS_LOB.SUBSTR(img_logo,32670,1))FROM "
+ dataBaseConnectionData.getDB_SHCHEMA() + "."+ dataBaseConnectionData.getDB_TABLE_COLA()
+ " WHERE status = 'P' OR status = 'N' OR status = 'E'"
+ " ORDER BY c_book";
There are 3 ways to get BLOB data using JDBC:
Blob blob = rs.getBlob("img_logo")
InputStream stream = rs.getBinaryStream("img_logo")
byte[] bytes = rs.getBytes("img_logo")
If the blob is of limited size, which a logo would be, the third version is the easiest to use.
You will then need to convert to a string, which means you need to know which encoding was used to convert the original text to binary in the first place. The most conservative choice would be US-ASCII, so:
String text = new String(bytes, StandardCharsets.US_ASCII)
Of course, this assumes the binary data is text, but it's a logo, and that doesn't sound like a text value, so you might have meant that you want to encode the binary data for embedding in a JSON structure, to be decoded back to binary at the other end.
For that, you'd need to decide on an encoding. The two top choices are HEX and BASE64, where HEX will double the size (two hex digits per byte), and BASE64 will add 33% (4 chars per 3 bytes).
For HEX, see How to convert a byte array to a hex string in Java?
For BASE64, see How do I convert a byte array to Base64 in Java?, or use the new Base64 class (Java 8).
Related
Is is possible to use ZPL and binary data for aztec barcode?
I try BluetoothConnection write to send joined array of String encoded in UTF8 and byte data/
String zplStart;
byte[] aztecData;
String endZpl;
new BluetoothConnection(MAC).write(zplStart.getBytes + aztecData + endZpl);
A expect printed aztec with byte data.
The problem was that I used UTF8 encoding (^CI28) because my texts on label was using central Europe fonts. Do not know whats the problem with encoding in aztec, because byte of data are always just bytes of data. So I change encoding back to default before aztec and return UTF8 just after aztec and everything is OK.
I am encountering issues in reporting in displaying names. My application uses different technologies PHP, Perl and for BI Pentaho.
We are using MYSQL as DB and my table is of CHARSET=utf8.
My table is been stored with values in rows as below which is wrong
Row1 = Ãx—350
Row2 = Ñz–401
PHP and Perl are using different in built functions to convert the above values which is stored in DB and it is displaying in UI as below which is correct
Expected Row1 = Áx—350
Expected Row2 = Ñz–401
Coming to reports which is using pentaho I am using ETL to transform the data before showing data in reports. In order to convert the above DB stored values I am trying to convert the data through Java step as below
new java.lang.String(new java.lang.String(CODE).getBytes("Windows-1252"), "UTF-8")
But it is not converting the values properly, among the above 2 wrong values only Row2 value is been converted properly but the first Row1 is wrongly converting as below
Converted Row1 = �?x—350
Converted Row2 = Ñz–401
Please suggest what way I can convert the values properly so that for example Row1 value should be converted properly to Áx—350.
I wrote a small Java program as below to convert the Ãx—350 string to Áx—350
String input = "Ãx—350";
byte[] b1 = input.getBytes("Windows-1252");
System.out.println("Input Get Bytes = "+b1.toString());
String szUT8 = new String(b1, "UTF-8");
System.out.println("Input Encoded = " + szUT8);
The output from the above code is as below
Input Get Bytes = [B#157ee3e5
Input Encoded = �?x—350-350—É1
If we see the output the string is wrong where the actual expected output is Áx—350.
To confirm on the encoding/decoding schemes i tried testing string online and tested with string Ãx—350 and output is as expected Áx—350 which is correct.
So from this any one please point why java code is not able to convert properly although i am using the proper encoding/decoding schemes, anything else which iam missing or my approach is wrong.
The CHARSET setting in your db being set to utf-8 doesn't necessarily mean that the data there is properly encoded in utf-8 (or even in utf-8 at all), as we can see. It looks like you are dealing with mojibake - characters that that were at one time decoded using the wrong encoding scheme, then therefore in turn encoded wrong. Fixing that is a usually tedious process of figuring out past decode/encode errors and then undoing them.
Long story short: if you have mojibake, there isn't any automatic conversions you can do unless you know (or can figure out) what conversions were made in the past.
Converting is a matter of first decoding, then encoding. To convert in Perl:
my $string = "some windows-1252 string";
use Encode;
my $raw = decode('windows-1252',$string);
my $encoded = encode('utf-8',$raw);
I have in my application a image upload method that need to send a image and a string to my server.
The problem is that the server receives the content (image and string) but when it saves the image on the disk it is corrupted and can't be opened.
This is the relevant part of the script.
HttpPost httpPost = new HttpPost(url);
Bitmap bmp = ((BitmapDrawable) imageView.getDrawable()).getBitmap();
ByteArrayOutputStream stream = new ByteArrayOutputStream();
bmp.compress(Bitmap.CompressFormat.PNG, 100, stream);
byte[] byteArray = stream.toByteArray();
String byteStr = new String(byteArray);
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append("--"+boundary+"\r\n");
stringBuilder.append("Content-Disposition: form-data; name=\"content\"\r\n\r\n");
stringBuilder.append(message+"\r\n");
stringBuilder.append("--"+boundary+"\r\n");
stringBuilder.append("Content-Disposition: form-data; name=\"image\"; filename=\"image.jpg\"\r\n");
stringBuilder.append("Content-Type: image/jpeg\r\n\r\n");
stringBuilder.append(byteStr);
stringBuilder.append("\r\n");
stringBuilder.append("--"+boundary+"--\r\n");
StringEntity entity = new StringEntity(stringBuilder.toString());
httpPost.setEntity(entity);
I can't change the server because other clients use it and it works for them. I just need to understand why the image is being corrupted.
When you do new String(byteArray), it's converting binary into the default character set (which is typically UTF-8). Most character sets aren't a suitable encoding for binary data. In other words if you were to encode certain binary strings to UTF-8 and then decode back to binary, you would not get the same binary string.
Since you're using multipart encoding, you need to write directly to the stream of the entity. Apache HTTP Client has helpers for doing this. See this guide, or this Android guide to uploading with multipart.
If you NEED to using strings only, you can safely convert your byte array to a string with
String byteStr = android.util.Base64.encode(byteArray, android.util.Base64.DEFAULT);
But it's important to note that your server will need to Base64 decode the string back to a byte array and save it to an image. Further, the transfer size will be greater because Base64 encoding isn't as space efficient as raw binary.
Your solutions above is not working because you are using new String(byteArray). The constructor encodes the byte array using the default encoding - see What is the default encoding - and it is very likely, that you have byte sequences in your data that cannot be encoded into a character.
To be more precise, a charset defines how characters are represented as bytes.
Most charsets have more than 256 characters. That is why you need more than one byte to represent a character. UTF-8 and UTF-16 uses up to four bytes.
So you have a mapping between the number space and the character space and this mapping is not bejectiv a priori. So it is very likely that there exist a number in the number space that have no character mapped to it.
The solution #Samuel suggested is foolproof because Base64 uses A–Z, a–z, 0–9, + , / and terminates with = to represent a byte. I would prefer this solution!
If you don't want or cannot use Base64, than you can try just to throw in every byte as it is into the StringBuilder hoping that the server does not do any encoding before you get it.
for (byte b : byteArray) {
stringBuilder.append((char)b);
}
I do not recommand that solution in general, but it may help you to get your stuff done.
I have a trouble to convert email attachment(simple text file in windows-1251 encoding with latin and cyrillic symbols) to String. I.e I have a problem with converting cyrillic.
I got attachment file as base64 encoded String like this:
Base64Encoded email Attachment
Original file
So when I try to decode it, I got "?" instead of Cyrillic symbols.
How can I get right Cyrillic(Russian) symbols instead of "?"
I've already tried this code with all encodings, but nothing help to get correct Russian symbols.
BASE64Decoder dec = new BASE64Decoder();
for (String key : Charset.availableCharsets().keySet()) {
System.out.println("K=" + key + " Value:" +
Charset.availableCharsets().get(key));
try {
System.out.println(new String(dec.decodeBuffer(encoded), key));
} catch (Exception e) {
continue;
}
}
Thank You beforehand.
I am not very familiar with BPEL and protocols it uses. If you communicate between nodes using some binary protocols, then you must 1) ensure, client and receiver use the same charset and 2) convert java string into proper bytes in this encoding. Java stores string internally in UTF-16 format. So when you execute String correct = new String(commonName.getBytes("ISO-8859-1"), "ISO-8859-5") you will get correct string in UTF-16. Then you need to export it to bytes in requested encoding, eg. byte[] buff = correct.getBytes("UTF-8") assuming the encoding you use between nodes is UTF-8. If happen the encoding is different, then you must make sure, it actually supports Cyrillic characters (e.g. ISO-8859-1 does not support it).
If you use XML for data exchange, make sure it uses suitable encoding in <?xml encoding="UTF-8"?>. You don't need then to play with bytes, you just need to correctly "import" the string (see correct variable). Writing to XML converts characters automatically, but it (encoding) must support characters you want to write. So if you set encoding="ISO-88591", then you will get those question marks again.
i have a problem when reading special charatters from oracle database (use JDBC driver and glassfish tooplink).
I store on database the name "GRØNLÅEN KJÆTIL" through WebService and, on database, the data are store correctly.
But when i read this String, print on log file and convert this in byte array whit this code:
int pos = 0;
byte[] msg=new byte[1024];
String F = "F" + passenger.getName();
logger.debug("Add " + F + " " + F.length());
msg = addStringToArrayBytePlusSeparator(msg, F,pos);
..............
private byte[] addStringToArrayBytePlusSeparator(byte[] arrDest,String strToAdd,int destPosition)
{
System.arraycopy(strToAdd.getBytes(Charset.forName("ISO-8859-1")), 0, arrDest, destPosition, strToAdd.getBytes().length);
arrDest = addSeparator(arrDest,destPosition+strToAdd.getBytes().length,1);
return arrDest;
}
1) In the log file there is:"Add FGRÃNLÃ " (the name isn't correct and the F.length() are not printed).
2) The code throw:
java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at it.edea.ebooking.business.chi.control.VingCardImpl.addStringToArrayBytePlusSeparator(Test.java:225).
Any solution?
Tanks
You're calling strToAdd.getBytes() without specifying the character encoding, within the System.arraycopy call - that will be using the system default encoding, which may well not be ISO-8859-1. You should be consistent in which encoding you use. Frankly I'd also suggest that you use UTF-8 rather than ISO-8859-1 if you have the choice, but that's a different matter.
Why are you dealing with byte arrays anyway at this point? Why not just use strings?
Also note that your addStringToArrayBytePlusSeparator method doesn't give any indication of how many bytes it's copied, which means the caller won't have any idea what to do with it afterwards. If you must use byte arrays like this, I'd suggest making addStringToArrayBytePlusSeparator return either the new "end of logical array" or the number of bytes copied. For example:
private static final Charset ISO_8859_1 = Charset.forName("ISO-8859-1");
/**
* (Insert fuller description here.)
* Returns the number of bytes written to the array
*/
private static int addStringToArrayBytePlusSeparator(byte[] arrDest,
String strToAdd,
int destPosition)
{
byte[] encodedText = ISO_8859_1.getBytes(strToAdd);
// TODO: Verify that there's enough space in the array
System.arraycopy(encodedText, 0, arrDest, destPosition, encodedText.length);
return encodedText.length;
}
Encoding/Decoding problems are hard. In every process step you have to do the correct encoding/decoding. So,
familiarize yourself with the difference of bytes (inputstream) and Characters (Readers, Strings)
Choose in which character encoding you want to store your data in the database, and in which character encoding you want to expose your webservice. Make sure when you load initial data in the database it's in the right encoding
connect with the right database properties. mysql requires an addition to the connection url:?useUnicode=true&characterEncoding=UTF-8 when using UTF-8, I don't know about oracle.
if you print/debug at a certain step and it looks ok, you can't be sure you did it right. The logger can write with the wrong encoding (sometimes making something look ok, while in fact it's broken). Your terminal might not handle strange byte encodings correct. The same holds for command-line database clients. Your data might wrongly be stored, but your wrongly configured terminal interprets/shows the data as correct.
In XML, it's not only the stream encoding that matters, but also the xml-encoding attribute.