Store image in Hbase loss of Meta data and Exif - java

Uploading an image to hbase using Java program, after retrieving the image I found there is difference in file size eventually increased and most of Exif and Meta data loss
(GPS location data, camera details, etc..)
Code :
public ArrayList<Object> uploadImagesToHbase(MultipartFile uploadedFileRef){
byte[] bytes =uploadedFileRef.getBytes();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
ImageIO.write(image, "jpg", outputStream);
HBaseAdmin admin = new HBaseAdmin(configuration);
HTable table = new HTable(configuration, "sample");
Put image = new Put(Bytes.toBytes("1"));
image.add(Bytes.toBytes("DataColumn"), Bytes.toBytes(DataQualifier), bytes);
table.put(image);
How to store and retrieve a Image with out any change / loss?

Please try using SerializationUtils from Apache Commons Lang.
Below are methods
static Object clone(Serializable object) //Deep clone an Object using serialization.
static Object deserialize(byte[] objectData) //Deserializes a single Object from an array of bytes.
static Object deserialize(InputStream inputStream) //Deserializes an Object from the specified stream.
static byte[] serialize(Serializable obj) //Serializes an Object to a byte array for storage/serialization.
static void serialize(Serializable obj, OutputStream outputStream) //Serializes an Object to the specified stream.
While storing in to hbase you can store byte[] which was returned from serialize.
While getting the Object you can type cast to corresponding object for ex: File object and can get it back.

Most likely you are just over-complicating things. :-)
The reason why you are losing the Exif and other metadata, is that the ImageIO convenience methods ImageIO.read(...) and ImageIO.write(...) does not preserve metadata. The good news is, they are not needed.
As you seem to already have the image data from the MultipartFile, you should simply store that data (the byte array) in the database, and you will store exactly what the user uploaded. No difference in file size, and metadata will be untouched.
Your code above doesn't compile for me, and I'm no HBase expert, so I just leave that out (as you have already been able to store an image, to see the size/quality difference and metadata loss, I assume you know how to do that :-) ). But here's the basics:
public ArrayList<Object> uploadImagesToHbase(MultipartFile uploadedFileRef) {
byte[] bytes = uploadedFileRef.getBytes();
// Store the above "bytes" byte array in HBase *as is* (no ImageIO)
}

Related

Reading a binary file from the file system as a BLOB to use in rhino with javascript

I'm planing to use SheetJS with rhino. And sheetjs takes a binary object(BLOB if i'm correct) as it's input. So i need to read a file from the system using stranded java I/O methods and store it into a blob before passing it to sheetjs. eg :-
var XLDataWorkBook = XLSX.read(blobInput, {type : "binary"});
So how can i create a BLOB(or appropriate type) from a binary file in java in order to pass it in.
i guess i cant pass streams because i guess XLSX needs a completely created object to process.
I found the answer to this by myself. i was able to get it done this way.
Read the file with InputStream and then write it to a ByteArrayOutputStream. like below.
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
...
buffer.write(bytes, 0, len);
Then create a byte array from it.
byte[] byteArray = buffer.toByteArray();
Finally i did convert it to a Base64 String (which is also applicable in my case) using the "Base64.encodeBase64String()" method in apache.commons.codec.binary package. So i can pass Base64 String as a method parameter.
If you further need there are lot of libraries(3rd-party and default) available for Base64 to Blob conversion as well.

Read/write blob data in chunks with Hibernate

Is there a way to read and write from a blob in chunks using Hibernate.
Right now I am getting OutOfmemoryException because the whole blob data is loaded in memory into a byte[].
To be more specific, let's say I want to save a large file into a database table called File.
public class File {
private byte[] data;
}
I open the file in a FileInputStream and then what?
How do I tell Hibernate that I need to stream the content and will not give the whole byte[] array at once?
Should I use Blob instead of byte[]? Anyway how can I stream the content?
Regarding reading, is there a way I can tell hibernate that (besides the lazy loading it does) I need the blob to be loaded in chunks, so when I retrieve my File it should not give me OutOfMemoryException.
I am using:
Oracle 11.2.0.3.0
Hibernate 4.2.3 Final
Oracle Driver 11.2
If going the Blob route, have you tried using Hibernate's LobHelper createBlob method, which takes an InputStream? To create a Blob and persist to the database, you would supply the FileInputStream object and the number of bytes.
Your File bean/entity class could map the Blob like this (using JPA annotations):
#Lob
#Column(name = "DATA")
private Blob data;
// Getter and setter
And the business logic/data access class could create the Blob for your bean/entity object like this, taking care not to close the input stream before persisting to the database:
FileInputStream fis = new FileInputStream(file);
Blob data = getSession().getLobHelper().createBlob(fis, file.length());
fileEntity.setData(data);
// Persist file entity object to database
To go the other way and read the Blob from the database as a stream in chunks, you could call the Blob's getBinaryStream method, giving you the InputStream and allowing you to set the buffer size later if needed:
InputStream is = fileEntity.getData().getBinaryStream();
Struts 2 has a convenient configuration available that can set the InputStream result's buffer size.

send int array from ec2 servlet to s3 without creating local file

My idea is to upload an int array from a Java servlet running on an AWS ec2 microinstance. As I understand it I would have to convert my int array to an java object file first and then upload the file into my bucket, but is there a way to do this "on the fly" without first creating a local file?
If I had to create a local file first, which pathname would it have?
It will like this:
public void arrayToS3(String bucket, String pathInS3, JSONArray array) {
ObjectMetadata metadata = new ObjectMetadata();
byte[] dataInMemory = array.toString().getBytes();
s3client.putObject(bucket, pathInS3, new ByteArrayInputStream(dataInMemory), metadata);
}
Just convert anything into IntputStream. For example, the method arrayToS3 convert the JSONArray to String, and convert String to byte[]. Finally, wrapping byte[] to InputStream.
Everything is in memory. If your data is not very large, it is a simple way to do that. If you data is bigger than the memory set by JVM, out-of-memory will hit you.

Write/Read a series of different serialized objects to/from a file

I have a collection of objects:
Map<BufferedImage, Map<ImageTransform, Set<Point>>> map
I want to write those to a file, and then be able to read them back in the same struct.
I can't just write the collection as it is, because BufferedImage doesn't implement the Serializable (nor the Externalizable) interface. So I need to use the methods from the ImageIO class to write the image.
ImageTransform is a custom object that implements Serializable. So, I believe the value part of my map collection, should be writeable as it is.
Here is what I do to write to the file:
ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(file));
for (BufferedImage image : map.keySet()) {
ImageIO.write(image, "PNG", out); // write the image to the stream
out.writeObject(map.get(image)); // write the 'value' part of the map
}
Here is what I do to read back from the file:
ObjectInputStream in = new ObjectInputStream(new FileInputStream(file));
while(true) {
try {
BufferedImage image = ImageIO.read(in);
Map<ImageTransform, Set<Point>> value =
(Map<ImageTransform, Set<Point>>) in.readObject(); // marker
map.put(image, value);
} catch (IOException ioe) {
break;
}
}
However, this doesn't work. I get a java.io.OptionalDataException at marker.
java.io.OptionalDataException
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1300)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
My question is, firstly, is the writing concept correct ? is ImageIO#write good for this case, or should I think about using/storing the BufferedImage#getRgb int[] array ? is the array more compact (as in, takes up less space in the file) ?
Secondly, how should I be reading the object back from the file ? How do I know when the EOF is reached ? Why doesn't the above work ?
I hope the info provided is enough, if you need more info on something, please tell me.
Thanks in advance.
It's not working as ObjectOutputStream and ObjectInputStream write/expect a certain file format that is violated when you write an image out of order. To use ObjectStreams successfully you will need to observe the contract that is specifed by ObjectStreams.
To do this you will need to create a holding class, and use this class as the key to your map instead of BufferedImages. This holding class should implement Serializable and a three methods (not in any actual interface) that mark the Class as needing special handling during reading and writing. The method signatures must be exactly as specified or serialization won't work.
For more information have a look at the documentation on ObjectOutputStream.
public class ImageHolder implements Serializable {
BufferedImage image;
public ImageHolder(BufferedImage image) {
this.image = image;
}
private void readObject(ObjectInputStream stream)
throws IOException, ClassNotFoundException {
image = ImageIO.read(stream);
}
private void writeObject(ObjectOutputStream stream)
throws IOException {
ImageIO.write(image, "PNG", stream);
}
private void readObjectNoData() throws ObjectStreamException {
// leave image as null
}
And then serialsation should be as simple as outputStream.writeObject(map). Though you will need to check that the implementing class of ImageTransform is serialisable too.
One way to 'cheat' and only have a single object to serialize is to add the group of objects to an expandable, serializable list. Then serialize the list.
BTW - I would tend to use XMLEncoder over serialized Objects because they can be restored in later JVMs. There is no such guarantee for serialized Objects.
#Ivan c00kiemon5ter V Kanak: "I'm trying to keep the file as small in size as possible,..
That is often wasted effort, given disk space is so cheap.
*.. so I guess Serialization is better for that. *
Don't guess. Measure.
..I'll try using a List and see how that goes. ..
Cool. Note that if using the XMLEncoder, I'd recommend Zipping it in most cases. That would reduce the file size of the cruft of XML. This situation is different in storing images.
Image formats typically incorporate compression of a type that is not conducive to being further compressed by Zip. That can be side-stepped by storing the XML compressed, and the images as 'raw' in separate entries in the Zip. OTOH I think you'll find the amount of bytes saved by compressing the XML alone is not worth the effort - given the final file size of the image entries.

How do I retrieve images within Postgres into Matlab using Java?

I have been given a bit of a strange task, there are around 1500-2000 jpeg images all of around 1-50kb in size. They are currently stored in a simple database I made with Postgres. It's been a long time since I used Matlab and Postgres heavily so any help or suggestions is really appreciated!
I need to get the images that are stored in the database, out of the database into Java. The last step is retrieve the image from Java into Matlab so that the image is stored in the same way in which the imread function works in Matlab. The imread function reads an image in and creates a n by m by 3 matrix array of uint8 values which denote the pixel intensities of RGB.
Atm I have got the image in and out of the database in Java, currently storing the image in a bytea column data type. Is the best data type to use?
How can I get the data back out from the database, so that it is either the constructed jpeg image which I put in or is in the requested matrix array format?
Currently I do not understand the retrieved data. It is in a byte array of around 70,000 elements containing values between -128 to 128. Help!?!
Note: The database toolkit is unavailable to me
ANOTHER UPDATE: I have solved the problem related to the post regarding'UTF-8' encoding error.
If anyone stumbles upon this page, any answer posted will be tried as soon as I can! I really do appreciate your thoughts and answers. Thanks again.
When you say you have the image in a bytea column, how exactly is it stored? Is it storing the bytes of the JPEG file's contents, or an array of RGB pixel values, or something else? "Bytea" is just a binary string, which can store data in pretty much any format.
I'm assuming it's the JPEG contents. In that case what you can do is retrieve the jpeg contents via Java, save them to a temporary file, and call imread() on the temp file.
Those [-128,127] values are values of signed bytes Java. Even without the Database Toolbox, you can call regular JDBC or other Java code that uses it. The Java method you used to get those values - call it from Matlab (with your JAR on the classpath), and it should return that array as an int8 array, or something you can convert to one.
Given that in a Matlab variable named "bytes", you can write it out to a temp file with something like this.
file = [tempname() '.jpg'];
fid = fopen(file, 'wb');
fwrite(fid, bytes, 'int8');
fclose(fid);
By specifying 'int8' precision, I think you can skip the step of converting them to unsigned bytes, which is a more common convention. Writing int8s as 'int8' or uint8s as 'uint8' will produce the same file. If you do need to convert them to unsigned, use Matlab's typecast() function.
unsigned_bytes = typecast(bytes, 'uint8');
At that point you can call imread on the temp file and then delete it.
img = imread(file);
delete(file);
Problem solved :-)
I have managed to get the bytes stored within the bytea column within the database into a byte array. Then through creating a temp file ( using ByteArrayInputStream and Reader object to form a BufferedImage object which I write to a file) send this back into Matlab in an array.
Then process the data I retrieved and read from the temporary file in Matlab. Once the data is within Matlab all the temp files are deleted.
The code to process the ResultSet to create a temp image from a byte array received from the databases bytea column is shown below:
private static void processImageResultSet(ResultSet rs) throws SQLException, FileNotFoundException, IOException{
int i = 0; //used as a count and to name various temp files
while(rs.next()){ //loop through result sets
byte[] b = rs.getBytes(1); //the bytea column result
String location = getFileName(rs.getString(2)); //the name of the jpg file
ByteArrayInputStream bis = new ByteArrayInputStream(b); //creates stream storing byts
//To make individual names of temporary files unique the current time and date is stored
SimpleDateFormat df = new SimpleDateFormat("'Date 'yyyy-MM-dd HH'H'-mm'M'-ss'secs'-SS'ms'"); //formats date string
Calendar cal = Calendar.getInstance(); //gets instance of calendar time
String fileDate = df.format(cal.getTime()); //gets the time and date as a String
Iterator<?> readers = ImageIO.getImageReadersByFormatName("jpg"); //creates a reader object, that will read jpg codec compression format
Object source = bis; //object to store stream of bytes from database
ImageReader reader = (ImageReader) readers.next();
ImageInputStream iis = ImageIO.createImageInputStream(source); //creates image input stream from object source which stores byte stream
reader.setInput(iis, true); //sets the reader object to read image input stream
ImageReadParam param = reader.getDefaultReadParam();
Image image = reader.read(0, param);
BufferedImage bufferedImage = new BufferedImage(image.getWidth(null), image.getHeight(null), BufferedImage.TYPE_INT_RGB); //creates buffered image
Graphics2D g2 = bufferedImage.createGraphics();
g2.drawImage(image, null, null);
File imageFile = new File(location + " " + fileDate + " " + i + ".jpg"); //creates image file
ImageIO.write(bufferedImage, "jpg", imageFile); //writes buffered image object to created file
i++; //counts number of results from query within the ResultSet
}
}
Do you have access to the Database Toolbox in MATLAB? If so, you should be able to directly connect to a PostgreSQL database using the DATABASE function and then import and export data using the FETCH function or the QUERYBUILDER GUI. This may be easier than first going through Java.

Categories

Resources