writing many java objects to a single file

writing many java objects to a single file - java

how can I write many serializable objects to a single file and then read a few of the objects as and when needed?

You'd have to implement the indexing aspect yourself, but otherwise this could be done. When you serialize an object you essentially get back an OutputStream, which you can point to wherever you want. Storing multiple objects into a file this way would be straightforward.
The tough part comes when you want to read "a few" objects back. How are you going to know how to seek to the position in the file that contains the specific object you want? If you're always reading objects back in the same order you wrote them, from the start of the file onwards, this will not be a problem. But if you want to have random access to objects in the "middle" of the stream, you're going to have to come up with some way to determine the byte offset of the specific object you're interested in.
(This method would have nothing to do with synchronization or even Java per se; you've got to design a scheme that will fit with your requirements and environment.)

The writing part is easy. You just have to remember that you have to write all objects 'at once'. You can't create a file with serialized objects, close it and open it again to append more objects. If you try it, you'll get error messages on reading.
For deserializing, I think you have to process the complete file and keep the objects you're interested in. The others will be created but collected by the gc on the next occasion.

Make Object[] for storing your objects. It worked for me.

I'd use a Flat File Database (e. g. Berkeley DB Java Edition). Just write your nodes as rows in a table like:
Node
----
id
value
parent_id

To read more Objects from file:
public class ReadObjectFromFile {
public static Object[] readObject() throws IOException {
Object[] list = null;
try {
byte[] bytes = Files.readAllBytes(Paths.get("src/objectFile.txt"));
ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(bytes));
list = (Object[]) ois.readObject();
ois.close();
} catch (IOException | ClassNotFoundException e) {
e.printStackTrace();
}
return list;
}
}

Related

Java: check if object exists in object file before writing

I have a file called "objects.txt" which contains some serializable objects.
I want to write some objects to the file.
Is there a way to check if the objects I want to write to the file already exist in the file before writing? Would it be better to not check even if the objects already exist in the file?
Below is example of writing object to file:
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectOutputStream;
import javax.swing.JFrame;
public class WriteObjectsDemo {
public static void main(String[] args)
{
try(FileOutputStream f = new FileOutputStream("objects.txt"))
{
ObjectOutputStream o = new ObjectOutputStream(f);
// Write objects to file
JFrame j = new JFrame();
o.writeObject(j);
}
catch (IOException e)
{
e.printStackTrace();
}
}
}

I'd say the answer is maybe. It might be pushing the serialization machinery beyond its comfort zone, but if it's going to work at all it'll go something like this:
First, read through the file once using a FileInputStream wrapped in an ObjectInputStream in order to determine whether or not the file already contains your object. Close the stream when you're done.
Then, if you decide you want to write your object, open the file for appending with new FileOutputStream(file, true), wrap that stream in an ObjectOutputStream and write away.
P.S.: I'd suggest reconsidering the .txt extension on your filename. The serialized object data is most definitely not text.

Is there a way to check if the objects I want to write to the file already exist in the file before writing?
Yes.
Read the entire file, deserialize every object in it, and see if the object you're about to write is already there.
Not very efficient, is it?
So one better way:
When your process starts, read all the objects in the file into a Set<>.
While you're processing, add objects to that Set<>. Since a Set<> only allows a single instance of any object, duplicate objects will be dropped.
When you're done processing, rewrite the entire file from your Set<>, serializing every object in it to the file.
Note that to implement this, your objects need to properly override the equals() method and the hashCode() method so equivalent objects compare as equals. See Compare two objects with .equals() and == operator to start - and read the accepted answer - all of it. Then read the links. Then think hard about what equals() means for your objects. Then implement equals() and hashCode() methods in your Java code that work.

Serializing objects referenced in my arraylist into one file, and then loading them back into that arraylist

I know similar questions have been asked many times, but I didn't find a question that was actually the same - apologies if I missed one.
I am trying to make a little game. The game stores my characters (spirits) into an arraylist. The spirits are created with different characteristics by creating different instances of my Spirit class. Then I store that instance into my arraylist that holds my party. The arraylist holds 3 elements.
I want to serialize all of the fields of these instances, save them to a file, and then load them again and store them back in their proper positions in the arraylist (essentially save/load in my game).
I have been advised to use kyro for serialization so I am using that. I am pretty sure the code is very similar though, and I have to imagine the way this problem is solved is basically the same.
This is my code right now
public void usekryo(){
Kryo kryo = new Kryo();
kryo.register(Spirits.class);
Spirits writespirit;
Spirits readspirit;
for(int i=0;i<3;i++) { try (Output output = new Output(new FileOutputStream("spirits.ser"))) {
System.out.println("loop is going");
writespirit = OwnedSpirits.myspirits.get(i);
kryo.writeClassAndObject(output, writespirit);
} catch (FileNotFoundException ex) { System.out.println("fail"); } }
Spirits blankarray = new Spirits();
OwnedSpirits.myspirits.set(0, blankarray);
OwnedSpirits.myspirits.set(1, blankarray);
OwnedSpirits.myspirits.set(2, blankarray);
System.out.println(OwnedSpirits.myspirits.get(2).species);
for(int i=0;i<3;i++) { try (Input input = new Input(new FileInputStream("spirits.ser"))) {
System.out.println("now it's loading");
readspirit = (Spirits) kryo.readClassAndObject(input);
OwnedSpirits.myspirits.set(i, readspirit);
} catch (FileNotFoundException ex) { System.out.println("fail"); } }
System.out.println(OwnedSpirits.myspirits.get(0).species);
System.out.println(OwnedSpirits.myspirits.get(2).species);
}
This ends up doing exactly what I thought it would do. I mean I knew this wouldn't solve my problem but I was just making sure I was using kyro correctly.
Anyways, it makes an instance of my spirit class, loads the value in the first index of my arraylist, then serializes this new instance. So far so good.
But then it does it again, and overwrites what I wrote the first time. Then does it a 3rd time. So the only values that actually get saved are what is in index [2] of my arraylist.
Then, when I read that file back and write it again, all 3 indexes of my arraylist are filled with what was initially just index [2] of my arraylist.
So how in the world would I write this so that all 3 indexes are output to one file, and then input back into their proper indexes?
I am hoping this isn't super complicated, but I am definitely worried that it is.

You're making it more complicated than it needs to be. You should just be able to serialize the container (OwnedSpirits.myspirits).

Reading and writing objects via GZIP streams?

I am new to Java. I want to learn to use GZIPstreams. I already have tried this:
ArrayList<SubImage>myObject = new ArrayList<SubImage>(); // SubImage is a Serializable class
ObjectOutputStream compressedOutput = new ObjectOutputStream(
new BufferedOutputStream(new GZIPOutputStream(new FileOutputStream(
new File("....")))));
compressedOutput.writeObject(myObject);
and
ObjectInputStream compressedInput = new ObjectInputStream(
new BufferedInputStream(new GZIPInputStream(new FileInputStream(
new File("....")))));
myObject=(ArrayList<SubImage>)compressedInput.readObject();
When the program writes myObject to a file without throwing any exception, but when it reaches the line
myObject=(ArrayList<SubImage>)compressedInput.readObject();
it throws this exception:
Exception in thread "main" java.io.EOFException: Unexpected end of ZLIB input stream
How can I solve this problem?

You have to flush and close your outputstream. Otherwhise, at least, the BufferedOutputStream will not write everything to the file (it does in big chucks to avoid penalizing performance).
If you call compressedOutput.flush() and compressedOutput.close() it will suffice.
You can try writing a simple string object and checking if the file is well written.
How? If you write a xxx.txt.gz file you can open it with your preferred zip app and look at the xxx.txt. If the app complains, then the content is not full written.
Extended answer to a comment: compressing even more the data
Changing serialization
You could change the standard serialization of SubImage object if it's an object of your own. Check java.io.Serializable javadoc to know how to do it. It's pretty straightforward.
Writing just what you need
Serialization has the drawback that needs to write "it's a SubImage" just before every instance you write. It's not necessary if you know what's going to be there beforehand. So you could try to serialize it more manually.
To write your list, instead of writing an object write directly the values that conform your list. You will need just a DataOutputStream (but ObjectOutputStream is a DOS so you can use it anyway).
dos.writeInt(yourList.size()); // tell how many items
for (SubImage si: yourList) {
// write every field, in order (this should be a method called writeSubImage :)
dos.writeInt(...);
dos.writeInt(...);
...
}
// to read the thing just:
int size = dis.readInt();
for (int i=0; i<size; i++) {
// read every field, in the same order (this should be a method called readSubImage :)
dis.readInt(...);
dis.readInt(...);
...
// create the subimage
// add it to the list you are recreating
}
This method is more manual but if:
you know what's going to be written
you will not need this kind of serialization for many types
it's pretty affordable and definitively more compressed than the Serializable counterpart.
Have in mind that there are alternative frameworks to serialize objects or create string messages (XStream for xml, Google Protocol Buffers for binary messages, and so on). That frameworks could work directly to binary or writing a string that could be then written.
If your app will need more on this, or just curious, maybe you should look at them.
Alternative serialization frameworks
Just looked in SO and found several questions (and answers) addressing this issue:
https://stackoverflow.com/search?q=alternative+serialization+frameworks+java
I've found that XStream is pretty easy and straightforward to use. And JSON is a format pretty readable and succint (and Javascript compatible which could be a plus :).
I should go for:
Object -> JSON -> OutputStreamWriter(UTF-8) -> GZippedOutputStream -> FileOutputStream

Java ObjectOutputStream and updating a file

I am having trouble figuring out one implementation problem, I have one class, it behaves like list but instead of holding a file in some collection it saves them on a disk.
The problem occurs when I want to add some element to my list. At the start of my file I have one int that tells me how many objects there are in my list, but I can't figure out elegant way to update this value. I have something like this:
public boolean add(T element)
{
try
{
out.writeObject(element);
out.flush();
//and here we need to update the int in my file
} catch (IOException e)
{
e.printStackTrace();
}
return true;
}
I tried to use something like this:
ObjectOutputStream upd=new ObjectOutputStream(new FileOutputStream(data.getAbsolutePath(),true));
but as I observed it writes some data to the start of the file, some serialization header or sth, how can I update single entry in my file or how to change
ObjectOutputStream
"pointer" to write at the beginning of the file?

Typically with stream based classes (especially higher order streams like OOS), you should rewrite the whole file, anytime you update it.
If you really INSIST on only updating part of a file, then you should think of the file as made up of N streams, where each 'stream' represents one object that you are writing. So i would use a RandomAccessFile for the base file, and then when i want to write an object i would wrap an ObjectOutputStream on top of a ByteArrayOutputStream, write your object into that, then take those bytes, and rewrite those bytes into the RandomAcessFile where you want.
This probably won't be particularly efficient, as you will write N OOS headers, and N class descriptions for the object you are writing.

Can objects be buffered during java serialization?

I have a very large object which I wish to serialize. During the process of serialization, it comes to occupy some 130MB of heap as an weblogic.utils.io.UnsyncByteArrayOutputStream. I am using a BufferedOutputStream to speed up writing the data to disk, which reduces the amount of time for which this object is held in memory.
Is it possible to use a buffer to reduce the size of the object in memory though? It would be good if there was a way to serialize it x bytes at a time and write those bytes to disk.
Sample code follows if it is of any use. There's not much to go on though I don't think. If it's the case that there needs to be a complete in-memory copy of the object to be serialised (and therefore no concept of a serialization buffer) then I suppose I am stuck.
ObjectOutputStream tmpSerFileObjectStream = null;
OutputStream tmpSerFileStream = null;
BufferedOutputStream bufferedStream = null;
try {
tmpSerFileStream = new FileOutputStream(tmpSerFile);
bufferedStream = new BufferedOutputStream(tmpSerFileStream);
tmpSerFileObjectStream = new ObjectOutputStream(bufferedStream);
tmpSerFileObjectStream.writeObject(siteGroup);
tmpSerFileObjectStream.flush();
} catch (InvalidClassException invalidClassEx) {
throw new SiteGroupRepositoryException(
"Problem encountered with class being serialised", invalidClassEx);
} catch (NotSerializableException notSerializableEx) {
throw new SiteGroupRepositoryException(
"Object to be serialized does not implement " + Serializable.class,
notSerializableEx);
} catch (IOException ioEx) {
throw new SiteGroupRepositoryException(
"Problem encountered while writing ser file", ioEx);
} catch (Exception ex) {
throw new SiteGroupRepositoryException(
"Unexpected exception encountered while writing ser file", ex);
} finally {
if (tmpSerFileObjectStream != null) {
try {
tmpSerFileObjectStream.close();
if(null!=tmpSerFileStream)tmpSerFileStream.close();
if(null!=bufferedStream)bufferedStream.close();
} catch (IOException ioEx) {
logger.warn("Exception caught on trying to close ser file stream", ioEx);
}
}
}

This is wrong on so many levels. This is a massive abuse of serialization. Serialization is mostly intended for temporarily storing an object. For example,
session objects between tomcat server restarts.
transferring objects between jvms ( load balancing at website )
Java's serialization makes no effort to handle long-term storage of objects (No versioning support) and may not handle large objects well.
For something so big, I would suggest some investigation first:
Ensure that you are not trying to persist the entire JVM Heap.
Look for member variables that can be labeled as 'transient' to avoid including them it the serialization ( perhaps you have references to service objects )
Consider possibility that there is a memory leak and the object is excessively large.
If everything is indeed correct, you will have to research alternatives to java.io.Serialization. Taking more control via java.io.Externalization might work. But I would suggest something like a json or xml representation.
Update:
Investigate :
google's protocol buffer
facebook's Thrift
Avro
Cisco's Etch
Take a look at this benchmarkings as well.

What is the "siteGroup" object that you're trying to save? I ask, because it's unlikely that any one object is 130MB in size, unless it has a ginormous list/array/map/whatever in it -- and if that's the case, the answer would be to persist that data in a database.
But if there's no monster collection in the object, then the problem is likely that the object tree contains references to a bagillion objects, and the serialization of course does a deep copy (this fact has been used as a shortcut to implement clone() a lot of times), so everything gets cataloged all at once in a top-down fashion.
If that's the problem, then the solution would be to implement your own serialization scheme where each object gets serialized in a bottom-up fashion, possibly in multiple files, and only references are maintained to other objects, instead of the whole thing. This would allow you to write each object out individually, which would have the effect you're looking for: smaller memory footprint due to writing the data out in chunks.
However, implementing your own serialization, like implementing a clone() method, is not all that easy. So it's a cost/benefit thing.

It sounds like whatever runtime you are using has a less-than-ideal implementation of object serialization that you likely don't have any control over.
A similar complaint is mentioned here, although it is quite old.
http://objectmix.com/weblogic/523772-outofmemoryerror-adapter.html
Can you use a newer version of weblogic? Can you reproduce this in a unit test? If so, try running it under a different JVM and see what happens.

I don't know about weblogic (that is - JRockit I suppose) serialization in particular: honestly I see no reason for using ByteArrayOutputStreams...
You may want to implement java.io.Externalizable if you need more control on how your object is serialized - or switch to an entirely different serialization system (eg: Terracotta) if you don't want to write read/write methods yourself (if you have many big classes).

Why does it occupy all those bytes as an unsync byte array output stream?
That's not how default serialization works. You must have some special code in there to make it do that. Solution: don't.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

writing many java objects to a single file - java

how can I write many serializable objects to a single file and then read a few of the objects as and when needed?

Make Object[] for storing your objects. It worked for me.

I'd use a Flat File Database (e. g. Berkeley DB Java Edition). Just write your nodes as rows in a table like: Node ---- id value parent_id

Related

Java: check if object exists in object file before writing

Serializing objects referenced in my arraylist into one file, and then loading them back into that arraylist

Reading and writing objects via GZIP streams?

Java ObjectOutputStream and updating a file

Can objects be buffered during java serialization?

Categories

Resources