Java: Serializing String[] Array to store in a MySQL Database?

Java: Serializing String[] Array to store in a MySQL Database? - java

Yes, I know it's bad practice and I should instead normalize my tables. That put aside, is it possible to serialize a String [] array and store it in the database?
I am from the lenient and forgiving world of PHP, where invoking the serialize() function and would convert the array into a string.
Is there an equivalent of doing such heresy in Java?
Apart from normalization, are there more elegant ways of storing String Arrays in the database?
In case it's applicable, I am using the jdbc driver for my MySQL connections.

Yes. You can serialize any Java objects and store the serialized data into MySQL.
If you use the regular serialization (ObjectOutputStream), the output is always binary. Even String is serialized into binary data. So you have to Base64 encode the stream or use a binary column like BLOB.
This is different from PHP, whose serialize() converts everything into text.
You can also use the XML serialization in Java (XMLEncoder) but it's very verbose.

If you're thinking in terms of raw arrays, you're still writing PHP in Java.
Java's an object-oriented language. An array of Strings really isn't much of an abstraction.
You'll get perfectly good advice here telling you that it's possible to serialize that array of Strings into a BLOB that you can readily store in MySQL, and you can tell yourself that leniency is a virtue.
But I'll going to remind you that you're losing something by not thinking in terms of objects. They're really about abstraction and encapsulation and dealing with things at a higher level than bare metal ints, Strings, and arrays.
It'd be a good exercise to try and design an object that might encapsulate an array or another more sophisticated data structure of child objects that were more than Strings. There'd be a 1:m relationship between parent and child that would better reflect the problem you were really trying to solve. That would be a far more object-oriented design than the one you're proposing here.

There are various good serialization/deserialization libraries that automatically convert JavaBean objects to/from XML and JSON strings. One I've had good experience with is XStream.
Java's built-in support for serialization can do the same thing, and you can write custom serialization/deserialization methods for Java to call.
You can roll your own serialization methods too, eg converting to and from a comma-separated value (CSV) format.
I'd opt for a library like XStream first, assuming there's a very compelling reason not to normalize the data.

You don't want to serialize the array. I'm not sure why you'd serialize it in PHP either, because implode() and explode() would be more appropriate. You really should normalize your data, but aside from that, you could very easily Google a solution for converting an array to a string.

But surely the more logical thing to do would be to save each string as its own record with a suitable identifier. That would probably be less coding than serializing -- a simple loop through the elements of the array -- and would result in a clean database design, rather than some gooey mess.

If you really don't want to normalize this values into a separate table where each string would be in its own row, then just convert your array to a list of comma separated values (possibly escaping commas somehow). Maybe quoting each string so that "str1","str2".
Google for CSV RFC for spec on how this should be properly escaped.

Related

Can InfluxDB store serialized objects?

Currently evaluating InfluxDB and want to find out if serialized objects (e.g. using Java) can be stored / retrieved from InfluxDB and what is the process for it?

According to wikipedia, this database supports the following types of values:
Values can be 64-bit integers, 64-bit floating points, strings, and booleans.
You can serialize Java objects into byte streams; and byte streams can be represented as hex strings.
So, theoretically the answer is yes - it should be possible to store serialized Java objects in this database. To read back, you just reverse that process.
If that is a good idea is a completely different question. It sounds rather inefficient; and storing serialized objects is by itself not a great idea. First of all, it is a big detour - turn an object into a byte stream into a hex string (and reverse that). Then: java object serialization has is a beast of its own - you have to be carefully for example to not introduce version incompatibilities. It is really annoying when you release a new version of your Java code and that code throws an exception when you try to deserialize previously stored objects.
Therefore more modern approaches prefer to serialize into different formats (JSON for example), or use tools to translate fields directly to different table columns.

Parsing data from untrusted Java serialized object

I need to parse untrusted Java serialized objects. The data is given to me as a byte array (written at some point by ObjectOutputStream).
I do not want to simply call ObjectInputStream.readObject() and/or load the actual object. I am looking for a way to safely parse the bytes and grab field names & values.
--
Here's a little summary of my attempt so far, after taking a look at the ObjectInputStream procedure for deserializing objects.
I have tried to extract field types/names (as unicode strings) recursively based on expected stream constants. I end up with a list of field names whose values should appear in the byte array in order. I am uneasy about this approach because it is probably buggy. Especially accommodating for what seems to be individual serialization protocols followed by HashMap, ArrayList, etc. But it might work, if I can figure out a way to read the bytes that represent field values:
I can try to read and store primitives based on size/offset, but when I encounter my first object, it gets a bit more complicated -- there is no clear way to distinguish between which bytes are associated with which values anymore (without actually loading the object in the way that ObjectInputStream probably does?).
--
Can anyone suggest either a potential solution that I'm obviously looking past, or a trusted library that can help parse the serialized data without loading objects?
Thank you for reading, and for all comments/suggestions!!! I apologize if something is unclear and I would be happy to clarify if you bear with me.

You can't do this in principle. Any Java class can take over its own Serialization and write arbitrary data to the stream that only it knows how to parse and reconstruct, via code that is only invoked during deserialization.

alternative to JSONObject in java

I've been using JSONObject as return types on most of my classes and methods for android to aid in debugging and informing the user of the problems. But I've been trying to build an AsyncTask JSONObject has been getting quirky. Is there any multi-type array that can be used to transport primitive data-types in one object?

Perhaps using a
Bundle
would be helpful?
http://developer.android.com/reference/android/os/Bundle.html

If the type doesnt need to be preserved, you could always convert your data to Strings and transport that instead. Then you could use whatever you want; an array, an ArrayList, etc.
If you need to preserve the type, you can use a second value to denote the type. Oryou could still use whatever data structure fits your needs performance-wise and store Object instances; Character for chars, Integer for ints, etc. and then on retrieval, use reflection to get their type info.

Without knowing a little more info, I'd possibly look into using Gson since you say that JSONObject has been quirky for you. I've been using it and haven't had any problems so far:
http://code.google.com/p/google-gson/

The best way to provide a JSON InputStream

In different languages I need to provide users with a stream of JSON objects with an interface similar to the following:
JSONObject json = stream.nextJSON();
Since it is a stream, each call will block until a full object has been retrieved. This means it makes no sense to try and encapsulate each JSON object inside a big array. An extra layer of structure and processing has to be added to the stream.
I have thought of two options:
Segmenting the stream with the null-termination character.
Writing a primitive parser that understand JSON scope so can detect the end of an object.
Each of the above have a number of potential issues to discuss: How will null-termination interact with the file system, socket or underlying streams in C++, Java and other languages? What edge cases would we need to take in to account when parsing? (different types of quote symbol might confuse a parser, for example). Furthermore, there might be alternatives to the two above.
So the question is: What is the best way to provide a JSON InputStream?

Well Google already thought about it apparently:
http://sites.google.com/site/gson/streaming

Accepted practice for converting an Object to and from a String in Java?

What is the commonly accepted method for converting arbitrary objects to and from their String representations, assuming that the exact class of the object is known ? In other words, I need to implement some methods similar to the following:
public interface Converter {
/**
* Convert this object to its String representation.
*/
public String asString(Object obj);
/**
* Take the String representation of an object produced by asString,
* and convert it back to an object of the appropriate class.
*/
public Object asObject(String stringRepresentation, Class clazz);
}
Ideally, the solution should:
Use the object's built-in toString() functionality, if possible. Thus, converter.asString(new Integer(5)) should return "5", and converter.asObject("5", Integer.class) should return an Integer with the value of 5.
Produce output that is human-readable whenever possible.
Deal with all common Java data types, including java.util.Date .
Allow me to plug in conversion functionality for my own, custom classes.
Be reasonably light-weight and efficient.
I understand that there are any number of ready-made solutions that do this (such as Google's protocol buffers, for example), and that I could easily implement a one-off solution myself. My question is not, "how do I solve this problem", but rather, "which one of the many ready-made solutions is the current industry standard ?".

My question is not, "how do I solve this problem", but rather, "which one of the many ready-made solutions is the current industry standard ?".
None of them have emerged as defacto standard.
The closest you can get it "default" XML serialization mechanism which BTW sucks if you pretend to write them by hand ( and It is good enough when you use them automatically )
The next thing closest to an standard and that would be for daily usage, would be JSON to Java, but, well, you know, it is not Java Java

I would vote for Json as well and then particularly Gson. It handles generic/parameterized objects very well.
Alternatively, you can also write a generic object converter which does all of the needed conversions with a little help of reflection, such as this example. But if your "API" require that this converter is to be published as an interface to the enduser, then I would only suggest to replace
public Object asObject(String stringRepresentation, Class clazz);
by for example
public <T extends Object> T asObject(String stringRepresentation, Class<T> clazz);
so that one doesn't need to cast it afterwards.

You can look at the svenson library, it converts java objects to json. Its pretty quick and uses annotations to introduce custom converters.
http://code.google.com/p/svenson/
Not long ago I would have proposed an xml serializer, but after playing with couchdb for a couple of days, I serve a new master... json.

Although it is tempting to use or attempt to implement "toString()" as a reversible operation, the purpose of "toString()" is to generate a user-friendly and easily understandable representation of an object, and this goal is often at odds with including enough state information to truly restore the original object.
If you are looking to persist an object, using XML, JSON, or binary serialization is probably the best way to go. The "toString()" function should report a human-friendly representation of an object (e.g. "5", "(3,0,2)", "5+6i", "{1, 2, 3, 4, 5, 6}", "{x => y, z => 3}", etc.). Even in cases where it is possible to completely restore the object from the generated string, the time to write a function to parse each type of (potentially unstructured) text is best conserved via automated XML persistence in favor of time to write the actual application.

I agree with Oscar that XML might be the preferable form here, if you can tolerate large uncompressed file sizes. To elaborate on his answer, in my experience if you write a fairly straightforward utility class you can serialize your objects into XML with not too much work. To read them back, I would recommend Apache Digester which does a great job of rule-based interpretation.
I would only opt for other file formats if I cared about performance or file sizes, though I eprsonally in most cases prefer the flexibility of XML.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.