I have a class with 10s of parameters. I want to encode and decode (back to) this class with limited number of parameters (let's say 3). It means, I am not worried if other parameters change. Assume those keys to be primary keys of the table and we are only concerned about them.
I can obviously use Base64 encoding/decoding to do the final job but handing different object types among those parameters was coming out to be multiple lines of code (with type checks).
One better thing I can think of it using JSON parsing to convert objects first to a JSON then use the encoding but that will again need specifically choosing the parameters. What could be the best way to perform this?
Related
How can I access a state using the same-id across multiple transformers, for example the following stores an Order object via ValueState in OrderMapper class:
env.addSource(source1()).keyBy(Order::getId).flatMap(new OrderMapper()).addSink(sink1());
Now I would like to access the same Order object via a SubOrderMapper class:
env.addSource(source2()).keyBy(SubOrder::getOrderId).flatMap(new SubOrderMapper()).addSink(sink2());
Edit: Looks like it's not possible to have state maintained across multiple operators, is there a way to have one operator accept multiple inputs, lets say 5 sources?
Take a look at CoProcessFunction
To realize low-level operations on two inputs, applications can use
CoProcessFunction or KeyedCoProcessFunction. This function is bound to
two different inputs and gets individual calls to processElement1(...)
and processElement2(...) for records from the two different inputs.
Also side outputs might be useful for you. side output
Edit:
Union operator my be an option.
Union
You can create a custom EitherOfFive class that contains one of your five different stream values (I'm assuming they are all different). See Flink's Either class for the one of two case.
Each input stream would use a Map function that converts the input class type to an EitherOfFive type.
There would be a getKey() method that would figure out (based on which of the five values is actually set) what key to return. And then you can have a single KeyedProcessFunction that takes as input this EitherOfFive type.
If the output is always the same, then you're all set. Otherwise you'll want side outputs, one per type, that feed the five different sinks.
I need to parse untrusted Java serialized objects. The data is given to me as a byte array (written at some point by ObjectOutputStream).
I do not want to simply call ObjectInputStream.readObject() and/or load the actual object. I am looking for a way to safely parse the bytes and grab field names & values.
--
Here's a little summary of my attempt so far, after taking a look at the ObjectInputStream procedure for deserializing objects.
I have tried to extract field types/names (as unicode strings) recursively based on expected stream constants. I end up with a list of field names whose values should appear in the byte array in order. I am uneasy about this approach because it is probably buggy. Especially accommodating for what seems to be individual serialization protocols followed by HashMap, ArrayList, etc. But it might work, if I can figure out a way to read the bytes that represent field values:
I can try to read and store primitives based on size/offset, but when I encounter my first object, it gets a bit more complicated -- there is no clear way to distinguish between which bytes are associated with which values anymore (without actually loading the object in the way that ObjectInputStream probably does?).
--
Can anyone suggest either a potential solution that I'm obviously looking past, or a trusted library that can help parse the serialized data without loading objects?
Thank you for reading, and for all comments/suggestions!!! I apologize if something is unclear and I would be happy to clarify if you bear with me.
You can't do this in principle. Any Java class can take over its own Serialization and write arbitrary data to the stream that only it knows how to parse and reconstruct, via code that is only invoked during deserialization.
I am starting code refactoring for integrating four small projects to one project,the four projects is very similar.
the general logic of the project is as this:a http-server to receive the request,the request is as a json format,for example A(we call A has two objects: a_key, b_key):
{a_key : a_value, b_key : {bb_key: {b_key1 : b_value1}}}
but the four types of request is not the same, there is a little different between them. for example: the second project receive request as this B(we call B has three objects: a_key, c_key, d_key):
{a_key : a_value, c_key : {cc_key: {c_key1 : c_value1}},d_key: [dd1, dd2, dd3]}
just as the above, all the requests is in JSON format, the different is "some key" may have a different name(eg: bb_key, cc_key), and the parameter count may be different(eg: B has dd_key parameter).I can't be sure the concrete parameter's names and counts.
Also, all the responses are in JSON format, but as requests, a little different between them. The process is similar, according to the parameter, after some filters, the response is returned.
I think the difficult of refactoring of this is a general of Request and Response, now in our four projects, most of the code is similar, but a little different for Request and Response, our code use Java, we use Jackson to translate every request, we define a concrete class for every object in the request(eg: a_key, b_key, c_key, d_key). I don't want define so many objects class for the 4 type requests, since there is only one or two parameter different. now I have no idea to general this Request, any guy has ideas? thank you!
In generic class, convert your json parameters into HashMap. Implement Factory Design pattern such that Factory class will return appropriate filter based on whether required key is present in HashMap or not. Use the returned filter to process your result and return JSON response.
I am working in Java. I have an class called Command. This object class stores a variable List of parameters that are primitives (mostly int and double). The type, number, and order of parameters is specific to each command, so the List is type Object. I won't ever query the table based on what these parameter values are so I figured I would concatenate them into a single String or serialize them in some way. I think this may be a better approach that normalizing the table because I will have to join every time and that table will grow huge pretty quickly. (Edit: The Command object also stores some other members that won't be serialized such as a String to identify the type of command, and a Timestamp for when it was issued.)
So I have 2 questions:
Should I turn them into a delimited String? If so, how do I get each object as a String without knowing which type to cast them to? I attempted to loop through and use the .toString method, but that is not working. It seems to be returning null.
Or is there some way to just serialize that data of the array into a column of the DB? I read about serialization and it seems to be for the context of serializing whole classes.
I would use JSON serializer and deserializer like Jackson to store and retrieve those command objects in DB without losing the specific type information. On a side note, I would have these commands implement a common interface and store them in a list of commands and not in a list of objects.
Yes, I know it's bad practice and I should instead normalize my tables. That put aside, is it possible to serialize a String [] array and store it in the database?
I am from the lenient and forgiving world of PHP, where invoking the serialize() function and would convert the array into a string.
Is there an equivalent of doing such heresy in Java?
Apart from normalization, are there more elegant ways of storing String Arrays in the database?
In case it's applicable, I am using the jdbc driver for my MySQL connections.
Yes. You can serialize any Java objects and store the serialized data into MySQL.
If you use the regular serialization (ObjectOutputStream), the output is always binary. Even String is serialized into binary data. So you have to Base64 encode the stream or use a binary column like BLOB.
This is different from PHP, whose serialize() converts everything into text.
You can also use the XML serialization in Java (XMLEncoder) but it's very verbose.
If you're thinking in terms of raw arrays, you're still writing PHP in Java.
Java's an object-oriented language. An array of Strings really isn't much of an abstraction.
You'll get perfectly good advice here telling you that it's possible to serialize that array of Strings into a BLOB that you can readily store in MySQL, and you can tell yourself that leniency is a virtue.
But I'll going to remind you that you're losing something by not thinking in terms of objects. They're really about abstraction and encapsulation and dealing with things at a higher level than bare metal ints, Strings, and arrays.
It'd be a good exercise to try and design an object that might encapsulate an array or another more sophisticated data structure of child objects that were more than Strings. There'd be a 1:m relationship between parent and child that would better reflect the problem you were really trying to solve. That would be a far more object-oriented design than the one you're proposing here.
There are various good serialization/deserialization libraries that automatically convert JavaBean objects to/from XML and JSON strings. One I've had good experience with is XStream.
Java's built-in support for serialization can do the same thing, and you can write custom serialization/deserialization methods for Java to call.
You can roll your own serialization methods too, eg converting to and from a comma-separated value (CSV) format.
I'd opt for a library like XStream first, assuming there's a very compelling reason not to normalize the data.
You don't want to serialize the array. I'm not sure why you'd serialize it in PHP either, because implode() and explode() would be more appropriate. You really should normalize your data, but aside from that, you could very easily Google a solution for converting an array to a string.
But surely the more logical thing to do would be to save each string as its own record with a suitable identifier. That would probably be less coding than serializing -- a simple loop through the elements of the array -- and would result in a clean database design, rather than some gooey mess.
If you really don't want to normalize this values into a separate table where each string would be in its own row, then just convert your array to a list of comma separated values (possibly escaping commas somehow). Maybe quoting each string so that "str1","str2".
Google for CSV RFC for spec on how this should be properly escaped.