I am relatively new to Mongodb and I have an idea in mind but not sure how to go about it.
What I would like to do is essentially hash a mongodb document (preferably it's Json format so it is not database specific) and store that hash somewhere with a reference to that specific document. This needs to allow me to retrieve the document in the future via a query and compare against the stored hash.
My idea is to get the json representation of the DBObject, hash it and then add the hash as a field to that specific document before persisting it. Then when querying for the object, make sure to exclude the hash field from the answer so the returned DBObject includes the same hash.
1 - Does mongodb always return a consistent DBObject format which will always convert to the same json so that the hash would always be the same
2 - Would such an implementation even be viable? As in storing the hash with the object itself, essentially changing the object (thus making the hash invalid) but getting around by not retrieving that field in the response
3 - If the implementation would not work, what would be the simplest way to store the hash, another object with a reference to the original document?
1- Does mongodb always return a consistent DBObject format which will always convert to the same json so that the hash would always be the same. - No Mongo does not guarantee the order so the json can be different based on what kind of updates were done on the document. There is no guarantee that the field order will be consistent, or the same, after an update. If no such order changing updates were done on it then the order should be preserved MongoDB update on Field Order .
But when you serialize the json into an object using Jackson or something else it will serialize to the same object and should have the same hash.
2 - Would such an implementation even be viable? As in storing the hash with the object itself, essentially changing the object (thus making the hash invalid) but getting around by not retrieving that field in the response.
Looks like from this answer you can use Jakson or Gson to hash the json object, even though it is not ordered.
excluding a field should not be a problem.
If you store the hash as a field in the object itself all the write queries that save ( which is an overwrite of the entire document ) will have to write the hash into it. If any of them fail to do so the hash will be lost.
An update query will have another problem since along with changing the data it also has to update the hash of the document. So this will have to involve reading the object, modifying it, computing the hash and storing it back. You will not be able to use the primitive update queries.
If you make the hash as the primary key which is _id field that would mitigate this problem although you probably need it for something else.
3- The simplest way would be to store the _id of the document to be hashed into another collection along with the hash as the _id of the new collection.
{
"_id":<hash code of docuemnt>,
"refer":<_id of the document to be hashed>
}
This would involve multiple read writes which will hurt performance and depending on your use case it
Mongo according to me is a simplistic database designed to store and retrieve objects. If you have the need to do something complicated with it other than retrieving fast and writing its probably not fit for the task.
Related
Idea:
Convert the POJO into flat map using
ObjectToMapTransformer
and form a solrinputdocument from the map and store it in Solr.
While retrieving get the document from the Solr and convert the Document into map and to POJO using MapToObjectTransformer
Problem:
While saving the SolrinputDocument to Solr the flatten key like A.B[0].c of the POJO is getting converted to A.B_0_.c in Solr.
This alternate form of storage in Solr makes it difficult to deserialize the solrDocument to POJO.
How to solve this problem? Or what is the alternate way of storing queryable Document in Solr which can be deserialized and serialized easily.
You usually wrap the fields in your POJO with the appropriate Solr fields that you're indexing that field into. See Mapping a POJO for Solr.
If you really want to serialize the complete object into Solr, serialize it into a single field, and if possible, use a string field (as that will store your object directly). If you want to actually search string values inside the object as well, you can use a text field - but since everything is imported into a single field, that'll have a few limitations (like if you want to score different fields or search for values in a single property from the objects).
So: Use the #Field annotation from SolrJ.beans to do specific POJO handling, or mangle it into a single field and search for strings in that field.
I am trying to build in-memory cache (using a ConcurrentHashMap in Java 8).The key value pair would be a json string and the result of a complex operation on that string.
The objective is to not do the complex operation everytime and do it only when the json string changes.
Is there a way I can uniquely represent this string as the value of any of the json keys can change within the application at any time.
I have looked up the hashCode() method but saw the shortcomings of it.
Right now am trying to see if the MD5 representation of the string would serve as a good key for the JSON string.
If anyone has already faced such a situation, can you please provide your inputs?
As I understand it, a java String instance is final (immutable), so that even if the JSON object is a very long string, the String class only calculates the hashCode of the String once (at construction time or first use i can't remember) - and keeps it as an instance attribute for the lifetime of the String. So there is no problem (in terms of performance penalty) using the JSON object both as the key and value in a concurrent HashMap. This is exactly what the same as how a java "Set" works, being backed by a Map.
This should be a easy task for Cacheonix, and you will save time on building your own caching solution:
Cache<String, ResultOfCalculation> cache = Cacheonix.getInstance().getCache("my.cache");
cache.put(myJsonString, myResultofCalcualtion);
...
ResultOfCalculation result = cache.get(myJsonString);
Is there any way to tell Glassfish that the hash value for a certain data member of an entity class should be calculated and stored in the database instead of the original value?
If you modify the getter of a field to produce its hash instead of original value, you might end up with the hash stored instead.
If your database has a hash function, other option is to issue native query using entitiy manager.
Give it a try
my case is the following. I have a JSF form with three outputtexts & the corresponding inputtexts lets say:
outputtext1 - inputtext1
outputtext2 - inputtext2
outputtext3 - inputtext3
Currently i use a backbean method 'Save' in order to store them into the database (hibernate object lets say table1 with primary key table1.id) into fields table1.field1, table1.field2, table1.field3.
So each record in the table have the values inserted in the inputtexts.
My question is how am i going to store form data in the database, in a form like the following:
{"outputtext1:inputtext1","outputtext2:inputtext2"."outputtext3:inputtext3"}
and then fetch them again, parse and rebuild the form. I am thinking of manipulating form data as JSON object... But i am new to both Java+JSON so some guideliness would be really useful for me!
This is an indicative example, form are going to by dynamic and created on the fly.
Why would you want to serialize/deserialize a JSON to send it directly to the database? Deserialization has its own issues and multiple deserializations could (not will) be a source of issues.
You should save the fields as attributes from a given entity and then, with the help of libraries like Gson, generate the JSON from the entity.
Update
Since your form is dynamic, you can use some adaptable entity structure to hold your data.
Your entities can either have a Map<String,String> attribute or a collection of, say, FieldRecord entities that contain a key - value pair.
I suggest this because a JSON in the database can lead to complex issues, in particular if you'll have to query that data later. YOu'll have to process the JSONs in order to either report or find which records have a particular field. This is just an example, things can get more complex.
Simple, you need a BLOB type column in your table to store the json and when you retrieve it in java you just need to decode the json and i recomend using https://code.google.com/p/json-simple/ its very simple
Convert JSONObject into a Stringform and then store.
And when your read it back, convert it back into JSONObject like below :
JSONObject obj = new JSONObject(stringRepresentationOfJSON);
change hibernate-mapping like this
JSONObject obj = new JSONObject(stringRepresentationOfJSON);
I am working in Java. I have an class called Command. This object class stores a variable List of parameters that are primitives (mostly int and double). The type, number, and order of parameters is specific to each command, so the List is type Object. I won't ever query the table based on what these parameter values are so I figured I would concatenate them into a single String or serialize them in some way. I think this may be a better approach that normalizing the table because I will have to join every time and that table will grow huge pretty quickly. (Edit: The Command object also stores some other members that won't be serialized such as a String to identify the type of command, and a Timestamp for when it was issued.)
So I have 2 questions:
Should I turn them into a delimited String? If so, how do I get each object as a String without knowing which type to cast them to? I attempted to loop through and use the .toString method, but that is not working. It seems to be returning null.
Or is there some way to just serialize that data of the array into a column of the DB? I read about serialization and it seems to be for the context of serializing whole classes.
I would use JSON serializer and deserializer like Jackson to store and retrieve those command objects in DB without losing the specific type information. On a side note, I would have these commands implement a common interface and store them in a list of commands and not in a list of objects.