Converting JSON to XML generated invalid XML - java

Please have a look at the following.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import org.json.JSONArray;
import org.json.JSONException;
import org.json.JSONML;
import org.json.JSONTokener;
import org.json.XML;
import com.amazonaws.auth.ClasspathPropertiesFileCredentialsProvider;
import com.amazonaws.regions.Region;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.S3Object;
public class JsonToXML
{
private AmazonS3Client s3;
public JsonToXML(String inputBucket, String inputFile) throws IOException, JSONException
{
//Connection to S3
s3 = new AmazonS3Client(new ClasspathPropertiesFileCredentialsProvider());
Region usWest2 = Region.getRegion(Regions.US_EAST_1);
s3.setRegion(usWest2);
//Downloading the Object
System.out.println("Downloading Object");
S3Object s3Object = s3.getObject(new GetObjectRequest(inputBucket, inputFile));
System.out.println("Content-Type: " + s3Object.getObjectMetadata().getContentType());
//Read the JSON File
BufferedReader reader = new BufferedReader(new InputStreamReader(s3Object.getObjectContent()));
StringBuffer strBuffer = new StringBuffer("");
int i=0;
while (true) {
String line = reader.readLine();
if (line == null) break;
System.out.println("Running: "+i);
strBuffer.append(line);
i++;
}
JSONTokener jTokener = new JSONTokener(strBuffer.toString());
JSONArray jsonArray = new JSONArray(jTokener);
//Convert to XML
String xml = XML.toString(jsonArray);
File f = new File("XML.xml");
FileWriter fw = new FileWriter(f);
fw.write(xml);
}
}
This is how the Json files look like
[
{
"_type": "ArticleItem",
"body": "Who's signing",
"source": "money.cnn.com",
"last_crawl_date": "2014-01-14",
"url": "http: //money.cnn.com/"
},
{
"_type": "ArticleItem",
"body": "GMreveals",
"title": "GMreveals625-horsepowerCorvetteZ06-Jan.13",
"source": "money.cnn.com",
"last_crawl_date": "2014-01-14",
"url": "http: //money.cnn.com"
}
]
This code generated invalid XML or files without any text. Invalid means, after the last <> it still generate some text, so the entire file is invalid. What is wrong here?
UPDATE
According to the answer of jtahlborn I managed to generate an XML file with the following output.
<array><body>Who&apos;s signing</body><_type>ArticleItem</_type><source>money.cnn.com</source><last_crawl_date>2014-01-14</last_crawl_date><url>http: //money.cnn.com/</url></array><array><body>GMreveals</body><_type>ArticleItem</_type><title>GMreveals625-horsepowerCorvetteZ06-Jan.13</title><source>money.cnn.com</source><last_crawl_date>2014-01-14</last_crawl_date><url>http: //money.cnn.com</url></array>
But XML Validator in here says:
XML Parsing Error: junk after document element
Location: http://www.w3schools.com/xml/xml_validator.asp
Line Number 1, Column 181:

You need to flush()/close() the FileWriter to ensure all the data is written to the file.
The problem is that you have 2 "top-level" elements in your xml result (2 "array" elements). xml can only have one top-level element.
UPDATE:
Try this for converting the json to xml:
String xml = XML.toString(jsonArray, "doc");

Related

How parse nested json in Spring

I have nested JSON with bunch of children objects, but I just need response_time and question, subquestions of survey_data. What is the best way to parse nested JSON in rest controller to the object in spring?
{
"is_test_data":false,
"language":"English",
"url_variables":{
"requestId":{
"key":"requestId",
"value":"1"
}
},
"response_time":1114,
"survey_data":{
"2":{
"id":2,
"type":"parent",
"question":"For each of the following factors, please rate your recent project",
"subquestions":{
"10":{
"10001":{
"id":10001,
"type":"MULTI_TEXTBOX",
"question":"Overall Quality : Rating",
"answer":null,
}
},
"11":{
"10001":{
"id":10001,
"type":"MULTI_TEXTBOX",
"question":"Achievement of Intended Objectives : Rating",
"answer":null
}
}
}
},
"33":{
"id":33,
"type":"HIDDEN",
"question":"Submitted",
"answer_id":0,
}
}
}
Thank you.
What you should do is parse the complete json to jsonObject using json-simple jar
which create a map like structure for the json and then you can simply get the desired value from it using the key as I explained in below example
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
public class JsonDeserializer {
public static void main(String[] args) throws Exception {
File file = new File("test.json");
InputStream is = new FileInputStream(file);
StringBuilder textBuilder = new StringBuilder();
try (Reader reader = new BufferedReader(
new InputStreamReader(is, Charset.forName(StandardCharsets.UTF_8.name())))) {
int c = 0;
while ((c = reader.read()) != -1) {
textBuilder.append((char) c);
}
}
String jsonTxt = textBuilder.toString();
Object obj = new JSONParser().parse(jsonTxt);
JSONObject jo = (JSONObject) obj;
System.out.println(jo.get("response_time"));
}
}
JSON is a data communication format that is lightweight, text-based. Objects and arrays are two structured kinds that JSON can represent. A JSONArray may extract text from a String and convert it to a vector-like object. The getString(index) method of JSONArray can be used to parse a nested JSON object. This is a helper method for the getJSONString method (index). The getString() method returns a string.

XML extract data with Jackson

I have written a Groovy that have as an input an xml file and i want to extract the value of two tags that are part of the xml file.I have converted XML to Json and from JSON to Map. This Java code works for me:
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import org.apache.commons.io.IOUtils;
import org.json.JSONObject;
import org.json.XML;
import com.bfi.digi.chk.CheckRemittance;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
CheckRemittance remise =data;
InputStream inputStream = new FileInputStream(new File(
"C:/xml/full_rejection_notification.xml"));
String xml = IOUtils.toString(inputStream);
JSONObject jObject = XML.toJSONObject(xml);
ObjectMapper mapper = new ObjectMapper();
mapper.enable(SerializationFeature.INDENT_OUTPUT);
Object json = mapper.readValue(jObject.toString(), Object.class);
String output = mapper.writeValueAsString(json);
Map<String, List<String>> response = new ObjectMapper().readValue(output, HashMap.class);
String state= response.get("Document").get("FIToFIPmtStsRpt").get("TxInfAndSts").get("TxSts").toString();
String errorCode=response.get("Document").get("FIToFIPmtStsRpt").get("TxInfAndSts").get("StsRsnInf").get("Rsn").get("Cd").toString();
return state.concat(" ").concat(errorCode);
The problem is that i'm getting an xml file containing repetitive block and as Map don't allow duplicate key I have to find another solution. I'm looking for your propositions.

how to convert arbitrary JSON to XML using BaseX?

How is arbitrary JSON converted to arbitrary XML using BaseX?
I'm looking at JsonParser from BaseX for this specific solution.
In this case, I have tweets using Twitter4J:
package twitterBaseX;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;
import java.util.logging.Logger;
import main.LoadProps;
import org.basex.core.BaseXException;
import twitter4j.JSONException;
import twitter4j.JSONObject;
import twitter4j.Query;
import twitter4j.QueryResult;
import twitter4j.Status;
import twitter4j.Twitter;
import twitter4j.TwitterException;
import twitter4j.TwitterFactory;
import twitter4j.TwitterObjectFactory;
import twitter4j.conf.ConfigurationBuilder;
public class TwitterOps {
private static final Logger log = Logger.getLogger(TwitterOps.class.getName());
public TwitterOps() {
}
private TwitterFactory configTwitterFactory() throws IOException {
LoadProps loadTwitterProps = new LoadProps("twitter");
Properties properties = loadTwitterProps.loadProperties();
log.fine(properties.toString());
ConfigurationBuilder configurationBuilder = new ConfigurationBuilder();
configurationBuilder.setDebugEnabled(true)
.setJSONStoreEnabled(true)
.setOAuthConsumerKey(properties.getProperty("oAuthConsumerKey"))
.setOAuthConsumerSecret(properties.getProperty("oAuthConsumerSecret"))
.setOAuthAccessToken(properties.getProperty("oAuthAccessToken"))
.setOAuthAccessTokenSecret(properties.getProperty("oAuthAccessTokenSecret"));
return new TwitterFactory(configurationBuilder.build());
}
public List<JSONObject> getTweets() throws TwitterException, IOException, JSONException {
Twitter twitter = configTwitterFactory().getInstance();
Query query = new Query("lizardbill");
QueryResult result = twitter.search(query);
String string = null;
JSONObject tweet = null;
List<JSONObject> tweets = new ArrayList<>();
for (Status status : result.getTweets()) {
tweet = jsonOps(status);
tweets.add(tweet);
}
return tweets;
}
private JSONObject jsonOps(Status status) throws JSONException, BaseXException {
String string = TwitterObjectFactory.getRawJSON(status);
JSONObject json = new JSONObject(string);
String language = json.getString("lang");
log.fine(language);
return json;
}
}
The JSONObject from Twitter4J cannot just get jammed into XML?
There are a number of online converters which purport to accomplish this, and, which, at least at first glance, seem quite adequate.
see also:
Converting JSON to XML in Java
Java implementation of JSON to XML conversion
Use the (excellent) JSON-Java library from json.org then
JSONObject json = new JSONObject(str);
String xml = XML.toString(json);
toString can take a second argument to provide the name of the XML root node.
This library is also able to convert XML to JSON using XML.toJSONObject(java.lang.String string)
Check the Javadoc for more information

How to dynamically create items in DynamoDB reading objects from a json file using java

my requirement is to read data from json file and create items in dynamoDB with the objects that are present in dynamoDB, for example consider this following file
{
"ISA": {
"isa01_name": "00",
"isa02": " ",
"isa03": "00",
"isa04": " ",
"isa05": "ZZ",
"isa06": "CLEARCUT ",
"isa07": "ZZ",
"isa08": "CMSENCOUNTERCTR",
"isa09": "120109",
"isa10": "1530",
"isa11": "U",
"isa12": "00501",
"isa13": "012412627",
"isa14": "0",
"isa15": "T",
"isa16": ":"
},
"GS": {
"gs02": "352091331",
"gs04": "20170109",
"gs06": "146823",
"gs03": "00580",
"gs05": "1530",
"gs01": "HC",
"gs08": "005010X222A1",
"gs07": "X"
},
"ST": {
"ST03_1705_Implementation Convention Reference": "005010X222A1",
"ST01_143_Transaction Set Identifier Code": "837",
"ST02_329_Transaction Set Control Number": "50138"
}
}
when i read this file it have to create ISA , GS , ST items in dynamodb database. Again if read another file having different objects , then items have to be created for them as well.
following is the code that i am having right now.
`import java.io.File;
import java.util.Iterator;
import com.amazonaws.client.builder.AwsClientBuilder;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder;
import com.amazonaws.services.dynamodbv2.document.DynamoDB;
import com.amazonaws.services.dynamodbv2.document.Item;
import com.amazonaws.services.dynamodbv2.document.Table;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.node.ObjectNode;
public class LoadData {
public static void main(String ards[]) throws Exception {
AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard()
.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration("http://localhost:8000", "us-west-2"))
.build();
DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("NewTable");
JsonParser parser = new JsonFactory().createParser(new File("C:\\Users\\Nikhil yadav\\Downloads\\healthclaims\\healthclaims\\src\\main\\resources\\output\\ValidJson.json"));
JsonNode rootNode = new ObjectMapper().readTree(parser);
Iterator<JsonNode> iter = rootNode.iterator();
ObjectNode currentNode;
while (iter.hasNext()) {
currentNode = (ObjectNode) iter.next();
try {
table.putItem(new Item().withPrimaryKey("ISA", currentNode.path("ISA").toString().replaceAll("[^:a-zA-Z0-9_-|]", " "))
.withString("GS",currentNode.path("GS").toString().replaceAll("[^:a-zA-Z0-9_-|]", " "))
.withString("ST", currentNode.path("ST").toString().replaceAll("[^:a-zA-Z0-9_-|]", " "))
);
System.out.println("PutItem succeeded: ");
}
catch (Exception e) {
System.err.println("Unable to add : ");
System.err.println(e.getMessage());
break;
}
}
parser.close();
}
}
`
it will only accept file having ISA , GS and ST objects, but i want program which accepts all types objects of json files.
i hope my question is clear. i am new in posting questions please ignore if it is not clear.

Error indexing text from Apache Tika in Solr

I am trying to integrate Apache Tika with Solr so that text extracted by Tika could be indexed in Solr.
I tried the following code:
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.UUID;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.impl.HttpSolrServer;
import org.apache.solr.common.SolrInputDocument;
import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.DublinCore;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.mime.MimeTypes;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.apache.tika.sax.BodyContentHandler;
import org.xml.sax.ContentHandler;
import org.xml.sax.SAXException;
public class Main {
private static SolrServer solr;
public static void main(String[] args) throws IOException, SAXException, TikaException {
try {
solr = new HttpSolrServer("http://localhost:8983/solr/#/");
String path = "C:\\content\\";
String file_html = path + "mobydick.htm";
String file_txt = path + "/home/ben/abc.warc";
String file_pdf = path + "callofthewild.pdf";
processDocument(file_html);
processDocument(file_txt);
processDocument(file_pdf);
solr.commit();
}
catch (Exception ex) {
System.out.println(ex.getMessage());
}
}
private static void processDocument(String pathfilename) {
try {
InputStream input = new FileInputStream(new File(pathfilename));
//use Apache Tika to convert documents in different formats to plain text
ContentHandler textHandler = new BodyContentHandler(10*1024*1024);
Metadata meta = new Metadata();
Parser parser = new AutoDetectParser(); //handles documents in different formats:
ParseContext context = new ParseContext();
parser.parse(input, textHandler, meta, context); //convert to plain text
//collect metadata and content from Tika and other sources
//document id must be unique, use guide
UUID guid = java.util.UUID.randomUUID();
String docid = guid.toString();
//Dublin Core metadata (partial set)
String doctitle = meta.get(DublinCore.TITLE);
String doccreator = meta.get(DublinCore.CREATOR);
//other metadata
String docurl = pathfilename; //document url
//content
String doccontent = textHandler.toString();
//call to index
indexDocument(docid, doctitle, doccreator, docurl, doccontent);
}
catch (Exception ex) {
System.out.println(ex.getMessage());
}
}
private static void indexDocument(String docid, String doctitle, String
doccreator, String docurl, String doccontent) {
try {
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", docid);
//map metadata fields to default schema
//location: path\solr-4.7.2\example\solr\collection1\conf\schema.xml
//Dublin Core
//thought: schema could be modified to use Dublin Core
doc.addField("title", doctitle);
doc.addField("author", doccreator);
//other metadata
doc.addField("url", docurl);
//content (and text)
//per schema, the content field is not indexed by default, used for returning and highlighting document content
//the schema "copyField" command automatically copies this to the "text" field which is indexed
doc.addField("content", doccontent);
//indexing
//when a field is indexed, like "text", Solr will handle tokenization, stemming, removal of stopwords etc, per the schema defn
//add to index
solr.add(doc);
}
catch (Exception ex) {
System.out.println(ex.getMessage());
}
}
}
Unfortunately I am hitting the Error below:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/http/NoHttpResponseException at Main.main(Main.java:28)
Caused by: java.lang.ClassNotFoundException:
org.apache.http.NoHttpResponseException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 1 more
Could you please help me with the resolution of this issue?

Categories

Resources