After a quick research, I found the following three libraries for parsing Toml files in java.
toml4j
tomlj
jackson-dataformats-text
What I am looking for is a library that can parse toml files without a corresponding POJO class. While both toml4j, and tomlj can achieve that, they do not seem to be maintained.
jackson-dataformats-text on the other is actively maintained but I can not parse a toml file without the corresponding POJO class.
Is there a way to create a dynamic class in java that I can use to parse any toml file?
If you just need to read a TOML file without a POJO, FasterXML Jackson libraries are a great choice. The simplest method is to just read the content as a java.util.Map:
final var tomlMapper = new TomlMapper();
final var data = tomlMapper.readValue(new File("config.toml"), Map.class);
After that, the content of the file will be available in data.
If you need even lower level parsing, all formats supported by FasterXML Jackson can be read using the stream API. In case you need that, read about the stream API on the core module: FasterXML/jackson-core, just make sure you use the right factory class (TomlFactory instead of JsonFactory).
Related
I am new to java, i am struggling in one program i don't know how to write it
I need some code that will read in the tasks and apply the appropriate task findings markup into the val.xml file.
For example:
A task in val.xml:
<task name="12-19" additionalIntervalInformationNeeded="No">
Converter (Cleaning)
</task>
The matching task-findings markup in the findings.xml:
<tf taskid="olive-12-19">
<task-findings val="28">
<task-finding>
<title>Left Converter</title>
</task-finding>
</task-findings>
</tf>
So the goal is to use the tasked attribute value from the element to locate the correct task-findings markup.
Incorporate the element and all child elements into the task markup (just inside the ending tag.
The result to the above examples would be as such:
<task name="12-19" additionalIntervalInformationNeeded="No">
Converter (Cleaning)
<tf taskid="olive-12-19">
<task-findings val="28">
<task-finding>
<title>Left Converter</title>
</task-finding>
</task-findings>
</tf>
</task>
Please suggest me how to write code.
From your use case, it appears that you can write a program to read in the two xml files and then edit and write them as an output file. XML files can be read and written just like TXT files in Java, you'll just need to change the file extension while reading and saving the files. This will need you to write your own parser or use regex etc methods.
Another way to go is by writing a JAXP or the Java API for XML, provided by Oracle. This will help you read, process and edit XML files via Java.
There are other parser APIs called DOM Parser API & SAX (Simple API for XML) API. That can be used to read and alter XML files. This was used by older Java versions and are useful for small XML files. Currently, the StaX or Streaming API for XML is used instead.
The tutorial blog here will help you get an idea of StaX library parsing XML files via Java.
I want to use Weka in order to parse an existing json file in java eclipse. I believe this can be done using the JSONLoader class. After I read the classes' specifications (http://weka.sourceforge.net/doc.dev/weka/core/converters/JSONLoader.html#JSONLoader--) I thought that this could be easily done by doing this:
JSONLoader jsonLoader = new JSONLoader(jsonFile);
Then I thought by just doing jsonLoader.getFileDescription() or jsonLoader.getSource() would give me results. This is not how it's done though and I can't find anywhere how to use the JSONLoader class in my java code. So in order not to make this question too broad, how can I create a JSONLoader object that reads a source that is in JSON format?
First of all it has nothing to do with eclipse so you should edit your question.
A brief look at the documentation of JSONLoader(in the link you provided) can tell that you need to set the data source you want to parse using setSource (the constructor is empty):
JSONLoader jsonLoader = new JSONLoader();
File f = new File("PATH_TO_YOUR_JSON_FILE");
jsonLoader.setSource(f); //you can also use InputStream instead of a File
After doing that you can use other methods that parse your JSON:
Instances dataset = jsonLoader.getDataSet();
jsonLoader.getFileDescription();
...
I have a scenario where to convert the messages present as Json object to Apache Parquet format using Java. Any sample code or examples would be helpful. As far as what I have found to convert the messages to Parquet either Hive, Pig, Spark are being used. I need to convert to Parquet without involving these only by Java.
To convert JSON data files to Parquet, you need some in-memory representation. Parquet doesn't have its own set of Java objects; instead, it reuses the objects from other formats, like Avro and Thrift. The idea is that Parquet works natively with the objects your applications probably already use.
To convert your JSON, you need to convert the records to Avro in-memory objects and pass those to Parquet, but you don't need to convert a file to Avro and then to Parquet.
Conversion to Avro objects is already done for you, see Kite's JsonUtil, and is ready to use as a file reader. The conversion method needs an Avro schema, but you can use that same library to infer an Avro schema from JSON data.
To write those records, you just need to use ParquetAvroWriter. The whole setup looks like this:
Schema jsonSchema = JsonUtil.inferSchema(fs.open(source), "RecordName", 20);
try (JSONFileReader<Record> reader = new JSONFileReader<>(
fs.open(source), jsonSchema, Record.class)) {
reader.initialize();
try (ParquetWriter<Record> writer = AvroParquetWriter
.<Record>builder(outputPath)
.withConf(new Configuration)
.withCompressionCodec(CompressionCodecName.SNAPPY)
.withSchema(jsonSchema)
.build()) {
for (Record record : reader) {
writer.write(record);
}
}
}
I had the same problem, and what I understood that there are not much samples available for parquet write without using avro or other frameworks. Finally I went with Avro. :)
Have a look at this, may help you.
Is there a way to validate values in a YAML file while loading it in the code. The requirement is I have some elements in the YAML file which must have values. If the validation fails, then YAML should not be loaded.
I'm using snakeyaml library and heard there is a way to do this via Representer.
Code I'm currently using to load the YAML,
Reader in = new InputStreamReader(Files.newInputStream(file), StandardCharsets.UTF_8);
Yaml yaml = new Yaml();
yaml.setBeanAccess(BeanAccess.FIELD);
return yaml.loadAs(in, School.class);
Since you can have any value in a YAML file, you should load the file in a function, test the values and raise an error if the values are not what you want. Return the loaded data if they are.
This may have side-effects if your YAML has tags that create arbitrary objects, but checking during loading will not prevent that, as such object might have been created before you come to the value you want to check.
If you do have tags in your YAML and that is a real problem, then you would have to make a safe_load-er for the YAML file that can handle the tags (by creating normal mapping objects), then check the values and reload with full tag support.
I have a scenario where to convert the messages present as Json object to Apache Parquet format using Java. Any sample code or examples would be helpful. As far as what I have found to convert the messages to Parquet either Hive, Pig, Spark are being used. I need to convert to Parquet without involving these only by Java.
To convert JSON data files to Parquet, you need some in-memory representation. Parquet doesn't have its own set of Java objects; instead, it reuses the objects from other formats, like Avro and Thrift. The idea is that Parquet works natively with the objects your applications probably already use.
To convert your JSON, you need to convert the records to Avro in-memory objects and pass those to Parquet, but you don't need to convert a file to Avro and then to Parquet.
Conversion to Avro objects is already done for you, see Kite's JsonUtil, and is ready to use as a file reader. The conversion method needs an Avro schema, but you can use that same library to infer an Avro schema from JSON data.
To write those records, you just need to use ParquetAvroWriter. The whole setup looks like this:
Schema jsonSchema = JsonUtil.inferSchema(fs.open(source), "RecordName", 20);
try (JSONFileReader<Record> reader = new JSONFileReader<>(
fs.open(source), jsonSchema, Record.class)) {
reader.initialize();
try (ParquetWriter<Record> writer = AvroParquetWriter
.<Record>builder(outputPath)
.withConf(new Configuration)
.withCompressionCodec(CompressionCodecName.SNAPPY)
.withSchema(jsonSchema)
.build()) {
for (Record record : reader) {
writer.write(record);
}
}
}
I had the same problem, and what I understood that there are not much samples available for parquet write without using avro or other frameworks. Finally I went with Avro. :)
Have a look at this, may help you.