How to use Weka JSONLoader in Java IDE? - java

I want to use Weka in order to parse an existing json file in java eclipse. I believe this can be done using the JSONLoader class. After I read the classes' specifications (http://weka.sourceforge.net/doc.dev/weka/core/converters/JSONLoader.html#JSONLoader--) I thought that this could be easily done by doing this:
JSONLoader jsonLoader = new JSONLoader(jsonFile);
Then I thought by just doing jsonLoader.getFileDescription() or jsonLoader.getSource() would give me results. This is not how it's done though and I can't find anywhere how to use the JSONLoader class in my java code. So in order not to make this question too broad, how can I create a JSONLoader object that reads a source that is in JSON format?

First of all it has nothing to do with eclipse so you should edit your question.
A brief look at the documentation of JSONLoader(in the link you provided) can tell that you need to set the data source you want to parse using setSource (the constructor is empty):
JSONLoader jsonLoader = new JSONLoader();
File f = new File("PATH_TO_YOUR_JSON_FILE");
jsonLoader.setSource(f); //you can also use InputStream instead of a File
After doing that you can use other methods that parse your JSON:
Instances dataset = jsonLoader.getDataSet();
jsonLoader.getFileDescription();
...

Related

Read a toml file in java 2022

After a quick research, I found the following three libraries for parsing Toml files in java.
toml4j
tomlj
jackson-dataformats-text
What I am looking for is a library that can parse toml files without a corresponding POJO class. While both toml4j, and tomlj can achieve that, they do not seem to be maintained.
jackson-dataformats-text on the other is actively maintained but I can not parse a toml file without the corresponding POJO class.
Is there a way to create a dynamic class in java that I can use to parse any toml file?
If you just need to read a TOML file without a POJO, FasterXML Jackson libraries are a great choice. The simplest method is to just read the content as a java.util.Map:
final var tomlMapper = new TomlMapper();
final var data = tomlMapper.readValue(new File("config.toml"), Map.class);
After that, the content of the file will be available in data.
If you need even lower level parsing, all formats supported by FasterXML Jackson can be read using the stream API. In case you need that, read about the stream API on the core module: FasterXML/jackson-core, just make sure you use the right factory class (TomlFactory instead of JsonFactory).

How to run analytic in spark?

I'm new to Spark. I'm still learning it. I have questions that would like opinion.
I have to prepare jar file for the analytic method that should be suitable to run as spark job.
is it necessary for jar to be executable / runnable?
Can I prepare jar as library with few methods
For my case,I have input and output of the analytic
Here, can I pass input json and get output json in the spark?
What are the steps?
Any help or links to read will be helpful?
Your first question basically asked how to run Spark with java API. Here is some code I think you'll find useful
SparkLauncher launcher = new SparkLauncher()
setAppName(config.getString("appName"))
.setSparkHome(sparkHomePath)
.setAppResource(pathToYourJar)
.setMaster(masterUrl)
.setMainClass(fullNameOfMainClass);
You might need to add launcher.addJar(...)
Create an instance of SparkAppHandle.Listener
SparkAppHandle handle = launcher.startApplication(sparkJobListener);
"can I pass input json and get output json in the spark?"
If you wish to read a JSON as the input you can follow the instructions in this link

S3 Implementation for org.apache.parquet.io.InputFile?

I am trying to write a Scala-based AWS Lambda to read Snappy compressed Parquet files based in S3. The process will write them backout in partitioned JSON files.
I have been trying to use the org.apache.parquet.hadoop.ParquetFileReader class to read the files... the non-deprecated way to do this appears to pass it a implementation of the org.apache.parquet.io.InputFile interface. There is one for Hadoop (HadoopInputFile)... but I cannot find one for S3. I also tried some of the deprecated ways for this class, but could not get them to work with S3 either.
Any solution to this dilemma?
Just in case anyone is interested... why I am doing this in Scala? Well... I cannot figure out another way to do it. The Python implementations for Parquet (pyarrow and fastparquet) both seem to struggle with complicated list/struct based schemas.
Also, I have seen some AvroParquetReader based code (Read parquet data from AWS s3 bucket) that might be a different solution, but I could not get these to work without a known schema. but maybe I am missing something there.
I'd really like to get the ParquetFileReader class to work, as it seem clean.
Appreciate any ideas.
Hadoop uses its own filesystem abstraction layer, which has an implementation for s3 (https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#S3A).
The setup should look someting like the following (java, but same should work with scala):
Configuration conf = new Configuration();
conf.set(Constants.ENDPOINT, "https://s3.eu-central-1.amazonaws.com/");
conf.set(Constants.AWS_CREDENTIALS_PROVIDER,
DefaultAWSCredentialsProviderChain.class.getName());
// maybe additional configuration properties depending on the credential provider
URI uri = URI.create("s3a://bucketname/path");
org.apache.hadoop.fs.Path path = new Path(uri);
ParquetFileReader pfr = ParquetFileReader.open(HadoopInputFile.fromPath(path, conf))

Converting Java code to ColdFusion

I'm not skilled at Java at all so I could really use your help. I'm trying to read the duration and bit rate from an mp3 file. I'm using a java library called "mp3spi" from http://www.javazoom.net/mp3spi/documents.html.
So var I've been able to determine that these objects exist:
<cfset AudioFormat = createObject("java", "org.tritonus.share.sampled.TAudioFormat")>
<cfset AudioFileFormat = createObject("java", "org.tritonus.share.sampled.file.TAudioFileFormat")>
<cfset AudioFileReader = createObject("java", "javax.sound.sampled.spi.AudioFileReader")>
I'm having trouble with the following code and converting it to ColdFusion:
File file = new File("filename.mp3");
AudioFileFormat baseFileFormat = new MpegAudioFileReader().getAudioFileFormat(file);
Map properties = baseFileFormat.properties();
Long duration = (Long) properties.get("duration");
I've tried several ways of setting the above variables, but I keep getting an error that either MpegAudioFileReader or getAudioFileFormat doesn't exist. However when I dump the variables I used to create the Java objects, they do exist.
Here is what I have:
<cfscript>
mp3file = FileOpen(ExpandPath("./") & originalfile, "readBinary");
baseFileFormat = AudioFileReader.getAudioFileFormat(mp3file);
properties = baseFileFormat.properties();
duration = properties.get("duration");
</cfscript>
I'm not going to write your code for you, Simone, but there's a coupla general tips.
File file = new File("filename.mp3");
Well as you probably know, CFML is loosely-type, so you can dispense with the typing on the LHS, and then you need to use the createObject() function to create Java objects, which you already have a handle on. CF can't import Java libraries, so you'll need to give a fully-qualified path to the File class. You also need to explicitly call the constructor:
mp3File = createObject("java", "java.io.File").init("filename.mp3");
(as #Leigh points out below, file is a kinda reserved word in CFML, so best not to use it as a variable name! So I'm using mp3File here)
From there... you should be able to do the work for the other three statements easy enough. Basic method calls and assignments can pretty much be ported straight from the Java source, just lose the static-typing bits as per above, and the type-casting (long) etc.
If you cannot sort everything out from here, update your question with your experimentation, and we can improve this answer (or someone can post a different one). But you need to give us your specific problems, not just a general "write my code please". People won't do that, and you shouldn't be asking people to here (it's against the rules, and people are very big on rules on StackOverflow).
Adam's answer is solid. Since you'll need to invoke the constructor of a Java class in order to create an instance rather than being limited to using static methods, the "init()" method must be called. As follows...
mp3file = createObject("java", "java.io.File").init("filename.mp3");
baseFileFormat = createObject("java", "path.to.MpegAudioFileReader").init().getAudioFileFormat(mp3file);
properties = baseFileFormat.properties();
duration = properties.get("duration");
Adam's guidance is right on in that typing your variables when you initialize them won't fly. I don't have a ColdFusion environment set up to try this, but in the past we've used approaches like the one above to expand on ColdFusion's Hibernate integration by creating instances of the Java classes and invoking their methods. So long as the external libs that you're dependent on are in ColdFusion server's class path, you shouldn't have any trouble with this.

Create Download Link But Source is Java String

I want to create a download link but the part I'm having trouble is that the source is a Java string. The String I have is a JSON data. I want people to be able to download that data.
I am using the Play! framework so I can pass the String data using the Scala template. But I'm not sure how to allow users to download the String and append the file types (.txt, .json) so that users actually download a file.
How do I go about to doing this?
I can't believe how simple the solution is. This was what did it for me. Basically take the string and convert it into an InputStream.
String data = "someBigOrSmallData";
InputStream dataStream = new ByteArrayInputStream(data.getBytes());
response().setHeader("Content-disposition","attachment; filename=anyFileName.txt");
return ok(dataStream);

Categories

Resources