can't save clustering result to an arff file - java

I'm using a java code to save clustering result to an arff file..
I've followed the instructions in this site:
http://weka.wikispaces.com/Visualizing+cluster+assignments
but I get an error in the line:
PlotData2D predData = ClustererPanel.setUpVisualizableInstances(train, eval);
saying that:
The method setUpVisualizableInstances(Instances, ClusterEvaluation) is undefined for the type ClustererPanel
I've tried to google it but I couldn't find a solution

Judging from the current code:
http://grepcode.com/file/repo1.maven.org/maven2/nz.ac.waikato.cms.weka/weka-dev/3.7.12/weka/gui/explorer/ClustererPanel.java#ClustererPanel
I assume you have to call setInstances instead of setUpVisualizableInstances now.
But: Why do you use a visualization tutorial?
Visualization won't produce an .arff file.

Related

How to run analytic in spark?

I'm new to Spark. I'm still learning it. I have questions that would like opinion.
I have to prepare jar file for the analytic method that should be suitable to run as spark job.
is it necessary for jar to be executable / runnable?
Can I prepare jar as library with few methods
For my case,I have input and output of the analytic
Here, can I pass input json and get output json in the spark?
What are the steps?
Any help or links to read will be helpful?
Your first question basically asked how to run Spark with java API. Here is some code I think you'll find useful
SparkLauncher launcher = new SparkLauncher()
setAppName(config.getString("appName"))
.setSparkHome(sparkHomePath)
.setAppResource(pathToYourJar)
.setMaster(masterUrl)
.setMainClass(fullNameOfMainClass);
You might need to add launcher.addJar(...)
Create an instance of SparkAppHandle.Listener
SparkAppHandle handle = launcher.startApplication(sparkJobListener);
"can I pass input json and get output json in the spark?"
If you wish to read a JSON as the input you can follow the instructions in this link

How to update the content of a file in Google Drive?

I am trying to update the content of a Google Doc file with the content of another Google Doc file. The reason I don't use the copy method of the API is because that creates another file with another ID. My goal is to keep the current ID of the file. This is a code snippet which unfortunately does nothing:
com.google.api.services.drive.Drive.Files.Get getDraft = service.files().get(draftID);
File draft = driveManager.getFileBackoffExponential(getDraft);
com.google.api.services.drive.Drive.Files.Update updatePublished = service.files().update(publishedID, draft);
driveManager.updateFileBackoffExponential(updatePublished);
The two backoffExponential functions just launch the execute method on the object.
Googling around I found out that the update method offers another constructor:
public Update update(java.lang.String fileId, com.google.api.services.drive.model.File content, com.google.api.client.http.AbstractInputStreamContent mediaContent)
Thing is, I have no idea how to retrieve the mediaContent of a Google file such as a Google Doc.
The last resort could be a Google Apps Script but I'd rather avoid that since it's awfully slow and unreliable.
Thank you.
EDIT: I am using Drive API v3.
Try the Google Drive REST update.
Updates a file's metadata and/or content with patch semantics.
This method supports an /upload URI and accepts uploaded media with
the following characteristics:
Maximum file size: 5120GB Accepted Media MIME types: /*
To download a Google File in the format that's usable, you need to specify the mime-type. Since you're using Spreadsheets, you can try application/vnd.openxmlformats-officedocument.spreadsheetml.sheet. Link to Download files for more info.

Reading JSON file with BigQuery to make table

I'm new to Google Dataflow, and can't get this thing to work with JSON. I've been reading throughout the documentation, but can't solve my problem.
So, following the WordCount example i figured how data is loaded from .csv file with next line
PCollection<String> input = p.apply(TextIO.Read.from(options.getInputFile()));
where inputFile in .csv file from my gcloud bucket. I can transform read lines from .csv with:
PCollection<TableRow> table = input.apply(ParDo.of(new ExtractParametersFn()));
(Extract ParametersFn defined by me). So far so good!
But then I realize my .csv file is too big and had to convert it to JSON (https://cloud.google.com/bigquery/preparing-data-for-bigquery).
Since BigQueryIO is supposedly better for reading JSON, I tried with the following code:
PCollection<TableRow> table = p.apply(BigQueryIO.Read.from(options.getInputFile()));
(inputFile is then JSON file and the output when reading with BigQuery is PCollection with TableRows) I tried with TextIO too (which returns PCollection with Strings) and neither of the two IO options work.
What am I missing? The documentation is really not that detailed to find an answer there, but perhaps some of you guys already dealt with this problem before?
Any suggestions would be very appreciated. :)
I believe there are two options to consider:
Use TextIO with TableRowJsonCoder to ingest the JSON files (e.g., like it is done in the TopWikipediaSessions example);
Import the JSON files into a bigquery table (https://cloud.google.com/bigquery/loading-data-into-bigquery), and then use BigQueryIO.Read to read from the table.

Using the created document trough FPDF with PHP/JAVA

I created a PDF document with PHP using FPDF. The next thing I want to do is silently printing the document without downloading the PDF file to the computer.
I've made the following code:
$pdfprintable = $pdf->Output(''.'.pdf','S');
$printcmd = "java -classpath jPDFPrint.jar;pdfprintcli.jar cli.PDFPrintCLI $pdfprintable";
exec($printcmd);
And it returns the following error message:
Warning: exec(): NULL byte detected. Possible attack in C:\Users\Jordy\Desktop\XAMPP\htdocs\php\stickers\pdf.php on line 392
If I echo the $pdfprintable in PHP it shows a lot of weird characters.
Are you sure the java command is supposed to be used with an hexadecimal string represenation of the PDF ?
use option
$pdfprintable = $pdf->Output('USEAFULLPATHTOFILE.pdf','F');
With the above the PDF is generated and then you can try to print it with the java application if that one works.
Also if you are loading the PDF correctly in FPDF you should be able to use the option D in ->Output
$pdfprintable = $pdf->Output('USEAFULLPATHTOFILE.pdf','D');
Use this to verify the that the PDF is loaded and also managed correctly by FPDF.
Also notice your example code is very limited.
If you need more troubleshooting pls show the Java and the full PHP source relevant to printing operation, loading or creation of the PDF in FPDF

Reading N-Quads in Jena

I'm trying to read an N-Quads file with Jena, but all I get is an empty model. The file I'm trying to read is taken from the example in N-Quads documentation:
<http://example.org/#spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/#green-goblin> <http://example.org/graphs/spiderman> .
(I saved it as a file named file.nq).
The way I'm loading the model is using the RDFDataMgr. But it didn't work with Model.read either.
RDFDataMgr.loadModel("file.nq", Lang.NQUADS)
yields an empty model.
What am I missing? Doesn't Jena support N-Quads out-of-the-box?
Yes, Jena supports N-Quads. Try loadDataset.
N-Quads is for multiple graphs and you have read it into one graph. What you get is just the default graph triples, in this case, none.
There is a warning emitted:
WARN riot :: Only triples or default graph data expected : named graph data ignored
If you didn't get that then (1) you are running an old copy (2) you have turned logging off (3) the file is empty.

Categories

Resources