Add comment to an ARFF file

Add comment to an ARFF file - java

this is my first question in this forum....
I'm making adata-mining application in java with the WEKA API.
I make first a pre-processing stage and when I save the ARFF file i would like to add a couple of lines (as comments) specifing the preprocessing task that i have done to the file...
the problem is that i don't know how to add comments to an ARFF file from the java WEKA API.
To save the file i use the class ArffSaver like this...
try {
ArffSaver saver = new ArffSaver();
saver.setInstances(dataPost);
saver.setFile(arffFile);
saver.writeBatch();
return true;
} catch (IOException ex) {
Logger.getLogger(Preprocesamiento.class.getName()).log(Level.SEVERE, null, ex);
return false;
}
I would be really greatfull if someone could give some idea...
thanks!

You should AVOID writting comments on an .arff file, even more when writting it from Java. These files are very "parser-sensitive". The Weka API to create these files is restrictive for this particular reason.
Even though, you can always add your comments manually with the % symbol. This said, I wouldn't recommend you writting anything more than instances, attributes and values into an .arff file. ;-)

I don't see a reason to not write comments into the header of an ARFF file. The specification clearly says:
Lines that begin with a % are comments.
So while it is technically valid, it can be difficult if you want to use the ArffSaver#setFile method. This method does a lot of (convenient, but somewhat arbitrary and unspecified) work internally, until it finally calls
setDestination(new FileOutputStream(m_outputFile));
If this is not required, the easiest option is to write directly to an OutputStream, which then can simply be set as the destination for the ArffSaver. This can be wrapped in a small helper method, for example, like this:
static void writeArff(
Instances instances,
List<String> commentLines,
OutputStream outputStream) throws IOException
{
ArffSaver saver = new ArffSaver();
saver.setInstances(instances);
if (commentLines != null && !commentLines.isEmpty())
{
BufferedWriter bw = new BufferedWriter(
new OutputStreamWriter(outputStream));
for (String commentLine : commentLines)
{
bw.write("% " + commentLine + "\n");
}
bw.write("\n");
bw.flush();
}
saver.setDestination(outputStream);
saver.writeBatch();
}
When calling it like this
List<String> comments = Arrays.asList("A comment", "Another one");
writeArff(instances, comments, outputStream);
then the given comments will be inserted at the top of the ARFF file.

Related

Reading and writing files using Java 7 nio

I have files which consist of json elements in an array.
(several file. each file has json array of elements)
I have a process that knows to take each json element as a line from file and process it.
So I created a small program that reads the JSON array, and then writes the elements to another file.
The output of this utility will be the input of the other process.
I used Java 7 NIO (and gson).
I tried to use as much Java 7 NIO as possible.
Is there any improvement I can do?
What about the filter? Which approach is better?
Thanks,
public class TransformJsonsUsers {
public TransformJsonsUsers() {
}
public static void main(String[] args) throws IOException {
final Gson gson = new Gson();
Path path = Paths.get("C:\\work\\data\\resources\\files");
final Path outputDirectory = Paths
.get("C:\\work\\data\\resources\\files\\output");
DirectoryStream.Filter<Path> filter = new DirectoryStream.Filter<Path>() {
#Override
public boolean accept(Path entry) throws IOException {
// which is better?
// BasicFileAttributeView attView = Files.getFileAttributeView(entry, BasicFileAttributeView.class);
// return attView.readAttributes().isRegularFile();
return !Files.isDirectory(entry);
}
};
DirectoryStream<Path> directoryStream = Files.newDirectoryStream(path, filter);
directoryStream.forEach(new Consumer<Path>() {
#Override
public void accept(Path filePath) {
String fileOutput = outputDirectory.toString() + File.separator + filePath.getFileName();
Path fileOutputPath = Paths.get(fileOutput);
try {
BufferedReader br = Files.newBufferedReader(filePath);
User[] users = gson.fromJson(br, User[].class);
BufferedWriter writer = Files.newBufferedWriter(fileOutputPath, Charset.defaultCharset());
for (User user : users) {
writer.append(gson.toJson(user));
writer.newLine();
}
writer.flush();
} catch (IOException e) {
throw new RuntimeException(filePath.toString(), e);
}
}
});
}
}

There is no point of using Filter if you want to read all the files from the directory. Filter is primarily designed to apply some filter criteria and read a subset of files. Both of them may not have any real difference in over all performance.
If you looking to improve performance, you can try couple different approaches.
Multi-threading
Depending on how many files exists in the directory and how powerful your CPU is, you can apply multi threading to process more than one file at a time
Queuing
Right now you are reading and writing to another file synchronously. You can queue content of the file using Queue and create asynchronous writer.
You can combine both of these approaches as well to improve performance further.

Don't put the I/O into the filter. That's not what it's for. You should get the complete list of files and then process it. For example if the I/O creates another file in the directory, the behaviour is undefined. You might miss a file, or see the new file in the accept() method.

Saving to "ExternalStorage" - Processing library

Stackoverflowers,
I am doing a simple project using Android smartphones to create 3D forms. I am using Android Processing to make a simple App.
My code makes a 3D shape and saves it as an .STL file. It works on my laptop and saves the .STL file, but in the App. version, I need it to save to the External storage/SD Card of my phone (HTC Sensation). It does not, because of the way the “save” function (writeSTL) in the Processing library I am using has been written.
I have posted for help here (my code more complete code is here too):
http://forum.processing.org/two/discussion/4809/exporting-geometry-stl-obj-dfx-modelbuilder-and-android
...and Marius Watz who wrote the library says that the writeSTL() code is pretty much standalone and the only thing missing is (or should be) replacing the code creating the output stream, which needs to be modified to work with Android. Basically, this line:
FileOutputStream out=(FileOutputStream)UIO.getOutputStream(p.sketchPath(filename));
I am not a great programmer in that I can usually get Processing to do what I need to do but no more; this problem has me beaten. I am looking for ideas for the correct code to replace the line:...
FileOutputStream out=(FileOutputStream)UIO.getOutputStream(p.sketchPath(filename));
...with something “Android-friendly”. Calling getExternalStorageDirectory() should work but I am at a loss to find the correct structure.
The code for the writeSTL function is below.
import java.io.*;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
/**
* Output binary STL file of mesh geometry.
* #param p Reference to PApplet instance
* #param filename Name of file to save to
*/
public void customWriteSTL(UGeometry geo, PApplet p, String filename) {
byte [] header;
ByteBuffer buf;
UFace f;
try {
if (!filename.toLowerCase().endsWith("stl")) filename+=".stl";
FileOutputStream out=(FileOutputStream)UIO.getOutputStream(p.sketchPath(filename));
buf = ByteBuffer.allocate(200);
header=new byte[80];
buf.get(header, 0, 80);
out.write(header);
buf.rewind();
buf.order(ByteOrder.LITTLE_ENDIAN);
buf.putInt(geo.faceNum);
buf.rewind();
buf.get(header, 0, 4);
out.write(header, 0, 4);
buf.rewind();
UUtil.logDivider("Writing STL '"+filename+"' "+geo.faceNum);
buf.clear();
header=new byte[50];
if (geo.bb!=null) UUtil.log(geo.bb.toString());
for (int i=0; i<geo.faceNum; i++) {
f=geo.face[i];
if (f.n==null) f.calcNormal();
buf.rewind();
buf.putFloat(f.n.x);
buf.putFloat(f.n.y);
buf.putFloat(f.n.z);
for (int j=0; j<3; j++) {
buf.putFloat(f.v[j].x);
buf.putFloat(f.v[j].y);
buf.putFloat(f.v[j].z);
}
buf.rewind();
buf.get(header);
out.write(header);
}
out.flush();
out.close();
UUtil.log("Closing '"+filename+"'. "+geo.faceNum+" triangles written.\n");
}
catch (Exception e) {
e.printStackTrace();
}
}
Any suggestions are gratefully received.
Thank you in advance.

There are a few ways of doing this - some that will just work and some that are proper ... as with all things Processing/Java. It's really not that different from regular Java though - the only quirk is the root SD path, and checking if it exists or not (note that some phones have "internal" rather than "external" storage (i.e. not removable/swappable), but Android should interpret these the same AFAIK.
In classic Java fashion, you should really be checking IF the SD Card is present beforehand... I use the following structure, taken from this answer by #kaolick
String state = Environment.getExternalStorageState();
if (state.equals(Environment.MEDIA_MOUNTED)) {
// Storage is available and writeable - ALL GOOD
} else if (state.equals(Environment.MEDIA_MOUNTED_READ_ONLY)) {
// Storage is only readable - RUH ROH
} else {
// Storage is neither readable nor writeable - ABORT
}
Note that he provides a full class for you to use, which is great, and has a few convenience functions.
The second thing you might want to look at is creating a custom directory on the SD Card of the device, probably in setup() - something like this:
try{
String dirName = "//sdcard//MyAppName";
File newFile = new File(dirName);
if(newFile.exists() && newFile.isDirectory()) {
println("Directory Exists... All Good");
}
else {
println("Directory Doesn't Exist... We're Making It");
newFile.mkdirs();
}
}
catch(Exception e) {
e.printStacktrace();
}
Of course, instead of HardCoding the Path name, you should do something like
String dirName = Environment.getExternalStorageDirectory().getAbsolutePath() + "/MyAppName";
instead...
Also, note that the above try/catch should go INSIDE the case statement of "if (state.equals(Environment.MEDIA_MOUNTED))" ... or should be wrapped in a separate function anc called from there.
Then, finally, saving it. If you wanted to use a BufferedWriter, it would look like this:
BufferedWriter writer = new BufferedWriter(new FileWriter(dirName, true));
writer.write(STL_STUFF);
writer.flush();
writer.close();
I've only use a FileOutputStream within a BufferedOutput Stream, and it looked like this:
try {
String fileName = "SOME_UNIQUE_NAME_PER_FILE";
String localFile = dirName + "/" +filename;
OutputStream output = new BufferedOutputStream(newFileOutputStream(localFile));
}
catch(Exception e) {
e.printStackTrace();
}
Finally, give my regards to Marius if you talk to him! ;-)

Java: CSV File Easy Read/Write

I'm working on a program that requires quick access to a CSV comma-delimited spreadsheet file.
So far I've been able to read from it easily using a BufferedReader.
However, now I want to be able to edit the data it reads, then export it BACK to the CSV.
The spreadsheet contains names, phone numbers, email addresses, etc. And the program lists everyone's data, and when you click on them it brings up a page with more detailed information, also pulled from the CSV. On that page you can edit the data, and I want to be able to click a "Save Changes" button, then export the data back to its appropriate line in the CSV--or delete the old one, and append the new.
I'm not very familiar with using a BufferedWriter, or whatever it is I should be using.
What I started to do is create a custom class called FileIO. It contains both a BufferedReader and a BufferedWriter. So far it has a method that returns bufferedReader.readLine(), called read(). Now I want a function called write(String line).
public static class FileIO {
BufferedReader read;
BufferedWriter write;
public FileIO (String file) throws MalformedURLException, IOException {
read = new BufferedReader(new InputStreamReader (getUrl(file).openStream()));
write = new BufferedWriter (new FileWriter (file));
}
public static URL getUrl (String file) throws IOException {
return //new URL (fileServer + file).openStream()));
FileIO.class.getResource(file);
}
public String read () throws IOException {
return read.readLine();
}
public void write (String line) {
String [] data = line.split("\\|");
String firstName = data[0];
// int lineNum = findLineThatStartsWith(firstName);
// write.writeLine(lineNum, line);
}
};
I'm hoping somebody has an idea as to how I can do this?

Rather than reinventing the wheel you could have a look at OpenCSV which supports reading and writing of CSV files. Here are examples of reading & writing

Please consider Apache commons csv.
To fast understand the api, there are four important classes:
CSVFormat
Specifies the format of a CSV file and parses input.
CSVParser
Parses CSV files according to the specified format.
CSVPrinter
Prints values in a CSV format.
CSVRecord
A CSV record parsed from a CSV file.
Code Example:
Unit test code:

The spreadsheet contains names, phone numbers, email addresses, etc. And the program lists everyone's data, and when you click on them it brings up a page with more detailed information, also pulled from the CSV. On that page you can edit the data, and I want to be able to click a "Save Changes" button, then export the data back to its appropriate line in the CSV--or delete the old one, and append the new.
The content of a file is a sequence of bytes. CSV is a text based file format, i.e. the sequence of byte is interpreted as a sequence of characters, where newlines are delimited by special newline characters.
Consequently, if the length of a line increases, the characters of all following lines need to be moved to make room for the new characters. Likewise, to delete a line you must move the later characters to fill the gap. That is, you can not update a line in a csv (at least not when changing its length) without rewriting all following lines in the file. For simplicity, I'd rewrite the entire file.
Since you already have code to write and read the CSV file, adapting it should be straightforward. But before you do that, it might be worth asking yourself if you're using the right tool for the job. If the goal is to keep a list of records, and edit individual records in a form, programs such as Microsoft Access or whatever the Open Office equivalent is called might be a more natural fit. If you UI needs go beyond what these programs provide, using a relational database to keep your data is probably a better fit (more efficient and flexible than a CSV).

Add Dependencies
implementation 'com.opencsv:opencsv:4.6'
Add Below Code in onCreate()
InputStreamReader is = null;
try {
String path= "storage/emulated/0/Android/media/in.bioenabletech.imageProcessing/MLkit/countries_image_crop.csv";
CSVReader reader = new CSVReader(new FileReader(path));
String[] nextLine;
int lineNumber = 0;
while ((nextLine = reader.readNext()) != null) {
lineNumber++;
//print CSV file according to your column 1 means first column, 2 means
second column
Log.e(TAG, "onCreate: "+nextLine[2] );
}
}
catch (Exception e)
{
Log.e(TAG, "onCreate: "+e );
}

I solved it using
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-csv</artifactId>
<version>2.8.6</version>
</dependency>
and
private static final CsvMapper mapper = new CsvMapper();
public static <T> List<T> readCsvFile(MultipartFile file, Class<T> clazz) throws IOException {
InputStream inputStream = file.getInputStream();
CsvSchema schema = mapper.schemaFor(clazz).withHeader().withColumnReordering(true);
ObjectReader reader = mapper.readerFor(clazz).with(schema);
return reader.<T>readValues(inputStream).readAll();
}

Update only the new content in a file all five minutes

I get a file personHashMap.ser with a HashMap in it. Here's the code how i create it:
String file_path = ("//releasearea/ToolReleaseArea/user/personHashMap.ser");
public void createFile(Map<String, String> newContent) {
try{
File file = new File(file_path);
FileOutputStream fos=new FileOutputStream(file);
ObjectOutputStream oos=new ObjectOutputStream(fos);
oos.writeObject(newContent);
oos.flush();
oos.close();
fos.close();
}catch (Exception e){
System.err.println("Error in FileWrite: " + e.getMessage());
}
}
Now i want, when the program is running, that all five minutes update the file personHashMap.ser only with the content which changed. So the method i called:
public void updateFile(Map<String, String> newContent) {
Map<String, String> oldLdapContent = readFile();
if(!oldLdapContent.equals(ldapContent)){ // they arent the same,
// so i must update the file
}
}
But now i haven't any ideas how i can realise that.
And is it better for the performance to update only the new content or should i clean the full file and insert the new list again?
Hope you can Help me..
EDIT:
The HashMap includes i.e street=Example Street.
But now, the new street called New Example Street. Now i must update the HashMap in the File. So i can't just append the new content...

Firstly HashMap isn't really an appropriate choice. It's designed for in-memory usage, not serialization (though of course it can be serialized in the standard way). But if it's just 2kb, then go ahead and write the whole thing rather than the updated data.
Second, you seem to be overly worried about performance of this rather trivial method (for 2kb the write will take mere milliseconds). I would be worried more about consistency and concurrency issues. I suggest you look into using a lightweight database such as JavaDB or h2.

Use the constructor FileOutputStream(File file, boolean append), set the boolean append to true. It will append the text in the existing file.

You can call the updateFile method in a loop and then call sleep for 5 minutes (5*60*1000 ms).
Thread.Sleep(300000); // sleep for 5 minutes
To append to your already existing file you can use :
FileOutputStream fooStream = new FileOutputStream(file, true);

How do i verify string content using Mockito in Java

I am new to using Mockito test framework. I need to unit test one method which return the the string content. Also the same contents will be stored in one .js file (i.e. "8.js").
How do I verify the the string contents returned from the method is as expected as i want.
Please find the below code for generating the .js file:
public String generateJavaScriptContents(Project project)
{
try
{
// Creating projectId.js file
FileUtils.mkdir(outputDir);
fileOutputStream = new FileOutputStream(outputDir + project.getId() + ".js");
streamWriter = new OutputStreamWriter(fileOutputStream, "UTF-8");
StringTemplateGroup templateGroup =
new StringTemplateGroup("viTemplates", "/var/vi-xml/template/", DefaultTemplateLexer.class);
stringTemplate = templateGroup.getInstanceOf("StandardJSTemplate");
stringTemplate.setAttribute("projectIdVal", project.getId());
stringTemplate.setAttribute("widthVal", project.getDimension().getWidth());
stringTemplate.setAttribute("heightVal", project.getDimension().getHeight());
stringTemplate.setAttribute("playerVersionVal", project.getPlayerType().getId());
stringTemplate.setAttribute("finalTagPath", finalPathBuilder.toString());
streamWriter.append(stringTemplate.toString());
return stringTemplate.toString();
}
catch (Exception e)
{
logger.error("Exception occurred while generating Standard Tag Type Content", e);
return "";
}
}
The output of above method writes the .js file and the contents of that file are looks something below:
var projectid = 8; var playerwidth = 300; var playerheight =
250; var player_version = 1; .....
I have written the testMethod() using mockito to test this, however i am able to write the .js file successfully using the test method, but how do I verify its contents?
Can anyone help me to sort out this problem?

As #ŁukaszBachman mentions, you can read the contents from the js file. There are a couple of things to consider when using this approach:
The test will be slow, as you will have to wait for the js content to be written to the disk, read the content back from the disk and assert the content.
The test could theoretically be flaky because the entire js content may not be written to the disk by the time the code reads from the file. (On that note, you should probably consider calling flush() and close() on your OutputStreamWriter, if you aren't already.)
Another approach is to mock your OutputStreamWriter and inject it into the method. This would allow you to write test code similar to the following:
OutputStreamWriter mockStreamWriter = mock(OutputStreamWriter.class);
generateJavaScriptContents(mockStreamWriter, project);
verify(mockStreamWriter).append("var projectid = 8;\nvar playerwidth = 300;...");
http://mockito.googlecode.com/svn/branches/1.5/javadoc/org/mockito/Mockito.html#verify%28T%29

If you persist this *.js file on file system then simply create util method which will read it's contents and then use some sort of assertEquals to compare it with your fixed data.
Here is code for reading file contents into String.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.