java retrieve lnk file parameters and target - java

I would like to extract some info from a .lnk file in Java, specifically the entire target (with command line parameters after the initial .exe) and the working directory as well.
In the question Windows shortcut (.lnk) parser in Java?
by user Zarkonnen we can find the WindowsShortcut library created by multiple community users. See Code Blings answer here.
However, as of now, this library provides only access to the file path itself, but not to command line arguments or working directory (or any other additional info that might be inside a shortcut file).
I tried to figure out a way to get the additional info using the WindowsShortcut library, but didn't succeed. The library only provides me with a getRealFilename() method:
public static void main(String[] args) throws Exception
{
WindowsShortcut windowsShortcut = new WindowsShortcut(new File("C:\test\test.lnk"));
System.out.println(windowsShortcut.getRealFilename());
}
Does anyone know of a way to do this?

Your question is really good. As of the date of your question the WindowsShortcut class you refer to only implements code to get the path to the file pointed to by a shortcut, but doesn't provide any further data inside a shortcut file. But it's open source, so let's extend it!
Let's do some research first
In the inofficial documentation by Jesse Hager we find this:
______________________________________________________________________________
| |
| **The flags** |
|______________________________________________________________________________|
| | | |
| Bit | Meaning when 1 | Meaning when 0 |
|_____|____________________________________|___________________________________|
| | | |
| 0 | The shell item id list is present. | The shell item id list is absent. |
| 1 | Points to a file or directory. | Points to something else. |
| 2 | Has a description string. | No description string. |
| 3 | Has a relative path string. | No relative path. |
| 4 | Has a working directory. | No working directory. |
| 5 | Has command line arguments. | No command line arguments. |
| 6 | Has a custom icon. | Has the default icon. |
|_____|____________________________________|___________________________________|
So we know that we can check the flags byte for the existence of these additional strings. And we already have access to the flags byte prepared in our WindowsShortcut class.
Now we only need to know where those strings are stored in the shortcut file. In the inofficial documentation we also find this structure:
File header
Shell item ID list
Item 1
Item 2
etc..
File locator info
Local path
Network path
Description string
Relative path string
Working directory string
Command line string
Icon filename string
Extra stuff
So the strings we are interested in come directly after the File locator info block. Which is neat, because the existing WindowsShortcut class already parses the File locator info to get the file path.
The docs also say that each string consists of a length given as unsigned short and then ASCII characters. However, at least under Windows10, I encountered UTF-16 strings and implemented my code accordingly.
Let's implement!
We can simply add a few more lines at the end of the parseLink method.
First we get the offset directly after the File locator info block and call it next_string_start, as it now points to the first additional string:
final int file_location_size = bytesToDword(link, file_start);
int next_string_start = file_start + file_location_size;
We then check the flags for each of the strings in order, and if it exists, we parse it:
final byte has_description = (byte)0b00000100;
final byte has_relative_path = (byte)0b00001000;
final byte has_working_directory = (byte)0b00010000;
final byte has_command_line_arguments = (byte)0b00100000;
// if description is present, parse it
if ((flags & has_description) > 0) {
final int string_len = bytesToWord(link, next_string_start) * 2; // times 2 because UTF-16
description = getUTF16String(link, next_string_start + 2, string_len);
next_string_start = next_string_start + string_len + 2;
}
// if relative path is present, parse it
if ((flags & has_relative_path) > 0) {
final int string_len = bytesToWord(link, next_string_start) * 2; // times 2 because UTF-16
relative_path = getUTF16String(link, next_string_start + 2, string_len);
next_string_start = next_string_start + string_len + 2;
}
// if working directory is present, parse it
if ((flags & has_working_directory) > 0) {
final int string_len = bytesToWord(link, next_string_start) * 2; // times 2 because UTF-16
working_directory = getUTF16String(link, next_string_start + 2, string_len);
next_string_start = next_string_start + string_len + 2;
}
// if command line arguments are present, parse them
if ((flags & has_command_line_arguments) > 0) {
final int string_len = bytesToWord(link, next_string_start) * 2; // times 2 because UTF-16
command_line_arguments = getUTF16String(link, next_string_start + 2, string_len);
next_string_start = next_string_start + string_len + 2;
}
The getUTF16String method is simply:
private static String getUTF16String(final byte[] bytes, final int off, final int len) {
return new String(bytes, off, len, StandardCharsets.UTF_16LE);
}
And finally we need members and getters for those new Strings:
private String description;
private String relative_path;
private String working_directory;
private String command_line_arguments;
public String getDescription() {
return description;
}
public String getRelativePath() {
return relative_path;
}
public String getWorkingDirectory() {
return working_directory;
}
public String getCommandLineArguments() {
return command_line_arguments;
}
I tested this under Windows 10 and it worked like a charm.
I made a pull request with my changes on the original repo, until then you can also find the complete code here.

Related

Why does protocol buffer performance worse than JSON in my case?

I'm doing a test between protocol buffers and json, but i find the performance of protocol buffers out of my expectation. I wonder if my http request is too complicated or protocol buffers just is not appropriate in my case. Here's my http request:
{
"LogId": "ABC165416515166165484165164132",
"ci_came": "uvInYKaMbhcyBm4p",
"ci_cme": "GWPMzgSwKzEZ5Mmz",
"ci_me": "gqHVRaSDTeksgijHM18QzajcVh21bBAq0TkknIXBQnkGKAPQbsZG35cDG6usAxwoxxiey9AnKbsN",
"ci_pi": "U6dpu0828Q0NNP5JRgVvCMgnn41poxfPlzhwU6FdBOrFsujzskD0HKTcQAXBcXgaOuYcckGtZDs9roJsK",
"th_logic": "IUPPORTxr_17_iIT_yiIS=[LthyOy.gitg.warmh;#6icy855 vxrsionCorx=1 uiyRr=msm8953 uiOTLOyrxR=unknown uPx=ujij Ir=N2G47H TIMx=1541178617000 iRyNr=gitrI TyG=iuilr HyRrWyRx=qyum jijIyL=1743iC101955 SUPPORTxr_yiIS=[Lhug.hxng.warmh;#5ic26y CPU_yiI=yrm17-v8y IS_rxiUGGyiLx=fylsx RyrIO=unknown MyNUFyCTURxR=gitrI IS_xMULyTOR=fylsx SUPPORTxr_32_iIT_yiIS=[Lhug.gitg.warmh;#xr9170c TyGS=txst-kxys CPU_yiI2= UNKNOWN=unknown PxRMISSIONS_RxVIxW_RxQUIRxr=fylsx Ujij=iuilrxr FINGxRPRINT=firxtruck/msm8953_17/msm8953_17:7.1.2/N2G47H/iuilrx11081144:ujij/txst-kxys HltT=gitriXM-iuilrxr-03 vxrsionNymx=1.0 PROrUCT=yxCR C10 rISPLyY=N2G47H txst-kxys MOrxL=yxCR C10 rxVICx=yxCR C10 hug.gitg.yrithmxticxxcxption: rivirx iy hxro yt yum.upspacepyy.fycxpyy.yummxnt.vixw.SxttingItxmFrygmxnt.onClick(SxttingItxmFrygmxnt.hug:117) yt ynrroir.vixw.Vixw.pxrformClick(Vixw.hug:5637) yt ynrroir.vixw.Vixw$PxrformClick.run(Vixw.hug:22433) yt ynrroir.lt.Hynrlxr.hynrlxCylliyck(Hynrlxr.hug:751) yt ynrroir.lt.Hynrlxr.rispytchMxssygx(Hynrlxr.hug:95)rimrgrilsTiiUxpXiHxXoOxX8kRituil",
"th_ideal": "TqpXdC5NQF",
"th_sth": "YTVMYUSuprQzHaQLgRdvxp0g8nLWdEZBc0UfrcyrQv09CKPBuacEesMfoiXqXHP2G2Duvmnzmv20iBBQKCuAk1piKvS9MvR9ymxD5YYahyBsdoWetqKjAuTBS115rqwDGhe2qDWMcRnZF3QF9f4WF5sJsFlmxroZzprR",
"th_err": "StGMzqIW1YGg44GC",
"th_code": "zrhwEaVmVlNPTUZCCO0j62bFjL6Sjnb8JxNng645fQOMlxA5ceKOwH67aYkK0FnM3vKMpXAbdLwCWAyVUjuvcH1",
"th_req": "ZWZXPUr6O4jYrXjLXlXskem7jHQ6D",
"th_index": "6546546546",
"th_log": "3Gt8V7LMxUMlvHPzcVUYCQl8zvwaDfDEzWn7GxOHbzf9quoZFTl2WwFRpMox2V8zfjbOQiIg4dxjf0x1vWGKHhvnmabXCO5jDWVE33TgI0YTJO14uYEnezdzYDoeR51",
"th_order": "T28XGCx1O3LCGa98lAtWc33",
"message": "Crash",
"time": "2019-11-11 18:23:00",
"ci_vi": "RCgdDu5874sJohjEVy7i72Kcp98rCOJvl",
"t_mNo": "1.9",
"t_tNo": "Gxk9Vb3zblp2PHpYTQzTXmzx43WaEtZmA3CWFfXtPsDZFgaAIug5mbX73w4wQvwNL65BEOW3fd7wExndzm3eilp4jODtHZQaV5G574FPfK",
"t_fd": "k58xs1eYKTvDxbRMWfPJMdB6tfBnGaOLAnmDUZxo2URebvtd8F",
"t_pd": "jWl7CTWdmgVFZxA",
"t_oer": "HHoLyXNYxKHqZgpev9vi",
"t_ar": "J6m4X9ATlADGaKUzi1eb",
"t_sr": "daP",
"t_sd": "AgXPBAaOrA95b9PM4196BQaLsVN9j9",
"t_sn": "1Ai4lFVObo0MymeJ894m0jItjiwhcD",
"t_dd": "zLuh1p1G",
"timeS": "2019-11-11 18:22:58"
}
Here's my proto file:
message Scada {
TechInfo tInfo = 1;
string time = 2;
string message = 3;
Thought thought = 4;
}
message TechInfo {
string mNo = 1;
string tNo = 2;
string fd = 3;
string pd = 4;
string oer = 5;
string ar = 6;
Ci cd = 7;
string sr = 8;
string sd = 9;
string sn = 10;
string dd = 11;
}
message Ci{
string pi = 1;
string vi = 2;
string me = 3;
string cme = 4;
string came = 5;
}
message Thought{
string logic = 1;
string ideal = 2;
string sth = 3;
string err = 4;
string code = 5;
string req = 6;
string index = 7;
string log = 8;
string order = 9;
}
And i use protocol buffers parseFrom() method to deserialize the request:
public static Scada pbDeSerialize(byte[] pbBytes) throws InvalidProtocolBufferException {
Scada scada = ScadaObj.Scada.parseFrom(pbBytes);
return scada;
}
I use json tools to deserialize the request:
public static PbScadaJsonObj jsonDeserialize(byte[] jsonBytes) {
String str = new String(jsonBytes, utf8Charset);
return JsonUtil.deserialize(str, PbScadaJsonObj.class);
}
public static <T> T deserialize(String json, Class<T> clazz) {
return JSON.parseObject(json, clazz);
}
And i use jmeter to test these two methods. The test consists of one thread and 100 threads. One request message is about 3KB. Json ProtoBuf(PB) deserialize message are tested in 1024MB heap size. Before each execution, i always add a random number to make the message different from each other. My machine is 2C4G.
+---------------------+----------+------+
| 100k loops 1 thread | FastJson | PB |
+---------------------+----------+------+
| TIME(s) | 360 | 309 |
+---------------------+----------+------+
| CPU(%) | 104 | 99.8 |
+---------------------+----------+------+
| MEM(%) | 7.2 | 6.6 |
+---------------------+----------+------+
+------------+----------+-------+
| 100threads | FastJson | PB |
+------------+----------+-------+
| TPS(/s) | 274.3 | 321.9 |
+------------+----------+-------+
| CPU(%) | 185.8 | 168.6 |
+------------+----------+-------+
| MEM(%) | 9.1 | 28.6 |
+------------+----------+-------+
From the test, i can't tell the protocol buffers improvement which consume much more memory with just 50/s TPS increasing.Could anyone explain this for me? Or anyone who did some kinda stuff like this test?
The comparison is unfair. Your Protobuf definition is nested, while your JSON definition is flat. If you want to do a fair comparison, make your JSON nested or make the Protobuf flat:
Make JSON nested:
{
"tInfo" : {"mNo" : "xxxx", "tNo" : "xxxx", "other" : "fields"},
"time" : "2019-11-11 18:23:00",
"message" : "Crash",
"thought" : {"logic" : "xxx", "other" : "fields"}
}
Every serialization mechanism has strengths and weaknesses. From memory I'd say it looks roughly like this:
Serialization: cheap (+) / expensive (-)
Protobuf
JSON
primitives (numbers/bool/enum)
+
-
raw bytes
+
-
skipping content
+
-
nested messages
-
+
strings
-
+
Protobuf encodes Strings and nested messages as length delimited data, so each string is prepended by the length of the string in bytes. This was a deliberate choice and has benefits when parsing (e.g. lazy parsing strings and efficient skipping), but it does add a cost to serialization. Implementations may need to precompute the length and effectively convert the string to bytes twice. JSON uses a start and end character, so it can directly stream into an output buffer. The difference gets smaller with caching, but Protobuf always has to do more work to encode a String than JSON.
Given that your protos only contain strings and nested messages, I wouldn't expect a lot of performance gains purely from switching to Protobuf. It may gain some speed on the field identifiers, but your field names were already shortened to a point where they are barely human readable.
On the other hand, several of your strings look like numbers and base64 encoded data. Switching those to Protobuf's primitive and bytes types would be a lot more efficient and should provide a significant good speedup.

Write text to a file in Java

I'm trying to write a simple output to a file but I'm getting the wrong output. This is my code:
Map<Integer,List<Client>> hashMapClients = new HashMap<>();
hashMapClients = clients.stream().collect(Collectors.groupingBy(Client::getDay));
Map<Integer,List<Transaction>> hasMapTransactions = new HashMap<>();
hasMapTransactions = transactions.stream().collect(Collectors.groupingBy(Transaction::getDay));
//DAYS
String data;
for (Integer key: hashMapClients.keySet()) {
data = key + " | ";
for (int i = 0; i <hashMapClients.get(key).size();i++) {
data += hashMapClients.get(key).get(i).getType() + " | " + hashMapClients.get(key).get(i).getAmountOfClients() + ", ";
writer.println(data);
}
}
I get this output
1 | Individual | 0,
1 | Individual | 0, Corporation | 0,
2 | Individual | 0,
2 | Individual | 0, Corporation | 0,
But it should be, also it should not end with , if it's the last one.
1 | Individual | 0, Corporation | 0
2 | Individual | 0, Corporation
| 0
What am I doing wrong?
It sounds like you only want to write data to the output in the outer loop, not the inner loop. The latter of which is just for building the data value to write. Something like this:
String data;
for (Integer key: hashMapClients.keySet()) {
// initialize the value
data = key + " | ";
// build the value
for (int i = 0; i <hashMapClients.get(key).size();i++) {
data += hashMapClients.get(key).get(i).getType() + " | " + hashMapClients.get(key).get(i).getAmountOfClients() + ", ";
}
// write the value
writer.println(data);
}
Edit: Thanks for pointing out that the last character also still needs to be removed. Without more error checking, that could be as simple as:
data = data.substring(0, data.length() - 1);
You can add error checking as your logic requires, perhaps confirming that the last character is indeed a comma or confirming that the inner loop executes at least once, etc.
One problem is that you are calling println after every Client, rather than waiting until the whole list is built. Then, to fix the problem with the trailing comma, you can use a joining collector.
Map<Integer,List<Client>> clientsByDay = clients.stream()
.collect(Collectors.groupingBy(Client::getDay));
/* Iterate over key-value pairs */
for (Map.Entry<Integer, List<Client>> e : clientsByDay) {
/* Print the key */
writer.print(e.getKey());
/* Print a separator */
writer.print(" | ");
/* Print the value */
writer.println(e.getValue().stream()
/* Convert each Client to a String in the desired format */
.map(c -> c.getType() + " | " + c.getAmountOfClients())
/* Join the clients together in a comma-separated list */
.collect(Collectors.joining(", ")));
}

Talend - generating n multiple rows from 1 row

Background: I'm using Talend to do something (I guess) that is pretty common: generating multiple rows from one. For example:
ID | Name | DateFrom | DateTo
01 | Marco| 01/01/2014 | 04/01/2014
...could be split into:
new_ID | ID | Name | DateFrom | DateTo
01 | 01 | Marco | 01/01/2014 | 02/01/2014
02 | 01 | Marco | 02/01/2014 | 03/01/2014
03 | 01 | Marco | 03/01/2014 | 04/01/2014
The number of outcoming rows is dynamic, depending on the date period in the original row.
Question: how can I do this? Maybe using tSplitRow? I am going to check those periods with tJavaRow. Any suggestions?
Expanding on the answer given by Balazs Gunics
Your first part is to calculate the number of rows one row will become, easy enough with a date diff function on the to and from dates
Part 2 is to pass that value to a tFlowToIterate, and pick it up with a tJavaFlex that will use it in its start code to control a for loop:
tJavaFlex start:
int currentId = (Integer)globalMap.get("out1.id");
String currentName = (String)globalMap.get("out1.name");
Long iterations = (Long)globalMap.get("out1.iterations");
Date dateFrom = (java.util.Date)globalMap.get("out1.dateFrom");
for(int i=0; i<((Long)globalMap.get("out1.iterations")); i++) {
Main
row2.id = currentId;
row2.name = currentName;
row2.dateFrom = TalendDate.addDate(dateFrom, i, "dd");
row2.dateTo = TalendDate.addDate(dateFrom, i+1, "dd");
End
}
and sample output:
1|Marco|01-01-2014|02-01-2014
1|Marco|02-01-2014|03-01-2014
1|Marco|03-01-2014|04-01-2014
2|Polo|01-01-2014|02-01-2014
2|Polo|02-01-2014|03-01-2014
2|Polo|03-01-2014|04-01-2014
2|Polo|04-01-2014|05-01-2014
2|Polo|05-01-2014|06-01-2014
2|Polo|06-01-2014|07-01-2014
2|Polo|07-01-2014|08-01-2014
2|Polo|08-01-2014|09-01-2014
2|Polo|09-01-2014|10-01-2014
2|Polo|10-01-2014|11-01-2014
2|Polo|11-01-2014|12-01-2014
2|Polo|12-01-2014|13-01-2014
2|Polo|13-01-2014|14-01-2014
2|Polo|14-01-2014|15-01-2014
2|Polo|15-01-2014|16-01-2014
2|Polo|16-01-2014|17-01-2014
2|Polo|17-01-2014|18-01-2014
2|Polo|18-01-2014|19-01-2014
2|Polo|19-01-2014|20-01-2014
2|Polo|20-01-2014|21-01-2014
2|Polo|21-01-2014|22-01-2014
2|Polo|22-01-2014|23-01-2014
2|Polo|23-01-2014|24-01-2014
2|Polo|24-01-2014|25-01-2014
2|Polo|25-01-2014|26-01-2014
2|Polo|26-01-2014|27-01-2014
2|Polo|27-01-2014|28-01-2014
2|Polo|28-01-2014|29-01-2014
2|Polo|29-01-2014|30-01-2014
2|Polo|30-01-2014|31-01-2014
2|Polo|31-01-2014|01-02-2014
You can use tJavaFlex to do this.
If you have a small amount of columns the a tFlowToIterate -> tJavaFlex options could be fine.
In the begin part you can start to iterate, and in the main part you assign values to the output schema. If you name your output is row6 then:
row6.id = (String)globalMap.get("id");
and so on.
I came here as I wanted to add all context parameters into an Excel data sheet. So the solution bellow works when you are taking 0 input lines, but can be adapted to generate several lines for each line in input.
The design is actually straight forward:
tJava –trigger-on-OK→ tFileInputDelimited → tDoSomethingOnRowSet
↓ ↑
[write into a CSV] [read the CSV]
And here is the kind of code structure usable in the tJava.
try {
StringBuffer wad = new StringBuffer();
wad.append("Key;Nub"); // Header
context.stringPropertyNames().forEach(
key -> wad.
append(System.getProperty("line.separator")).
append(key + ";" + context.getProperty(key) )
);
// Here context.metadata contains the path to the CSV file
FileWriter output = new FileWriter(context.metadata);
output.write(wad.toString());
output.close();
} catch (IOException mess) {
System.out.println("An error occurred.");
mess.printStackTrace();
}
Of course if you have a set of rows as input, you can adapt the process to use a tJavaRow instead of a tJava.
You might prefer to use an Excel file as an on disk buffer, but dealing with this file format asks more work at least the first time when you don’t have the Java libraries already configured in Talend. Apache POI might help you if you nonetheless chose to go this way.

In java, what's the best way to read a url and split it into its parts?

Firstly, I am aware that there are other posts similar, but since mine is using a URL and I am not always sure what my delimiter will be, I feel that I am alright posting my question. My assignment is to make a crude web browser. I have a textField that a user enters the desired URL into. I then have obviously have to navigate to that webpage. Here is an example from my teacher of what my code would look kinda like. This is the code i'm suposed to be sending to my socket. Sample url: http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol
GET /wiki/Hypertext_Transfer_Protocol HTTP/1.1\n
Host: en.wikipedia.org\n
\n
So my question is this: I am going to read in the url as just one complete string, so how do I extract just the "en.wikipedia.org" part and just the extension? I tried this as a test:
String url = "http://en.wikipedia.org/wiki/Hypertext Transfer Protocol";
String done = " ";
String[] hope = url.split(".org");
for ( int i = 0; i < hope.length; i++)
{
done = done + hope[i];
}
System.out.println(done);
This just prints out the URL without the ".org" in it. I think i'm on the right track. I am just not sure. Also, I know that websites can have different endings (.org, .com, .edu, etc) so I am assuming i'll have to have a few if statements that compenstate for the possible different endings. Basically, how do I get the url into the two parts that I need?
The URL class pretty much does this, look at the tutorial. For example, given this URL:
http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING
This is the kind of information you can expect to obtain:
protocol = http
authority = example.com:80
host = example.com
port = 80
path = /docs/books/tutorial/index.html
query = name=networking
filename = /docs/books/tutorial/index.html?name=networking
ref = DOWNLOADING
This is how you should split your URL parts: http://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.html
Instead of url.split(".org"); try url.split("/"); and iterate through your array of strings.
Or you can look into regular expressions. This is a good example to start with.
Good luck on your homework.
Even though the answer with URL class is great, here is one more way to split URL to components using REGEXP:
"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?"
|| | | | | | | |
12 - scheme | | | | | | |
3 4 - authority, includes hostname/ip and port number.
5 - path| | | |
6 7 - query| |
8 9 - fragment
You can use it with Pattern class:
var regex = "^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?";
var pattern = Pattern.compile(REGEX);
var matcher = pattern.matcher("http://example.com:80/docs/books/tutorial/index.html?name=networking#DOWNLOADING");
if (matcher.matches()) {
System.out.println("scheme: " + matcher.group(2));
System.out.println("authority: " + matcher.group(4));
System.out.println("path: " + matcher.group(5));
System.out.println("query: " + matcher.group(7));
System.out.println("fragment: " + matcher.group(9));
}
you can use String class split() and store the result into the String array then iterate the array and store the variable and value into the Map.
public class URLSPlit {
public static Map<String,String> splitString(String s) {
String[] split = s.split("[= & ?]+");
int length = split.length;
Map<String, String> maps = new HashMap<>();
for (int i=0; i<length; i+=2){
maps.put(split[i], split[i+1]);
}
return maps;
}
public static void main(String[] args) {
String word = "q=java+online+compiler&rlz=1C1GCEA_enIN816IN816&oq=java+online+compiler&aqs=chrome..69i57j69i60.18920j0j1&sourceid=chrome&ie=UTF-8?k1=v1";
Map<String, String> newmap = splitString(word);
for(Map.Entry map: newmap.entrySet()){
System.out.println(map.getKey()+" = "+map.getValue());
}
}
}

How can I display a graphical (ascii text graphics) representation of the decision tree in WEKA using graph() or tograph()

I'm trying to display the decision tree generated by different classes using the WEKA classes in my own program. Specifically I'm using two different ones: J48 (C4.5 implementation) and RandomTree. One has the function graph() and the other has the function toGraph() which appear to have the same functionality for their respective classes.
Since they both show java.lang.String as their return type I was expecting to see something like what you see when using their Explorer app:
act = STRETCH
| size = SMALL
| | Color = YELLOW
| | | age = ADULT : T (1/0)
| | | age = CHILD : F (1/0)
| | Color = PURPLE
| | | age = ADULT : T (2/0)
| | | age = CHILD : F (1/0)
| size = LARGE
| | age = ADULT : T (4/0)
| | age = CHILD : F (2/0)
act = DIP : F (8/0)
Instead I get something like this:
digraph Tree {
edge [style=bold]
N13aaa14a [label="1: T"]
N13aaa14a->N268b819f [label="act = STRETCH"]
N268b819f [label="2: T"shape=box]
N13aaa14a->N10eb017e [label="act = DIP"]
N10eb017e [label="3: F"]
N10eb017e->N34aeffdf [label="age = CHILD"]
N34aeffdf [label="4: F"shape=box]
N10eb017e->N4d20a47e [label="age = ADULT"]
N4d20a47e [label="5: T"shape=box]
}
Is this something unique to the WEKA libraries or is this some type of standard Java format? It looks similar to some of the JSON stuff I saw working on another project but I never got that familiar with it.
Is there an easy way I can write a function to display this in a more human-readable format?
The output you are getting is in so-called "dot" format which is designed to be compiled by graphviz. You'll get better results than ASCII art, that's for sure.
Save your file in out.dot and then try this command:
$ dot -Tpng -oout.png out.dot
Then look at what you've got in out.png

Categories

Resources