I am new to Java, working on a desktop application thereby understanding Java standards and concepts. I have a portion in the application where I do XML parsing, which also forms the majority of the application. The application parses a huge XML file and generates some readable texts depending on nodes and values. As of now, everything in the parsing logic is hard-coded.
As the application grows and new requirements and change requests come in, I prefer getting rid of all hard-coded stuff and would like to maintain a config file (like lookup table). I would like to revise the existing behavior which is seen as below
public String getReadable() {
action = action.getFirstChild();
switch (action.getNodeName()) {
case "Drive": {
String name = XMLUtility.getChildFrom(action, 0).getTextContent();
String model = XMLUtility.getChildFrom(action, 1).getTextContent();
String speed = XMLUtility.getChildFrom(action, 2).getTextContent();
return name + "is driving the " + model + " vehicle at a speed of " + speed + "mph";
}
case "Stop": {
String model = action.getFirstChild().getTextContent();
return model + " is now stopped";
}
}
return "";
}
There are hundreds of similar functions in my application under the parsing mechanism. I would like to maintain all the hard-coded stuffs centrally which is the more flexible way of maintaining the code.
But, as I do not have any prior experience in Java, I am looking for a standard way of doing this. Please recommend a legible approach to improve my code.
Related
I'm using the common.graph from Google Guava in Version 21.0. It suits very well to my usecase without one aspect: Persistence. The graph seems to be in-memory only. The graph-classes does not implement Serializable, it was explained in this issue posts.
Google describes three models to store the topology. The third option is:
a separate data repository (for example, a database) stores the topology
But that's all. I didn't found any methods in the package to apply a separate data repository. Is there any way to do this? Or is the only way to use the nodes()and edges() method to get a Set of my nodes and a Set of my edges? I can persist them in a database if I implement Serializable in this classes and restore the graph by calling addNode(Node) and addEdge(Source, Target, Edge) (there are no addAll-methods). But this seems to be a workaround.
Thanks for your support!
To briefly recap the reason why Guava's common.graph classes aren't Serializable: Java serialization is fragile because it depends on the details of the implementation, and that can change at any time, so we don't support it for the graph types.
In the short term, your proposed workaround is probably your best bet, although you'll need to be careful to store the endpoints (source and target) of the edges alongside the edge objects so that you'll be able to rebuild the graph as you describe. And in fact this may work for you in the longer term, too, if you've got a database that you're happy with and you don't need to worry about interoperation with anyone else.
As I mentioned in that GitHub issue, another option is to persist your graph to some kind of file format. (Guava itself does not provide a mechanism for doing this, but JUNG will for common.graph graphs once I can get 3.0 out the door, which I'm still working on.) Note that most graph file formats (at least the ones I'm familiar with) have fairly limited support for storing node and edge metadata, so you might want your own file format (say, something based on protocol buffers).
One way I found of storing the graph was through the DOT format, like so:
public class DOTWriter<INode, IEdge> {
public static String write(final Graph graph) {
StringBuilder sb = new StringBuilder();
sb.append("strict digraph G {\n");
for(INode n : graph.nodes()) {
sb.append(" \"" + n.getId() + "\n");
}
for(IEdge e : graph.edges()) {
sb.append(" \"" + e.getSource().getId() + "\" -> \"" + e.getTarget().getId() + "\" " + "\n");
}
sb.append("}");
return sb.toString();
}
}
This will produce something like
strict digraph G {
node_A;
node_B;
node_A -> node_B;
}
It's very easy to read this and build the graph in memory again.
If your nodes are complex objects you should store them separately though.
Based on #Maria Ines Parnisari's amazing answer, I modified a little😊. Then drawing with mermaid(a markdown plugin), I get a clear image like this in Idea(>=2021.2, support markdown better)!
code
//noinspection UnstableApiUsage
MutableGraph<String> graph = GraphBuilder.directed()
.allowsSelfLoops(false)
.build();
//noinspection UnstableApiUsage
graph.addNode("root");
graph.putEdge("root", "s1_1");
graph.putEdge("root", "s1_2");
graph.putEdge("root", "s1_3");
graph.putEdge("s1_2", "s2");
graph.putEdge("s2", "s3");
graph.putEdge("s3", "s4");
graph.putEdge("s3", "s5");
graph.putEdge("s4", "s6");
graph.putEdge("s5", "s6");
graph.putEdge("s1_1", "s6");
graph.putEdge("s1_1", "s2");
// print mermaid text , then copy it
StringBuilder sb = new StringBuilder();
for (EndpointPair<String> edge : graph.edges()) {
// shoudle be `-->` to draw with mermaid
sb.append(edge.nodeU() + " --> " + edge.nodeV() + "\n");
}
System.out.println(sb);
sorry As < 10 repution, i can't copy image directly that image will be hidden
I want to print out Java method calls with names and values of parameters, and return results.
I don't want to manually add the trace statements, especially when the code is in a 3rd party library. I need to understand the interactions with the library, especially when callbacks are used.
I have tried to use a wrapper, but ran into problems, so subclassing is better. (i.e. either wrappedObject.methodA() or super.methodA() calls)
It's a pain to write this code especially when there are lots of methods.
I wish Java can do this automatically, since it has everything to make this possible easily.
What is the best way to do this? Substituting objects with the wrapper or subclass is a compromise.
So, the next step is to add the tracing code to the wrapper or subclass. I thought of writing a parser to generate the code.
I have used yacc & lex before, and just found out about Antlr.
Is Antlr the right tool to do this? How would I do it please? I haven't used Antlr before, but have seen it around.
Thanks.
Here's what I want to do -
// 3rdParty library under investigation, with evolving versions
import com.3rdParty.lib.Service;
import com.3rdParty.lib.Callback;
MyExistingClass {
Service service = new Service();
// Need to understand 3rd party library service and callback interactions
// Also need to write my own callbacks using 3rd party interface
if (traceMethod1) {
service.doSomething(param1, new CallbackWrapper(), param3);
}
else if (traceMethod2) {
service.doSomething(param1, new CallbackSubclass(), param3);
}
else {
// Original code
// Service calls Callback methods
service.doSomething(param1, new Callback(), param3);
}
}
--------------------------------------------------------------------------------
// 3rd Party code - Service calls Callback methods
package com.3rdParty.lib;
public Callback extends SomeBaseClass {
public void methodA(int code, String action, SomeClass data) {
// do method A stuff
}
public String methodB(String name, boolean flag) {
// do method B stuff
return result;
}
...etc.
}
--------------------------------------------------------------------------------
// Wrapper class - traceMethod1
package com.my.package;
import com.3rdParty.lib.Callback;
public CallbackWrapper implements SomeCallbackInterface {
Callback cb = new Callback();
public void methodA(int code, String action, SomeClass data) {
logger.debug("CallbackWrapper.methodA() called");
logger.debug(" code = " + code);
logger.debug(" action = " + action);
logger.debug(" data = " + data);
cb.methodA(code, action, data);
logger.debug("CallbackWrapper.methodA() returns");
}
public String methodB(String name, boolean flag) {
logger.debug("CallbackWrapper.methodB() called");
logger.debug(" name = " + name);
logger.debug(" flag = " + flag);
String result = cb.methodB(name, flag);
logger.debug("CallbackWrapper.methodB() returns result = " + result);
return result;
}
...etc.
}
--------------------------------------------------------------------------------
// Subclass - traceMethod2
package com.my.package;
import com.3rdParty.lib.Callback;
public CallbackSubclass extends Callback {
public void methodA(int code, String action, SomeClass data) {
logger.debug("CallbackSubclass.methodA() called");
logger.debug(" code = " + code);
logger.debug(" action = " + action);
logger.debug(" data = " + data);
super.methodA(code, action, data);
logger.debug("CallbackSubclass.methodA() returns");
}
public String methodB(String name, boolean flag) {
logger.debug("CallbackSubclass.methodB() called");
logger.debug(" name = " + name);
logger.debug(" flag = " + flag);
String result = super.methodB(name, flag);
logger.debug("CallbackSubclass.methodB() returns result = " + result);
return result;
}
...etc.
}
The easiest way to do this sort of thing in Java is to work with byte code rather than source code. Using BCEL (https://commons.apache.org/proper/commons-bcel/) or ASM (http://asm.ow2.org/), you can dynamically create and load modified versions of existing classes, or even entirely new classes generated from scratch.
This is still not easy, but it's much easier than trying to do source code translation.
For your particular problem of tracing method calls, you can make a custom ClassLoader that automatically instruments every method in every class it loads with custom tracing code.
ANTLR is not the right tool. While you can get Java grammars for ANTLR that will build ASTs, ANTLR won't help you, much, trying to modify the ASTs or regenerate compilable source text.
What you need is a Program Transformation System (PTS). These are tools that parse source code, build ASTs, provide you with means to modify these ASTs generally with source to source transformations, and can regenerate compilable source text from the modified tree.
The source-to-source transformations for a PTS are usually written in terms of the language-to-be-transformed syntax (in this case, java):
if you see *this*, replace it by *that* if some *condition*
Our DMS Software Reengineering Toolkit is such a PTS with an available Java front end.
What you want to do is very much like instrumenting code; just modify the victim methods to make them do your desired tracing. See Branch Coverage Made Easy for examples of how we implemented an instrumentation tool for Java using such source-to-source rewrites.
Alternatively, you could write rules that replicated the victim classes and methods as subclasses with the tracing wired in as you have suggested. The instrumentation is probably easier than copying everything. [Of course, when you are writing transformation rules, you really don't care how much code changes since in your case you are going to throw it away after you are done with it; you care how much effort it is to write the rules]
See DMS Rewrite Rules for a a detailed discussion of what such rules really look like, and worked example of rewrite rules that make changes to source code.
Others suggest doing transformation on the compiled byte code. Yes, that works for the same reason that a PTS system works, but you get to do the code hacking by hand and the transforms, written as a pile of procedural goo operating on JVM instructions are not readable by mere humans. In the absence of an alternative, I understand people take this approach. My main point is there are alternatives.
I'm trying to get a lot of data from multiple pages but its not always consistent. here is an example of the html I am working with!:
Example HTML
I need to get something like: Team | Team | Result all into different variables or lists.
I just need some help on where to start because the main table I'm working with on multiple pages isn't the same on everyone.
heres my java so far:
try {
Document team_page = Jsoup.connect("http://www.soccerstats.com/team.asp?league=" + league + "&teamid=" + teamNumber).get();
Element home_team = team_page.select("[class=homeTitle]").first();
String teamName = home_team.text();
System.out.println(teamName + "'s Latest Results: ");
Elements main_page = team_page.select("[class=stat]");
System.out.println(main_page);
} catch (IOException e) {
System.out.println("unable to parse content");
}
I am getting the league and teamid from different methods of my program.
Thanks!
Yes. This is one of the problems with webpage scraping.
You have to figure out one or more heuristics that will extract the information that you need across all of the pages that you need to access. There's no magic bullet. Just hard work. (And you'll have to do it all over again if the site changes its page layout.)
A better idea is to request the information as XML or JSON using the site or sites' RESTful APIs ... assuming they exist and are available to you.
(And if you continue with the web-scraping approach, check the site's Terms of Service to make sure that your activity is acceptable.)
I'm using Hibernate Search / Lucene to maintain a really simple index to find objects by name - no fancy stuff.
My model classes all extend a class NamedModel which looks basically as follows:
#MappedSuperclass
public abstract class NamedModel {
#Column(unique = true)
#Field(store = Store.YES, index = Index.UN_TOKENIZED)
protected String name;
}
My problem is that I get a BooleanQuery$TooManyClauses exception when querying the index for objects with names starting with a specific letter, e.g. "name:l*".
A query like "name:lin*" will work without problems, in fact any query using more than one letter before the wildcard will work.
While searching the net for similar problems, I only found people using pretty complex queries and that always seemed to cause the exception. I don't want to increase maxClauseCount because I don't think it's a good practice to change limits just because you reach them.
What's the problem here?
Lucene tries to rewrite your query from simple name:l* to a query with all terms starting with l in them (something like name:lou OR name:la OR name: ...) - I believe as this is meant to be faster.
As a workaround, you may use a ConstantScorePrefixQuery instead of a PrefixQuery:
// instead of new PrefixQuery(prefix)
new ConstantScoreQuery(new PrefixFilter(prefix));
However, this changes scoring of documents (hence sorting if you rely on score for sorting). As we faced the challenge of needing score (and boost), we decided to go for a solution where we use PrefixQuery if possible and fallback to ConstantScorePrefixQuery where needed:
new PrefixQuery(prefix) {
public Query rewrite(final IndexReader reader) throws IOException {
try {
return super.rewrite(reader);
} catch (final TooManyClauses e) {
log.debug("falling back to ConstantScoreQuery for prefix " + prefix + " (" + e + ")");
final Query q = new ConstantScoreQuery(new PrefixFilter(prefix));
q.setBoost(getBoost());
return q;
}
}
};
(As an enhancement, one could use some kind of LRUMap to cache terms that failed before to avoid going through a costly rewrite again)
I can't help you with integrating this into Hibernate Search though. You might ask after you've switched to Compass ;)
I would like to get to query Windows Vista Search service directly ( or indirectly ) from Java.
I know it is possible to query using the search-ms: protocol, but I would like to consume the result within the app.
I have found good information in the Windows Search API but none related to Java.
I would mark as accepted the answer that provides useful and definitive information on how to achieve this.
Thanks in advance.
EDIT
Does anyone have a JACOB sample, before I can mark this as accepted?
:)
You may want to look at one of the Java-COM integration technologies. I have personally worked with JACOB (JAva COm Bridge):
http://danadler.com/jacob/
Which was rather cumbersome (think working exclusively with reflection), but got the job done for me (quick proof of concept, accessing MapPoint from within Java).
The only other such technology I'm aware of is Jawin, but I don't have any personal experience with it:
http://jawinproject.sourceforge.net/
Update 04/26/2009:
Just for the heck of it, I did more research into Microsoft Windows Search, and found an easy way to integrate with it using OLE DB. Here's some code I wrote as a proof of concept:
public static void main(String[] args) {
DispatchPtr connection = null;
DispatchPtr results = null;
try {
Ole32.CoInitialize();
connection = new DispatchPtr("ADODB.Connection");
connection.invoke("Open",
"Provider=Search.CollatorDSO;" +
"Extended Properties='Application=Windows';");
results = (DispatchPtr)connection.invoke("Execute",
"select System.Title, System.Comment, System.ItemName, System.ItemUrl, System.FileExtension, System.ItemDate, System.MimeType " +
"from SystemIndex " +
"where contains('Foo')");
int count = 0;
while(!((Boolean)results.get("EOF")).booleanValue()) {
++ count;
DispatchPtr fields = (DispatchPtr)results.get("Fields");
int numFields = ((Integer)fields.get("Count")).intValue();
for (int i = 0; i < numFields; ++ i) {
DispatchPtr item =
(DispatchPtr)fields.get("Item", new Integer(i));
System.out.println(
item.get("Name") + ": " + item.get("Value"));
}
System.out.println();
results.invoke("MoveNext");
}
System.out.println("\nCount:" + count);
} catch (COMException e) {
e.printStackTrace();
} finally {
try {
results.invoke("Close");
} catch (COMException e) {
e.printStackTrace();
}
try {
connection.invoke("Close");
} catch (COMException e) {
e.printStackTrace();
}
try {
Ole32.CoUninitialize();
} catch (COMException e) {
e.printStackTrace();
}
}
}
To compile this, you'll need to make sure that the JAWIN JAR is in your classpath, and that jawin.dll is in your path (or java.library.path system property). This code simply opens an ADO connection to the local Windows Desktop Search index, queries for documents with the keyword "Foo," and print out a few key properties on the resultant documents.
Let me know if you have any questions, or need me to clarify anything.
Update 04/27/2009:
I tried implementing the same thing in JACOB as well, and will be doing some benchmarks to compare performance differences between the two. I may be doing something wrong in JACOB, but it seems to consistently be using 10x more memory. I'll be working on a jcom and com4j implementation as well, if I have some time, and try to figure out some quirks that I believe are due to the lack of thread safety somewhere. I may even try a JNI based solution. I expect to be done with everything in 6-8 weeks.
Update 04/28/2009:
This is just an update for those who've been following and curious. Turns out there are no threading issues, I just needed to explicitly close my database resources, since the OLE DB connections are presumably pooled at the OS level (I probably should have closed the connections anyway...). I don't think I'll be any further updates to this. Let me know if anyone runs into any problems with this.
Update 05/01/2009:
Added JACOB example per Oscar's request. This goes through the exact same sequence of calls from a COM perspective, just using JACOB. While it's true JACOB has been much more actively worked on in recent times, I also notice that it's quite a memory hog (uses 10x as much memory as the Jawin version)
public static void main(String[] args) {
Dispatch connection = null;
Dispatch results = null;
try {
connection = new Dispatch("ADODB.Connection");
Dispatch.call(connection, "Open",
"Provider=Search.CollatorDSO;Extended Properties='Application=Windows';");
results = Dispatch.call(connection, "Execute",
"select System.Title, System.Comment, System.ItemName, System.ItemUrl, System.FileExtension, System.ItemDate, System.MimeType " +
"from SystemIndex " +
"where contains('Foo')").toDispatch();
int count = 0;
while(!Dispatch.get(results, "EOF").getBoolean()) {
++ count;
Dispatch fields = Dispatch.get(results, "Fields").toDispatch();
int numFields = Dispatch.get(fields, "Count").getInt();
for (int i = 0; i < numFields; ++ i) {
Dispatch item =
Dispatch.call(fields, "Item", new Integer(i)).
toDispatch();
System.out.println(
Dispatch.get(item, "Name") + ": " +
Dispatch.get(item, "Value"));
}
System.out.println();
Dispatch.call(results, "MoveNext");
}
} finally {
try {
Dispatch.call(results, "Close");
} catch (JacobException e) {
e.printStackTrace();
}
try {
Dispatch.call(connection, "Close");
} catch (JacobException e) {
e.printStackTrace();
}
}
}
As few posts here suggest you can bridge between Java and .NET or COM using commercial or free frameworks like JACOB, JNBridge, J-Integra etc..
Actually I had an experience with with one of these third parties (an expensive one :-) ) and I must say I will do my best to avoid repeating this mistake in the future. The reason is that it involves many "voodoo" stuff you can't really debug, it's very complicated to understand what is the problem when things go wrong.
The solution I would suggest you to implement is to create a simple .NET application that makes the actual calls to the windows search API. After doing so, you need to establish a communication channel between this component and your Java code. This can be done in various ways, for example by messaging to a small DB that your application will periodically pull. Or registering this component on the machine IIS (if exists) and expose simple WS API to communicate with it.
I know that it may sound cumbersome but the clear advantages are: a) you communicate with windows search API using the language it understands (.NET or COM) , b) you control all the application paths.
Any reason why you couldn't just use Runtime.exec() to query via search-ms and read the BufferedReader with the result of the command? For example:
public class ExecTest {
public static void main(String[] args) throws IOException {
Process result = Runtime.getRuntime().exec("search-ms:query=microsoft&");
BufferedReader output = new BufferedReader(new InputStreamReader(result.getInputStream()));
StringBuffer outputSB = new StringBuffer(40000);
String s = null;
while ((s = output.readLine()) != null) {
outputSB.append(s + "\n");
System.out.println(s);
}
String result = output.toString();
}
}
There are several libraries out there for calling COM objects from java, some are opensource (but their learning curve is higher) some are closed source and have a quicker learning curve. A closed source example is EZCom. The commercial ones tend to focus on calling java from windows as well, something I've never seen in opensource.
In your case, what I would suggest you do is front the call in your own .NET class (I guess use C# as that is closest to Java without getting into the controversial J#), and focus on making the interoperability with the .NET dll. That way the windows programming gets easier, and the interface between Windows and java is simpler.
If you are looking for how to use a java com library, the MSDN is the wrong place. But the MSDN will help you write what you need from within .NET, and then look at the com library tutorials about invoking the one or two methods you need from your .NET objects.
EDIT:
Given the discussion in the answers about using a Web Service, you could (and probably will have better luck) build a small .NET app that calls an embedded java web server rather than try to make .NET have the embedded web service, and have java be the consumer of the call. For an embedded web server, my research showed Winstone to be good. Not the smallest, but is much more flexible.
The way to get that to work is to launch the .NET app from java, and have the .NET app call the web service on a timer or a loop to see if there is a request, and if there is, process it and send the response.