How to parse a malformed xml (ofx) with ofx4j?

How to parse a malformed xml (ofx) with ofx4j? - java

i am desperatly trying to use the following library : ofx4j. But the documentation relative to parsing an ofx file is a bit lite. It says : If you've got a file or other stream resource, you can read it using an instance of net.sf.ofx4j.io.OFXReader
Ok but how do i do ?
It also states the following: if you want to unmarshal the OFX directly to a Java object, use the net.sf.ofx4j.io.AggregateUnmarshaller.
Fine, but that's a bit complicated for me. Is there something obvious that i missed ? When i try to use the unmarshaller, it asks me to implement an interface.
Could someone point me to an online resource explaining the bits that i am missing ? Or the best, what do you understand from the previous statements relative to the ofxreader and the unmarshaller ?
Please, don't bash me, I am learning java with the playframework and i would really appreciate to be able to parse those ofx files.
thanks in advance.

I don't see a plain old tutorial, but there's sample code in the test directory that illustrates OFXReader and AggregateUnmarshaller.
The phrase "an instance of net.sf.ofx4j.io.OFXReader" means one of the known implementing classes", such as NanoXMLOFXReader, which is tested here. A test for AggregateUnmarshaller is here.
The API and mail archives are good resources, too. It looks like a lot of institutions participate.

For those that stumble on this like I did when I couldn't get the expected results from the AggregateUnmarshaller... Here is an example.
//Using a multipart file, but using a regular file is similar.
public void parse(MultipartFile file) throws IOException {
//Use ResponseEnvelope to start.
AggregateUnmarshaller<ResponseEnvelope> unmarshaller = new AggregateUnmarshaller<ResponseEnvelope>(
ResponseEnvelope.class);
try {
ResponseEnvelope envelope = unmarshaller.unmarshal(file.getInputStream());
//Assume we are just interested in the credit card info. Make sure to cast.
CreditCardResponseMessageSet messageSet = (CreditCardResponseMessageSet) envelope
.getMessageSet(MessageSetType.creditcard);
List<CreditCardStatementResponseTransaction> responses = messageSet.getStatementResponses();
for (CreditCardStatementResponseTransaction response : responses) {
CreditCardStatementResponse message = response.getMessage();
String currencyCode = message.getCurrencyCode();
List<Transaction> transactions = message.getTransactionList().getTransactions();
for (Transaction transaction : transactions) {
System.out.println(transaction.getName() + " " + transaction.getAmount() + " "
+ currencyCode);
}
}
}
catch (OFXParseException e) {
e.printStackTrace();
}
}

Related

Issues With JAX-RS

I'm new to JAX-RS and having a number of issues (which oddly make me miss SOAP). Here is a snippet of my code. The getMergedPDFReport method should take a file and return a file after some processing. After which I would worry about the client
#GET
#Produces("application/pdf")
#Path("merge-service")
public Response getMergedPDFReport(#QueryParam(ApiParameters.WORD_DOCUMENT) File wordDocument,
#QueryParam(ApiParameters.MERGE_FIELDS)Object[] fieldNames,
#QueryParam(ApiParameters.MERGE_VALUES) Object [] fieldValues) {
ResponseBuilder builder =null;
try {
File product = DocumentUtil.generatePDF(wordDocument, fieldNames, fieldValues);
builder = Response.ok(product);
builder.header("Content-Disposition", "attachment; filename=\\\"report.pdf\\\"");
} catch (Exception e) {
e.printStackTrace();
}
return builder.build();
}
I get a warning on my server log that says "No injection source found for a parameter of type public javax.ws.rs.core.Response". I can't seem to know why.
2. Am I using the #QueryParam annotation right? Should I be using it for types of File, and arrays? I saw a lot of debates online over #BeanParam, #MatrixParam and #QueryParam. Since I didn't know what the first two do, I decided to Keep It Simple.
Any help would be appreciated.

I think you can't use queryParam for files. You must use a #Consumes with a multipart form.
Check this :
http://www.javatpoint.com/jax-rs-file-upload-example

Parsing a string with multiple variations

I am curious as to what a better way to deal with this is, I wanted to challenge my self and see if I could break up, in a HashMap of key,value (or String, String), a string that could come back in almost any format.
the string in question is:
/user/2/update?updates=success
Thats right, a url request for a server. The issue - as we all know this could be any thing, it could come back in any form. I wanted to break it up so that it would look like:
Controller => user
action => update
params => ??? (theres a 2, a update=success ... )
Obviously The above is not a real java object.
But you get the idea.
What do you need? what have you done? what are you trying to do?
What I want to do is map this to a controller and action while passing in the parameters along the way. But i need to separate this up making sure to specify each step what is what.
What I have done is:
private Filter parseRoute(String route){
String[] parsedRoute = route.split("[?:/=]");
Filter filter = new Filter(parsedRoute);
return filter;
}
Splits on any thing that is in the url (note, : would be something like /user:id/update
so: user/2/update ... )
I then attempted to do:
public class Filter {
private HashMap<String, String> filterInfo;
public Filter(String[] filteredRoute){
if(filteredRoute.length > 0){
filterInfo.put("Controller", filteredRoute[0]);
}else{
throw new RoutingException("routes must not be empty.");
}
}
}
But this is not going to work as I expected it to...As there are too many variables at play.
including parameters before the action (those would just be used to search for that user), their could be nested routes, so multiple controller/action/controller/action ..
How would you deal with this? What would you suggest? How could you get around this? Should you just do something like:
route(controller, action, params, template); ? (template lets you render a jsp). if so how do you deal with the ?update=success
I am using HttpServer to set up the basics. But I am now lost. I am trying to keep routing as generic and "do what ever you want we will map it to the right controller, action and pass in the parameters" but I think I bit off more then I can chew.
I have looked at both spark and spring framework, and decided that the route you pass, we will map to a xml file to find the controller and action, I just need the data structure in place to do that ...
So I am looking to back up and still go with "pass me something, ill map it out."

I would probably use the URL from apache,
org.apache.tomcat.util.net.URL url = null;
try {
url = new org.apache.tomcat.util.net.URL("/user/2/update?updates=success");
// ... do some stuff with it...
} catch (Exception e) {
e.printStackTrace();
}

java.net.URI may help you.
you can get your path by getPath()
and get all of your query by getQuery(),then you can split the query by = to name value pairs.
URI uri = new URI("/user/2/update?updates=success");
// /user/2/update
System.out.println("path is " + uri.getPath());
// updates=success
System.out.println("query is " + uri.getQuery());

Create a database / execute a bunch of mysql statements from Java

I have a library that needs to create a schema in MySQL from Java. Currently, I have a dump of the schema that I just pipe into the mysql command. This works okay, but it is not ideal because:
It's brittle: the mysql command needs to be on the path: usually doesn't work on OSX or Windows without additional configuration.
Also brittle because the schema is stored as statements, not descriptively
Java already can access the mysql database, so it seems silly to depend on an external program to do this.
Does anyone know of a better way to do this? Perhaps...
I can read the statements in from the file and execute them directly from Java? Is there a way to do this that doesn't involve parsing semicolons and dividing up the statements manually?
I can store the schema in some other way - either as a config file or directly in Java, not as statements (in the style of rails' db:schema or database.yml) and there is a library that will create the schema from this description?
Here is a snippet of the existing code, which works (when mysql is on the command line):
if( db == null ) throw new Exception ("Need database name!");
String userStr = user == null ? "" : String.format("-u %s ", user);
String hostStr = host == null ? "" : String.format("-h %s ", host);
String pwStr = pw == null ? "" : String.format("-p%s ", pw);
String cmd = String.format("mysql %s %s %s %s", hostStr, userStr, pwStr, db);
System.out.println(cmd + " < schema.sql");
final Process pr = Runtime.getRuntime().exec(cmd);
new Thread() {
public void run() {
try (OutputStream stdin = pr.getOutputStream()) {
Files.copy(f, stdin);
}
catch (IOException e) { e.printStackTrace(); }
}
}.start();
new Thread() {
public void run() {
try (InputStream stdout = pr.getInputStream() ) {
ByteStreams.copy(stdout, System.out);
}
catch (IOException e) { e.printStackTrace(); }
}
}.start();
int exitVal = pr.waitFor();
if( exitVal == 0 )
System.out.println("Create db succeeded!");
else
System.out.println("Exited with error code " + exitVal);

The short answer (as far as i know) is no.
You will have to do some parsing of the file into separate statements.
I have faced the same situation and you can find many questions on this topic here on SO.
some like here will show a parser. others can direct to tools Like this post from apache that can convert the schema to an xml format and then can read it back.
My main intention when writing this answer is to tell that I chose to use the command line in the end.
extra configuration: maybe it is an additional work but you can do it by config or at runtime based on the system you are running inside. you do the effort one time and you are done
depending on external tool: it is not as bad as it seems. you have some benefits too.
1- you don't need to write extra code or introduce additional libraries just for parsing the schema commands.
2- the tool is provided by the vendor. it is probably more debugged and tested than any other code that will do the parsing.
3- it is safer on the long run. any additions or changes in the format of dump that "might" break the parser will most probably be supported with the tool that comes with the database release. you won't need to do any change in your code.
4- the nature of the action where you are going to use the tool (creating schema) does not suggest frequent usage, minimizing the risk of it becoming a performance bottle neck.
I hope you can find the best solution for your needs.

Check out Yank, and more specifically the code examples linked to on that page. It's a light-weight persistence layer build on top of DBUtils, and hides all the nitty-gritty details of handling connections and result sets. You can also easily load a config file like you mentioned. You can also store and load SQL statements from a properties file and/or hard code the SQL statements in your code.

PyYaml to SnakeYaml --- AWT-EventQueue-0" Can't construct a java object for tag:yaml.org,2002:java/object:

I am passing Yaml created with PyYaml to SnakeYaml and Snakeyaml does not seem to recognize anything beyond the first line where !! exists and python/object is declared. I already have identical objects setup in Java. Is there an example out there that shows a loadAll into an object array where the object type is asserted or assigned?
Good call... was away from the computer when I originally posted.
Here is the data from PyYaml that I am trying to use SnakeYaml to get into a Java application:
--- !!python/object:dbmethods.Project.Project {dblogin: kirtstrim7900, dbname: 92218kirtstrim_wfrogls,dbpw: 1234567895#froggy, preference1: '', preference2: '', preference3: '', projName: CheckPoint Firewall Audit - imp, projNo: 1295789430544+CheckPoint Firewall Audit - imp, projectowner: kirtcathey#sysrisk.com,result1label: Evidence, result2label: Recommend, result3label: Report, resultlabel: Response,role: owner, workstep1label: Objective, workstep2label: Policy, workstep3label: Guidance,worksteplabel: Procedure}
Not just a single instance of the above, but several objects, so need to use loadAll in SnakeYaml.... unless somebody knows better.
As for the code, this is all I have from SnakeYaml docs:
for (Object data : yaml.loadAll(sb.toString())) {
System.out.println(data.toString());
}
Then, this error is thrown:
Exception in thread "AWT-EventQueue-0" Can't construct a java object for tag:yaml.org,2002:java/object: ......
Caused by: org.yaml.snakeyaml.error.YAMLException: Class not found: ......
As you can see from the small code snippet, EVEN without all this information supplied, anybody who knows the answer about how to cast an object arbitrarily could PROBABLY answer the question.
Thx.
Parsed off the two exclamation points (!!) at the beginning of each entry and now I get:
mapping values are not allowed here
in "", line 1, column 73:
as an error. The whole point of using YAML was to reduce coding related to parsing. If I have to turn around and parse incoming and outgoing code for whatever reason, then YAML sucks!! And will gladly revert back XML or anything else that will allow a python middleware to talk to a java application.

To achieve the same result you may:
configure PyYAML to skip the tag (exactly as you did with the comment "Convert objects to a dictionary of their representation")
configure SnakeYAML to create the object you expect (exactly as you did with "projectData = gson.fromJson(mystr, ProjectData[].class); ")
If you are lost (before you say "it sucks") you may ask a question in the corresponding mailing lists. It may help you to find a proper solution in the future.

Fixed. YAML sucks, so don't use it. All kinds of Google results about how SnakeYAML is derived from PyYaml and what-not, but nobody clearly states exactly what dumps format from PyYaml works with what loadAll routines with SnakeYAML.
Also, performance with YAML is horrid, JSON is far simpler and easier to implement. In Python, where our middleware resides (and most crunching occurs), YAML takes almost twice the time to process than JSON!!
If you are using Python 2.6 or greater, just
import json
json_doc = json.dumps(projects, default=convert_to_builtin_type)
print json_doc
　　def convert_to_builtin_type(obj):
　print 'default(', repr(obj), ')'
　# Convert objects to a dictionary of their representation
　d = { '__class__':obj.__class__.__name__,
'__module__':obj.__module__,
}
　d.update(obj.__dict__)
　return d
Then on the Java client (loading) side, use GSon -- this took a lot of head-scratching and searches to figure out because ALL examples on the 'net are virtually useless. Every blogger with 500 ads per page shows you how to convert one single, stupid object and last time I created an app, I used lists, arrays, or anything that held more than one object!!
try {
serverAddress = new URL("http://127.0.0.1:5000/projects/" + ruser.getUserEmail()+"+++++"+ruser.getUserHash());
//set up out communications stuff
connection = null;
//Set up the initial connection
connection = (HttpURLConnection)serverAddress.openConnection();
connection.setRequestMethod("GET");
connection.setDoOutput(true);
connection.setReadTimeout(100000);
connection.connect();
//get the output stream writer and write the output to the server
//not needed in this example
rd = new BufferedReader(new InputStreamReader(connection.getInputStream()));
sb = new StringBuilder();
while ((line = rd.readLine()) != null)
{
sb.append(line + '\n');
}
String mystr = sb.toString();
// Now do the magic.
Gson gson = new Gson();
projectData = gson.fromJson(mystr, ProjectData[].class);
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (ProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
finally
{
//close the connection, set all objects to null
connection.disconnect();
rd = null;
sb = null;
connection = null;
}
return projectData;
Done! In a nutshell - YAML sucks and use JSON!! Also, the http connection code is mostly snipped off of this site...now I need to figure out https.

Is Scala/Java not respecting w3 "excess dtd traffic" specs?

I'm new to Scala, so I may be off base on this, I want to know if the problem is my code. Given the Scala file httpparse, simplified to:
object Http {
import java.io.InputStream;
import java.net.URL;
def request(urlString:String): (Boolean, InputStream) =
try {
val url = new URL(urlString)
val body = url.openStream
(true, body)
}
catch {
case ex:Exception => (false, null)
}
}
object HTTPParse extends Application {
import scala.xml._;
import java.net._;
def fetchAndParseURL(URL:String) = {
val (true, body) = Http request(URL)
val xml = XML.load(body) // <-- Error happens here in .load() method
"True"
}
}
Which is run with (URL doesn't matter, this is a joke example):
scala> HTTPParse.fetchAndParseURL("http://stackoverflow.com")
The result invariably:
java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/html4/strict.dtd
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1187)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:973)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEnti...
I've seen the Stack Overflow thread on this with respect to Java, as well as the W3C's System Team Blog entry about not trying to access this DTD via the web. I've also isolated the error to the XML.load() method, which is a Scala library method as far as I can tell.
My Question: How can I fix this? Is this something that is a by product of my code (cribbed from Raphael Ferreira's post), a by product of something Java specific that I need to address as in the previous thread, or something that is Scala specific? Where is this call happening, and is it a bug or a feature? ("Is it me? It's her, right?")

I've bumped into the SAME issue, and I haven't found an elegant solution (I'm thinking into posting the question to the Scala mailing list) Meanwhile, I found a workaround: implement your own SAXParserFactoryImpl so you can set the f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); property. The good thing is it doesn't require any code change to the Scala code base (I agree that it should be fixed, though).
First I'm extending the default parser factory:
package mypackage;
public class MyXMLParserFactory extends SAXParserFactoryImpl {
public MyXMLParserFactory() throws SAXNotRecognizedException, SAXNotSupportedException, ParserConfigurationException {
super();
super.setFeature("http://xml.org/sax/features/validation", false);
super.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false);
super.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
super.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
}
}
Nothing special, I just want the chance to set the property.
(Note: that this is plain Java code, most probably you can write the same in Scala too)
And in your Scala code, you need to configure the JVM to use your new factory:
System.setProperty("javax.xml.parsers.SAXParserFactory", "mypackage.MyXMLParserFactory");
Then you can call XML.load without validation

Without addressing, for now, the problem, what do you expect to happen if the function request return false below?
def fetchAndParseURL(URL:String) = {
val (true, body) = Http request(URL)
What will happen is that an exception will be thrown. You could rewrite it this way, though:
def fetchAndParseURL(URL:String) = (Http request(URL)) match {
case (true, body) =>
val xml = XML.load(body)
"True"
case _ => "False"
}
Now, to fix the XML parsing problem, we'll disable DTD loading in the parser, as suggested by others:
def fetchAndParseURL(URL:String) = (Http request(URL)) match {
case (true, body) =>
val f = javax.xml.parsers.SAXParserFactory.newInstance()
f.setNamespaceAware(false)
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
val MyXML = XML.withSAXParser(f.newSAXParser())
val xml = MyXML.load(body)
"True"
case _ => "False"
}
Now, I put that MyXML stuff inside fetchAndParseURL just to keep the structure of the example as unchanged as possible. For actual use, I'd separate it in a top-level object, and make "parser" into a def instead of val, to avoid problems with mutable parsers:
import scala.xml.Elem
import scala.xml.factory.XMLLoader
import javax.xml.parsers.SAXParser
object MyXML extends XMLLoader[Elem] {
override def parser: SAXParser = {
val f = javax.xml.parsers.SAXParserFactory.newInstance()
f.setNamespaceAware(false)
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
f.newSAXParser()
}
}
Import the package it is defined in, and you are good to go.

This is a scala problem. Native Java has an option to disable loading the DTD:
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
There are no equivalent in scala.
If you somewhat want to fix it yourself, check scala/xml/parsing/FactoryAdapter.scala and put the line in
278 def loadXML(source: InputSource): Node = {
279 // create parser
280 val parser: SAXParser = try {
281 val f = SAXParserFactory.newInstance()
282 f.setNamespaceAware(false)
<-- insert here
283 f.newSAXParser()
284 } catch {
285 case e: Exception =>
286 Console.err.println("error: Unable to instantiate parser")
287 throw e
288 }

GClaramunt's solution worked wonders for me. My Scala conversion is as follows:
package mypackage
import org.xml.sax.{SAXNotRecognizedException, SAXNotSupportedException}
import com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
import javax.xml.parsers.ParserConfigurationException
#throws(classOf[SAXNotRecognizedException])
#throws(classOf[SAXNotSupportedException])
#throws(classOf[ParserConfigurationException])
class MyXMLParserFactory extends SAXParserFactoryImpl() {
super.setFeature("http://xml.org/sax/features/validation", false)
super.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false)
super.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false)
super.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false)
}
As mentioned his the original post, it is necessary to place the following line in your code somewhere:
System.setProperty("javax.xml.parsers.SAXParserFactory", "mypackage.MyXMLParserFactory")

It works. After some detective work, the details as best I can figure them:
Trying to parse a developmental RESTful interface, I build the parser and get the above (rather, a similar) error. I try various parameters to change the XML output, but get the same error. I try to connect to an XML document I quickly whip up (cribbed stupidly from the interface itself) and get the same error. Then I try to connect to anything, just for kicks, and get the same (again, likely only similar) error.
I started questioning whether it was an error with the sources or the program, so I started searching around, and it looks like an ongoing issue- with many Google and SO hits on the same topic. This, unfortunately, made me focus on the upstream (language) aspects of the error, rather than troubleshoot more downstream at the sources themselves.
Fast forward and the parser suddenly works on the original XML output. I confirmed that there was some additional work has been done server side (just a crazy coincidence?). I don't have either earlier XML but suspect that it is related to the document identifiers being changed.
Now, the parser works fine on the RESTful interface, as well any well formatted XML I can throw at it. It also fails on all XHTML DTD's I've tried (e.g. www.w3.org). This is contrary to what #SeanReilly expects, but seems to jive with what the W3 states.
I'm still new to Scala, so can't determine if I have a special, or typical case. Nor can I be assured that this problem won't re-occur for me in another form down the line. It does seem that pulling XHTML will continue to cause this error unless one uses a solution similar to those suggested by #GClaramunt $ #J-16 SDiZ have used. I'm not really qualified to know if this is a problem with the language, or my implementation of a solution (likely the later)
For the immediate timeframe, I suspect that the best solution would've been for me to ensure that it was possible to parse that XML source-- rather than see that other's have had the same error and assume there was a functional problem with the language.
Hope this helps others.

There are two problems with what you are trying to do:
Scala's xml parser is trying to physically retrieve the DTD when it shouldn't. J-16 SDiZ seems to have some advice for this problem.
The Stack overflow page you are trying to parse isn't XML. It's Html4 strict.
The second problem isn't really possible to fix in your scala code. Even once you get around the dtd problem, you'll find that the source just isn't valid XML (empty tags aren't closed properly, for example).
You have to either parse the page with something besides an XML parser, or investigate using a utility like tidy to convert the html to xml.

My knowledge of Scala is pretty poor, but couldn't you use ConstructingParser instead?
val xml = new java.io.File("xmlWithDtd.xml")
val parser = scala.xml.parsing.ConstructingParser.fromFile(xml, true)
val doc = parser.document()
println(doc.docElem)

For scala 2.7.7 I managed to do this with scala.xml.parsing.XhtmlParser

Setting Xerces switches only works if you are using Xerces. An entity resolver works for any JAXP parser.
There are more generalized entity resolvers out there, but this implementation does the trick when all I'm trying to do is parse valid XHTML.
http://code.google.com/p/java-xhtml-cache-dtds-entityresolver/
Shows how trivial it is to cache the DTDs and forgo the network traffic.
In any case, this is how I fix it. I always forget. I always get the error. I always go fetch this entity resolver. Then I'm back in business.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.