Namespace prefix rewriting for XML cononicalization in Java? - java

I'm trying to 1) compute the digital signature for an XML string, 2) unmarshall the XML string to a Java object, 3) marshall the object back to an XML string, and 4) re-compute signature, and verify against the signature from step 1.
Problem is that the namespace prefixes usually get changed during the round trip (steps 2-3), so I need a way to standardize them before and after the round trip. Otherwise, the digital signatures (steps 1 and 4) obviously won't match.
I figured I need something like PrefixRewrite="sequential" in section 2.5.4 of https://www.w3.org/TR/xml-c14n2/Overview_diff.html#sec-Example-PrefixRewriteSeq. I found a Python library that supposedly does that (https://github.com/dept2/c14n2py), but I can't seem to find a Java library with that option (org.apache.xml.security.c14n.Canonicalizer doesn't have it). I've also been able to hard-code the namespace prefixes in my marshaller, but that's not an acceptable solution for me.
Can anybody recommend a Java library for XML canonicalization with the PrefixRewrite="sequential" option?
Thanks!!

That python library c14n2py was written from sources of Java library, that lies nearby: https://github.com/dept2/c14n2
You could try to use it

Related

Load a Perl Hash into Java

I have a big .pm File, which only consist of a very big Perl hash with lots of subhashes. I have to load this hash into a Java program, do some work and changes on the data lying below and save it back into a .pm File, which should look similar to the one i started with.
By now, i tried to convert it linewise by regex and string matching, converting it into a XML Document and later Elementwise parse it back into a perl hash.
This somehow works, but seems quite dodgy. Is there any more reliable way to parse the perl hash without having a perl runtime installed?
You're quite right, it's utterly filthy. Regex and string for XML in the first place is a horrible idea, and honestly XML is probably not a good fit for this anyway.
I would suggest that you consider JSON. I would be stunned to find java can't handle JSON and it's inherently a hash-and-array oriented data structure.
So you can quite literally:
use JSON;
print to_json ( $data_structure, { pretty => 1 } );
Note - it won't work for serialising objects, but for perl hash/array/scalar type structures it'll work just fine.
You can then import it back into perl using:
my $new_data = from_json $string;
print Dumper $new_data;
Either Dumper it to a file, but given you requirement is multi-language going forward, just using native JSON as your 'at rest' data is probably a more sensible choice.
But if you're looking at parsing perl code within java, without a perl interpreter? No, that's just insanity.

How to parse freedict files (*.dict and *.index)

I was searching for free translation dictionaries. Freedict (freedict.org) provides the ones I need but I don't know, how to parse the *.index and *.dict files. I also don't really know, what to google, to find useful information about these formats.
The *.index files look following:
00databasealphabet QdGI l
00databasedictfmt1121 B b
00databaseinfo c 5o
00databaseshort 6E u
00databaseurl 6y c
00databaseutf8 A B
a BHO M
a bad risc BHa u
a bag of nerves BII 2
[...]
and the *.dict files:
[Lot of info stuff]
German-English FreeDict Dictionary ver. 0.3.4
Pipi machen /piːpiːmaxən/
to pee; to piss
(Aktien) zusammenlegen /aktsiːəntsuːzamənleːgən/
to merge (with)
[...]
I would be glad to see some example projects (preferably in python, but java, c, c++ are also ok) to understand how to handle these files.
It is too late. However, i hope that it can be useful for others like me.
JGoerzen writes a Dictdlib lib. You can see more details how he parse .index and .dict files.
https://github.com/jgoerzen/dictdlib/blob/master/dictdlib.py
dictd considers its format of .index and .dict[.dz] as private, to reserve itself the right to change it in the future.
If you want to process it directly anyway, the index contains the headwords and the .dict[.dz] contains definitions. It is optionally compressed with a special modified gzip algorithm providing almost random access, which gzip normally does not. The index contains 3 columns per line, tab separated:
The headword for looking up the definition.
The absolute byte position of the definition in the .dict[.dz] file, base64 encoded.
The length of the definition in bytes, base64 encoded.
For more details see the dict(8) man page (section Database Format) you should have found in your research before asking your question. For processing the headwords correctly, you'd have to consider encoding and character collation.
Eventually it would be better to use an existing library to read dictd databases. But that really depends on whether the library is good (no experience here).
Finally, as you noted yourself, XML is made exactly for easy processing. You could extract the headwords and translations using XPath, leaving out all the grammatical stuff and no need to bother parsing anything.
After getting this far the next problem would be that there is no one-to-one mapping between words in different lanuages...

Whirlpool hash in java and in python give different results

I have two projects. panager and panager-android. I use the whirlpool hash algorithm and with the same data panager gives different results than panager-android.
panager is written in python and panager-android (guess) in java.
I'm ultra-new in java so take it easy :P
In python I use a module that I found on the net (whirlpool.py) and in java I use the jacksum library.
There are different versions of the Whirlpool spec which generate different output for the same input. It looks like whirlpool.py might be implementing the original Whirlpool (referred to as "Whirlpool-0"), whereas in panager-android you use Whirlpool-2:
AbstractChecksum encode = JacksumAPI.getChecksumInstance("whirlpool2");
Try changing that to "whirlpool0" and see if it matches your Python implementation now. Failing that, try "whirlpool1".
Wikipedia has known Whirlpool hashes from each version for a given test input which you may use to identify the version of a questioned Whirlpool implementation, or find out if it's just entirely wrong and broken.

XSD: Index of sequence in Element name

I'm building an XSD to generate JAXB objects in Java. Then I ran into this:
<TotalBugs>
<Bug1>...</Bug1>
<Bug2>...</Bug2>
...
<BugN>...</BugN>
</TotalBugs>
How do I build a sequence of elements where the index of the sequence is in the element name? Specifically, how do I get the 1 in Bug1
You don't want to do it in this way, XML has a top-down order by nature. Consequently, you don't have to enumerate yourself:
<totalBugs>
<bug><!-- Here comes 1st bug --></bug>
<bug><!-- Here comes 2nd bug --></bug>
...
<bug><!-- Here comes last bug --></bug>
</totalBugs>
You can access the 1st bug node in the list by the XPath expression:
/totalBugs/bug[1]
Note, indexes start by W3C standard at 1. Please refer to for further readings to w3schools.
I'm pretty sure XSD won't support what you need. However you can use <xsd:any> for that bit of the schema, then use something lower-level than JAXB to generate the XML for that particular part. (I think your generated classes will have fields like protected List<Element> any; which you can fill in using DOM).

Porting XML parsing from Java to Objective C

I am trying to port code written in Java to Objective C (for iPhone), but I'm kind of confused about a few lines of my code (mentioned below). How should I port this efficiently?
Namespace nmgrhistory=Namespace.getNamespace("history", "http://www.mywebsite.com/History.xsd");
pEventEl.addContent(new Element("History",nmgrhistory));
Namespace nmgrState=Namespace.getNamespace("state", "http://www.mywebsite.com/State.xsd");
pEventEl.addContent(new Element("State",nmgrState));
Iterator<Element> eld=(Iterator<Element>) pEventEl.getChild(
pEventEl.getName() == "event"? "./history:history/state:state" : "./state:state",pEventEl.getNamespace());
I'm not very sure about the replacements for the classes Namespace, Iterator and Element.
Anybody having idea or having done this before, please enlighten me.
Ok... So although these are not the actual replacements ... But basically what u need for parsing XML in Objecive - C is "NSXMLParser"
so u can say that NSXMLParser is the replacement for Namespace
And for "Iterator" NSXMLParserDelegate has a method named:-
– parser:didStartElement:namespaceURI:qualifiedName:attributes:
OR
– parser:foundCharacters:
I don't know java, but the url's your are pointing at are .xsd files which are xml definition files. XML parsing on iOS is somewhat limited out of the box: NSXMLParser.
I strongly recommend one of the bazillion open source XML parsers. They're much more user friendly.
Well thanks to all for making the efforts to answer, but I got a nice library TouchXML that solves the purpose.

Categories

Resources