Can anyone recommend a good binary XML format? It's for a JavaME application, so it needs to be a) Easy to implement on the server, and b) Easy to write a low-footprint parser for on a low-end JavaME client device.
And it goes without saying that it needs to be smaller than XML, and faster to parse.
The data would be something akin to SVG.
You might want to take a look at wbxml (Wireless Binary XML) it is optimized for size, and often used on mobile phones, but it is not optimized for parsing speed.
Hessian might be an alternative worth looking at. It is a small protocol, well-suited for Java ME applications.
"Hessian is a binary web service protocol that makes web services usable without requiring a large framework, and without learning a new set of protocols. Because it is a binary protocol, it is well-suited to sending binary data without any need to extend the protocol with attachments."
More links:
Here
Here too
What kind of data are you planning to use? I would say, that if the server is also done in Java, easiest way for small footprint is to send/receive binary data in predefined format. Just write everything in known order into DataOutputStream.
But it would really depend, what what kind of data are you working on and can you define the format.
Actually you should evaluate, if this kind of optimization is even needed. Maybe you target devices are not so limited.
It very much depends on the target device. If you have JSR172 available, then you are done with the parsing, the runtime does it for you. And XML is mainly about making your own format. As was alredy stated if your goal is performance, than XML is probably not the best way to go and you will end up doing some binary stuff.
Related
I know that XML can be used so that programs in different programming languages could communicate.
E.g. a Java server with C and Python clients.
Could JSON be used as an alternative? I mean also if it can should I go for it? Especially in a case where the clients are not controlled by me.
Would you feel that implementing such a client XML would be prefered?
Yes, you can. Just use appropriate Json libraries on both ends (e.g. JsonCPP on the C++ side, or jansson in C). And learn more about json-rpc.
The big advantage of JSON over XML is that it is simpler (to understand, to implement, to use) and probably less verbose (so shorter messages).
You could also consider YAML which seems less used, but is more "powerful".
Don't forget to document quite well your JSON protocol (i.e. messages).
Yes, you should JSON.
There are many libraries for JSON in nearly all well known languages. And a JSON file with the same content as a XML file is about 75% smaller. So you should use it :D
Per your question of should you do it, I think it's an appropriate use. In the end you simply need something that both ends of the conversation can handle. You could use XML or some other alternative, but I don't think it's any better/worse from a 'should you' perspective.
Sorry for the separate answer. Lacking the rep to comment...
You can, but you should not. Don't get me wrong, JSON is OK as data-interchange languages go, but the XML serialization packages for just about any language are much more mature than most JSON packages. Yes, XML is larger than JSON, and there's good reason - it carries a lot more descriptive information than JSON does. And the more diverse your "endpoints" are, the more that information can help in maintaining robust communication.
I need to implement rather simple network protocol: there is device with microcontroller (language is C) and Java application, they should communicate: I need to implement firmware update, and maybe some other things.
At least, I need to transmit some data structures as headers.
Only ugly way comes to mind:
I can declare packed structure on C side, and handle somehow the same data flow on Java side.
So, if my structure is changed, then I need to make changes on both sides: C and Java. I strongly dislike this.
Is there some better way to do that? Maybe, something like this: I should write protocol structures in some special format, and then some utility can generate code for C and Java sides.
Or, maybe, something different.
I would be glad to see suggestions.
You might want to look at using a standardized notation for data transfer such as JSON. Here is some info on parsing JSON in c.
Parsing JSON using C
If it were my project I probably would go with just packed data structures. Hopefully once your project matures changes to the data structures are minimal and only occur during major releases. You can keep a version tag in the data structure to handle legacy data formats if needed.
One common solution to this problem would be to use Google's protobuf. However, as you
specified that you need it to work in a microcontroller environment I think you could look
into protobuf-c, which is a pure C-version of protobuf.
Could you describe details of protocol? Is if statefull or stateless?
If your protocol is stateless, then take a look at web-services (especially, REST-WS). This is well known cross-platform communication practice.
What are possible options to transfer a lot of data from one computer to another not in the same LAN. The amount of data is about 100Mb unzipped and 2Mb zipped? Another requirement is that when I create a server for this (with C#) Java clients should be able to consume it.
Does WCF support something like this? But if Java clients won't be able to consume it I'm not interested.
What could be other strategies here?
I'd just use something common like HTTP or FTP, since there will be plenty of existing libraries to do it and you're pretty much guaranteed not to have compatibility problems. 2MB is not an unreasonably large amount of data for those protocols.
This is an interesting kind of question. The question is fairly simple to answer. But the interesting thing is that this kind of questions are new, they didn't exists before. Let me explain, but first I will answer your question:
You should create a server and clients both using old fashion TCP streams. To not waist bandwidth you need to compress the stream somehow, here use one of the most common compression algorithms you can find (anyone said Zip?). Now you have a language independent protocol. Clients in any language will work, mission accomplished. Also to keep it cross-platform, do not pick the best compression out there, pick the most common one (It will be good enough).
Now to why this kind of questions are interesting, they show something about OOP on the large scale. People understanding and using huge frameworks and asking if this or that framework can perform this or that simple task for them. Here we have lose our roots, we have lost the inner workings of things, it's hitting the nail not with a hammer but with a nuclear missile. It's overshooting the target, and it will produce huge applications, with huge footprint and often poor performance.
I believe that this questions has increased in number since OOP was fully adopted. It's like new programmers only want to learn these new big frameworks and that the framework dim the view of the world. There is absolutely nothing wrong with big frameworks, they are great, but I believe it's wrong to start out using them before one have mastered the basics. It's like learning to fly using a NASA space shuttle instead of a school version of a Cessna private airplane.
In C# you can serialize your object as an XML and transmit, on the other end your can deserialize your XML back to an object.
In terms of files size, you can transmit as zipped or 7z..and on the client decompress it before parsing the xml.
WCF supports SOAP and includes optional JSON serialization for XHTTP. There are other mechanisms but they are MS orientated. You will easily be able to consume the service you create. However you will have to consider how to encode the data as it will hit the wire in a non binary data friendly manner (XML/JSON).
You may wish to instead create a simple http handler that can return the data directly as zip using appropriate mime headers etc. You should then be able to just hit that using your Java client.
XMPP is another option. You need another server, but this could be an advantage: the client wouldn't need to know the servers IP address, server and clients would simply connect to the XMPP server to exchange message and files.
Related links (for Java):
Openfire XMPP server
XMPP library for java (Smack)
You didn't mentioned what type of data do you want to send. So for keeping things simple I will suppose that you have data stream which can be converted to byte array. Content of the stream has to be in format which is understandable to both C# and Java!
The best choice is to compress your data stream with GZip stream. Gzip should be supported on Java. Than you can send that stream converted to byte array as response from your WCF service operation. You can use default text encoding which will convert byte array into Base64 encoded string. If your java client supports MTOM (it is standard which is supported by Java) than you can use MTOM encoding which uses smaller messages.
If you don't have a stream with well known content format you have some sort of custom data. For custom data you have to use interoperable transport format which is XML. Using XML will futher increase size of your data. In that case you should consider dividing your data transfer into several calls. You can also try to host your WCF service in IIS 7.x and take advantage of its build in feature - compression of dynamic content. If your Java client calls the service with HTTP Accept-Encoding header set to compress, gzip it will automatically compress the response. Be aware that only .NET 4.0 WCF clients can work with such service.
We work on an internal corporate system that has a web front-end as one of its interfaces.
The front-end (Java + Tomcat + Apache) communicates to the back-end (proprietary system written in a COBOL-like language) through SOAP web services.
As a result, we pass large XML files back and forth.
We believe that this architecture has a significant impact on performance due to the large overhead of XML transportation and parsing. Unfortunately, we are stuck with this architecture.
How can we make this XML set-up more efficient?
Any tips or techniques are greatly appreciated.
Profiling!
Do some proper profiling of your system under load - there isn't really enough information to go on here.
You need to work out where the time is going and what the bottleknecks are (network bandwidth, cpu, memory etc...). Only then will you know what to do about it - many optimisations are really just trade-offs (for example caching is sacrificing memory to improve performance elsewhere)
The only thing that I can think of off-hand is making sure that you are using HTTP compression with web services - XML can usually be compacted down to a fraction of its normal size, but again this will only help if you have CPU cycles to spare.
You can compress the transfer if both ends can support that, and you can try different parsers, but since you say SOAP there aren't many choices. SOAP is bloated anyway.
I'm going to go out on a limb here and suggest GZIP Compression if you think it is due to bandwidth issues. (you mentioned XML Transportation) Yes, this would increase your CPU time, but it might speed things up in the transport.
Here's the first Google hit on GZIP Compression as a starting point. It describes how it works on Apache.
First make sure that your parsing methods are efficient for large documents. StAX is a good one for parsing large documents.
Additionally, you can take a look at binary XML approaches. These provide more efficient transport but also attempt to aid in parsing.
Try StAX. It performs well and has a nice, concise syntax.
Check if your application reads in the whole XML documents as a DOM tree. Those may get VERY big, and frequently you can do with a simple SAX event inspection or a SAX-based XSLT program (which can be compiled for fast processing).
This is very visible in a profiler like visualvm in the Sun Java 6 JDK
I hear'd that exist binary communication. I'm beginner in java, I use plain text based which I learn from java.sun.com tutorials for sockets. So I want to know what benefits have binary socketing? Why I should use that? And what resources you can suggest about binary communication?
I would advise you to look at Object stream which allow simple binary passing of Objects over a Socket connection from one Java VM to another (across a network if required).
The Java Sun Tutorial is a good resource to start. See the Basic I/O section.
Sending binary data across the socket would obviously be more efficient in terms of bandwidth usage. However, it also increases the complexity of having to marshal and unmarshal data across the socket. Sending textual data (XML, JSON, etc.) is more intuitive and easier to develop in many circumstances if performance is not critical.
You would only work directly with binary data over sockets if you had a specific reason to do so - such as extreme performance. If you had such a reason you would know it, and using binary data direct over sockets does not guarantee high performance.
There are many abstractions provided in Java and libraries to facilitate network communication whilst protecting the programmer from the labour intensive and error-prone work at that low level:
Remote Method Invocation
Web Services
UrlConnection
Object Serialization
are just some examples.
As mentioned in the other answers, you should avoid raw socket communication unless absolutely necessary. I would suggest XML as a communication data format instead of objects as it allows you easily communicate with clients running other languages as needed and is much easier to debug. XStream is a great library to facilitate object<-->xml conversion with minimal supporting code.