Conversion of WARC file to JSON or XML or CSV - java

I am working with the WARC files and trying to access the complete file into some framework acceptable format (say elasticsearch, apache sparks or others). But these frameworks accept data in a format of JSON or other types apart from the WARC.
For this reason, I tried the Github program as a parser for the file. Here is the Github repository code link: https://github.com/eugeneware/warc
Now , when I have tried implementing it this program didn't work at all. I don't know what was the problem, but it didn't show up anything. Not an error or an output.
Now I am trying to figure out how I can accomplish my task? If anyone has any suggestions for this please share it with me.

Related

How to convert the ConfigProperties_server1.props which is extracted from websphere using wsadmin to xml or json file?

This is the Jython script I have used to extract the ConfigProperties_server1.props file:
AdminTask.extractConfigProperties('[-propertiesFileName ConfigProperties_server1.props -configData Server=server1]')
Welcome to SO. The AdminTask.extractConfigProperties command has no option to control the format of the file. The option PortablePropertiesFile might sound promising, but instead it controls whether internal XMI ids are included in the props file. You're going to have to parse the properties file and convert it yourself, the syntax of the file is documented in this IBM KnowledgeCenter topic. Given the complexity of this task, you may want to edit your question and add some detail on what you're trying to accomplish by converting to xml or json format file, so perhaps the community might better help you.

Is there any Api in android for ms office word(.doc/.docx) to (image/html/xml/pdf) conversion

I am trying to make android application which should take ms word files i.e.(.doc/.docx) files as input and convert it into pdf as final output.
I searched for pdf conversion in android but the output i got was not proper. i tried with jWordConvert(Qoppa software library),it converts word to pdf in java very well but if we try in android it giving error,
conversion to dalvik format failed with error 1.
also i tried with apache poi library but again it is not showing proper output if my word files contains images or tables..also if we tried with apache tika with apache poi, again same results.
Itext is also there but its main use is to convert from images/html to pdf, but again output is different from outr expectation..
So my request is, Is there any api which should support android application for word to html/pdf/image conversion with proper output if my word file contains tyables, images etc.
also can i go for JNI, is there any anather way to do it.
please reply. thanx in advance.
Here is an app which does what you want to realize.
If there is an app already in app store, for me is very easy to know what that use and how is using...
I hope it helps for you a bit!

How to convert SWF file to XML using java?

Is there any possibilities to decompile and convert a SWF file into an XML file using Java?
I tried to use shark -Flash2XML but it doesn't provide any output. I tried to analyse code, but cant get any result out of it.
Can anyone share some working code that uses SWF2XML or Flash2XML

How to convert .mxd file into .pdf file

I want to convert .mxd file into .pdf file. I have google under this topic but I ended with nothing. I want to know that can I convert .mxd to .pdf directly or do I need to convert using intermediate conversions?
any help would be appreciate.....
thank you.
Typically .mxd files are mapping files created with ESRI ArcGIS. ArcMap has a tool to export a specific section to a pdf.
If you must do this programmatically (not by using a manual tool) you can do this I believe by publishing the MXD as a map service and then using the JavaScript, etc. APIs to make the conversions.
well i found this
http://arcscripts.esri.com/details.asp?dbid=15139

GWT document format converter

I am searching on ways to make a small app using GWT for converting documents
from one format to other.
Mainly these formats .doc , .pdf , .odt , .rtf.. and maybe a couple
more.
Has anyone tried this before??
I came across the library JODConverter but it needs open office to be
already installed and i don't really know how many people have used it
with gwt in past.
Please give me some starting pointers, or if anyone has experience
with this kind of app, do share.
Thanks and regards,
Rohit
I was looking into implementing something like this a few month ago.
Since GWT compiles your code to JavaScript there is no way for you to do that on the client side, JavaScript can't access the file system.
So you would need to upload the file to the server first and do the conversion on the server side and send the converted file back.
I have never heard of JODConverter before, the the library I wanted to use was Apache POI . Unfortunately I can't tell you anything about it, because I haven't tried it yet.
It sounds like JOD Converter is precisely what you need since you're looking at multi format conversions from Java. You would install OpenOffice on your server and link it up with JOD Converter. When a document is uploaded, your application would call JOD Converter to perform the conversion and stream the converted document back to the caller. Alternatively you can put the file somewhere, and send a link (URL) back to the caller so they can fetch the document. You can also look at JOD Reports or Docmosis if you need to manipulate the documents.
GWT is mostly a client side toolkit. Are you trying to make a tool that does all the conversion on the client side, with no help from the server? In that case, you should be looking for JavaScript libraries that can read/convert all those formats. If you are planning to have the user upload their files to the server, then you can use whatever technology you want on the server, and just use GWT for the UI.

Categories

Resources