I have created a python script for predictive analytics using pandas,numpy etc. I want to send my result set to java application . Is their simple way to do it. I found we can use Jython for java python integration but it doesn't use many data analysis libraries. Any help will be great . Thank you .
Have you tried using xml to transfer the data between the two applications ?
My next suggestion would be to output the data in JSON format in a txt file and then call the java application which will read the JSON from the text file.
Better approach here is to use java pipe input like python pythonApp.py | java read. Output of python application can be used as an input for java application till the format of data is consitent and known. Above soultions of creating a file and then reading also works but is prone to more errors.
Related
I'm trying to load an ISO-8859-1 file into BigQuery using DataFlow. I've built a template with Apache Beam Java. Everything works well but when I check the content of the Bigquery table I see that some characters like 'ñ' or accents 'á','é', etc. haven't been stored propertly, they have been stored as �.
I've tried several charset changing before write into BigQuery. Also, I've created a special ISOCoder passed to the pipeline using the method setCoder(), but nothing works.
Does anyone know if is it possible to load into BigQuery this kind of files using Apache Beam? Only UTF-8?
Thanks in advance for your help.
This feature is currently not available in the Java SDK of Beam. In Python this seems to be possible by using the additional_bq_parameters when using WriteToBigQuery, see: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L177
I am working with the WARC files and trying to access the complete file into some framework acceptable format (say elasticsearch, apache sparks or others). But these frameworks accept data in a format of JSON or other types apart from the WARC.
For this reason, I tried the Github program as a parser for the file. Here is the Github repository code link: https://github.com/eugeneware/warc
Now , when I have tried implementing it this program didn't work at all. I don't know what was the problem, but it didn't show up anything. Not an error or an output.
Now I am trying to figure out how I can accomplish my task? If anyone has any suggestions for this please share it with me.
I need to create a java API that takes in a CSV file and returns a JSON with some parsed data from the CSV. I've currently built a Java program in Eclipse with a bunch of static class methods that if passed in the file name of the CSV as a command-line argument to the main function, will do the parsing and create the JSON.
How do I go about modifying this code so that it will function as an API? In specific, the API should be able to accept a .csv file in the following way.
POST http://<YOUR HOST HERE>/scrub/:csv_file
Then, how do I go about returning the JSON? Thank you in advance for the help -- first time writing an API as opposed to a standard Java program.
I have some pig output files and want to read them on another machine(without hadoop installation). I just want to read a tab-seperated plain text line and parse it into a java object. I am guessing we should be able to use pig.jar as dependency and be able to read it. I could not find relevant documentation. I think this class could be used? How can we provide the schema also.
I suggest you to store data in Avro serialization format. It is Pig-independent and it allows to handle complex data structures like you described (so you don't need to write your own parser). See this article for examples.
Your pig output files are just text files, right? Then you don't need any pig or hadoop jars.
Last time i worked with Pig was on amazon's EMR platform, and the output files were stashed in an s3 bucket. They were just text files and standard java can read the file in.
That class you referenced is for reading into pig from some text format.
Are you asking for a library to parse the pig data model into java objects? I.e. the text representation of tuples & bags, etc? If so then its probably easier to write it yourself. It's a VERY simple data model with only 3 -ish datatypes..
Our application is a client/server setup, where the client is a standalone Java application that always runs in Windows, and the server is written in C and can run on either a Windows or a Unix machine. Additionally, we use Perl for doing various reports. Generally, the way the reports work is that we generate either a text file or an xml file on the server in Perl and then send that to the client. The client then uses FOP or similar to convert the xml into a pdf. In either the case of the text file or the eventual pdf, the user select a printer via the java client and then the copied over file prints to the selected printer.
One of our "reports" is used for creating barcodes. This one is different in that it uses Perl to fetch/format some data from the database and then sends that to a C application that creates some Raw print data. This data is then sent directly to the printer (via a simple pipe in Unix or a custom application in Windows.
The problem is that this in no way respects the printer selected by the user in the Java client. Also, we are unable to show a preview in said client. Ideally, I'd like to be able convert the raw print data into a ps/pdf or similar on the server (or even on the client) and then send THAT to the printer from the client. This would allow me to show a preview as well as actually print to the selected printer.
If I can't generate a preview, even just copying over the raw data in a file to the Java client and then sending that to the printer would probably be "good enough." I've been unable to find anything that is quite what I'm trying to accomplish so any help would of course be appreciated.
Edit: The RAW data is in PCL format. I managed to reconcile the source with a PCL language reference guide.
Have you had a look at iText?
You willl need to find some way of interpreting the RAW format, which most likely is some printer language like PCL or HPGL into a format you can use. This is probably best done at serverside.
A java based PCL interpreter can be found at http://openpcl.sourceforge.net/ - I have no experience with it.
I figured out a way to generate the barcodes using XSL-FO directly. This is the "correct" answer based on our architecture and trying to do anything else would have been just a dirty hack.