How to integrate Map Reduce programs with Web Application - java

I am currently working on a web application. The requirement is something like the User will upload excel or Csv files, containing large Datasets from a front end framework.
Once uploaded the data will be processed based on many parameters like duplication check, individual field validations etc.
The user should be able to download the results based on the filters instantly in the form of newly generated csv files.
The technologies I am using is Hbase for storing the User information like name, email & so. Once the data is uploaded by the User it will be stored and processed in HDFS. The backend I have written in sparkjava web framework. Now the data processing engine I have used is MapReduce.
For MapReduce, I have written multiple Mappers, Reducers, Driver classes in Java which are present inside the same project Directory, but the issue is I am not able to integrate MapReduce with my backend. Once the data is uploaded, the Mapreduce programs should run. I am not able to do that.
Can anyone please suggest me any ideas regarding this. I am new to Hadoop, so please do tell me if I am doing anything wrong & suggest a better alternative for this. Any help will be awesome. Thank you.

Related

need help to store many documents from a java web app

i have a little issue. I hava to generate thousands of files from a web application. I will then put them into a zip send it by mail and delete them. I was thinking about storing them on the jboss server, but i'm not a big fan of this solution.
any idea of a cleaner solution?
If you send the files as an attachment, the files are stored on the mail server and there's no need to store them anywhere else. On the other hand, you might run into size limitations. If your files are too big to be directly attached to an email, you might consider storage services like AWS S3.

Schedule export of Google Analytics csv reports

I would like to have a copy of one of the Analytics custom reports as
CSV on a webserver every day. I wish to update some records in my
database depending on this csv report.
Before I would start, will it work if I
find the analytics core java api code of fetching reports, compile and save it
set a cronjob daily which runs a php file
the php file executes a bash command that calls java
the java application interacts with analytics, gets and saves the report
the php file checks if the new csv exists, reads the file, gets information
the php file connects to mysql and updates records
Please correct me if it's bullsh.t or there are easier ways (Analytics PHP/JS API if exists, or something else). These six points just popped in my head, I've never done something like this before, so please help me.
The Core Reporting API is language agnostic and there are libraries for many languages including PHP. So I'd say calling Java via PHP is unnecessarily complicated.

Pulling data from Java application to Salesforce, where should i start?

I'm new in programming, and I'm trying to do a something like this.
I have data(Objects, Fields & Records) in a Java based web application.
I need that data on salesforce.com. How do i achieve this? by diggin in stack for an hour i came across couple of solutions(A part of it though).
Using data export option in Salesforce, which is again manual, i dont know if there is a automate process.
Using SOAP api or Partner API
To get the objects : describeGlobal()
To get list of fields: describeSObjects.
Any ideas ? or suggestions ?
Thanks in advance.
You Can use SOAP API to load data into salesforce if the record limit <50000 . In case you want to load huge amount of data in salesforce, you can opt for a BULK API. For SOAP you need to have the Enterprise wsdl and for bulk - its Partner WSDL.
1 - Data export functionality provided out of the box by salesforce allows your organization to generate backup files of your data on a weekly or monthly basis depending on your edition. It is mostly used for backup purposes.
2 - Is the upload process something triggered from the Java application itself ?
or you need some periodical data dump between your webapp and salesfoce ?
In the first case you have to use the SOAP API interface,directly inside your java code.
find below a good recipe from the cookbook:
http://developer.force.com/cookbook/recipe/calling-salesforce-web-services-using-apex
In the second case you can export your data into a csv file from your java app and load it in salesforce using the Dataloader. This process can be easily batched periodically.
We have experience with using SOAP API. I would suggest downloading soapUI tester and pulling the Enterprise WSDL from Salesforce, to get a feel for how to insert the data.
Also keep note of the governors and limits that SF imposes, in case you start trying to send data OUT of SF as well.
Hello Everyone I have used bulk api of salesforce.With the help of bulk api We can fetch and insert upto 10000 record in one batch .So you can go with bulk api

calling swf function from java (without loading swf file into browser)

I searched a lot regarding this topic but did not get any good answer.
Scenario:
We have Rest web service bases implementation in our project. Ideally frontEnd (Flex) call web service and backend send huge data point to frontEnd. Then frontEnd create chart of these data points and display to end user.
Our requirement is that user can export these charts and save as pdf file on the server. We are able to create JPG file from flex server and save as pdf file.
Problem occurs when end user has scheduled that chart report. Now that report can run at any time and may be browser is not opened at that time. So how backEnd will interact with frontEnd (flex) functions. Problems are:
browser is not opened so swf file is not loaded.
java/jsp need to interact with frontEnd(flex) as a reverseAjax so that frontEnd send JPG file back to server.
Does anybody face this issue before?
Is it somehow possible??
Asnwers/any leads are highly appreciated.
Please provide comments on this
Probably the only way to do this is to run a version of your Flex application (at least the charting part) on your server, and have your Java server interact with it.
I have faced a similar problem and have asked a similar question before. It is not very elegant, but what I mentioned before seems to be the only way to go.

Architecture Java EE ? many ressources : database, xml

I have a Java application and now I want to make it an web app.
Now I think about how to make the architecture of this app.
In fact, I have many resources, matlab, exe files and XML files and a MySQL database.
so we will have a 3-tier architecture.
Client: Browser
Treatment: Java EE server (maybe Servlet and EJB container)
Data: matlab, exe files and XML files and a MySQL database
So, how can I create this application without having a problem even if we have several clients connected which sends many queries at the same time?
Knowing that the processing is calling an EXE and reading and writing XML files, and execute MATLAB.
More details
INPUT -RESSOURCE-> OUTPUT
image(query) -exe-> XML
XML -JDOM-> Java Objects (List)
Java Objects -JDOM-> n XML files
n XML files -JDOM-> txt files
txt files -matlab-> txt files
txt files -MYSQL-> java objects (List)
txt files --> Images (results)
This is pretty broad question. So I will keep my answer at a high level and we can dig deeper as you have more questions.
Initially this is how I would structure the application.
Pick a MVC framework. I would pick JSF2 but anything else is fine too. Your view and controller layer will be defined here.
Create 3 DAO classes at bare minimum - one for reading data from XML, one for reading data from mySql, and one for reading text files. To parse XML files you can use XPath and ofcourse SQLs to get data from the database.
Create a MDB to asynchronously kick off the EXE process via JMS.
Package the application as an EAR file.
Tools you can use:
Eclipse for IDE
JBoss-AS (or any other container that you have access to)
Some sort of build/packaging tool (ANT, Maven, etc)
I am not familiar with image manipulation so I can't comment on that part. However, I think you need to break down your design into various components first. That's why I started listing the ones that I could think of without enough details. So image query will be one of the components. Try to create a black box diagram of the system with each major component in & out. After that start developing each of them and then I bet a lot more folks here can help you with more specific questions. Does this make sense?

Categories

Resources