Java: Upload file and get back string (contents) of file - java

Hi I have GWT client with standard server-side Servlets.
I can upload file from GWT client-side and read it's contents at server-side
I can send it back to client as String
BUT
I have GWT FormPanel with action (myModule+"import"). FormPanel invokes POST from servlet. Browser then redirects me to myurl/import so I can see contents of uploaded file.
This is not what I wanted though. I'd simply like to have my String back. I added
submitCompleteHandler to my FormPanel, but it doesn't log any results.
I noticed that servlets have method such setContentType so I tried text/html, text/plain ... I don't know what should be there ...
To say it in one sentence, I want to send String back to client from servlet without having browser to redirect me somewhere else. Is it possible?

Since you are submitting a form you get your browser to change navigation. In order to make it work the way you want you have to send the file with ajax. For GWT there is the GWTUpload library that allows you to do that.

If the browser redirects you, it's because you gave a "target" to the FormPanel. By default, it submits within an hidden iframe (a.k.a "ajax upload").
As said in the javadoc, you have to setContentType("text/html") in your servlet if you want onSubmitComplete to be reliably called.
onSubmitComplete's results is the returned HTML's body innerHTML so you have to be very careful when sending back values with < or & in them. The only reliable way to get them back is to escape them on the server-side, and unescape them on the client-side. You can either use your own escaping mechanism, or you can use < and &. In the latter case, to unescape on the client-side, you'd either use String#replace, or create an HTML element, set it's innerHTML with the string you got back, and then get its innerText:
public String htmlUnescape(String htmlEscaped) {
Element tmp = Document.get().createDivElement();
tmp.setInnerHTML(htmlEscaped);
return tmp.getInnerText();
}
On the server-side, you'd use:
escaped = content.replace("&", "&").replace("<", "<")
(order matters here if you don't want <s to become &lt;; also, replacing < and & is enough, > and " won't cause any issue here)
In your case however, make sure first that the file's content is "text" and not "binary", as it wouldn't make sense to return it as String could cause issues depending on how you use the value on the client side.

Related

Problems transforming special characters to bytes and strings

I'm showing a dropdown on a web page but when using characters as ○ as options, the dropdown shows a question mark
I'm getting the dropdown option from a SQL Server database in which the column that saves the value is nvarchar type
Then I create an XML output string with the values to send it as response of an AJAX call
When I do xmlWriter.toString() , being xmlWriter a StringWriter object, I'm able to see the ○ character using Eclipse's debug mode but that string needs to be sent as a ByteArrayOutputStream object to add it to response stream for the response to see the XML file on the client side but when doing xmlWriter.toString().getBytes() the ○ character becomes a question mark
I've tried to use xmlWriter.toString().getBytes("UTF-8") but the result is some strange symbols
What am I missing?
By guessing what might be your problem it feels like you're not specifying the encoding in your response object to the browser and it fails guessing the right one. Consider calling getBytes("UTF-8") as you did (better: getBytes(StandardCharsets.UTF_8)) and submit an encoding information along with your response, either in the HTTP header (Content-Type: application/xml; charset=utf-8) as you're probably using HTTP or in the XML header (<?xml version="1.0" encoding="utf-8"?>). Maybe even both as this will provide you the best compatibility.

Servlet: Image upload with content type as image/jpeg

I am using sapui5 control UploadCollection to upload set of images and servlet to process the post request.
Problem 1: I have slightest idea how to parse the content to get images in doPost.
Problem 2: For the UploadCollection, it's not advisable to change the content type by modifying the header parameters. So, i'd need to get those images in servlet without multipart as content type.
I have seen dozens of examples, all having multipart as content type. I need a solution where content type from browser comes as image/*. Hints or code snippets would do.
I am not sure which examples you have seen. Normally the UploadCollection never uses multipart. You can check the code of the UploadCollection here and see that the FileUploaders are always built with useMultipart: false.
Also, if you check the examples from the Explored app, you will see that the content type is image/png or whatever type of file you select (on Chrome):
I am not really sure what is the behaviour on IE 8 / 9, where things are a little different (uploads through AJAX is not supported).
The multipart content type is controlled by the useMultipart property of the FileUploader. If you need to play around with this value, you will need to replace the default upload button from the UploadCollection. To do this, simply make the default upload button invisible (using the uploadButtonInvisible property) and add your own FileUploader in the toolbar of the UploadCollection.
Related to the Servlet question: it depends what you want to do with the Image. You can get the InputStream from the request
and then use it for whatever you need it. The input stream will contain the image itself (if the content is not multi-part that is).

Stopping a Servlet from returning a Response

Had a look through SO and couldn't find a question similar to what I'm after. I'll start off by explaining what I'm trying to do, then finish up with a more specific question..
My aim
I have a link that passes a query string parameter to my servlet. That parameter is report. If report = true in the servlet, then I'll generate a PDF document. The PDF document then returns this value, by setting the response's mime type to application/pdf. This code is shown below:
String mimeType = "application/pdf";
res.setContentType(mimeType);
res.setHeader("Content-Disposition", "attachment; filename=\"" +
getEventID(doc) + ".pdf\"");
// Set the response content type and pdf attachment.
out = new ByteArrayOutputStream();
// All PDF Data is pushed onto the output stream.
com.lowagie.text.Document pdfDoc = buildPDF(getEventID(doc));
This code is then written to the response object's output stream.
if(pdfDoc == null)
{
// Something went wrong in generating the report.
return false;
}
// Create the PDF document.
out.writeTo(res.getOutputStream());
If all goes well, the class returns true. If not, it returns false. Now, the problem I'm having is if it returns false. Essentially, I want to point blank stop the data from going anywhere. I added the check to make sure things went well, before I write anything to the output stream, so at the moment what I have is a response that is set to PDF type, but contains no data, if something goes wrong that is.
Next, I have a function that will test the output of the class. If it's true, then all is good, but if it is false, then it sets an error parameter:
if(!PdfReportGenerator.generateReport(res, repositoryURI)) {
req.getSession().setAttribute(SDRestServlet.PDF_ERROR, "error");
// This will then re-direct back to the current URL, meaning the page
// looks like it doesn't do anything.
res.sendRedirect(req.getRequestURI());
}
The problem is, this re-directing is really not helping at all. It's messing up other values that are stored in the request and, while it's making the page appear like it's doing nothing, it doesn't allow me to output an error message to the user.
The issue
While I know how to make it seem like the web response is not returning, it means that I can not output any meaningful information to the user, which is obviously not the ideal outcome.
My question
Is there a way to force the servlet to stop, or return something so that the browser ignores the data?
My second question is, if there is something I can send back to the browser, is there anything I can do on the client side to cause a message to pop up (can be as simple as alert())?
I've been as clear as I possibly can be, so if there's anything you need to know, just ask :)
Is there a way to force the servlet to stop, or return something so
that the browser ignores the data?
Please try setting zero response using method "ServletResponse.setContentLength(int)"
My second question is, if there is something I can send back to the
browser, is there anything I can do on the client side to cause a
message to pop up (can be as simple as alert())?
Yes you can but you need to update back header to say "text/html" and set all the variable as you would do in a normal scenario of a server request
SECOND APPROACH:
If I have to build it from scratch, would following following approach:
First make and AJAX call to find whether pdf need to be generated or not
If response is false show error message.
If response is true send request to server to generate PDF
Hopefully I was able to help you a bit here.

how can I clean and sanitize a url submitted by a user for redisplay in java?

I want a user to be able to submit a url, and then display that url to other users as a link.
If I naively redisplay what the user submitted, I leave myself open to urls like
http://somesite.com' ><script>[any javacscript in here]</script>
that when I redisplay it to other users will do something nasty, or at least something that makes me look unprofessional for not preventing it.
Is there a library, preferably in java, that will clean a url so that it retains all valid urls but weeds out any exploits/tomfoolery?
Thanks!
URLs having ' in are perfectly valid. If you are outputting them to an HTML document without escaping, then the problem lies in your lack of HTML-escaping, not in the input checking. You need to ensure that you are calling an HTML encoding method every time you output any variable text (including URLs) into an HTML document.
Java does not have a built-in HTML encoder (poor show!) but most web libraries do (take your pick, or write it yourself with a few string replaces). If you use JSTL tags, you get escapeXml to do it for free by default:
ok
Whilst your main problem is HTML-escaping, it is still potentially beneficial to validate that an input URL is valid to catch mistakes - you can do that by parsing it with new URL(...) and seeing if you get a MalformedURLException.
You should also check that the URL begins with a known-good protocol such as http:// or https://. This will prevent anyone using dangerous URL protocols like javascript: which can lead to cross-site-scripting as easily as HTML-injection can.
I think what you are looking for is output encoding. Have a look at OWASP ESAPI which is tried and tested way to perform encoding in Java.
Also, just a suggestion, if you want to check if a user is submitting malicious URL, you can check that against Google malware database. You can use SafeBrowing API for that.
You can use apache validator URLValidator
UrlValidator urlValidator = new UrlValidator(schemes);
if (urlValidator.isValid("http://somesite.com")) {
//valid
}

Selenium 2: Detect content type of link destinations

I am using the Selenium 2 Java API to interact with web pages. My question is: How can i detect the content type of link destinations?
Basically, this is the background: Before clicking a link, i want to be sure that the response is an HTML file. If not, i need to handle it in another way. So, let's say there is a download link for a PDF file. The application should directly read the contents of that URL instead of opening it in the browser.
The goal is to have an application which automatically knows wheather the current location is an HTML, PDF, XML or whatever to use appropriate parsers to extract useful information out of the documents.
Update
Added bounty: Will reward it to the best solution which allows me to get the content type of a given URL.
As Jochen suggests, the way to get the Content-type without also downloading the content is HTTP HEAD, and the selenium webdrivers does not seem to offer functionality like that. You'll have to find another library to help you with fetching the content type of an url.
A Java library that can do this is Apache HttpComponents, especially HttpClient.
(The following code is untested)
HttpClient httpclient = new DefaultHttpClient();
HttpHead httphead = new HttpHead("http://foo/bar");
HttpResponse response = httpclient.execute(httphead);
BasicHeader contenttypeheader = response.getFirstHeader("Content-Type");
System.out.println(contenttypeheader);
The project publishes JavaDoc for HttpClient, the documentation for the HttpClient interface contains a nice example.
You can figure out the content type will processing the data coming in.
Not sure why you need to figure this out first.
If so, use the HEAD method and look at the Content-Type header.
You can retrieve all the URLs from the DOM, and then parse the last few characters of each URL (using a java regex) to determine the link type.
You can parse characters proceeding the last dot. For example, in the url http://yoursite.com/whatever/test.pdf, extract the pdf, and enforce your test logic accordingly.
Am I oversimplifying your problem?

Categories

Resources