Spring request body problem with non-valid xml - java

In our rest application we use #RequestBody StreamSorce to upload xml file. The problem is that is xml itself is non-valid or contains some invalid characters, PUT request is failing(with http bad request response) before our logic, so we can not inform client about exact problem. I know that it is possible to use just plain String for requestBody, but does it make sense to use it? I guess if i will upload 100Mb xml each request will create String request body with same size, and while using StreamSource we are reading input stream while we need it.
What are the cons and pros for using String or StreamSource as requestbody. If i do it with StreamSorce will it scan the whole xml?

You already know the con of the StreamSource: in makes it not possible for you to pre-process the XML, in case it is invalid.
Using String: when XML would be so large, it would be a performance killer! Never use so large objects in Java - in multi-thread environment of application server it can easily lead to OutOfMemoryError, it makes your application vulnerable to DoS attack!
The best solution is to map #RequestBody as InputStream, and process that InputStream with SAX parser. You have low memory consumption (SAX parser doesn't store XML structure in memory) and you can handle exceptions, that can be thrown during the processing.

Related

Fortify Cross-Site Scripting Persistent on Java Rest API response (JSON string & XML string)

I understand that to fix the cross-site scripting, I need to validate the user input and encode the output to avoid browser execute malicious data.
However my application is just a pure Rest API which return JSON string and XML string, fortify reported cross-site scripting persistent (stored) because the code will query data from db and return to the response
#Java Code
#PostMapping(path = "${api.abc.endpoint}")
public ResponseEntity processRequest(#RequestBody String requestStr,
HttpServletRequest servletRequest) {
ResponseEntity<String> response = null;
String responseStr = "";
responseStr = processRequest(requestString, servletRequest);
response = ResponseEntity.ok().body(responseStr);
return response; //response can be JSON or XML
}
#Original JSON Response
{
"type":"order",
"responseCode":"001",
"responseText":"Success",
"transDesc":"Value from DB"
}
#Original XML Response
<abc:res xmlns:abc="http://sample.com/abc/">
<type>order</type>
<responseCode>001</responseCode>
<responseText>Success</responseText>
<transDesc>Value from DB</transDesc>
</abc:res>
I try to encode the output string using the OWASP Java Encoder and I got the below encoded string which changed the response format.
#Encoded JSON Response
{\"type\":\"order\",\"responseCode\":\"001\",\"responseText\":\"Success\",\"transDesc\":\"Value from DB\"}
#Encoded XML Response
<data contentType="application/xml;charset=UTF-8" contentLength="241">
<![CDATA[<abc:res xmlns:abc="http://sample.com/abc/"><type>order</type><responseCode>001</responseCode><responseText>Success</responseText><transDesc>Value from DB</type></abc:res>]]></data>
How can I actually fix the cross-site scripting persistent in fortify for JSON string and XML string?
Thanks.
Fortify may be too eager to detect XSS as it assumes any data you produce could end up directly interpreted as HTML. Content sent back to the browser with XML or JSON content types aren't vulnerable to XSS by themselves though. Check that the content-type header being sent back isn't text/html.
The issue may be that a client would read part of the response and output it as is onto the page. The encoding here would be the client's responsibility though as what encoding to use depends on the output context.
Many client-side frameworks will HTML encode data as necessary by default. If you control the client, you should check whether it's doing its own encoding here.
Input validation can help in general too. Either here or in related requests that are writing to the database. Input can be validated depending on what its content should be.
How the above Fortify cross site scripting persistent issue is solved for the database call and sending output as responsentity.
Leaving my solution in case this helps peeps in the future.
My app security team needed fortify to completely resolve the issue.
What worked for me was grabbing all the keys+values in the json and running them through the html encoder function from import org.apache.commons.lang3.StringUtils library.
As the user above mentioned, fortify tries to make sure that the user input it html encoded.

REST call in Java

I have a few questions about a specific REST call I'm making in JAVA. I'm quite the novice, so I've cobbled this together from several sources. The call itself looks like this:
String src = AaRestCall.subTrackingNum(trackingNum);
The Rest call class looks like this:
public class AaRestCall {
public static String subTrackingNum (Sting trackingNum) throws IOException {
URL url = new URL("https://.../rest/" + trackingNum);
String query = "{'TRACKINGNUM': trackingNum}";
//make connection
URLConnection urlc = url.openConnection();
//use post mode
urlc.setDoOutput(true);
urlc.setAllowUserInteraction(false);
//send query
PrintStream ps = new PrintStream(urlc.getOutputStream());
ps.print(query);
ps.close();
//get result
BufferedReader br = new BufferedReader(new InputStreamReader(urlc
.getInputStream()));
StringBuilder sb = new StringBuilder();
String line = null;
while ((line=br.readLine())!=null) {
sb.append(line);
}
br.close();
return sb.toString();
}
}
Now, I have a few questions on top of the what is wrong with this in general.
1) If this rest call is returning a JSON object, is that going to get screwed up by going to a String?
2) What's the best way to parse out the JSON that is returning?
3) I'm not really certain how to format the query field. I assume that's supposed to be documented in the REST API?
Thanks in advance.
REST is a pattern applied on top of HTTP. From your questions, it seems to me that you first need to understand how HTTP (and chatty socket protocols in general) works and what the Java API offers for deal with it.
You can use whatever Json library out there to parse the HTTP response body (provided it's a 200 OK, that you need to check for, and also watch out for HTTP redirects!), but it's not how things are usually built.
If the service exposes a real RESTful interface (opposed to a simpler HTTP+JSON) you'll need to use four HTTP verbs, and URLConnection doesn't let you do so. Plus, you'll likely want to add headers for authentication, or maybe cookies (which in fact are just HTTP headers, but are still worth to be considered separately). So my suggestion is building the client-side part of the service with the HttpClient from Apache commons, or maybe some JAX-RS library with client support (for example Apache CXF). In that way you'll have full control of the communication while also getting nicer abstractions to work with, instead of consuming the InputStream provided by your URLConnection and manually serializing/deserializing parameters/responses.
Regarding the bit about how to format the query field, again you first need to grasp the basics of HTTP. Anyway, the definite answer depends on the remote service implementation, but you'll face four options:
The query string in the service URL
A form-encoded body of your HTTP request
A multipart body of your HTTP request (similar to the former, but the different MIME type is enough to give some headache) - this is often used in HTTP+JSON services that also have a website, and the same URL can be used for uploading a form that contains a file input
A service-defined (for example application/json, or application/xml) encoding for your HTTP body (again, it's really the same as the previous two points, but the different MIME encoding means that you'll have to use a different API)
Oh my. There are a couple of areas where you can improve on this code. I'm not even going to point out the errors since I'd like you to replace the HTTP calls with a HTTP client library. I'm also unaware of the spec required by your API so getting you to use the POST or GET methods properly at this level of abstraction will take more work.
1) If this rest call is returning a JSON object, is that going to get
screwed up by going to a String?
No, but marshalling that json into an obect is your job. A library like google gson can help.
2) What's the best way to parse out the JSON that is returning?
I like to use gson like I mentioned above, but you can use another marshal/unmarhal library.
3) I'm not really certain how to format the query field. I assume
that's supposed to be documented in the REST API?
Yes. Take a look at the documentation and come up with java objects that mirror the json structure. You can then parse them with the following code.
gson.fromJson(json, MyStructure.class);
Http client
Please take a look at writing your HTTP client using a library like apache HTTP client which will make your job much easier.
Testing
Since you seem to be new to this, I'd also suggest you take a look at a tool like Postman which can help you test your API calls if you suspect that the code you've written is faulty.
I think that you should use a REST client library instead of writing your own, unless it is for educational purposes - then by all means go nuts!
The REST service will respond to your call with a HTTP response, the payload may and may not be formatted as a JSON string. If it is, I suggest that you use a JSON parsing library to convert that String into a Java representation.
And yes, you will have to resort to the particular REST API:s documentation for details.
P.S. The java URL class is broken, use URI instead.

JSoup getting content type then data

so currently I'm retrieving the data from a url using the following code
Document doc = Jsoup.connect(url).get();
Before I fetch the data I've decided I want to get the content type, so I do that using the following.
Connection.Response res = Jsoup.connect(url).timeout(10*1000).execute();
String contentType = res.contentType();
Now I'm wondering, is this making 2 separate connections? Is this not efficient? Is there a way for me to get the content type and the document data in 1 single connection?
Thanks
Yes Jsoup.connect(url).get() and Jsoup.connect(url).timeout(10*1000).execute(); are two separate connections. Maybe you are looking for something like
Response resp = Jsoup.connect(url).timeout(10*1000).execute();
String contentType = res.contentType();
and later parse body of response as a Document
Document doc = resp.parse();
Anyway Jsoup by default parses only text/*, application/xml, or application/xhtml+xml and if content type is other, like application/pdf it will throw UnsupportedMimeTypeException so you shouldn't be worried about it.
Without looking at the Jsoup internals we can't know. Typically when you want to obtain just the headers of a file (the content type in your case) without downloading the actual file content, you use the HTTP GET method instead of the GET method to the same url. Perhaps the Jsoup API allows you to set the method, that code doesn't seem like it's doing it so I'd wager it's actually getting the entire file.
The HTTP spec allows clients to reuse the connection later, they are called HTTP persistent connections, and it avoids having to create a connection for each call to the same server. However it's up to the client, Jsoup in this case since you aren't handling the connections in your code, to make sure it's not closing the connections after each request.
I believe that the overhead of creating two connections is offset by not downloading the entire file if you're code decides that it shouldn't download the file if it's not of the content type that you want.

How can I pass JSON as well as File to REST API in JAVA?

My main question is how can I pass JSON as well as File to post request to REST API? What needs in Spring framework to work as client and wait for response by passing post with JSON and File?
Options:
Do I need to use FileRepresentation with ClientResource? But how can I pass file as well as JSON?
By using RestTemplate for passing both JSON as well as File? How it can be used for posting JSON as well as File?
Any other option is available?
Sounds like an awful resource you're trying to expose. My suggestion is to separate them into 2 different requests. Maybe the JSON has the URI for the file to then be requested…
From a REST(ish) perspective, it sounds like the resource you are passing is a multipart/mixed content-type. One subtype will be application/json, and one will be whatever type the file is. Either or both could be base64 encoded.
You may need to write specific providers to serialize/deserialize this data. Depending on the particular REST framework, this article may help.
An alternative is to create a single class that encapsulates both the json and the file data. Then, write a provider specific to that class. You could optionally create a new content-type for it, such as "application/x-combo-file-json".
You basically have three choices:
Base64 encode the file, at the expense of increasing the data size
by around 33%.
Send the file first in a multipart/form-data POST,
and return an ID to the client. The client then sends the metadata
with the ID, and the server re-associates the file and the metadata.
Send the metadata first, and return an ID to the client. The client
then sends the file with the ID, and the server re-associates the
file and the metadata.

Find if InputStream of DataHandler is empty

In my application I develop web service that get attached file.
The file is mapped to DataHandler object via JaxB,
and I have access to the file via DataHandler.getInputStream()
My problem is this:
When the file attribute exist in the web service request, but no file is attached,
I still get the DataHandler object, and its getInputStream().available() = 11 bytes
(a header I guess...??).
So I can I know that the inputStream is empty?
Thanks,
Alon
Read it and parse the data as it should be parsed. The answer is in there.
The InputStream#available() certainly does not return the length of the stream or so as you seem to think. In some cases it (by coincidence) may, but you shouldn't rely on that. It just returns the amount of bytes which are available for read without blocking other threads. Just read the stream the usual Java IO way fully until the last bit returned -1 and then intercept on the whole data you received.

Categories

Resources