Java regex URL for MockServer expectation

Java regex URL for MockServer expectation - java

I'm trying to set up some expectations in MockServer where I have a specific expectation for requests matching this path
/api/users/:user_id
using the regex /api/users/.*.
However, I have a few other expectations which I want to match when accessing user-specific resources:
/api/users/:user_id/books
/api/users/:user_id/books/:book_id
/api/users/:user_id/holidays
I'm not really sure how to properly use regexs to match the first path, without also affecting the requests coming for the user-specific resources without (i.e., all requests are matching on the first path).
For example, I believe for the /api/users/:user_id/books path, I can use the regex /api/users/.*/books, but this will never match, because the regex /api/users/.* for the first path will always match these deeper URLs.
I've been reading this page, but can't quite figure out how to correctly use the regexes for this particular case

One general approach here might be to use negative lookahead assertions. For the general case, you could use this pattern:
/api/users/\d+/(?!books|holidays)
Then, for the more specific cases, use patterns which you already had in mind, e.g.
/api/users/\d+/books
/api/users/\d+/holidays
Demo

Related

Java Url Validation With Placeholders

I have built an API where you can register a callback URL.
The URL's are validated using the Apache UrlValidator class.
I now have to add a feature that allow to add placeholders in the configured URL.
https:/foo.com/${placeholder1}/bar/${placeholder2}
These placeholders will be dynamically replaced using the Apache StrSubstitutor or something similar.
Now my issue, how do I validate the URL's with the placeholders ?
I have thought of a solution :
I replace the expected placeholders with an example value
Then I Validate the URL using the Apache UrlValidator
My issue with this solution is that the Apache UrlValidator only returns a boolean so the error message will be quite ambiguous.
Is there another solution than creating my own regex ?
Update : following discussions in the comments
There is a finite number of allowed placeholders.
The format of the Strings that will replace the placeholders is also known.
The first objective is to be able to check if the given URL which eventually contains placeholders is valid at the time it is configured.
The second objective is, if the URL is not valid return an intelligible error message.
There are multiple error cases :
A placeholder used in the URL is not in the allowed placeholder list
The URL in not valid independently of the placeholders

For a minimal URL validation, you could use the java.net.URL constructor (it will work with your https:/foo.com/${placeholder1}/bar/${placeholder2} example).
According to the docs, it throws:
MalformedURLException - if no protocol is specified, or an unknown protocol is found, or spec is null.
You can then leverage the URL methods as a bonus, to get parts of it such as path, protocol, etc.
I would definitely advise against re-inventing the wheel with regex for URL validation.
Note that java.net.URI has a much stricter validation and would fail your example with placeholders as is.
Edit
As discussed, since you need to validate placeholders as well, you probably want to actually try to fill them first and fail fast if something's wrong, then proceed and validate the populated URL against java.net.URI, for strict validation.
General caveat
You might also want to make your life easier and leverage an existing framework that would allow you to use annotated path variables in the first place (e.g. Spring, etc.), but that's quite a broad discussion.

regex to disallow access to parent directories - java

So what I need is to create a regex which is going to be used on my server to make sure that all the files that the user is requesting access to, are under a specific directory. Let's name that dir UserFiles and let's assume that it is under the path /Server/Users/Bob/UserFiles.
So now when a client sends a request to read a file I want to validate that the path that he is asking access to is under /Bob/UserFiles/.
I thought about making sure that the prefix of the path always begins with /Userfiles/ and that there is no .. in the path (so that would also protect me from restricted access like /UserFiles/../../noAccess.txt)
examples of not allowed inputs:
C:/UserFiles/
../../Alice/txt.txt
/UserFiles/../../noAccess.txt
examples of allowed input:
/UserFiles/UserFiles/Alice/txt.txt
/UserFiles/txt.txt
/UserFiles/Bob/Bob/txt.txt
I cannot think of any cases why this wouldn't work. I also tried to build the regex but it is not quite right as it allows inputs like /UserFiles//txt.txt (Might allow even more that it shouldn't that I have no knowledge of)
So is my idea complete or there are other cases I havent thought of? If my idea is complete could you please help me fix my regex?
(?!\.\.)^\/UserFiles\/[/\w,\s-]+\.[A-Za-z]{3}$

How about resolving the path and checking only afterwards (note, the behaviour is OS-dependent):
new File(input).getCanonicalPath().startsWith("/UserFiles/")
Or, depending on how to interpret your question:
new File(input).getCanonicalPath().startsWith("/Server/Users/Bob/UserFiles/")

how to mandate com/net/etc in url using regex?

i have below the regex. how can i mandate .com/net/etc?
String regex = "^(((https?|ftp)://|(www|ftp)\\.)[a-z0-9-]+(\\.[a-z0-9-]+)+([/?].*)?)|(http://)$";
Thanks!

I would recommend that you don't us a regex for this.
I'd recommend that you parse the URL using the URL class (or URI class if that is more appropriate), and then check that the hostname part ends with one of the required top-level domains.
I'd also recommend that you avoid hard-wiring a set of top-level domains into your code and/or your regexes.
A whole swathe of new TLDs are going to go live fairly soon. Like thousands of them ...
Even ignoring the new TLDs, the set of 2-letter country TLDs is not fixed. (Does South Sudan have a code yet?)

This regex should do it:
[.](com|net|other)
But place it at the correct position in your big url regex (which is maybe not the best way to go...)

After you validate URL protocol you may use something like ^[a-zA-Z0-9-.]+.(com|org|net|mil|edu|COM|ORG|NET)$

Simple JMeter Test does not work

we are trying to add a simple test using JMeter in a JSF Application. We followed the instructions in:
http://jmeter.apache.org/usermanual/build-adv-web-test-plan.html
It has a simple login page with user name and password and a submit button. You can see from the screenshots that we used a proxy. With the settings in the screenshot we are getting HTTP 500 Error. I am not sure if I placed the question in a right way.. Please ask if you need any clarification.
The error code is:
EDIT:
I think this is going to be the longest question of SO. But images are better than words sometimes. Anyway, what we have done is to sent the data that is equivalent to what we see in the firebug. But still getting 500 error. You can see in the attachments Tomcat log.

HTTP 5xx codes are related to server or application errors. Search log files first.
Your script don't need a "User Defined Variables" component because there's no variable expression that really need to be evaluated per thread/user.
The "Regular Expression Extractor" component suffice to extract the JSF ViewState value.
I suggest you to delete the last part of your expression, " />", and change the regular expression grouping (.+?) to (\w+?) 'cause it will evaluate to a few matches (probably only 2). Change the value of "Match No." field to 1 (no need to use random if all values matched are identical).
I didn't understand why you used both "XPath Extractor" and "Regular Expression Extractor" components to extract the same value. I prefer to use the last one when leading with html. XPath is better when treating with well-formed xml strings/files.
To capture a script from scratch, I suggest you to add a "HTTP Proxy Server" inside Workbench, configure it, start it, configure a browser to use this proxy and navigate those pages using the browser. This way you'll capture all requests made and request headers used by the browser you choose. After this, remove unnecessary requests and change query parameters, like javax.faces.ViewState, to the corresponding variables.
Consider using extractors (Pos-Processors) inside an HTTP Sampler prior to the one that will use the variable in Parameter Values. Ex.: if /EBS request comes first and /EBS/login.xhtml request have a javax.faces.ViewState parameter then, probably, /EBS response will contain a hidden input with the javax.faces.ViewState value.
This is a common make up of JSF application test scripts I use. Providing more information about the cause of the HTTP 500 error should clarify the way to a better solution.

On the Regular Expression Extractor for jsfViewState, add (?s) to the start of the regular expression. So you have:
(?s)<input type="hidden" name="javax\.faces\.ViewState" id="javax\.faces\.ViewState" value="(.+?)" />
This allows the (.+?) to span line break characters.

Your regular expression extractor is in the wrong place. You cannot extract a value from the response to a request and then send it with the same request. The only way to achieve this is to use a time machine, but these don't exist yet and even if they did, it probably wouldn't work.
Typically you get a viewstate in the response to a GET and then you later need it in the POST of the same page. So, put the regular expression extractor as a child of the GET call where the login.xhtml page is first called (as a GET). If your recording does not include this GET call then either add it manually or examine the responses of previous calls before your login POST to find it, eg. maybe the GET homepage.xhtml (or similar) will include it.

What is the base open source java package to filter/match URLs?

I have an high performance application which deals with URLs. For every URL it needs to retrieve the appropriate settings from a predefined pool. Every settings object is associated with a URL pattern which indicates which URLs should use these settings. The matching rules are as follows:
"google.com" match pattern should match all URLs pointing to the google domain (thus, maps.google.com and www.google.com/match are matched).
"*.google.com" should match all URLs pointing to a subdomain of google.com (thus, maps.google.com matches, but google.com and www.google.com don't).
"maps.google.com" should match all URLs pointing to this specific subdomain.
Apart from the above rules, every match rule can contain a path, which means that the path part of the URL should start with the match rule path. So: "*.google.com/maps" matches "maps.google.com/maps" but not "maps.google.com/advanced".
As you can see the rules above are overlapping. In the case two rules exist which match the same URL the most specific should apply. The list above is ranked from least specific to most specific.
This seems to be such a standard problem that I was hoping to use a ready made library rather than program my self. Google reveals a couple of options but without a clear way to choose between them. What would you recommend as a good library for this task?
Thanks,
Boaz

I don't think you need a specific library to solve this; the standard Java API has all that you need to write the code without too much work.
Take a look at java.util.regex.Pattern and work out the regular expressions you need to match each of your rules. You might also want to use java.net.URL to parse out the different fields from the URL.
You already said you have a priority scheme to handle scenarios where multiple patterns match the URL, so that should be the last piece for this puzzle.
It looks like a pretty straight-forward task.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.