Using regex to find chars in a string and replace - java

When returning a string value from an incoming request in my network based app, I have a string like this
'post http://a.com\r\nHost: a.com\r\n'
Issue is that the host is always changing so I need to replace it with my defined host. To accomplish that I tried using regex but am stuck trying to find the 'host:a.com' chars in the string and replacing it with a defined valued.
I tried using this example www.javamex.com/tutorials/regular_expressions/search_replace_loop.shtml#.VUWvt541jqB changing the pattern compile to :([\\d]+) but it still remains unchanged.
My goal is to replace given chars in a string with a defined value and returning the new string with the defined value.
Any pointers?
EDIT:
Sample of a typical incoming request:
Post http://example.com\r\nHost: example.com\r\nConnection: close\r\n
Another incoming request might take this form:
GET http://example2.net\r\nContent-Length: 2\r\nConnection: close\r\nHost: example2.net\r\n
I want to replace it to this forms
Post http://example.com\r\nHost: mycustomhostvalue.com\r\nConnection: close\r\n
GET http://example2.net\r\nContent-Length: 2\r\nConnection: close\r\nHost: mycustomhostvalue.com\r\n

Use a regex to replace it, like this:
content = content.replaceAll("Host:\\s*(\\w)*\\.\\w*", "Host: newhost.com")
This will replace anything after Host: with newHost.com.
Note: as per comment by cfqueryparam, you may want to usea regex like this to cover .co.uk and such:
Host:\\s*.*?(?=\\\\r\\\\n)

Related

Unable to get single quotes from Apache camel route XML config to Java method

I am defining an Apache camel route using XML configurations, and I want to call a method while passing parameters with single quotes:
<bean ref="cmdExecutor" method="execute('BatchQA.bat',
'./input/CamelCMDFile/QATestScripts/', 'Analytics,&apos;qa.user&apos;')"/>
The execute method looks like this:
public int execute(String bat, String dir, String arguments, Exchange exchange) {
String[] args = arguments.split(",");
result = ProcessUtils.cmdExecute(bat, dir, args);
.....
I have tried using &apos;, ' and ' to get the required result, but neither have worked. These characters are simply being ignored in the arguments object and the rest of the string is received as it is in my java function.
After applying #Screwtape solution, argument I am getting &apos;qa.user&apos; and this is not what I am aiming.
Thanks. :)
I'm not sure what Camel is doing with these single quoted strings, because it seemed just to strip the apostrophes if you quote with apostrophes such that options I expected to cause errors just seemed to work.
However, I have got it to work as you require. You need to reverse the quotation types. XML allows both single and double quotes in attributes, even though eclipse doesn't seem to colourise the single quoted attributes (but this site does).
Hence when I use
<camel:bean ref="testBean" method='test("BatchQA.bat",
"./input/CamelCMDFile/QATestScripts/", "Analytics,&apos;qa.user&apos;")' />
my test bean does break out the strings as you wanted:
[WARN ]: beans.testBean - Analytics
[WARN ]: beans.testBean - 'qa.user'
although I don't know if it would be possible to have a string like this with both single and double quotes. Let's hope you don't need that.

Match Url path with query string

I'm sending a request from SOAPUI to a wiremock server, and I'm attempting to match the url's.
This is the request that is being sent out: /user/test/?and=query
I've written the following regular expression:
stubFor(post(urlPathMatching("/user/test/\\?(and)\\=([a-z]*)"))
The problem is when I try to match the "?" when I use one backslash to capture the literal character, I get an error in Java saying:
"Illegal Escape Character"
What I tried to do to resolve the problem:
I know the solution is to use the second backslash to capture the "?" like this: "\?", but when I send the request I get an error saying the urls don't match because this is the request that is matched against the original one being sent from soap ui:
/user/test/\?(and)\=([a-z]*)
Can someone please help me on this?
EDIT: Second attempt
I've tried to use the dot notation to represent the "?" and "=" symbol. I've tested this on a regular expression tester and it checks out, but, It's still saying the url's dont match on soap ui.
Regular expression: stubFor(post(urlPathMatching("/user/test/.*(and).*([a-z]*)")).atPriority(1)
mismatched url: /user/test/.*(and).*([a-z]*)
When you are using urlPathMatching() you shouldn't put your query parameters in the url. That approach only works for urlEqualTo().
Instead you should specify the parameters separately using withQueryParam(), so your stub setup should be:
stubFor(post(urlPathMatching("/user/test/")).withQueryParam("and", matching("[a-z]*")));
\\ is just escapse the \, you should add one more \ before ? to escapse ?.
Just like this:
stubFor(post(urlPathMatching("/user/test/\\\?(and)\\=([a-z]*)"))

Request parameter is modified in my servlet

I sent one request as URL with data to servlet, But by default servlet is modifying the data and sending as request. Can you please suggest how to maintain the request URL with data which i passed to servlet should remain same ?
Example:- when i am passing the data to servlet
http://localhost/helloservlet/servlet/ppd.abcd.build.coupons.CouponValueFormatterServlet?dsn=frd_abc_abcde&lang=ENG&val=PRCTXT|12345 &ABCDEFG
when it using the above url in servelt as request , like string abc = request.getParameter("val"), the val attribute is trimmed automatically and assigned as " val=PRCTXT|12345" but it supposed to be like " val = PRCTXT|12345 &ABCDEFG ". Please help me on this.
The servlet interprets each & in the URL as the start of a new parameter. So when it sees &ABCDEFG, it thinks you are sending a new parameter called ABCDEFG with no value (though this is technically a "keyless value" according to the specifications).
Two things to fix this, first is when you want to actually send an &, use %26 instead. This will be skipped by the code that divides up the parameters, but converted to a real & in the parameter's value.
Second is to replace spaces with +. Spaces in URLs work sometimes but can be problematic.
So your actual request URL should look like this:
http://localhost/helloservlet/servlet/ppd.abcd.build.coupons.CouponValueFormatterServlet?dsn=frd_abc_abcde&lang=ENG&val=PRCTXT|12345+%26ABCDEFG
If you're building these parameters in javascript, you can use encodeURIComponent() to fix all problem characters for you. So you could do something like this:
var userInput = *get some input here*
var addr = 'http://www.example.com?param1=' + encodeURIComponent(userInput);

java regex matcher results != to notepad++ regex find result

I am trying to extract data out of a website access log as part of a java program. Every entry in the log has a url. I have successfully extracted the url out of each record.
Within the url, there is a parameter that I want to capture so that I can use it to query a database. Unfortunately, it doesn't seem that the web developers used any one standard to write the parameter's name.
The parameter is usually called "course_id", but I have also seen "courseId", "course%3DId", "course%253Did", etc. The format for the parameter name and value is usually course_id=_22222_1, where the number I want is between the "_" and "_1". (The value is always the same, even if the parameter name varies.)
So, my idea was to use the regex /^.*course_id[^_]*_(\d*)_1.*$/i to find and extract the number.
In java, my code is
java.util.regex.Pattern courseIDPattern = java.util.regex.Pattern.compile(".*course[^i]*id[^_]*_(\\d*)_1.*", java.util.regex.Pattern.CASE_INSENSITIVE);
java.util.regex.Matcher courseIDMatcher = courseIDPattern.matcher(_url);
_courseID = "";
if(courseIDMatcher.matches())
{
_courseID = retrieveCourseID(courseIDMatcher.group(1));
return;
}
This works for a lot of the records. However, some records do not record the course_id, even though the parameter is in the url. One such example is the record:
/webapps/contentDetail?course_id=_223629_1&content_id=_3641164_1&rich_content_level=RICH&language=en_US&v=1&ver=4.1.2
However, I used notepad++ to do a regex replace on this (in fact, every) url using the regex above, and the url was successfully replaced by the course ID, implying that the regex is not incorrect.
Am I doing something wrong in the java code, or is the java matcher broken?

Regex to Extract First Part of URL

I need a java regex to extract parts of a URL.
For example, take the following URLs:
http://localhost:81/example
https://test.com/test
http://test.com/
I would want my regex expression to return:
http://localhost:81
https://test.com
http://test.com
I will be using this in a Java patcher.
This is what I have so far, problem is it takes the whole URLs:
^https?:\/\/(?!.*:\/\/)\S+
import Java.net.URL
//snip
URL url = new URL(urlString);
return url.getProtocol() + "://" + url.getAuthority();
The right tool for the right job.
Building off your attempt, try this:
^https?://[^/]+
I'm assuming that you want to capture everything until the first / after http://? (That's what I was getting from your examples - if not, please post some more).
Are these URLs given as one input, or are each a different string?
Edit: It was pointed out that there were unnecessary escapes, so fixed to a more condensed version
Language independent answer:
For the whitespace: replace /^\s+/ with the empty string.
For removing the path information from the URL, if you can assume there aren't any slashes in the path (i.e. you're not dealing with http://localhost:81/foo/bar/baz), replace /\/[^\/]+$/ with the empty string. If there might be more slashes, you might try something like replacing /(^\s*.*:\/\/[^\/]+)\/.*/ with $1.
A simple one: ^(https?://[^/]+)

Categories

Resources