Cucumber: how to define a string with only letters as the Given

Cucumber: how to define a string with only letters as the Given - java

I am working on Cucumber framework, and I have written my feature file and run the test runner. From that I got the snippets, which have to be implemented. I am a bit confused with one as the scenario is that a user types a non-digits string e.g. "nonumbers".
#Given("The string contains {string}")
public void the_string_contains(String string) {
}
As I am unable to just say string = "^[a-zA-Z]+$"; I am not sure how I should define the string as a non-digits string. As it is the #Given, I am not using Pattern in order to check if the string is correctly formated

According to the documentation you can use {string} to match single-quoted or double-quoted strings, for example "banana split" or 'banana split' (but not banana split). Only the text between the quotes will be extracted. The quotes themselves are discarded.
Note that Cucumber expressions (like {string}) are available as of Cucumber-jvm v3.x

For my feature file I did the following implementation. Please check the screenshot:
Feature file
Java file
With the above implementation everything executed just fine.

Related

Unable to get single quotes from Apache camel route XML config to Java method

I am defining an Apache camel route using XML configurations, and I want to call a method while passing parameters with single quotes:
<bean ref="cmdExecutor" method="execute('BatchQA.bat',
'./input/CamelCMDFile/QATestScripts/', 'Analytics,&apos;qa.user&apos;')"/>
The execute method looks like this:
public int execute(String bat, String dir, String arguments, Exchange exchange) {
String[] args = arguments.split(",");
result = ProcessUtils.cmdExecute(bat, dir, args);
.....
I have tried using &apos;, ' and ' to get the required result, but neither have worked. These characters are simply being ignored in the arguments object and the rest of the string is received as it is in my java function.
After applying #Screwtape solution, argument I am getting &apos;qa.user&apos; and this is not what I am aiming.
Thanks. :)

I'm not sure what Camel is doing with these single quoted strings, because it seemed just to strip the apostrophes if you quote with apostrophes such that options I expected to cause errors just seemed to work.
However, I have got it to work as you require. You need to reverse the quotation types. XML allows both single and double quotes in attributes, even though eclipse doesn't seem to colourise the single quoted attributes (but this site does).
Hence when I use
<camel:bean ref="testBean" method='test("BatchQA.bat",
"./input/CamelCMDFile/QATestScripts/", "Analytics,&apos;qa.user&apos;")' />
my test bean does break out the strings as you wanted:
[WARN ]: beans.testBean - Analytics
[WARN ]: beans.testBean - 'qa.user'
although I don't know if it would be possible to have a string like this with both single and double quotes. Let's hope you don't need that.

How to remove \u200B (Zero Length Whitespace Unicode Character) from String in Java?

My application is using Spring Integration for email polling from Outlook mailbox.
As, it is receiving the String (email body)from an external system (Outlook), So I have no control over it.
For Example,
String emailBodyStr= "rejected by sundar14-\u200B.";
Now I am trying to remove the unicode character \u200B from this String.
What I tried already.
Try#1:
emailBodyStr = emailBodyStr.replaceAll("\u200B", "");
Try#2:
`emailBodyStr = emailBodyStr.replaceAll("\u200B", "").trim();`
Try#3 (using Apache Commons):
StringEscapeUtils.unescapeJava(emailBodyStr);
Try#4:
StringEscapeUtils.unescapeJava(emailBodyStr).trim();
Nothing worked till now.
When I tried to print this String using below code.
logger.info("Comment BEFORE:{}",emailBodyStr);
logger.info("Comment AFTER :{}",emailBodyStr);
In Eclipse console, it is NOT printing unicode char,
Comment BEFORE:rejected by sundar14-.
But the same code prints the unicode char in Linux console as below.
Comment BEFORE:rejected by sundar14-\u200B.
I read some examples where str.replace() is recommended, but please note that examples uses javascript, PHP and not Java.

Finally, I am able to remove 'Zero Width Space' character by using 'Unicode Regex'.
String plainEmailBody = new String();
plainEmailBody = emailBodyStr.replaceAll("[\\p{Cf}]", "");
Reference to find the category of Unicode characters.
Character class from Java.
Character class from Java lists all of these unicode categories.
Website: http://www.fileformat.info/
Website: http://www.regular-expressions.info/ => Unicode Regular Expressions
Note 1: As I received this string from Outlook Email Body - none of the approaches listed in my question was working.
My application is receiving a String from an external system
(Outlook), So I have no control over it.
Note 2: This SO answer helped me to know about Unicode Regular Expressions .

non-basic characters in java, how to handle the encoding correctly

when I am trying to call method with parameter using my Polish language f.e.
node.call("ąćęasdasdęczć")
I get these characters as input characters.
Ä?Ä?Ä?asdasdÄ?czÄ
I don't know where to set correct encoding in maven pom.xml? or in my IDE? I tried to change UTF-8 to ISO_8859-2 in my IDE setting, but it didn't work. I was searching similiar questions, but I didn't find the answer.
#Edit 1
Sample code:
public void findAndSendKeys(String vToSet , By vLocator){
WebElement element;
element = webDriverWait.until(ExpectedConditions.presenceOfElementLocated(vLocator));
element.sendKeys(vToSet);
}
By nameLoc = By.id("First_Name");
findAndSendKeys("ąćęasdasdęczć" , nameLoc );
Then in input field I got Ä?Ä?Ä?asdasdÄ?czÄ. Converting string to Basic Latin in my IDE helps, but It's not the solution that I needed.
I have also problems with fields in classes f.e. I have class in which I have to convert String to basic Latin
public class Contacts{
private static final By LOC_ADDRESS_BTN = By.xpath("//button[contains(#aria-label,'Wybór adresu')]");
// it doesn't work, I have to use basic latin and replace "ó" with "\u00f3" in my IDE
}
#Edit 2 - Changed encoding, but problem still exists
1:

java regex matcher results != to notepad++ regex find result

I am trying to extract data out of a website access log as part of a java program. Every entry in the log has a url. I have successfully extracted the url out of each record.
Within the url, there is a parameter that I want to capture so that I can use it to query a database. Unfortunately, it doesn't seem that the web developers used any one standard to write the parameter's name.
The parameter is usually called "course_id", but I have also seen "courseId", "course%3DId", "course%253Did", etc. The format for the parameter name and value is usually course_id=_22222_1, where the number I want is between the "_" and "_1". (The value is always the same, even if the parameter name varies.)
So, my idea was to use the regex /^.*course_id[^_]*_(\d*)_1.*$/i to find and extract the number.
In java, my code is
java.util.regex.Pattern courseIDPattern = java.util.regex.Pattern.compile(".*course[^i]*id[^_]*_(\\d*)_1.*", java.util.regex.Pattern.CASE_INSENSITIVE);
java.util.regex.Matcher courseIDMatcher = courseIDPattern.matcher(_url);
_courseID = "";
if(courseIDMatcher.matches())
{
_courseID = retrieveCourseID(courseIDMatcher.group(1));
return;
}
This works for a lot of the records. However, some records do not record the course_id, even though the parameter is in the url. One such example is the record:
/webapps/contentDetail?course_id=_223629_1&content_id=_3641164_1&rich_content_level=RICH&language=en_US&v=1&ver=4.1.2
However, I used notepad++ to do a regex replace on this (in fact, every) url using the regex above, and the url was successfully replaced by the course ID, implying that the regex is not incorrect.
Am I doing something wrong in the java code, or is the java matcher broken?

Matcher.find() only find the last match in JUnit Test

i have this weird problem. I have this Java method that works fine in my program:
/*
* Extract all image urls from the html source code
*/
public void extractImageUrlFromSource(ArrayList<String> imgUrls, String html) {
Pattern pattern = Pattern.compile("\\<[ ]*[iI][mM][gG][\t\n\r\f ]+.*[sS][rR][cC][ ]*=[ ]*\".*\".*>");
Matcher matcher = pattern.matcher(html);
while (matcher.find()) {
imgUrls.add(extractImgUrlFromTag(matcher.group()));
}
}
This method works fine in my java application. But whenever I test it in JUnit test, it only adds the last url to the ArrayList
/**
* Test of extractImageUrlFromSource method, of class ImageDownloaderProc.
*/
#Test
public void testExtractImageUrlFromSource() {
System.out.println("extractImageUrlFromSource");
String html = "<html><title>fdjfakdsd</title><body><img kfjd src=\"http://image1.png\">df<img dsd src=\"http://image2.jpg\"></body><img dsd src=\"http://image3.jpg\"></html>";
ArrayList<String> imgUrls = new ArrayList<String>();
ArrayList<String> expimgUrls = new ArrayList<String>();
expimgUrls.add("http://image1.png");
expimgUrls.add("http://image2.jpg");
expimgUrls.add("http://image3.jpg");
ImageDownloaderProc instance = new ImageDownloaderProc();
instance.extractImageUrlFromSource(imgUrls, html);
imgUrls.stream().forEach((x) -> {
System.out.println(x);
});
assertArrayEquals(expimgUrls.toArray(), imgUrls.toArray());
}
Is it the JUnit that has the fault. Remember, it works fine in my application.

I think there is a problem in the regex:
"\\<[ ]*[iI][mM][gG][\t\n\r\f ]+.*[sS][rR][cC][ ]*=[ ]*\".*\".*>"
The problem (or at least one problem) us the first .*. The + and * metacharacters are greedy, which means that they will attempt to match as many characters as possible. In your unit test, I think that what is happening is that the .* is matching everything up to the last 'src' in the input string.
I suspect that the reason that this "works" in your application is that the input data is different. Specifically, I suspect that you are running your application on input files where each img element is on a different line. Why does this make a difference? Well, it turns out that by default, the . metacharacter does not match line breaks.
For what it is worth, using regexes to "parse" HTML is generally thought to be a bad idea. For a start, it is horribly fragile. People who do a lot of this kind of stuff tend to use proper HTML parsers ... like "jsoup".
Reference: RegEx match open tags except XHTML self-contained tags

I wish I could comment as I'm not sure about this, but it might be worth mentioning...
This line looks like it's extracting the URLs from the wrong array...did you mean to extract from expimgUrls instead of imgUrls?
instance.extractImageUrlFromSource(imgUrls, html);
I haven't gotten this far in my Java education so I may be incorrect...I just looked over the code and noticed it. I hope someone else who knows more can actually give you a solid answer!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Cucumber: how to define a string with only letters as the Given - java

For my feature file I did the following implementation. Please check the screenshot: Feature file Java file With the above implementation everything executed just fine.

Related

Unable to get single quotes from Apache camel route XML config to Java method

How to remove \u200B (Zero Length Whitespace Unicode Character) from String in Java?

non-basic characters in java, how to handle the encoding correctly

java regex matcher results != to notepad++ regex find result

Matcher.find() only find the last match in JUnit Test

Categories

Resources