How to conditionally select a node in XPath - java

The XSD schema I am working with, calls for either an international or domestic address:
"/mns:PhysicalAddress/mns:DomesticAddress/mns:City"
or
"/mns:PhysicalAddress/mns:InternationalAddress/mns:City"
It is being used as a parameter in a Java method as in XMLUtils.BuildField(Document doc, String xpath).
I know I can go straight to the Java object that created that doc and use the auto-generated beans to query elements, but I prefer remaining within the concise realm of XPath. Is this possible?
If so, how do I write an XPath expression selects mns:City regardless of whether it is international or domestic address?
Note: This in Java, not Javascript, HTML or XSLT, so I don't think <xsl:if> is relevant here.

You could go with finding all Cities that have either parent:
//mns:City[(parent::mns:DomesticAddress|parent::mns:InternationalAddress)]
If you need to also ensure that the address is in the physical address:
//mns:City[(parent::mns:DomesticAddress|parent::mns:InternationalAddress)[parent::mns:PhysicalAddress]]
Alternatively, instead of reversing the hierarchy, you do a * and check the name:
/mns:PhysicalAddress/*[name()="mns:DomesticAddress" or name()="mns:InternationalAddress"]/mns:City

Depending in the precise structure of your XML,
/mns:PhysicalAddress/*/mns:City
may be enough, if that pulls in too much then the clearest option is probably just to use the two alternatives you already have, separated by a |:
/mns:PhysicalAddress/mns:DomesticAddress/mns:City | /mns:PhysicalAddress/mns:InternationalAddress/mns:City
Or slightly more concise but (in my opinion) less clear:
/mns:PhysicalAddress/*[self::mns:DomesticAddress | self::mns:InternationalAddress]/mns:City

Related

How to write a step definition that compares a json response to a scenario outline table

I have json response example:
{
"colours": ["green","blue", "red"],
"type" : ["shoes","socks","t-shirts"],
"make" : ["nike", "adidas"],
}
I have Scenario outline table:
|colours|type |make |
|red |shoes |nike |
|blue |socks |nike |
|green |t-shirts|adidas|
I want to use the scenario table to assert against the json response. Now I know how to check this one by one, for example
* Assert colour is correct: <colours>
* Assert type is correct: <type>
* Assert make is correct: <make>
And then perform the step definition like the example below for colour:
#Step("Assert colour is correct: <colours>")
public void assertColourIsCorrect(String colourValue) {
String responseBody = BridgeTestState.getLastResponse().getBody().toString();
itemState itemStateResp = new Gson().fromJson(responseBody, itemState.class);
assertThat("colours", itemStateResp.getColour(), is(equalTo(colourValue.toLowerCase())));
}
Note The getColour() comes from a getter and setter I have set.
Now this works but as you can see it's a bit long winded as I have to create three separate steps to assert against each column.
I want to be a little smarter than this but don't know how to implement. What I would like is a step definition where it will look at the json response and compare it to the table based on its field and then from there view the value.
Something along the lines of:
Assert correct "fields" and "values" are outputted.
I hope that makes sense, basically a smart one step definition to perform the check between the json response and the table row.
From the comments it seems you want a way to do a row by row where you use the headlines of a datatable as keys for the json. You cannot achieve that by using it by example, because that is specifically meant to be mapped directly into steps the way you describe yourself. As I see there are two ways of dealing with this, depending on your use case.
First, still deal with it as parameters in the steps, i.e.,
Then the "colours" is <colours>
And the "type" is <type>
and then just have one step implmentation
Then("the {string} is {string}")
public void theKeyIsValue(String key, String value) {
assertThat(json.get(key)).contains(value);
}
Another, and most likely better would be to deal with it as a normal scenario as already suggested in the comments (I did not understand why you claim that you can't). Which most often is better.
However, most likely the correct solution is - annoyingly enough - to actually rethink your scenario. There are some really great guidelines for best practices etc. on https://cucumber.io/docs/bdd/ they are fairly fast and easy to read, and will help with a lot of the initial problems.
It's hard without a complete example, but from what you write I suspect that your tests might be too technical. It's an extremely hard balance, but try to keep them so vague that they do not specify the "How" but only the "What". Example Given the username "Kate" is a better step than Given I type the username "Kate", because in the latter you are specifying that there should be something you can type text in. I usually ask people if their tests works with a voice assistant.
Another thing I suspect is that you try to test too many things at once. One thing I notice for instance is that there are no apparent connection between your json and your table. I.e., if they data should match on the index for instance, it might make more sense. However, looking at the sparse data, I think the tests you need are:
Scenario: The colour options
Given ...
When the options are given
Then the following can be chosen
| Colour |
| red |
| blue |
| green |
Scenario: The clothing options
Given ..
When the options are given
Then the following can be chosen
| Type |
| shoes |
| socks |
| t-shirts |
That way you can still re-use the steps, you can use the headline for a key in the json, and judging by your data the tests actually relate more closely to the expected things.
Writing acceptance tests is an art the requires practice. I hope some of the suggestions here can be used, however, it is hard to come with more direct suggestions without more context.
Doing what you want to do is counter productive and against the underlying design principles of Cucumber, Scenario Outlines, and probably even data tables in Cucumber.
What the cuke should be doing is explaining WHAT the json response represents and WHY its important. HOW the json response is constructed and the details of exploring its validity and content structure should be pushed down to the step definitions, or better yet helper methods called by the step definitions.
Its really hard to illustrate this with your sample data because its really hard to work out WHAT
{
"colours": ["green","blue", "red"],
"type" : ["shoes","socks","t-shirts"],
"make" : ["nike", "adidas"],
}
represents. Its also pretty hard to understand why you want to make the assertions you want to make.
If you gave a real example and explained what the data represents and WHY its important and perhaps also WHY you need to check it and WHAT you would do if it wasn't correct then we might be able to make more progress.
So I'll try with my own example which won't be very good
Given a set of clothing
When I get a json representation of the clothing
Then the json should be valid clothing
and the steps
Given 'a set of clothing' do
create_clothing_set
end
When 'I get a json representation of the clothing' do
#json = make_clothing_request type: :json
end
Then 'the json should be valid clothing' do
res = validate_clothing json: #json
expect res ...
end
Now your problem is how to write some code to validate your clothing i.e.
def validate_clothing(json: )
...
end
Which is a much simpler problem being executed in a much more powerful environment. This has nothing to do with cucumber, no interactions with features, scenarios etc. its just a simple programming problem.
In general with Cucumber either push technical problems down, so they become programming problems, or pull problems up, so they are outside Cucumber and become scripting problems. This keeps Cucumber on topic. Its job is describe WHAT and WHY and provide a framework to automate.

Is there a good way to identify whether there is date information contained in a String

I had this problem of trying to identifying whether there is a date information contained in a paragraph. So here are the issues:
We don't know where the date string might appear. A paragraph would be something like "We would like set the appointment at Nov. 15th. Then we would .....". So we cannot directly use DateTime.parse()
The format of the date is arbitrary, it can be more formal forms like "Nov. 15th" or "08/21/1988" or "5th in this month".
It would be unlikely to cover all the cases given that the date information can have various forms, I just want to cover as many cases as possible. The lightweight solution I can come up with would be regular expressions I guess.... And again that would be a huge expression. Does anyone know if there are better solutions or available regular expressions for this?
(P.S. I would prefer more light weighted approaches, methods like machine learning might be more general but is not applicable to my task here)
I'd propably approach it with a regular expression (or multiple) as well.
I'd make the regular expression match regions that look date-like by matching everything around "th", "nd" "st", month/day names and abbreviations, dot/line/slash/colon separated numbers or such things. Experiment with that and see how good it finds dates with a ton of test-cases.
Parsing the possible dates is another story. I guess you'd need something as powerful as PHP's strtotime.
Another approach is to just clearly define a big collection of possible formats. Then, when one is detected, you can easily parse it. Feels too brute-force for me though
As a starting point, there are seven pages of date regexes over at http://regexlib.com. If you don't know which one you're looking for, I would create an array and apply them one at a time. You'll still have a problem with dates like 11/12/2015 vs. 12/11/2015 so some kind of process for clarification is still necessary (e.g., automatically mail back and ask "Do you mean December 11 or November 12?").

ANTLR: Multiple ASTs using the same ambiguous grammar?

I'm building an ANTLR parser for a small query language. The query language is by definition ambiguous, and we need all possible interpretations (ASTs) to process the query.
Example:
query : CLASSIFIED_TOKEN UNCLASSIFIED_TOKEN
| ANY_TOKEN UNCLASSIFIED_TOKEN
;
In this case, if input matches both rules, I need to get 2 ASTs with both interpretations. ANTLR will return the first matched AST.
Do you know a simple way to get all possible ASTs for the same grammar? I'm thinking about running parser multiple times, "turning off" already matched rules between iterations; this seems dirty. Is there a better idea? Maybe other lex/parser tool with java support that can do this?
Thanks
If I were you, I'd remove the ambiguities. You can often do that by using contextual information to determine which grammar rules actually trigger. For instance, in
C* X;
in C (not your language, but this is just to make a point), you can't tell if this is just a pointless multiplication (legal to write in C), or a declaration of a variable X of type "pointer to C". So, there are two valid (ambiguous) parses. But if you know that C is a type declaration (from some context, perhaps an earlier code declaration), you can hack the parser to kill off the inappropriate choices and end up with just the one "correct" parse, no ambiguities.
If you really don't have the context, then you likely need a GLR parser, which happily generate both parses in your final tree. I don't know of any available for Java.
Our DMS Software Reengineering Toolkit [not a Java-based product] has GLR parsing support, and we use that all the time to parse difficult languages with ambiguities. The way we handle the C example above is to produce both parses, because the GLR parser is happy to do this, and then if we have additional information (such as symbol table table), post-process the tree to remove the inappropriate parses.
DMS is designed to support the customized analysis and transformation of arbitrary languages, such as your query language, and makes it easy to define the grammar. Once you have a context-free grammar (ambiguities or not), DMS can parse code and you can decide what to do later.
I doubt you're going to get ANTLR to return multiple parse trees without wholesale rewriting of the code.
I believe you're going to have to partition the ambiguities, each into its own unambiguous grammar and run the parse multiple times. If the total number of ambiguous productions is large you could have an unmanageable set of distinct grammars. For example, for three binary ambiguities (two choices) you'll end up with 8 distinct grammars, though there might be slightly fewer if one ambiguous branch eliminates one or more of the other ambiguities.
Good luck

Detecting equivalent expressions

I'm currently working on a Java application where I need to implement a system for building BPF expressions. I also need to implement mechanism for detecting equivalent BPF expressions.
Building the expression is not too hard. I can build a syntax tree using the Interpreter design pattern and implement the toString for getting the BPF syntax.
However, detecting if two expressions are equivalent is much harder. A simple example would be the following:
A: src port 1024 and dst port 1024
B: dst port 1024 and src port 1024
In order to detect that A and B are equivalent I probably need to transform each expression into a "normalized" form before comparing them. This would be easy for above example, however, when working with a combination of nested AND, OR and NOT operations it's getting harder.
Does anyone know how I should best approach this problem?
One way to compare boolean expressions may be to convert both to the disjunctive normal form (DNF), and compare the DNF. Here, the variables would be Berkeley Packet Filter tokens, and the same token (e.g. port 80) appearing anywhere in either of the two expressions would need to be assigned the same variable name.
There is an interesting-looking applet at http://www.izyt.com/BooleanLogic/applet.php - sadly I can't give it a try right now due to Java problems in my browser.
I'm pretty sure detecting equivalent expressions is either an np-hard or np-complete problem, even for boolean-only expressions. Meaning that to do it perfectly, the optimal way is basically to build complete tables of all possible combinations of inputs and the results, then compare the tables.
Maybe BPF expressions are limited in some way that changes that? I don't know, so I'm assuming not.
If your problems are small, that may not be a problem. I do exactly that as part of a decision-tree designing algorithm.
Alternatively, don't try to be perfect. Allow some false negatives (cases which are equivalent, but which you won't detect).
A simple approach may be to do a variant of the normal expression-evaluation, but evaluating an alternative representation of the expression rather than the result. Impose an ordering on commutative operators. Apply some obvious simplifications during the evaluation. Replace a rich operator set with a minimal set of primitive operators - e.g. using de-morgans to eliminate OR operators.
This alternative representation forms a canonical representation for all members of a set of equivalent expressions. It should be an equivalence class in the sense that you always find the same canonical form for any member of that set. But that's only the set-theory/abstract-algebra sense of an equivalence class - it doesn't mean that all equivalent expressions are in the same equivalence class.
For efficient dictionary lookups, you can use hashes or comparisons based on that canonical representation.
I'd definitely go with syntax normalization. That is, like aix suggested, transform the booleans using DNF and reorder the abstract syntax tree such that the lexically smallest arguments are on the left-hand side. Normalize all comparisons to < and <=. Then, two equivalent expressions should have equivalent syntax trees.

XSD: Index of sequence in Element name

I'm building an XSD to generate JAXB objects in Java. Then I ran into this:
<TotalBugs>
<Bug1>...</Bug1>
<Bug2>...</Bug2>
...
<BugN>...</BugN>
</TotalBugs>
How do I build a sequence of elements where the index of the sequence is in the element name? Specifically, how do I get the 1 in Bug1
You don't want to do it in this way, XML has a top-down order by nature. Consequently, you don't have to enumerate yourself:
<totalBugs>
<bug><!-- Here comes 1st bug --></bug>
<bug><!-- Here comes 2nd bug --></bug>
...
<bug><!-- Here comes last bug --></bug>
</totalBugs>
You can access the 1st bug node in the list by the XPath expression:
/totalBugs/bug[1]
Note, indexes start by W3C standard at 1. Please refer to for further readings to w3schools.
I'm pretty sure XSD won't support what you need. However you can use <xsd:any> for that bit of the schema, then use something lower-level than JAXB to generate the XML for that particular part. (I think your generated classes will have fields like protected List<Element> any; which you can fill in using DOM).

Categories

Resources