Integration test elastic search, timing issue, document not found - java

I have an integration test, in java, of a facade that does a few things, among which is an index operation into an elastic search database. This elastic search database has been very naively set up (out of the box stuff actually, I'm in the learning fase).
The insertion is done, inside that facade, with the java api, also very naively, with the example nearly completely copy pasted from elastic search, as described here : http://www.elasticsearch.org/guide/reference/java-api/index_.html.
Afterwards I test a whether my facade did its stuff correctly, part of it is checking whether that document has really been inserted in the database. This I do, again, in the way elastic describes on their site : http://www.elasticsearch.org/guide/reference/java-api/search.html. I insert a document with a certain payload and look it up the same way.
This test works if I run in debug and set a breakpoint after facade did it's stuff, but it fails with no results found if I don't put this breakpoint or do not run in debug. This makes me think I'm really doing something wrong. Also, the application itself works (inserts and so on), so there's likely something wrong with my integration test, and not with my copy pasted code.
I guess that after the indexing operation returns the indexing is not really finished yet, or there is some replication going on that doesn't complete before the search, or something like that, but it eludes me what exactly and I can't seem to get it solved either.
I didn't try to put elastic on one node and one shard yet, maybe there's something wrong there, but I don't really see what exactly, so I didn't walk that path yet.
Like I said, just started using elastic so I might be missing something crucial and beginner-style. I can paste my exact code if needs be, but like I said it boils down to using two code snippets from the elastic search site in a test.
Kasper

Elasticsearch doesn't make data available immediately after the index operation is called. It waits for 1 sec by default for more data to arrive. However, you can force elasticsearch to make all data available immediately by calling refresh:
client.admin().indices().refresh(refreshRequest()).actionGet();
Try adding this operation after your facade is done indexing before you check the final result.

Unable to find refreshRequest method. I have used something like this and it seems to working now. I will do some more tests
client.admin().indices().refresh(new RefreshRequest(indexName)).actionGet();
where indexName is the String name of index to be refreshed

Related

Is there any way to guarantee that an ElasticSearch index has been deleted

In some automated tests, I am trying to delete and immediately recreate an index at the start of every test, using ElasticSearch's high-level rest client (version 6.4), as follows:
DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest(indexName);
deleteIndexRequest.indicesOptions(IndicesOptions.lenientExpandOpen());
client.indices().delete(deleteIndexRequest, RequestOptions.DEFAULT);
CreateIndexRequest createIndexRequest = new CreateIndexRequest(indexName);
request.mapping("_doc", "{...}", XContentType.JSON);
client.indices().create(request, RequestOptions.DEFAULT);
The problem I have is that, intermittently, my tests fail at the point of creating the index, with an error:
{"error": {"root_cause":[{"type":"resource_already_exists_exception","reason":"index [(index-name)/(UUID)] already exists, ...,}] "status":400}
The more tests I run, the more likely I am to see the error, which seems to be a strong indicator that it's a race condition - presumably when I try to recreate the index, the previous delete operation hasn't always completed.
This is backed-up with the fact that if I put a breakpoint immediately after the delete operation, and manually run a curl request to look at the index that I tried to delete, I find that it's still there some of the time; on those occasions the error above appears if I continue the test.
I've tried asserting the isAcknowledged() method of the response to the delete operation, but that always returns true, even in cases when the error occurs.
I've also tried doing an exists() check before the create operation. Interestingly in that case if I run the tests without breakpoints, the exists() check always returns false (i.e. that the index doesn't exist) even in cases where the error will then occur, but if I put a breakpoint in before the create operation, then the exists() check returns true in cases where the error will happen.
I'm at a bit of a loss. As far as I understand, my requests should be synchronous, and from a comment on this question, this should mean that the delete() operation only returns when the index has definitely been deleted.
I suspect a key part of the problem might be that these tests are running on a cluster of 3 nodes. In setting up the client, I'm only addressing one of the nodes:
client = new RestHighLevelClient(RestClient.builder(new HttpHost("example.com", 9200, "https")));
but I can see that each operation is being replicated to the other two nodes.
When I stop a breakpoint before the create operation, in cases where the index is not deleted, I can see that it's not being deleted on any of the nodes, and it seems not to matter how long I wait, it never gets deleted.
Is there some way I can reliably determine whether the index has been deleted before I create it? Or perhaps something I need to do before I attempt the delete operation, to guarantee that it will succeed?
Hey I think there are quite a few things to think about. For one I'd test everything with curl or some kind of rest client till I start doing anything in code. Might just help you conceptually, but that's just my opinion.
This is one thing you should consider:
"If an external versioning variant is used, the delete operation automatically creates an index if it has not been created before (check out the create index API for manually creating an index)."
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete.html
Which kind of would explain why exists() would return false. So if external versioning variant is used then the delete option would actually create an index with the same name prior to deleting it.
You mentioned about the fact that you are working with a three node cluster. Something you can try is:
"When making delete requests, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the delete request." Here is a super detailed explanation which is certainly worth reading: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-wait-for-active-shards
I suggest you try:
curl -X DELETE 127.0.0.1:9200/fooindex?wait_for_active_shards=3
You said you have 3 nodes in your cluster,so this means that:"...indexing operation will require 3 active shard copies before proceeding, a requirement which should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard."
This check is probably not 100% water tight since according to the docs here:https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-wait-for-active-shards
"It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation commences. Once the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the write operation’s response reveals the number of shard copies on which replication succeeded/failed." so perhaps use this parameter, but have your code check the response to see if any operations failed.
Something you can also try is:
(I can't seem to find good documentation to back this info up)
This should be able to tell you if the cluster isn't ready to accept deletes.
curl -X DELETE 127.0.0.1:9200/index?wait_for_completion=true

Java Cucumber: creating scenario outlines with dynamic examples

We have a test where basically we need to input a specific value in a web site and make sure another value comes out. The data of the input-output for this is stored in an XML file.
Now we can create a single Scenario that runs once and loops through, submitting each value however we run into some reporting problems, if 2 out of 100 pairs fail we want to know which ones and not just have an assertion error for the whole scenario.
We would get much clearer reporting using a Scenario Outline where all the values are in the examples table. then the scenario itself runs repeatedly and we can fail an individual set as an assertion error and have that kick back clearly in a report.
Problem: we do not want to hard code all the values from the xml into the .feature. it's noisy but also if the values change it's slow to update. we would rather just provide the XML parse it and go, if things change we just drop in an updated XML.
Is there a way to create dynamic examples where we can run the scenario repeatedly, one for each data case, without explicitly defining it in the examples table ?
Using Cucumber for this is a bad idea. You should test this functionality lower down your stack with a unit test.
At some point in your code, after the user has input their value, the value will be passed to a method/function that will return your answer. This is the place to do this sort of testing.
A cucumber test going through the whole stack will upwards of 3 orders of magnitude slower than a well written unit tests. So you could test thousands of pairs of values in your unit test in the time it takes to run one single cuke.
If you do this sort of testing in Cucumber you will quickly end up with a test suite that takes far too long to run, or that can only be run quickly at great expense. This is very damaging to a project.
Cuking should be about one happy path (The user can enter a value and see the result) and maybe a sad path (the user enters a bad value and sees an error/explanation). Anything else needs to be pushed down to unit tests.
The NoraUi framework does exactly what you want to do in your project. The NoraUi code is open source. If you have questions about this framework, you can post an issue with the tag "Question"

How to get all commits for a certain release in GitHub?

I know I can get all commits in a project using GET /repos/:owner/:repo/commits
Now I want to get all commits for a certain release of that project.
What should I do?
Judging by your answer to my question, you want the commits made since some tag. This will take a couple steps to complete, first you need to get the SHA for the tag in question. You'll want to use the git references API to get a specific reference. In the specific example that you linked you'll want to do
GET /repos/nasa/mct/git/refs/tags/v1.8b3
And you'll want to get the 'sha' attribute from the object stored in the 'object' attribute of the response object. With the 'sha' attribute, you'll want to use the commits API to list commits starting with that 'sha' so your request will look like this:
GET /repos/nasa/mct/commits?sha=%(sha_from_first_request)s
That will give you 30 commits per-page by default (if I remember correctly), so you should see if adding &per_page=100 to the end helps. I can't tell you exactly how to do this in Java, but I expect you'll be able to use one of the libraries written to interact with the API to make it easier.

My failing Selenium test works manually

I have a Selenium test which has been working as expected for the past 1 month.
Since last week this one test alone fails 8 times out of 10, when the form is saved, throwing a Hibernate "Transient Object Exception". So it doesn't fail 100% but just around 80-90% of the time. But it fails just at that one point when Save button is clicked.
The developers tell me that they have changed nothing at all in the test server in the last one week.
I tried the same form manually about 10 times and it saves perfectly all 10 times.
Could there be something wrong with my Selenium test ?
Any thoughts would be helpful.
The exception means that the object that is fed to hibernate (I'm guessing a Java representation of the form) is not attached to the hibernate scope at the time of saving/updating.
Given that it works manually and not with Selenium, I'm guessing a race condition.
Something like an update/delete being performed while the matching object is not (yet) attached to the hibernate session.
Selenium is quite a bit faster at clicking then a human ;-)
My best bet would have the programmers look at any (async) calls to the database via hibernate and execution order of those calls and see if there are any race conditions possible (or in this case; where).
Have you tried adding a wait command? It may be something as simple as trying to select the element a little to early.

Where to store expected output of a test?

Writing a test I expect the tested method to return certain outputs. Usually I'm checking that for a given database operation I get a certain output. My practice has usually been that of writing an array as a quick map/properties file in the test itself.
This solution is quick, and is not vulnerable to run-time changes of an external file to load the expected results from.
A solution is to place the data in a java source file, so I bloat less the test and still get a compile-time checked test. How about this?
Or is loading the exepected results as resources a better approach? A .properties file is not good enough since I can have only one value per key. Is commons-config the way to go?
I'd prefer a simple solution where I name the properties per key, so for each entry I might have a doc-length and numFound property value (sounds like the elements of an xml node)?
How do you go about this?
You must remember about maintaining such tests. After writing several web services tests with Spring-WS test support I must admit that storing requests (test setup) and expected responses in external XML files wasn't such a good idea. Each request-response pair had the same name prefix as test case so everything was automated and very clean. But still refactoring and diagnosing test failures becomes painful. After a while I realized that embedding XML in test case as String, although ugly, is much easier to maintain.
In your case, I assume you invoke some database query and you get a list of maps in response. What about writing some nice DSL to make assertions on these structures? Actually, FEST-Assert is quite good for that.
Let's say you test the following query (I know it's an oversimplification):
List<Map<String, Object>> rs = db.query("SELECT id, name FROM Users");
then you can simply write:
assertThat(rs).hasSize(1);
assertThat(rs.get(0))
.hasSize(2)
.includes(
entry("id", 7),
entry("name", "John")
)
);
Of course it can and should be further simplified to fit your needs better. Isn't it easier to have a full test scenario on one screen rather than jump from one file to another?
Or maybe you should try Fitnesse (looks like you are no longer doing unit testing, so acceptance testing framework should be fine), where tests are stored in wiki-like documents, including tables?
Yes, using resources for expected results (and also setup data) works well and is pretty common.
XML may well be a useful format for you - being hierarchical can certainly help (one element per test method). It depends on the exact situation, but it's definitely an option. Alternatively, JSON may be easier for you. What are you comfortable with, in terms of serialization APIs?
Given that you mention you are usually testing that a certain DB operation returns expected output, you may want to take a look at using DBUnit:
// Load expected data from an XML dataset
IDataSet expectedDataSet = new FlatXmlDataSetBuilder().build(new File("expectedDataSet.xml"));
ITable expectedTable = expectedDataSet.getTable("TABLE_NAME");
// Assert actual database table match expected table
Assertion.assertEquals(expectedTable, actualTable);
DBUnit handles comparing the state of a table after some operation has completed and asserting that the data in the table matches an expected DataSet. The most common format for the DataSet that you compare the actual table state with is probably using an XmlDataSet, where the expected data is loaded from an XML file, but there are other subclasses as well.
If you are already doing testing like this, then it sounds like you may have written most of the same logic already - but DBUnit may give you additional features you haven't implemented on your own yet for free.

Categories

Resources