Need to load static content in JMETER tests - java

I need to figure out a way to load content from a file containing list of ids in the preprocessing step in Jmeter. This needs to happen only once and not every time for each request. So it should be like -
Load all the list of static ids from the file once.
For every request pick one id randomly from this list.
POST the request
I am trying to explore JSR223 preprocessor but not much luck so far. Also I am not sure whether the preprocessor executes for every request which I do not want.
My current JSR Preprocessor looks something like the following -
import java.util.*;
import java.io.*;
try {
Random generator = new Random();
List<String> uuids = new ArrayList<String>();
int n = 1000;
try(BufferedReader br = new BufferedReader(new FileReader("/uuids.txt"))) {
String line = br.readLine();
while (line != null) {
uuids.add(line);
line = br.readLine();
}
}
int rn = uuids.get(generator.nextInt(n));
vars.put("some_file", "/files/" + uuids.get(rn) + ".json.gz");
} catch (Throwable ex) {
log.error("Something went wrong", ex);
throw ex;
}```

Your approach is a little bit wrong because:
JSR223 PreProcessor is executed before each request in its scope
JSR223 PreProcessor is executed by each thread (virtual user)
So I would recommend the following enhancement:
Add setUp Thread Group to your test plan
Add JSR223 Sampler to it with the following code:
SampleResult.setIgnore()
props.put('uuids', new File('uuids.txt').readLines())
this will let you read the file only once and only by one thread.
Whenever you want to access a random uuid you can use the following __groovy() function:
${__groovy(props.get('uuids').get(org.apache.commons.lang3.RandomUtils.nextInt(0\,props.get('uuids').size())),)}
More information on Groovy scripting in JMeter: Apache Groovy - Why and How You Should Use It

You can use instead JMeter's plugin bzm - Random CSV Data Set Config
Just input the CSV filename and it will generate random uuid every time

Related

Hive UDF in Java fails when creating a table

What is the difference between those two queries:
SELECT my_fun(col_name) FROM my_table;
and
CREATE TABLE new_table AS SELECT my_fun(col_name) FROM my_table;
Where my_fun is a java UDF.
I'm asking, because when I create new table (second query) I receive a java error.
Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
...
Caused by: org.apache.hadoop.hive.ql.exec.UDFArgumentException: Unable to instantiate UDF implementation class com.company_name.examples.ExampleUDF: java.lang.NullPointerException
I found that the source of error is line in my java file:
encoded = Files.readAllBytes(Paths.get(configPath));
But the question is why it works when table is not created and fails if table is created?
The problem might be with the way you read the file. Try to pass the file path as the second argument in the UDF, then read as follows
private BufferedReader getReaderFor(String filePath) throws HiveException {
try {
Path fullFilePath = FileSystems.getDefault().getPath(filePath);
Path fileName = fullFilePath.getFileName();
if (Files.exists(fileName)) {
return Files.newBufferedReader(fileName, Charset.defaultCharset());
}
else
if (Files.exists(fullFilePath)) {
return Files.newBufferedReader(fullFilePath, Charset.defaultCharset());
}
else {
throw new HiveException("Could not find \"" + fileName + "\" or \"" + fullFilePath + "\" in inersect_file() UDF.");
}
}
catch(IOException exception) {
throw new HiveException(exception);
}
}
private void loadFromFile(String filePath) throws HiveException {
set = new HashSet<String>();
try (BufferedReader reader = getReaderFor(filePath)) {
String line;
while((line = reader.readLine()) != null) {
set.add(line);
}
} catch (IOException e) {
throw new HiveException(e);
}
}
The full code for different generic UDF that utilizes file reader can be found here
I think there are several points unclear, so this answer is based on assumptions.
First of all, it is important to understand that hive currently optimize several simple queries and depending on the size of your data, the query that is working for you SELECT my_fun(col_name) FROM my_table; is most likely running locally from the client where you are executing the job, that is why you UDF can access your config file locally available, this "execution mode" is because the size of your data. CTAS trigger a job independent on the input data, this job runs distributed in the cluster where each worker fail accessing your config file.
It looks like you are trying to read your configuration file from the local file system, not from the HDSFS Files.readAllBytes(Paths.get(configPath)), this means that your configuration has to either be replicated in all the worker nodes or be added previously to the distributed cache (you can use add file from this, doc here. You can find another questions here about accessing files from the distributed cache from UDFs.
One additional problem is that you are passing the location of your config file through an environment variable which is not propagated to worker nodes as part of your hive job. You should pass this configuration as a hive config, there is an answer for accessing Hive Config from UDF here assuming that you are extending GenericUDF.

How to use OpenNLP parser models in an Android app?

I go through this link for java nlp https://www.tutorialspoint.com/opennlp/index.htm
I tried below code in android:
try {
File file = copyAssets();
// InputStream inputStream = new FileInputStream(file);
ParserModel model = new ParserModel(file);
// Creating a parser
Parser parser = ParserFactory.create(model);
// Parsing the sentence
String sentence = "Tutorialspoint is the largest tutorial library.";
Parse topParses[] = ParserTool.parseLine(sentence, parser,1);
for (Parse p : topParses) {
p.show();
}
} catch (Exception e) {
}
i download file **en-parser-chunking.bin** from internet and placed in assets of android project but code stop on third line i.e ParserModel model = new ParserModel(file); without giving any exception. Need to know how can this work in android? if its not working is there any other support for nlp in android without consuming any services?
The reason the code stalls/breaks at runtime is that you need to use an InputStream instead of a File to load the binary file resource. Most likely, the File instance is null when you "load" it the way as indicated in line 2. In theory, this constructor of ParserModelshould detect this and an IOException should be thrown. Yet, sadly, the JavaDoc of OpenNLP is not precise about this kind of situation and you are not handling this exception properly in the catch block.
Moreover, the code snippet you presented should be improved, so that you know what actually went wrong.
Therefore, loading a POSModel from within an Activity should be done differently. Here is a variant that takes care for both aspects:
AssetManager assetManager = getAssets();
InputStream in = null;
try {
in = assetManager.open("en-parser-chunking.bin");
POSModel posModel;
if(in != null) {
posModel = new POSModel(in);
if(posModel!=null) {
// From here, <posModel> is initialized and you can start playing with it...
// Creating a parser
Parser parser = ParserFactory.create(model);
// Parsing the sentence
String sentence = "Tutorialspoint is the largest tutorial library.";
Parse topParses[] = ParserTool.parseLine(sentence, parser,1);
for (Parse p : topParses) {
p.show();
}
}
else {
// resource file not found - whatever you want to do in this case
Log.w("NLP", "ParserModel could not initialized.");
}
}
else {
// resource file not found - whatever you want to do in this case
Log.w("NLP", "OpenNLP binary model file could not found in assets.");
}
}
catch (Exception ex) {
Log.e("NLP", "message: " + ex.getMessage(), ex);
// proper exception handling here...
}
finally {
if(in!=null) {
in.close();
}
}
This way, you're using an InputStream approach and at the same time you take care for proper exception and resource handling. Moreover, you can now use a Debugger in case something remains unclear with the resource path references of your model files. For reference, see the official JavaDoc of AssetManager#open(String resourceName).
Note well:
Loading OpenNLP's binary resources can consume quite a lot of memory. For this reason, it might be the case that your Android App's request to allocate the needed memory for this operation can or will not be granted by the actual runtime (i.e., smartphone) environment.
Therefore, carefully monitor the amount of requested/required RAM while posModel = new POSModel(in); is invoked.
Hope it helps.

Apache Lucene doesn't filter stop words despite the usage of StopAnalyzer and StopFilter

I have a module based on Apache Lucene 5.5 / 6.0 which retrieves keywords. Everything is working fine except one thing — Lucene doesn't filter stop words.
I tried to enable stop word filtering with two different approaches.
Approach #1:
tokenStream = new StopFilter(new ASCIIFoldingFilter(new ClassicFilter(new LowerCaseFilter(stdToken))), EnglishAnalyzer.getDefaultStopSet());
tokenStream.reset();
Approach #2:
tokenStream = new StopFilter(new ClassicFilter(new LowerCaseFilter(stdToken)), StopAnalyzer.ENGLISH_STOP_WORDS_SET);
tokenStream.reset();
The full code is available here:
https://stackoverflow.com/a/36237769/462347
My questions:
Why Lucene doesn't filter stop words?
How can I enable the stop words filtering in Lucene 5.5 / 6.0?
Just tested both approach 1 and approach 2, and they both seem to filter out stop words just fine. Here is how I tested it:
public static void main(String[] args) throws IOException, ParseException, org.apache.lucene.queryparser.surround.parser.ParseException
{
StandardTokenizer stdToken = new StandardTokenizer();
stdToken.setReader(new StringReader("Some stuff that is in need of analysis"));
TokenStream tokenStream;
//You're code starts here
tokenStream = new StopFilter(new ASCIIFoldingFilter(new ClassicFilter(new LowerCaseFilter(stdToken))), EnglishAnalyzer.getDefaultStopSet());
tokenStream.reset();
//And ends here
CharTermAttribute token = tokenStream.getAttribute(CharTermAttribute.class);
while (tokenStream.incrementToken()) {
System.out.println(token.toString());
}
tokenStream.close();
}
Results:
some
stuff
need
analysis
Which has eliminated the four stop words in my sample.
The pitfall was in the default Lucene's stop words list, I expected, it is much more broader.
Here is the code which by default tries to load the customized stop words list and if it's failed then uses the standard one:
CharArraySet stopWordsSet;
try {
// use customized stop words list
String stopWordsDictionary = FileUtils.readFileToString(new File(%PATH_TO_FILE%));
stopWordsSet = WordlistLoader.getWordSet(new StringReader(stopWordsDictionary));
} catch (FileNotFoundException e) {
// use standard stop words list
stopWordsSet = CharArraySet.copy(StandardAnalyzer.STOP_WORDS_SET);
}
tokenStream = new StopFilter(new ASCIIFoldingFilter(new ClassicFilter(new LowerCaseFilter(stdToken))), stopWordsSet);
tokenStream.reset();

Integrating MaltParser into java code, without using a separate process

There are several resources already available for training and executing the grammatical dependency parser, MaltParser; most notably is the project's homepage: http://www.maltparser.org/userguide.html#startusing). And looking at the NLTK code that uses MaltParser, I see how I could write equivalent Java code to start up a separate child process to run MaltParser: http://nltk.org/_modules/nltk/parse/malt.html. However, what I am asking, or rather looking for, is code that clearly and cleanly shows how to integrate MaltParser as a library into a Java program.
To be specific, I want to write Java code to do the following:
Train a parsing model.
Load a trained model and parse sentences in an online fashion (i.e. stream sentences and use a MaltParser object to parse each one).
To whomever has the knowledge, patience, and willingness: please to help me answer 1 and 2!
I found a rudimentary solution to 2. I noticed that on http://www.maltparser.org/userguide.html#api it directs one to a listing of example files. I took this snippet out of one of those files:
/**
* #author Johan Hall
*/
public static void main(String[] args) {
try {
MaltParserService service = new MaltParserService();
// Inititalize the parser model 'model0' and sets the working directory to '.' and sets the logging file to 'parser.log'
service.initializeParserModel("-c model0 -m parse -w . -lfi parser.log");
// Creates an array of tokens, which contains the Swedish sentence 'Grundavdraget upphör alltså vid en taxerad inkomst på 52500 kr.'
// in the CoNLL data format.
String[] tokens = new String[11];
tokens[0] = "1\tGrundavdraget\t_\tN\tNN\tDD|SS";
tokens[1] = "2\tupphör\t_\tV\tVV\tPS|SM";
tokens[2] = "3\talltså\t_\tAB\tAB\tKS";
tokens[3] = "4\tvid\t_\tPR\tPR\t_";
tokens[4] = "5\ten\t_\tN\tEN\t_";
tokens[5] = "6\ttaxerad\t_\tP\tTP\tPA";
tokens[6] = "7\tinkomst\t_\tN\tNN\t_";
tokens[7] = "8\tpå\t_\tPR\tPR\t_";
tokens[8] = "9\t52500\t_\tR\tRO\t_";
tokens[9] = "10\tkr\t_\tN\tNN\t_";
tokens[10] = "11\t.\t_\tP\tIP\t_";
// Parses the Swedish sentence above
DependencyStructure graph = service.parse(tokens);
// Outputs the dependency graph created by MaltParser.
System.out.println(graph);
// Terminates the parser model
service.terminateParserModel();
} catch (MaltChainedException e) {
System.err.println("MaltParser exception: " + e.getMessage());
}
}

How Can I Read The Next Row From A CSV Data Set Config In JMeter?

I am in the process of creating a test place in JMeter which visits a random amount of pages (from 2 - 10), whose URLs are to be fetched from a CSV Data Set. I have created the CSV Data Set and the samplers which are working fine, except that only one row is read from the Data Set per thread, which is not as a I need - I want a new row to be read after the sampler has completed (or before, I'm not fussed).
I saw that this question is very similar and the solution was to use the Raw Data Source Pre-Processor, which does work but requires arduous alterations to the file in question (adding chunk sizes before each line), which is a bit of a pain when the file is about 500 lines long.
Is there a way I can set the CSV Data Set to advance to the next row on reading, or use some post or pre processor, such as beanshell, in order to do this? I have seen people state that CSVRead can do this, but that states that access is per-thread, which would be no good for me.
As a side note - ultimately all I want to do is access a random line in the file which gets passed to a HTTP sampler, if there is an easier or better way to do this I'm open to suggestions.
You can possibly use for this beanshell (= java) code executed from BeanShell Sampler / BeanShell PostProcessor / BeanShell PreProcessor.
The following code will read all the lines from your file and then select single random:
import java.text.*;
import java.io.*;
import java.util.*;
String [] params = Parameters.split(",");
String csvTest = params[0];
String csvDir = params[0];
ArrayList strList = new ArrayList();
try {
File file = new File(System.getProperty("user.dir") + File.separator + csvDir + File.separator + csvTest);
if (!file.exists()) {
throw new Exception ("ERROR: file " + csvTest + " not found in " + csvDir + " directory.");
}
BufferedReader bufRdr = new BufferedReader(new FileReader(file));
String line = null;
while((line = bufRdr.readLine()) != null) {
strList.add(line);
}
bufRdr.close();
Random rnd = new java.util.Random();
vars.put("csvUrl",strList.get(rnd.nextInt(strList.size())));
}
catch (Exception ex) {
IsSuccess = false;
log.error(ex.getMessage());
System.err.println(ex.getMessage());
}
catch (Throwable thex) {
System.err.println(thex.getMessage());
}
Then you can access extracted URL via variable (${csvUrl} in this example).
I doubt only that reading full file on each iteration (if you have to execute this in loop) is good solution from performance point of view.

Categories

Resources