i compiled the :
grammar Hello; // Define a grammar called Hello
r : 'hello' ID ; // match keyword hello followed by an identifier
ID : [a-z]+ ; // match lower-case identifiers
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines, \r (Windows)r
the command to generate the class files (notice im creating it with -package hellogrammer ) :
java -jar antlr-4.9.2-complete.jar -package hellogrammer -o c:\Dev\my\java\ANTLR\test_project\core\src\main\java\hellogrammer c:\Dev\my\java\ANTLR\test_project\core\src\main\java\Hello.g4
and it creates the files just fine , then i compile the files and it looks like this :
c:\Dev\my\java\ANTLR\test_project\core\target\classes\hellogrammer>ls -1
HelloBaseListener.class
HelloLexer.class
HelloListener.class
'HelloParser$RContext.class'
HelloParser.class
now when I try to execute the TestRig command im getting no response from the command line :
c:\Dev\my\java\ANTLR\test_project\core\target\classes>java -cp ".;C:/Dev/my/java/ANTLR/antlr-4.9.2-complete.jar" org.antlr.v4.gui.TestRig hellogrammer.Hello -tree
it just stacks with no error or any response ...
The TestRig requires two separate parameters, first the grammar name and the the start rule name. This TestRig then begins parsing input from the input stream, so, you can either type input (with a Ctrl-D for signal EOF), or you can redirect your input to stdin with <
Try:
c:\Dev\my\java\ANTLR\test_project\core\target\classes>java -cp ".;C:/Dev/my/java/ANTLR/antlr-4.9.2-complete.jar" org.antlr.v4.gui.TestRig hellogrammar.Hello r -tree < “your input file”
Related
Background:
Using a csv as input, I want to combine the first two columns into a new one (separated by an underscore) and add that new column to the end of a new csv.
Input:
column1,column2,column3
1,2,3
a,b,c
Desired output:
column1,column2,column3,column1_column2
1,2,3,1_2
a,b,c,a_b
The below awk phrase works from the command line:
awk 'BEGIN{FS=OFS=","} {print \$0, (NR>1 ? \$1"_"\$2 : "column1_column2")}' file.csv > full_template.csv
However, when placed within a nextflow script (below) it gives an error.
#!/usr/bin/env nextflow
params.input = '/file/location/here/file.csv'
process unique {
input:
path input from params.input
output:
path 'full_template.csv' into template
"""
awk 'BEGIN{FS=OFS=","} {print \$0, (NR>1 ? \$1"_"\$2 : "combined_header")}' $input > full_template.csv
"""
}
Here is the error:
N E X T F L O W ~ version 21.10.0
Launching `file.nf` [awesome_pike] - revision: 1b63d4b438
class groovyx.gpars.dataflow.expression.DataflowInvocationExpression cannot be cast to class java.nio.file.FileSystem (groovyx.gpars.dataflow.expression.Dclass groovyx.gpars.dataflow.expression.DataflowInvocationExpression cannot be cast to class java.nio.file.FileSystem (groovyx.gpars.dataflow.expression.DataflowInvocationExpression is in unnamed module of loader 'app'; java.nio.file.FileSystem is in module java.base of loader 'bootstrap')
I'm not sure what is causing this, and any help would be appreciated.
Thanks!
Edit:
Yes it seems this was not the source of the error (sorry!). I'm trying to use splitCsv on the resulting csv and this appears to be what's causing the error. Like so:
Channel
.fromPath(template)
.splitCsv(header:true, sep:',')
.map{ row -> tuple(row.column1, file(row.column2), file(row.column3)) }
.set { split }
I expect my issue is it's not acceptable to use .fromPath on a channel, but I can't figure out how else to do it.
Edit 2:
So this was a stupid mistake. I simply needed to add the .splitCsv option directly after the input line where I invoked the channel. Hardly elegant, but appears to be working great now.
process blah {
input:
what_you_want from template.splitCsv(header:true, sep:',').map{ row -> tuple(row.column1, file(row.column2), file(row.column3)) }
I was unable to reproduce the error you're seeing with your example code and Nextflow version. In fact, I get the expected output. This shouldn't be much of a surprise though, because you have correctly escaped the special dollar variables in your AWK command. The cause of the error is likely somewhere else in your code.
If escaping the special characters gets tedious, another way is to use a shell block instead:
It is an alternative to the Script definition with an important
difference, it uses the exclamation mark ! character as the variable
placeholder for Nextflow variables in place of the usual dollar
character.
The example becomes:
params.input_csv = '/file/location/here/file.csv'
input_csv = file( params.input_csv)
process unique {
input:
path input_csv
output:
path 'full_template.csv' into template
shell:
'''
awk 'BEGIN { FS=OFS="," } { print $0, (NR>1 ? $1 "_" $2 : "combined_header") }' \\
"!{input_csv}" > "full_template.csv"
'''
}
template.view { it.text }
Results:
$ nextflow run file.nf
N E X T F L O W ~ version 20.10.0
Launching `file.nf` [wise_hamilton] - revision: b71ff1eb03
executor > local (1)
[76/ddbb87] process > unique [100%] 1 of 1 ✔
column1,column2,column3,combined_header
1,2,3,1_2
a,b,c,a_b
I have a separated parser and lexer grammar and want to run org.antlr.v4.gui.TestRig to debug/test my grammar.
My lexer grammar start with:
lexer grammar TestLexer;
IDS: [a-zA-Z];
WS: [ \t];
NL: [\r?\n];
...
and my parser grammar start with:
parser grammar TestParser;
options { tokenVocab=TestLexer; }
testRule: WS* IDS+ NL;
...
My classpath env variable points to complete antlr.jar and current directory.
antlr is an alias to java org.antlr.v4.Tool
grun is an alias to java org.antlr.v4.gui.TestRig.
When I run antlr TestParser.g4 && javac *.java the parser code gets generated and compiled.
When I run grun TestParser testRule -gui I get the error:
Exception in thread "main" java.lang.ClassCastException: class TestParser
at java.lang.Class.asSubclass(Class.java:3404)
at org.antlr.v4.gui.TestRig.process(TestRig.java:135)
at org.antlr.v4.gui.TestRig.main(TestRig.java:119)
And when I run grun Test testRule -gui I get the error:
Can't load Test as lexer or parser
I don't have any problems when using a combined grammar.
What's missing in order to run TestRig?
When using separated lexer and parser you have to generate the code for the lexer and parser. This is not done automatically by generating the code for the parser alone.
Execute:
antlr TestLexer.g4
antlr TestParser.g4
javac *.java
After generating the code for both (lexer and parser) you have to run:
grun Test -gui testInput.txt
where testInput.txt contains some test input to parse.
Note: When using separated lexer and parser it's expected that the lexer ends on Lexer and the parser ends on Parser. The common part of the files is the name of grammar.
I.e TestLexer and TestParser -> Test is the name of the grammar.
I have some issues with getting the java version out as a string.
In a batch script I have done it like this:
for /f tokens^=2-5^ delims^=.-_^" %%j in ('%EXTRACTPATH%\Java\jdk_extract\bin\java -fullversion 2^>^&1') do set "JAVAVER=%%j.%%k.%%l_%%m"
The output is: 1.8.0_121
Now I want to do this for PowerShell, but my output is: 1.8.0_12, I miss one "1" in the end Now I have tried it with trim and split but nothing gives me the right output can someone help me out?
This is what I've got so var with PowerShell
$javaVersion = (& $extractPath\Java\jdk_extract\bin\java.exe -fullversion 2>&1)
$javaVersion = "$javaVersion".Trim("java full version """).TrimEnd("-b13")
The full output is: java full version "1.8.0_121-b13"
TrimEnd() works a little different, than you might expect:
'1.8.0_191-b12'.TrimEnd('-b12')
results in: 1.8.0_19 and so does:
'1.8.0_191-b12'.TrimEnd('1-b2')
The reason is, that TrimEnd() removes a trailing set of characters, not a substring. So .TrimEnd('-b12') means: remove all occurrences of any character of the set '-b12' from the end of the string. And that includes the last '1' before the '-'.
A better solution in your case would be -replace:
'java full version "1.8.0_191-b12"' -replace 'java full version "(.+)-b\d+"','$1'
Use a regular expression for matching and extracting the version number:
$javaVersion = if (& java -fullversion 2>&1) -match '\d+\.\d+\.\d+_\d+') {
$matches[0]
}
or
$javaVersion = (& java -fullversion 2>&1 | Select-String '\d+\.\d+\.\d+_\d+').Matches[0].Groups[0].Value
I have below shellscript
MYMAP12=$(java -jar hello-0.0.1-SNAPSHOT.jar)
echo "==="
echo ${MYMAP12}
The output of java -jar hello-0.0.1-SNAPSHOT.jar will be map {one=one, two=two, three=three}
how to get each element from the key in shell script
I tried echo ${MYMAP12{one}} but it gave me an error
As #chepner implied, the Java code is just outputting a text string which has to be parsed and manipulated in bash to make it useful. There are no doubt several ways to do this, here is one which uses pure bash (i.e. no external programs):
# This is the text string supplied by Java
MYMAP12='{one=one, two=two, three=three}'
# Create an associative array called 'map'
declare -A map
# Remove first and last characters ( { and } )
MYMAP12=${MYMAP12#?}
MYMAP12=${MYMAP12%?}
# Remove ,
MYMAP12=${MYMAP12//,/}
# The list is now delimited by spaces, the default in a shell
for item in $MYMAP12
do
# This splits around '='
IFS='=' read key val <<< $item
map[$key]=$val
done
echo "keys: ${!map[#]}"
echo "values: ${map[#]}"
Gives:
keys: two three one
values: two three one
EDIT:
You should to use the correct tool for the job, if you need an associative array (map, hash table, dictionary) then you need a language with that feature. These include bash, ksh, awk, perl, ruby, python and C++.
You can extract the keys and values using a POSIX shell (sh) but you cannot store them in an associative array since sh does not have that feature. The best you can do is a generic list, which is just a text string of whitespace separated values. What you can do is to write a lookup function which emulates it:
get_value() {
map="$1"
key="$2"
for pair in $MYMAP12
do
if [ "$key" = "${pair%=*}" ]
then
value="${pair#*=}"
# Remove last character ( , or } )
value=${value%?}
echo "$value"
return 0
fi
done
return 1
}
MYMAP12='{kone=one, ktwo=two, kthree=three}'
# Remove first character ( { )
MYMAP12=${MYMAP12#?}
val=$(get_value "$MYMAP12" "ktwo")
echo "value for 'ktwo' is $val"
Gives:
value for 'ktwo' is two
Using this function you can also test for the presence of a key, for example:
if get_value "$MYMAP12" "kfour"
then
echo "key kfour exists"
else
echo "key kfour does not exist"
fi
Gives:
key kfour does not exist
Note that this is inefficient compared to an associative array since we are sequentially searching a list, although with a short list of only three keys you won't see any difference.
if you change your output format to the right hand side
$ x="( [one]=foo [two]=bar [three]=baz )"
then, you can use bash associative arrays
$ declare -A map="$x"
$ echo "${map[one]}"
foo
In a web scanner application, i need to parse some script's output to get some informations, but the problem is that i don't get the same output in linux shell and in java output, let me describe it (this example is done with whatweb on one of the websites i need to scan at work, but i also have this problem whenever i have a colored output in shell):
Here is what i get from linux's output (with some colors):
http://www.ceris-ingenierie.com [200] Apache[2.2.9], Cookies[ca67a6ac78ebedd257fb0b4d64ce9388,jfcookie,jfcookie%5Blang%5D,lang], Country[EUROPEAN UNION][EU], HTTPServer[Fedora Linux][Apache/2.2.9 (Fedora)], IP[185.13.64.116], Joomla[1.5], Meta-Author[Administrator], MetaGenerator[Joomla! 1.5 - Open Source Content Management], PHP[5.2.6,], Plesk[Lin], Script[text/javascript], Title[Accueil ], X-Powered-By[PHP/5.2.6, PleskLin]
And here is what i get from Java:
[1m[34mhttp://www.ceris-ingenierie.com[0m [200] [1m[37mApache[0m[[1m[32m2.2.9[0m], [1m[37mCookies[0m[[1m[33mca67a6ac78ebedd257fb0b4d64ce9388,jfcookie,jfcookie%5Blang%5D,lang[0m], [1m[37mCountry[0m[[1m[33mEUROPEAN UNION[0m][[1m[35mEU[0m], [1m[37mHTTPServer[0m[[1m[31mFedora Linux[0m][[1m[36mApache/2.2.9 (Fedora)[0m], [1m[37mIP[0m[[1m[33m185.13.64.116[0m], [1m[37mJoomla[0m[[1m[32m1.5[0m], [1m[37mMeta-Author[0m[[1m[33mAdministrator[0m], [1m[37mMetaGenerator[0m[[1m[33mJoomla! 1.5 - Open Source Content Management[0m], [1m[37mPHP[0m[[1m[32m5.2.6,[0m], [1m[37mPlesk[0m[[1m[33mLin[0m], [1m[37mScript[0m[[1m[33mtext/javascript[0m], [1m[37mTitle[0m[[32mAccueil [0m], [1m[37mX-Powered-By[0m[[1m[33mPHP/5.2.6, PleskLin[0m]
My guess is that colors in linux's shell are generated by those unknown characters, but they are really a pain for parsing in java.
I get this output by running the script in a new thread, and doing raw_data+=data;(where raw_data is a String) whenever i have a new line in my output, to finally send raw_data to my parser.
How can i do to avoid getting those annoying chars and so, to get a more friendly output like i get in linux's shell?
In your Java code, where you are executing the shell script, you can add an extra sed filter to filter out the shell-control characters.
# filter out shell control characters
./my_script | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g"
Use tr -dc '[[:print:]]' to remove non-printable characters, like this:
# filter out shell control characters
./my_script | \
sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g" | \
tr -dc '[[:print:]]'
You could even add a wrapper script around the original script to do this. And call the wrapper script. This allows you to do any other pre-processing, before feeding it into the Java program and keeps it clean of all unnecessary code and you can focus on the core logic of the application.
If you can't add a wrapper script for any reason and would like to add the filter in Java, Java doesn't support pipes in the command, directly. You'll have to call your command as an argument to bash it like this:
String[] cmd = {
"/bin/sh",
"-c",
"./my_script | sed -r 's/\\x1B\\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g'"
};
Process p = Runtime.getRuntime().exec(cmd);
Don't forget to escape all the '\' when you use the regex in Java.
Source and description for the sed filter: http://www.commandlinefu.com/commands/view/3584/remove-color-codes-special-characters-with-sed
You can use a regex here:
String raw_data= ...;
String cleaned_raw_data = raw_data.replaceAll("\\[\\d+m", "");
This will remove any sequence of characters starting with a \\[, ending with a m and having between them one or more digit (\\d+).
Note that [ is preceded by a \\ because [ has a special meaning for regular expressions (it's a meta-character).
Description