How to extract one integer value from multiple integers present

How to extract one integer value from multiple integers present - java

I have the following string, output of df command.
$ df |grep data
/dev/block/dm-1 11066964 2103848 8946732 20% /data
I want to extract value
8946732
using java regex, I have tried
(.*?\s{3}\d+.\d+)
but it is not working fine.

If your expected output is always on fixed index, then you can use this :
>>> s = "/dev/block/dm-1 11066964 2103848 8946732 20% /data"
>>> s.split()[3]
'8946732'
ok, so as requested with using regex, here it is
>>> import re
>>> re.search(r'(?is)(.*?\s+\d+\s+\d+\s+)(\d+)',s).group(2)
'8946732'

using regex as below, it match non space follow by 2 digits then capture next digits
import re
re.match('\S+\s+(?:\d+\s+){2}(\d+)', "/dev/block/dm-1 11066964 2103848 8946732 20% /data").group(1)
# '8946372'
refer demo

You can ask df to print a specific column with the --output option:
df --output=avail /dev/block/dm-1 | tail -n1
The tail command is just for stripping the header Avail out.

Related

Why can't Nextflow handle this awk phrase?

Background:
Using a csv as input, I want to combine the first two columns into a new one (separated by an underscore) and add that new column to the end of a new csv.
Input:
column1,column2,column3
1,2,3
a,b,c
Desired output:
column1,column2,column3,column1_column2
1,2,3,1_2
a,b,c,a_b
The below awk phrase works from the command line:
awk 'BEGIN{FS=OFS=","} {print \$0, (NR>1 ? \$1"_"\$2 : "column1_column2")}' file.csv > full_template.csv
However, when placed within a nextflow script (below) it gives an error.
#!/usr/bin/env nextflow
params.input = '/file/location/here/file.csv'
process unique {
input:
path input from params.input
output:
path 'full_template.csv' into template
"""
awk 'BEGIN{FS=OFS=","} {print \$0, (NR>1 ? \$1"_"\$2 : "combined_header")}' $input > full_template.csv
"""
}
Here is the error:
N E X T F L O W ~ version 21.10.0
Launching `file.nf` [awesome_pike] - revision: 1b63d4b438
class groovyx.gpars.dataflow.expression.DataflowInvocationExpression cannot be cast to class java.nio.file.FileSystem (groovyx.gpars.dataflow.expression.Dclass groovyx.gpars.dataflow.expression.DataflowInvocationExpression cannot be cast to class java.nio.file.FileSystem (groovyx.gpars.dataflow.expression.DataflowInvocationExpression is in unnamed module of loader 'app'; java.nio.file.FileSystem is in module java.base of loader 'bootstrap')
I'm not sure what is causing this, and any help would be appreciated.
Thanks!
Edit:
Yes it seems this was not the source of the error (sorry!). I'm trying to use splitCsv on the resulting csv and this appears to be what's causing the error. Like so:
Channel
.fromPath(template)
.splitCsv(header:true, sep:',')
.map{ row -> tuple(row.column1, file(row.column2), file(row.column3)) }
.set { split }
I expect my issue is it's not acceptable to use .fromPath on a channel, but I can't figure out how else to do it.
Edit 2:
So this was a stupid mistake. I simply needed to add the .splitCsv option directly after the input line where I invoked the channel. Hardly elegant, but appears to be working great now.
process blah {
input:
what_you_want from template.splitCsv(header:true, sep:',').map{ row -> tuple(row.column1, file(row.column2), file(row.column3)) }

I was unable to reproduce the error you're seeing with your example code and Nextflow version. In fact, I get the expected output. This shouldn't be much of a surprise though, because you have correctly escaped the special dollar variables in your AWK command. The cause of the error is likely somewhere else in your code.
If escaping the special characters gets tedious, another way is to use a shell block instead:
It is an alternative to the Script definition with an important
difference, it uses the exclamation mark ! character as the variable
placeholder for Nextflow variables in place of the usual dollar
character.
The example becomes:
params.input_csv = '/file/location/here/file.csv'
input_csv = file( params.input_csv)
process unique {
input:
path input_csv
output:
path 'full_template.csv' into template
shell:
'''
awk 'BEGIN { FS=OFS="," } { print $0, (NR>1 ? $1 "_" $2 : "combined_header") }' \\
"!{input_csv}" > "full_template.csv"
'''
}
template.view { it.text }
Results:
$ nextflow run file.nf
N E X T F L O W ~ version 20.10.0
Launching `file.nf` [wise_hamilton] - revision: b71ff1eb03
executor > local (1)
[76/ddbb87] process > unique [100%] 1 of 1 ✔
column1,column2,column3,combined_header
1,2,3,1_2
a,b,c,a_b

Get specific java version with powershell

I have some issues with getting the java version out as a string.
In a batch script I have done it like this:
for /f tokens^=2-5^ delims^=.-_^" %%j in ('%EXTRACTPATH%\Java\jdk_extract\bin\java -fullversion 2^>^&1') do set "JAVAVER=%%j.%%k.%%l_%%m"
The output is: 1.8.0_121
Now I want to do this for PowerShell, but my output is: 1.8.0_12, I miss one "1" in the end Now I have tried it with trim and split but nothing gives me the right output can someone help me out?
This is what I've got so var with PowerShell
$javaVersion = (& $extractPath\Java\jdk_extract\bin\java.exe -fullversion 2>&1)
$javaVersion = "$javaVersion".Trim("java full version """).TrimEnd("-b13")
The full output is: java full version "1.8.0_121-b13"

TrimEnd() works a little different, than you might expect:
'1.8.0_191-b12'.TrimEnd('-b12')
results in: 1.8.0_19 and so does:
'1.8.0_191-b12'.TrimEnd('1-b2')
The reason is, that TrimEnd() removes a trailing set of characters, not a substring. So .TrimEnd('-b12') means: remove all occurrences of any character of the set '-b12' from the end of the string. And that includes the last '1' before the '-'.
A better solution in your case would be -replace:
'java full version "1.8.0_191-b12"' -replace 'java full version "(.+)-b\d+"','$1'

Use a regular expression for matching and extracting the version number:
$javaVersion = if (& java -fullversion 2>&1) -match '\d+\.\d+\.\d+_\d+') {
$matches[0]
}
or
$javaVersion = (& java -fullversion 2>&1 | Select-String '\d+\.\d+\.\d+_\d+').Matches[0].Groups[0].Value

JMeter - Groovy script concatenation of variables

Groovy is the preferred scripting in JMeter
We advise using Apache Groovy or any language that supports the Compilable interface of JSR223.
The following code in JSR233 Sampler works in Java but not in Groovy
String a= "0"+"1" +
"2"
+"3";
log.info(a);
I found the reasons for + operator not to work as expected,
but what is the solution is I want to concatenate several variables to a script?
I failed to use answer of using three quotes """The row Id is: ${row.id}..."""
Currently I use Java as script language and use JMeter ${variable} although is also not recommended:
In this case, ensure the script does not use any variable using ${varName} as caching would take only first value of ${varName}
String text ="...<id>${id}</id><id2>${id2}</id2>...";
What's a better approach in groovy in such case?
EDIT:
Try using << but different error where it split to new line
String text ="<id>" <<vars["id1"] << "<id><id2>"
<< vars["id2"] << "<id2>";
Receives an error:
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
Script12.groovy: 2: unexpected token: << # line 2, column 1.
<< vars["id2"] << "<id2>";

Groovy uses the new line character to indicate end of statement except in cases where it knows the next line must extend the current one. Numerous binary operators on the start of the next line are supported. The '+' and '-' operators have binary and unary variants and currently (Groovy versions at least up to 2.5.x) don't support those operators at the start of the next line. You can place the operator on the end of the previous line (as in your first line) or use the line continuation character at the end of the previous line:
String a = "0" + "1" +
"2" \
+ "3"
log.info(a)

Why don't you use :
String text ="<id>" <<vars["id1"] << "<id><id2>" << vars["id2"] << "<id2>";
It works for me

If I had a hashmap to concat like yours, I would try:
def vars = ["id": "value", "id2": "value2", "id3": "value3"]
String text = ""
vars.each { k, v ->
text += "<${k}>${v}</${k}>"
}
println text

Getting no such file error when trying to run Maven wrapper? [duplicate]

I am trying to format a variable in linux
str="Initial Value = 168"
echo "New Value=$(echo $str| cut -d '=' -f2);">>test.txt
I am expecting the following output
Value = 168;
But instead get
Value = 168 ^M;

Don't edit your bash script on DOS or Windows. You can run dos2unix on the bash script. The issue is that Windows uses "\r\n" as a line separator, Linux uses "\n". You can also manually remove the "\r" characters in an editor on Linux.

str="Initial Value = 168"
newstr="${str##* }"
echo "$newstr" # 168
pattern matching is the way to go.

Try this:
#! /bin/bash
str="Initial Value = 168"
awk '{print $2"="$4}' <<< $str > test.txt
Output:
cat test.txt
Value=168
I've got comment saying that it doesn't address ^M, I actually does:
echo -e 'Initial Value = 168 \r' | cat -A
Initial Value = 168 ^M$
After awk:
echo -e 'Initial Value = 168 \r' | awk '{print $2"="$4}' | cat -A
Value=168$

First off, always quote your variables.
#!/bin/bash
str="Initial Value = 168"
echo "New Value=$(echo "$str" | cut -d '=' -f2);"
For me, this results in the output:
New Value= 168;
If you're getting a carriage return between the digits and the semicolon, then something may be wrong with your echo, or perhaps your input data is not what you think it is. Perhaps you're editing your script on a Windows machine and copying it back, and your variable assignment is getting DOS-style newlines. From the information you've provided in your question, I can't tell.
At any rate I wouldn't do things this way. I'd use printf.
#!/bin/bash
str="Initial Value = 168"
value=${str##*=}
printf "New Value=%d;\n" "$value"
The output of printf is predictable, and it handily strips off gunk like whitespace when you don't want it.
Note the replacement of your cut. The functionality of bash built-ins is documented in the Bash man page under "Parameter Expansion", if you want to look it up. The replacement I've included here is not precisely the same functionality as what you've got in your question, but is functionally equivalent for the sample data you've provided.

How to validate number.number.number;number this format in Reg ex

My string should contains in this format number.number.number;number ex:15.2.63;4
How to validate this format in Reg ex. I have done in normal way used contains, spilt etc. But lines of code increased. May I know how do it in reg ex?

You can go with this:
^\d+[.]\d+[.]\d+;\d+$
With a liveDemo

Many ways to do it, here using PCRE:
laptop:~$ echo "12.34.56;7" | perl -ne 'print $_ if (/^\d+\.\d+\.\d+;\d+$/);'
12.34.56;7
laptop:~$ echo "12a.34b.56c;7" | perl -ne 'print $_ if (/\d+\.\d+\.\d+;\d+/);'
laptop:~$ echo "12.34.56;7" | perl -ne 'print $_ if (/^(\d+\.){2}\d+;\d+$/);'
12.34.56;7
If you know the exact length of each part, you can also fix it.
For example \d{2}. will match 11. but won't match 123.
The above answer group dot into bracket ([.]) this is useless for a single character.
But if you delimiter may vary, you can use, for example [.;-] to allow . ; and - as a delimiter.

Try this..
^[1-9][\.\d]*[1-9][\.\d]*[1-9][\.\d]*[1-9][\;\d]?$
Hope this helps...

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to extract one integer value from multiple integers present - java

I have the following string, output of df command. $ df |grep data /dev/block/dm-1 11066964 2103848 8946732 20% /data I want to extract value 8946732 using java regex, I have tried (.*?\s{3}\d+.\d+) but it is not working fine.

If your expected output is always on fixed index, then you can use this : >>> s = "/dev/block/dm-1 11066964 2103848 8946732 20% /data" >>> s.split()[3] '8946732' ok, so as requested with using regex, here it is >>> import re >>> re.search(r'(?is)(.*?\s+\d+\s+\d+\s+)(\d+)',s).group(2) '8946732'

using regex as below, it match non space follow by 2 digits then capture next digits import re re.match('\S+\s+(?:\d+\s+){2}(\d+)', "/dev/block/dm-1 11066964 2103848 8946732 20% /data").group(1) # '8946372' refer demo

You can ask df to print a specific column with the --output option: df --output=avail /dev/block/dm-1 | tail -n1 The tail command is just for stripping the header Avail out.

Related

Why can't Nextflow handle this awk phrase?

Get specific java version with powershell

JMeter - Groovy script concatenation of variables

Getting no such file error when trying to run Maven wrapper? [duplicate]

How to validate number.number.number;number this format in Reg ex

Categories

Resources