I am trying to get all pid's from pstree -pA <PID> output in linux.
I am working in java and thought about doing it with regular expression.
I attached an example output below:
eclipse(45905)---java(45906)-+-{java}(45907)
|-{java}(45908)
|-{java}(45909)
|-{java}(45910)
|-{java}(45911)
I have written the following code:
private static Pattern PATTERN = Pattern.compile("\\d+");
static List<String> getPidsFromOutput(String output) {
List<String> $ = Lists.newArrayList();
List<String> list = Splitter.on(CharMatcher.anyOf("()\n")).splitToList(output);
for (String string : list) {
Matcher matcher = PATTERN.matcher(string);
if (matcher.matches()) {
$.add(string);
}
}
return $ ;
}
The problem is with processes that their name (ie: the executed file) is a number. it will catch them also and this is buggy.
Do you have any suggestion to fix that? or any other solution?
you should make sure the pid is surrounded by braces,
in addition your code catches threads as well, to avoid them you should ignore the process that has {} around its name.
private static Pattern PATTERN = Pattern.compile(".*[^}]\\((\\d+)\\).*");
private Integer pid;
static Set<String> getPidsFromOutput(String output) {
Set<String> $ = Sets.newHashSet();
List<String> list = Splitter.on(CharMatcher.anyOf("\n")).splitToList(output);
for (String line : list) {
List<String> perProcess = Splitter.on(CharMatcher.anyOf("-")).splitToList(line);
for (String p : perProcess) {
Matcher matcher = PATTERN.matcher(p);
if (matcher.matches()) {
$.add(matcher.group(1));
}
}
}
log.info("pids from pstree: " + $);
return $;
}
Look for numbers that are surrounded by braces
\((\d+)\)
since process names are surrounded by curly braces it will only get the PID
Related
I'm currently working on some stuff with regex and struggel alot with regex latetly.
I wanted to build some script engine, for that I need to load some presets:
example:
create <Type> [after;before;at;between(2);<Integer>, <DateTime>, <Date>, <Time>, <String>] : Creator
edit <Type> [after;before;at;between(2);<Integer>, <DateTime>, <Date>, <Time>, <String>]
run [<File>, <Command>]
So I want to make sure I can read <Type> [after;before;at;between(2);<Integer>, <DateTime>, <Date>, <Time>, <String>] and [<File>, <Command>].
For the understanding:
NAME <IMPORTANT_PARAMETER> [TEXT_PARAMETER(AMOUNT_OF_OPTIONAL_PAREMETER);<OPTIONAL_PARAMETER(S)>].
In this example I used 'command names' as IMPORTANT_PARAMETER.
For the first rule I made this regex: \<(\w+)\>(?:\s+\[(?:(.*;))(.*)\])?(?:\s+\:\s+(\w+))? and it kinda works within my code:
Pattern p = Pattern.compile("\\<(\\w+)\\>(?:\\s+\\[(?:(.*;))(.*)\\])?(?:\\s+\\:\\s+(\\w+))?");
Matcher m = p.matcher(parameters);
if(m.matches()){
Command command2 = new Command(command);
command2.addParameter(new Parameter(m.group(1)));
String text = m.group(2);
String[] texts = null;
if(text != null){
texts = text.split(";");
command2.addTexts(Arrays.asList(texts));
}
String type = m.group(3);
String[] types = null;
if(type != null){
types = type.split(", ");
for (String string : types) {
Pattern pTypes = Pattern.compile("\\<(?:(\\w+))\\>");
Matcher mTypes = pTypes.matcher(string);
if(mTypes.matches()){
command2.addParameter(new Parameter(mTypes.group(1), true));
}
}
}
String className = m.group(4);
if(className != null){
command2.addClassName(className);
}
commandslist.add(command2);
}
I tried to use \[\<(\w+)\>(?:,\s+\<(\w+)\>)+\] but it only worked for two entries -> example run [<File>, <Command>]. It would be better having a "list" of those optional elements [<File>, <Command>]. So in the end I want to have m.group(1) = File; m.group(2) = Command; m.group(3) = blablabla; and so on.
I hope I could show you my problem good enough, hit me with questions if there is anything more to explain.
Here is a link to the regexr: REGEXR or regex101: REGEX101
Thanks for helping :)
My suggestion is to match the stuff between the words you are after:
public static void main (String[] args) {
final String STR1 = "run [<File>, <Command1>, <Command2>, <Command3>]";
final String STR2 = "run [<File>, <Command1>, <Command2>, <Command3>, <Command4>]";
System.out.println(parse(STR1));
System.out.println(parse(STR2));
}
private static List parse(String str) {
List<String> list = new ArrayList<>();
Pattern p = Pattern.compile("(?:\\G,\\s+|^run\\s+\\[(?:<\\w+>,\\s+)+?)<(\\w+)>");
Matcher m = p.matcher(str);
while (m.find()) {
list.add(m.group(1));
}
return list;
}
which results in the output:
[Command1, Command2, Command3]
[Command1, Command2, Command3, Command4]
from the String value want to getting word before and after the <in>
String ref = "application<in>rid and test<in>efd";
int result = ref.indexOf("<in>");
int result1 = ref.lastIndexOf("<in>");
String firstWord = ref.substring(0, result);
String[] wor = ref.split("<in>");
for (int i = 0; i < wor.length; i++) {
System.out.println(wor[i]);
}
}
my Expected Output
String[] output ={application,rid,test,efd}
i tried with 2 Option first one IndexOf but if the String have more than two <in>i 'm not getting my expected output
Second One splitits also not getting with my expected Output
please suggest best option to getting the word(before and after <in>)
You could use an expression like so: \b([^ ]+?)<in>([^ ]+?)\b (example here). This should match the string prior and after the <in> tag and place them in two groups.
Thus, given this:
String ref = "application<in>rid and test<in>efd";
Pattern p = Pattern.compile("\\b([^ ]+?)<in>([^ ]+?)\\b");
Matcher m = p.matcher(ref);
while(m.find())
System.out.println("Prior: " + m.group(1) + " After: " + m.group(2));
Yields:
Prior: application After: rid
Prior: test After: efd
Alternatively using split:
String[] phrases = ref.split("\\s+");
for(String s : phrases)
if(s.contains("<in>"))
{
String[] split = s.split("<in>");
for(String t : split)
System.out.println(t);
}
Yields:
application
rid
test
efd
Regex is your friend :)
public static void main(String args[]) throws Exception {
String ref = "application<in>rid and test<in>efd";
Pattern p = Pattern.compile("\\w+(?=<in>)|(?<=<in>)\\w+");
Matcher m = p.matcher(ref);
while (m.find()) {
System.out.println(m.group());
}
}
O/P :
application
rid
test
efd
No doubt matching what you need using Pattern/Matcher API is simpler for tis problem.
However if you're looking for a short and quick String#split solution then you can consider:
String ref = "application<in>rid and test<in>efd";
String[] toks = ref.split("<in>|\\s+.*?(?=\\b\\w+<in>)");
Output:
application
rid
test
efd
RegEx Demo
This regex splits on <in> or a pattern that matches a space followed by 0 more chars followed by a word and <in>.
You can also try the below code, it is quite simple
class StringReplace1
{
public static void main(String args[])
{
String ref = "application<in>rid and test<in>efd";
System.out.println((ref.replaceAll("<in>", " ")).replaceAll(" and "," "));
}
}
I have a text file and want to tokenize its lines -- but only the sentences with the # character.
For example, given...
Buah... Molt bon concert!! #Postconcert #gintonic
...I want to print only #Postconcert #gintonic.
I have already tried this code with some changes...
public class MyTokenizer {
/**
* #param args
*/
public static void main(String[] args) {
tokenize("Europe3.txt","allo.txt");
}
public static void tokenize(String sFile,String sFileOut) {
String sLine="", sToken="";
MyBufferedReaderWriter f = new MyBufferedReaderWriter();
f.openRFile(sFile);
MyBufferedReaderWriter fOut = new MyBufferedReaderWriter();
fOut.openWFile(sFileOut);
while ((sLine=f.readLine()) != null) {
//StringTokenizer st = new StringTokenizer(sLine, "#");
String[] tokens = sLine.split("\\#");
for (String token : tokens)
{
fOut.writeLine(token);
//System.out.println(token);
}
/*while (st.hasMoreTokens()) {
sToken = st.nextToken();
System.out.println(sToken);
}*/
}
f.closeRFile();
}
}
Can anyone help?
You can try something like with Regex:
package com.stackoverflow.answers;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class HashExtractor {
public static void main(String[] args) {
String strInput = "Buah... Molt bon concert!! #Postconcert #gintonic";
String strPattern = "(?:\\s|\\A)[##]+([A-Za-z0-9-_]+)";
Pattern pattern = Pattern.compile(strPattern);
Matcher matcher = pattern.matcher(strInput);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
As per the given example, when using the split() function the values would be stored something like this:
tokens[0]=Buah... Molt bon concert!!
tokens[1]=Postconcert
tokens[2]=gintonic
So you just need to skip first value and append '#' (if you need that in your other) to the other string values.
Hope this helps.
You have not specially asked for this, but I assume you try to extract all the #hashtags from your textfile.
To do this, Regex is your friend:
String text = "Buah... Molt bon concert!! #Postconcert #gintonic";
System.out.println(getHashTags(text));
public Collection<String> getHashTags(String text) {
Pattern pattern = Pattern.compile("(#\\w+)");
Matcher matcher = pattern.matcher(text);
Set<String> htags = new HashSet();
while (matcher.find()) {
htags.add(matcher.group(1));
}
return htags;
}
Compile a pattern like this #\w+, everything that starts with a # followed by one or more (+) word character (\w).
Then we have to escape the \ for java with a \\.
And finally put this expression in a group to get access to the matched text by surrounding it with braces (#\w+).
For every match, add the first matched group to the set htags, finally we get a set with all the hashtags in it.
[#gintonic, #Postconcert]
I am using regex in java to get a specific output from a list of rooms at my University.
A outtake from the list looks like this:
(A55:G260) Laboratorium 260
(A55:G292) Grupperom 292
(A55:G316) Grupperom 316
(A55:G366) Grupperom 366
(HDS:FLØYEN) Fløyen (appendix)
(ODO:PC-STUE) Pulpakammeret (PC-stue)
(SALEM:KONF) Konferanserom
I want to get the value that comes between the colon and the parenthesis.
The regex I am using at the moment is:
pattern = Pattern.compile("[:]([A-Za-z0-9ÆØÅæøå-]+)");
matcher = pattern.matcher(room.text());
I've included ÆØÅ, because some of the rooms have Norwegian letters in them.
Unfortunately the regex includes the building code also (e.g. "A55") in the output... Comes out like this:
A55
A55
A55
:G260
:G292
:G316
Any ideas on how to solve this?
The problem is not your regular expression. You need to reference group(1) for the match result.
while (matcher.find()) {
System.out.println(matcher.group(1));
}
However, you may consider using a negated character class instead.
pattern = Pattern.compile(":([^)]+)");
You can try a regex like this :
public static void main(String[] args) {
String s = "(HDS:FLØYEN) Fløyen (appendix)";
// select everything after ":" upto the first ")" and replace the entire regex with the selcted data
System.out.println(s.replaceAll(".*?:(.*?)\\).*", "$1"));
String s1 = "ODO:PC-STUE) Pulpakammeret (PC-stue)";
System.out.println(s1.replaceAll(".*?:(.*?)\\).*", "$1"));
}
O/P :
FLØYEN
PC-STUE
Can try with String Opreations as follows,
String val = "(HDS:FLØYEN) Fløyen (appendix)";
if(val.contains(":")){
String valSub = val.split("\\s")[0];
System.out.println(valSub);
valSub = valSub.substring(1, valSub.length()-1);
String valA = valSub.split(":")[0];
String valB = valSub.split(":")[1];
System.out.println(valA);
System.out.println(valB);
}
Output :
(HDS:FLØYEN)
HDS
FLØYEN
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class test
{
public static void main( String args[] ){
// String to be scanned to find the pattern.
String line = "(HDS:FLØYEN) Fløyen (appendix)";
String pattern = ":([^)]+)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
I've a long template from which I need to extract certain strings based on certain patterns. When I went through some examples I found that use of quantifiers is good in such situations.For example following is my template, from which I need to extract while and doWhile.
This is a sample document.
$while($variable)This text can be repeated many times until do while is called.$endWhile.
Some sample text follows this.
$while($variable2)This text can be repeated many times until do while is called.$endWhile.
Some sample text.
I need to extract the whole text, starting from $while($variable) till $endWhile. I then need to process the value of $variable. After that I need to insert the text between $while and $endWhile to the original text.
I've the logic of extracting the variable. But I'm not sure how to use quantifiers or pattern match here.
Can someone please provide me a sample code for this? Any help will be greatly appreciated
You can use a rather simple regex-based solution here with a Matcher:
Pattern pattern = Pattern.compile("\\$while\\((.*?)\\)(.*?)\\$endWhile", Pattern.DOTALL);
Matcher matcher = pattern.matcher(yourString);
while(matcher.find()){
String variable = matcher.group(1); // this will include the $
String value = matcher.group(2);
// now do something with variable and value
}
If you want to replace the variables in the original text, you should use the Matcher.appendReplacement() / Matcher.appendTail() solution:
Pattern pattern = Pattern.compile("\\$while\\((.*?)\\)(.*?)\\$endWhile", Pattern.DOTALL);
Matcher matcher = pattern.matcher(yourString);
StringBuffer sb = new StringBuffer();
while(matcher.find()){
String variable = matcher.group(1); // this will include the $
String value = matcher.group(2);
// now do something with variable and value
matcher.appendReplacement(sb, value);
}
matcher.appendTail(sb);
Reference:
Methods of the Pattern Class
(Sun Java Tutorial)
Methods of the Matcher Class
(Sun Java Tutorial)
Pattern JavaDoc
Matcher JavaDoc
public class PatternInString {
static String testcase1 = "what i meant here";
static String testcase2 = "here";
public static void main(String args[])throws StringIndexOutOfBoundsException{
PatternInString testInstance= new PatternInString();
boolean result = testInstance.occurs(testcase1,testcase2);
System.out.println(result);
}
//write your code here
public boolean occurs(String str1, String str2)throws StringIndexOutOfBoundsException
{ int i;
boolean result=false;
int num7=str1.indexOf(" ");
int num8=str1.lastIndexOf(" ");
String str6=str1.substring(num8+1);
String str5=str1.substring(0,num7);
if(str5.equals(str2))
{
result=true;
}
else if(str6.equals(str2))
{
result=true;
}
int num=-1;
try
{
for(i=0;i<str1.length()-1;i++)
{ num=num+1;
num=str1.indexOf(" ",num);
int num1=str1.indexOf(" ",num+1);
String str=str1.substring(num+1,num1);
if(str.equals(str2))
{
result=true;
break;
}
}
}
catch(Exception e)
{
}
return result;
}
}