Regex to split string

Regex to split string - java

I have this code which prints:
[( ?Random = <http://www.semanticweb.org/vassilis/ontologies/2013/5/Test#Hello> ), ( ?Random = <http://www.semanticweb.org/vassilis/ontologies/2013/5/Test#Bye> )]
I tried to split at [#] but it didnt work.
What should i put in split so that I can get as a result the part after # only: Hello, Bye
Query query = QueryFactory.create(queryString);
QueryExecution qe= QueryExecutionFactory.create(query, model);
ResultSet resultset = qe.execSelect();
ResultSet results = ResultSetFactory.copyResults(resultset);
final ResultSet results2 = ResultSetFactory.copyResults(results);
System.out.println( "== Available Options ==" );
ResultSetFormatter.out(System.out, results, query);
Scanner input = new Scanner(System.in);
final String inputs;
inputs = input.next();
final String[] indices = inputs.split("\\s*,\\s*");
final List<QuerySolution> selectedSolutions = new ArrayList<QuerySolution>(
indices.length) {
{
final List<QuerySolution> solutions = ResultSetFormatter
.toList(results2);
for (final String index : indices) {
add(solutions.get(Integer.valueOf(index)));
}
}
};
System.out.println(selectedSolutions);

If I understand correctly, you only want to extract "Hello" and "Bye" from your input String through regex.
In which case, I would just use iterative matching of whatever's in between # and >, as such:
// To clarify, this String is just an example
// Use yourScannerInstance.nextLine to get the real data
String input = "[( ?Random = <http://www.semanticweb.org/vassilis/ontologies/2013/5/Test#Hello> ), "
+ "( ?Random = <http://www.semanticweb.org/vassilis/ontologies/2013/5/Test#Bye> )]";
// Pattern improved by Brian
// was: #(.+?)>
Pattern p = Pattern.compile("#([^>]+)>");
Matcher m = p.matcher(input);
// To clarify, printing the String out is just for testing purpose
// Add "m.group(1)" to a Collection<String> to use it in further code
while (m.find()) {
System.out.println(m.group(1));
}
Output:
Hello
Bye

You can try this
String[] str= your_orginal_String.split(",");
Then you can take the parts after # as follows
String[] s=new String[2];
int j=0;
for(String i:str){
s[j]=i.split("#",2)[1];
j++;
}
You may need some formatting. for resulting String[] s as follows
String str = "[( ?Random = <http://www.semanticweb.org/vassilis
/ontologies/2013/5/Test#Hello> ), ( ?Random =
<http://www.semanticweb.org/vassilis/ontologies/2013/5/Test#Bye> )]";
String[] arr = str.split(",");
String[] subArr = new String[arr.length];
int j = 0;
for (String i : arr) {
subArr[j] = i.split("#", 2)[1].replaceAll("\\>|\\)|\\]", "");
j++;
}
System.out.println(Arrays.toString(subArr));
Out put:
[Hello , Bye ]

Try the regular expression:
(?<=#)([^#>]+)
e.g.:
private static final Pattern REGEX_PATTERN =
Pattern.compile("(?<=#)([^#>]+)");
public static void main(String[] args) {
String input = "[( ?A = <http://www.semanticweb.org/vassilis/ontologies/2013/5/Test#Hello> ), ( ?A = <http://www.semanticweb.org/vassilis/ontologies/2013/5/Test#World> )]";
Matcher matcher = REGEX_PATTERN.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
Output:
Hello
World

Related

Read and Split(Parse) data in java

I am trying to split some simple data from a .txt file. I have found some useful structures on the internet but it was not enough to split the data the way I wanted. I get a string like this:
{X:0.8940594 Y:0.6853521 Z:1.470214}
And I want to transform it to like this;
0.8940594
0.6853521
1.470214
And then put them in a matrix in order X=[], Y=[], Z=[]; (the data is the coordinate of an object)
Here is my code:
BufferedReader in = null; {
try {
in = new BufferedReader(new FileReader("file.txt"));
String read = null;
while ((read = in.readLine()) != null) {
String[] splited = read.split("\\s+");
for (String part : splited) {
System.out.println(part);
}
}
} catch (IOException e) {
System.out.println("There was a problem: " + e);
e.printStackTrace();
} finally {
try {
in.close();
} catch (Exception e) {
}
}
}
What do I need to add to my code to get the data the way I want?
Right now with this code I receive data like this:
{X:0.8940594
Y:0.6853521
Z:1.470214}

You can try using a regex similar to the following to match and capture the three numbers contained in each tuple:
{\s*X:(.*?)\s+Y:(.*?)\s+Z:(.*?)\s*}
Each quantity contained in parenthesis is a capture group, and is available after a match has taken place.
int size = 100; // replace with actual size of your vectors/matrix
double[] A = new double[size];
double[] B = new double[size];
double[] C = new double[size];
String input = "{X:0.8940594 Y:0.6853521 Z:1.470214}";
String regex = "\\{\\s*X:(.*?)\\s+Y:(.*?)\\s+Z:(.*?)\\s*\\}";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
int counter = 0;
while (m.find()) {
A[counter] = Double.parseDouble(m.group(1));
B[counter] = Double.parseDouble(m.group(2));
C[counter] = Double.parseDouble(m.group(3));
++counter;
}

You can use this regex -?\d+\.\d+ for example :
String input = "{X:0.8940594 Y:0.6853521 Z:1.470214}";
Pattern pattern = Pattern.compile("-?\\d+\\.\\d+");
Matcher matcher = pattern.matcher(input);
List<String> result = new ArrayList<>();
while (matcher.find()) {
result.add(matcher.group());
}
System.out.println(result);
In your case you want to match the real number, you can check the Regex .

This code will solve your problem.
String input = "{X:0.8940594 Y:0.6853521 Z:1.470214} ";
String[] parts = input.split("(?<= )");
List<String> output = new ArrayList();
for (int i = 0; i < parts.length; i++) {
//System.out.println("*" + i);
//System.out.println(parts[i]);
String[] part = parts[i].split("(?<=:)");
String[] temp = part[1].split("}");
output.add(temp[0]);
}
System.out.println("This List contains numbers:" + output);
Output->This List contains numbers:[0.8940594 , 0.6853521 , 1.470214]

How about this?
public class Test {
public static void main(String[] args) {
String s = "{X:0.8940594 Y:0.6853521 Z:1.470214}";
String x = s.substring(s.indexOf("X:")+2, s.indexOf("Y:")-1);
String y = s.substring(s.indexOf("Y:")+2, s.indexOf("Z:")-1);
String z = s.substring(s.indexOf("Z:")+2, s.lastIndexOf("}"));
System.out.println(x);
System.out.println(y);
System.out.println(z);
}
}

Your regex splits on whitespace, but does not remove the curly braces.
So instead of splitting on whitespace, you split on a class of characters: whitespace and curly braces.
The line with the regex then becomes:
String[] splited = read.split("[\\s+\\{\\}]");
Here is an ideone link with the full snippet.
After this, you'll want to split the resulting three lines on the :, and parse the righthand side. You can use Double.parseDouble for this purpose.
Personally, I would try to avoid long regex expressions; they are hard to debug.
It may be best to remove the curly braces first, then split the result on whitespace and colons. This is more lines of code, but it's more robust and easier to debug.

Replace word in Java

There is some line, for example "1 qqq 4 aaa 2" and list {aaa, qqq}. I must change all words (consists only from letters) on words from list. Answer on this example "1 aaa 4 qqq 2". Try
StringTokenizer tokenizer = new StringTokenizer(str, " ");
while (tokenizer.hasMoreTokens()){
tmp = tokenizer.nextToken();
if(tmp.matches("^[a-z]+$"))
newStr = newStr.replaceFirst(tmp, words.get(l++));
}
But it's not working. In result I have the same line.
All my code:
String space = " ", tmp, newStr;
Scanner stdin = new Scanner(System.in);
while (stdin.hasNextLine()) {
int k = 0, j = 0, l = 0;
String str = stdin.nextLine();
newStr = str;
List<String> words = new ArrayList<>(Arrays.asList(str.split(" ")));
words.removeIf(new Predicate<String>() {
#Override
public boolean test(String s) {
return !s.matches("^[a-z]+$");
}
});
Collections.sort(words);
StringTokenizer tokenizer = new StringTokenizer(str, " ");
while (tokenizer.hasMoreTokens()){
tmp = tokenizer.nextToken();
if(tmp.matches("^[a-z]+$"))
newStr = newStr.replaceFirst(tmp, words.get(l++));
}
System.out.printf(newStr);
}

I think the problem might be that replaceFirst() expects a regular expression as first parameter and you are giving it a String.
Maybe try
newStr = newStr.replaceFirst("^[a-z]+$", words.get(l++));
instead?
Update:
Would that be a possibility for you:
StringBuilder _b = new StringBuilder();
while (_tokenizer.hasMoreTokens()){
String _tmp = _tokenizer.nextToken();
if(_tmp.matches("^[a-z]+$")){
_b.append(words.get(l++));
}
else{
_b.append(_tmp);
}
_b.append(" ");
}
String newStr = _b.toString().trim();
Update 2:
Change the StringTokenizer like this:
StringTokenizer tokenizer = new StringTokenizer(str, " ", true);
That will also return the delimiters (all the spaces).
And then concatenate the String like this:
StringBuilder _b = new StringBuilder();
while (_tokenizer.hasMoreTokens()){
String _tmp = _tokenizer.nextToken();
if(_tmp.matches("^[a-z]+$")){
_b.append(words.get(l++));
}
else{
_b.append(_tmp);
}
}
String newStr = _b.toString().trim();
That should work.
Update 3:
As #DavidConrad mentioned StrinkTokenizer should not be used anymore. Here is another solution with String.split():
final String[] _elements = str.split("(?=[\\s]+)");
int l = 0;
for (int i = 0; i < _tokenizer.length; i++){
if(_tokenizer[i].matches("^[a-z]+$")){
_b.append(_arr[l++]);
}
else{
_b.append(_tokenizer[i]);
}
}

Just out of curiosity, another solution (the others really don't answer the question), which takes the input line and sorts the words alphabetically in the result, as you commented in your question.
public class Replacer {
public static void main(String[] args) {
Replacer r = new Replacer();
Scanner in = new Scanner(System.in);
while (in.hasNextLine()) {
System.out.println(r.replace(in.nextLine()));
}
}
public String replace(String input) {
Matcher m = Pattern.compile("([a-z]+)").matcher(input);
StringBuffer sb = new StringBuffer();
List<String> replacements = new ArrayList<>();
while (m.find()) {
replacements.add(m.group());
}
Collections.sort(replacements);
m.reset();
for (int i = 0; m.find(); i++) {
m.appendReplacement(sb, replacements.get(i));
}
m.appendTail(sb);
return sb.toString();
}
}

How to extract variables from url query to readable format in java?

I have this url:
http://myhost.com/Request?to=s%3A73746647+d%3Afalse+f%3A-1.0+x%3A-74.454383+y%3A40.843021+r%3A-1.0+cd%3A-1.0+fn%3A-1+tn%3A-1+bd%3Atrue+st%3ACampus%7EDr&returnGeometries=true&nPaths=1&returnClientIds=true&returnInstructions=true&hour=12+00&from=s%3A-1+d%3Afalse+f%3A-1.0+x%3A-74.241765+y%3A40.830182+r%3A-1.0+cd%3A-1.0+fn%3A56481485+tn%3A26459042+bd%3Afalse+st%3AClaremont%7EAve&sameResultType=true
how can I extract the from and to arguments in a readable manner?
I have tried the following:
String patternString1 = "(&to=) (.+?) (&returnGeometries) (.+?) (&hour=)"
+" (.+?) (&from=) (.+?) (&sameResultType=)";
Pattern pattern = Pattern.compile(patternString1);
Matcher matcher = pattern.matcher(freshResponse.regression_requestUrl);
while(matcher.find()) {
System.out.println("found: "+matcher.group(1)+" "+matcher.group(3)+matcher.group(4));
}
Pattern pattern = Pattern.compile(patternString1);
Matcher matcher = pattern.matcher(url);
but even if I succeed fetching the correct substrings, how can I convert them to coordinates which I can use to find this place? (in other words: ..such that the coordinates are clean and ready to be used)

This should do what you want:
public class DecodeURL {
public static void main(String[] args) throws UnsupportedEncodingException {
String input = "http://myhost.com/Request?to=s%3A73746647+d%3Afalse+f%3A-1.0+"
+"x%3A-74.454383+y%3A40.843021+r%3A-1.0+cd%3A-1.0+fn%3A-1+tn%3A-1+bd%3A"
+"true+st%3ACampus%7EDr&returnGeometries=true&nPaths=1&returnClientIds="
+"true&returnInstructions=true &hour=12+00&from=s%3A-1+d%3Afalse+f%3A-"
+"1.0+x%3A-74.241765+y%3A40.830182+r%3A-1.0+cd%3A-1.0+fn%3A56481485+tn"
+"%3A26459042+bd%3Afalse+st%3AClaremont%7EAve&sameResultType=true";
String decoded = java.net.URLDecoder.decode(input, "UTF-8").replace("&", " & ");
String[] params = {"s","d","f","x","y","r","cd","fn","tn","bd","st"};
System.out.println("Decoded input URL: \n"+decoded);
// Output all FROM arguments
System.out.println("\nFROM:");
for (int i = 0; i < params.length; i++) {
System.out.println(params[i]+" = \t"+findInstance(decoded, "from", params[i]));
}
// Output all TO arguments
System.out.println("\nTO:");
for (int i = 0; i < params.length; i++) {
System.out.println(params[i]+" = \t"+findInstance(decoded, "to", params[i]));
}
}
public static String findInstance(String input, String type, String match) {
int start = input.indexOf(match+":", input.indexOf(type))+match.length()+1;
return input.substring(start, input.indexOf(" ", start));
}
}
Output
Decoded input URL:
http://myhost.com/Request?to=s:73746647 d:false f:-1.0 x:-74.454383 y:40.843021 r:-1.0 cd:-1.0 fn:-1 tn:-1 bd:true st:Campus~Dr & returnGeometries=true & nPaths=1 & returnClientIds=true & returnInstructions=true & hour=12 00 & from=s:-1 d:false f:-1.0 x:-74.241765 y:40.830182 r:-1.0 cd:-1.0 fn:56481485 tn:26459042 bd:false st:Claremont~Ave & sameResultType=true
FROM:
s = -1
d = false
f = -1.0
x = -74.241765
y = 40.830182
r = -1.0
cd = -1.0
fn = 56481485
tn = 26459042
bd = false
st = Claremont~Ave
TO:
s = 73746647
d = false
f = -1.0
x = -74.454383
y = 40.843021
r = -1.0
cd = -1.0
fn = -1
tn = -1
bd = true
st = Campus~Dr
To change the number of output parameters, simply edit the params array. For example, if you have String[] params = {"x","y"}, the program will output the coordinates (x,y) only
I hope that helps you out. Good luck!

Try this:
String url = "http://myhost.com/Request?to=s%3A73746647+d%3Afalse+f%3A-1.0+x%3A-74.454383+y%3A40.843021+r%3A-1.0+cd%3A-1.0+fn%3A-1+tn%3A-1+bd%3Atrue+st%3ACampus%7EDr&returnGeometries=true&nPaths=1&returnClientIds=true&returnInstructions=true&hour=12+00&from=s%3A-1+d%3Afalse+f%3A-1.0+x%3A-74.241765+y%3A40.830182+r%3A-1.0+cd%3A-1.0+fn%3A56481485+tn%3A26459042+bd%3Afalse+st%3AClaremont%7EAve&sameResultType=true";
URL urlObject = new URL(url);
for (String s : urlObject.getQuery().split("&")) {
String d = URLDecoder.decode(s, "UTF-8");
if (d.startsWith("from=") || d.startsWith("to=")) {
int index = d.indexOf('=') + 1;
System.out.println(d.substring(0, index));
for (String t : d.substring(index).split(" "))
System.out.println(" " + t);
} else
System.out.println(s);
}
URLDecoder.decode() is useful. But be careful in case of query string may contain %26. It is & but not a delimiter.
So you should split("&") and then decode().

Split mathematical string in Java

I have this string: "23+43*435/675-23". How can I split it? The last result which I want is:
String 1st=23
String 2nd=435
String 3rd=675
String 4th=23
I already used this method:
String s = "hello+pLus-minuss*multi/divide";
String[] split = s.split("\\+");
String[] split1 = s.split("\\-");
String[] split2 = s.split("\\*");
String[] split3 = s.split("\\/");
String plus = split[1];
String minus = split1[1];
String multi = split2[1];
String div = split3[1];
System.out.println(plus+"\n"+minus+"\n"+multi+"\n"+div+"\n");
But it gives me this result:
pLus-minuss*multi/divide
minuss*multi/divide
multi/divide
divide
But I require result in this form
pLus
minuss
multi
divide

Try this:
public static void main(String[] args) {
String s ="23+43*435/675-23";
String[] ss = s.split("[-+*/]");
for(String str: ss)
System.out.println(str);
}
Output:
23
43
435
675
23
I dont know why you want to store in variables and then print . Anyway try below code:
public static void main(String[] args) {
String s = "hello+pLus-minuss*multi/divide";
String[] ss = s.split("[-+*/]");
String first =ss[1];
String second =ss[2];
String third =ss[3];
String forth =ss[4];
System.out.println(first+"\n"+second+"\n"+third+"\n"+forth+"\n");
}
Output:
pLus
minuss
multi
divide

Try this out :
String data = "23+43*435/675-23";
Pattern pattern = Pattern.compile("[^\\+\\*\\/\\-]+");
Matcher matcher = pattern.matcher(data);
List<String> list = new ArrayList<String>();
while (matcher.find()) {
list.add(matcher.group());
}
for (int index = 0; index < list.size(); index++) {
System.out.println(index + " : " + list.get(index));
}
Output :
0 : 23
1 : 43
2 : 435
3 : 675
4 : 23

I think it is only the issue of index. You should have used index 0 to get the split result.
String[] split = s.split("\\+");
String[] split1 = split .split("\\-");
String[] split2 = split1 .split("\\*");
String[] split3 = split2 .split("\\/");
String hello= split[0];//split[0]=hello,split[1]=pLus-minuss*multi/divide
String plus= split1[0];//split1[0]=plus,split1[1]=minuss*multi/divide
String minus= split2[0];//split2[0]=minuss,split2[1]=multi/divide
String multi= split3[0];//split3[0]=multi,split3[1]=divide
String div= split3[1];

If the order of operators matters, change your code to this:
String s = "hello+pLus-minuss*multi/divide";
String[] split = s.split("\\+");
String[] split1 = split[1].split("\\-");
String[] split2 = split1[1].split("\\*");
String[] split3 = split2[1].split("\\/");
String plus = split1[0];
String minus = split2[0];
String multi = split3[0];
String div = split3[1];
System.out.println(plus + "\n" + minus + "\n" + multi + "\n" + div + "\n");
Otherwise, to spit on any operator, and store to variable do this:
public static void main(String[] args) {
String s = "hello+pLus-minuss*multi/divide";
String[] ss = s.split("[-+*/]");
String plus = ss[1];
String minus = ss[2];
String multi = ss[3];
String div = ss[4];
System.out.println(plus + "\n" + minus + "\n" + multi + "\n" + div + "\n");
}

Words inside square brackes - RegExp

String linkPattern = "\\[[A-Za-z_0-9]+\\]";
String text = "[build]/directory/[something]/[build]/";
RegExp reg = RegExp.compile(linkPattern,"g");
MatchResult matchResult = reg.exec(text);
for (int i = 0; i < matchResult.getGroupCount(); i++) {
System.out.println("group" + i + "=" + matchResult.getGroup(i));
}
I am trying to get all blocks which are encapsulated by squared bracets form a path string:
and I only get group0="[build]" what i want is:
1:"[build]" 2:"[something]" 3:"[build]"
EDIT:
just to be clear words inside the brackets are generated with random text
public static String genText()
{
final int LENGTH = (int)(Math.random()*12)+4;
StringBuffer sb = new StringBuffer();
for (int x = 0; x < LENGTH; x++)
{
sb.append((char)((int)(Math.random() * 26) + 97));
}
String str = sb.toString();
str = str.substring(0,1).toUpperCase() + str.substring(1);
return str;
}
EDIT 2:
JDK works fine, GWT RegExp gives this problem
SOLVED:
Answer from Didier L
String linkPattern = "\\[[A-Za-z_0-9]+\\]";
String result = "";
String text = "[build]/directory/[something]/[build]/";
RegExp reg = RegExp.compile(linkPattern,"g");
MatchResult matchResult = null;
while((matchResult=reg.exec(text)) != null){
if(matchResult.getGroupCount()==1)
System.out.println( matchResult.getGroup(0));
}

I don't know which regex library you are using but using the one from the JDK it would go along the lines of
String linkPattern = "\\[[A-Za-z_0-9]+\\]";
String text = "[build]/directory/[something]/[build]/";
Pattern pat = Pattern.compile(linkPattern);
Matcher mat = pat.matcher(text);
while (mat.find()) {
System.out.println(mat.group());
}
Output:
[build]
[something]
[build]

Try:
String linkPattern = "(\\[[A-Za-z_0-9]+\\])*";
EDIT:
Second try:
String linkPattern = "\\[(\\w+)\\]+"
Third try, see http://rubular.com/r/eyAQ3Vg68N

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regex to split string - java

Related

Read and Split(Parse) data in java

Replace word in Java

How to extract variables from url query to readable format in java?

Split mathematical string in Java

Words inside square brackes - RegExp

Categories

Resources