get a specific number in a string - java

I am trying to parse a text and get a value from a text like:
Page 1 of 6
I am looking at extracting the end number using java.
so my out put in this case should be 6.
is there any java string functions I can use? (or) any other way?

You could use a regular expression for this (it's safer than using for example String.split):
public static void main(String[] args) {
String text = "Page 1 of 6";
Matcher m = Pattern.compile("Page (\\d+) of (\\d+)").matcher(text);
if (m.matches()) {
int page = Integer.parseInt(m.group(1));
int pages = Integer.parseInt(m.group(2));
System.out.printf("parsed page = %d and pages = %d.", page, pages);
}
}
Outputs:
parsed page = 1 and pages = 6.

Something like this:
String s = "Page 1 of 6";
String[] values = s.split(" ");
System.out.println(Integer.parseInt(values[values.length - 1]));

I think it is basic string manipulation. what you can do is this..
String pageNumberString = "Page 1 of 6";
int ofIndex = pageNumberString.indexOf("of");
int pageNumber = Integer.parseInt(pageNumberString.substring(ofIndex + 2));
I think this should work.

Pattern p = Pattern.compile("(\\d+)$");
Matcher m = p.match("Page 1 of 6");
System.out.println(Integer.parseInt(m.group(1)));

I'd use a regular expression, as long as the format of your numbers is going to stay similar.
This one for example, will match any string with 2 numbers (seperated by any non-digit character), and capture the 2 numbers.
(\d+)[^\d]+(\d+)
Note: this will match some weird things like "Page1of2". It also won't match negative numbers. Not that you expect to ever get a negative page number.

Related

How to skip a portion of a line in Java- Don't know what I am doing

I need to only use every other number in a string of numbers.
This is how the xml file content comes to us, but I only want to use The first group and then every other group.
The second number in each group can be ignored. The most important number are the first like 1,3,5 and 29
Can you help? Each group equals “x”:”x”,
<CatcList>{“1":"15","2":"15","3":"25","4":"25","5":"35","6":"35","29":"10","30":"10"}</CatcList>
Right now my script looks like this, but I am not the one who wrote it.
I only included the portion that would be needed. The StartPage would be the variable used.
If you have knowledge of how to add 1 to the EndPage Integer, that would be very helpful as well.
Thank you!
Util.StringList xs;
line.parseLine(",", "", xs);
for (Int i=0; i<xs.Size; i++) {
Int qty = xs[i].right(xs[i].Length - xs[i].find(":")-1).toInt()-1;
for (Int j=0; j<qty; j++) {
Output_0.File.DocId = product;
Output_0.File.ImagePath = Image;
Output_0.File.ImagePath1 = Image;
Output_0.File.StartPage = xs[i].left(xs[i].find(("-"))).toInt()-1;
Output_0.File.EndPage = xs[i].mid(xs[i].find("-")+1, (xs[i].find(":") - xs[i].find("-")-1)).toInt()-0;
Output_0.File.Quantity = qty.toString();
Output_0.File.commit();
You can use Pattern with a loop and some condition to extract this information :
String string = "<CatcList>{\"1\":\"15\",\"2\":\"15\",\"3\":\"25\",\"4\":\"25\","
+ "\"5\":\"35\",\"6\":\"35\",\"29\":\"10\",\"30\":\"10\"}</CatcList> ";
String regex = "\"(\\d+)\":\"\\d+\"";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
int index = 0;
while (matcher.find()) {
if (index % 2 == 0) {//From your question you have to get alternative results 1 3 5 ...
System.out.println(matcher.group(1) + " ---> " + matcher.group());
}
index++;
}
Outputs
1 ---> "1":"15"
3 ---> "3":"25"
5 ---> "5":"35"
29 ---> "29":"10"
The regex "(\d+)":"\d+" should match any combination of a "number":"number" i used (\d+) so I can get only the information of this group.
That XML value looks like a JSON map. All that .right, .mid, and .left code looks pretty confusing to me without details of how those methods work. Does something like this seem more clear:
// leaving out all the backslash escapes of the embedded quotes
String xmlElement = "{"1":"15","2":"15","3":"25","4":"25","5":"35","6":"35","29":"10","30":"10"}";
xmlElement = xmlElement.replaceAll("[}{]", "");
String[] mapEntryStrings = xmlElement.split(",");
Map<Integer, String> oddStartPages = new HashMap<Integer, String>();
for (String entry : mapEntryStrings) {
String[] keyAndValue = entry.split(":");
int key = Integer.parseInt(keyAndValue[0]);
if (key % 2 == 1) {// if odd
oddStartPages.put(key, keyAndValue[1]);
}
}
Then, the set of keys in the oddStartPages Map is exactly the set of first numbers in your "first and every other group" requirement.
Try this:
,?("(?<num>\d+)":"\d+"),?("\d+":"\d+")?
The group called num will contain every other occurrence of the first part of "x":"x"
so for the values:
"1":"14","2":"14","3":"14","4":"24","5":"33","6":"44","7":"55"
the group called 'num' will contain 1 3 5 and 7.
see example here
Edit: Once you have the numbers extracted, you can do with it what you need:
Pattern datePatt = Pattern.compile(",?(\"(?<num>\\d+)\":\"\\d+\"),?(\"\\d+\":\"\\d+\")?");
String dateStr = "\"1\":\"14\",\"2\":\"14\",\"3\":\"14\",\"4\":\"24\",\"5\":\"33\",\"6\":\"44\",\"7\":\"55\"";
Matcher m = datePatt.matcher(dateStr);
while (m.find()) {
System.out.printf("%s%n", m.group("num"));
}

Taking and skipping groups of strings?

I worked with strings in a couple of languages and then something bothered me about how we can select characters or slices (substrings) from the strings. Like we can get substrings from a string or a character from a particular position, but I was not able to find any method or operator which returns certain slices of a particular length skipping particular characters. Below is the explanation.
So suppose I have the following string: I am an example string. From this string, I want to be able to get groups of string of let's say length 2 and skip certain characters, let's say 3. Now to make things more interesting let's say I can start at any index, which for this example we'll take 5. So the string which I should get from the above conditions should be the following: anam sng. Illustration below (* for take, ! for skip).
** ** ** **
I am an example string.
| !!! !!! !!! !
Start Position --+
I know you could implement that using counting variables which keep track of each character whether to take or not using if condition. But I was thinking of a mathematical way or maybe even an inbuilt method or operator in some languages that could do the job.
I also searched whether Regex could do the job. But couldn't come up with anything.
Generic solution: skip first start characters, when replace all occurrences of regex (.{0,n}).{0,m} by the first group.
Python:
import re
input = 'I am an example string.'
n = 2
m = 3
start = 5
print(re.sub('(.{0,%d}).{0,%d}' % (n, m), "\\1", input[start:]))
Java:
final String input = "I am an example string.";
final int n = 2;
final int m = 3;
final int start = 5;
final String regex = String.format("(.{0,%d}).{0,%d}", n, m);
System.out.println(input.substring(start).replaceAll(regex, "$1"));
C++11:
string input = "I am an example string.";
int n = 2;
int m = 3;
int start = 5;
stringstream s;
s << "(.{0," << n << "}).{0," << m << "}";
regex r(s.str());
cout << regex_replace(input.substr(start), r, "$1");
Regex can do. You only need to try a little harder :)
public static void main(String[] args) {
String s = "I am an example stringpppqq";
Pattern p = Pattern.compile("(.{1,2})(?:.{3}|.{0,2}$)");
int index = 5;
Matcher m = p.matcher(s);
StringBuilder sb = new StringBuilder();
while (index < s.length() && m.find(index)) {
System.out.println(m.group(1));
sb.append(m.group(1));
index = index + 5;
System.out.println(index);
}
System.out.println(sb);
}
O/P :
anam sngqq
Python don't have this kind of slicing, you must use a loop. But you can do it with a comprehension list:
text = 'I am a sample string'
s = 5 # start position
l = 2 # slice length
d = 3 # distance between slices
chunks = [text[p:p + l] for p in range(s, len(text), l + d]
result = ''.join(chunks)
With a RegEx you can match a two-length string in a group and a three-length string.
import re
regex = r'(..)...'
found = re.findall(regex, text[s:]) # list of tuples
result = ''.join(f[0] for f in found)

Get certain data from text - Java

I am creating a bukkit plugin for minecraft and i need to know a few things before i move on.
I want to check if a text has this layout: "B:10 S:5" for example.
It stands for Buy:amount and Sell:amount
How can i check the easiest way if it follows the syntax?
It can be any number that is 0 or over.
Another problem is to get this data out of the text. how can i check what text is after B: and S: and return it as an integer
I have not tried out this yet because i have no idea where to start.
Thanks for help!
In the simple problem you gave, you can get away with a simple answer. Otherwise, see the regex answer below.
boolean test(String str){
try{
//String str = "B:10 S:5";
String[] arr = str.split(" ");//split to left and right of space = [B:10,S:5]
String[] bArr = arr[0].split(":");//split ...first colon = [B,10]
String[] sArr = arr[1].split(":");//... second colon = [S,5]
//need to use try/catch here in case the string is not an int value.
String labelB = bArr[0];
Integer b = Integer.parseInt(bArr[1]);
String labelS = sArr[0];
Integer s = Integer.parseInt(sArr[1]);
}catch( Exception e){return false;}
return true;
}
See my answer here for a related task. More related details below.
How can I parse a string for a set?
Essentially, you need to use regex and iterate through the groups. Just in case the grammar is not always B and S, I made this more abstract.Also, if there are extra white spaces in the middle for what ever reason, I made that more broad too. The pattern says there are 4 groups (indicated by parentheses): label1, number1, label2, and number2. + means 1 or more. [] means a set of characters. a-z is a range of characters (don't put anything between A-Z and a-z). There are also other ways of showing alpha and numeric patterns, but these are easier to read.
//this is expensive
Pattern p=Pattern.compile("([A-Za-z]+):([0-9]+)[ ]+([A-Za-z]+):([0-9]+)");
boolean test(String txt){
Matcher m=p.matcher(txt);
if(!m.matches())return false;
int groups=m.groupCount();//should only equal 5 (default whole match+4 groups) here, but you can test this
System.out.println("Matched: " + m.group(0));
//Label1 = m.group(1);
//val1 = m.group(2);
//Label2 = m.group(3);
//val2 = m.group(4);
return true;
}
Use Regular Expression.
In your case,^B:(\d)+ S:(\d)+$ is enough.
In java, to use a regular expression:
public class RegExExample {
public static void main(String[] args) {
Pattern p = Pattern.compile("^B:(\d)+ S:(\d)+$");
for (int i = 0; i < args.length; i++)
if (p.matcher(args[i]).matches())
System.out.println( "ARGUMENT #" + i + " IS VALID!")
else
System.out.println( "ARGUMENT #" + i + " IS INVALID!");
}
}
This sample program take inputs from command line, validate it against the pattern and print the result to STDOUT.

Getting match of Group with Asterisk?

How can I get the content for a group with an asterisk?
For example I'd like to pare a comma separated list, e. g. 1,2,3,4,5.
private static final String LIST_REGEX = "^(\\d+)(,\\d+)*$";
private static final Pattern LIST_PATTERN = Pattern.compile(LIST_REGEX);
public static void main(String[] args) {
final String list = "1,2,3,4,5";
final Matcher matcher = LIST_PATTERN.matcher(list);
System.out.println(matcher.matches());
for (int i = 0, n = matcher.groupCount(); i < n; i++) {
System.out.println(i + "\t" + matcher.group(i));
}
}
And the output is
true
0 1,2,3,4,5
1 1
How can I get every single entry, i. e. 1, 2, 3, ...?
I am searching for a common solution. This is only a demonstrative example.
Please imagine a more complicated regex like ^\\[(\\d+)(,\\d+)*\\]$ to match a list like [1,2,3,4,5]
You can use String.split().
for (String segment : "1,2,3,4,5".split(","))
System.out.println(segment);
Or you can repeatedly capture with assertion:
Pattern pattern = Pattern.compile("(\\d),?");
for (Matcher m = pattern.matcher("1,2,3,4,5");; m.find())
m.group(1);
For your second example you added you can do a similar match.
for (String segment : "!!!!![1,2,3,4,5] //"
.replaceFirst("^\\D*(\\d(?:,\\d+)*)\\D*$", "$1")
.split(","))
System.out.println(segment);
I made an online code demo. I hope this is what you wanted.
how can I get all the matches (zero, one or more) for a arbitary group with an asterisk (xyz)*? [The group is repeated and I would like to get every repeated capture.]
No, you cannot. Regex Capture Groups and Back-References tells why:
The Returned Value for a Given Group is the Last One Captured
Since a capture group with a quantifier holds on to its number, what value does the engine return when you inspect the group? All engines return the last value captured. For instance, if you match the string A_B_C_D_ with ([A-Z]_)+, when you inspect the match, Group 1 will be D_. With the exception of the .NET engine, all intermediate values are lost. In essence, Group 1 gets overwritten each time its pattern is matched.
I assume you may be looking for something like the following, this will handle both of your examples.
private static final String LIST_REGEX = "^\\[?(\\d+(?:,\\d+)*)\\]?$";
private static final Pattern LIST_PATTERN = Pattern.compile(LIST_REGEX);
public static void main(String[] args) {
final String list = "[1,2,3,4,5]";
final Matcher matcher = LIST_PATTERN.matcher(list);
matcher.find();
int i = 0;
String[] vals = matcher.group(1).split(",");
System.out.println(matcher.matches());
System.out.println(i + "\t" + matcher.group(1));
for (String x : vals) {
i++;
System.out.println(i + "\t" + x);
}
}
Output
true
0 1,2,3,4,5
1 1
2 2
3 3
4 4
5 5

How does String.split work?

Why does the following code return the output below?
I would expect that 2 and 3 provide the same string splitting of 1.
Log.d(TAG, " 1 ---------------------------");
String originalText = "hello. .hello1";
Pattern p = Pattern.compile("[a-zA-Z]+|\\s|\\W|\\d");
Matcher m = p.matcher(originalText);
while (m.find()) {
Log.d(TAG, m.group(0));
}
Log.d(TAG, "2 --------------------------- " + originalText);
String [] scrollString = p.split(originalText);
int i;
for (i=0; i<scrollString.length; i++)
Log.d(TAG, scrollString[i]);
Log.d(TAG, "3 --------------------------- " + originalText);
scrollString = originalText.split("[a-zA-Z]+|\\s|\\W|\\d");
for (i=0; i<scrollString.length; i++)
Log.d(TAG, scrollString[i]);
OUTPUT:
1 ---------------------------
hello
.
.
hello
1
2 ---------------------------
3 ---------------------------
No. 1 will find the pattern and return that, whereas No. 2 and 3 will return the text in between the found pattern (which serves as the delimiter in those cases).
Your subject doesn't match what you are asking.
The Subject asks about String.split() you are doing Pattern.split() which one do you really want help with?
When using String.split(); you pass in the regular expression to apply to the string, not the string you want to split!
JavaDoc for String.split();
final String s = "this is the string I want to split";
final String[] sa = s.split(" ");
you are calling .split on p ( Pattern.split(); )
Pattern p = Pattern.compile("[a-zA-Z]+|\\s|\\W|\\d");
String [] scrollString = p.split(originalText);
these too methods have different behaviors.
the split() methods don't add the captured part of the string (the delimiter) to the result array
if you want the delimiters you'll have to play with lookahead and lookbehind (or use version 1)
No. Every character in your string is covered by the split pattern, hence taken as something you don't want. Therefore, you get the empty result.
You can imagine that your pattern first finds "hello", then split hopes to find something, but alas!, it finds another "separation" character.

Categories

Resources