Right now, using Java, I just want it to be able to tokenize any string of integers to an array
input = 1dsa23f hj23nma9123
array = 1,23,23,9123;
I have been trying a few different ways to do it, string.matches("") and then tokenising after it's in the right format and what not but it is too limiting to the user.
It looks like you are looking for something like
String[] nums = text.split("\\D+");
\D regex is negation of \d (it is like [^\d]) which means \D+ will match one or more non-digits.
Only problem with this solution is that if your text start with non-digits result array will start with one empty string.
If you still want to use split then you can simply remove that non-digits part from start of your text.
String[] nums = text.replaceFirst("^\\D+","").split("\\D+");
Other approach than split which is focusing on finding delimiters would be focusing on finding parts which are interesting to us. So instead of searching for non-digits lets find digits.
We can do it in few ways like Patter/Matcher#find, or with Scanner. Problem here is that these approaches don't return array but single elements which you would need to store in some resizeable structure like List.
So solution using Pattern and Matcher could look like:
List<String> numbers = new ArrayList<>();
Matcher m = Pattern.compile("\\d+").matcher(yourText);
while(m.find()){
numbers.add(m.group());
}
Solution using Scanner is similar, we just need to set proper delimiter (to non-digit) and read everything which is not delimiter (delimiters at start of text will be ignored which will should prevent returning empty strings).
List<String> nums = new ArrayList<>();
Scanner sc = new Scanner(yourText);
sc.useDelimiter("\\D+");
while(sc.hasNext()){
nums.add(sc.next());
}
final String input = "1dsa23f hj23nma9123";
final String[] parts = input.split("[^0-9]+");
for (final String s: parts) {
final int i = Integer.parseInt(s);
}
Related
I have a string that is read in pairs, separated by comma. However, I do not always want to split at the comma because there is not always 1 comma in the input. For example, the string,
(http://www.wolframalpha.com/input/?i=103%2F30+%3D+4a-3b,+71%2F60+%3D+a+%2B+b
,http://www.wolframalpha.com/input/?i=x%5E2%2B5x%2B6,file:///tmp/foo/bar/p,d,f.pdf)
Is read in all one line. For this case, I only want to split at the ,h, and no where else in the string. Essentially, after the split, the strings should be:
http://www.wolframalpha.com/input/?i=103%2F30+%3D+4a-3b,+71%2F60+%3D+a+%2B+b
http://www.wolframalpha.com/input/?i=x%5E2%2B5x%2B6
file:///tmp/foo/bar/p,d,f.pdf
Maintaining the order of the comma in the first string. (I will get rid of parenthesis). I have looked at this stack overflow question, and while helpful, does not correctly split this string. This is in Java. Any help is appreciated.
You can use regex to do the split. Please see below code snippet.
String str = "(http://www.wolframalpha.com/input/?i=103%2F30+%3D+4a-3b,+71%2F60+%3D+a+%2B+b,http://www.wolframalpha.com/input/?i=x%5E2%2B5x%2B6)";
String[] strArr = str.split("(,(?=http))");
You will have Array of all the value which would be possible according to your requirement.
Split on 'http' then re-add it.
Psuedo-code
String input = "http://www.wolframalpha.com/input/?i=103%2F30+%3D+4a-3b,+71%2F60+%3D+a+%2B+b
,http://www.wolframalpha.com/input/?i=x%5E2%2B5x%2B6"
List<String> split = input.split('http');
List<String> finalList = new ArrayList<String>();
for(String fixup in split)
{
finalList.put( "http" + fixup );
}
Final should contain the two URLs.
I understand splitting a string in Java is something like this:
String[] result = userCalcInput.split("[-+*/]");
But i want to split say the following expressions.
2+2
222*2
2/2+6
120+9/4+22
29*2+9
I want the user to be able to enter anything such as the above or their own choice as a string and then i can split this on the operator and push into the different lists. code so far below:
List<Integer> numberList = new ArrayList<Integer>();
List<String> operatorList = new ArrayList<String>();
Scanner userInputNoOne = new Scanner( System.in );
System.out.println("Enter your calculation: ");
userCalcInput = userInputNoOne.next();
String[] result = userCalcInput.split("[-+*/]");
if(userCalcInput = int){
//code here
}
else if (userCalcInput = NaN){
//code here
}
System.out.println(result);
I understand this wont work but just wanted to see if right idea and how to properly implement it.
thanks
I understand you are quite new to Java.
You could use regular expressions.
Regular expressions allow you to find specific patterns in a string, in that case you want to look for numbers and operators, which are defined separately in a 'regex' : \d for a digit and \W for a non-word character, such as an arithmetic operator.
You should try using the Pattern and Matcher classes
If I have a file with the following content:
11:17 GET this is my content #2013
11:18 GET this is my content #2014
11:19 GET this is my content #2015
How can I use a Scanner and ignore certain parts of a `String line = scanner.nextLine();?
The result that I like to have would be:
this is my content
this is my content
this is my content
So I'd like to trip everything from the start until GET, and then take everything until the # char.
How could this easily be done?
You can use the String.indexOf(String str) and String.indexOf(char ch) methods. For example:
String line = scanner.nextLine();
int start = line.indexOf("GET");
int end = line.indexOf('#');
String result = line.substring(start + 4, end);
One way might be
String strippedStart = scanner.nextLine().split(" ", 3)[2];
String result = strippedStart.substring(0, strippedStart.lastIndexOf("#")).trim();
This assumes the are always two space separated tokens at the beginning (11:22 GET or 11:33 POST, idk).
You could do something like this:-
String line ="11:17 GET this is my content #2013";
int startIndex = line.indexOf("GET ");
int endIndex = line.indexOf("#");
line = line.substring(startIndex+4, endIndex-1);
System.out.println(line);
In my opinion the best solution for your problem would be using Java regex. Using regex you can define which group or groups of text you want to retrieve and what kind of text comes where. I haven't been working with Java in a long time, so I'll try to help you out from the top of my head. I'll try to give you a point in the right direction.
First off, compile a pattern:
Pattern pattern = Pattern.compile("^\d{1,2}:\d{1,2} GET (.*?) #\d+$", Pattern.MULTILINE);
First part of the regex says that you expect one or two digits followed by a colon followed by one or two digits again. After that comes the GET (you can use GET|POST if you expect those words or \w+? if you expect any word). Then you define the group you want with the parentheses. Lastly, you put the hash and any number of digits with at least one digit. You might consider putting flags DOTALL and CASE_INSENSITIVE, although I don't think you'll be needing them.
Then you continue with the matcher:
Matcher matcher = pattern.matcher(textToParse);
while (matcher.find())
{
//extract groups here
String group = matcher.group(1);
}
In the while loop you can use matcher.group(1) to find the text in the group you selected with the parentheses (the text you'd like extracted). matcher.group(0) gives the entire find, which is not what you're currently looking for (I guess).
Sorry for any errors in the code, it has not been tested. Hope this puts you on the right track.
You can try this rather flexible solution:
Scanner s = new Scanner(new File("data"));
Pattern p = Pattern.compile("^(.+?)\\s+(.+?)\\s+(.*)\\s+(.+?)$");
Matcher m;
while (s.hasNextLine()) {
m = p.matcher(s.nextLine());
if (m.find()) {
System.out.println(m.group(3));
}
}
This piece of code ignores first, second and last words from every line before printing them.
Advantage is that it relies on whitespaces rather than specific string literals to perform the stripping.
i have a problem to build following regex:
[1,2,3,4]
i found a work-around, but i think its ugly
String stringIds = "[1,2,3,4]";
stringIds = stringIds.replaceAll("\\[", "");
stringIds = stringIds.replaceAll("\\]", "");
String[] ids = stringIds.split("\\,");
Can someone help me please to build one regex, which i can use in the split function
Thanks for help
edit:
i want to get from this string "[1,2,3,4]" to an array with 4 entries. the entries are the 4 numbers in the string, so i need to eliminate "[","]" and ",". the "," isn't the problem.
the first and last number contains [ or ]. so i needed the fix with replaceAll. But i think if i use in split a regex for ",", i also can pass a regex which eliminates "[" "]" too. But i cant figure out, who this regex should look like.
This is almost what you're looking for:
String q = "[1,2,3,4]";
String[] x = q.split("\\[|\\]|,");
The problem is that it produces an extra element at the beginning of the array due to the leading open bracket. You may not be able to do what you want with a single regex sans shenanigans. If you know the string always begins with an open bracket, you can remove it first.
The regex itself means "(split on) any open bracket, OR any closed bracket, OR any comma."
Punctuation characters frequently have additional meanings in regular expressions. The double leading backslashes... ugh, the first backslash tells the Java String parser that the next backslash is not a special character (example: \n is a newline...) so \\ means "I want an honest to God backslash". The next backslash tells the regexp engine that the next character ([ for example) is not a special regexp character. That makes me lol.
Maybe substring [ and ] from beginning and end, then split the rest by ,
String stringIds = "[1,2,3,4]";
String[] ids = stringIds.substring(1,stringIds.length()-1).split(",");
Looks to me like you're trying to make an array (not sure where you got 'regex' from; that means something different). In this case, you want:
String[] ids = {"1","2","3","4"};
If it's specifically an array of integer numbers you want, then instead use:
int[] ids = {1,2,3,4};
Your problem is not amenable to splitting by delimiter. It is much safer and more general to split by matching the integers themselves:
static String[] nums(String in) {
final Matcher m = Pattern.compile("\\d+").matcher(in);
final List<String> l = new ArrayList<String>();
while (m.find()) l.add(m.group());
return l.toArray(new String[l.size()]);
}
public static void main(String args[]) {
System.out.println(Arrays.toString(nums("[1, 2, 3, 4]")));
}
If the first line your code is following:
String stringIds = "[1,2,3,4]";
and you're trying to iterate over all number items, then the follwing code-frag only could work:
try {
Pattern regex = Pattern.compile("\\b(\\d+)\\b", Pattern.MULTILINE);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
for (int i = 1; i <= regexMatcher.groupCount(); i++) {
// matched text: regexMatcher.group(i)
// match start: regexMatcher.start(i)
// match end: regexMatcher.end(i)
}
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}
I am trying to break apart a very simple collection of strings that come in the forms of
0|0
10|15
30|55
etc etc. Essentially numbers that are seperated by pipes.
When I use java's string split function with .split("|"). I get somewhat unpredictable results. white space in the first slot, sometimes the number itself isn't where I thought it should be.
Can anybody please help and give me advice on how I can use a reg exp to keep ONLY the integers?
I was asked to give the code trying to do the actual split. So allow me to do that in hopes to clarify further my problem :)
String temp = "0|0";
String splitString = temp.split("|");
results
\n
0
|
0
I am trying to get
0
0
only. Forever grateful for any help ahead of time :)
I still suggest to use split(), it skips null tokens by default. you want to get rid of non numeric characters in the string and only keep pipes and numbers, then you can easily use split() to get what you want. or you can pass multiple delimiters to split (in form of regex) and this should work:
String[] splited = yourString.split("[\\|\\s]+");
and the regex:
import java.util.regex.*;
Pattern pattern = Pattern.compile("\\d+(?=([\\|\\s\\r\\n]))");
Matcher matcher = pattern.matcher(yourString);
while (matcher.find()) {
System.out.println(matcher.group());
}
The pipe symbol is special in a regexp (it marks alternatives), you need to escape it. Depending on the java version you are using this could well explain your unpredictable results.
class t {
public static void main(String[]_)
{
String temp = "0|0";
String[] splitString = temp.split("\\|");
for (int i=0; i<splitString.length; i++)
System.out.println("splitString["+i+"] is " + splitString[i]);
}
}
outputs
splitString[0] is 0
splitString[1] is 0
Note that one backslash is the regexp escape character, but because a backslash is also the escape character in java source you need two of them to push the backslash into the regexp.
You can do replace white space for pipes and split it.
String test = "0|0 10|15 30|55";
test = test.replace(" ", "|");
String[] result = test.split("|");
Hope this helps for you..
You can use StringTokenizer.
String test = "0|0";
StringTokenizer st = new StringTokenizer(test);
int firstNumber = Integer.parseInt(st.nextToken()); //will parse out the first number
int secondNumber = Integer.parseInt(st.nextToken()); //will parse out the second number
Of course you can always nest this inside of a while loop if you have multiple strings.
Also, you need to import java.util.* for this to work.
The pipe ('|') is a special character in regular expressions. It needs to be "escaped" with a '\' character if you want to use it as a regular character, unfortunately '\' is a special character in Java so you need to do a kind of double escape maneuver e.g.
String temp = "0|0";
String[] splitStrings = temp.split("\\|");
The Guava library has a nice class Splitter which is a much more convenient alternative to String.split(). The advantages are that you can choose to split the string on specific characters (like '|'), or on specific strings, or with regexps, and you can choose what to do with the resulting parts (trim them, throw ayway empty parts etc.).
For example you can call
Iterable<String> parts = Spliter.on('|').trimResults().omitEmptyStrings().split("0|0")
This should work for you:
([0-9]+)
Considering a scenario where in we have read a line from csv or xls file in the form of string and need to separate the columns in array of string depending on delimiters.
Below is the code snippet to achieve this problem..
{ ...
....
String line = new BufferedReader(new FileReader("your file"));
String[] splittedString = StringSplitToArray(stringLine,"\"");
...
....
}
public static String[] StringSplitToArray(String stringToSplit, String delimiter)
{
StringBuffer token = new StringBuffer();
Vector tokens = new Vector();
char[] chars = stringToSplit.toCharArray();
for (int i=0; i 0) {
tokens.addElement(token.toString());
token.setLength(0);
i++;
}
} else {
token.append(chars[i]);
}
}
if (token.length() > 0) {
tokens.addElement(token.toString());
}
// convert the vector into an array
String[] preparedArray = new String[tokens.size()];
for (int i=0; i < preparedArray.length; i++) {
preparedArray[i] = (String)tokens.elementAt(i);
}
return preparedArray;
}
Above code snippet contains method call to StringSplitToArray where in the method converts the stringline into string array splitting the line depending on the delimiter specified or passed to the method. Delimiter can be comma separator(,) or double code(").
For more on this, follow this link : http://scrapillars.blogspot.in