What regex should I use here? - java

I am working on some socket programming stuff and attempting to match some strings. The format is as follows:
1.) Some text
where the one represents any number, and some text refers to anything (including letters, numbers, quotation marks, etc).
I tried using [0-9]*\\.\\).* but it doesn't return a match. What am I doing wrong and how do I fix it?
Edit
As requested, here is my code:
/** Parses data returned by the server */
public void getSocketData(String data) {
String[] lines = data.split("\\r?\\n");
this.fileHosts = new String[lines.length];
Pattern p = Pattern.compile("[0-9]*\\.\\).*");
for (int i = 0; i < lines.length; i++) {
String line = lines[i];
if (p.matcher(line).matches()) {
//The format is: 1.) "b.jpg" from "192.168.1.101:40000"
String[] info = line.split("\"");
this.fileHosts[i] = info[3]; //this should now contain <addr:port>
System.out.println("Adding " + fileHosts[i] + " to fileHosts");
}
else {
System.out.println("No Match!");
}
}
}//getSocketData

This works for me:
public static void main(String args[]) {
String s = "1.) Some text";
System.out.println(s.replaceFirst("^[0-9]+\\.\\).*$","matched"));
}
Output:
matched
EDIT: Same result with the following:
String s = "1.) \"b.jpg\" from \"192.168.1.101:40000\"";
That is the example in the comment in your code
EDIT2: I try also your code:
String s = "1.) \"b.jpg\" from \"192.168.1.101:40000\"";
Pattern p = Pattern.compile("^[0-9]+\\.\\).*$"); // works also if you use * instead of +
if (p.matcher(s).matches()) {
System.out.println("match");
}
else {
System.out.println("No Match!");
}
The result is
match
Try using this regex: ^[0-9]+\\.\\).*$

Related

Split string into list of substrings of different character types

I am writing a spell checker that takes a text file as input and outputs the file with spelling corrected.
The program should preserve formatting and punctuation.
I want to split the input text into a list of string tokens such that each token is either 1 or more: word, punctuation, whitespace, or digit characters.
For example:
Input:
words.txt:
asdf don't ]'.'..;'' as12....asdf.
asdf
Input as list:
["asdf" , " " , "don't" , " " , "]'.'..;''" , " " , "as" , "12" ,
"...." , "asdf" , "." , "\n" , "asdf"]
Words like won't and i'll should be treated as a single token.
Having the data in this format would allow me to process the tokens like so:
String output = "";
for(String token : tokens) {
if(isWord(token)) {
if(!inDictionary(token)) {
token = correctSpelling(token);
}
}
output += token;
}
So my main question is how can i split a string of text into a list of substrings as described above? Thank you.
The main difficulty here would be to find the regex that matches what you consider to be a "word". For my example I consider ' to be part of a word if it's proceeded by a letter or if the following character is a letter:
public static void main(String[] args) {
String in = "asdf don't ]'.'..;'' as12....asdf.\nasdf";
//The pattern:
Pattern p = Pattern.compile("[\\p{Alpha}][\\p{Alpha}']*|'[\\p{Alpha}]+");
Matcher m = p.matcher(in);
//If you want to collect the words
List<String> words = new ArrayList<String>();
StringBuilder result = new StringBuilder();
Now find something from the start
int pos = 0;
while(m.find(pos)) {
//Add everything from starting position to beginning of word
result.append(in.substring(pos, m.start()));
//Handle dictionary logig
String token = m.group();
words.add(token); //not used actually
if(!inDictionary(token)) {
token = correctSpelling(token);
}
//Add to result
result.append(token);
//Repeat from end position
pos = m.end();
}
//Append remainder of input
result.append(in.substring(pos));
System.out.println("Result: " + result.toString());
}
Because I like solving puzzles, I tried the following and I think it works fine:
public class MyTokenizer {
private final String str;
private int pos = 0;
public MyTokenizer(String str) {
this.str = str;
}
public boolean hasNext() {
return pos < str.length();
}
public String next() {
int type = getType(str.charAt(pos));
StringBuilder sb = new StringBuilder();
while(hasNext() && (str.charAt(pos) == '\'' || type == getType(str.charAt(pos)))) {
sb.append(str.charAt(pos));
pos++;
}
return sb.toString();
}
private int getType(char c) {
String sc = Character.toString(c);
if (sc.matches("\\d")) {
return 0;
}
else if (sc.matches("\\w")) {
return 1;
}
else if (sc.matches("\\s")) {
return 2;
}
else if (sc.matches("\\p{Punct}")) {
return 3;
}
else {
return 4;
}
}
public static void main(String... args) {
MyTokenizer mt = new MyTokenizer("asdf don't ]'.'..;'' as12....asdf.\nasdf");
while(mt.hasNext()) {
System.out.println(mt.next());
}
}
}

how to Split name file to count series in java?

i have a list of files in this form:
name_of_file_001.csv
name_of_file_002.csv
name_of_file_123.csv
or
name_of_file.csv
second_name_of_file.csv
i don't know if the file has 001 or not.
how to take name of file (only name_of_file) in java?
Try the following:
int i=0;
while(!fullName.charAt(i).equals('.')&&!fullName.charAt(i).equals('0')){
i++;
}
String name=fullName.substring(0, i);
Take the string from the beginning of the fullName to the first appearance of . or 0.
EDIT:
Referring to the comments and the case of high numbers greater than 1.. and inspired from this answer:
int i=0;
String patternStr = "[0-9\.]";
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(fullName);
if(matcher.find()){
i=matcher.start(); //this will give you the first index of the regex
}
String name=fullName.substring(0, i);
EDIT2:
In the case where there's no Extension and the fullname doesn't match the regex(there's no numbers):
if(matcher.find()){
i=matcher.start(); //this will give you the first index of the regex
}else {
i=fullname.length();
}
String name=fullName.substring(0, i);
Or simply we will take all the name.
I modified chsdk's solution with respect to mmxx's comment:
int i=0;
while(i < fullName.length() && ".0123456789".indexOf(fullName.charAt(i)) == -1) {
i++;
}
String name=fullName.substring(0, i);
EDIT:
Added
i < fullName.length()
This little class solves the problem for all the examples shown in main:
public class Example {
private static boolean isNaturalNumber(String str)
{
return str.matches("\\d+(\\.\\d+)?");
}
public static String getFileName(String s) {
String fn = s.split("\\.")[0];
int idx = fn.lastIndexOf('_');
if (idx < 0) {
return fn;
}
String lastPart = fn.substring(idx+1);
System.out.println("last part = " + lastPart);
if (isNaturalNumber(lastPart)) {
return fn.substring(0,idx);
} else {
return fn;
}
}
public static void main(String []args){
System.out.println(getFileName("file_name_001.csv"));
System.out.println(getFileName("file_name_1234.csv"));
System.out.println(getFileName("file_name.csv"));
System.out.println(getFileName("file_name"));
System.out.println(getFileName("file"));
}
}
EDIT 1: Replaced the exception-based check with a regex check.
EDIT 2: Handling file names without any underscores.
i resolved the problem in this mode:
nameOfFile.split("\\.")[0].replaceall("_[0-9]*","");
split("\.")[0] remove ".csv" name_of_file_001.csv => name_of_file_001
.replaceall("_[0-9]*","") "remove, if there is, "_001" name_of_file_001 => name_of_file
the result is the name of file only

RegEx for dividing complex number String in Java

Looking for a Regular Expression in Java to separate a String that represents complex numbers. A code sample would be great.
The input string will be in the form:
"a+bi"
Example: "300+400i", "4+2i", "5000+324i".
I need to retrieve 300 & 400 from the String.'
I know we can do it crudely in this way.
str.substring(0, str.indexOf('+'));
str.substring(str.indexOf('+')+1,str.indexOf("i"));
I need to retrieve 300 & 400 from the String.
What about using String.split(regex) function:
String s[] = "300-400i".split("[\\Q+-\\Ei]");
System.out.println(s[0]+" "+s[1]); //prints 300 400
Regex that matches this is: /[0-9]{1,}[+-][0-9]{1,}i/
You can use this method:
Pattern complexNumberPattern = Pattern.compile("[0-9]{1,}");
Matcher complexNumberMatcher = complexNumberPattern.matcher(myString);
and use find and group methods on complexNumberMatcher to retrieve numbers from myString
Use this one:
[0-9]{1,}
It'll return the numbers.
Hope it helps.
Regex
([-+]?\d+\.?\d*|[-+]?\d*\.?\d+)\s*\+\s*([-+]?\d+\.?\d*|[-+]?\d*\.?\d+)i
Example Regex
http://rubular.com/r/FfOAt1zk0v
Example Java
string regexPattern =
// Match any float, negative or positive, group it
#"([-+]?\d+\.?\d*|[-+]?\d*\.?\d+)" +
// ... possibly following that with whitespace
#"\s*" +
// ... followed by a plus
#"\+" +
// and possibly more whitespace:
#"\s*" +
// Match any other float, and save it
#"([-+]?\d+\.?\d*|[-+]?\d*\.?\d+)" +
// ... followed by 'i'
#"i";
Regex regex = new Regex(regexPattern);
Console.WriteLine("Regex used: " + regex);
while (true)
{
Console.WriteLine("Write a number: ");
string imgNumber = Console.ReadLine();
Match match = regex.Match(imgNumber);
double real = double.Parse(match.Groups[1].Value, CultureInfo.InvariantCulture);
double img = double.Parse(match.Groups[2].Value, CultureInfo.InvariantCulture);
Console.WriteLine("RealPart={0};Imaginary part={1}", real, img);
}
Try this one. As for me, it works.
public static void main(String[] args) {
String[] attempts = new String[]{"300+400i", "4i+2", "5000-324i", "555", "2i", "+400", "-i"};
for (String s : attempts) {
System.out.println("Parsing\t" + s);
printComplex(s);
}
}
static void printComplex(String in) {
String[] parts = in.split("[+-]");
int re = 0, im = 0, pos = -1;
for (String s : parts) {
if (pos != -1) {
s = in.charAt(pos) + s;
} else {
pos = 0;
if ("".equals(s)) {
continue;
}
}
pos += s.length();
if (s.lastIndexOf('i') == -1) {
if (!"+".equals(s) && !"-".equals(s)) {
re += Integer.parseInt(s);
}
} else {
s = s.replace("i", "");
if ("+".equals(s)) {
im++;
} else if ("-".equals(s)) {
im--;
} else {
im += Integer.parseInt(s);
}
}
}
System.out.println("Re:\t" + re + "\nIm:\t" + im);
}
Output:
Parsing 300+400i
Re: 300
Im: 400
Parsing 4i+2
Re: 2
Im: 4
Parsing 5000-324i
Re: 5000
Im: -324
Parsing 555
Re: 555
Im: 0
Parsing 2i
Re: 0
Im: 2
In theory you could use something like this:
Pattern complexNumberPattern = Pattern.compile("(.*)+(.*)");
Matcher complexNumberMatcher = complexNumberPattern.matcher(myString);
if (complexNumberMatcher.matches()) {
String prePlus = complexNumberMatcher.group(1);
String postPlus = complexNumberMatcher.group(2);
}
The advantage this would give you over selecting the numbers, is that it would allow you to read things like:
5b+17c as 5b and 17c
edit: just noticed you didn't want the letters, so never mind, but this would give you more control over it in case other letters appear in it.

Replace every match but the first

I need a regular expression to do the following:
I have this String: 123.45.678.7 and I need to replace all (.) characters from the second. The result will be 123.456787
¿Can anyone help me please?
System.out.println(
"123.45.678.7".replaceAll("\\G((?!^).*?|[^\\.]*\\..*?)\\.", "$1"));
123.456787
This can also be done without a regular expression:
String str = "123.45.678.7";
String[] arr = str.split("\\.", 2);
System.out.println(arr[0] + "." + arr[1].replace(".", ""));
123.456787
This code matches all of the periods with a regular expression, then puts the first decimal point back in the String.
Here are the test results:
123.45.678.7, 123.456787
And here's the code.
public class SecondMatch {
public String match(String s) {
StringBuilder builder = new StringBuilder();
String[] parts = s.split("\\.");
for (int i = 0; i < parts.length; i++) {
builder.append(parts[i]);
if (i == 0) {
builder.append(".");
}
}
return builder.toString();
}
public static void main(String[] args) {
String s = "123.45.678.7";
String t = new SecondMatch().match(s);
System.out.println(s + ", " + t);
}
}
Just create a function...
function removeAllButFirst(myParam)
{
var numReplaces = 0;
var retVal = myParam.replace(/\./g, function(allMatch, currMatch) {
return ++numReplaces==1 ? '.' : ''; });
return retVal;
}
Just had a rummage and found this on the net - its fairly crude, but would do the job...
retVal = retVal.replaceFirst(".","$");
retVal = retVal.replaceAll(".","");
retVal = retVal.replaceFirst("$",".");
This does assume you don't have any $'s in your input - if you do pick a different char.
Its not great and there is probably a better single regex on Java using something like LookBehinds but I'm not a Java dev so couldnt say.
This regex should also work:
String str = "123.45.678.9";
String repl = str.replaceAll("([^.]*\\.[^.]*|[^.]*)\\.", "$1");
// repl = 123.456789

java scanner - how to split a mac address?

The mac address string may be in format:
00:aa:bb:cc:dd:ee
or
00aabbccddee
I need a good way to retrieve the 6 parts.
Here my code:
public class Mac
{
public static void main(String[] args)
{
String mac = "00:aa:bb:cc:dd:ee"; /* 00aabbccddee */
Scanner s = new Scanner(mac);
s.useDelimiter(":?"); /* zero or one occurrence */
String t = null;
while ((t = s.next("[0-9a-f][0-9a-f]")) != null)
System.out.println(t);
}
}
It throws a exception:
Exception in thread "main" java.util.InputMismatchException
at java.util.Scanner.throwFor(Scanner.java:840)
at java.util.Scanner.next(Scanner.java:1461)
at java.util.Scanner.next(Scanner.java:1394)
at Mac.main(Mac.java:11)
What's wrong with it?
public static String[] getMacAddressParts(String macAddress) {
String[] parts = macAddress.split(":");
if (parts.length == 0) {
parts = new String[6];
for (int i = 0; i < 6; i++) {
parts[i] = macAddress.substring(i * 2, i * 2 + 1);
}
}
return parts;
}
String[] splitMac(String mac) {
String[] parts = null;
if (mac.length() == 6*3) {
parts = mac.split(":");
} else if (mac.length() == 6*2) {
parts = new String[6];
parts[0] = mac.substring(0,1);
parts[1] = mac.substring(2,3);
parts[2] = mac.substring(4,5);
parts[3] = mac.substring(6,7);
parts[4] = mac.substring(8,9);
parts[5] = mac.substring(10,11);
} else {
throw new RuntimeException("Invalid arg for mac addr: " + mac);
}
return parts;
}
Setting a delimiter for zero or more occurrences will split the string as every single char in the next() findings like this:
a
a
b
b
c
c
d
d
e
e
because you are saying if it does or doesn't find it, split it.
So looking for the next token that matches your regex "[0-9a-f][0-9a-f]" throws that exception because its trying to match one character with a 2 character regex every time, and it throws it then the token you are trying to get doesnt match the regex you are giving, thus doing the hasNext("Pattern") before prevents that.
Also you're code will throw a NoSuchElementException when the string stops reading the characters through the next() method, verify if it does have a next token with the hasNext("Pattern") in the while conditional.
So just remove the ? on the delimiter and it will work.
Example:
String mac = "00:aa:bb:cc:dd:ee"; /* 00aabbccddee */
Scanner s = new Scanner(mac);
s.useDelimiter(":"); /* zero or one occurrence */
String t = null;
while (s.hasNext("[0-9a-f][0-9a-f]"))
{
t = s.next("[0-9a-f][0-9a-f]");
System.out.println(t);
}
Regards
String mac = "00:aa:bb:cc:dd:ee";
String[] partsArray = mac.split(":");
for(int i = 0; partsArray.length < i ;i++){
String part = partsArray[i];
System.out.println ("Splitted mac address part "+i+" is - "+part);
}

Categories

Resources