Indent all lines after a certain character until another character - java

For an assignment we are creating a java program that accepts a java file, fixes messy code and outputs to a new file.
We are to assume there is only one bracket { } per line and that each bracket occurs at the end of the line. If/else statements also use brackets.
I am currently having trouble finding a way to indent every line after an opening bracket until next closing bracket, then decreasing indent after closing bracket until the next opening bracket. We are also required to use the methods below:
Updated code a bit:
public static void processJavaFile() {
}
}

This algorithm should get you started. I left a few glitches that you'll have to fix.
(For example it doesn't indent your { brackets } as currently written, and it adds an extra newline for every semicolon)
The indentation is handled by a 'depth' counter which keeps track of how many 'tabs' to add.
Consider using a conditional for loop instead of a foreach if you want more control over each iteration. (I wrote this quick n' dirty just to give you an idea of how it might be done)
public String parse(String input) {
StringBuilder output = new StringBuilder();
int depth = 0;
boolean isNewLine = false;
boolean wasSpaced = false;
boolean isQuotes = false;
String tab = " ";
for (char c : input.toCharArray()) {
switch (c) {
case '{':
output.append(c + "\n");
depth++;
isNewLine = true;
break;
case '}':
output.append("\n" + c);
depth--;
isNewLine = true;
break;
case '\n':
isNewLine = true;
break;
case ';':
output.append(c);
isNewLine = true;
break;
case '\'':
case '"':
if (!isQuotes) {
isQuotes = true;
} else {
isQuotes = false;
}
output.append(c);
break;
default:
if (c == ' ') {
if (!isQuotes) {
if (!wasSpaced) {
wasSpaced = true;
output.append(c);
}
} else {
output.append(c);
}
} else {
wasSpaced = false;
output.append(c);
}
break;
}
if (isNewLine) {
output.append('\n');
for (int i = 0; i < depth; i++) {
output.append(tab);
}
isNewLine = false;
}
}
return output.toString();
}

Related

How do I implement the looping functionality in my BrainFuck Interpreter?

There's multiple questions here already, but I'll still proceed. This is a simple BrainFuck interpreter. I figured out all the other symbols, but I can't figure out how to implement loops. Can anyone help?
package com.lang.bfinterpreter;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import com.lang.exceptions.TapeSizeExceededException;
public class Interpreter {
private Interpreter() {
super();
}
private static String getCode(final String inputFile) throws IOException {
String code = "";
// store the entire code
final BufferedReader br = new BufferedReader(new FileReader(inputFile));
for (String line = br.readLine(); line != null; line = br.readLine()) {
code += line;
}
br.close();
return code;
}
public static void interpret(final String inputFile) throws IOException,TapeSizeExceededException,IndexOutOfBoundsException {
// get the program as a string
final String code = getCode(inputFile);
// create the Turing tape (static size)
Character[] tape = new Character[12000];
Integer indexPointer = 0;
for (int i = 0; i != 12000; i++) {
switch (code.toCharArray()[i]) {
case ',':
tape[indexPointer] = (char) System.in.read();
break;
case '.':
System.out.println(tape[indexPointer]);
break;
case '+':
tape[indexPointer]++;
break;
case '-':
tape[indexPointer]--;
break;
case '>':
if (indexPointer == 11999) {
throw new IndexOutOfBoundsException();
}
else {
indexPointer++;
}
break;
case '<':
if (indexPointer == 0) {
throw new IndexOutOfBoundsException();
}
else {
indexPointer--;
}
break;
case '[':
// I have a feeling I'll need stack to store nested loops
break;
case ']':
// I have a feeling I'll need stack to store nested loops
break;
default:
break;
}
}
}
}
I have a feeling that I will need to use Stack, but I just can't seem to figure out how. I have constructed expression evaluators before... will this require the same logic?
The most challenging part, I suppose, is finding the matching brackets. After you find where the matching bracket is, you can just check tape[indexPointer]'s value, and set i to the position after it, which should be rather easy to do.
Given an opening bracket at index i in code, to find its matching close bracket, you just need to go to the right of i in code. You start with an stack with a single [ in it - this is the [ at i. Every time you encounter a new [, you push it onto the stack. Every time you encounter a ], you pop a [ from the stack - this ] you encountered matches the [ you popped! When you popped the last [ from the stack (i.e. when the stack becomes empty), you know you have found the matching close bracket of the open bracket at i.
In code, you don't even need a Stack. You can just use an int to encode how many elements are in the stack - increment it when you push, decrement it when you pop.
private static int findMatchingCloseBracketAfterOpenBracket(char[] code, int openBracketIndex) {
// parameter validations omitted
int stack = 1;
for (int i = openBracketIndex + 1; i < code.length ; i++) {
if (code[i] == '[') {
stack++;
} else if (code[i] == ']') {
stack--;
}
if (stack == 0) {
return i;
}
}
return -1; // brackets not balanced!
}
To find the matching [ of a ], the idea is same, except you go the other direction, and reverse the push and pop actions.
private static int findMatchingOpenBracketBeforeCloseBracket(char[] code, int closeBracketIndex) {
// parameter validations omitted
int stack = 1;
for (int i = closeBracketIndex - 1; i >= 0 ; i--) {
if (code[i] == '[') {
stack--;
} else if (code[i] == ']') {
stack++;
}
if (stack == 0) {
return i;
}
}
return -1; // brackets not balanced!
}
(Refactoring the code duplication here is left as an exercise for the reader)
Updated: here's example code. Before the main execution loop you scan the whole program for matches and store them in an array:
Stack<Integer> stack = new Stack<>();
int[] targets = new int[code.length];
for (int i = 0, j; i < code.length; i++) {
if (code[i] == '[') {
stack.push(i);
} else if (code[i] == ']') {
if (stack.empty()) {
System.err.println("Unmatched ']' at byte " + (i + 1) + ".");
System.exit(1);
} else {
j = stack.pop();
targets[i]=j;
targets[j]=i;
}
}
}
if (!stack.empty()) {
System.err.println("Unmatched '[' at byte " + (stack.peek() + 1) + ".");
System.exit(1);
}
And then inside the main execution loop you just jump to the precomputed location:
case '[':
if (tape[indexPointer] == 0) {
i = targets[i];
}
break;
case ']':
if (tape[indexPointer] != 0) {
i = targets[i];
}
break;
(Note, we jump to the matching bracket, but the for loop will still autoincrement i as usual, so the next instruction that gets executed is the one after the matching bracket, as it should be.)
This is much faster than having to scan through a bunch of code looking for the matching bracket every time a bracket gets executed.
I notice also: you probably want to convert code into an array once, and not once per instruction you execute. You probably want to run your "for" loop while i < codelength, not 12000, and you also probably want to compute codelength only once.
Definitely '.' should output only one character, not add a newline as well. Also, 12000 bytes of array is too small. 30000 is the minimum, and much larger is better.
Good luck!

Regexp not matching extension [duplicate]

Is there a standard (preferably Apache Commons or similarly non-viral) library for doing "glob" type matches in Java? When I had to do similar in Perl once, I just changed all the "." to "\.", the "*" to ".*" and the "?" to "." and that sort of thing, but I'm wondering if somebody has done the work for me.
Similar question: Create regex from glob expression
Globbing is also planned for implemented in Java 7.
See FileSystem.getPathMatcher(String) and the "Finding Files" tutorial.
There's nothing built-in, but it's pretty simple to convert something glob-like to a regex:
public static String createRegexFromGlob(String glob)
{
String out = "^";
for(int i = 0; i < glob.length(); ++i)
{
final char c = glob.charAt(i);
switch(c)
{
case '*': out += ".*"; break;
case '?': out += '.'; break;
case '.': out += "\\."; break;
case '\\': out += "\\\\"; break;
default: out += c;
}
}
out += '$';
return out;
}
this works for me, but I'm not sure if it covers the glob "standard", if there is one :)
Update by Paul Tomblin: I found a perl program that does glob conversion, and adapting it to Java I end up with:
private String convertGlobToRegEx(String line)
{
LOG.info("got line [" + line + "]");
line = line.trim();
int strLen = line.length();
StringBuilder sb = new StringBuilder(strLen);
// Remove beginning and ending * globs because they're useless
if (line.startsWith("*"))
{
line = line.substring(1);
strLen--;
}
if (line.endsWith("*"))
{
line = line.substring(0, strLen-1);
strLen--;
}
boolean escaping = false;
int inCurlies = 0;
for (char currentChar : line.toCharArray())
{
switch (currentChar)
{
case '*':
if (escaping)
sb.append("\\*");
else
sb.append(".*");
escaping = false;
break;
case '?':
if (escaping)
sb.append("\\?");
else
sb.append('.');
escaping = false;
break;
case '.':
case '(':
case ')':
case '+':
case '|':
case '^':
case '$':
case '#':
case '%':
sb.append('\\');
sb.append(currentChar);
escaping = false;
break;
case '\\':
if (escaping)
{
sb.append("\\\\");
escaping = false;
}
else
escaping = true;
break;
case '{':
if (escaping)
{
sb.append("\\{");
}
else
{
sb.append('(');
inCurlies++;
}
escaping = false;
break;
case '}':
if (inCurlies > 0 && !escaping)
{
sb.append(')');
inCurlies--;
}
else if (escaping)
sb.append("\\}");
else
sb.append("}");
escaping = false;
break;
case ',':
if (inCurlies > 0 && !escaping)
{
sb.append('|');
}
else if (escaping)
sb.append("\\,");
else
sb.append(",");
break;
default:
escaping = false;
sb.append(currentChar);
}
}
return sb.toString();
}
I'm editing into this answer rather than making my own because this answer put me on the right track.
Thanks to everyone here for their contributions. I wrote a more comprehensive conversion than any of the previous answers:
/**
* Converts a standard POSIX Shell globbing pattern into a regular expression
* pattern. The result can be used with the standard {#link java.util.regex} API to
* recognize strings which match the glob pattern.
* <p/>
* See also, the POSIX Shell language:
* http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_13_01
*
* #param pattern A glob pattern.
* #return A regex pattern to recognize the given glob pattern.
*/
public static final String convertGlobToRegex(String pattern) {
StringBuilder sb = new StringBuilder(pattern.length());
int inGroup = 0;
int inClass = 0;
int firstIndexInClass = -1;
char[] arr = pattern.toCharArray();
for (int i = 0; i < arr.length; i++) {
char ch = arr[i];
switch (ch) {
case '\\':
if (++i >= arr.length) {
sb.append('\\');
} else {
char next = arr[i];
switch (next) {
case ',':
// escape not needed
break;
case 'Q':
case 'E':
// extra escape needed
sb.append('\\');
default:
sb.append('\\');
}
sb.append(next);
}
break;
case '*':
if (inClass == 0)
sb.append(".*");
else
sb.append('*');
break;
case '?':
if (inClass == 0)
sb.append('.');
else
sb.append('?');
break;
case '[':
inClass++;
firstIndexInClass = i+1;
sb.append('[');
break;
case ']':
inClass--;
sb.append(']');
break;
case '.':
case '(':
case ')':
case '+':
case '|':
case '^':
case '$':
case '#':
case '%':
if (inClass == 0 || (firstIndexInClass == i && ch == '^'))
sb.append('\\');
sb.append(ch);
break;
case '!':
if (firstIndexInClass == i)
sb.append('^');
else
sb.append('!');
break;
case '{':
inGroup++;
sb.append('(');
break;
case '}':
inGroup--;
sb.append(')');
break;
case ',':
if (inGroup > 0)
sb.append('|');
else
sb.append(',');
break;
default:
sb.append(ch);
}
}
return sb.toString();
}
And the unit tests to prove it works:
/**
* #author Neil Traft
*/
public class StringUtils_ConvertGlobToRegex_Test {
#Test
public void star_becomes_dot_star() throws Exception {
assertEquals("gl.*b", StringUtils.convertGlobToRegex("gl*b"));
}
#Test
public void escaped_star_is_unchanged() throws Exception {
assertEquals("gl\\*b", StringUtils.convertGlobToRegex("gl\\*b"));
}
#Test
public void question_mark_becomes_dot() throws Exception {
assertEquals("gl.b", StringUtils.convertGlobToRegex("gl?b"));
}
#Test
public void escaped_question_mark_is_unchanged() throws Exception {
assertEquals("gl\\?b", StringUtils.convertGlobToRegex("gl\\?b"));
}
#Test
public void character_classes_dont_need_conversion() throws Exception {
assertEquals("gl[-o]b", StringUtils.convertGlobToRegex("gl[-o]b"));
}
#Test
public void escaped_classes_are_unchanged() throws Exception {
assertEquals("gl\\[-o\\]b", StringUtils.convertGlobToRegex("gl\\[-o\\]b"));
}
#Test
public void negation_in_character_classes() throws Exception {
assertEquals("gl[^a-n!p-z]b", StringUtils.convertGlobToRegex("gl[!a-n!p-z]b"));
}
#Test
public void nested_negation_in_character_classes() throws Exception {
assertEquals("gl[[^a-n]!p-z]b", StringUtils.convertGlobToRegex("gl[[!a-n]!p-z]b"));
}
#Test
public void escape_carat_if_it_is_the_first_char_in_a_character_class() throws Exception {
assertEquals("gl[\\^o]b", StringUtils.convertGlobToRegex("gl[^o]b"));
}
#Test
public void metachars_are_escaped() throws Exception {
assertEquals("gl..*\\.\\(\\)\\+\\|\\^\\$\\#\\%b", StringUtils.convertGlobToRegex("gl?*.()+|^$#%b"));
}
#Test
public void metachars_in_character_classes_dont_need_escaping() throws Exception {
assertEquals("gl[?*.()+|^$#%]b", StringUtils.convertGlobToRegex("gl[?*.()+|^$#%]b"));
}
#Test
public void escaped_backslash_is_unchanged() throws Exception {
assertEquals("gl\\\\b", StringUtils.convertGlobToRegex("gl\\\\b"));
}
#Test
public void slashQ_and_slashE_are_escaped() throws Exception {
assertEquals("\\\\Qglob\\\\E", StringUtils.convertGlobToRegex("\\Qglob\\E"));
}
#Test
public void braces_are_turned_into_groups() throws Exception {
assertEquals("(glob|regex)", StringUtils.convertGlobToRegex("{glob,regex}"));
}
#Test
public void escaped_braces_are_unchanged() throws Exception {
assertEquals("\\{glob\\}", StringUtils.convertGlobToRegex("\\{glob\\}"));
}
#Test
public void commas_dont_need_escaping() throws Exception {
assertEquals("(glob,regex),", StringUtils.convertGlobToRegex("{glob\\,regex},"));
}
}
There are couple of libraries that do Glob-like pattern matching that are more modern than the ones listed:
Theres Ants Directory Scanner
And
Springs AntPathMatcher
I recommend both over the other solutions since Ant Style Globbing has pretty much become the standard glob syntax in the Java world (Hudson, Spring, Ant and I think Maven).
I recently had to do it and used \Q and \E to escape the glob pattern:
private static Pattern getPatternFromGlob(String glob) {
return Pattern.compile(
"^" + Pattern.quote(glob)
.replace("*", "\\E.*\\Q")
.replace("?", "\\E.\\Q")
+ "$");
}
This is a simple Glob implementation which handles * and ? in the pattern
public class GlobMatch {
private String text;
private String pattern;
public boolean match(String text, String pattern) {
this.text = text;
this.pattern = pattern;
return matchCharacter(0, 0);
}
private boolean matchCharacter(int patternIndex, int textIndex) {
if (patternIndex >= pattern.length()) {
return false;
}
switch(pattern.charAt(patternIndex)) {
case '?':
// Match any character
if (textIndex >= text.length()) {
return false;
}
break;
case '*':
// * at the end of the pattern will match anything
if (patternIndex + 1 >= pattern.length() || textIndex >= text.length()) {
return true;
}
// Probe forward to see if we can get a match
while (textIndex < text.length()) {
if (matchCharacter(patternIndex + 1, textIndex)) {
return true;
}
textIndex++;
}
return false;
default:
if (textIndex >= text.length()) {
return false;
}
String textChar = text.substring(textIndex, textIndex + 1);
String patternChar = pattern.substring(patternIndex, patternIndex + 1);
// Note the match is case insensitive
if (textChar.compareToIgnoreCase(patternChar) != 0) {
return false;
}
}
// End of pattern and text?
if (patternIndex + 1 >= pattern.length() && textIndex + 1 >= text.length()) {
return true;
}
// Go on to match the next character in the pattern
return matchCharacter(patternIndex + 1, textIndex + 1);
}
}
Similar to Tony Edgecombe's answer, here is a short and simple globber that supports * and ? without using regex, if anybody needs one.
public static boolean matches(String text, String glob) {
String rest = null;
int pos = glob.indexOf('*');
if (pos != -1) {
rest = glob.substring(pos + 1);
glob = glob.substring(0, pos);
}
if (glob.length() > text.length())
return false;
// handle the part up to the first *
for (int i = 0; i < glob.length(); i++)
if (glob.charAt(i) != '?'
&& !glob.substring(i, i + 1).equalsIgnoreCase(text.substring(i, i + 1)))
return false;
// recurse for the part after the first *, if any
if (rest == null) {
return glob.length() == text.length();
} else {
for (int i = glob.length(); i <= text.length(); i++) {
if (matches(text.substring(i), rest))
return true;
}
return false;
}
}
It may be a slightly hacky approach. I've figured it out from NIO2's Files.newDirectoryStream(Path dir, String glob) code. Pay attention that every match new Path object is created. So far I was able to test this only on Windows FS, however, I believe it should work on Unix as well.
// a file system hack to get a glob matching
PathMatcher matcher = ("*".equals(glob)) ? null
: FileSystems.getDefault().getPathMatcher("glob:" + glob);
if ("*".equals(glob) || matcher.matches(Paths.get(someName))) {
// do you stuff here
}
UPDATE
Works on both - Mac and Linux.
The previous solution by Vincent Robert/dimo414 relies on Pattern.quote() being implemented in terms of \Q...\E, which is not documented in the API and therefore may not be the case for other/future Java implementations. The following solution removes that implementation dependency by escaping all occurrences of \E instead of using quote(). It also activates DOTALL mode ((?s)) in case the string to be matched contains newlines.
public static Pattern globToRegex(String glob)
{
return Pattern.compile(
"(?s)^\\Q" +
glob.replace("\\E", "\\E\\\\E\\Q")
.replace("*", "\\E.*\\Q")
.replace("?", "\\E.\\Q") +
"\\E$"
);
}
I don't know about a "standard" implementation, but I know of a sourceforge project released under the BSD license that implemented glob matching for files. It's implemented in one file, maybe you can adapt it for your requirements.
There is sun.nio.fs.Globs but it is not part of the public API.
You can use it indirectly via:
FileSystems.getDefault().getPathMatcher("glob:<myPattern>")
But it returns PathMatcher, which is inconvenient to work with. Since it can accept only Path as parameter (not File).
One possible option is to convert the PathMatcher to regex pattern (just call its 'toString()' method).
Another option is to use dedicated Glob library like glob-library-java.
Long ago I was doing a massive glob-driven text filtering so I've written a small piece of code (15 lines of code, no dependencies beyond JDK).
It handles only '*' (was sufficient for me), but can be easily extended for '?'.
It is several times faster than pre-compiled regexp, does not require any pre-compilation (essentially it is a string-vs-string comparison every time the pattern is matched).
Code:
public static boolean miniglob(String[] pattern, String line) {
if (pattern.length == 0) return line.isEmpty();
else if (pattern.length == 1) return line.equals(pattern[0]);
else {
if (!line.startsWith(pattern[0])) return false;
int idx = pattern[0].length();
for (int i = 1; i < pattern.length - 1; ++i) {
String patternTok = pattern[i];
int nextIdx = line.indexOf(patternTok, idx);
if (nextIdx < 0) return false;
else idx = nextIdx + patternTok.length();
}
if (!line.endsWith(pattern[pattern.length - 1])) return false;
return true;
}
}
Usage:
public static void main(String[] args) {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
try {
// read from stdin space separated text and pattern
for (String input = in.readLine(); input != null; input = in.readLine()) {
String[] tokens = input.split(" ");
String line = tokens[0];
String[] pattern = tokens[1].split("\\*+", -1 /* want empty trailing token if any */);
// check matcher performance
long tm0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i) {
miniglob(pattern, line);
}
long tm1 = System.currentTimeMillis();
System.out.println("miniglob took " + (tm1-tm0) + " ms");
// check regexp performance
Pattern reptn = Pattern.compile(tokens[1].replace("*", ".*"));
Matcher mtchr = reptn.matcher(line);
tm0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i) {
mtchr.matches();
}
tm1 = System.currentTimeMillis();
System.out.println("regexp took " + (tm1-tm0) + " ms");
// check if miniglob worked correctly
if (miniglob(pattern, line)) {
System.out.println("+ >" + line);
}
else {
System.out.println("- >" + line);
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Copy/paste from here
By the way, it seems as if you did it the hard way in Perl
This does the trick in Perl:
my #files = glob("*.html")
# Or, if you prefer:
my #files = <*.html>

Calculator dot function issue android

I am working on a calculator and practicing java and android development. Everything works fine except for the dot function. here is the problem(see the very last dot):
here are the codes:
case R.id.btn_dot:
if (dotSet) {
screenTV.append("");
} else if (isEmpty() || empty) {
screenTV.append("0.");
dotSet = true;
count++;
} else {
screenTV.append(".");
dotSet = true;
count++;
}
an operand:
ase R.id.btn_add:
if (isEmpty()) {
screenTV.append("");
} else if (screenTvGet().endsWith("+")) {
screenTV.append("");
} else if (!isEmpty()) {
screenTV.append("+");
dotSet = false;
empty = true;
resultSet = false;
count = 0;
}
break;
and a number:
case R.id.btn0:
if (resultSet) {
screenTV.append("");
} else if (isEmpty()) {
screenTV.append("");
} else {
screenTV.append("0");
empty = false;
}
Finally, the backspace function:
case R.id.btn_backspace:
String screenContent;
String screen = screenTV.getText().toString();
int screenMinusOne = screen.length() - 1;
String screenMinus = String.valueOf(screenMinusOne);
if (screen.endsWith("."))
dotSet = false;
if (isEmpty()) {
screenTV.setText("");
} else {
screenContent = screen.substring(0, screen.length() - 1);
screenTV.setText(screenContent);
}
break;
forget about the "count".
I believe you can see the whole picture. now I want somehow when I clear an operand with "BackSpace function" and the previous number has a dot in it, the dot button doesn't just add an '.' or "0." to the screen instead returns null or just adds this "". I hope my question is clear.
Remove dotSet. You don't need it.
Then, similar to how you do this for btn_add:
} else if (screenTvGet().endsWith("+")) {
screenTV.append("");
use a regex for btn_dot to check if there is a . in the last part of the text:
} else if (screenTvGet().matches(".*\\.\\d*")) {
screenTV.append("");

Code does not work as intended (Palindrome checker)

I am trying to check if a string is a palindrome, but it seems it does not work, because when I send a string that I know is not a palindrome, it returns that it is a palindrome, can anyone help? It also won't add to the variable counter.
package UnaryStack.RubCol1183;
public class CheckPalindrome {
static int counter = 0;
/** Decides whether the parentheses, brackets, and braces
in a string occur in left/right pairs.
#param expression a string to be checked
#return true if the delimiters are paired correctly */
public static boolean checkBalance(String expression)
{
StackInterface<Character> temporaryStack = new LinkedStack<Character>();
StackInterface<Character> reverseStack = new LinkedStack<Character>();
StackInterface<Character> originalStack = new LinkedStack<Character>();
int characterCount = expression.length();
boolean isBalanced = true;
int index = 0;
char nextCharacter = ' ';
for (;(index < characterCount); index++)
{
nextCharacter = expression.charAt(index);
switch (nextCharacter)
{
case '.': case '?': case '!': case '\'': case ' ': case ',':
break;
default:
{
{
reverseStack.push(nextCharacter);
temporaryStack.push(nextCharacter);
originalStack.push(temporaryStack.pop());
}
{
char letter1 = Character.toLowerCase(originalStack.pop());
char letter2 = Character.toLowerCase(reverseStack.pop());
isBalanced = isPaired(letter1, letter2);
if(isBalanced == false){
counter++;
}
}
break;
}
} // end switch
} // end for
return isBalanced;
} // end checkBalance
// Returns true if the given characters, open and close, form a pair
// of parentheses, brackets, or braces.
private static boolean isPaired(char open, char close)
{
return (open == close);
} // end isPaired
public static int counter(){
return counter;
}
}//end class
Your implementation seems way more complex than it needs to be.
//Check for invalid characters first if needed.
StackInterface<Character> stack = new LinkedStack<Character>();
for (char ch: expression.toCharArray()) {
Character curr = new Character(ch);
Character peek = (Character)(stack.peek());
if(!stack.isEmpty() && peek.equals(curr)) {
stack.pop();
} else {
stack.push(curr)
}
}
return stack.isEmpty();
Honestly using a stack is over kill here. I would use the following method.
int i = 0;
int j = expression.length() - 1;
while(j > i) {
if(expression.charAt(i++) != expression.charAt(j--)) return false;
}
return true;
You put exaclty the same elemets in reverseStack and originalStack, because everything you push into the temporaryStack will be immediately pushed into originalStack. This does not make sense.
reverseStack.push(nextCharacter);
temporaryStack.push(nextCharacter);
originalStack.push(temporaryStack.pop());
Therefore the expression
isBalanced = isPaired(letter1, letter2);
will always return true.
I found the errors in logic that were found inside the method checkBalace() and finished the code into a full working one. Here is what my finished code looks like:
public class CheckPalindrome {
static int counter;
/** Decides whether the parentheses, brackets, and braces
in a string occur in left/right pairs.
#param expression a string to be checked
#return true if the delimiters are paired correctly */
public static boolean checkBalance(String expression)
{
counter = 0;
StackInterface<Character> temporaryStack = new LinkedStack<Character>();
StackInterface<Character> reverseStack = new LinkedStack<Character>();
StackInterface<Character> originalStack = new LinkedStack<Character>();
boolean isBalanced = true;
int characterCount = expression.length();
int index = 0;
char nextCharacter = ' ';
for (;(index < characterCount); index++)
{
nextCharacter = expression.charAt(index);
switch (nextCharacter)
{
case '.': case '?': case '!': case '\'': case ' ': case ',':
break;
default:
{
{
reverseStack.push(nextCharacter);
temporaryStack.push(nextCharacter);
}
break;
}
} // end switch
} // end for
while(!temporaryStack.isEmpty()){
originalStack.push(temporaryStack.pop());
}
while(!originalStack.isEmpty()){
char letter1 = Character.toLowerCase(originalStack.pop());
char letter2 = Character.toLowerCase(reverseStack.pop());
isBalanced = isPaired(letter1, letter2);
if(isBalanced == false){
counter++;
}
}
return isBalanced;
} // end checkBalance
// Returns true if the given characters, open and close, form a pair
// of parentheses, brackets, or braces.
private static boolean isPaired(char open, char close)
{
return (open == close);
} // end isPaired
public static int counter(){
return counter;
}
}
I used 2 while methods outside of the for thus fixing the logic errors pointed out. I also assigned the value 0 to counter inside the method to fix a small problem I encountered. Feel free to revise the code if I still have errors, but I think I made no errors, then again, I'm a beginner.

Is there an equivalent of java.util.regex for "glob" type patterns?

Is there a standard (preferably Apache Commons or similarly non-viral) library for doing "glob" type matches in Java? When I had to do similar in Perl once, I just changed all the "." to "\.", the "*" to ".*" and the "?" to "." and that sort of thing, but I'm wondering if somebody has done the work for me.
Similar question: Create regex from glob expression
Globbing is also planned for implemented in Java 7.
See FileSystem.getPathMatcher(String) and the "Finding Files" tutorial.
There's nothing built-in, but it's pretty simple to convert something glob-like to a regex:
public static String createRegexFromGlob(String glob)
{
String out = "^";
for(int i = 0; i < glob.length(); ++i)
{
final char c = glob.charAt(i);
switch(c)
{
case '*': out += ".*"; break;
case '?': out += '.'; break;
case '.': out += "\\."; break;
case '\\': out += "\\\\"; break;
default: out += c;
}
}
out += '$';
return out;
}
this works for me, but I'm not sure if it covers the glob "standard", if there is one :)
Update by Paul Tomblin: I found a perl program that does glob conversion, and adapting it to Java I end up with:
private String convertGlobToRegEx(String line)
{
LOG.info("got line [" + line + "]");
line = line.trim();
int strLen = line.length();
StringBuilder sb = new StringBuilder(strLen);
// Remove beginning and ending * globs because they're useless
if (line.startsWith("*"))
{
line = line.substring(1);
strLen--;
}
if (line.endsWith("*"))
{
line = line.substring(0, strLen-1);
strLen--;
}
boolean escaping = false;
int inCurlies = 0;
for (char currentChar : line.toCharArray())
{
switch (currentChar)
{
case '*':
if (escaping)
sb.append("\\*");
else
sb.append(".*");
escaping = false;
break;
case '?':
if (escaping)
sb.append("\\?");
else
sb.append('.');
escaping = false;
break;
case '.':
case '(':
case ')':
case '+':
case '|':
case '^':
case '$':
case '#':
case '%':
sb.append('\\');
sb.append(currentChar);
escaping = false;
break;
case '\\':
if (escaping)
{
sb.append("\\\\");
escaping = false;
}
else
escaping = true;
break;
case '{':
if (escaping)
{
sb.append("\\{");
}
else
{
sb.append('(');
inCurlies++;
}
escaping = false;
break;
case '}':
if (inCurlies > 0 && !escaping)
{
sb.append(')');
inCurlies--;
}
else if (escaping)
sb.append("\\}");
else
sb.append("}");
escaping = false;
break;
case ',':
if (inCurlies > 0 && !escaping)
{
sb.append('|');
}
else if (escaping)
sb.append("\\,");
else
sb.append(",");
break;
default:
escaping = false;
sb.append(currentChar);
}
}
return sb.toString();
}
I'm editing into this answer rather than making my own because this answer put me on the right track.
Thanks to everyone here for their contributions. I wrote a more comprehensive conversion than any of the previous answers:
/**
* Converts a standard POSIX Shell globbing pattern into a regular expression
* pattern. The result can be used with the standard {#link java.util.regex} API to
* recognize strings which match the glob pattern.
* <p/>
* See also, the POSIX Shell language:
* http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_13_01
*
* #param pattern A glob pattern.
* #return A regex pattern to recognize the given glob pattern.
*/
public static final String convertGlobToRegex(String pattern) {
StringBuilder sb = new StringBuilder(pattern.length());
int inGroup = 0;
int inClass = 0;
int firstIndexInClass = -1;
char[] arr = pattern.toCharArray();
for (int i = 0; i < arr.length; i++) {
char ch = arr[i];
switch (ch) {
case '\\':
if (++i >= arr.length) {
sb.append('\\');
} else {
char next = arr[i];
switch (next) {
case ',':
// escape not needed
break;
case 'Q':
case 'E':
// extra escape needed
sb.append('\\');
default:
sb.append('\\');
}
sb.append(next);
}
break;
case '*':
if (inClass == 0)
sb.append(".*");
else
sb.append('*');
break;
case '?':
if (inClass == 0)
sb.append('.');
else
sb.append('?');
break;
case '[':
inClass++;
firstIndexInClass = i+1;
sb.append('[');
break;
case ']':
inClass--;
sb.append(']');
break;
case '.':
case '(':
case ')':
case '+':
case '|':
case '^':
case '$':
case '#':
case '%':
if (inClass == 0 || (firstIndexInClass == i && ch == '^'))
sb.append('\\');
sb.append(ch);
break;
case '!':
if (firstIndexInClass == i)
sb.append('^');
else
sb.append('!');
break;
case '{':
inGroup++;
sb.append('(');
break;
case '}':
inGroup--;
sb.append(')');
break;
case ',':
if (inGroup > 0)
sb.append('|');
else
sb.append(',');
break;
default:
sb.append(ch);
}
}
return sb.toString();
}
And the unit tests to prove it works:
/**
* #author Neil Traft
*/
public class StringUtils_ConvertGlobToRegex_Test {
#Test
public void star_becomes_dot_star() throws Exception {
assertEquals("gl.*b", StringUtils.convertGlobToRegex("gl*b"));
}
#Test
public void escaped_star_is_unchanged() throws Exception {
assertEquals("gl\\*b", StringUtils.convertGlobToRegex("gl\\*b"));
}
#Test
public void question_mark_becomes_dot() throws Exception {
assertEquals("gl.b", StringUtils.convertGlobToRegex("gl?b"));
}
#Test
public void escaped_question_mark_is_unchanged() throws Exception {
assertEquals("gl\\?b", StringUtils.convertGlobToRegex("gl\\?b"));
}
#Test
public void character_classes_dont_need_conversion() throws Exception {
assertEquals("gl[-o]b", StringUtils.convertGlobToRegex("gl[-o]b"));
}
#Test
public void escaped_classes_are_unchanged() throws Exception {
assertEquals("gl\\[-o\\]b", StringUtils.convertGlobToRegex("gl\\[-o\\]b"));
}
#Test
public void negation_in_character_classes() throws Exception {
assertEquals("gl[^a-n!p-z]b", StringUtils.convertGlobToRegex("gl[!a-n!p-z]b"));
}
#Test
public void nested_negation_in_character_classes() throws Exception {
assertEquals("gl[[^a-n]!p-z]b", StringUtils.convertGlobToRegex("gl[[!a-n]!p-z]b"));
}
#Test
public void escape_carat_if_it_is_the_first_char_in_a_character_class() throws Exception {
assertEquals("gl[\\^o]b", StringUtils.convertGlobToRegex("gl[^o]b"));
}
#Test
public void metachars_are_escaped() throws Exception {
assertEquals("gl..*\\.\\(\\)\\+\\|\\^\\$\\#\\%b", StringUtils.convertGlobToRegex("gl?*.()+|^$#%b"));
}
#Test
public void metachars_in_character_classes_dont_need_escaping() throws Exception {
assertEquals("gl[?*.()+|^$#%]b", StringUtils.convertGlobToRegex("gl[?*.()+|^$#%]b"));
}
#Test
public void escaped_backslash_is_unchanged() throws Exception {
assertEquals("gl\\\\b", StringUtils.convertGlobToRegex("gl\\\\b"));
}
#Test
public void slashQ_and_slashE_are_escaped() throws Exception {
assertEquals("\\\\Qglob\\\\E", StringUtils.convertGlobToRegex("\\Qglob\\E"));
}
#Test
public void braces_are_turned_into_groups() throws Exception {
assertEquals("(glob|regex)", StringUtils.convertGlobToRegex("{glob,regex}"));
}
#Test
public void escaped_braces_are_unchanged() throws Exception {
assertEquals("\\{glob\\}", StringUtils.convertGlobToRegex("\\{glob\\}"));
}
#Test
public void commas_dont_need_escaping() throws Exception {
assertEquals("(glob,regex),", StringUtils.convertGlobToRegex("{glob\\,regex},"));
}
}
There are couple of libraries that do Glob-like pattern matching that are more modern than the ones listed:
Theres Ants Directory Scanner
And
Springs AntPathMatcher
I recommend both over the other solutions since Ant Style Globbing has pretty much become the standard glob syntax in the Java world (Hudson, Spring, Ant and I think Maven).
I recently had to do it and used \Q and \E to escape the glob pattern:
private static Pattern getPatternFromGlob(String glob) {
return Pattern.compile(
"^" + Pattern.quote(glob)
.replace("*", "\\E.*\\Q")
.replace("?", "\\E.\\Q")
+ "$");
}
This is a simple Glob implementation which handles * and ? in the pattern
public class GlobMatch {
private String text;
private String pattern;
public boolean match(String text, String pattern) {
this.text = text;
this.pattern = pattern;
return matchCharacter(0, 0);
}
private boolean matchCharacter(int patternIndex, int textIndex) {
if (patternIndex >= pattern.length()) {
return false;
}
switch(pattern.charAt(patternIndex)) {
case '?':
// Match any character
if (textIndex >= text.length()) {
return false;
}
break;
case '*':
// * at the end of the pattern will match anything
if (patternIndex + 1 >= pattern.length() || textIndex >= text.length()) {
return true;
}
// Probe forward to see if we can get a match
while (textIndex < text.length()) {
if (matchCharacter(patternIndex + 1, textIndex)) {
return true;
}
textIndex++;
}
return false;
default:
if (textIndex >= text.length()) {
return false;
}
String textChar = text.substring(textIndex, textIndex + 1);
String patternChar = pattern.substring(patternIndex, patternIndex + 1);
// Note the match is case insensitive
if (textChar.compareToIgnoreCase(patternChar) != 0) {
return false;
}
}
// End of pattern and text?
if (patternIndex + 1 >= pattern.length() && textIndex + 1 >= text.length()) {
return true;
}
// Go on to match the next character in the pattern
return matchCharacter(patternIndex + 1, textIndex + 1);
}
}
Similar to Tony Edgecombe's answer, here is a short and simple globber that supports * and ? without using regex, if anybody needs one.
public static boolean matches(String text, String glob) {
String rest = null;
int pos = glob.indexOf('*');
if (pos != -1) {
rest = glob.substring(pos + 1);
glob = glob.substring(0, pos);
}
if (glob.length() > text.length())
return false;
// handle the part up to the first *
for (int i = 0; i < glob.length(); i++)
if (glob.charAt(i) != '?'
&& !glob.substring(i, i + 1).equalsIgnoreCase(text.substring(i, i + 1)))
return false;
// recurse for the part after the first *, if any
if (rest == null) {
return glob.length() == text.length();
} else {
for (int i = glob.length(); i <= text.length(); i++) {
if (matches(text.substring(i), rest))
return true;
}
return false;
}
}
It may be a slightly hacky approach. I've figured it out from NIO2's Files.newDirectoryStream(Path dir, String glob) code. Pay attention that every match new Path object is created. So far I was able to test this only on Windows FS, however, I believe it should work on Unix as well.
// a file system hack to get a glob matching
PathMatcher matcher = ("*".equals(glob)) ? null
: FileSystems.getDefault().getPathMatcher("glob:" + glob);
if ("*".equals(glob) || matcher.matches(Paths.get(someName))) {
// do you stuff here
}
UPDATE
Works on both - Mac and Linux.
The previous solution by Vincent Robert/dimo414 relies on Pattern.quote() being implemented in terms of \Q...\E, which is not documented in the API and therefore may not be the case for other/future Java implementations. The following solution removes that implementation dependency by escaping all occurrences of \E instead of using quote(). It also activates DOTALL mode ((?s)) in case the string to be matched contains newlines.
public static Pattern globToRegex(String glob)
{
return Pattern.compile(
"(?s)^\\Q" +
glob.replace("\\E", "\\E\\\\E\\Q")
.replace("*", "\\E.*\\Q")
.replace("?", "\\E.\\Q") +
"\\E$"
);
}
I don't know about a "standard" implementation, but I know of a sourceforge project released under the BSD license that implemented glob matching for files. It's implemented in one file, maybe you can adapt it for your requirements.
There is sun.nio.fs.Globs but it is not part of the public API.
You can use it indirectly via:
FileSystems.getDefault().getPathMatcher("glob:<myPattern>")
But it returns PathMatcher, which is inconvenient to work with. Since it can accept only Path as parameter (not File).
One possible option is to convert the PathMatcher to regex pattern (just call its 'toString()' method).
Another option is to use dedicated Glob library like glob-library-java.
Long ago I was doing a massive glob-driven text filtering so I've written a small piece of code (15 lines of code, no dependencies beyond JDK).
It handles only '*' (was sufficient for me), but can be easily extended for '?'.
It is several times faster than pre-compiled regexp, does not require any pre-compilation (essentially it is a string-vs-string comparison every time the pattern is matched).
Code:
public static boolean miniglob(String[] pattern, String line) {
if (pattern.length == 0) return line.isEmpty();
else if (pattern.length == 1) return line.equals(pattern[0]);
else {
if (!line.startsWith(pattern[0])) return false;
int idx = pattern[0].length();
for (int i = 1; i < pattern.length - 1; ++i) {
String patternTok = pattern[i];
int nextIdx = line.indexOf(patternTok, idx);
if (nextIdx < 0) return false;
else idx = nextIdx + patternTok.length();
}
if (!line.endsWith(pattern[pattern.length - 1])) return false;
return true;
}
}
Usage:
public static void main(String[] args) {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
try {
// read from stdin space separated text and pattern
for (String input = in.readLine(); input != null; input = in.readLine()) {
String[] tokens = input.split(" ");
String line = tokens[0];
String[] pattern = tokens[1].split("\\*+", -1 /* want empty trailing token if any */);
// check matcher performance
long tm0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i) {
miniglob(pattern, line);
}
long tm1 = System.currentTimeMillis();
System.out.println("miniglob took " + (tm1-tm0) + " ms");
// check regexp performance
Pattern reptn = Pattern.compile(tokens[1].replace("*", ".*"));
Matcher mtchr = reptn.matcher(line);
tm0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i) {
mtchr.matches();
}
tm1 = System.currentTimeMillis();
System.out.println("regexp took " + (tm1-tm0) + " ms");
// check if miniglob worked correctly
if (miniglob(pattern, line)) {
System.out.println("+ >" + line);
}
else {
System.out.println("- >" + line);
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Copy/paste from here
By the way, it seems as if you did it the hard way in Perl
This does the trick in Perl:
my #files = glob("*.html")
# Or, if you prefer:
my #files = <*.html>

Categories

Resources