How to check if a string is a number [duplicate] - java

This question already has answers here:
How to check if a String is numeric in Java
(41 answers)
Closed 5 years ago.
I have conversion to Map problem in Core Java.
Below is requirement:
Given a String array below
String str[] = {"abc","123","def","456","ghi","789","lmn","101112","opq"};
Convert it into a Map such that the resultant output is below
Output
====== ======
key Value
====== ======
abc true
123 false
def true
456 false
The above should be printed for each element in the array. I have written the code but it's not working and I'm stuck. Please let me know how it can be resolved. Thanks in advance.
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
public class CoversionToMap {
/**
* #param args
*/
public static void main(String[] args) {
String str[] = {"abc","123","def","456","ghi","789","lmn","101112","opq"};
Map m = new HashMap();
for(int i=0;i<str.length;i++){
if(Integer.parseInt(str[i]) < 0){
m.put(str[i],true);
}else{
m.put(str[i],false);
}
}
//Print the map values finally
printMap(m);
}
public static void printMap(Map mp) {
Iterator it = mp.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pairs = (Map.Entry)it.next();
System.out.println(pairs.getKey() + " = " + pairs.getValue());
}
}
}
exception:
Exception in thread "main" java.lang.NumberFormatException: For input string: "abc"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at CoversionToMap.main(CoversionToMap.java:22)

Everyone is suggesting using exception handling for this, there is nothing exceptional here to warrant using exceptions like this, you don't try turning left in your car and if you crash go right do you? Something like this should do it
Map<String, Boolean> m = new HashMap<String, Boolean>();
for (String str: strs) {
m.put(str, isInteger(str));
}
public boolean isInteger(String str) {
int size = str.length();
for (int i = 0; i < size; i++) {
if (!Character.isDigit(str.charAt(i))) {
return false;
}
}
return size > 0;
}
Much clearer and more efficient that catching throwing exception, even when there are 99% integers as the integer value is not even needed so no conversion required.

Integer.parseInt(..) throws an exception for invalid input.
Your if clause should look like this:
if (isNumber(str[i])) {
...
} else {
...
}
Where isNumber can be implemented in multiple ways. For example:
using try { Integer.parseInt(..) } catch (NumberFormatException ex) (see this related question)
using commons-lang NumberUtils.isNumber(..)

You check if parseInt returns a number smaller than 0 to see if the input is non-numeric.
However, that method doesn't return any value at all, if the input is non-numeric. Instead it throws an exception, as you have seen.
The simplest way to do what you want is to catch that exception and act accordingly:
try {
Integer.parseInt(str[i]);
// str[i] is numeric
} catch (NumberFormatException ignored) {
// str[i] is not numeric
}

If you want to check if the string is a valid Java number you can use the method isNumber from the org.apache.commons.lang.math (doc here: http://commons.apache.org/lang/api-2.4/org/apache/commons/lang/math/NumberUtils.html).
This way you won't have to write your own implementation of isNumber

You need to use a try/catch block instead of testing the return value for parseInt.
try {
Integer.parseInt(str[i]);
m.put(str[i],true);
} catch(NumberFormatException e) {
m.put(str[i],false);
}

Your error occurs here:
if(Integer.parseInt(str[i]) < 0){
Integer.parseInt throws a NumberFormatException when the input isn't a number, so you need to use a try/catch block, for example:
try{
int number = Integer.parseInt(str[i]);
m.put(str[i],false);
}catch NumberFormatException nfe{
m.put(str[i],true);
}

Assuming you won't use any external libraries, you can also use a Regular Expression Matcher to do that. Just like
for (String element : str) {
m.put(element, element.matches("\\d+"));
}
Note that this works only with non-negative integers, but you can adapt the regular expression to match the number formats you want to map as true. Also, if element is null, you'll get a NullPointerException, so a little defensive code is required here.

Here is an improved answer which can be used for numbers with negative value, decimal points etc. It uses Regular Expressions.
Here it it:
public class StringValidator {
public static void printMap(Map<String, Boolean> map) {
Iterator it = map.entrySet().iterator();
for(Map.Entry<String, Boolean> entry:map.entrySet()){
System.out.println(entry.getKey()+" = "+ entry.getValue());
}
}
}
class ValidateArray{
public static void main(String[] args) {
String str[] = {"abcd", "123", "101.112", "-1.54774"};
Map<String, Boolean> m = new HashMap<String, Boolean>();
for (String s : str) {
m.put(s, isNumber(s));
}
StringValidator.printMap(m);
}
public static boolean isNumber(String str) {
Pattern pattern = Pattern.compile("^-?\\d+\\.?\\d*$");
Matcher matcher = pattern.matcher(str);
return matcher.matches();
}
}

Replace your parseInt line with a call to isInteger(str[i]) where isInteger is defined by:
public static boolean isInteger(String text) {
try {
new Integer(text);
return true;
} catch (NumberFormatException e) {
return false;
}
}

I would like to enter the contrary view on 'don't use exception handling' here. The following code:
try
{
InputStream in = new FileInputStream(file);
}
catch (FileNotFoundException exc)
{
// ...
}
is entirely equivalent to:
if (!file.exists())
{
// ...
}
else
try
{
InputStream in = new FileInputStream(file);
}
catch (FileNotFoundException exc)
{
// ...
}
except that in the former case:
The existence of the file is only checked once
There is no timing-window between the two checks during which things can change.
The processing at // ... is only programmed once.
So you don't see code like the second case. At least you shouldn't.
The present case is identical except that because it's a String there is no timing window. Integer.parseInt() has to check the input for validity anyway, and it throws an exception which must be caught somewhere anyway (unless you like RTEs stopping your threads). So why do everything twice?
The counter-argument that you shouldn't use exceptions for normal flow control just begs the question. Is it normal flow control? or is it an error in the input? [In fact I've always understood that principle to mean more specifically 'don't throw exceptions to your own code' within the method, and even then there are rare cases when it's the best answer. I'm not a fan of blanket rules of any kind.]
Another example detecting EOF on an ObjectInputStream. You do it by catching EOFException. There is no other way apart from prefixing a count to the stream, which is a design change and a format change. So, is EOF part of the normal flow, or is it an exception? and how can it be part of the normal flow given that it is only reported via an exception?

Here's a more general way to validate, avoiding exceptions, and using what the Format subclasses already know. For example the SimpleDateFormat knows that Feb 31 is not valid, as long as you tell it not to be lenient.
import java.text.Format;
import java.text.NumberFormat;
import java.text.ParsePosition;
import java.text.SimpleDateFormat;
import java.util.HashMap;
import java.util.Map;
public class ValidatesByParsePosition {
private static NumberFormat _numFormat = NumberFormat.getInstance();
private static SimpleDateFormat _dateFormat = new SimpleDateFormat(
"MM/dd/yyyy");
public static void printMap(Map<String, Boolean> map) {
for (Map.Entry<String, Boolean> entry : map.entrySet()) {
System.out.println(entry.getKey() + " = " + entry.getValue());
}
}
public static void main(String[] args) {
System.out.println("Validating Nums with ParsePosition:");
String numStrings[] = { "abcd", "123", "101.112", "-1.54774", "1.40t3" };
Map<String, Boolean> rslts = new HashMap<String, Boolean>();
for (String s : numStrings) {
rslts.put(s, isOk(_numFormat, s));
}
ValidatesByParsePosition.printMap(rslts);
System.out.println("\nValidating dates with ParsePosition:");
String dateStrings[] = { "3/11/1952", "02/31/2013", "03/14/2014",
"05/25/2014", "3/uncle george/2015" };
rslts = new HashMap<String, Boolean>();
_dateFormat.setLenient(false);
for (String s : dateStrings) {
rslts.put(s, isOk(_dateFormat, s));
}
ValidatesByParsePosition.printMap(rslts);
}
public static boolean isOk(Format format, String str) {
boolean isOK = true;
int errorIndx = -1;
int parseIndx = 0;
ParsePosition pos = new ParsePosition(parseIndx);
while (isOK && parseIndx < str.length() - 1) {
format.parseObject(str, pos);
parseIndx = pos.getIndex();
errorIndx = pos.getErrorIndex();
isOK = errorIndx < 0;
}
if (!isOK) {
System.out.println("value \"" + str
+ "\" not parsed; error at char index " + errorIndx);
}
return isOK;
}
}

boolean intVal = false;
for(int i=0;i<str.length;i++) {
intVal = false;
try {
if (Integer.parseInt(str[i]) > 0) {
intVal = true;
}
} catch (java.lang.NumberFormatException e) {
intVal = false;
}
m.put(str[i], !intVal);
}

Related

Pattern matching in Thousands of files

I've a regex pattern of words like welcome1|welcome2|changeme... which I need to search for in thousands of files (varies between 100 to 8000) ranging from 1KB to 24 MB each, in size.
I would like to know if there's a faster way of pattern matching than doing what I have been trying.
Environment:
jdk 1.8
Windows 10
Unix4j Library
Here's what I tried till now
try (Stream<Path> stream = Files.walk(Paths.get(FILES_DIRECTORY))
.filter(FilePredicates.isFileAndNotDirectory())) {
List<String> obviousStringsList = Strings_PASSWORDS.stream()
.map(s -> ".*" + s + ".*").collect(Collectors.toList()); //because Unix4j apparently needs this
Pattern pattern = Pattern.compile(String.join("|", obviousStringsList));
GrepOptions options = new GrepOptions.Default(GrepOption.count,
GrepOption.ignoreCase,
GrepOption.lineNumber,
GrepOption.matchingFiles);
Instant startTime = Instant.now();
final List<Path> filesWithObviousStringss = stream
.filter(path -> !Unix4j.grep(options, pattern, path.toFile()).toStringResult().isEmpty())
.collect(Collectors.toList());
System.out.println("Time taken = " + Duration.between(startTime, Instant.now()).getSeconds() + " seconds");
}
I get Time taken = 60 seconds which makes me think I'm doing something really wrong.
I've tried different ways with the stream and on an average every method takes about a minute to process my current folder of 6660 files.
Grep on mysys2/mingw64 takes about 15 seconds and exec('grep...') in node.js takes about 12 seconds consistently.
I chose Unix4j because it provides java native grep and clean code.
Is there a way to produce better results in Java, that I'm sadly missing?
The main reason why native tools can process such text files much faster, is their assumption of one particular charset, especially when it has an ASCII based 8 Bit encoding, whereas Java performs a byte to character conversion whose abstraction is capable of supporting arbitrary charsets.
When we similarly assume a single charset with the properties named above, we can use lowlevel tools which may increase the performance dramatically.
For such an operation, we define the following helper methods:
private static char[] getTable(Charset cs) {
if(cs.newEncoder().maxBytesPerChar() != 1f)
throw new UnsupportedOperationException("Not an 8 bit charset");
byte[] raw = new byte[256];
IntStream.range(0, 256).forEach(i -> raw[i] = (byte)i);
char[] table = new char[256];
cs.newDecoder().onUnmappableCharacter(CodingErrorAction.REPLACE)
.decode(ByteBuffer.wrap(raw), CharBuffer.wrap(table), true);
for(int i = 0; i < 128; i++)
if(table[i] != i) throw new UnsupportedOperationException("Not ASCII based");
return table;
}
and
private static CharSequence mapAsciiBasedText(Path p, char[] table) throws IOException {
try(FileChannel fch = FileChannel.open(p, StandardOpenOption.READ)) {
long actualSize = fch.size();
int size = (int)actualSize;
if(size != actualSize) throw new UnsupportedOperationException("file too large");
MappedByteBuffer mbb = fch.map(FileChannel.MapMode.READ_ONLY, 0, actualSize);
final class MappedCharSequence implements CharSequence {
final int start, size;
MappedCharSequence(int start, int size) {
this.start = start;
this.size = size;
}
public int length() {
return size;
}
public char charAt(int index) {
if(index < 0 || index >= size) throw new IndexOutOfBoundsException();
byte b = mbb.get(start + index);
return b<0? table[b+256]: (char)b;
}
public CharSequence subSequence(int start, int end) {
int newSize = end - start;
if(start<0 || end < start || end-start > size)
throw new IndexOutOfBoundsException();
return new MappedCharSequence(start + this.start, newSize);
}
public String toString() {
return new StringBuilder(size).append(this).toString();
}
}
return new MappedCharSequence(0, size);
}
}
This allows to map a file into the virtual memory and project it directly to a CharSequence, without copy operations, assuming that the mapping can be done with a simple table and, for ASCII based charsets, the majority of the characters do not even need a table lookup, as their numerical value is identical to the Unicode codepoint.
With these methods, you may implement the operation as
// You need this only once per JVM.
// Note that running inside IDEs like Netbeans may change the default encoding
char[] table = getTable(Charset.defaultCharset());
try(Stream<Path> stream = Files.walk(Paths.get(FILES_DIRECTORY))
.filter(Files::isRegularFile)) {
Pattern pattern = Pattern.compile(String.join("|", Strings_PASSWORDS));
long startTime = System.nanoTime();
final List<Path> filesWithObviousStringss = stream//.parallel()
.filter(path -> {
try {
return pattern.matcher(mapAsciiBasedText(path, table)).find();
} catch(IOException ex) {
throw new UncheckedIOException(ex);
}
})
.collect(Collectors.toList());
System.out.println("Time taken = "
+ TimeUnit.NANOSECONDS.toSeconds(System.nanoTime()-startTime) + " seconds");
}
This runs much faster than the normal text conversion, but still supports parallel execution.
Besides requiring an ASCII based single byte encoding, there’s the restriction that this code doesn’t support files larger than 2 GiB. While it is possible to extend the solution to support larger files, I wouldn’t add this complication unless really needed.
I don’t know what “Unix4j” provides that isn’t already in the JDK, as the following code does everything with built-in features:
try(Stream<Path> stream = Files.walk(Paths.get(FILES_DIRECTORY))
.filter(Files::isRegularFile)) {
Pattern pattern = Pattern.compile(String.join("|", Strings_PASSWORDS));
long startTime = System.nanoTime();
final List<Path> filesWithObviousStringss = stream
.filter(path -> {
try(Scanner s = new Scanner(path)) {
return s.findWithinHorizon(pattern, 0) != null;
} catch(IOException ex) {
throw new UncheckedIOException(ex);
}
})
.collect(Collectors.toList());
System.out.println("Time taken = "
+ TimeUnit.NANOSECONDS.toSeconds(System.nanoTime()-startTime) + " seconds");
}
One important property of this solution is that it doesn’t read the whole file, but stops at the first encountered match. Also, it doesn’t deal with line boundaries, which is suitable for the words you’re looking for, as they never contain line breaks anyway.
After analyzing the findWithinHorizon operation, I consider that line by line processing may be better for larger files, so, you may try
try(Stream<Path> stream = Files.walk(Paths.get(FILES_DIRECTORY))
.filter(Files::isRegularFile)) {
Pattern pattern = Pattern.compile(String.join("|", Strings_PASSWORDS));
long startTime = System.nanoTime();
final List<Path> filesWithObviousStringss = stream
.filter(path -> {
try(Stream<String> s = Files.lines(path)) {
return s.anyMatch(pattern.asPredicate());
} catch(IOException ex) {
throw new UncheckedIOException(ex);
}
})
.collect(Collectors.toList());
System.out.println("Time taken = "
+ TimeUnit.NANOSECONDS.toSeconds(System.nanoTime()-startTime) + " seconds");
}
instead.
You may also try to turn the stream to parallel mode, e.g.
try(Stream<Path> stream = Files.walk(Paths.get(FILES_DIRECTORY))
.filter(Files::isRegularFile)) {
Pattern pattern = Pattern.compile(String.join("|", Strings_PASSWORDS));
long startTime = System.nanoTime();
final List<Path> filesWithObviousStringss = stream
.parallel()
.filter(path -> {
try(Stream<String> s = Files.lines(path)) {
return s.anyMatch(pattern.asPredicate());
} catch(IOException ex) {
throw new UncheckedIOException(ex);
}
})
.collect(Collectors.toList());
System.out.println("Time taken = "
+ TimeUnit.NANOSECONDS.toSeconds(System.nanoTime()-startTime) + " seconds");
}
It’s hard to predict whether this has a benefit, as in most cases, the I/O dominates such an operation.
I never used Unix4j yet, but Java provides nice file APIs as well nowadays. Also, Unix4j#grep seems to return all the found matches (as you're using .toStringResult().isEmpty()), while you seem to just need to know whether at least one match got found (which means that you should be able to break once one match is found). Maybe this library provides another method that could better suit your needs, e.g. something like #contains? Without the use of Unix4j, Stream#anyMatch could be a good candidate here. Here is a vanilla Java solution if you want to compare with yours:
private boolean lineContainsObviousStrings(String line) {
return Strings_PASSWORDS // <-- weird naming BTW
.stream()
.anyMatch(line::contains);
}
private boolean fileContainsObviousStrings(Path path) {
try (Stream<String> stream = Files.lines(path)) {
return stream.anyMatch(this::lineContainsObviousStrings);
}
}
public List<Path> findFilesContainingObviousStrings() {
Instant startTime = Instant.now();
try (Stream<Path> stream = Files.walk(Paths.get(FILES_DIRECTORY))) {
return stream
.filter(FilePredicates.isFileAndNotDirectory())
.filter(this::fileContainsObviousStrings)
.collect(Collectors.toList());
} finally {
Instant endTime = Instant.now();
System.out.println("Time taken = " + Duration.between(startTime, endTime).getSeconds() + " seconds");
}
}
Please try this out too (if it is possible), I am curious how it performs on your files.
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UncheckedIOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class Filescan {
public static void main(String[] args) throws IOException {
Filescan sc = new Filescan();
sc.findWords("src/main/resources/files", new String[]{"author", "book"}, true);
}
// kind of Tuple/Map.Entry
static class Pair<K,V>{
final K key;
final V value;
Pair(K key, V value){
this.key = key;
this.value = value;
}
#Override
public String toString() {
return key + " " + value;
}
}
public void findWords(String directory, String[] words, boolean ignorecase) throws IOException{
final String[] searchWords = ignorecase ? toLower(words) : words;
try (Stream<Path> stream = Files.walk(Paths.get(directory)).filter(Files::isRegularFile)) {
long startTime = System.nanoTime();
List<Pair<Path,Map<String, List<Integer>>>> result = stream
// you can test it with parallel execution, maybe it is faster
.parallel()
// searching
.map(path -> findWordsInFile(path, searchWords, ignorecase))
// filtering out empty optionals
.filter(Optional::isPresent)
// unwrap optionals
.map(Optional::get).collect(Collectors.toList());
System.out.println("Time taken = " + TimeUnit.NANOSECONDS.toSeconds(System.nanoTime()
- startTime) + " seconds");
System.out.println("result:");
result.forEach(System.out::println);
}
}
private String[] toLower(String[] words) {
String[] ret = new String[words.length];
for (int i = 0; i < words.length; i++) {
ret[i] = words[i].toLowerCase();
}
return ret;
}
private static Optional<Pair<Path,Map<String, List<Integer>>>> findWordsInFile(Path path, String[] words, boolean ignorecase) {
try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(path.toFile())))) {
String line = br.readLine();
line = ignorecase & line != null ? line.toLowerCase() : line;
Map<String, List<Integer>> map = new HashMap<>();
int linecount = 0;
while(line != null){
for (String word : words) {
if(line.contains(word)){
if(!map.containsKey(word)){
map.put(word, new ArrayList<Integer>());
}
map.get(word).add(linecount);
}
}
line = br.readLine();
line = ignorecase & line != null ? line.toLowerCase() : line;
linecount++;
}
if(map.isEmpty()){
// returning empty optional when nothing in the map
return Optional.empty();
}else{
// returning a path-map pair with the words and the rows where each word has been found
return Optional.of(new Pair<Path,Map<String, List<Integer>>>(path, map));
}
} catch (IOException ex) {
throw new UncheckedIOException(ex);
}
}
}

How to extract Full Name From a Url in Java

i need a library to extract file's full name from it's URL(Direct Download Link). I want a powerful library. I use FileNameUtils from Apache commons, But this class does not support a lot of URLs.
I want a library which supports these Urls:
https://example.cdn.com/mp4/7/9/5/file_795f32460d111df334849ee8336e56ca.mp4?e=1535545105&h=4772d27a70cd9b1c665b712f62592c47&download=1
name : file_795f32460d111df334849ee8336e56ca.mp4
http://example.cdn.comr/post/93/3/Jozve-Kamele-arbi.abp.zip
name : Jozve-Kamele-arbi.abp.zip
http://cdl.example.com/?b=dl-software&f=Windows.8.1.Enterprise.x86.Aug.2018_n.part1.rar
name : dl-software&f=Windows.8.1.Enterprise.x86.Aug.2018_n.part1.rar
https://www.google.com/url?sa=t&source=web&rct=j&url=http://www.pdf995.com/samples/pdf.pdf&ved=2ahUKEwjV096X-ZHdAhVQzlkKHTpUBV4QFjAAegQIARAB&usg=AOvVaw3HFvAQ7GNf5QjsUo05ot-j
name: pdf.pdf
Can anyone help me? Thanks.
I apologize in advance if the grammar of my sentence is not correct. because I can't speak English well.
You could actually also try to solve this problem with regular expressions (like e.g (?i)([^=/&?]+\\.(" + EXTENSIONS + "))\\b), if you have a list of the files extensions you are interested in.
Here is an example of such a method which extracts a file from a URL:
private static final String EXTENSIONS = "ez|aw|atom|atomcat|atomsvc|ccxml|cdmia|cdmic|cdmid|cdmio|cdmiq|cu|davmount|dbk|dssc|xdssc|ecma|emma|epub|exi|pfr|gml|gpx|gxf|stk|ipfix|jar|ser|class|js|json|jsonml|lostxml|hqx|cpt|mads|mrc|mrcx|mathml|mbox|mscml|metalink|meta4|mets|mods|mp4s|mp4|mxf|oda|opf|ogx|omdoc|oxps|xer|pdf|pgp|prf|p10|p7s|p8|ac|cer|crl|pkipath|pki|pls|cww|pskcxml|rdf|rif|rnc|rl|rld|rs|gbr|mft|roa|rsd|rss|rtf|sbml|scq|scs|spq|spp|sdp|setpay|setreg|shf|rq|srx|gram|grxml|sru|ssdl|ssml|tfi|tsd|plb|psb|pvb|tcap|pwn|aso|imp|acu|air|fcdt|xdp|xfdf|ahead|azf|azs|azw|acc|ami|apk|cii|fti|atx|mpkg|m3u8|swi|iota|aep|mpm|bmi|rep|cdxml|mmd|cdy|cla|rp9|c11amc|c11amz|csp|cdbcmsg|cmc|clkx|clkk|clkp|clkt|clkw|wbs|pml|ppd|car|pcurl|dart|rdz|fe_launch|dna|mlp|dpg|dfac|kpxx|ait|svc|geo|mag|nml|esf|msf|qam|slt|ssf|ez2|ez3|fdf|mseed|gph|ftc|fnc|ltf|fsc|oas|oa2|oa3|fg5|bh2|ddd|xdw|xbd|fzs|txd|ggb|ggt|gxt|g2w|g3w|gmx|kml|kmz|gac|ghf|gim|grv|gtm|tpl|vcg|hal|zmm|hbci|les|hpgl|hpid|hps|jlt|pcl|pclxl|sfd-hdstx|mpy|irm|sc|igl|ivp|ivu|igm|i2g|qbo|qfx|rcprofile|irp|xpr|fcs|jam|rms|jisp|joda|karbon|chrt|kfo|flw|kon|ksp|htke|kia|sse|lasxml|lbd|lbe|123|apr|pre|nsf|org|scm|lwp|portpkg|mcd|mc1|cdkey|mwf|mfm|flo|igx|mif|daf|dis|mbk|mqy|msl|plc|txf|mpn|mpc|xul|cil|cab|xlam|xlsb|xlsm|xltm|eot|chm|ims|lrm|thmx|cat|stl|ppam|pptm|sldm|ppsm|potm|docm|dotm|wpl|xps|mseq|mus|msty|taglet|nlu|nnd|nns|nnw|ngdat|n-gage|rpst|rpss|edm|edx|ext|odc|otc|odb|odf|odft|odg|otg|odi|oti|odp|otp|ods|ots|odt|odm|ott|oth|xo|dd2|oxt|pptx|sldx|ppsx|potx|xlsx|xltx|docx|dotx|mgp|dp|esa|paw|str|ei6|efif|wg|plf|pbd|box|mgz|qps|ptid|bed|mxl|musicxml|cryptonote|cod|rm|rmvb|link66|st|see|sema|semd|semf|ifm|itp|iif|ipk|mmf|teacher|dxp|sfs|sdc|sda|sdd|smf|sgl|smzip|sm|sxc|stc|sxd|std|sxi|sti|sxm|sxw|sxg|stw|svd|xsm|bdm|xdm|tao|tmo|tpt|mxs|tra|utz|umj|unityweb|uoml|vcx|vis|vsf|wbxml|wmlc|wmlsc|wtb|nbp|wpd|wqd|stf|xar|xfdl|hvd|hvs|hvp|osf|osfpvg|saf|spf|cmp|zaz|vxml|wgt|hlp|wsdl|wspolicy|7z|abw|ace|dmg|aam|aas|bcpio|torrent|bz|vcd|cfs|chat|pgn|nsc|cpio|csh|dgc|wad|ncx|dtb|res|dvi|evy|eva|bdf|gsf|psf|pcf|snf|arc|spl|gca|ulx|gnumeric|gramps|gtar|hdf|install|iso|jnlp|latex|mie|application|lnk|wmd|wmz|xbap|mdb|obd|crd|clp|mny|pub|scd|trm|wri|nzb|p7r|rar|ris|sh|shar|swf|xap|sql|sit|sitx|srt|sv4cpio|sv4crc|t3|gam|tar|tcl|tex|tfm|obj|ustar|src|fig|xlf|xpi|xz|xaml|xdf|xenc|dtd|xop|xpl|xslt|xspf|yang|yin|zip|adp|s3m|sil|eol|dra|dts|dtshd|lvp|pya|ecelp4800|ecelp7470|ecelp9600|rip|weba|aac|caf|flac|mka|m3u|wax|wma|rmp|wav|xm|cdx|cif|cmdf|cml|csml|xyz|ttc|otf|ttf|woff|woff2|bmp|cgm|g3|gif|ief|ktx|png|btif|sgi|psd|sub|dwg|dxf|fbs|fpx|fst|mmr|rlc|mdi|wdp|npx|wbmp|xif|webp|3ds|ras|cmx|ico|sid|pcx|pnm|pbm|pgm|ppm|rgb|tga|xbm|xpm|xwd|dae|dwf|gdl|gtw|mts|vtu|appcache|css|csv|n3|dsc|rtx|tsv|ttl|vcard|curl|dcurl|mcurl|scurl|sub|fly|flx|gv|3dml|spot|jad|wml|wmls|java|nfo|opml|etx|sfv|uu|vcs|vcf|3gp|3g2|h261|h263|h264|jpgv|ogv|dvb|fvt|pyv|viv|webm|f4v|fli|flv|m4v|mng|vob|wm|wmv|wmx|wvx|avi|movie|smv|ice";
private static final Pattern FILE_DETECT = Pattern.compile("(?i)([^=/&?]+\\.(" + EXTENSIONS + "))\\b");
public static Optional<String> extractFileFrom(String url) {
Matcher matcher = FILE_DETECT.matcher(url);
return (matcher.find()) ? Optional.of(matcher.group(1)) : Optional.empty();
}
And here is a test which demonstrates how to use the method above:
public static void main(String[] args) throws ParseException {
List<String> strings = Arrays.asList(
"https://example.cdn.com/mp4/7/9/5/file_795f32460d111df334849ee8336e56ca.mp4?e=1535545105&h=4772d27a70cd9b1c665b712f62592c47&download=1",
"http://example.cdn.comr/post/93/3/Jozve-Kamele-arbi.abp.zip",
"http://cdl.example.com/?b=dl-software&f=Windows.8.1.Enterprise.x86.Aug.2018_n.part1.rar",
"https://www.google.com/url?sa=t&source=web&rct=j&url=http://www.pdf995.com/samples/pdf.pdf&ved=2ahUKEwjV096X-ZHdAhVQzlkKHTpUBV4QFjAAegQIARAB&usg=AOvVaw3HFvAQ7GNf5QjsUo05ot-j",
"https://www.google.com/url?sa=t&source=web&rct=j&url=http://www.pdf995.com/samples/pdf.PDF&ved=2ahUKEwjV096X-ZHdAhVQzlkKHTpUBV4QFjAAegQIARAB&usg=AOvVaw3HFvAQ7GNf5QjsUo05ot-j");
strings.stream().map(s -> extractFileFrom(s)).collect(Collectors.toList())
.forEach(System.out::println);
}
If you execute the main method you will see this on the console:
Optional[file_795f32460d111df334849ee8336e56ca.mp4]
Optional[Jozve-Kamele-arbi.abp.zip]
Optional[Windows.8.1.Enterprise.x86.Aug.2018_n.part1.rar]
Optional[pdf.pdf]
Optional[pdf.PDF]
I use this method, hope it helps you too. It will parse from question marks, hash too.
public static String parseFileNameFromUrl(String url) {
if (url == null) {
return "";
}
try {
URL res = new URL(url);
String resHost = res.getHost();
if (resHost.length() > 0 && url.endsWith(resHost)) {
// handle ...example.com
return "";
}
} catch (MalformedURLException e) {
e.printStackTrace();
return "";
}
int startIndex = url.lastIndexOf('/') + 1;
int length = url.length();
// find end index for ?
int lastQuestionMarkPos = url.lastIndexOf('?');
if (lastQuestionMarkPos == -1) {
lastQuestionMarkPos = length;
}
// find end index for #
int lastHashPos = url.lastIndexOf('#');
if (lastHashPos == -1) {
lastHashPos = length;
}
// calculate the end index
int endIndex = Math.min(lastQuestionMarkPos, lastHashPos);
return url.substring(startIndex, endIndex);
}

Catch inside while loop

I am converting string to Integer, so when I receive any character, exception is thrown and execution is getting stopped. I want to skip that character and print all the remaining numbers so I kept catch inside the while loop. But now for each and every exception one error will be thrown and the remaining numbers are getting printed as per exception, but code has to send mail to the team once an exception is thrown (I will place the mailing part inside catch). It will not be good if code sends mail when each and every time exception is thrown so I have to collect all the exceptions inside that while loop and send mail at once regarding all the exception. Is it possible?
I will place the simple sample code. (Mailing part I will handle later as of please tell me the logic to collect all the exception and printing at once.)
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
public class dummy {
public static void main(String args[]) {
String getEach="";
List A = new ArrayList();
A.add("1");
A.add("2");
A.add("3");
A.add("AA");
A.add("4");
A.add("5");
A.add("dsfgfdsgfdshg");
A.add("30");
Iterator<String> map = A.iterator();
while (map.hasNext()) {
try {
getEach = map.next();
int getValue = Integer.parseInt(getEach);
System.out.println("Value:::::: "+getValue);
} catch (Exception E) {
System.out.println("There is an exception c" +E.getMessage());
}
}
}
}
Declare a List class with an Exception object then collect it.
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
public class dummy {
public static void main(String args[])
{
String getEach = "";
List<String> A = new ArrayList<String>();
A.add("1");
A.add("2");
A.add("3");
A.add("AA");
A.add("4");
A.add("5");
A.add("dsfgfdsgfdshg");
A.add("30");
Iterator<String> map = A.iterator();
List<Exception> errList = new ArrayList<Exception>();
while (map.hasNext()) {
try {
getEach = map.next();
int getValue = Integer.parseInt(getEach);
System.out.println("Value:::::: " + getValue);
} catch (Exception E)
{
//System.out.println("There is an exception c" + E.getMessage());
errList.add(E);
}
}
if(!errList.isEmpty())
{
for(Iterator<Exception> eIter = errList.iterator();eIter.hasNext();)
{
Exception e = eIter.next();
System.out.println("There is an exception c" + e.getMessage());
}
}
}
}
You could I suppose just drop the try/catch altogether and use something like this:
while(map.hasNext()) {
getEach = map.next();
// If getEach contains the string representation of
// a Numerical value. The regular expression within
// the matcher below will handle signed, unsigned,
// integer, and double numerical values. If getEach
// holds a numerical value then print it.
if (getEach.matches("-?\\d+(\\.\\d+)?")) {
int getValue = Integer.parseInt(getEach);
System.out.println("Value:::::: "+getValue);
}
}

How to remove all special characters from a string in java?

I want to remove all special characters from a string,i tried many options which were given in stackoverflow, but none of them work for me.
here is my code :
public class convert {
public static void main(String[] args) {
try {
List<List<String>> outerList = new ArrayList<List<String>>();
outerList.add(new ArrayList<String>(asList("11-","2")));
outerList.add(new ArrayList<String>(asList("(2^","1")));
outerList.add(new ArrayList<String>(asList("11","3)")));
int i,j;
for(i=0;i<outerList.size();i++){
for(j=0;j<outerList.get(0).size();j++){
outerList.get(i).get(j).replaceAll("[^\\w\\s]", "");
if(outerList.get(i).get(j).matches("-?\\d+"){
continue;
}else{
System.out.println("special characters not removed");
System.exit(0);
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
The (simple) error is that s.replaceAll(...) does not change s but yields a new changed string:
String s = outerList.get(i).get(j).replaceAll("[^\\w\\s]", "");
outerList.get(i).set(j, s);
in the case of not alphanumeric you can use
String value = "hello#() world";
value = value.replaceAll("[^A-Za-z0-9]", "");
System.out.println(value) // => helloworld
something similar has already been asked here
Use StringUtils at Apache Commons Lang (http://commons.apache.org/proper/commons-lang/):
http://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html

Java pairs of symbols program

Using stack data structure(s): If the input file is not balanced, the un-balance cause and the in-file localization details will be supplied. For flexibility reasons, read the balancing pairs of symbols from a text file. Test your program by considering the following pairs of symbols: ( ), { }, [ ], /* */
I'm having trouble with the last requirement: /* */
I also can't seem to grasp how to print the in-file localization details? i.e which line number of the text file the error has occured on?
The text file looks like this:
(()(()
{}}}{}{
[[[]][][]
((}})){{]
()
[]
{}
[]{}
()()()[]
*/ /*
(a+b) = c
The code:
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
public class P1 {
private boolean match = true;
// The stack
private java.util.Stack<Character> matchStack = new java.util.Stack<Character>();
// What to do with a match
public boolean ismatch() {
return match && matchStack.isEmpty();
}
// Finding a match
public void add(char c) {
Character k = leftSide(c);
if (k == null)
;
else if (k.charValue() == c)
matchStack.push(k);
else {
if (matchStack.isEmpty() || !matchStack.pop().equals(k))
match = false;
}
}
// Add string values
public void add(String s) {
for (int i = 0; i < s.length(); i++)
add(s.charAt(i));
}
// The various symbol pairs
protected static Character leftSide(char c) {
switch (c) {
case '(':
case ')':
return new Character('(');
case '[':
case ']':
return new Character('[');
case '{':
case '}':
return new Character('{');
default:
return null;
}
}
// Main method. Welcome message, read the test file, build the array, print
// results.
public static void main(String args[]) {
List<String[]> arrays = new ArrayList<String[]>();
// Welcome message
System.out
.println("Project #1\n"
+ "Welcome! The following program ensures both elements of various paired symbols are present.\n"
+ "Test Data will appear below: \n"
+ "-------------------------------");
// Read the file
try {
BufferedReader in = new BufferedReader(new FileReader(
"testfile.txt"));
String str;
// Keep reading while there is still more data
while ((str = in.readLine()) != null) {
// Line by line read & add to array
String arr[] = str.split(" ");
if (arr.length > 0)
arrays.add(arr);
// Let the user know the match status (i.e. print the results)
P1 mp = new P1();
mp.add(str);
System.out.print(mp.ismatch() ? "\nSuccessful Match:\n"
: "\nThis match is not complete:\n");
System.out.println(str);
}
in.close();
// Catch exceptions
} catch (FileNotFoundException e) {
System.out
.println("We're sorry, we are unable to find that file: \n"
+ e.getMessage());
} catch (IOException e) {
System.out
.println("We're sorry, we are unable to read that file: \n"
+ e.getMessage());
}
}
}
An easy way to implement this would be using a map of stacks such as Map<String, Stack<Location>>, where Location is a class you create that holds two ints (a line number and a character number). That can be your location info. The key (String) to this map would be your left side (opener) part of your pairs. Every time you have an opener you look up the appropriate Stack in the map and push a new Location on it for that opener. Each time you encounter a closer you look up its opener, use the opener to look up the correct Stack in the map and then pop it once. The reason I say use String for your key is because not all your openers can be represented by Character namely your /* opener, so a String will have to do. Since you can't switch on Strings for your leftSide(char) (which will now be leftSide(String)) function you'll either have to use if-else or use a map (Map<String, String>) to create the closer to opener mappings.
When the end of the file is reached the only Location objects remaining in the Stack objects should be unclosed openers.

Categories

Resources