Working with EnumSet

Working with EnumSet - java

This code works, but with a try/catch box .
public enum AccentuationUTF8 {/** */
é, /** */è, /** */ç, /** */à, /** */ù, /** */
ä, /** */ë, /** */ö, /** */ï, /** */ü, /** */
â, /** */ê, /** */î, /** */ô, /** */û, /** */
}
......
final EnumSet<AccentuationUTF8> esUtf8 = EnumSet.noneOf(AccentuationUTF8.class);
final String[] acc1 = {"é", "à", "u"};
for (final String string : acc1) {
try { // The ontologic problem
esUtf8.add(AccentuationUTF8.valueOf(string));
} catch (final Exception e) {
System.out.println(string + " not an accent.");
}
}
System.out.println(esUtf8.size() + "\t" + esUtf8.toString()
output :
u not an accent.
2 [é, à]
I want to generate an EnumSet with all accents of a word or of sentence.
Edit after comments
Is it possible to manage such an EnumSet without using try (needed by AccentuationUTF8.valueOf(string)?
Is better way to code ?
FINAL EDIT
Your responses suggest me a good solution : because EnumSet.contains(Object), throw an Exception, change it : create a temporary HashSet able to return a null without Exception.
So the ugly try/catch is now removed, code is now :
final Set<String> setTmp = new HashSet<>(AccentsUTF8.values().length);
for (final AccentsUTF8 object : AccentsUTF8.values()) {
setTmp.add(object.toString());
}
final EnumSet<AccentsUTF8> esUtf8 = EnumSet.noneOf(AccentsUTF8.class);
final String[] acc1 = {"é", "à", "u"};
for (final String string : acc1) {
if (setTmp.contains(string)) {
esUtf8.add(AccentsUTF8.valueOf(string));
} else {
System.out.println(string + " not an accent.");
}
}
System.out.println(esUtf8.size() + "\t" + esUtf8.toString()
Thanks for attention you paid for.

I don't think an enum is the best approach here - partly because it's only going to work for valid Java identifiers.
It looks like what you really want is just a Set<Character>, with something like:
Set<Character> accentsInText = new HashSet<Character>();
for (int i = 0; i < text.length(); i++) {
Character c = text.charAt(i);
if (ALL_ACCENTS.contains(c)) {
accentsInText.add(c);
}
}

Related

Unit Tests for spring boot application

I have never used JUnit testing before.I need to test my code with JUnit.
I have been searching google for all day but the problem is that I found examples using Mockito but in my code I didn't use dependency injections(#Autowired).
How can i use it for these methods?
Thanks in advance.
public class WordService {
public WordService() {
}
public static String upperCaseFirst(String value) {
char[] listChar = value.toCharArray();
listChar[0] = Character.toUpperCase(listChar[0]);
return new String(listChar);
}
/**
* Find and return the search word
* #param name
* #return the word sought or null if not found
*/
public Word findWordByName(String name){
String nameUpper = upperCaseFirst(name);
WordDao w = new WordDao();
Word found = w.findWord(nameUpper);
List<String> definitions = new ArrayList<>();
if(found != null) {
for(int i=0; i<found.getDefinition().size(); i++) {
StringBuffer defBuffer = new StringBuffer();
String definitionFound = found.getDefinition().get(i);
definitionFound = definitionFound.replace("\n", "");
defBuffer.append(definitionFound);
defBuffer.append("_");
definitions.add(i, defBuffer.toString());
}
found.setDefinition(definitions);
}
return found;
}
/**
*
* #return Return a list of words
*/
public List<Word> findAllWord(){
WordDao w = new WordDao();
return w.findAllWords();
}
}

You can extract WordDao to class level as a field. Create set method.
After that in unit test you can mock WordDao and control what will be result of methods call. For the second method it something like:
WordDao wMocked = Mock(WordDao.class)
Word word1 = new...
Word word2 = new...
List<Word> words = List.of(word1, word2);
when(w.findAllWords()).thenReturn(words);
WordService ws = new WordService();
ws.setWordDao(wMocked);
Assert.equals(words, ws.findAllWords);

Initialize multiple numeric fields at once in JAVA that begin with certain values

I am working on a Java class that contains a ton of numeric fields. Most of them would begin with something like 'CMTH' or 'FYTD'. Is it possible to initialize all fields of the same type that begin or end with a certain value. For example I have the following fields:
CMthRepCaseACR CMthRepUnitACR CMthRecCaseACR CMthRecUnitACR CMthHecCaseACR CMthHecUnitACR FYTDHecCaseACR FYTDHecUnitACR CMthBBKCaseACR CMthBBKUnitACR CMthPIHCaseACR .
I am trying to figure if it is possible to initialize all fields to zero that end with an 'ACR' or begin with an 'Cmth"
I know I can do something like cmtha = cmthb = cmthc = 0 but I was wondering there was a command where you can some kind of mask to initialize
Thanks

Assuming that you cannot change that said Java class (and e.g. use a collection or map to store the values) your best bet is probably reflection (see also: Trail: The Reflection API). Reflection gives you access to all fields of the class and you can then implement whatever matching you'd like.
Here's a short demo to get you started, minus error handling, sanity checks and adaptions to your actual class:
import java.util.stream.Stream;
public class Demo {
private static class DemoClass {
private int repCaseACR = 1;
private int CMthRepUnit = 2;
private int foo = 3;
private int bar = 4;
#Override
public String toString() {
return "DemoClass [repCaseACR=" + repCaseACR + ", CMthRepUnit=" + CMthRepUnit + ", foo=" + foo + ", bar="
+ bar + "]";
}
}
public static void main(String[] args) {
DemoClass demoClass = new DemoClass();
System.out.println("before: " + demoClass);
resetFields(demoClass, "CMth", null);
System.out.println("after prefix reset: " + demoClass);
resetFields(demoClass, null, "ACR");
System.out.println("after suffix reset: " + demoClass);
}
private static void resetFields(DemoClass instance, String prefix, String suffix) {
Stream.of(instance.getClass().getDeclaredFields())
.filter(field ->
(prefix != null && field.getName().startsWith(prefix))
|| (suffix != null && field.getName().endsWith(suffix)))
.forEach(field -> {
field.setAccessible(true);
try {
field.set(instance, 0);
} catch (IllegalArgumentException | IllegalAccessException e) {
// TODO handle me
}
});
}
}
Output:
before: DemoClass [repCaseACR=1, CMthRepUnit=2, foo=3, bar=4]
after prefix reset: DemoClass [repCaseACR=1, CMthRepUnit=0, foo=3, bar=4]
after suffix reset: DemoClass [repCaseACR=0, CMthRepUnit=0, foo=3, bar=4]
Note: Both links are seriously dated but the core functionality of reflection is still the same.

Given a span of string like [0..2) how to find string equivalent?

I am using apache open nlp toolkit in java.I wish to display only name enitites in a given text like geo-graphical, person etc.. Following code snippet gives string spans
try {
System.out.println("Input : Pierre Vinken is 61 years old");
InputStream modelIn = new FileInputStream("en-ner-person.bin");
TokenNameFinderModel model = new TokenNameFinderModel(modelIn);
NameFinderME nameFinder = new NameFinderME(model);
String[] sentence = new String[]{
"Pierre",
"Vinken",
"is",
"61",
"years",
"old",
"."
};
Span nameSpans[] = nameFinder.find(sentence);
for(Span s: nameSpans)
System.out.println("Name Entity : "+s.toString());
}
catch (IOException e) {
e.printStackTrace();
}
Output :
Input : Pierre Vinken is 61 years old
Name Entity : [0..2) person
How can i get the equivalent string rather than span, is there any api for that?

Span has the method getCoveredText(CharSequence text) which will do this. But I don't understand why you need an API method to get the text corresponding to a span. A span clearly provides start (inclusive) and end (exclusive) integer offsets. So the following suffices:
StringBuilder builder = new StringBuilder();
for (int i = s.getStart(); i < s.getEnd(); i++) {
builder.append(sentences[i]).append(" ");
}
String name = builder.toString();

You can use the Span class itself.
The following class method returns the CharSequence that correspond to the Span instance from another CharSequence text:
/**
* Retrieves the string covered by the current span of the specified text.
*
* #param text
*
* #return the substring covered by the current span
*/
public CharSequence getCoveredText(CharSequence text) { ... }
Notice that this class also has two static methods that accept an array of Span and respectively a CharSequence or an array of tokens (String[]) to return the equivalent array of String.
/**
* Converts an array of {#link Span}s to an array of {#link String}s.
*
* #param spans
* #param s
* #return the strings
*/
public static String[] spansToStrings(Span[] spans, CharSequence s) {
String[] tokens = new String[spans.length];
for (int si = 0, sl = spans.length; si < sl; si++) {
tokens[si] = spans[si].getCoveredText(s).toString();
}
return tokens;
}
public static String[] spansToStrings(Span[] spans, String[] tokens) { ... }
I hope it helps...

Generating interface with ASM is not working

I need to generate an interface at runtime. This interface will be used in a dynamic proxy. At first, I found this article from Google, but then I found I could just use ASM instead. Here is my code that gets the bytecode of the interface:
private static byte[] getBytecode(String internalName, String genericClassTypeSignature, Method[] methods, Class<?>... extendedInterfaces) throws IOException {
ClassWriter cw = new ClassWriter(0);
String[] interfaces = new String[extendedInterfaces.length];
int i = 0;
for (Class<?> interfac : extendedInterfaces) {
interfaces[i] = interfac.getName().replace('.', '/');
i++;
}
cw.visit(V1_6, ACC_PUBLIC + ACC_ABSTRACT + ACC_INTERFACE, internalName, null, "java/lang/Object", interfaces);
ArrayList<String> exceptions = new ArrayList<String>();
for (Method m : methods) {
exceptions.clear();
for (Class<?> exception : m.getExceptionTypes()) {
exceptions.add(getInternalNameOf(exception));
}
cw.visitMethod(removeInvalidAbstractModifiers(m.getModifiers()) + ACC_ABSTRACT, m.getName(), getMethodDescriptorOf(m), getTypeSignatureOf(m), exceptions.toArray(new String[exceptions.size()]));
}
cw.visitEnd();
return cw.toByteArray();
}
private static int removeInvalidAbstractModifiers(int mod) {
int result = 0;
if (Modifier.isProtected(mod)) {
result += ACC_PROTECTED;
}
if (Modifier.isPublic(mod)) {
result += ACC_PUBLIC;
}
if (Modifier.isTransient(mod)) {
result += ACC_VARARGS;
}
return result;
}
Just for test purposes, I tried to convert JFrame to an interface. But when I load my generated interface, it gives me a java.lang.ClassFormatError:
java.lang.ClassFormatError: Method paramString in class javax/swing/JFrame$GeneratedInterface has illegal modifiers: 0x404
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
// ...
Modifier.toString(0x404) tells me that 0x404 means protected abstract. As far as I know, a protected abstract method in an abstract class is perfectly legal.
Here is the code for the paramString method (see above) in JFrame:
/**
* Returns a string representation of this <code>JFrame</code>.
* This method
* is intended to be used only for debugging purposes, and the
* content and format of the returned string may vary between
* implementations. The returned string may be empty but may not
* be <code>null</code>.
*
* #return a string representation of this <code>JFrame</code>
*/
protected String paramString() {
String defaultCloseOperationString;
if (defaultCloseOperation == HIDE_ON_CLOSE) {
defaultCloseOperationString = "HIDE_ON_CLOSE";
} else if (defaultCloseOperation == DISPOSE_ON_CLOSE) {
defaultCloseOperationString = "DISPOSE_ON_CLOSE";
} else if (defaultCloseOperation == DO_NOTHING_ON_CLOSE) {
defaultCloseOperationString = "DO_NOTHING_ON_CLOSE";
} else if (defaultCloseOperation == 3) {
defaultCloseOperationString = "EXIT_ON_CLOSE";
} else defaultCloseOperationString = "";
String rootPaneString = (rootPane != null ?
rootPane.toString() : "");
String rootPaneCheckingEnabledString = (rootPaneCheckingEnabled ?
"true" : "false");
return super.paramString() +
",defaultCloseOperation=" + defaultCloseOperationString +
",rootPane=" + rootPaneString +
",rootPaneCheckingEnabled=" + rootPaneCheckingEnabledString;
}
I see no reason why I should be getting this error. Could someone explain this to me?

Methods in an interface must be public.
Also, in your removeInvalidAbstractModifiers() method, you should be using |= to set a flag, rather than +=. The latter will cause problems if the flag is already set (which I realize it won't be if starting from 0, but it's a good habit to get into). Although why you're setting the flag in a method called "remove," I have no idea.

How to normalize a URL in Java?

URL normalization (or URL canonicalization) is the process by which URLs are modified and standardized in a consistent manner. The goal of the normalization process is to transform a URL into a normalized or canonical URL so it is possible to determine if two syntactically different URLs are equivalent.
Strategies include adding trailing slashes, https => http, etc. The Wikipedia page lists many.
Got a favorite method of doing this in Java? Perhaps a library (Nutch?), but I'm open. Smaller and fewer dependencies is better.
I'll handcode something for now and keep an eye on this question.
EDIT: I want to aggressively normalize to count URLs as the same if they refer to the same content. For example, I ignore the parameters utm_source, utm_medium, utm_campaign. For example, I ignore subdomain if the title is the same.

Have you taken a look at the URI class?
http://docs.oracle.com/javase/7/docs/api/java/net/URI.html#normalize()

I found this question last night, but there wasn't an answer I was looking for so I made my own. Here it is incase somebody in the future wants it:
/**
* - Covert the scheme and host to lowercase (done by java.net.URL)
* - Normalize the path (done by java.net.URI)
* - Add the port number.
* - Remove the fragment (the part after the #).
* - Remove trailing slash.
* - Sort the query string params.
* - Remove some query string params like "utm_*" and "*session*".
*/
public class NormalizeURL
{
public static String normalize(final String taintedURL) throws MalformedURLException
{
final URL url;
try
{
url = new URI(taintedURL).normalize().toURL();
}
catch (URISyntaxException e) {
throw new MalformedURLException(e.getMessage());
}
final String path = url.getPath().replace("/$", "");
final SortedMap<String, String> params = createParameterMap(url.getQuery());
final int port = url.getPort();
final String queryString;
if (params != null)
{
// Some params are only relevant for user tracking, so remove the most commons ones.
for (Iterator<String> i = params.keySet().iterator(); i.hasNext();)
{
final String key = i.next();
if (key.startsWith("utm_") || key.contains("session"))
{
i.remove();
}
}
queryString = "?" + canonicalize(params);
}
else
{
queryString = "";
}
return url.getProtocol() + "://" + url.getHost()
+ (port != -1 && port != 80 ? ":" + port : "")
+ path + queryString;
}
/**
* Takes a query string, separates the constituent name-value pairs, and
* stores them in a SortedMap ordered by lexicographical order.
* #return Null if there is no query string.
*/
private static SortedMap<String, String> createParameterMap(final String queryString)
{
if (queryString == null || queryString.isEmpty())
{
return null;
}
final String[] pairs = queryString.split("&");
final Map<String, String> params = new HashMap<String, String>(pairs.length);
for (final String pair : pairs)
{
if (pair.length() < 1)
{
continue;
}
String[] tokens = pair.split("=", 2);
for (int j = 0; j < tokens.length; j++)
{
try
{
tokens[j] = URLDecoder.decode(tokens[j], "UTF-8");
}
catch (UnsupportedEncodingException ex)
{
ex.printStackTrace();
}
}
switch (tokens.length)
{
case 1:
{
if (pair.charAt(0) == '=')
{
params.put("", tokens[0]);
}
else
{
params.put(tokens[0], "");
}
break;
}
case 2:
{
params.put(tokens[0], tokens[1]);
break;
}
}
}
return new TreeMap<String, String>(params);
}
/**
* Canonicalize the query string.
*
* #param sortedParamMap Parameter name-value pairs in lexicographical order.
* #return Canonical form of query string.
*/
private static String canonicalize(final SortedMap<String, String> sortedParamMap)
{
if (sortedParamMap == null || sortedParamMap.isEmpty())
{
return "";
}
final StringBuffer sb = new StringBuffer(350);
final Iterator<Map.Entry<String, String>> iter = sortedParamMap.entrySet().iterator();
while (iter.hasNext())
{
final Map.Entry<String, String> pair = iter.next();
sb.append(percentEncodeRfc3986(pair.getKey()));
sb.append('=');
sb.append(percentEncodeRfc3986(pair.getValue()));
if (iter.hasNext())
{
sb.append('&');
}
}
return sb.toString();
}
/**
* Percent-encode values according the RFC 3986. The built-in Java URLEncoder does not encode
* according to the RFC, so we make the extra replacements.
*
* #param string Decoded string.
* #return Encoded string per RFC 3986.
*/
private static String percentEncodeRfc3986(final String string)
{
try
{
return URLEncoder.encode(string, "UTF-8").replace("+", "%20").replace("*", "%2A").replace("%7E", "~");
}
catch (UnsupportedEncodingException e)
{
return string;
}
}
}

Because you also want to identify URLs which refer to the same content, I found this paper from the WWW2007 pretty interesting: Do Not Crawl in the DUST: Different URLs with Similar Text. It provides you with a nice theoretical approach.

No, there is nothing in the standard libraries to do this. Canonicalization includes things like decoding unnecessarily encoded characters, converting hostnames to lowercase, etc.
e.g. http://ACME.com/./foo%26bar becomes:
http://acme.com/foo&bar
URI's normalize() does not do this.

The RL library:
https://github.com/backchatio/rl
goes quite a ways beyond java.net.URL.normalize().
It's in Scala, but I imagine it should be useable from Java.

You can do this with the Restlet framework using Reference.normalize(). You should also be able to remove the elements you don't need quite conveniently with this class.

In Java, normalize parts of a URL
Example of a URL: https://i0.wp.com:55/lplresearch.com/wp-content/feb.png?ssl=1&myvar=2#myfragment
protocol: https
domain name: i0.wp.com
subdomain: i0
port: 55
path: /lplresearch.com/wp-content/uploads/2019/01/feb.png?ssl=1
query: ?ssl=1"
parameters: &myvar=2
fragment: #myfragment
Code to do the URL parsing:
import java.util.*;
import java.util.regex.*;
public class regex {
public static String getProtocol(String the_url){
Pattern p = Pattern.compile("^(http|https|smtp|ftp|file|pop)://.*");
Matcher m = p.matcher(the_url);
return m.group(1);
}
public static String getParameters(String the_url){
Pattern p = Pattern.compile(".*(\\?[-a-zA-Z0-9_.#!$&''()*+,;=]+)(#.*)*$");
Matcher m = p.matcher(the_url);
return m.group(1);
}
public static String getFragment(String the_url){
Pattern p = Pattern.compile(".*(#.*)$");
Matcher m = p.matcher(the_url);
return m.group(1);
}
public static void main(String[] args){
String the_url =
"https://i0.wp.com:55/lplresearch.com/" +
"wp-content/feb.png?ssl=1&myvar=2#myfragment";
System.out.println(getProtocol(the_url));
System.out.println(getFragment(the_url));
System.out.println(getParameters(the_url));
}
}
Prints
https
#myfragment
?ssl=1&myvar=2
You can then push and pull on the parts of the URL until they are up to muster.

Im have a simple way to solve it. Here is my code
public static String normalizeURL(String oldLink)
{
int pos=oldLink.indexOf("://");
String newLink="http"+oldLink.substring(pos);
return newLink;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Working with EnumSet - java

Related

Unit Tests for spring boot application

Initialize multiple numeric fields at once in JAVA that begin with certain values

Given a span of string like [0..2) how to find string equivalent?

Generating interface with ASM is not working

How to normalize a URL in Java?

Categories

Resources