Sort Array List lexicographically ignoring integers - java

I have this code. I want to order a list of strings. Every item in the list consists of a three word sentence. I want to ignore the first word and sort the sentence lexicographically with the 2nd and 3rd words. If the 2nd or 3rd words contain an integer, I want to ignore sorting them but add them to the end of the list.
For example: (19th apple orange, 17th admin 7th, 19th apple table) should be sorted in the list as (19th apple orange, 19th apple table, 17th admin 7th)
So far my code only ignores the first word and sort lexicographically the rest of the lists
public static List<String> sortOrders(List<String> orderList) {
// Write your code here
Collections.sort( orderList,
(a, b) -> a.split(" *", 2)[1].compareTo( b.split(" *", 2)[1] )
);
return orderList;
}

In your compare method check for numbers first and then strings. You just have to add code to the steps you described:
Here's a pseudo code of what you described
...
(a,b) -> {
// Every item in the list consists of a three word sentence.
var awords = a.split(" ")
var bwords = a.split(" ")
// I want to ignore the first word
var as = awords[1] + " " awords[2]
var bs ...
// and sort the sentence lexicographically with the 2nd and 3rd words.
var r = as.compareTo(bs)
// If the 2nd or 3rd words contain an integer, I want to ignore sorting them but add them to the end of the list
if ( as.matches(".*\\d.*) ) {
return -1
} else {
return r
}
}
...
It's not clear what to do if both have numbers, e.g. a 1 a vs a 1 b, but that's something you have to clarify.
So basically you just have to go, divide each of the statements in your problem and add some code that solves it (like the example below )
You might notice there are some gaps (like what to do if two of them have strings). Once you have a working solution you can clean it up.
Another alternative with a similar idea
var as = a.substring(a.indexOf(" ")) // "a b c" -> "b c"
var bs = b.substring(b.indexOf(" ")) // "a b c" -> "b c"
return as.matches("\\d+") ? -1 : as.compareTo(bs);
Remember the compare(T,T) method returns < 0 if a is "lower" than b, so if a has numbers, it will always be "higher" thus should return 1, if b has numbers then a will be "lower", thus it should return -1, otherwise just compare the strings
Here's the full program:
import java.util.*;
public class Sl {
public static void main(String ... args ) {
var list = Arrays.asList("19th apple orange", "17th admin 7th", "19th apple table");
Collections.sort(list, (a, b) -> {
// just use the last two words
var as = a.substring(a.indexOf(" "));
var bs = b.substring(b.indexOf(" "));
// if a has a number, will always be higher
return as.matches(".*\\d+.*") ? 1
// if b has a number, a will always be lower
: bs.matches(".*\\d+.*") ? -1
// if none of the above, compare lexicographically the strings
: as.compareTo(bs);
});
System.out.println(list);
}
}

If you aren't careful, you will get an error such as Exception in thread "main" java.lang.IllegalArgumentException: Comparison method violates its general contract!
In order to prevent that you can do it as follows by creating a comparator that parses the string and checks the second and third elements for integers. If the first element has not integers but the second one does, it will be sent to the bottom since the second one is considered greater by returning a 1. But the next condition must only check on the second element and return a -1 indicating that it is smaller than the one so gain, it goes to the bottom of the list.
public static List<String> sortOrders(List<String> orderList) {
Comparator<String> comp = (a, b) -> {
String[] aa = a.split("\\s+", 2);
String[] bb = b.split("\\s+", 2);
boolean aam = aa[1].matches(".*[0-9]+.*");
boolean bbm = bb[1].matches(".*[0-9]+.*");
return aam && !bbm ? 1 : bbm ? -1 :
aa[1].compareTo(bb[1]);
};
return orderList.stream().sorted(comp).toList();
}
If you want to preserve your original data, use the above. If you want to sort in place, then apply the Comparator defined above and use Collections.sort(data, comp).
I have tested this extensively using the following data generation code which generated random strings meeting your requirements. I suggest you test any answers you get (including this one) to ensure it satisfies your requirements.
String letters = "abcdefghijklmnopqrstuvwxyz";
Random r = new Random(123);
List<String> data = r.ints(200000, 1, 100).mapToObj(i -> {
StringBuilder sb = new StringBuilder();
boolean first = r.nextBoolean();
boolean second = r.nextBoolean();
int ltr = r.nextInt(letters.length());
String fstr = letters.substring(ltr,ltr+1);
ltr = r.nextInt(letters.length());
String sstr = letters.substring(ltr,ltr+1);
sb.append(fstr).append(first ? ltr : "").append(" ");
sb.append(fstr);
if (first) {
sb.append(r.nextInt(100));
}
sb.append(" ").append(sstr);
if (!first && second) {
sb.append(r.nextInt(100));
}
return sb.toString();
}).collect(Collectors.toCollection(ArrayList::new));

Related

Count occurrences in 2D Array

I'm trying to count the occurrences per line from a text file containing a large amount of codes (numbers).
Example of text file content:
9045,9107,2376,9017
2387,4405,4499,7120
9107,2376,3559,3488
9045,4405,3559,4499
I want to compare a similar set of numbers that I get from a text field, for example:
9107,4405,2387,4499
The only result I'm looking for, is if it contains more than 2 numbers (per line) from the text file. So in this case it will be true, because:
9045,9107,2376,9017 - false (1)
2387,4405,4499,7120 - true (3)
9107,2387,3559,3488 - false (2)
9045,4425,3559,4490 - false (0)
From what I understand, the best way to do this, is by using a 2d-array, and I've managed to get the file imported successfully:
Scanner in = null;
try {
in = new Scanner(new File("areas.txt"));
} catch (FileNotFoundException ex) {
Logger.getLogger(NewJFrame.class.getName()).log(Level.SEVERE, null, ex);
}
List < String[] > lines = new ArrayList < > ();
while ( in .hasNextLine()) {
String line = in .nextLine().trim();
String[] splitted = line.split(", ");
lines.add(splitted);
}
String[][] result = new String[lines.size()][];
for (int i = 0; i < result.length; i++) {
result[i] = lines.get(i);
}
System.out.println(Arrays.deepToString(result));
The result I get:
[[9045,9107,2376,9017], [2387,4405,4499,7120], [9107,2376,3559,3488], [9045,4405,3559,4499], [], []]
From here I'm a bit stuck on checking the codes individually per line. Any suggestions or advice? Is the 2d-array the best way of doing this, or is there maybe an easier or better way of doing it?
The expected number of inputs defines the type of searching algorithm you should use.
If you aren't searching through thousands of lines then a simple algorithm will do just fine. When in doubt favour simplicity over complex and hard to understand algorithms.
While it is not an efficient algorithm, in most cases a simple nested for-loop will do the trick.
A simple implementation would look like this:
final int FOUND_THRESHOLD = 2;
String[] comparedCodes = {"9107", "4405", "2387", "4499"};
String[][] allInputs = {
{"9045", "9107", "2376", "9017"}, // This should not match
{"2387", "4405", "4499", "7120"}, // This should match
{"9107", "2376", "3559", "3488"}, // This should not match
{"9045", "4405", "3559", "4499"}, // This should match
};
List<String[] > results = new ArrayList<>();
for (String[] input: allInputs) {
int numFound = 0;
// Compare the codes
for (String code: input) {
for (String c: comparedCodes) {
if (code.equals(c)) {
numFound++;
break; // Breaking out here prevents unnecessary work
}
}
if (numFound >= FOUND_THRESHOLD) {
results.add(input);
break; // Breaking out here prevents unnecessary work
}
}
}
for (String[] result: results) {
System.out.println(Arrays.toString(result));
}
which provides us with the output:
[2387, 4405, 4499, 7120]
[9045, 4405, 3559, 4499]
To expand on my comment, here's a rough outline of what you could do:
String textFieldContents = ... //get it
//build a set of the user input by splitting at commas
//a stream is used to be able to trim the elements before collecting them into a set
Set<String> userInput = Arrays.stream(textFieldContents .split(","))
.map(String::trim).collect(Collectors.toSet());
//stream the lines in the file
List<Boolean> matchResults = Files.lines(Path.of("areas.txt"))
//map each line to true/false
.map(line -> {
//split the line and stream the parts
return Arrays.stream(line.split(","))
//trim each part
.map(String::trim)
//select only those contained in the user input set
.filter(part -> userInput.contains(part))
//count matching elements and return whether there are more than 2 or not
.count() > 2l;
})
//collect the results into a list, each element position should correspond to the zero-based line number
.collect(Collectors.toList());
If you need to collect the matching lines instead of a flag per line you could replace map() with filter() (same content) and change the result type to List<String>.

arranging strings in ascending and descending order

Alright so my code doesn't work : I'm trying to arrange inputted strings in both a "descending" and an "ascending" but sometimes strings just won't go in the lists (either in the right order or it doesn't go in the descending/ascending strings at all)
import java.util.Scanner;
public class Stringseries
{
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
System.out.println("Start the sequence by inputting a string DIFFERENT than 'quit'. When you DO want to end it, input 'quit'");
String encore = scanner.nextLine();
int loop = 0;
String smallest = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"; // we set a "smallest" string to know where to put the new string in the "descending" and "ascending" strings.
String longest = "";
String ascending = "";
String descending = "";
String lastInput = "";
while (!encore.equals("quit")) {
loop = ++loop;
encore = encore.replaceAll("\\s+",""); // this way, the length of the strings is only defined by the characters in the string, and not characters + whitespaces.
if (loop == 1) {
descending = encore;
ascending = encore;
} if (loop >= 2) {
if (encore.length() < smallest.length()) {
descending = descending + " " + encore;
ascending = encore + " " + ascending;
} if (encore.length() > longest.length()) {
descending = encore + " " + descending;
ascending = ascending + " " + encore;
}
}
if (longest.length() < encore.length()) {
longest = encore;
} if (smallest.length() > encore.length()) {
smallest = encore;
}
System.out.println("Enter the string you want to put in your sequence of strings");
lastInput = encore;
encore = scanner.nextLine();
}
if (descending != null && !descending.isEmpty()) { // we check to see if the "descending" string is empty (we could do this with "ascending" mind you).
System.out.println("Here are your strings in ascending order : " + ascending);
System.out.println("Here are your strings in descending order : " + descending);
System.out.println("Here is the longest string : " + longest);
} else if (descending == null | descending == "") {
System.out.println("You have not entered any strings, therefore the program doesn't display any string :("); // customised message.
}
} // end Method
} // end Class
I would take a different approach entirely. Yours is very homegrown, and Java has stuff built in that can do this, most notably here, the Stream API and Comparators
String quitString = "quit";
List<String> userInputList = new ArrayList<>();
try(Scanner scanner = new Scanner(System.in)){ // This is called a "try with resources"
System.out.println("Start the sequence by inputting a string DIFFERENT than 'quit'. When you DO want to end it, input \"" + quitString + "\"." + System.lineSeparator());
String encore = scanner.nextLine();
while(!encore.equalsIgnoreCase(quitString)){
encore = encore.replaceAll("\\s+", ""); // this way, the length of the strings is only defined by the characters in the string, and not characters + whitespaces.
System.out.println("Enter the string you want to put in your sequence of strings");
encore = scanner.nextLine();
if(encore != null && !encore.isEmpty() && !encore.equalsIgnoreCase(quitString)) {
userInputList.add(encore);
}
}
}
catch(Exception e)
{
e.printStackTrace();
}
List<String> ascending =
userInputList.stream()
.sorted((strA, strB) -> strA.length() - strB.length())
.collect(Collectors.toList());
List<String> descending =
userInputList.stream()
.sorted((strA, strB) -> strB.length() - strA.length())
.collect(Collectors.toList());
StringBuilder sbAscending = new StringBuilder();
sbAscending.append("Here are your strings in ascending order: ");
ascending.forEach(userInput -> {
sbAscending.append(System.lineSeparator() + userInput);
});
System.out.println(sbAscending.toString());
StringBuilder sbDescending = new StringBuilder();
sbDescending.append("Here are your strings in descending order: ");
descending.forEach(userInput -> {
sbDescending.append(System.lineSeparator() + userInput);
});
System.out.println(sbDescending.toString());
Output:
Start the sequence by inputting a string DIFFERENT than 'quit'. When you DO want to end it, input "quit".
Start
Enter the string you want to put in your sequence of strings
test
Enter the string you want to put in your sequence of strings
test2
Enter the string you want to put in your sequence of strings
test23
Enter the string you want to put in your sequence of strings
test234
Enter the string you want to put in your sequence of strings
quit
Here are your strings in ascending order:
test
test2
test23
test234
Here are your strings in descending order:
test234
test23
test2
test
Assuming you want to do stuff by your self, since this seems to be a practice assignment. Otherwise use j.seashell's answer.
Your current code can only input values into the end of the lists. This means that if you input
Test
Second Test
Third Test
The result after the first two inputs will be
ascending = "Test SecondTest"
descending = "SecondTest Test"
Your next value is supposed to go between those two, so the correct result becomes
ascending = "Test ThirdTest SecondTest"
descending = "SecondTest ThirdTest Test"
but your code may only append to the strings right now.
You also filter away strings that are not the shortest or the longst string inputed yet. To solve this you have to implement some way to split the lists, and insertion of the value in the middle of the splitted values. This can be done in several ways for instance
Using a list structure on the form List<String> ascending;
Splitting it each loop with ascending.split(" ");
Using insertion with substrings as in Insert a character in a string at a certain position
The simplest way would be using Javas built-in List structure i.e.
List<String> ascending = new ArrayList<>();
A possible solution to inserting the string in the correct position may then be
boolean inserted = false;
//We loop to the correct location and add it
for(int i = 0; i < ascending.size(); i++) {
if(ascending.get(i).length() > encore.length()) {
ascending.add(i, encore);
inserted = true;
break;
}
}
//If it wasn't inserted its the longest one yet, so add it at the end
if(!inserted) {
ascending.add(encore);
}
You may use the same loop but switch the comparision to be < instead to get an descending list.
At the end you can print the values with
for(String value : ascending) {
System.out.println(value);
}
/*
Hello Mister Dracose.
perhaps you should use something a bit more appropriated for this goal.
in fact you can not manage more than 2 strings at a time on your currently code, so you rather be using
*/
List<String> supplierNames1 = new ArrayList<String>();
/*
java structures, for save all user inputs, before you can go any further.
after that, than you could use your ordenating algotithm exatcly the same way you re already doing.
hope this help
*/
Use a linked list. Every time you add a word, look down your list one item at a time and insert your new node at position n, where n-1.length => n.length > n+1.length
To read it backwards, you can either implement this as a doubly linked list, or read your singly linked list into a stack and pop off the stack

Java custom Sort by 2 parts of same string

I have seen other questions like this, but couldn't adapt any of the information to my code. Either because it wasn't specific to my issue or I couldn't get my head around the answer. So, I am hoping to ask "how" with my specific code. Tell me if more is needed.
I have various files (all jpg's) with names with the format "20140214-ddEventBlahBlah02.jpg" and "20150302-ddPsBlagBlag2".
I have a custom comparator in use that sorts things in a Windows OS fashion... i.e. 02,2,003,4,4b,4c,10, etc. Instead of the computer way of sorting, which was screwed up. Everything is good, except I now want to sort these strings using 2 criteria in the strings.
1) The date (in the beginning). i.e. 20150302
2) The rest of the filename after the "-" i.e. ddPsBlagBlag2
I am currently using the comparator for a project that displays these files in reverse order. They are displaying according to what was added most recently. i.e. 20150302 is displaying before 20140214. Which is good. But I would like the files, after being sorted by date in reverse order, to display by name in normal Windows OS ascending order (not in reverse).
Code:
Collections.sort(file, new Comparator<File>()
{
private final Comparator<String> NATURAL_SORT = new WindowsExplorerComparator();
#Override
public int compare(File o1, File o2)
{
return NATURAL_SORT.compare(o1.getName(), o2.getName());
}
});
Collections.reverse(file);
The code above takes the ArayList of file names and sends it to the custom WindowsExplorerComparator class. After being sorted, Collections.reverse() is called on the ArrayList.
Code:
class WindowsExplorerComparator implements Comparator<String>
{
private static final Pattern splitPattern = Pattern.compile("\\d\\.|\\s");
#Override
public int compare(String str1, String str2) {
Iterator<String> i1 = splitStringPreserveDelimiter(str1).iterator();
Iterator<String> i2 = splitStringPreserveDelimiter(str2).iterator();
while (true)
{
//Til here all is equal.
if (!i1.hasNext() && !i2.hasNext())
{
return 0;
}
//first has no more parts -> comes first
if (!i1.hasNext() && i2.hasNext())
{
return -1;
}
//first has more parts than i2 -> comes after
if (i1.hasNext() && !i2.hasNext())
{
return 1;
}
String data1 = i1.next();
String data2 = i2.next();
int result;
try
{
//If both datas are numbers, then compare numbers
result = Long.compare(Long.valueOf(data1), Long.valueOf(data2));
//If numbers are equal than longer comes first
if (result == 0)
{
result = -Integer.compare(data1.length(), data2.length());
}
}
catch (NumberFormatException ex)
{
//compare text case insensitive
result = data1.compareToIgnoreCase(data2);
}
if (result != 0) {
return result;
}
}
}
private List<String> splitStringPreserveDelimiter(String str) {
Matcher matcher = splitPattern.matcher(str);
List<String> list = new ArrayList<String>();
int pos = 0;
while (matcher.find()) {
list.add(str.substring(pos, matcher.start()));
list.add(matcher.group());
pos = matcher.end();
}
list.add(str.substring(pos));
return list;
}
}
The code above is the custom WindowsExplorerComperator class being used to sort the ArrayList.
So, an example of what I would like the ArrayList to look like after being sorted (and date sort reversed) is:
20150424-ssEventBlagV002.jpg
20150323-ssEventBlagV2.jpg
20150323-ssEventBlagV3.jpg
20150323-ssEventBlagV10.jpg
20141201-ssEventZoolander.jpg
20141102-ssEventApple1.jpg
As you can see, first sorted by date (and reversed), then sorted in ascending order by the rest of the string name.
Is this possible? Please tell me its an easy fix.
Your close, whenever dealing with something not working debug your program and make sure that methods are returning what you would expect. When I ran your program first thing I noticed was that EVERY compare iteration which attempted to convert a string to Long threw a NumberFormatException. This was a big red flag so I threw in some printlns to check what the value of data1 and data2 were.
Heres my output:
Compare: 20150323-ssEventBlagV 20150424-ssEventBlagV00
Compare: 20150323-ssEventBlagV 20150323-ssEventBlagV
Compare: 3. 2.
Compare: 20150323-ssEventBlagV 20150424-ssEventBlagV00
Compare: 20150323-ssEventBlagV 20150323-ssEventBlagV
Compare: 3. 2.
Compare: 20150323-ssEventBlagV1 20150323-ssEventBlagV
Compare: 20150323-ssEventBlagV1 20150424-ssEventBlagV00
Compare: 20141201-ssEventZoolander.jpg 20150323-ssEventBlagV1
Compare: 20141201-ssEventZoolander.jpg 20150323-ssEventBlagV
Compare: 20141201-ssEventZoolander.jpg 20150323-ssEventBlagV
Big thing to notice here is that its trying to convert 3. and 2. to long values which of course wont work.
The simplest solution with your code is to simply change your regular expression. Although you might go for a more simple route of string iteration instead of regex in the future, I feel as though regex complicates this problem more than it helps.
New regex: \\d+(?=\\.)|\\s
Changes:
\\d -> \\d+ - Capture all digits before the period not just the first one
\\. -> (?=\\.) - place period in non capturing group so your method doesn't append it to our digits
New debug output:
Compare: 20150323-ssEventBlagV 20150424-ssEventBlagV
Compare: 20150323-ssEventBlagV 20150323-ssEventBlagV
Compare: 3 2
Compare: 20150323-ssEventBlagV 20150323-ssEventBlagV
Compare: 10 3
Compare: 20141201-ssEventZoolander.jpg 20150323-ssEventBlagV
As you can see the numbers at the end are actually getting parsed correctly.
One more minor thing:
Your result for digit comparison is backwards
result = Long.compare(Long.valueOf(data1), Long.valueOf(data2));
should be either:
result = -Long.compare(Long.valueOf(data1), Long.valueOf(data2));
or
result = Long.compare(Long.valueOf(data2), Long.valueOf(data1));
because its sorting them backwards.
There are a few things you should do:
First, you need to fix your split expression as #ug_ stated. However, I think splitting on numbers is more appropriate.
private static final Pattern splitPattern = Pattern.compile("\\d+");
which, for 20150323-ssEventBlagV2.jpg will result in
[, 20150323, -ssEventBlagV, 2, .jpg]
Second, perform a date comparison separate from your Long comparison. Using SimpleDateFormat will make sure you are only comparing numbers that are formatted as dates.
try {
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMdd");
result = sdf.parse(data2).compareTo(sdf.parse(data1));
if (result != 0) {
return result;
}
} catch (final ParseException e) {
/* continue */
}
Last, swap the order of your Long compare
Long.compare(Long.valueOf(data2), Long.valueOf(data1));
And you should be good to go. Full code below.
private static final Pattern splitPattern = Pattern.compile("\\d+");
#Override
public int compare(String str1, String str2) {
Iterator<String> i1 = splitStringPreserveDelimiter(str1).iterator();
Iterator<String> i2 = splitStringPreserveDelimiter(str2).iterator();
while (true) {
// Til here all is equal.
if (!i1.hasNext() && !i2.hasNext()) {
return 0;
}
// first has no more parts -> comes first
if (!i1.hasNext() && i2.hasNext()) {
return -1;
}
// first has more parts than i2 -> comes after
if (i1.hasNext() && !i2.hasNext()) {
return 1;
}
String data1 = i1.next();
String data2 = i2.next();
int result;
try {
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMdd");
result = sdf.parse(data1).compareTo(sdf.parse(data2));
if (result != 0) {
return result;
}
} catch (final ParseException e) {
/* continue */
}
try {
// If both datas are numbers, then compare numbers
result = Long.compare(Long.valueOf(data2),
Long.valueOf(data1));
// If numbers are equal than longer comes first
if (result == 0) {
result = -Integer.compare(data1.length(),
data2.length());
}
} catch (NumberFormatException ex) {
// compare text case insensitive
result = data1.compareToIgnoreCase(data2);
}
if (result != 0) {
return result;
}
}
}
You will need to edit your WindowsExporerComparator Class so that it performs this sorting. Given two file names as Strings you need to determine what order they go in using a following high level algorithm.
are they the same? if yes return 0
Split the file name into two strings, the date portion and the name portion.
Using the date portion convert the string to a date using the Java DateTime and then compare the dates.
If the dates are the same compare the two name portions using your current compare code and return the result from that.
This is a bit complicated and sort of confusing, but you will have to do it in one comparator and put in all of your custom logic

Why does my sorting loop seem to append an element where it shouldn't?

I am trying to sort an array of Strings using compareTo(). This is my code:
static String Array[] = {" Hello ", " This ", "is ", "Sorting ", "Example"};
String temp;
public static void main(String[] args)
{
for (int j=0; j<Array.length;j++)
{
for (int i=j+1 ; i<Array.length; i++)
{
if (Array[i].compareTo(Array[j])<0)
{
String temp = Array[j];
Array[j] = Array[i];
Array[i] = temp;
}
}
System.out.print(Array[j]);
}
}
Now the output is:
Hello This Example Sorting is
I am getting results, but not the results I want to get, which are:
Hello This Example Is Sorting
How can I adjust my code to sort the string array properly?
Your output is correct. Denote the white characters of " Hello" and " This" at the beginning.
Another issue is with your methodology. Use the Arrays.sort() method:
String[] strings = { " Hello ", " This ", "Is ", "Sorting ", "Example" };
Arrays.sort(strings);
Output:
Hello
This
Example
Is
Sorting
Here the third element of the array "is" should be "Is", otherwise it will come in last after sorting. Because the sort method internally uses the ASCII value to sort elements.
Apart from the alternative solutions that were posted here (which are correct), no one has actually answered your question by addressing what was wrong with your code.
It seems as though you were trying to implement a selection sort algorithm. I will not go into the details of how sorting works here, but I have included a few links for your reference =)
Your code was syntactically correct, but logically wrong. You were partially sorting your strings by only comparing each string with the strings that came after it. Here is a corrected version (I retained as much of your original code to illustrate what was "wrong" with it):
static String Array[]={" Hello " , " This " , "is ", "Sorting ", "Example"};
String temp;
//Keeps track of the smallest string's index
int shortestStringIndex;
public static void main(String[] args)
{
//I reduced the upper bound from Array.length to (Array.length - 1)
for(int j=0; j < Array.length - 1;j++)
{
shortestStringIndex = j;
for (int i=j+1 ; i<Array.length; i++)
{
//We keep track of the index to the smallest string
if(Array[i].trim().compareTo(Array[shortestStringIndex].trim())<0)
{
shortestStringIndex = i;
}
}
//We only swap with the smallest string
if(shortestStringIndex != j)
{
String temp = Array[j];
Array[j] = Array[shortestStringIndex];
Array[shortestStringIndex] = temp;
}
}
}
Further Reading
The problem with this approach is that its asymptotic complexity is O(n^2). In simplified words, it gets very slow as the size of the array grows (approaches infinity). You may want to read about better ways to sort data, such as quicksort.
I know this is a late reply but maybe it can help someone.
Removing whitespace can be done by using the trim() function.
After that if you want to sort the array with case sensitive manner you can just use:
Arrays.sort(yourArray);
and for case insensitive manner:
Arrays.sort(yourArray,String.CASE_INSENSITIVE_ORDER);
Hope this helps!
Instead of this line
if(Array[i].compareTo(Array[j])<0)
use this line
if(Array[i].trim().compareTo(Array[j].trim())<0)
and you are good to go. The reason your current code is not working is explained by other users already. This above replacement is one workaround amongst several that you could apply.
Starting from Java 8, you can also use parallelSort which is useful if you have arrays containing a lot of elements.
Example:
public static void main(String[] args) {
String[] strings = { "x", "a", "c", "b", "y" };
Arrays.parallelSort(strings);
System.out.println(Arrays.toString(strings)); // [a, b, c, x, y]
}
If you want to ignore the case, you can use:
public static void main(String[] args) {
String[] strings = { "x", "a", "c", "B", "y" };
Arrays.parallelSort(strings, new Comparator<String>() {
#Override
public int compare(String o1, String o2) {
return o1.compareToIgnoreCase(o2);
}
});
System.out.println(Arrays.toString(strings)); // [a, B, c, x, y]
}
otherwise B will be before a.
If you want to ignore the trailing spaces during the comparison, you can use trim():
public static void main(String[] args) {
String[] strings = { "x", " a", "c ", " b", "y" };
Arrays.parallelSort(strings, new Comparator<String>() {
#Override
public int compare(String o1, String o2) {
return o1.trim().compareTo(o2.trim());
}
});
System.out.println(Arrays.toString(strings)); // [ a, b, c , x, y]
}
See:
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/arrays.html
Difference between Arrays.sort() and Arrays.parallelSort()
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/Arrays.java?av=f
" Hello " , " This " , "is ", "Sorting ", "Example"
First of all you provided spaces in " Hello " and " This ", spaces have a lower value than alphabetic characters in Unicode, so it gets printed first. (The rest of the characters were sorted alphabetically).
Now upper case letters have a lower value than lower case letter in Unicode, so "Example" and "Sorting" gets printed, then at last "is " which has the highest value.
If you use:
if (Array[i].compareToIgnoreCase(Array[j]) < 0)
you will get:
Example Hello is Sorting This
which I think is the output you were looking for.
To begin with, your problem is that you use the method `compareTo() which is case sensitive. That means that the Capital letters are sorted apart from the lower case. The reason is that it translated in Unicode where the capital letters are presented with numbers which are less than the presented number of lower case. Thus you should use `compareToIgnoreCase()` as many also mentioned in previous posts.
This is my full example approach of how you can do it effecively
After you create an object of the Comparator you can pass it in this version of `sort()` which defined in java.util.Arrays.
static<T>void sort(T[]array,Comparator<?super T>comp)
take a close look at super. This makes sure that the array which is passed into is combatible with the type of comparator.
The magic part of this way is that you can easily sort the array of strings in Reverse order you can easily do by:
return strB.compareToIgnoreCase(strA);
import java.util.Comparator;
public class IgnoreCaseComp implements Comparator<String> {
#Override
public int compare(String strA, String strB) {
return strA.compareToIgnoreCase(strB);
}
}
import java.util.Arrays;
public class IgnoreCaseSort {
public static void main(String[] args) {
String strs[] = {" Hello ", " This ", "is ", "Sorting ", "Example"};
System.out.print("Initial order: ");
for (String s : strs) {
System.out.print(s + " ");
}
System.out.println("\n");
IgnoreCaseComp icc = new IgnoreCaseComp();
Arrays.sort(strs, icc);
System.out.print("Case-insesitive sorted order: ");
for (String s : strs) {
System.out.print(s + " ");
}
System.out.println("\n");
Arrays.sort(strs);
System.out.print("Default, case-sensitive sorted order: ");
for (String s : strs) {
System.out.print(s + " ");
}
System.out.println("\n");
}
}
run:
Initial order: Hello This is Sorting Example
Case-insesitive sorted order: Hello This Example is Sorting
Default, case-sensitive sorted order: Hello This Example Sorting is
BUILD SUCCESSFUL (total time: 0 seconds)
Alternative Choice
The method compareToIgnoreCase(), although it works well with many occasions(just like compare string in english),it will wont work well with all languages and locations. This automatically makes it an unfit choice for use. To make sure that it will be suppoorted everywhere you should use compare() from java.text.Collator.
You can find a collator for your location by calling the method getInstance(). After that you should set this Collator's strength property. This can be done with the setStrength() method together with Collator.PRIMARY as parameter. With this alternative choise the IgnocaseComp can be written just like below. This version of code will generate the same output independently of the location
import java.text.Collator;
import java.util.Comparator;
//this comparator uses one Collator to determine
//the right sort usage with no sensitive type
//of the 2 given strings
public class IgnoreCaseComp implements Comparator<String> {
Collator col;
IgnoreCaseComp() {
//default locale
col = Collator.getInstance();
//this will consider only PRIMARY difference ("a" vs "b")
col.setStrength(Collator.PRIMARY);
}
#Override
public int compare(String strA, String strB) {
return col.compare(strA, strB);
}
}

Find difference between two Strings

Suppose I have two long strings. They are almost same.
String a = "this is a example"
String b = "this is a examp"
Above code is just for example. Actual strings are quite long.
Problem is one string have 2 more characters than the other.
How can I check which are those two character?
You can use StringUtils.difference(String first, String second).
This is how they implemented it:
public static String difference(String str1, String str2) {
if (str1 == null) {
return str2;
}
if (str2 == null) {
return str1;
}
int at = indexOfDifference(str1, str2);
if (at == INDEX_NOT_FOUND) {
return EMPTY;
}
return str2.substring(at);
}
public static int indexOfDifference(CharSequence cs1, CharSequence cs2) {
if (cs1 == cs2) {
return INDEX_NOT_FOUND;
}
if (cs1 == null || cs2 == null) {
return 0;
}
int i;
for (i = 0; i < cs1.length() && i < cs2.length(); ++i) {
if (cs1.charAt(i) != cs2.charAt(i)) {
break;
}
}
if (i < cs2.length() || i < cs1.length()) {
return i;
}
return INDEX_NOT_FOUND;
}
To find the difference between 2 Strings you can use the StringUtils class and the difference method. It compares the two Strings, and returns the portion where they differ.
StringUtils.difference(null, null) = null
StringUtils.difference("", "") = ""
StringUtils.difference("", "abc") = "abc"
StringUtils.difference("abc", "") = ""
StringUtils.difference("abc", "abc") = ""
StringUtils.difference("ab", "abxyz") = "xyz"
StringUtils.difference("abcde", "abxyz") = "xyz"
StringUtils.difference("abcde", "xyz") = "xyz"
Without iterating through the strings you can only know that they are different, not where - and that only if they are of different length. If you really need to know what the different characters are, you must step through both strings in tandem and compare characters at the corresponding places.
The following Java snippet efficiently computes a minimal set of characters that have to be removed from (or added to) the respective strings in order to make the strings equal. It's an example of dynamic programming.
import java.util.HashMap;
import java.util.Map;
public class StringUtils {
/**
* Examples
*/
public static void main(String[] args) {
System.out.println(diff("this is a example", "this is a examp")); // prints (le,)
System.out.println(diff("Honda", "Hyundai")); // prints (o,yui)
System.out.println(diff("Toyota", "Coyote")); // prints (Ta,Ce)
System.out.println(diff("Flomax", "Volmax")); // prints (Fo,Vo)
}
/**
* Returns a minimal set of characters that have to be removed from (or added to) the respective
* strings to make the strings equal.
*/
public static Pair<String> diff(String a, String b) {
return diffHelper(a, b, new HashMap<>());
}
/**
* Recursively compute a minimal set of characters while remembering already computed substrings.
* Runs in O(n^2).
*/
private static Pair<String> diffHelper(String a, String b, Map<Long, Pair<String>> lookup) {
long key = ((long) a.length()) << 32 | b.length();
if (!lookup.containsKey(key)) {
Pair<String> value;
if (a.isEmpty() || b.isEmpty()) {
value = new Pair<>(a, b);
} else if (a.charAt(0) == b.charAt(0)) {
value = diffHelper(a.substring(1), b.substring(1), lookup);
} else {
Pair<String> aa = diffHelper(a.substring(1), b, lookup);
Pair<String> bb = diffHelper(a, b.substring(1), lookup);
if (aa.first.length() + aa.second.length() < bb.first.length() + bb.second.length()) {
value = new Pair<>(a.charAt(0) + aa.first, aa.second);
} else {
value = new Pair<>(bb.first, b.charAt(0) + bb.second);
}
}
lookup.put(key, value);
}
return lookup.get(key);
}
public static class Pair<T> {
public Pair(T first, T second) {
this.first = first;
this.second = second;
}
public final T first, second;
public String toString() {
return "(" + first + "," + second + ")";
}
}
}
To directly get only the changed section, and not just the end, you can use Google's Diff Match Patch.
List<Diff> diffs = new DiffMatchPatch().diffMain("stringend", "stringdiffend");
for (Diff diff : diffs) {
if (diff.operation == Operation.INSERT) {
return diff.text; // Return only single diff, can also find multiple based on use case
}
}
For Android, add: implementation 'org.bitbucket.cowwoc:diff-match-patch:1.2'
This package is far more powerful than just this feature, it is mainly used for creating diff related tools.
String strDiffChop(String s1, String s2) {
if (s1.length > s2.length) {
return s1.substring(s2.length - 1);
} else if (s2.length > s1.length) {
return s2.substring(s1.length - 1);
} else {
return null;
}
}
Google's Diff Match Patch is good, but it was a pain to install into my Java maven project. Just adding a maven dependency did not work; eclipse just created the directory and added the lastUpdated info files. Finally, on the third try, I added the following to my pom:
<dependency>
<groupId>fun.mike</groupId>
<artifactId>diff-match-patch</artifactId>
<version>0.0.2</version>
</dependency>
Then I manually placed the jar and source jar files into my .m2 repo from https://search.maven.org/search?q=g:fun.mike%20AND%20a:diff-match-patch%20AND%20v:0.0.2
After all that, the following code worked:
import fun.mike.dmp.Diff;
import fun.mike.dmp.DiffMatchPatch;
DiffMatchPatch dmp = new DiffMatchPatch();
LinkedList<Diff> diffs = dmp.diff_main("Hello World.", "Goodbye World.");
System.out.println(diffs);
The result:
[Diff(DELETE,"Hell"), Diff(INSERT,"G"), Diff(EQUAL,"o"), Diff(INSERT,"odbye"), Diff(EQUAL," World.")]
Obviously, this was not originally written (or even ported fully) into Java. (diff_main? I can feel the C burning into my eyes :-) )
Still, it works. And for people working with long and complex strings, it can be a valuable tool.
To find the words that are different in the two lines, one can use the following code.
String[] strList1 = str1.split(" ");
String[] strList2 = str2.split(" ");
List<String> list1 = Arrays.asList(strList1);
List<String> list2 = Arrays.asList(strList2);
// Prepare a union
List<String> union = new ArrayList<>(list1);
union.addAll(list2);
// Prepare an intersection
List<String> intersection = new ArrayList<>(list1);
intersection.retainAll(list2);
// Subtract the intersection from the union
union.removeAll(intersection);
for (String s : union) {
System.out.println(s);
}
In the end, you will have a list of words that are different in both the lists. One can modify it easily to simply have the different words in the first list or the second list and not simultaneously. This can be done by removing the intersection from only from list1 or list2 instead of the union.
Computing the exact location can be done by adding up the lengths of each word in the split list (along with the splitting regex) or by simply doing String.indexOf("subStr").
On top of using StringUtils.difference(String first, String second) as seen in other answers, you can also use StringUtils.indexOfDifference(String first, String second) to get the index of where the strings start to differ. Ex:
StringUtils.indexOfDifference("abc", "dabc") = 0
StringUtils.indexOfDifference("abc", "abcd") = 3
where 0 is used as the starting index.
Another great library for discovering the difference between strings is DiffUtils at https://github.com/java-diff-utils. I used Dmitry Naumenko's fork:
public void testDiffChange() {
final List<String> changeTestFrom = Arrays.asList("aaa", "bbb", "ccc");
final List<String> changeTestTo = Arrays.asList("aaa", "zzz", "ccc");
System.out.println("changeTestFrom=" + changeTestFrom);
System.out.println("changeTestTo=" + changeTestTo);
final Patch<String> patch0 = DiffUtils.diff(changeTestFrom, changeTestTo);
System.out.println("patch=" + Arrays.toString(patch0.getDeltas().toArray()));
String original = "abcdefghijk";
String badCopy = "abmdefghink";
List<Character> originalList = original
.chars() // Convert to an IntStream
.mapToObj(i -> (char) i) // Convert int to char, which gets boxed to Character
.collect(Collectors.toList()); // Collect in a List<Character>
List<Character> badCopyList = badCopy.chars().mapToObj(i -> (char) i).collect(Collectors.toList());
System.out.println("original=" + original);
System.out.println("badCopy=" + badCopy);
final Patch<Character> patch = DiffUtils.diff(originalList, badCopyList);
System.out.println("patch=" + Arrays.toString(patch.getDeltas().toArray()));
}
The results show exactly what changed where (zero based counting):
changeTestFrom=[aaa, bbb, ccc]
changeTestTo=[aaa, zzz, ccc]
patch=[[ChangeDelta, position: 1, lines: [bbb] to [zzz]]]
original=abcdefghijk
badCopy=abmdefghink
patch=[[ChangeDelta, position: 2, lines: [c] to [m]], [ChangeDelta, position: 9, lines: [j] to [n]]]
For a simple use case like this. You can check the sizes of the string and use the split function. For your example
a.split(b)[1]
I think the Levenshtein algorithm and the 3rd party libraries brought out for this very simple (and perhaps poorly stated?) test case are WAY overblown.
Assuming your example does not suggest the two bytes are always different at the end, I'd suggest the JDK's Arrays.mismatch( byte[], byte[] ) to find the first index where the two bytes differ.
String longer = "this is a example";
String shorter = "this is a examp";
int differencePoint = Arrays.mismatch( longer.toCharArray(), shorter.toCharArray() );
System.out.println( differencePoint );
You could now repeat the process if you suspect the second character is further along in the String.
Or, if as you suggest in your example the two characters are together, there is nothing further to do. Your answer then would be:
System.out.println( longer.charAt( differencePoint ) );
System.out.println( longer.charAt( differencePoint + 1 ) );
If your string contains characters outside of the Basic Multilingual Plane - for example emoji - then you have to use a different technique. For example,
String a = "a 🐣 is cuter than a 🐇.";
String b = "a 🐣 is cuter than a 🐹.";
int firstDifferentChar = Arrays.mismatch( a.toCharArray(), b.toCharArray() );
int firstDifferentCodepoint = Arrays.mismatch( a.codePoints().toArray(), b.codePoints().toArray() );
System.out.println( firstDifferentChar ); // prints 22!
System.out.println( firstDifferentCodepoint ); // prints 20, which is correct.
System.out.println( a.codePoints().toArray()[ firstDifferentCodepoint ] ); // prints out 128007
System.out.println( new String( Character.toChars( 128007 ) ) ); // this prints the rabbit glyph.
You may try this
String a = "this is a example";
String b = "this is a examp";
String ans= a.replace(b, "");
System.out.print(now);
//ans=le

Categories

Resources