I got:
a list of lists of TextInfo objects. Each TextInfo object contains a piece of text and a toString override method to return the text. By the Y value of a TextInfo we can conclude what TextInfo's are on the same line (custom problem)
I want:
a list of strings. Each string is the result of concatenation of all elements of one sublist. And I want to make use of streams as much as possible.
So far:
List<String> allLinesByYCoordinate = groupAllTextLineByLineBasedOnY(allTextInfosOnPage);
public static List<String> groupAllTextLineByLineBasedOnY(List<TextInfo> allTextInfo) {
Map<Object, List<TextInfo>> groupedtextInfosPerLine = allTextInfo.stream().
collect(Collectors.groupingBy(x -> x.getY()));
List<String> allLines = new ArrayList<>();
for (Map.Entry<Object, List<TextInfo>> groupedtextInfos: groupedtextInfosPerLine.entrySet()) {
String temp = "";
for (TextInfo textInfo: groupedtextInfos.getValue()) {
temp += textInfo;
}
allLines.add(temp);
}
return allLines;
}
You might agree that the method groupAllTextLineByLineBasedOnY looks a bit too oldschool. I 'm trying to execute a concatenation on each of the lists of TextInfo's in the big list, resulting in a list of strings where each string used to be a list of TextInfo's
I'm hoping to find a concise stream() solution
Let's refactor a little at a time.
First, you should never build a string in a loop, as it can be very inefficient; use StringBuilder instead. However, we can instead stream and collect into a string. Also notice here that we using Map.values() instead of calling getValue() inside the loop.
public static List<String> groupAllTextLineByLineBasedOnY(List<TextInfo> allTextInfo) {
Map<Object, List<TextInfo>> groupedtextInfosPerLine = allTextInfo.stream()
.collect(Collectors.groupingBy(x -> x.getY()));
List<String> allLines = new ArrayList<>();
for (List<TextInfo> groupedtextInfos: groupedtextInfosPerLine.values()) {
allLines.add(groupedtextInfos.stream().map(Object::toString).collect(Collectors.joining()));
}
return allLines;
}
In the next refactoring, we can get rid of the intermediate list allLines and instead stream the textInfos and collect them into a List:
public static List<String> groupAllTextLineByLineBasedOnY(List<TextInfo> allTextInfo) {
Map<Object, List<TextInfo>> groupedtextInfosPerLine = allTextInfo.stream()
.collect(Collectors.groupingBy(x -> x.getY()));
return groupedtextInfosPerLine.values().stream()
.map(textInfos -> textInfos.stream().map(Object::toString).collect(Collectors.joining()))
.toList();
}
Finally, we can get rid of the groupedtextInfosPerLine variable:
public static List<String> groupAllTextLineByLineBasedOnY(List<TextInfo> allTextInfo) {
return allTextInfo.stream()
.collect(Collectors.groupingBy(TextInfo::getY)).values().stream()
.map(textInfos -> textInfos.stream().map(Object::toString).collect(Collectors.joining()))
.toList();
}
I think your looking for a solution like this
import java.util.Collection;
import java.util.List;
import java.util.TreeMap;
import java.util.stream.Collectors;
class Scratch {
public static void main(String[] args) {
// Setup some example data
List<TextInfo> textInfos = List.of(
new TextInfo("line 3 a", 3),
new TextInfo("line 1 a", 1),
new TextInfo("line 2 a", 2),
new TextInfo("line 1 b", 1),
new TextInfo("line 3 b", 3),
new TextInfo("line 1 c", 1)
);
// This is the actual answer
Collection<String> allLines = textInfos.stream()
.collect(Collectors.toMap(
TextInfo::getY, // line number as key
TextInfo::toString, // convert TextInfo to String
(a, b) -> a + b, // Merge TextInfo on the same line
TreeMap::new)) // Ensure in order
.values();
// You would return allLines from the method
System.out.println(allLines);
}
static class TextInfo {
String text;
int y;
public TextInfo(String text, int y) {
this.text = text;
this.y = y;
}
public int getY() { return y; }
#Override
public String toString() { return text; }
}
}
If you run the code you print
[line 1 aline 1 bline 1 c, line 2 a, line 3 aline 3 b]
I got an array of elements like :
ArrayList<String> t = new ArrayList();
t.add("/folder1/sub-folder1");
t.add("/folder2/sub-folder2");
t.add("/folder1/sub-folder1/data");
I need to get output as /folder1/sub-folder1 which is mostly repeated path.
In python this can be achieved using the below function:
def getRepeatedPath(self, L):
""" Returns the highest repeated path/string in a provided list """
try:
pkgname = max(g(sorted(L)), key=lambda(x, v): (len(list(v)), -L.index(x)))[0]
return pkgname.replace("/", ".")
except:
return "UNKNOWN"
I am trying to work on equivalent lambda function in Java. I got struck and need some help in the lambda implementation.
public String mostRepeatedSubString(ArrayList<String> pathArray) {
Collections.sort(pathArray);
String mostRepeatedString = null;
Map<String,Integer> x = pathArray.stream.map(s->s.split("/")).collect(Collectors.toMap());
return mostRepeatedString;
}
Lots of tweaking, but I finally got it!
public static void main(String[] args) {
ArrayList<String> t = new ArrayList<String>();
t.add("folder1/sub-folder1");
t.add("folder2/sub-folder2");
t.add("folder1/sub-folder1/data");
System.out.println(mostRepeatedSubString(t));
}
public static String mostRepeatedSubString(List<String> pathArray) {
return pathArray
.stream()
// Split to lists of strings
.map(s -> Arrays.asList(s.split("/")))
// Group by first folder
.collect(Collectors.groupingBy(lst -> lst.get(0)))
// Find the key with the largest list value
.entrySet()
.stream()
.max((e1, e2) -> e1.getValue().size() - e2.getValue().size())
// Extract that largest list
.map(Entry::getValue)
.orElse(Arrays.asList())
// Intersect the lists in that list to find maximal matching
.stream()
.reduce(YourClassName::commonPrefix)
// Change back to a string
.map(lst -> String.join("/", lst))
.orElse("");
}
private static List<String> commonPrefix(List<String> lst1, List<String> lst2) {
int maxIndex = 0;
while(maxIndex < Math.min(lst1.size(), lst2.size())&& lst1.get(maxIndex).equals(lst2.get(maxIndex))) {
maxIndex++;
}
return lst1.subList(0, maxIndex);
}
Note that I had to remove the initial / from the paths, otherwise that character would have been used in the split, resulting in the first string in every path list being the empty string, which would always be the most common prefix. Shouldn't be too hard to do this in pre-processing though.
I have a string arraylist under that i need to pass 22184 elements from ["AA00001", "AA00005" ,"AA00003" ----- "ZZ00678"] and i need to generate the sequence elements which are not present in the list. I have written code for that and for less inputs it is generating the required output. But when i am adding 22184 elements and want to generate 200 unique ids which are not present in the arraylist i am getting error as
The code of method main(String[]) is exceeding the 65535 bytes limit
Can someone please help ?
import java.util.ArrayList;
public class GenerateIds
{
private static ArrayList<String> ids = new ArrayList<>();
static int n=50; //no of Ids u want to generate
static int completed =0;
static char ID[] = new char[7];
public static void main(String[] args)
{
ids.add("AA00001");
ids.add("AA00004");
ids.add("AA00007");
generateIds(0);
for(String id : ids)
{
System.out.println(id);
}
}
private static void generateIds(int i)
{
if(n!=completed)
{
if(i<2)
{
for(char c ='A';c<'Z';c++)
{
ID[i]=c;
generateIds(i+1);
}
}
else if(i>=2 && i<7)
{
for(char c ='0';c<='9';c++)
{
ID[i]=c;
generateIds(i+1);
}
}else if(i==7)
{
String id = String.valueOf(ID);
if(!ids.contains(id))
{
ids.add(id);
completed++;
}
}
}
}
}
You can put your id's in a text file. Then use something like.
List<String> ids = Files.readAllLines(Paths.get("ids.txt"));
In java a methods can't have more than 65535 bytes.
The main method is becoming too large since you are doing all the adds inline:
ids.add("AA00001");
ids.add("AA00004");
ids.add("AA00007");
...
This will make the main method too long. What you can do to solve this (and to find the missing elements) is putting all the String values in a List and loop over it to find the missing elements:
public void findMissingElements() {
List<String> missingIds = allPossibleIds.stream()
.filter(isMissingIn(existingIds))
.collect(toList());
//do something with the missingIds...
}
As other readers such as matt suggested, you can e.g. put all the Strings in a file and read the file.
I wrote a small example to show how it would all work together. I rewrote your generateIds method with jOOλ to generate all the possible ids and renamed it to allPossibleIds (however your recursive method would work too). I limited the ids to a 3 size digit number to limit the search time as an example.
public class FindMissingIdsTest {
private List<String> allPossibleIds;
private List<String> existingIds;
#Before
public void setup() throws IOException {
allPossibleIds = allPossibleIds();
existingIds = retrieveIdsFromSubSystem();
}
#Test
public void findMissingElements() {
List<String> missingIds = allPossibleIds.stream()
.filter(isMissingIn(existingIds))
.collect(toList());
}
private Predicate<String> isMissingIn(List<String> existingIds) {
return possibleId -> !existingIds.contains(possibleId);
}
public List<String> allPossibleIds(){
List<String> alphabet = Seq.rangeClosed('A', 'Z').map(Object::toString).toList();
List<String> letterCombinations = Seq.seq(alphabet).crossJoin(Seq.seq(alphabet)).map(t -> t.v1 + t.v2).toList();
List<String> numbericParts = IntStream.range(0, 1000)
.mapToObj(i -> String.format("%03d", i))
.collect(toList());
return Seq.seq(letterCombinations).crossJoin(Seq.seq(numbericParts)).map(t -> t.v1 + t.v2).toList();
}
public List<String> retrieveIdsFromSubSystem() throws IOException {
return Files.readAllLines(Paths.get("ids.txt"));
}
}
To change to 5 digits again you can just change the 1000 to 100000 and the %03d to %05d.
If you can order the list, you could probably find a faster and better algorithm. It all depends on the situation. e.g. if you have an ordered list, you could build up the stream of all the ids, iterate over it and follow in the existing list with a pointer instead of always doing a resource consuming contains().
I am writing a program that will receive a list of words. After that, it will store the repeated words and the non repeated into two different lists. My code is the following:
public class Runner
{
public static void run (Set<String> words)
{
Set<String> uniques= new HashSet<String>();
Set<String> dupes= new HashSet<String>();
Iterator<String> w = words.iterator();
while (w.hasNext())
{
if (!(uniques.add(w.next())))
{
dupes.add(w.next());
}
else
{
uniques.add(w.next());
}
}
System.out.println ("Uniques: "+ uniques);
System.out.println ("Dupes: "+ dupes);
}
}
However, the output for the following:
right, left, up, left, down
is:
Uniques: [left, right, up, down]
Dupes: []
and my desired would be:
Uniques: [right, left, up, down]
Dupes: [ left ]
I want to achieve this using sets. I know it would be way easier to just an ArrayList but I am trying to understand sets.
The reason for your problem is that the argument words is a Set<String>. A set by definition will not contain duplicates. The argument words should be a List<String>. The code also makes the mistake of calling w.next() twice. A call to the next() will cause the iterator to advance.
public static void run(List<String> words) {
Set<String> uniques= new HashSet<String>();
Set<String> dupes= new HashSet<String>();
Iterator<String> w = words.iterator();
while(w.hasNext()) {
String word = w.next();
if(!uniques.add(word)) {
dupes.add(word);
}
}
}
You are doing uniques.add(w.next()) twice. Why?
Also, don't keep calling w.next() - this makes the iteration happen. Call it once and keep a local reference.
Use:
String next = w.next();
if(uniques.contains(next)) {
// it's a dupe
} else {
// it's not a dupe
}
I'm developing a Java Application that reads a lot of strings data likes this:
1 cat (first read)
2 dog
3 fish
4 dog
5 fish
6 dog
7 dog
8 cat
9 horse
...(last read)
I need a way to keep all couple [string, occurrences] in order from last read to first read.
string occurrences
horse 1 (first print)
cat 2
dog 4
fish 2 (last print)
Actually i use two list:
1) List<string> input; where i add all data
In my example:
input.add("cat");
input.add("dog");
input.add("fish");
...
2)List<string> possibilities; where I insert the strings once in this way:
if(possibilities.contains("cat")){
possibilities.remove("cat");
}
possibilities.add("cat");
In this way I've got a sorted list where all possibilities.
I use it like that:
int occurrence;
for(String possible:possibilities){
occurrence = Collections.frequency(input, possible);
System.out.println(possible + " " + occurrence);
}
That trick works good but it's too slow(i've got millions of input)... any help?
(English isn’t my first language, so please excuse any mistakes.)
Use a Map<String, Integer>, as #radoslaw pointed, to keep the insertion sorting use LinkedHashMap and not a TreeMap as described here:
LinkedHashMap keeps the keys in the order they were inserted, while a TreeMap is kept sorted via a Comparator or the natural Comparable ordering of the elements.
Imagine you have all the strings in some array, call it listOfAllStrings, iterate over this array and use the string as key in your map, if it does not exists, put in the map, if it exists, sum 1 to actual result...
Map<String, Integer> results = new LinkedHashMap<String, Integer>();
for (String s : listOfAllStrings) {
if (results.get(s) != null) {
results.put(s, results.get(s) + 1);
} else {
results.put(s, 1);
}
}
Make use of a TreeMap, which will keep ordering on the keys as specified by the compare of your MyStringComparator class handling MyString class which wraps String adding insertion indexes, like this:
// this better be immutable
class MyString {
private MyString() {}
public static MyString valueOf(String s, Long l) { ... }
private String string;
private Long index;
public hashcode(){ return string.hashcode(); }
public boolean equals() { // return rely on string.equals() }
}
class MyStringComparator implements Comparator<MyString> {
public int compare(MyString s1, MyString s2) {
return -s1.getIndex().compareTo(s2.gtIndex());
}
}
Pass the comparator while constructing the map:
Map<MyString,Integer> map = new TreeMap<>(new MyStringComparator());
Then, while parsing your input, do
Long counter = 0;
while (...) {
MyString item = MyString.valueOf(readString, counter++);
if (map.contains(item)) {
map.put(map.get(item)+1);
} else {
map.put(item,1);
}
}
There will be a lot of instantiation because of the immutable class, and the comparator will not be consistent with equals, but it should work.
Disclaimer: this is untested code just to show what I'd do, I'll come back and recheck it when I get my hands on a compiler.
Here is the complete solution for your problem,
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class DataDto implements Comparable<DataDto>{
public int count = 0;
public String string;
public long lastSeenTime;
public DataDto(String string) {
this.string = string;
this.lastSeenTime = System.currentTimeMillis();
}
public boolean equals(Object object) {
if(object != null && object instanceof DataDto) {
DataDto temp = (DataDto) object;
if(temp.string != null && temp.string.equals(this.string)) {
return true;
}
}
return false;
}
public int hashcode() {
return string.hashCode();
}
public int compareTo(DataDto o) {
if(o != null) {
return o.lastSeenTime < this.lastSeenTime ? -1 : 1;
}
return 0;
}
public String toString() {
return this.string + " : " + this.count;
}
public static final void main(String[] args) {
String[] listOfAllStrings = {"horse", "cat", "dog", "fish", "cat", "fish", "dog", "cat", "horse", "fish"};
Map<String, DataDto> results = new HashMap<String, DataDto>();
for (String s : listOfAllStrings) {
DataDto dataDto = results.get(s);
if(dataDto != null) {
dataDto.count = dataDto.count + 1;
dataDto.lastSeenTime = System.nanoTime();
} else {
dataDto = new DataDto(s);
results.put(s, dataDto);
}
}
List<DataDto> finalResults = new ArrayList<DataDto>(results.values());
System.out.println(finalResults);
Collections.sort(finalResults);
System.out.println(finalResults);
}
}
Ans
[horse : 1, cat : 2, fish : 2, dog : 1]
[fish : 2, horse : 1, cat : 2, dog : 1]
I think this solution will be suitable for your requirement.
If you know that your data is not going to exceed your memory capacity when you read it all into memory, then the solution is simple - using a LinkedList or a and a LinkedHashMap.
For example, if you use a Linked list:
LinkedList<String> input = new LinkedList();
You then proceed to use input.add() as you did originally. But when the input list is full, you basically use Jordi Castilla's solution - but put the entries in the linked list in reverse order. To do that, you do:
Iterator<String> iter = list.descendingIterator();
LinkedHashMap<String,Integer> map = new LinkedHashMap<>();
while (iter.hasNext()) {
String s = iter.next();
if ( map.containsKey(s)) {
map.put( s, map.get(s) + 1);
} else {
map.put(s, 1);
}
}
Now, the only real difference between his solution and mine is that I'm using list.descendingIterator() which is a method in LinkedList that gives you the entries in backwards order, from "horse" to "cat".
The LinkedHashMap will keep the proper order - whatever was entered first will be printed first, and because we entered things in reverse order, then whatever was read last will be printed first. So if you print your map the result will be:
{horse=1, cat=2, dog=4, fish=2}
If you have a very long file, and you can't load the entire list of strings into memory, you had better keep just the map of frequencies. In this case, in order to keep the order of entry, we'll use an object such as this:
private static class Entry implements Comparable<Entry> {
private static long nextOrder = Long.MIN_VALUE;
private String str;
private int frequency = 1;
private long order = nextOrder++;
public Entry(String str) {
this.str = str;
}
public String getString() {
return str;
}
public int getFrequency() {
return frequency;
}
public void updateEntry() {
frequency++;
order = nextOrder++;
}
#Override
public int compareTo(Entry e) {
if ( order > e.order )
return -1;
if ( order < e.order )
return 1;
return 0;
}
#Override
public String toString() {
return String.format( "%s: %d", str, frequency );
}
}
The trick here is that every time you update the entry (add one to the frequency), it also updates the order. But the compareTo() method orders Entry objects from high order (updated/inserted later) to low order (updated/inserted earlier).
Now you can use a simple HashMap<String,Entry> to store the information as you read it (I'm assuming you are reading from some sort of scanner):
Map<String,Entry> m = new HashMap<>();
while ( scanner.hasNextLine() ) {
String str = scanner.nextLine();
Entry entry = m.get(str);
if ( entry == null ) {
entry = new Entry(str);
m.put(str, entry);
} else {
entry.updateEntry();
}
}
Scanner.close();
Now you can sort the values of the entries:
List<Entry> orderedList = new ArrayList<Entry>(m.values());
m = null;
Collections.sort(orderedList);
Running System.out.println(orderedList) will give you:
[horse: 1, cat: 2, dog: 4, fish: 2]
In principle, you could use a TreeMap whose keys contained the "order" stuff, rather than a plain HashMap like this followed by sorting, but I prefer not having either mutable keys in a map, nor changing the keys constantly. Here we are only changing the values as we fill the map, and each key is inserted into the map only once.
What you could do:
Reverse the order of the list using
Collections.reverse(input). This runs in linear time - O(n);
Create a Set from the input list. A Set garantees uniqueness.
To preserve insertion order, you'll need a LinkedHashSet;
Iterate over this set, just as you did above.
Code:
/* I don't know what logic you use to create the input list,
* so I'm using your input example. */
List<String> input = Arrays.asList("cat", "dog", "fish", "dog",
"fish", "dog", "dog", "cat", "horse");
/* by the way, this changes the input list!
* Copy it in case you need to preserve the original input. */
Collections.reverse(input);
Set<String> possibilities = new LinkedHashSet<String>(strings);
for (String s : possibilities) {
System.out.println(s + " " + Collections.frequency(strings, s));
}
Output:
horse 1
cat 2
dog 4
fish 2