What is making my code use so much memory? - java

I have solved in various ways a simple problem on CodeEval, which specification can be found here (only a few lines long).
I have made 3 working versions (one of them in Scala) and I don't understand the difference of performances for my last Java version which I expected to be the best time and memory-wise.
I also compared this to a code found on Github. Here are the performance stats returned by CodeEval :
. Version 1 is the version found on Github
. Version 2 is my Scala solution :
object Main extends App {
val p = Pattern.compile("\\d+")
scala.io.Source.fromFile(args(0)).getLines
.filter(!_.isEmpty)
.map(line => {
val dists = new TreeSet[Int]
val m = p.matcher(line)
while (m.find) dists += m.group.toInt
val list = dists.toList
list.zip(0 +: list).map { case (x,y) => x - y }.mkString(",")
})
.foreach(println)
}
. Version 3 is my Java solution which I expected to be the best :
public class Main {
public static void main(String[] args) throws IOException {
Pattern p = Pattern.compile("\\d+");
File file = new File(args[0]);
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line = br.readLine()) != null) {
Set<Integer> dists = new TreeSet<Integer>();
Matcher m = p.matcher(line);
while (m.find()) dists.add(Integer.parseInt(m.group()));
Iterator<Integer> it = dists.iterator();
int prev = 0;
StringBuilder sb = new StringBuilder();
while (it.hasNext()) {
int curr = it.next();
sb.append(curr - prev);
sb.append(it.hasNext() ? "," : "");
prev = curr;
}
System.out.println(sb);
}
br.close();
}
}
Version 4 is the same as version 3 except I don't use a StringBuilder to print the output and do like in version 1
Here is how I interpreted those results :
version 1 is too slow because of the too high number of System.out.print calls. Moreover, using split on very large lines (that's the case in the tests performed) uses a lot of memory.
version 2 seems slow too but it is mainly because of an "overhead" on running Scala code on CodeEval, even very efficient code run slowly on it
version 2 uses unnecessary memory to build a list from the set, which also takes some time but should not be too significant. Writing more efficient Scala would probably like writing it in Java so I preferred elegance to performance
version 3 should not use that much memory in my opinion. The use of a StringBuilder has the same impact on memory as calling mkString in version 2
version 4 proves the calls to System.out.println are slowering down the program
Does someone see an explanation to those results ?

I conducted some tests.
There is a baseline for every type of language. I code in java and javascript. For javascript here are my test results:
Rev 1: Default empty boilerplate for JS with a message to standard output
Rev 2: Same without file reading
Rev 3: Just a message to the standard output
You can see that no matter what, there will be at least 200 ms runtime and about 5 megs of memory usage. This baseline depends on the load of the servers as well! There was a time when codeevals was heavily overloaded, thus making impossible to run anything within the max time(10s).
Check this out, a totally different challenge than the previous:
Rev4: My solution
Rev5: The same code submitted again now. Scored 8000 more ranking point. :D
Conclusion: I would not worry too much about CPU and memory usage and rank. It is clearly not reliable.

Your scala solution is slow, not because of "overhead on CodeEval", but because you are building an immutable TreeSet, adding elements to it one by one. Replacing it with something like
val regex = """\d+""".r // in the beginning, instead of your Pattern.compile
...
.map { line =>
val dists = regex.findAllIn(line).map(_.toInt).toIndexedSeq.sorted
...
Should shave about 30-40% off your execution time.
Same approach (build a list, then sort) will, probably, help your memory utilization in "version 3" (java sets are real memory hogs). It is also a good idea to give your list an initial size while you are at it (otherwise, it'll grow by 50% every time it runs out of capacity, which is wasteful in both memory and performance). 600 sounds like a good number, since that's the upper bound for the number of cities from the problem description.
Now, since we know the upper boundary, an even faster and slimmer approach is to do away with lists and boxed Integeres, and just do int dists[] = new int[600];.
If you wanted to get really fancy, you'd also make use of the "route length" range that's mentioned in the description. For example, instead of throwing ints into an array and sorting (or keeping a treeset), make an array of 20,000 bits (or even 20K bytes for speed), and set those that you see in input as you read it ... That would be both faster and more memory efficient than any of your solutions.

I tried solving this question and figured that you don't need the names of the cities, just the distances in a sorted array.
It has much better runtime of 738ms, and memory of 4513792 with this.
Although this may not help improve your piece of code, it seems like a better way to approach the question. Any suggestions to improve the code further are welcome.
import java.io.*;
import java.util.*;
public class Main {
public static void main (String[] args) throws IOException {
File file = new File(args[0]);
BufferedReader buffer = new BufferedReader(new FileReader(file));
String line;
while ((line = buffer.readLine()) != null) {
line = line.trim();
String out = new Main().getDistances(line);
System.out.println(out);
}
}
public String getDistances(String s){
//split the string
String[] arr = s.split(";");
//create an array to hold the distances as integers
int[] distances = new int[arr.length];
for(int i=0; i<arr.length; i++){
//find the index of , - get the characters after that - convert to integer - add to distances array
distances[i] = Integer.parseInt(arr[i].substring(arr[i].lastIndexOf(",")+1));
}
//sort the array
Arrays.sort(distances);
String output = "";
output += distances[0]; //append the distance to the closest city to the string
for(int i=0; i<arr.length-1; i++){
//get distance between current element(city) and next
int distance_between = distances[i+1] - distances[i];
//append the distance to the string
output += "," + distance_between;
}
return output;
}
}

Related

Storing and comparing a large quantity of Strings in Java

My application stores a large number (about 700,000) of strings in an ArrayList. The strings are loaded from a text file like this:
List<String> stringList = new ArrayList<String>(750_000);
//there's a try catch here but I omitted it for this example
Scanner fileIn = new Scanner(new FileInputStream(listPath), "UTF-8");
while (fileIn.hasNext()) {
String s = fileIn.nextLine().trim();
if (s.isEmpty()) continue;
if (s.startsWith("#")) continue; //ignore comments
stringList.add(s);
}
fileIn.close();
Later on, Other strings are compared to this list, using this code:
String example = "Something";
if (stringList.contains(example))
doSomething();
This comparison will happen many hundreds (thousands?) of times.
This all works, but I want to know if there's anything I can do to make it better. I notice that the JVM increases in size from about 100MB to 600MB when it loads the 700K Strings. The strings are mainly about this size:
Blackened Recordings
Divergent Series: Insurgent
Google
Pixels Movie Money
X Ambassadors
Power Path Pro Advanced
CYRFZQ
Is there anything I can do to reduce the memory, or is that to be expected? Any suggestions in general?
ArrayList is a memory effective. Probably your issue is caused by java.util.Scanner. Scanner creates a lot of temp objects during parsing (Patterns, Matchers etc) and not suitable for big files.
Try to replace it with java.io.BufferedReader:
List<String> stringList = new ArrayList<String>();
BufferedReader fileIn = new BufferedReader(new FileReader("UTF-8"));
String line = null;
while ((line = fileIn.readLine()) != null) {
line = line.trim();
if (line.isEmpty()) continue;
if (line.startsWith("#")) continue; //ignore comments
stringList.add(line);
}
fileIn.close();
See java.util.Scanner source code
To pinpoint memory issue attach to your JVM any memory profiler, for example VisualVM from JDK tools.
Added:
Let's make few assumtions:
you have 700000 string with 20 characters each.
object reference size is 32 bits, object header - 24, array header - 16, char - 16, int 32.
Then every string will consume 24+32*2+32+(16+20*16) = 456 bits.
Whole ArrayList with string object will consume about 700000*(32*2+456) = 364000000 bits = 43.4 MB (very roughly).
Not quite an answer, but:
Your scenario uses around 70mb on my machine:
long usedMemory = -(Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory());
{//
String[] strings = new String[700_000];
for (int i = 0; i < strings.length; i++) {
strings[i] = new String(new char[20]);
}
}//
usedMemory += Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
System.out.println(usedMemory / 1_000_000d + " mb");
How did you reach 500mb there? As far as I know, String has internally a char[], and each char has 16 bits. Taking the Object and String overhead in account, 500mb is still quite much for the strings only. You may perform some benchmarking tests on your machine.
As others already mentioned, you should change the datastructure for element look-ups/comparison.
You're likely going to be better off using a HashSet instead of an ArrayList as both add and contains are constant time operations in a HashSet.
However, it does assume that your object's hashCode implementation (which is part of Object, but can be overridden) is evenly distributed.
There is a Trie data structure which can be used as dictionary, with so many strings they can occur multiple times. https://en.wikipedia.org/wiki/Trie . It seems to fit your case.
UPDATE:
An alternative can be HashSet or HashMap string -> something if you want occurrences of strings for example. Hashed collection will be faster than list for sure.
I would start with HashSet.
Using an ArrayList is a very bad idea for your use case, because it is not sorted, and hence you cannot efficiently search for an entry.
The best built-in type for your case is a is a TreeSet<String>. It guarantees O(log(n)) Performance for add() and contains().
Be aware that TreeSet is not thread-safe in the basic implementation. Use an mt-safe wrapper (see the JavaDocs of TreeSet for this).
Here is a Java 8 approach. It uses Files.lines() method which take advantage of Stream API. This method reads all lines from a file as a Stream.
As a consequence no String objects are created till the terminal operation which is a static method MyExecutor.doSomething(String).
/**
* Process lines from a file.
* Uses Files.lines() method which take advantage of Stream API introduced in Java 8.
*/
private static void processStringsFromFile(final Path file) {
try (Stream<String> lines = Files.lines(file)) {
lines.map(s -> s.trim())
.filter(s -> !s.isEmpty())
.filter(s -> !s.startsWith("#"))
.filter(s -> s.contains("Something"))
.forEach(MyExecutor::doSomething);
} catch (IOException ex) {
logProcessStringsFailed(ex);
}
}
I conducted an Analysis of Memory Usage in NetBeans and here are the Memory Results for empty implementation of doSomething()
public static void doSomething(final String s) {
}
Live Bytes = 6702720 ≈ 6.4MB.

Parse and Extract unique value from a text file efficiently

I have two tsv files to parse and extract values from each file. Each line may have 4-5 attributes per line. The content of both the files are as below :
1 44539 C T 19.44
1 44994 A G 4.62
1 45112 TATGG 0.92
2 43635 Z Q 0.87
3 5672 AAS 0.67
There are some records in each file that have first 3 or 4 attributes same but different value. I want to retain higher value of such records and prepare new file with all unique values. For example:
1 44539 C T 19.44
1 44539 C T 25.44
I need to retain one with the higher value in above case record with value 25.44
I have drafted code for this however after few minutes the program runs slow. I am reading each record from a file forming a key value pair with the first 3 or 4 records as key and last record as value and storing it in hashmap and use it to again write to a file. Is there a better solution?
also how can I test if my code is giving me correct output in file?
One file is of size 498 MB with 23822225 records and other is of 515 MB with 24500367 records.
I get Exception in thread "main" java.lang.OutOfMemoryError: Java heap space error for the file with size 515 MB.
Is there a better way I can code to execute the program efficiently with out increasing heap size.
I might have to deal with larger files in future, what would be the trick to solve such problems?
public class UniqueExtractor {
private int counter = 0;
public static void main(String... aArgs) throws IOException {
UniqueExtractor parser = new UniqueExtractor("/Users/xxx/Documents/xyz.txt");
long startTime = System.currentTimeMillis();
parser.processLineByLine();
parser.writeToFile();
long endTime = System.currentTimeMillis();
long total_time = endTime - startTime;
System.out.println("done in " + total_time/1000 + "seconds ");
}
public void writeToFile()
{
System.out.println("writing to a file");
try {
PrintWriter writer = new PrintWriter("/Users/xxx/Documents/xyz_unique.txt", "UTF-8");
Iterator it = map.entrySet().iterator();
StringBuilder sb = new StringBuilder();
while (it.hasNext()) {
sb.setLength(0);
Map.Entry pair = (Map.Entry)it.next();
sb.append(pair.getKey());
sb.append(pair.getValue());
writer.println(sb.toString());
writer.flush();
it.remove();
}
}
catch(Exception e)
{
e.printStackTrace();
}
}
public UniqueExtractor(String fileName)
{
fFilePath = fileName;
}
private HashMap<String, BigDecimal> map = new HashMap<String, BigDecimal>();
public final void processLineByLine() throws IOException {
try (Scanner scanner = new Scanner(new File(fFilePath))) {
while (scanner.hasNextLine())
{
//System.out.println("ha");
System.out.println(++counter);
processLine(scanner.nextLine());
}
}
}
protected void processLine(String aLine)
{
StringBuilder sb = new StringBuilder();
String[] split = aLine.split(" ");
BigDecimal bd = null;
BigDecimal bd1= null;
for (int i=0; i < split.length-1; i++)
{
//System.out.println(i);
//System.out.println();
sb.append(split[i]);
sb.append(" ");
}
bd= new BigDecimal((split[split.length-1]));
//System.out.print("key is" + sb.toString());
//System.out.println("value is "+ bd);
if (map.containsKey(sb.toString()))
{
bd1 = map.get(sb.toString());
int res = bd1.compareTo(bd);
if (res == -1)
{
System.out.println("replacing ...."+ sb.toString() + bd1 + " with " + bd);
map.put(sb.toString(), bd);
}
}
else
{
map.put(sb.toString(), bd);
}
sb.setLength(0);
}
private String fFilePath;
}
There are a couple main things you may want to consider to improve the performance of this program.
Avoid BigDecimal
While BigDecimal is very useful, it has a lot of overhead, both in speed and space requirements. According to your examples, you don't have very much precision to worry about, so I would recommend switching to plain floats or doubles. These would take a mere fraction of the space (so you could process larger files) and would probably be faster to work with.
Avoid StringBuilder
This is not a general rule, but applies in this case: you appear to be parsing and then rebuilding aLine in processLine. This is very expensive, and probably unnecessary. You could, instead, use aLine.lastIndexOf('\t') and aLine.substring to cut up the String with much less overhead.
These two should significantly improve the performance of your code, but don't address the overall algorithm.
Dataset splitting
You're trying to handle enough data that you might want to consider not keeping all of it in memory at once.
For example, you could split up your data set into multiple files based on the first field, run your program on each of the files, and then rejoin the files into one. You can do this with more than one field if you need more splitting. This requires less memory usage because the splitting program does not have to keep more than a single line in memory at once, and the latter programs only need to keep a chunk of the original data in memory at once, not the entire thing.
You may want to try the specific optimizations outlined above, and then see if you need more efficiency, in which case try to do dataset splitting.

Null Pointer that makes no sense to me?

Im currently working on a program and any time i call Products[1] there is no null pointer error however, when i call Products[0] or Products[2] i get a null pointer error. However i am still getting 2 different outputs almost like there is a [0] and 1 or 1 and 2 in the array. Here is my code
FileReader file = new FileReader(location);
BufferedReader reader = new BufferedReader(file);
int numberOfLines = readLines();
String [] data = new String[numberOfLines];
Products = new Product[numberOfLines];
calc = new Calculator();
int prod_count = 0;
for(int i = 0; i < numberOfLines; i++)
{
data = reader.readLine().split("(?<=\\d)\\s+|\\s+at\\s+");
if(data[i].contains("input"))
{
continue;
}
Products[prod_count] = new Product();
Products[prod_count].setName(data[1]);
System.out.println(Products[prod_count].getName());
BigDecimal price = new BigDecimal(data[2]);
Products[prod_count].setPrice(price);
for(String dataSt : data)
{
if(dataSt.toLowerCase().contains("imported"))
{
Products[prod_count].setImported(true);
}
else{
Products[prod_count].setImported(false);
}
}
calc.calculateTax(Products[prod_count]);
calc.calculateItemTotal(Products[prod_count]);
prod_count++;
This is the output :
imported box of chocolates
1.50
11.50
imported bottle of perfume
7.12
54.62
This print works System.out.println(Products[1].getProductTotal());
This becomes a null pointer System.out.println(Products[2].getProductTotal());
This also becomes a null pointer System.out.println(Products[0].getProductTotal());
You're skipping lines containing "input".
if(data[i].contains("input")) {
continue; // Products[i] will be null
}
Probably it would be better to make products an ArrayList, and add only the meaningful rows to it.
products should also start with lowercase to follow Java conventions. Types start with uppercase, parameters & variables start with lowercase. Not all Java coding conventions are perfect -- but this one's very useful.
The code is otherwise structured fine, but arrays are not a very flexible type to build from program logic (since the length has to be pre-determined, skipping requires you to keep track of the index, and it can't track the size as you build it).
Generally you should build List (ArrayList). Map (HashMap, LinkedHashMap, TreeMap) and Set (HashSet) can be useful too.
Second bug: as Bohemian says: in data[] you've confused the concepts of a list of all lines, and data[] being the tokens parsed/ split from a single line.
"data" is generally a meaningless term. Use meaningful terms/names & your programs are far less likely to have bugs in them.
You should probably just use tokens for the line tokens, not declare it outside/ before it is needed, and not try to index it by line -- because, quite simply, there should be absolutely no need to.
for(int i = 0; i < numberOfLines; i++) {
// we shouldn't need data[] for all lines, and we weren't using it as such.
String line = reader.readLine();
String[] tokens = line.split("(?<=\\d)\\s+|\\s+at\\s+");
//
if (tokens[0].equals("input")) { // unclear which you actually mean.
/* if (line.contains("input")) { */
continue;
}
When you offer sample input for a question, edit it into the body of the question so it's readable. Putting it in the comments, where it can't be read properly, is just wasting the time of people who are trying to help you.
Bug alert: You are overwriting data:
String [] data = new String[numberOfLines];
then in the loop:
data = reader.readLine().split("(?<=\\d)\\s+|\\s+at\\s+");
So who knows how large it is - depends on the success of the split - but your code relies on it being numberOfLines long.
You need to use different indexes for the line number and the new product objects. If you have 20 lines but 5 of them are "input" then you only have 15 new product objects.
For example:
int prod_count = 0;
for (int i = 0; i < numberOfLines; i++)
{
data = reader.readLine().split("(?<=\\d)\\s+|\\s+at\\s+");
if (data[i].contains("input"))
{
continue;
}
Products[prod_count] = new Product();
Products[prod_count].setName(data[1]);
// etc.
prod_count++; // last thing to do
}

Tips optimizing Java code

So, I've written a spellchecker in Java and things work as they should. The only problem is that if I use a word where the max allowed distance of edits is too large (like say, 9) then my code runs out of memory. I've profiled my code and dumped the heap into a file, but I don't know how to use it to optimize my code.
Can anyone offer any help? I'm more than willing to put up the file/use any other approach that people might have.
-Edit-
Many people asked for more details in the comments. I figured that other people would find them useful, and they might get buried in the comments. Here they are:
I'm using a Trie to store the words themselves.
In order to improve time efficiency, I don't compute the Levenshtein Distance upfront, but I calculate it as I go. What I mean by this is that I keep only two rows of the LD table in memory. Since a Trie is a prefix tree, it means that every time I recurse down a node, the previous letters of the word (and therefore the distance for those words) remains the same. Therefore, I only calculate the distance with that new letter included, with the previous row remaining unchanged.
The suggestions that I generate are stored in a HashMap. The rows of the LD table are stored in ArrayLists.
Here's the code of the function in the Trie that leads to the problem. Building the Trie is pretty straight forward, and I haven't included the code for the same here.
/*
* #param letter: the letter that is currently being looked at in the trie
* word: the word that we are trying to find matches for
* previousRow: the previous row of the Levenshtein Distance table
* suggestions: all the suggestions for the given word
* maxd: max distance a word can be from th query and still be returned as suggestion
* suggestion: the current suggestion being constructed
*/
public void get(char letter, ArrayList<Character> word, ArrayList<Integer> previousRow, HashSet<String> suggestions, int maxd, String suggestion){
// the new row of the trie that is to be computed.
ArrayList<Integer> currentRow = new ArrayList<Integer>(word.size()+1);
currentRow.add(previousRow.get(0)+1);
int insert = 0;
int delete = 0;
int swap = 0;
int d = 0;
for(int i=1;i<word.size()+1;i++){
delete = currentRow.get(i-1)+1;
insert = previousRow.get(i)+1;
if(word.get(i-1)==letter)
swap = previousRow.get(i-1);
else
swap = previousRow.get(i-1)+1;
d = Math.min(delete, Math.min(insert, swap));
currentRow.add(d);
}
// if this node represents a word and the distance so far is <= maxd, then add this word as a suggestion
if(isWord==true && d<=maxd){
suggestions.add(suggestion);
}
// if any of the entries in the current row are <=maxd, it means we can still find possible solutions.
// recursively search all the branches of the trie
for(int i=0;i<currentRow.size();i++){
if(currentRow.get(i)<=maxd){
for(int j=0;j<26;j++){
if(children[j]!=null){
children[j].get((char)(j+97), word, currentRow, suggestions, maxd, suggestion+String.valueOf((char)(j+97)));
}
}
break;
}
}
}
Here's some code I quickly crafted showing one way to generate the candidates and to then "rank" them.
The trick is: you never "test" a non-valid candidate.
To me your: "I run out of memory when I've got an edit distance of 9" screams "combinatorial explosion".
Of course to dodge a combinatorial explosion you don't do thing like trying to generate yourself all words that are at a distance from '9' from your misspelled work. You start from the misspelled word and generate (quite a lot) of possible candidates, but you refrain from creating too many candidates, for then you'd run into trouble.
(also note that it doesn't make much sense to compute up to a Levenhstein Edit Distance of 9, because technically any word less than 10 letters can be transformed into any other word less than 10 letters in max 9 transformations)
Here's why you simply cannot test all words up to a distance of 9 without either having an OutOfMemory error or simply a program never terminating:
generating all the LED up to 1 for the word "ptmizing", by only adding one letter (from a to z) generates already 9*26 variations (i.e. 324 variations) [there are 9 positions where you can insert one out of 26 letters)
generating all the LED up to 2, by only adding one letter to what we know have generates already 10*26*324 variations (60 840)
generating all the LED up to 3 gives: 17 400 240 variations
And that is only by considering the case where we add one, add two or add three letters (we're not counting deletion, swaps, etc.). And that is on a misspelled word that is only nine characters long. On "real" words, it explodes even faster.
Sure, you could get "smart" and generate this in a way not to have too many dupes etc. but the point stays: it's a combinatorial explosion that explodes fastly.
Anyway... Here's an example. I'm simply passing the dictionary of valid words (containing only four words in this case) to the corresponding method to keep this short.
You'll obviously want to replace the call to the LED with your own LED implementation.
The double-metaphone is just an example: in a real spellchecker words that do "sound alike"
despite further LED should be considered as "more correct" and hence often suggest first. For example "optimizing" and "aupteemising" are quite far from a LED point of view, but using the double-metaphone you should get "optimizing" as one of the first suggestion.
(disclaimer: following was cranked in a few minutes, it doesn't take into account uppercase, non-english words, etc.: it's not a real spell-checker, just an example)
#Test
public void spellCheck() {
final String src = "misspeled";
final Set<String> validWords = new HashSet<String>();
validWords.add("boing");
validWords.add("Yahoo!");
validWords.add("misspelled");
validWords.add("stackoverflow");
final List<String> candidates = findNonSortedCandidates( src, validWords );
final SortedMap<Integer,String> res = computeLevenhsteinEditDistanceForEveryCandidate(candidates, src);
for ( final Map.Entry<Integer,String> entry : res.entrySet() ) {
System.out.println( entry.getValue() + " # LED: " + entry.getKey() );
}
}
private SortedMap<Integer, String> computeLevenhsteinEditDistanceForEveryCandidate(
final List<String> candidates,
final String mispelledWord
) {
final SortedMap<Integer, String> res = new TreeMap<Integer, String>();
for ( final String candidate : candidates ) {
res.put( dynamicProgrammingLED(candidate, mispelledWord), candidate );
}
return res;
}
private int dynamicProgrammingLED( final String candidate, final String misspelledWord ) {
return Levenhstein.getLevenshteinDistance(candidate,misspelledWord);
}
Here you generate all possible candidates using several methods. I've only implemented one such method (and quickly so it may be bogus but that's not the point ; )
private List<String> findNonSortedCandidates( final String src, final Set<String> validWords ) {
final List<String> res = new ArrayList<String>();
res.addAll( allCombinationAddingOneLetter(src, validWords) );
// res.addAll( allCombinationRemovingOneLetter(src) );
// res.addAll( allCombinationInvertingLetters(src) );
return res;
}
private List<String> allCombinationAddingOneLetter( final String src, final Set<String> validWords ) {
final List<String> res = new ArrayList<String>();
for (char c = 'a'; c < 'z'; c++) {
for (int i = 0; i < src.length(); i++) {
final String candidate = src.substring(0, i) + c + src.substring(i, src.length());
if ( validWords.contains(candidate) ) {
res.add(candidate); // only adding candidates we know are valid words
}
}
if ( validWords.contains(src+c) ) {
res.add( src + c );
}
}
return res;
}
One thing you could try is, increase the Java's heap size, in order to overcome "out of memory error".
Following article will help you in order to understand how to increase heap size in Java
http://viralpatel.net/blogs/2009/01/jvm-java-increase-heap-size-setting-heap-size-jvm-heap.html
But I think the better approach to address your problem is, find out a better algorithm than the current algorithm
Well without more Information on the topic there is not much the community could do for you... You can start with the following:
Look at what your Profiler says (after it has run a little while): Does anything pile up? Are there a lot of Objects - this should normally give you a hint on what is wrong with your code.
Publish your saved dump somewhere and link it in your question, so someone else could take a look at it.
Tell us which profiler you are using, then somebody can give you hints on where to look for valuable information.
After you have narrowed down your problem to a specific part of your Code, and you cannot figure out why there are so many objects of $FOO in your memory, post a snippet of the relevant part.

Remove the delimiter , at the end

String prefix = "";
for (String serverId : serverIds) {
sb.append(prefix);
prefix = ",";
sb.append(serverId);
}
The following code runs faster than the above code . the "," prefix object does unnecessary object creation on every iteration . The above code takes 86324 nano seconds,while mine takes only 68165 nano seconds.
List<String> l = Arrays.asList("SURESH1","SURESH2","SURESH4","SURESH5");
StringBuffer l1 = new StringBuffer();
int sz = l.size();
int i=0; long t =
System.nanoTime();
for (String s : l)
{
l1.append(s);
if ( i != sz-1)
l1.append(","); i++;
}
}
long t2 = System.nanoTime();
System.out.println ((t2-t)); System.out.println(l1);
// The time taken for the above code is 68165 nano seconds
SURESH1,SURESH2,SURESH4,SURESH5
kindly let me know which one is better in ur view.
A few points:
My code doesn't require you to know the number of elements up-front. In other words, it can work over any Iterable<String>
Why are you using StringBuffer at all rather than StringBuilder?
The "empty prefix object" is only created once... how sure are you that there aren't any references to an empty string literal anywhere in your code?
Which code do you find simpler to read? That's likely to be more important than timing in most cases. (Currently your code as posted appears not to have enough open braces, for example...)
Why not use a library method in the first place (e.g. Guava's Joiner class)?
Never use timings this small in a benchmark. How accurate do you expect your system clock to be? You should repeat the same operation many, many until it's taken a sensible amount of time.
EDIT: Now one alternative which addresses the first point would be this change:
boolean first = true;
StringBuilder builder = new StringBuilder();
for (String value : values) {
if (first) {
first = false;
} else {
builder.append(",");
}
builder.append(value);
}
Or if you really like using a counter:
int i = 0;
StringBuilder builder = new StringBuilder();
for (String value : values) {
if (i != 0) {
builder.append(",");
}
builder.append(value);
i++;
}
I also have serious doubts about the way you have coded and run your benchmarks. For a start, your timings suggest that your code didn't get JIT compiled. There are many mistakes that people make with Java benchmarks that can invalidate the results. Show us the complete code.
The other point is that in most cases, this kind of micro-optimization is irrelevant to the performance of real programs. Either the program already runs fast enough, or you are wasting your time optimizing the wrong part of the program.

Categories

Resources