memory leak issue on using java substring method

memory leak issue on using java substring method - java

I have looked through all the memory leak solutions for java substring method. I still get the out of memory error due to this issue. I have an arraylist of string which are of length 1000-3500. i index them and store them. The issue is each string needs to be run through loop to store all possible varying lengths of same string. To do this, i use for loop and substring method. and this method causes the memory leak problem.
A sudo code of what i have done:
for(int i=0;i<str.length;i++)
{
//create substring and index it
str.substring(0,(str.length()-i));
}
str: string. and this above loops runs till all the string within the arraylist are indexed. I tried to fix the leak by,
1.
for(int i=0;i<str.length;i++)
{
//create substring and index it
new String(str.substring(0,(str.length()-i)));
}
2.
for(int i=0;i<str.length;i++)
{
//create substring and index it
new String(str.substring(0,(str.length()-i)).intern());
}
3.
for(int i=0;i<str.length;i++)
{
//create substring and index it
new String(str.substring(0,(str.length()-i))).intern();
}
Still i have the issue. My java version is: 1.7.0_17.
Edit:
I understand this is not a memory leak problem from the comments. I am indexing some continuous strings. Say for example,
String s= abcdefghijkl;
i want index each string as :
abcdefghjk
abcdefghj
abcdefhg
abcdefh
abcdef
abcde
abcd
..
..
a
To perform this,i get a string,then perform substring operation,get that string and index them.

There is no leak.
Please note that you're creating a huge amount of String objects. If a String has a length of 1000 characters you're creating 1000 objects.
Is it really needed to create so many String objects? Would it be possible for example to use a char[] to achive what you want?

There are 2 things:
First: ".intern()" keeps the string in an internal cache that is usually not garbage collected - please don't use it if you're not 100% sure why you are using it.
Second: there is a constructor from String taking char[] like this:
final char[] chars = str.toCharArray ();
for(int i=0;i<chars.length;i++)
{
//create substring and index it
new String(chars, 0, chars.length-i);
}
-> this is also more efficient (in terms of speed)

This problem fixed in the JDK 1.7 by returning the new copy of character array.
public String(char value[], int offset, int count) {
//check boundary
// It return new copy on array.
this.value = Arrays.copyOfRange(value, offset, offset + count);
}
public String substring(int beginIndex, int endIndex) {
//check boundary
int subLen = endIndex - beginIndex;
return new String(value, beginIndex, subLen);
}
http://javaexplorer03.blogspot.in/2015/10/how-substring-memory-leak-fixed-in-jdk.html?q=itself+implements+Runnable

Related

How to delete characters at x?

How to delete the characters at x and keep the rest? The output should be "12345678" Deleting every '9' in the position that x is on. X is i*(i+1)/2 so that the number is added to the next number. So every number at 0,1,3,6,10,15,21,28,etc.
public class removeMysteryI {
public static String removeMysteryI(String str) {
String newString = "";
int x=0;
for(int i=0;i<str.length();i++){
int y = (i*(i+1)/2)+1;
if(y<=str.length()){
x=i*(i+1)/2;
newString=str.substring(0, x) + str.substring(x + 1);
}
}
return newString;
}
public static void main(String[] args) {
String str = "9919239456978";
System.out.println(removeMysteryI(str));
}
}

OK, so there are a couple of mistakes in your code. One is easy to fix. The others not so easy.
The easy one first:
newString=str.substring(0, x) + str.substring(x + 1);
OK so that is creating a string with the character at position x removed. The problem is what it is operating on. The str variable is the input parameter. So at the end of the day newString will still only be str with one character removed.
The above actually needs to be operating on the string from the previous loop iterations ... if you are going to remove more than one character.
The next problem arises when you try to solve the first one. When you remove a character from a string, all characters after the removal point are renumbered; e.g. after removing the character at 5, the character at 6 becomes the character at 5, the character at 7 becomes the character at 6, and so on.
So if you are going to remove characters by "snipping" the string, you need to make sure that the indexes for the positions for the "snips" are adjusted for the number of characters you have already removed.
That can be done ... but you need to think about it.
The final problem is efficiency. Each time your current code removes a single character (as above), it is actually copying all remaining characters to a new string. For small strings, that's OK. For really large strings, the repeated copying could have a serious performance impact1.
The solution to this is to use a different approach to removing the characters. Instead of snipping out the characters you want to discard, copy the characters that you want to keep. The StringBuilder class is one way of doing this2. If you are not permitted to use that, then you could do it with an array of char, and an index variable to keep track of your "append" position in the array. Finally, there is a String constructor that can create a String from the relevant part of the char[].
I'll leave it to you to work out the details.
1 - Efficiency could be viewed as beyond the scope of this exercise.
2 - #Horse's answer uses a StringBuilder but in a different way to what I am suggesting. This will also suffer from the repeated copying problem because each deleteCharAt call will copy all characters after the deletion point.

Follow the steps below:
Initialize with builderIndexToDelete = 0
Initialize with counter = 1
Repeat the following till the index is valid:
delete character at builderIndexToDelete
update builderIndexToDelete to counter - 1 (-1 as a character is deleted in every iteration)
increment the counter
public static String deleteNaturalSumIndexes(String str) {
StringBuilder builder = new StringBuilder(str);
int counter = 1;
int builderIndexToDelete = 0;
while (builderIndexToDelete < builder.length()) {
builder.deleteCharAt(builderIndexToDelete);
builderIndexToDelete += (counter - 1);
counter++;
}
return builder.toString();
}
public static void main(String[] args) {
String str = "9919239456978";
System.out.println(deleteNaturalSumIndexes(str));
}
Thank you #dreamcrash and #StephenC
Using #StephenC suggestion to improve performance
public static String deleteNaturalSumIndexes(String str) {
StringBuilder builder = new StringBuilder();
int nextNum = 1;
int indexToDelete = 0;
while (indexToDelete < str.length()) {
// check whether this is a valid range to continue
// handles 0,1 specifically
if (indexToDelete + 1 < indexToDelete + nextNum) {
// min is used to limit the index of last iteration
builder.append(str, indexToDelete + 1, Math.min(indexToDelete + nextNum, str.length()));
}
indexToDelete += nextNum;
nextNum++;
}
return builder.toString();
}
public static void main(String[] args) {
System.out.println(deleteNaturalSumIndexes(""));
System.out.println(deleteNaturalSumIndexes("a"));
System.out.println(deleteNaturalSumIndexes("ab"));
System.out.println(deleteNaturalSumIndexes("abc"));
System.out.println(deleteNaturalSumIndexes("99192394569"));
System.out.println(deleteNaturalSumIndexes("9919239456978"));
}

Having problems with a search method between Arrays and Array lists

OK so I'm trying to design a simple program that checks to see if a substring of length 4 characters is within all initial strings. Here is my code as follows:
public class StringSearch{
private String[] s1Array = {"A","C","T","G","A","C","G","C","A","G"};
private String[] s2Array = {"T","C","A","C","A","A","C","G","G","G"};
private String[] s3Array = {"G","A","G","T","C","C","A","G","T","T"};
//{for (int i = 0; i < s1Array.length; i++){
// System.out.print(s1Array[i]);
//}}//check if Array loaded correctly
/**
* This is the search method.
*
* #param length length of sub string to search
* #param count counter for search engine
* #param i for-loop counter
* #return subStr returns strings of length = 4 that are found in all 3 input strings with at most
* one mismatched position.
*/
public String Search()
{
int length = 4;
int count = 0;
int i = 0;
ArrayList<StringSearch> subStr = new ArrayList<StringSearch>();
//String[] subStr = new String[4];
do
{
for (i = count; i < length; i++){
subStr.add(s1Array[i]); // cant find .add method???
count = count + 1;
}
if (s2Array.contains(subStr) && s3Array.contains(subStr)){ //can't find .contains method???
System.out.println(subStr + "is in all 3 lists.");
}
if (count = s1Array.length){
System.out.println("Task complete.");
}
else{
count = count - length;
count = count + 1;
}
}while (count <= s1Array.length);
}
}
For some reason, Java cannot seem to find the .add or .contains methods and I have no idea why. So my approach was to turn the initial Strings each into an array (since the assignment specified each string would be exactly N elements long, in this case N = 10) where 1 letter would be 1 element. The next thing I did was set up a for loop that would scan s1Array and add the first 4 elements to an ArrayList subStr which is used to search s2Array and s3Array. Here is where .add isn't a valid method, for whatever reason. Commenting that out and compiling again, I also ran into an issue with the .contains method not being a valid method. Why won't this work? What am I missing? Logically, it seems to make sense but I guess maybe I'm missing something in the syntax? Help would be appreciated, as I'm a Java novice.

There are lots of errors and misunderstandings here.
Let's start with #1
private String[] s1Array = {"A","C","T","G","A","C","G","C","A","G"};
Making an array of strings is just silly, you should either use a single string or an array of characters.
private String s1 = "ACTGACGCAG";
Or
private char[] s1Array = {'A','C','T','G','A','C','G','C','A','G'};
Now #2
ArrayList<StringSearch> subStr = new ArrayList<StringSearch>();
This means you are trying to make an ArrayList that contains objects of type StringSearch. StringSearch is a class that contains your three arrays and your Search function so I don't think this is what you want.
If you wanted to make a list of 3 strings you might do something like this:
ArrayList<String> stringList = new ArrayList<String>();
stringList.add(s1);
stringList.add(s2);
stringList.add(s3);
Now say you defined s1, s2 and s3 as strings you can do something like this.
for(int i = 0; i <= s1.length() - 4; i++)
{
String subStr = s1.substring(i, i + 4);
if(s2.contains(subStr) && s3.contains(subStr))
{
System.out.println(subStr + " is in all 3 lists.");
}
}
System.out.println("Task Complete.");
The above code should achieve what it looks like you are trying to do. However, it should be noted that this isn't the most efficient way, just a way, of doing it. You should start with some more basic concepts judging by the code you have so far.

After declaring subStr as ArrayList you can call add or contains only with StringSearch objects as parameters.

Instead of:
ArrayList<StringSearch> subStr = new ArrayList<StringSearch>();
Replace it with:
String subStr = "";
And within the for loop to get the first 4 letters in s1 to be in its own string (subStr) add the line:
subStr += s1Array[i];
Also, s1Array is a String array, and not a String. The .contains method is a method that belongs to String variables, so for eg. the way you have it implemented, you can say s1Array[i].contains. But you cannot say s1Array.contains. If you change your String arrays to Strings and edit your code to suit, everything should work the way you expect it to work.

First of all you need to educate yourself on the concept of Java generics.
The most basic thing about generics is that once you declare a collection, here it is the arraylist, as you can only add objects of StringSearch.
Second of all, logically what you can do is to implement an algorithm called
Longest Common Subsequence. Check in pairs whether the longest subsequeces are 4 or not on the arrays.

any other way to find char array length?

public static int getLenth(char[] t)
{
int i=0;
int count=0;
try
{
while(t[i]!='\0')
{
++count;
i++;
}
return count;
}
catch(ArrayIndexOutOfBoundsException aiobe)
{
return count;
}
}
This method returns length of charArray. But my question is, is there is some other "ways" to find the length of charArray without using this try, catch statements & all ??
Thanks in advance :)

you can use length property of char array
simple example
char [] cd={'a','b'};
System.out.println(cd.length);
output
2

You can/should use the built-in length property to determine the size of any array.
int len = t.length;
You can learn more about arrays from here.

If you want to write your own custom code for the length property and also want to check the length of the char array then must go for StringBuffer , and append one by one character from the character array to the StringBuffer and check the no of character by using looping till the space character i.e " " as because String buffer has extra capacity of 16 character so there must be some space available after appending data from char array to StringBuffer.

Splitting string algorithm in Java

I'm trying to make the following algorithm work. What I want to do is split the given string into substrings consisting of either a series of numbers or an operator.
So for this string = "22+2", I would get an array in which [0]="22" [1]="+" and [2]="2".
This is what I have so far, but I get an index out of bounds exception:
public static void main(String[] args) {
String string = "114+034556-2";
int k,a,j;
k=0;a=0;j=0;
String[] subStrings= new String[string.length()];
while(k<string.length()){
a=k;
while(((int)string.charAt(k))<=57&&((int)string.charAt(k))>=48){
k++;}
subStrings[j]=String.valueOf(string.subSequence(a,k-1)); //exception here
j++;
subStrings[j]=String.valueOf(string.charAt(k));
j++;
}}
I would rather be told what's wrong with my reasoning than be offered an alternative, but of course I will appreciate any kind of help.

I'm deliberately not answering this question directly, because it looks like you're trying to figure out a solution yourself. I'm also assuming that you're purposefully not using the split or the indexOf functions, which would make this pretty trivial.
A few things I've noticed:
If your input string is long, you'd probably be better off working with a char array and stringbuilder, so you can avoid memory problems arising from immutable strings
Have you tried catching the exception, or printing out what the value of k is that causes your index out of bounds problem?
Have you thought through what happens when your string terminates? For instance, have you run this through a debugger when the input string is "454" or something similarly trivial?

You could use a regular expression to split the numbers from the operators using lookahead and lookbehind assertions
String equation = "22+2";
String[] tmp = equation.split("(?=[+\\-/])|(?<=[+\\-/])");
System.out.println(Arrays.toString(tmp));

If you're interested in the general problem of parsing, then I'd recommend thinking about it on a character-by-character level, and moving through a finite state machine with each new character. (Often you'll need a terminator character that cannot occur in the input--such as the \0 in C strings--but we can get around that.).
In this case, you might have the following states:
initial state
just parsed a number.
just parsed an operator.
The characters determine the transitions from state to state:
You start in state 1.
Numbers transition into state 2.
Operators transition into state 3.
The current state can be tracked with something like an enum, changing the state after each character is consumed.
With that setup, then you just need to loop over the input string and switch on the current state.
// this is pseudocode -- does not compile.
List<String> parse(String inputString) {
State state = INIT_STATE;
String curr = "";
List<String> subStrs = new ArrayList<String>();
for(Char c : inputString) {
State next;
if (isAnumber(c)) {
next = JUST_NUM;
} else {
next = JUST_OP;
}
if (state == next) {
// no state change, just add to accumulator:
acc = acc + c;
} else {
// state change, so save and reset the accumulator:
subStrs.add(acc);
acc = "";
}
// update the state
state = next;
}
return subStrs;
}
With a structure like that, you can more easily add new features / constructs by adding new states and updating the behavior depending on the current state and incoming character. For example, you could add a check to throw errors if letters appear in the string (and include offset locations, if you wanted to track that).

If your critera is simply "Anything that is not a number", then you can use some simple regex stuff if you dont mind working with parallel arrays -
String[] operands = string.split("\\D");\\split around anything that is NOT a number
char[] operators = string.replaceAll("\\d", "").toCharArray();\\replace all numbers with "" and turn into char array.

String input="22+2-3*212/21+23";
String number="";
String op="";
List<String> numbers=new ArrayList<String>();
List<String> operators=new ArrayList<String>();
for(int i=0;i<input.length();i++){
char c=input.charAt(i);
if(i==input.length()-1){
number+=String.valueOf(c);
numbers.add(number);
}else if(Character.isDigit(c)){
number+=String.valueOf(c);
}else{
if(c=='+' || c=='-' || c=='*' ||c=='/'){
op=String.valueOf(c);
operators.add(op);
numbers.add(number);
op="";
number="";
}
}
}
for(String x:numbers){
System.out.println("number="+x+",");
}
for(String x:operators){
System.out.println("operators="+x+",");
}
this will be the output
number=22,number=2,number=3,number=212,number=21,number=23,operator=+,operator=-,operator=*,operator=/,operator=+,

Removing duplicate chars from a string passed as a parameter

I am a little confused how to approach this problem. The userKeyword is passed as a parameter from a previous section of the code. My task is to remove any duplicate chars from the inputted keyword(whatever it is). We have just finished while loops in class so some hints regarding these would be appreciated.
private String removeDuplicates(String userKeyword){
String first = userKeyword;
int i = 0;
while(i < first.length())
{
if (second.indexOf(first.charAt(i)) > -1){
}
i++;
return "";
Here's an update of what I have tried so far - sorry about that.

This is the perfect place to use java.util.Set, a construct which is designed to hold unique elements. By trying to add each word to a set, you can check if you've seen it before, like so:
static String removeDuplicates(final String str)
{
final Set<String> uniqueWords = new HashSet<>();
final String[] words = str.split(" ");
final StringBuilder newSentence = new StringBuilder();
for(int i = 0; i < words.length; i++)
{
if(uniqueWords.add(words[i]))
{
//Word is unique
newSentence.append(words[i]);
if((i + 1) < words.length)
{
//Add the space back in
newSentence.append(" ");
}
}
}
return newSentence.toString();
}
public static void main(String[] args)
{
final String str = "Words words words I love words words WORDS!";
System.out.println(removeDuplicates(str)); //Words words I love WORDS!
}

Have a look at this answer.
You might not understand this, but it does the job (it cleverly uses a HashSet that doesn't allow duplicate values).
I think your teacher might be looking for a solution using loops however - take a look at William Morisson's answer for this.
Good luck!

For future reference, StackOverflow normally requires you to post what you have, and ask for suggestions for improvement.
As its not an active day, and I am bored I've done this for you. This code is pretty efficient and makes use of no advanced data structures. I did this so you could more easily understand it.
Please do try to understand what I'm doing. Learning is what StackOverflow is for.
I've added comments in the code to assist you in learning.
private String removeDuplicates(String keyword){
//stores whether a character has been encountered before
//a hashset would likely use less memory.
boolean[] usedValues = new boolean[Character.MAX_VALUE];
//Look into using a StringBuilder. Using += operator with strings
//is potentially wasteful.
String output = "";
//looping over every character in the keyword...
for(int i=0; i<keyword.length(); i++){
char charAt = keyword.charAt(i);
//characters are just numbers. if the value in usedValues array
//is true for this char's number, we've seen this char.
boolean shouldRemove = usedValues[charAt];
if(!shouldRemove){
output += charAt;
//now this character has been used in output. Mark that in
//usedValues array
usedValues[charAt] = true;
}
}
return output;
}
Example:
//output will be the alphabet.
System.out.println(removeDuplicates(
"aaaabcdefghijklmnopqrssssssstuvwxyyyyxyyyz"));

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.