I am a fairly newbie programmer with a question on arrays in Java. Consider a 2D array, [i][j]. The value of i is determined at run time. The value of j is known to be 7. At [i][6] and [i][7] I want to be able to store a deeper array or list of values. Is it possible to have something like an array within an array, where there is an x and y axis and a z axis at the point of [i][6] and i[7] or will I need a full 3D cube of memory to be able to store and navigate my data?
The Details: My goal is to run a query which takes certain information from two tables (target and attacker) My query is fine and I can get a resultset. What I really want to be able to do is to store the data from my resultset and present it in a table in a more useful format while also using it in a data visualization program. The fields I get are: server_id, target_ip, threat_level, client_id, attacker_ip and num_of_attacks. I could get 20 records that have the same server_id, target_ip, threat_level, client_id but different attacker_ip and num_of_attacks because that machine got attacked 20 times. A third dimension would allow me to do this but the 3rd axis/array would be empty for server_id, target_ip, threat_level, client_id
UPDATE after reviewing the answers and doing some more thinking I'm wondering if using an arraylist of objects would be best for me, and/or possible. Keeping data organized and easily accessible is a big concern for me. In psedu code it would be something like this:
Object[] servers
String server_id
String target
String threat_level
String client_id
String arr[][] // this array will hold attacker_ip in one axis and num_of_attacks in the other in order to keep the relation between the attacking ip and the number of attacks they make against one specific server
In first place, if you have an array DataType[i][j] and j is known to be 7, the 2 greatest indexes you can use are 5 and 6, not 6 and 7. This is because Java array indexes are 0-based. When creating the array you indicate the number of elements, not the maximum index (which always is one less than number of elements).
In second place, there is nothing wrong with using multidimensional arrays when the problem domain already uses them. I can think of scientific applications, data analysis applications, but not many more. If, on the contrary, you are modelling a business problem whose domain does not use multidimensional arrays, you are probably better off using more abstract data structures instead of forcing arrays into the design just because they seem very efficient, experience in other languages where arrays are more important, or other reasons.
Without having much information, I'd say your "first dimension" could be better represented by a List type (say ArrayList). Why? Because you say its size is determined at runtime (and I assume this comes indirectly, not as a magic number that you obtain from somewhere). Lists are similar to arrays but have the particularity that they "know" how to grow. Your program can easily append new elements as it reads them from a source or otherwise discovers/creates them. It can also easily insert them at the beginning or in the middle, but this is rare.
So, your first dimension would be: ArrayList<something>, where something is the type of your second dimension.
Regarding this second dimension, you say that it has a size of 7, but that the first 5 items accept single values while the last 2 multiple ones. This is already telling me that the 7 items are not homogeneous, and thus an array is ill-indicated. This dimension would be much better represented by a class. To understand this class's structure, let's say that the 5 single-valued elements are homogenous (of type, say, BigDecimal). One of the most natural representations for this is array, as the size is known. The 2 remaining, multi-valued elements also seem to constitute an array. However, given that each of its 2 elements contains an unidentified number of data items, the element type of this array should not be BigDecimal as in the previous case, but ArrayList. The type of the elements of these ArrayLists is whatever the type of the multiple values is (say BigDecimal too).
The final result is:
class SecondD {
BigDecimal[] singleValued= new BigDecimal[5] ;
ArrayList<BigDecimal>[] multiValued= new ArrayList<BigDecimal>[2] ;
{
multiValued[0]= new ArrayList<BigDecimal>() ;
multiValued[1]= new ArrayList<BigDecimal>() ;
}
}
ArrayList<SecondD> data= new ArrayList<SecondD>() ;
In this code snippet I'm not only declaring the structures, but also creating them so they are ready to use. Pure declaration would be:
class SecondD {
BigDecimal[] singleValued;
ArrayList<BigDecimal>[] multiValued;
}
ArrayList<SecondD> data= new ArrayList<SecondD>() ;
Array size is not important in Java from a type (and thus structural) point of view. That's why you don't see any 5 or 2.
Access to the data structure would be like
data.get(130).singleValued[2]
data.get(130).multiValued[1].get(27)
A possible variant that could be much clearer in certain cases is
class SecondD {
BigDecimal monday;
BigDecimal tuesday;
BigDecimal wednesday;
BigDecimal thursday;
BigDecimal friday;
ArrayList<BigDecimal> saturday= new ArrayList<BigDecimal>() ;
ArrayList<BigDecimal> sunday= new ArrayList<BigDecimal>() ;
}
ArrayList<SecondD> data= new ArrayList<SecondD>() ;
In this case we are "expanding" each array into individual items, each with a name. Typical access operations would be:
data.get(130).wednesday
data.get(130).sunday.get(27)
Which variant to choose? Well, that depends on how similar or different the operations with the different itemes are. If every time you will perform and operation with monday you will also perform it with tuesday, wednesday, thursday, and friday (not saturday and sunday because these are a completely different kind of thing, remember?), then an array could be better. For example, to sum the items when stores as an array it's only necessary:
element= data.get(130) ;
int sum= 0 ;
for(int e: element.singleValued ) sum+= e ;
While if expanded:
element= data.get(130) ;
int sum= 0 ;
sum+= element.monday ;
sum+= element.tuesday ;
sum+= element.wednesday ;
sum+= element.thursday ;
sum+= element.friday ;
In this case, with only 5 elements, the difference is not much. The first way makes things slightly shorter, while the second makes them clearer. Personally, I vote for clarity. Now, if instead of 5 items they would have been 1,000 or even as few as 20, the repetition in the second case would have too much and the first case preferred. I have another general rule for this too: if I can name every element separately, then it's probably better to do exactly so. If while trying to name the elements I find myself using numbers or sequential letters of the alphabet (either naturally, as in the days of the month, or because things just don't seem to have different names), then it's arrays. You could still find cases that are not clear even after applying these two criteria. In this case toss a coin, start developing the program, and think a bit how things would be the other way. You can change your mind any time.
If your application is indeed a scientific one, please forgive me for such a long (and useless) explanation. My answer could help others looking for something similar, though.
Use ArrayList instead of array primitives. You can have your three dimensions, without the associated inefficient wastage of allocating a "cube"
If not creating a custom class like #nIcE cOw suggested Collections are more cumbersome for this kind of thing than primitive arrays. This is because Java likes to be verbose and doesn't do certain things for you like operator overloading (like C++ does) or give you the ability to easily instantiate ArrayList from arrays.
To exemplify, heres #sbat's example with ArrayLists;
public static <T> ArrayList<T> toAL(T ... input) {
ArrayList<T> output = new ArrayList<T>();
for (T item : input) {
output.add(item);
}
return output;
}
public static void main(String[] args) {
ArrayList<ArrayList<ArrayList<Integer>>> a = toAL(
toAL(
toAL(0, 1, 2)
),
toAL(
toAL(4, 5)
),
toAL(
toAL(6)
)
);
System.out.println(a.get(0).get(0).get(2));
System.out.println(a.get(1).get(0).get(1));
System.out.println(a.get(2).get(0).get(0));
}
Of course, there's nothing syntactically wrong with doing:
int[][][] a = {{{0, 1, 2}}, {{4, 5}}, {{6}}};
System.out.println(a[0][0].length); // 3
System.out.println(a[1][0].length); // 2
System.out.println(a[2][0].length); // 1
In fact, that's what multidimensional arrays in Java are, they're arrays within arrays.
The only problem I see with this is that it might become confusing or difficult to maintain later on, but so would using ArrayLists within ArrayLists:
List<List<List<Integer>>> list = ...;
System.out.println(list.get(0).get(1).get(50)); // using ArrayList
However, there are still reasons as to why you might prefer an array over a collection. But ArrayLists or other collections may be preferable depending on the circumstance.
Related
Some people tell me its the first option, but other people tell me its the second one.
Where do the rows and where to de columns actually go?
I'd appreciate the help, thanks.
1) Double array[][] = new Double[5][3];
2) Double array[][] = new Double[3][5];
This question usually comes down to which convention do you prefer or which one is predominantly used in your field or, more narrowly, programming language.
I found this answer stating that Java is "row major". This is the convention I always follow while working with 2D arrays in Java. Although, Wikipedia article on row- and column-major order refers to the part of Java Language Specification, stating that this langauge is neither row-major nor column-major. Instead, it uses Iliffe vectors to store multi-dimensional arrays, meaning that data in the same row is stored continuously in the memory, but the rows themselves are not. Address of the first element of each row is stored in an array of pointers.
Despite it's impossible to clasify Java memory model as a strictly row- or column-major respective, the usage of Iliffe vectors prompts to perceive it as row-major. Therefore, in order to create a matrix of 3 rows and 5 columns, you should use:
Double array[][] = new Double[3][5];
There is no real concept of matrices in Java. What you're referring to is a two-dimensional array, or in other words an array of arrays.
Writing Double[5][3] will create an array of length 5, containing arrays of length 3 and your other example will do the opposite. Therefore the answer to your question depends on how you want to visualise it. The most obvious way for me is to say that each inner array represents a row in the matrix, therefore I would lean towards Double[3][5], then indexing with a row and column would look like array[row][column] which makes a lot of sense.
Hello I am research about that, but I cannot found anything in the oracle website.
The question is the next.
If you are using an static Array like this
int[] foo = new int[10];
And you want add some value to the 4 position of this ways
foor[4] = 4;
That don't shift the elements of the array so the time complexity will be O(1) because if you array start at 0x000001, and have 10 spaces, and you want put some in the x position you can access by (x*sizeOf(int))+initialMemoryPosition (this is a pseudocode)
Is this right, is this the way of that this type of array works in java, and if its time complexity O(1)
Thanks
The question is based on a misconception: in Java, you can't add elements to an array.
An array gets allocated once, initially, with a predefined number of entries. It is not possible to change that number later on.
In other words:
int a[] = new int[5];
a[4] = 5;
doesn't add anything. It just sets a value in memory.
So, if at all, we could say that we have somehow "O(1)" for accessing an address in memory, as nothing related to arrays depends on the number of entries.
Note: if you ask about ArrayList, things are different, as here adding to the end of the array can cause the creation of a new, larger (underlying) array, and moving of data.
An array is somewhere in memory. You don’t have control where, and you should not care where it is. The array is initialized when using the new type[size] syntax is used.
Accessing the array is done using the [] index operator. It will never modify size or order. Just the indexed location if you assign to it.
See also https://www.w3schools.com/java/java_arrays.asp
The time complexity is already correctly commented on. But that is the concern after getting the syntax right.
An old post regarding time complexity of collections can be found here.
Yes, it takes O(1) time. When you initialize an array, lets say, int[] foo = new int[10],
then it will create a new array with 0s. Since int has 4 bytes, which is 32 bits, every time assign a value to one element, i.e., foo[4] = 5, it will do foo[32 x input(which is 4)] = value(5); That's why array is 0-indexed, and how they assign values in O(1) time.
I have an array that created 5 objects. Each object has two strings and a int. Lets call the int "number". How can i add up the "number's" of each object into a final number, assume that the numbers change so i cannot simply just put 5 + 3 etc.. For example
Question question[] = new Question[5];
public Constructor()
{
String1 = "null";
Sting2 = "null";
number = 0;
}
SO i have five objects that look like this, they all have a different value. Number refers to a score, So if the user does something right, the number will be added to a variable, i need to know how to add up the 5 variables when i execute the 5 objects in something like.
for (i=0; i < Question.length; i++)
{
object.dostuff
}
Many things have to happen first:
Initialize the array: seems you got that one covered.
Initialize objects within the array: Make sure every cell of your array actually contains a question instance (or to be more precise: a reference to a Question instance).
Iterate over the array: here your loop seems to go over the class (Question, with capital Q) but you need to iterate over the array (question with a small q). Piece of advice, since the variable question here represents an array of question it would make more sense if you make your name plural (questions) to help illustrate that this is an array. Basic rule is to make the name as explicit as possible, so questionArray would be an even better name. Past a certain point it's a question of taste. Rule of thumb is that if you have to look at the declaration of the variable then it's probably not named correctly.
access methods, properties etc of the objects: when iterating over the array you need to access the right index (questions[i]) then access the members of this object (questions[i].doStuff). If you aim for OOP (which I assume is the point here) then you may want to make the obvious operations as functions of your Question class. Then simply call this function with the proper parameter (questions[i].setNumber(i)). It all depends on what you need it to do.
Hope this helps (if this is a homework related question you should tag it as such, that would maximize your chance to get help here).
Don't use Question.length, use question.length
Add an accessor method and a method to increment the scores.
use map to extract the numbers from the list of tuples then use reduce to accumulatively sum the numbers.
list=[("1 this is sentence 1","1 this is sentence 2",1),("2 this is sentence 1","2
this is sentence 2",2),("3 this is sentence 1","3 this is sentence 2",3)]
numbers=map(lambda x: x[2],list)
result=reduce(lambda x,y: x+y,numbers)
print(result)
output:
6
I've got 2 2D arrays, one int and one String, and I want them to appear one next to the other since they have the same number of rows. Is there a way to do this? I've thought about concatenating but that requires that they be the same type of array, so in that case, is there a way I could make my int array a String array?
If you want an array that can hold both Strings and ints, you basically have two choices:
Treat them both as Objects, so
effectively Object[][]
concatArray. Autoboxing will
convert your ints to Integers.
Treat them both as Strings (using
String.valueOf(int) and
Integer.parseInt(String)).
I don't know for a fact, but would guess autoboxing is a less expensive operation that converting ints to string and back.
Further, you can always find out the value type of a cell in the array by using instanceof operator; if values are converted to String, you actually need to parse a value to find out if its just a bit of text or a text representation of a number.
These two considerations -- one a guess, the other possibly irrelevant in your case -- would support using option 1 above.
Just cast the ints to Strings as you concatenate. The end result would be a 2D array of type String.
To concatenate an int to a String, you can just use
int myInt = 5;
String newString = myInt + "";
It's dirty, but it's commonly practiced, thus recognizable, and it works.
There are two ways to do this that I can see:
You can create a custom data object that holds the strings and ints. This would be the best way if the two items belong together. It also makes comparing rows easier.
You can create a 4D array of objects and put all the values together like this. I wouldn't recommend it, but it does solve the problem,
I hear the curse of dimensionality lurking in the background, that said - first answer that comes to mind is:
final List<int,String> list = new List<int,String>();
then reviewing op's statement, we see two [][] which raises the question of ordering. Looking at two good replies already we can do Integer.toString( int ); to get concatenation which fulfills op's problem definition, then it's whether ordering is significant or flat list and ( again ) what to do with the second array? Is it a tuple? If so, how to "hold" the data in the row ... we could then do List<Pair<Integer,String>> or List<Pair<String,String>>, which seems the canonical solution to me.
I'm programming a java application that reads strictly text files (.txt). These files can contain upwards of 120,000 words.
The application needs to store all +120,000 words. It needs to name them word_1, word_2, etc. And it also needs to access these words to perform various methods on them.
The methods all have to do with Strings. For instance, a method will be called to say how many letters are in word_80. Another method will be called to say what specific letters are in word_2200.
In addition, some methods will compare two words. For instance, a method will be called to compare word_80 with word_2200 and needs to return which has more letters. Another method will be called to compare word_80 with word_2200 and needs to return what specific letters both words share.
My question is: Since I'm working almost exclusively with Strings, is it best to store these words in one large ArrayList? Several small ArrayLists? Or should I be using one of the many other storage possibilities, like Vectors, HashSets, LinkedLists?
My two primary concerns are 1.) access speed, and 2.) having the greatest possible number of pre-built methods at my disposal.
Thank you for your help in advance!!
Wow! Thanks everybody for providing such a quick response to my question. All your suggestions have helped me immensely. I’m thinking through and considering all the options provided in your feedback.
Please forgive me for any fuzziness; and let me address your questions:
Q) English?
A) The text files are actually books written in English. The occurrence of a word in a second language would be rare – but not impossible. I’d put the percentage of non-English words in the text files at .0001%
Q) Homework?
A) I’m smilingly looking at my question’s wording now. Yes, it does resemble a school assignment. But no, it’s not homework.
Q) Duplicates?
A) Yes. And probably every five or so words, considering conjunctions, articles, etc.
Q) Access?
A) Both random and sequential. It’s certainly possible a method will locate a word at random. It’s equally possible a method will want to look for a matching word between word_1 and word_120000 sequentially. Which leads to the last question…
Q) Iterate over the whole list?
A) Yes.
Also, I plan on growing this program to perform many other methods on the words. I apologize again for my fuzziness. (Details do make a world of difference, do they not?)
Cheers!
I would store them in one large ArrayList and worry about (possibly unnecessary) optimisations later on.
Being inherently lazy, I don't think it's a good idea to optimise unless there's a demonstrated need. Otherwise, you're just wasting effort that could be better spent elsewhere.
In fact, if you can set an upper bound to your word count and you don't need any of the fancy List operations, I'd opt for a normal (native) array of string objects with an integer holding the actual number. This is likely to be faster than a class-based approach.
This gives you the greatest speed in accessing the individual elements whilst still retaining the ability to do all that wonderful string manipulation.
Note I haven't benchmarked native arrays against ArrayLists. They may be just as fast as native arrays, so you should check this yourself if you have less blind faith in my abilities than I do :-).
If they do turn out to be just as fast (or even close), the added benefits (expandability, for one) may be enough to justify their use.
Just confirming pax assumptions, with a very naive benchmark
public static void main(String[] args)
{
int size = 120000;
String[] arr = new String[size];
ArrayList al = new ArrayList(size);
for (int i = 0; i < size; i++)
{
String put = Integer.toHexString(i).toString();
// System.out.print(put + " ");
al.add(put);
arr[i] = put;
}
Random rand = new Random();
Date start = new Date();
for (int i = 0; i < 10000000; i++)
{
int get = rand.nextInt(size);
String fetch = arr[get];
}
Date end = new Date();
long diff = end.getTime() - start.getTime();
System.out.println("array access took " + diff + " ms");
start = new Date();
for (int i = 0; i < 10000000; i++)
{
int get = rand.nextInt(size);
String fetch = (String) al.get(get);
}
end = new Date();
diff = end.getTime() - start.getTime();
System.out.println("array list access took " + diff + " ms");
}
and the output:
array access took 578 ms
array list access took 907 ms
running it a few times the actual times seem to vary some, but generally array access is between 200 and 400 ms faster, over 10,000,000 iterations.
If you will access these Strings sequentially, the LinkedList would be the best choice.
For random access, ArrayLists have a nice memory usage/access speed tradeof.
My take:
For a non-threaded program, an Arraylist is always fastest and simplest.
For a threaded program, a java.util.concurrent.ConcurrentHashMap<Integer,String> or java.util.concurrent.ConcurrentSkipListMap<Integer,String> is awesome. Perhaps you would later like to allow threads so as to make multiple queries against this huge thing simultaneously.
If you're going for fast traversal as well as compact size, use a DAWG (Directed Acyclic Word Graph.) This data structure takes the idea of a trie and improves upon it by finding and factoring out common suffixes as well as common prefixes.
http://en.wikipedia.org/wiki/Directed_acyclic_word_graph
Use a Hashtable? This will give you your best lookup speed.
ArrayList/Vector if order matters (it appears to, since you are calling the words "word_xxx"), or HashTable/HashMap if it doesn't.
I'll leave the exercise of figuring out why you would want to use an ArrayList vs. a Vector or a HashTable vs. a HashMap up to you since I have a sneaking suspicion this is your homework. Check the Javadocs.
You're not going to get any methods that help you as you've asked for in the examples above from your Collections Framework class, since none of them do String comparison operations. Unless you just want to order them alphabetically or something, in which case you'd use one of the Tree implementations in the Collections framework.
How about a radix tree or Patricia trie?
http://en.wikipedia.org/wiki/Radix_tree
The only advantage of a linked list over an array or array list would be if there are insertions and deletions at arbitrary places. I don't think this is the case here: You read in the document and build the list in order.
I THINK that when the original poster talked about finding "word_2200", he meant simply the 2200th word in the document, and not that there are arbitrary labels associated with each word. If so, then all he needs is indexed access to all the words. Hence, an array or array list. If there really is something more complex, if one word might be labeled "word_2200" and the next word is labeled "foobar_42" or some such, then yes, he'd need a more complex structure.
Hey, do you want to give us a clue WHY you want to do any of this? I'm hard pressed to remember the last time I said to myself, "Hey, I wonder if the 1,237th word in this document I'm reading is longer or shorter than the 842nd word?"
Depends on what the problem is - speed or memory.
If it's memory, the minimum solution is to write a function getWord(n) which scans the whole file each time it runs, and extracts word n.
Now - that's not a very good solution. A better solution is to decide how much memory you want to use: lets say 1000 items. Scan the file for words once when the app starts, and store a series of bookmarks containing the word number and the position in the file where it is located - do this in such a way that the bookmarks are more-or-less evenly spaced through the file.
Then, open the file for random access. The function getWord(n) now looks at the bookmarks to find the biggest word # <= n (please use a binary search), does a seek to get to the indicated location, and scans the file, counting the words, to find the requested word.
An even quicker solution, using rather more memnory, is to build some sort of cache for the blocks - on the basis that getWord() requests usually come through in clusters. You can rig things up so that if someone asks for word # X, and its not in the bookmarks, then you seek for it and put it in the bookmarks, saving memory by consolidating whichever bookmark was least recently used.
And so on. It depends, really, on what the problem is - on what kind of patterns of retreival are likely.
I don't understand why so many people are suggesting Arraylist, or the like, since you don't mention ever having to iterate over the whole list. Further, it seems you want to access them as key/value pairs ("word_348"="pedantic").
For the fastest access, I would use a TreeMap, which will do binary searches to find your keys. Its only downside is that it's unsynchronized, but that's not a problem for your application.
http://java.sun.com/javase/6/docs/api/java/util/TreeMap.html