Removing Null elements in an array - java

I'm reading from an XML file to populate some data structures and I run into this sort of problem when I inspect my final structure:
arrayName
[0] = null
[1] = input
[2] = null
[3] = input
etc
input is what I want, and the type is my own class.
I'm used to C# so I'd use LINQ to get rid of them normally, idea for doing something like this in Java?
I'm going to look at what's wrong with the code that's making it happen but for now I need a quick solution.
Ideas?
EDIT:
I found the issue, I create an array of size doc.getChildNodes().getLength(), and when I'm setting elements in the array (while looping through), I check if
getNodeType() == Node.ELEMENT_NODE)
And it doesn't work half the time. Problem is, I initialise the array based on the size, so half the array gets filled.
Ideas?

Arrays are immutable in Java (not their contents, the array itself). There is no such thing as a dynamic sized array or changing of the length. So you would iterate, count, create a new array, copy... or use an appropriate datastructure in the first place, maybe even one that capsules the creation of new arrays and offers manipulation methods like ArrayList.
Something like LINQ does not exist yet, you need some explicit object that capsules the manipulation or filtering.

If you're not having excessive amounts of data, you could just blend it through a few collections.
public class Main {
public static void main(String[] args) {
String[] arr = {"haha", "hoho", null, "hihi", null };
System.out.println(Arrays.toString(arr));
Set<String> set = new HashSet<>(Arrays.asList(arr));
set.remove(null);
arr = new String[set.size()];
arr = set.toArray(arr);
System.out.println(Arrays.toString(arr));
}
}
Output:
[haha, hoho, null, hihi, null]
[hoho, haha, hihi]
Keep in mind that this first allocates the original array, then creates a list from that array, then creates a hashset from that list and eventually puts everything back in the original array.
If you're having a very large amount of data it might constipate you a little but otherwise it's very easy to read and really just uses built-in features to reach what you want.

Unfortunately, there's nothing like LINQ in Java. The best thing would be probably checking for null beforehand, and only inserting if the element is not null, eg. (assuming your class name is InputClass:
InputClass c = parseFromXml(...);
if (c != null) {
myList.add(c);
}
Alternatively, you can remove nulls by iterating and copying (I use a list as an intermediary artifact):
InputClass[] removeNulls(InputClass[] original) {
List<InputClass> nonNulls = new ArrayList<InputClass>();
for (InputClass i : original) {
if (i != null) {
nonNulls.add(i);
}
}
return nonNulls.toArray(new InputClass[0]);
}
You can also use generics and make your method <T> removeNulls(T[] original) instead.

It sounds like you want a method to give you a new Array with the null values removed; perhaps something like this -
public static <T> T[] removeNulls(T[] in) {
if (in == null) {
return in;
}
Class<?> cls = null;
List<T> al = new ArrayList<T>();
for (T t : in) {
if (t != null) {
if (cls == null) {
cls = t.getClass();
}
al.add(t);
}
}
#SuppressWarnings("unchecked")
T[] out = (T[]) Array.newInstance(cls, al.size());
return al.toArray(out);
}

Related

Why are ArrayList created with empty elements array but HashSet with null table?

Maybe a bit of a philosophical question.
Looking at java's ArrayList implementation I noticed that when creating a new instance, the internal "elementData" array (that holds the items) is created as new empty array:
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
However, a HashSet (that is based on a HashMap) is created with the table and entreySet are just left null;
transient Node<K,V>[] table;
transient Set<Map.Entry<K,V>> entrySet;
public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
This got me thinking so I went and looked up C#'s List and HashSet:
https://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,61f6a8d9f0c40f6e
https://referencesource.microsoft.com/#System.Core/System/Collections/Generic/HashSet.cs,2d265edc718b158b
List:
static readonly T[] _emptyArray = new T[0];
public List() {
_items = _emptyArray;
}
HashSet:
private int[] m_buckets;
public HashSet()
: this(EqualityComparer<T>.Default) { }
public HashSet(IEqualityComparer<T> comparer) {
if (comparer == null) {
comparer = EqualityComparer<T>.Default;
}
this.m_comparer = comparer;
m_lastIndex = 0;
m_count = 0;
m_freeList = -1;
m_version = 0;
}
So, is there a good reason why both languages picked empty for list and null for set/map?
They both used the "single instance" for the empty array trick, which is nice, but why not just have a null array?
Answering from a C# perspective.
For an empty ArrayList, you'll find that all the logic (get, add, grow, ...) works "as-is" if you have an empty array as backing store. No need for additional code to handle the uninitialized case, this makes the whole implementation neater. And since the empty array is cached, this does not result in an additional heap allocation, so you get the cleaner code at no extra cost.
For HashSet this is not possible, as accessing a bucket is done through the formula hashCode % m_buckets.Length. Trying to compute % 0 is considered as a division by 0, and therefore invalid. This means you need to handle specifically the "not initialized" case, so you gain nothing from pre-assigning the field with an empty array.
Initializing elementData to an empty array in ArrayList allows to avoid a null check in the grow(int minCapacity) method, which calls:
elementData = Arrays.copyOf(elementData, newCapacity);
to increase the capacity of the backing array. When that method is first called, that statement will "copy" the empty array to the start of the new array (actually it will copy nothing).
In HashMap a similar strategy wouldn't be useful, since when you re-size the array of buckets, you don't copy the original array to the start of the new array, you have to go over all the entries and find the new bucket of each entry. Therefore initialing the buckets array to an empty array instead of keeping it null will require you to check if the array's length == 0 instead of checking whether it's null. Replacing one condition with another wouldn't be useful.

How to lazy initialize Collectors.toList() in java 8 stream api?

I want to collect items based on a filter. But the resulting list should not be initialized if no match was found. I'd prefer null instead of empty list.
List<String> match = list
.stream()
.filter(item -> item.getProperty == "match")
.collect(Collectors.toList());
if (match != null && !match.isEmpty()) {
//handle seldom match
}
Problem: most of the time I will not have a match, resulting in an empty collection. Which means most of the time the list is instanciated even though I don't need it.
Collecto.toList() allocates a List using ArrayList::new which is a very cheap operation since ArrayList doesn't actually allocate the backing array until elements are inserted. All the constructor does is initialize an internal Object[] field to the value of a statically created empty array. The actual backing array is initialized to its "initial size" only when the first element is inserted.
So why go through the pain of avoiding this construction? It sounds like a premature optimization.
If you're so worried about GC pressure, just don't use Streams. The stream and the Collector itself are probably quite a lot more "expensive" to create than the list.
I am only thinking of a case when something other than Collectors.toList() would be expensive to compute, otherwise use:
... collect(Collectors.collectingAndThen(list -> {
list.isEmpty() ? null: list;
}))
But just keep in mind that someone using that List would most probably expect an empty one in case of missing elements, instead of a null.
Creating an empty ArrayList is quite cheap and laziness here would only make things worse.
Otherwise, here is a variant that could defer to null if you really really wanted to:
private static <T> List<T> list(Stream<T> stream) {
Spliterator<T> sp = stream.spliterator();
if (sp.getExactSizeIfKnown() == 0) {
System.out.println("Exact zero known");
return null;
}
T[] first = (T[]) new Object[1];
boolean b = sp.tryAdvance(x -> first[0] = x);
if (b) {
List<T> list = new ArrayList<>();
list.add(first[0]);
sp.forEachRemaining(list::add);
return list;
}
return null;
}

Creating and filling an array in for each loop

//is there any way to do this
public Entry[] getArray(SimpleHashtable table) {
Entry[] data;
for (Entry k: table) {
//do something here
}
return data; //array that contains all elements stored in table
}
Is there a way to create and fill array with elements in for each loop?
Explained in short:
I have made class SimpleHashtable that stores objects, it implements Iterable.
Now I have to code method that returns an array of elements that are stored in that SimpleHashtable, and first idea that came to mind is to iterate over SimpleHashtable and fill array one element at the time, but I cant find any example of that.
If would like to avoid putting elements in a list in the meantime until I have iterated over SimpleHashtable (seems messy).
I know that arrays in Java aren't resizable, and that seems to make things difficult.
Assuming your class SimpleHashtable has a method size() to get the number of elements (which you need to size your array). You could then iterate with a for-each loop (as requested) with something like,
SimpleHashtable<Object> sh;
// ...
Object[] arr = new Object[sh.size()];
int pos = 0;
for (Object obj : sh) {
arr[pos++] = obj;
}

How do filter out this list with java 8 streams and functional interfaces?

if I have a list of arrays like this (pseudo java code):
Note the list valsSorted will be always sorted with x[0] asc and x[1] desc order.
List valsSorted = {[1 5][1 4][1 3][2 1][3 2][3 1][4 2][4 1][5 1][6 2][6 1]};
How do I filter this list with Java 8 streams and lambdas so that I get:
result = {[1 5][2 1][3 2][4 2][5 1][6 2]}
The first item of the array (x[0]) is ID and the second is a version number. So the rule is give all distinct IDs with the highest version back.
If I would use a for loop the following code would be fine:
ArrayList<int[]> result= new ArrayList();
int keep = -1;
for (int[] x : valsSorted) {
int id = x[0];
int version = x[1];
if(keep == id) continue;
keep = id;
result.add(x);
}
Your use of the word "distinct" suggests using the distinct() stream operation. Unfortunately that operation is hardwired to use the equals() method of the stream elements, which isn't useful for arrays. One approach for dealing with this would be to wrap the arrays in a wrapper object that has the semantics of equality that you're looking for:
class Wrapper {
final int[] array;
Wrapper(int[] array) { this.array = array; }
int[] getArray() { return array; }
#Override
public boolean equals(Object other) {
if (! (other instanceof Wrapper))
return false;
else
return this.array[0] == ((Wrapper)other).array[0];
}
#Override
public int hashCode() { ... }
}
Then wrap up your object before distinct() and unwrap it after:
List<int[]> valsDistinct =
valsSorted.stream()
.map(Wrapper::new)
.distinct()
.map(Wrapper::getArray)
.collect(toList());
This makes one pass over the data but it generates a garbage object per value. This also relies on the stream elements being processed in-order since you want the first one.
Another approach would be to use some kind of stateful collector, but that will end up storing the entire result list before any subsequent processing begins, which you said you wanted to avoid.
It might be worth considering making the data elements be actual classes instead of two-element arrays. This way you can provide a reasonable notion of equality, and you can also make the values comparable so that you can sort them easily.
(Credit: technique stolen from this answer.)
class Test{
List<Point> valsSorted = Arrays.asList(new Point(1,5),
new Point(1,4),
new Point(1,3),
new Point(2,1),
new Point(3,2),
new Point(3,1),
new Point(4,2),
new Point(4,1),
new Point(5,1),
new Point(6,2),
new Point(6,1));
public Test(){
List<Point> c = valsSorted.stream()
.collect(Collectors.groupingBy(Point::getX))
.values()
.stream()
.map(j -> j.get(0))
.collect(Collectors.toList());
for(int i=0; i < c.size(); i++){
System.out.println(c.get(i));
}
}
public static void main(String []args){
Test t = new Test()
}
}
I decided to use the point class and represent the ID field as x and the version number as Y. So from there if you create a stream and group them by ID. You can call the values method which returns a Collection of Lists Collection<List<Point>>. You can then call the stream for this Collection and get the first value from each list which according to your specifications is ordered with descending version number so it should be the the highest version number. From there all you have to do is collect them into a list, array or whatever you see necessary and assign it as needed.
The only problem here is that they are printed out of order. That should be an easy fix though.

Best way to Iterate collection classes?

Guys i wanna ask about the best way to iterate collection classes ??
private ArrayList<String> no = new ArrayList<String>();
private ArrayList<String> code = new ArrayList<String>();
private ArrayList<String> name = new ArrayList<String>();
private ArrayList<String> colour = new ArrayList<String>();
private ArrayList<String> size = new ArrayList<String>();
// method for finding specific value inside ArrayList, if match then delete that element
void deleteSomeRows(Collection<String> column, String valueToDelete) {
Iterator <String> iterator = column.iterator();
do{
if (iterator.next()==valueToDelete){
iterator.remove();
}
}while(iterator.hasNext());
}
deleteSomeRows(no, "value" );
deleteSomeRows(code, "value" );
deleteSomeRows(name , "value");
deleteSomeRows(colour ,"value" );
deleteSomeRows(size , "value");
THE PROBLEM WITH CODES ABOVE IS THAT IT TAKES AMOUNT OF TIME JUST TO ITERATE EACH OF THOSE CLASSES ? ANY SOLUTION TO MAKE IT FASTER ? pls help if u care :D..
You could simplify your code:
while column.contains(valueToDelete)
{
column.remove(valueToDelete);
}
You're not going to be able to speed up your ArrayList iteration, especially if your list is not sorted. You're stuck at O(n) for this problem. If you sorted it and inserted logic to binary search for the item to remove until it is no longer found, you could speed up access.
This next suggestion isn't directly related to the time it takes, but it will cause you problems.
You should never compare String objects for equality using the == operator. This will cause a comparison of their pointer values.
Use this instead:
if (iterator.next().equals(valueToDelete))
EDIT: The problem here is not the iteration. The problem is removing the elements from the ArrayList. When you remove the first element from an ArrayList, then all subsequent elements have to be shifted one position to the left. So in the worst case, your current approach will have quadratic complexity.
It's difficult to avoid this in general. But in this case, the best tradeoff between simplicity and performance can probably be achieved like this: Instead of removing the elements from the original list, you create a new list which only contains the elements that are not equal to the "valueToDelete".
This could, for example, look like this:
import java.util.ArrayList;
import java.util.List;
public class QuickListRemove
{
public static void main(String[] args)
{
List<String> size = new ArrayList<String>();
size = deleteAll(size, "value");
}
private static <T> List<T> deleteAll(List<T> list, T valueToDelete)
{
List<T> result = new ArrayList<T>(list.size());
for (T value : list)
{
if (!value.equals(valueToDelete))
{
result.add(value);
}
}
return result;
}
}
If you want to modify the collection while iterating them then you should use Iterators, otherwise you can use the for-each loop.
For -each :
// T is the type f elements stored in myList
for(T val : myList)
{
// do something
}
Try putting a break after you find the element to delete.

Categories

Resources