Context
I want to iterate over a Spark Dataset and update a HashMap for each row.
Here is the code I have:
// At this point, I have a my_dataset variable containing 300 000 rows and 10 columns
// - my_dataset.count() == 300 000
// - my_dataset.columns().length == 10
// Declare my HashMap
HashMap<String, Vector<String>> my_map = new HashMap<String, Vector<String>>();
// Initialize the map
for(String col : my_dataset.columns())
{
my_map.put(col, new Vector<String>());
}
// Iterate over the dataset and update the map
my_dataset.foreach( (ForeachFunction<Row>) row -> {
for(String col : my_map.KeySet())
{
my_map.get(col).add(row.get(row.fieldIndex(col)).toString());
}
});
Issue
My issue is that the foreach doesn't iterate at all, the lambda is never executed and I don't know why.
I implemented it as indicated here: How to traverse/iterate a Dataset in Spark Java?
At the end, all the inner Vectors remain empty (as they were initialized) despite the Dataset is not (Take a look to the first comments in the given code sample).
I know that the foreach never iterates because I did two tests:
Add an AtomicInteger to count the iterations, increment it right in the beginning of the lambda with incrementAndGet() method. => The counter value remains 0 at the end of the process.
Print a debug message right in the beginning of the lambda. => The message is never displayed.
I'm not used of Java (even less with Java lambdas) so maybe I missed an important point but I can't find what.
I am probably a little old school, but I never like lambdas too much, as it can get pretty complicated.
Here is a full example of a foreach():
package net.jgp.labs.spark.l240_foreach.l000;
import java.io.Serializable;
import org.apache.spark.api.java.function.ForeachFunction;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
public class ForEachBookApp implements Serializable {
private static final long serialVersionUID = -4250231621481140775L;
private final class BookPrinter implements ForeachFunction<Row> {
private static final long serialVersionUID = -3680381094052442862L;
#Override
public void call(Row r) throws Exception {
System.out.println(r.getString(2) + " can be bought at " + r.getString(
4));
}
}
public static void main(String[] args) {
ForEachBookApp app = new ForEachBookApp();
app.start();
}
private void start() {
SparkSession spark = SparkSession.builder().appName("For Each Book").master(
"local").getOrCreate();
String filename = "data/books.csv";
Dataset<Row> df = spark.read().format("csv").option("inferSchema", "true")
.option("header", "true")
.load(filename);
df.show();
df.foreach(new BookPrinter());
}
}
As you can see, this example reads a CSV file and prints a message from the data. It is fairly simple.
The foreach() instantiates a new class, where the work is done.
df.foreach(new BookPrinter());
The work is done in the call() method of the class:
private final class BookPrinter implements ForeachFunction<Row> {
#Override
public void call(Row r) throws Exception {
...
}
}
As you are new to Java, make sure you have the right signature (for classes and methods) and the right imports.
You can also clone the example from https://github.com/jgperrin/net.jgp.labs.spark/tree/master/src/main/java/net/jgp/labs/spark/l240_foreach/l000. This should help you with foreach().
Related
I implemented a queue using array. Now I want to remove a element by searching, if the element were there it must be removed from the queue.
public static void deleteFromQueue(PassengerQueue passengerQueue){
//passengerQueue.dequeue();
Scanner scan = new Scanner(System.in);
System.out.print("please Enter the Passenger name: ");
String name = scan.nextLine();
for (int i=0; i<passengerQueue.getPassenger().length; i++){
if (passengerQueue.getPassenger()[i].getName().equals(name)){
//
}
}
}
here my method of removing
Why do you specifically want to use an array. Please find an example attached using an ArrayList:
package stackoverflow;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;
import java.util.ArrayList;
import java.util.Iterator;
public class QuickTest {
#Test
public void test() throws Exception {
PassengerQueue passengerQueue = new PassengerQueue();
passengerQueue.add(new Passenger("testName1"));
passengerQueue.add(new Passenger("testName2"));
Assertions.assertEquals(2, passengerQueue.size());
PassengerUtil.removeByName(passengerQueue, "testName1");
Assertions.assertEquals(passengerQueue.size(), 1);
System.out.println("All done");
}
private static class PassengerUtil {
/** #param passengerQueue Modified by reference. **/
private static void removeByName(PassengerQueue passengerQueue, String specifiedName) {
// Using an Iterator so that I don't trigger ConcurrentModificationException.
for (Iterator<Passenger> it = passengerQueue.iterator() ; it.hasNext() ; ) {
Passenger currPassenger = it.next();
if (currPassenger.getName().equals(specifiedName)) {
it.remove();
}
}
}
}
private class PassengerQueue extends ArrayList<Passenger> {
}
private class Passenger {
private String name;
public Passenger(String name) {
if (name == null) {
throw new NullPointerException("The " + Passenger.class.getSimpleName() + " cannot have a Name equal to NULL!");
}
this.name = name;
}
public String getName() {
return this.name;
}
}
}
Please note the following:
My PassengerQueue object extends ArrayList. So I have a type-safe list of Passengers just by extending ArrayList - I don't need to do anything else.
I use an Iterator to iterate over the list. Its a bit more verbose than your normal for-each loop, but its necessary to not trigger a ConcurrentModificationException. Java doesn't always like when you iterate over a list and then for example delete things from that list while you're iterating over it. (Maybe simple examples won't trigger the ConcurrentModificationException)
You called your list PassengerQueue. Please note that Java does have Queue(https://docs.oracle.com/javase/8/docs/api/java/util/Queue.html) collections. Similar to me extending ArrayList you can look at Queue subclasses and extend that instead if you really need your Collection to function like a queue.
Your code, and mine, can currently delete multiple elements from the list if the list contains Passengers with the same name.
Your question title asked about deleting from an array using an index position. You can consider adding the Apache Commons Lang project to your classpath and using methods from their ArrayUtils
Actually, my answer can be improved to not even use an Iterator:
private static class PassengerUtil {
/** #param passengerQueue Modified by reference. **/
private static void removeByName(PassengerQueue passengerQueue, String specifiedName) {
passengerQueue.removeIf(currPassenger -> currPassenger.getName().equals(specifiedName));
}
}
Some reading on the latter code example here.
A 'Queue' is defined as a data structure that holds a sequence of items where you can only add something to the end (the 'tail') and where you can take something from the beginning, the 'head'. And sometimes it is said that you can get the current size of the sequence, ask whether the sequence is empty, and that you can look ('peek') at the first item without taking it.
That's the basics. And you can implement that in various ways.
There is an interface in Java (java.util.Queue) that provides the basic features described above. So when you declare
java.util.Queue myQueue = …
then you cannot search your queue for an item and remove it (ok, you can take all elements from your queue, one by one, and add again those you want to keep, but that's tedious).
But the implementation for java.util.Queue is java.util.LinkedList, and a list can be searched.
So you write
java.util.Queue myQueue = new java.util.LinkedList();
and as you now know that the implementation of your queue is in fact a list, you can write
…
for( var i = ((java.util.List) myQueue).iterator(); i.hasNext(); )
{
if( matchesCriteriaForRemoval( i.next() ) i.remove();
}
…
But this works only because you know some implementation details of myQueue – but that was what you want to hide when you chose to define it as java.util.Queue.
So when you have to be able to remove entries from your PassengerQueue, that data structure should provide a method to do so instead of revealing its internal implementation.
This means your code have to look like this:
public static void deleteFromQueue( PassengerQueue passengerQueue )
{
Scanner scan = new Scanner( System.in );
System.out.print( "please Enter the Passenger name: " );
String name = scan.nextLine();
passengerQueue.removeByName( name );
}
How this method PassengerQueue.removeByName() is implemented depends from the internal implementation of PassengerQueue; if it uses the java.util.List with the name passengers to store the passengers, it may look like this:
public final void removeByName( final String name )
{
for( var i = passengers.iterator(); i.hasNext(); )
{
if( passengerNameMatches( name, i.next() ) ) i.remove();
}
}
If you use another container for your passengers, that removal method has to be implemented differently …
Obviously I omitted all error handling, and the collections are generic types, but I used them as raw because of brevity.
I am new to parallel stream and trying to make 1 sample program that will calculate value * 100(1 to 100) and store it in map.
While executing code I am getting different count on each iteration.
I may be wrong at somewhere so please guide me anyone knows the proper way to do so.
code:
import java.util.*;
import java.lang.*;
import java.io.*;
import java.util.stream.Collectors;
public class Main{
static int l = 0;
public static void main (String[] args) throws java.lang.Exception {
letsGoParallel();
}
public static int makeSomeMagic(int data) {
l++;
return data * 100;
}
public static void letsGoParallel() {
List<Integer> dataList = new ArrayList<>();
for(int i = 1; i <= 100 ; i++) {
dataList.add(i);
}
Map<Integer, Integer> resultMap = new HashMap<>();
dataList.parallelStream().map(f -> {
Integer xx = 0;
{
xx = makeSomeMagic(f);
}
resultMap.put(f, xx);
return 0;
}).collect(Collectors.toList());
System.out.println("Input Size: " + dataList.size());
System.out.println("Size: " + resultMap.size());
System.out.println("Function Called: " + l);
}
}
Runnable Code
Last Output
Input Size: 100
Size: 100
Function Called: 98
On each time run output differs.
I want to use parallel stream in my own application but due to this confusion/issue I can't.
In my application I have 100-200 unique numbers on which some same operation needs to be performed. In short there's function which process something.
Your access to both the HashMap and to the l variable are both not thread safe, which is why the output is different in each run.
The correct way to do what you are trying to do is collecting the Stream elements into a Map:
Map<Integer, Integer> resultMap =
dataList.parallelStream()
.collect(Collectors.toMap (Function.identity (), Main::makeSomeMagic));
EDIT: The l variable is still updated in a not thread safe way with this code, so you'll have to add your own thread safety if the final value of the variable is important to you.
By putting some values in resultMap you're using a side-effect:
dataList.parallelStream().map(f -> {
Integer xx = 0;
{
xx = makeSomeMagic(f);
}
resultMap.put(f, xx);
return 0;
})
The API states:
Stateless operations, such as filter and map, retain no state from
previously seen element when processing a new element -- each element
can be processed independently of operations on other elements.
Going on with:
Stream pipeline results may be nondeterministic or incorrect if the
behavioral parameters to the stream operations are stateful. A
stateful lambda (or other object implementing the appropriate
functional interface) is one whose result depends on any state which
might change during the execution of the stream pipeline.
It follows an example similar to yours showing:
... if the mapping operation is performed in parallel, the results for
the same input could vary from run to run, due to thread scheduling
differences, whereas, with a stateless lambda expression the results
would always be the same.
That explains your observation: On each time run output differs.
The right approach is shown by #Eran
Hopefully it works fine. by making Synchronied function makeSomeMagic and using Threadsafe data structure ConcurrentHashMap
and write simple statement
dataList.parallelStream().forEach(f -> resultMap.put(f, makeSomeMagic(f)));
Whole code is here :
import java.util.*;
import java.lang.*;
import java.io.*;
import java.util.stream.Collectors;
public class Main{
static int l = 0;
public static void main (String[] args) throws java.lang.Exception {
letsGoParallel();
}
public synchronized static int makeSomeMagic( int data) { // make it synchonized
l++;
return data * 100;
}
public static void letsGoParallel() {
List<Integer> dataList = new ArrayList<>();
for(int i = 1; i <= 100 ; i++) {
dataList.add(i);
}
Map<Integer, Integer> resultMap = new ConcurrentHashMap<>();// use ConcurrentHashMap
dataList.parallelStream().forEach(f -> resultMap.put(f, makeSomeMagic(f)));
System.out.println("Input Size: " + dataList.size());
System.out.println("Size: " + resultMap.size());
System.out.println("Function Called: " + l);
}
}
There is no need to count how many times the method invoked.
Stream will help you do loop in byte code.
Pass your logic(function) to Stream, do not use no thread-safe variable in multi-thread(include parallelStream)
like this.
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class ParallelStreamClient {
// static int l = 0;---> no need to count times.
public static void main(String[] args) throws java.lang.Exception {
letsGoParallel();
}
public static int makeSomeMagic(int data) {
// l++;-----> this is no thread-safe way
return data * 100;
}
public static void letsGoParallel() {
List<Integer> dataList = new ArrayList<>();
for (int i = 1; i <= 100; i++) {
dataList.add(i);
}
Map<Integer, Integer> resultMap =
dataList.parallelStream().collect(Collectors.toMap(i -> i,ParallelStreamClient::makeSomeMagic));
System.out.println("Input Size: " + dataList.size());
System.out.println("Size: " + resultMap.size());
//System.out.println("Function Called: " + l);
}
I'm coping with some work regarding places where I used some unsafe (no type safety) String or int representations of part of the model.,
and leveraging Enum and EnumSet best practices.
One particular difficulty is this use case : an Enum where every instance holds an EnumSet of [0..n] of its own sisters.
To strip it down to the essentials I based my question on StyleEnum from Joshua Bloch. So we got an enum of BOLD, ITALIC, UNDERLINE, STRIKETHROUGH.. and let's imagine a B_AND_I which will holds {BOLD, ITALIC}.
Please, do not take great of the meaningless example : in the real system this subSets is built on base of some changing rules loaded # startup time.
The goal is that once this computing has took place, nothing can change instance particular sub-EnumSet range.
So I come with something like this :
public enum StyleEnum {
NONE(0, "none"), BOLD(100, "B"), ITALIC(250, "i"), UNDERLINE(350, "u"), STRIKETHROUGH(9, "b"), B_AND_I(99,"Emphase");
//// Pure dream == private final EnumSet<StyleEnum> complexComputedSubSet = new EnumSet<StyleEnum> ();
//// But not in the jdk
private final EnumSet<StyleEnum> complexComputedSubSet;
private final int intProp;
private final String strLabel;
StyleEnum(int intProp, String strLabel) {
this.intProp = intProp;
this.strLabel = strLabel;
//// option 2 would have been be this
// complexComputedSubSet = EnumSet.of(NONE);
//// But COMPILER :: illegal reference to static field from initializer
}//.... end of constructor
/**
* static initialzer will compute based on some rules a subset of (none) or
* others Enum, a particular enum instance can holds in his bag.
*/
static {
//// at least, as option 3, why not this...
// for (StyleEnum e : EnumSet.allOf(StyleEnum.class)) {
// e.complexComputedSubSet = EnumSet.of(NONE);
// }
//// COMPILER :: cannot assign a value to final variable complexComputedSubSet
// main handling here : at class loading
// compute a set (rules coming from whatever you want or can).
//Once this static class level init is done
// nothing can change the computed EnumSet
// it's getter will always return an unmodifiable computed EnumSet
//.... computing something
}
//....
//getter(){}
//whateverelse(){}
}
As you can see nothing is really pleasant or at least elegant here.
In my dreams :
private final EnumSet<StyleEnum> complexComputedSubSet= new EnumSet<StyleEnum> ();
//..
//static initialzer
static {
EnumSet.allOf(StyleEnum.class).forEach(e-> computeSubSet(e));
//..
}
private static void computeSubSet(StyleEnum instance){
//...
instance.complexComputedSubSet.addAll(someComputedCollection);
}
Et voilà !
Instead of that, all I can do seems to pull away the final on the field
// getting away from the final keyword
private EnumSet<StyleEnum> complexComputedSubSet;
then in theclass' static initializer block loop and instantiate with the (dummy) marker (NONE) introduced only for this (silly) purpose :
for (StyleEnum e : EnumSet.allOf(StyleEnum.class)) {
e.complexComputedSubSet = EnumSet.of(NONE);
}
And only after that compute and store the sub-EnumSet.
So all this pain, -mostly-, just because one can not say " new EnumSet ();" ?
There must be some better way ? Can you please point me to the good direction ?
I would abandon keeping the auxiliary Set in an instance field, and instead implement it as a static Map:
import java.util.Collections;
import java.util.Map;
import java.util.Set;
import java.util.EnumMap;
import java.util.EnumSet;
public enum StyleEnum {
NONE(0, "none"),
BOLD(100, "B"),
ITALIC(250, "i"),
UNDERLINE(350, "u"),
STRIKETHROUGH(9, "b"),
B_AND_I(99,"Emphase");
private static Map<StyleEnum, Set<StyleEnum>> complexSubsets;
private final int intProp;
private final String strLabel;
StyleEnum(int intProp, String strLabel) {
this.intProp = intProp;
this.strLabel = strLabel;
}
public Set<StyleEnum> getComplexSubset() {
initSubsets();
return complexSubsets.get(this);
}
private static void initSubsets() {
if (complexSubsets == null) {
Map<StyleEnum, Set<StyleEnum>> map = new EnumMap<>(StyleEnum.class);
map.put(NONE, Collections.unmodifiableSet(EnumSet.of(
BOLD, ITALIC)));
map.put(BOLD, Collections.unmodifiableSet(EnumSet.of(
UNDERLINE)));
map.put(ITALIC, Collections.unmodifiableSet(EnumSet.of(
UNDERLINE)));
map.put(UNDERLINE, Collections.emptySet());
map.put(STRIKETHROUGH, Collections.unmodifiableSet(EnumSet.of(
NONE)));
map.put(B_AND_I, Collections.unmodifiableSet(EnumSet.of(
BOLD, ITALIC)));
complexSubsets = Collections.unmodifiableMap(map);
assert complexSubsets.keySet().containsAll(
EnumSet.allOf(StyleEnum.class)) :
"Not all values have subsets defined";
}
}
}
The situation (UML given below): A java package shall have a class Process that runs a calculation in a loop within a thread and notifies observers about the result of that calculation. A new calculation is initiated bei adding a value from outside the class to a queue the thread can read:
public class Process
{
/* ... */
Thread th;
BlockingQueue queue = new ArrayBlockingQueue(10);
public void calc()
{
th = new Thread(new Runnable()
{
public void run()
{
while (!Thread.currentThread().isInterrupted())
{
try
{
Integer value = (Integer) queue.take();
List<Float> result = calculator.calcAllResults(value);
/* ... Do something with result ...*/
} catch (InterruptedException e)
{
e.printStackTrace();
}
}
}
}).start();
}
public void addValue(Integer value)
{
queue.add(value);
}
/* ... */
}
The calculation itself in calcAllResults(int value) is as follows: The calculator object has a (interchangeable by a config file, not fixed) list of "smaller calculators" that calculate exactly one Float in the result list. Now, the calculator object first gathers some data and then just runs all the small calculators in a loop:
public class Calculator
{
/* ... */
private DataGatherer1 dataGatherer1;
public List<Float> calcAllResults(int value)
{
List<Float> result = new List<Float>();
DataType1 dt1 = dataGatherer1.getData(value);
for (int i = 0; i < calculators.size(); i++)
{
result.add(calculators.get(i).calcSingleResult(dt1));
}
}
/* ... */
}
My Problem: For the moment there is only one way to calculate the result, but in future there will be more. That means: Currently, the small calculator objects that run calcSingleResult depend on data of type DataType1 (see listing above), but in future there might be another set of calculator objects that depend on data of type DataType2. Of course, then there must be another class, DataGatherer2, that gathers data of that new type. In order to make the code expandable in the way mentioned, I thought of two options how to design the package. But none of them seem to be satisfactory for the reasons mentioned below. To make the two approches clear, I made two UML-Diagrams:
Design1
Design2
Design 1:
The calculator for calculating the result list has an AbstractDataGatherer object which can be of type DataGatherer1 or DataGatherer2 (depending on the state of the program). As both data types (DataType1 and DataType2) are of type AbstractDataObject, the method calcAllResults() can have all the logic for gathering data and calculation (an advantage of this design (?)), no matter what set of CalcType's is in its calculator list:
public class Calculator
{
/* ... */
private AbstractDataGatherer dataGatherer;
private List<AbstractCalculationType> calculators;
public List<Float> calcAllResults(int value)
{
List<Float> result = new List<Float>();
AbstractDataObject dt = dataGatherer.getData(value);
for (int i = 0; i < calculators.size(); i++)
{
result.add(calculators.get(i).calcSingleResult(dt));
}
}
/* ... */
}
The disadvantage is (in my eyes) that a specific CalcType expects a specific DataType in its calcSingleResult()-routine. So AbstractDataObject dt must be casted. For example:
public class CalcType1A extends AbstractCalculationType
{
/* ... */
private DataType1 data;
public float calcSingleResult(AbstractDataObject data)
{
this.data = (DataType1) data;
}
/* ... */
}
Design 2:
The calcAllResults()-method is now inside an object of type AbstractCalculationStrategy. From this abstract class two (in future possibly more) subclasses exist: CalculationStrategy1 and CalculationStrategy2. Both define calcAllResults() alone. For example:
public class CalculationStrategy1 extends AbstractCalculationStrategy
{
/* ... */
private DataGatherer1 dataGatherer1;
private List<AbstractCalculationType1> calculators;
public float calcAllResult(int value)
{
List<Float> result = new List<Float>();
DataType1 dt1 = dataGatherer1.getData(value);
for (int i = 0; i < calculators.size(); i++)
{
result.add(calculators.get(i).calcSingleResult(dt1));
}
}
/* ... */
}
(The same for CalculationStrategy2, but with a DataGatherer2 object and AbstractCalculationType2 objects instead.)
This is basically the situation from above where only one calculation type existed. The advantage here is that each strategy must gather a specific data type that its calculator objects really wants (no cast). The disadvantage is that you have duplicate code: The logic in the method calcAllResult() is the same (or: must in future be implemented in the same way) for both (in future possible more) calculation strategies. Additionally the UML diagram shows that there is a symmetry that I would like to prevent with OOP-means, because OOP is made to prevent duplicate code / symmetry, right?
So, what is the best design? Or are both options not okay?
I have Arraylist of objects ArrayList<Product> productDatabase. The object contains a String and a double and then these objects will be added to the productDatabase by addProductToDatabase(); as follows:
public void addProductToDatabase(String productName, double dimensions); {
Product newProduct = new Product(ProductName, dimensions);
productDatabase.add(newProduct);
}
I also want to make an Arraylist<ProductCount> productInventory which counts how many Product are accounted for. Before it can add to ArrayList<ProductCount> productInventory however, it should first check if the object details exist in the productDatabase while running addProductToInventory()
public Product getProduct(String name) {
for(i = 0; i < productDatabase.size(); i++)
if(productDatabase.get(i).contains(name) //Error: cannot find symbol- method contains.(java.lang.String)
return productDatabase.get(i)
}
public void addProductToInventory(String productName, double quantity)
{
Product p = getProduct(name);
productCount.add(new ProductCount(o, quantity));
}
Assume that you always have different objects (so nothing will have the same name), but you're always unsure of the dimensions (so when you input the same producttName + dimensions you edit the dimensions in it).
At the end of the day, you have to put all the items in it a large box and report what you've inventoried, so you also have a getProductQuantityTotal() and you have to getProductDimensionTotal()-- as the name suggests, get the total of number of objects you've counted, and the sum of the dimensions.
What do I have to add/change/remove about this code? Don't consider syntax first (because BlueJ checks for common syntax errors and I just typed this by hand). I'm sure that I'm missing a for statement somewhere, and I'm probably misusing contains() because it won't recognise it (I have import java.util.*; and import java.util.ArrayList;)
To answer the question in your post title: How to find a string in an object, for a list of those objects, here is some sample code that does this:
First, I created a trivial object that has a string field:
class ObjectWithStringField {
private final String s;
public ObjectWithStringField(String s) {
this.s = s;
}
public String getString() {
return s;
}
}
And then a code that populates a list of it, and then searches each for the string. There's no magic here, it just iterates through the list until a match is found.
import java.util.List;
import java.util.Arrays;
/**
<P>{#code java StringInObjectInList}</P>
**/
public class StringInObjectInList {
public static final void main(String[] ignored) {
ObjectWithStringField[] owStrArr = new ObjectWithStringField[] {
new ObjectWithStringField("abc"),
new ObjectWithStringField("def"),
new ObjectWithStringField("ghi")};
//Yes this is a List instead of an ArrayList, but you can easily
//change this to work with an ArrayList. I'll leave that to you :)
List<ObjectWithStringField> objWStrList = Arrays.asList(owStrArr);
System.out.println("abc? " + doesStringInObjExistInList("abc", objWStrList));
System.out.println("abcd? " + doesStringInObjExistInList("abcd", objWStrList));
}
private static final boolean doesStringInObjExistInList(String str_toFind, List<ObjectWithStringField> owStrList_toSearch) {
for(ObjectWithStringField owStr : owStrList_toSearch) {
if(owStr.getString().equals(str_toFind)) {
return true;
}
}
return false;
}
}
Output:
[C:\java_code\]java StringInObjectInList
abc? true
abcd? false
In the real world, instead of a List, I'd use a Map<String,ObjectWithStringField>, where the key is that field. Then it'd be as simple as themap.containsKey("abc");. But here it is implemented as you require. You'll still have quite a bit of work to do, to get this working as specifically required by your assignment, but it should get you off to a good start. Good luck!