I have a Singleton class which connects to Cassandra. I want to initialize processMetadata, procMetadata and topicMetadata all at once not one by one. If they gets initialize all at once then I will see consistent values from all those three not different values for either of them.
In the below code, processMetadata, procMetadata and topicMetadata is initialized for the first time inside initializeMetadata method and then it gets updated every 15 minutes.
public class CassUtil {
private static final Logger LOGGER = Logger.getInstance(CassUtil.class);
private final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
// below are my three metedata which I need to update all three at once not one by one
private List<ProcessMetadata> processMetadata = new ArrayList<>();
private List<ProcMetadata> procMetadata = new ArrayList<>();
private List<String> topicMetadata = new ArrayList<>();
private Session session;
private Cluster cluster;
private static class Holder {
private static final CassUtil INSTANCE = new CassUtil();
}
public static CassUtil getInstance() {
return Holder.INSTANCE;
}
private CassUtil() {
List<String> servers = TestUtils.HOSTNAMES;
String username = TestUtils.USERNAME;
String password = TestUtils.PASSWORD;
PoolingOptions opts = new PoolingOptions();
opts.setCoreConnectionsPerHost(HostDistance.LOCAL,
opts.getCoreConnectionsPerHost(HostDistance.LOCAL));
Builder builder = Cluster.builder();
cluster =
builder
.addContactPoints(servers.toArray(new String[servers.size()]))
.withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE)
.withPoolingOptions(opts)
.withReconnectionPolicy(new ConstantReconnectionPolicy(100L))
.withLoadBalancingPolicy(
DCAwareRoundRobinPolicy
.builder()
.withLocalDc(
!TestUtils.isProduction() ? "DC2" : TestUtils.getCurrentLocation()
.get().name().toLowerCase()).build())
.withCredentials(username, password).build();
try {
session = cluster.connect("testkeyspace");
} catch (NoHostAvailableException ex) {
LOGGER.logError("error= ", ExceptionUtils.getStackTrace(ex));
} catch (Exception ex) {
LOGGER.logError("error= " + ExceptionUtils.getStackTrace(ex));
}
}
// start a background thread which runs every 15 minutes
public void startScheduleTask() {
scheduler.scheduleAtFixedRate(new Runnable() {
public void run() {
try {
processMetadata = processMetadata(true);
topicMetadata = listOfTopic(TestUtils.GROUP_ID);
procMetadata = procMetadata();
} catch (Exception ex) {
LOGGER.logError("error= ", ExceptionUtils.getStackTrace(ex));
}
}
}, 0, 15, TimeUnit.MINUTES);
}
// called from main thread to initialize the metadata
// and start the background thread where it gets updated
// every 15 minutes
public void initializeMetadata() {
processMetadata = processMetadata(true);
topicMetadata = listOfTopic(TestUtils.GROUP_ID);
procMetadata = procMetadata();
startScheduleTask();
}
private List<String> listOfTopic(final String consumerName) {
List<String> listOfTopics = new ArrayList<>();
String sql = "select topics from topic_metadata where id=1 and consumerName=?";
try {
// get data from cassandra
} catch (Exception ex) {
LOGGER.logError("error= ", ExceptionUtils.getStackTrace(ex), ", Consumer Name= ",
consumerName);
}
return listOfTopics;
}
private List<ProcessMetadata> processMetadata(final boolean flag) {
List<ProcessMetadata> metadatas = new ArrayList<>();
String sql = "select * from process_metadata where id=1 and is_active=?";
try {
// get data from cassandra
} catch (Exception ex) {
LOGGER.logError("error= ", ExceptionUtils.getStackTrace(ex), ", active= ", flag);
}
return metadatas;
}
private List<ProcMetadata> procMetadata() {
List<ProcMetadata> metadatas = new ArrayList<>();
String sql = "select * from schema where id=1";
try {
// get data from cassandra
} catch (SchemaParseException ex) {
LOGGER.logError("schema parsing error= ", ExceptionUtils.getStackTrace(ex));
} catch (Exception ex) {
LOGGER.logError("error= ", ExceptionUtils.getStackTrace(ex));
}
return metadatas;
}
public List<ProcessMetadata> getProcessMetadata() {
return processMetadata;
}
public List<String> getTopicMetadata() {
return topicMetadata;
}
public List<ProcMetadata> getProcMetadata() {
return procMetadata;
}
}
So from my main thread, I call initializeMetadata method only once which initializes those three metadata and then it starts a background thread which updates them every 15 minutes. Afer that I was using them like below from my multiple threads:
CassUtil.getInstance().getProcessMetadata();
CassUtil.getInstance().getTopicMetadata();
CassUtil.getInstance().getProcMetadata();
Problem Statement:-
Now I want to see same state of processMetadata, topicMetadata and procMetadata. Meaning these three metadata should be updated at same time not one after other bcoz I don't want to see mix state value for them after I do get on them.
How can I avoid this issue? Do I need to create another class which will hold these three metadata as constructor parameter?
The most efficient way to keep a consistent state of your lists can be to use an immutable class that will hold your 3 lists, you will then have a field of this type in your class that you will define volatile to make sure that all threads see the latest update of this field.
Here is for example the immutable class that we use to hold the state of the lists (it could be an ordinary class but as it is implementation specific it could be a static inner class):
private static class State {
private final List<ProcessMetadata> processMetadata;
private final List<ProcMetadata> procMetadata;
private final List<String> topicMetadata;
public State(final List<ProcessMetadata> processMetadata,
final List<ProcMetadata> procMetadata, final List<String> topicMetadata) {
this.processMetadata = new ArrayList<>(processMetadata);
this.procMetadata = new ArrayList<>(procMetadata);
this.topicMetadata = new ArrayList<>(topicMetadata);
}
// Getters
}
Then your class would be something like that:
public class CassUtil {
...
private volatile State state = new State(
new ArrayList<>(), new ArrayList<>(), new ArrayList<>()
);
...
public void startScheduleTask() {
...
this.state = new State(
processMetadata(true), listOfTopic(TestUtils.GROUP_ID),
procMetadata()
);
...
}
...
public void initializeMetadata() {
this.state = new State(
processMetadata(true), listOfTopic(TestUtils.GROUP_ID), procMetadata()
);
startScheduleTask();
}
...
public List<ProcessMetadata> getProcessMetadata() {
return this.state.getProcessMetadata();
}
public List<String> getTopicMetadata() {
return this.state.getTopicMetadata();
}
public List<ProcMetadata> getProcMetadata() {
return this.state.getProcMetadata();
}
Related
I am using Fork join pool in java for multitasking. Now i came across a situation where, for every task, I need to hit a url then wait for 10 minutes and then again hit another url to read the data. Now the problem is that for those 10 minutes my CPU is idle and not starting another tasks ( more than those defined in fork join pool).
static ForkJoinPool pool = new ForkJoinPool(10);
public static void main(String[] args){
List<String> list = new ArrayList<>();
for(int i=1; i<=100; i++){
list.add("Str"+i);
}
final Tasker task = new Tasker(list);
pool.invoke(task);
public class Tasker extends RecursiveAction{
private static final long serialVersionUID = 1L;
List<String> myList;
public Tasker(List<String> checkersList) {
super();
this.myList = checkersList;
}
#Override
protected void compute() {
if(myList.size()==1){
System.out.println(myList.get(0) + "start");
//Date start = new Date();
try {
Thread.sleep(10*60*1000);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println(myList.get(0) + "Finished");
}
else{
List<String> temp = new ArrayList<>();
temp.add( myList.get( myList.size()-1 ) );
myList.remove( myList.size()-1 );
Tasker left = new Tasker(myList);
Tasker right = new Tasker(temp);
left.fork();
right.compute();
left.join();
}
}
Now What should I do so that CPU picks all the tasks and then wait parallaly for them.
Unfortunately, ForkJoinPool does not work well in the face of Thread.sleep(), because it designed for many short tasks that finish quickly, rather than tasks that block for a long time.
Instead, for what you are trying to accomplish, I would recommend using ScheduledThreadPoolExecutor and dividing your task into two parts.
import java.util.*;
import java.util.concurrent.*;
public class Main {
static ScheduledThreadPoolExecutor pool = new ScheduledThreadPoolExecutor(10);
public static void main(String[] args){
for(int i=1; i<=100; i++){
pool.schedule(new FirstHalf("Str"+i), 0, TimeUnit.NANOSECONDS);
}
}
static class FirstHalf implements Runnable {
String name;
public FirstHalf(String name) {
this.name = name;
}
public void run() {
System.out.println(name + "start");
pool.schedule(new SecondHalf(name), 10, TimeUnit.MINUTES);
}
}
static class SecondHalf implements Runnable {
String name;
public SecondHalf(String name) {
this.name = name;
}
public void run() {
System.out.println(name + "Finished");
}
}
}
If Java provides a thread pool which allows releasing the underlying resources (that is, the kernel thread participating in the thread pool) during a Thread.sleep(), you should use that instead, but I currently do not know of one.
According to docs forkJoin basic use section tells:
if (my portion of the work is small enough)
do the work directly
else
split my work into two pieces
invoke the two pieces and wait for the results
Hopefully this meets your need if you are using forkjoin
public class Tasker extends RecursiveAction {
static ForkJoinPool pool = new ForkJoinPool(10);
static int threshold = 10;
public static void main(String[] args){
List<String> list = new ArrayList<>();
for(int i=1; i<=100; i++){
list.add("Str"+i);
}
final Tasker task = new Tasker(list);
pool.invoke(task);
}
private static final long serialVersionUID = 1L;
List<String> myList;
public Tasker(List<String> checkersList) {
super();
this.myList = checkersList;
}
void computeDirectly() {
for(String url : myList){
System.out.println(url + " start");
}
//Date start = new Date();
try {
//keep hitting url
while (true) {
for(String url : myList) {
//url hitting code here
System.out.println(url + " hitting");
}
Thread.sleep(10 * 60 * 1000);
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
for(String url : myList){
System.out.println(url + " Finished");
}
}
#Override
protected void compute() {
if (myList.size() <= threshold) {
computeDirectly();
return;
}
//temp list have only one url
//List<String> temp = new ArrayList<>();
//temp.add( myList.get( myList.size()-1 ) );
//myList.remove( myList.size()-1 );
//Tasker left = new Tasker(myList);
//Tasker right = new Tasker(temp);
//left.fork();
//right.compute();
//left.join();
List<String> first = new ArrayList<>();
List<String> second = new ArrayList<>();
//divide list
int len = myList.size();
int smHalf = len / 2;//smaller half
first = myList.subList(0, smHalf);
second = myList.subList(smHalf + 1, len);
invokeAll(new Tasker(first), new Tasker(second));
}
}
A little bit of context: the client is sending to the server a SOSPFPacket object (via TCP) that has various attributes, such as a Vector<LSA> lsaArray. The LSA itself has a LinkedList<LinkDescription> links attribute. In my test case, there are two messages being sent. In both messages, there is only one LSA in the vector. In the first message, the LSA has one LinkDescription, in the second, it has two. When I send a message, I increment the messageId.
The server receives both messages with proper ids, but in the second message, the links only contain one link instead of two. I'm clueless...
Here are the object implementations:
import java.io.*;
import java.util.Vector;
public class SOSPFPacket implements Serializable {
public final static short HELLO = 0;
public final static short LSU = 1;
public final static short OVER_BURDENED = 2;
public static int id = Integer.MIN_VALUE;
public String srcProcessIP;
public short srcProcessPort;
public String srcIP;
public String dstIP;
public short sospfType; //0 - HELLO, 1 - LinkState Update, 2 - Over Burdened
public String routerID;
public int messageId = id++;
public String neighborID; //neighbor's simulated IP address
public Vector<LSA> lsaArray = new Vector<>();
public String lsaInitiator = null;
}
import java.io.Serializable;
import java.util.LinkedList;
public class LSA implements Serializable {
public String linkStateID;
public int lsaSeqNumber = Integer.MIN_VALUE;
public LinkedList<LinkDescription> links = new LinkedList<LinkDescription>();
#Override
public String toString() {
StringBuffer sb = new StringBuffer();
sb.append(linkStateID + ":").append(lsaSeqNumber + "\n");
for (LinkDescription ld : links) {
sb.append(ld);
}
sb.append("\n");
return sb.toString();
}
}
import java.io.Serializable;
public class LinkDescription implements Serializable {
public String linkID;
public int portNum;
public int tosMetrics;
public LinkDescription() {}
public LinkDescription(String linkID, int portNum, int tosMetrics) {
this.linkID = linkID;
this.portNum = portNum;
this.tosMetrics = tosMetrics;
}
public String toString() {
return linkID + "," + portNum + "," + tosMetrics;
}
}
To send the message, I do it via a Client.java thread implementing Runnable. Here are the relevant methods:
public void run() {
try {
_outputStream = new ObjectOutputStream(_clientSocket.getOutputStream());
sendMessage(SOSPFPacket.HELLO);
_inputStream = new ObjectInputStream(_clientSocket.getInputStream());
SOSPFPacket message = Util.receiveMessage(_inputStream);
if (message.sospfType == SOSPFPacket.OVER_BURDENED) {
System.out.println("Removing link with router " + message.srcIP + "...");
_router.removeLink(_remoteRouterIP);
return;
}
_remoteRouterDescription.setStatus(RouterStatus.TWO_WAY);
_router.addLinkDescriptionToDatabase(_remoteRouterDescription, _link.getWeight());
sendMessage(SOSPFPacket.HELLO);
message = Util.receiveMessage(_inputStream);
if (message.sospfType == SOSPFPacket.LSU) {
_router.synchronize(message.lsaArray);
}
_router.propagateSynchronization(message.lsaInitiator, message.srcIP);
} catch (IOException e) {
e.printStackTrace();
}
}
private void sendMessage(short messageType) {
try {
SOSPFPacket message = Util.makeMessage(_rd, _remoteRouterDescription, messageType, _router);
_outputStream.writeObject(message);
_outputStream.flush();
} catch (IOException e) {
e.printStackTrace();
}
}
public class Util {
public static SOSPFPacket makeMessage(RouterDescription local, RouterDescription external, short messageType, Router rd) {
SOSPFPacket message = new SOSPFPacket();
message.srcProcessIP = local.getProcessIPAddress();
message.srcProcessPort = local.getProcessPortNumber();
message.srcIP = local.getSimulatedIPAddress();
message.dstIP = external.getSimulatedIPAddress();
message.sospfType = messageType;
message.routerID = local.getSimulatedIPAddress();
message.neighborID = external.getSimulatedIPAddress();
rd.getLsd().getStore().forEach((k, v) -> message.lsaArray.addElement(v));
message.lsaInitiator = messageType == SOSPFPacket.LSU ? message.srcIP : null;
return message;
}
public static SOSPFPacket receiveMessage(ObjectInputStream inputStream) {
SOSPFPacket receivedMessage = null;
try {
receivedMessage = (SOSPFPacket) inputStream.readObject();
String messageType;
switch (receivedMessage.sospfType) {
case SOSPFPacket.HELLO:
messageType = "HELLO";
break;
case SOSPFPacket.LSU:
messageType = "LINKSTATEUPDATE";
break;
case SOSPFPacket.OVER_BURDENED:
messageType = "OVER_BURDENED";
break;
default:
messageType = "UNKNOWN_STATE";
break;
}
System.out.println("received " + messageType + " from " + receivedMessage.srcIP + ";");
} catch (ClassNotFoundException e) {
System.out.println("No message received.");
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return receivedMessage;
}
}
And the server instantiates a private ClientServiceThread when it receives a new connection, which is in charge of receiving the message.
private class ClientServiceThread implements Runnable {
Socket _clientSocket;
Thread _runner;
ClientServiceThread(Socket s) {
_clientSocket = s;
_runner = new Thread(this);
}
public Thread getRunner() { return _runner; }
public void run() {
ObjectInputStream inputStream = null;
ObjectOutputStream outputStream = null;
try {
inputStream = new ObjectInputStream(_clientSocket.getInputStream());
outputStream = new ObjectOutputStream(_clientSocket.getOutputStream());
while (true) {
try {
SOSPFPacket receivedMessage = Util.receiveMessage(inputStream);
//some logic not relevant since the receivedMessage is already not correct
}
}
}
}
}
Again, all SOSPFPacket fields are correctly received, except for the Vector<LSA> lsaArray...
Edit: I also tried sending a third sendMessage(SOSPFPacket.HELLO) after _router.propagateSynchronization(message.lsaInitiator, message.srcIP);. This time, the message being sent contains two LSA, the first one having two LinkDescription, the second one having one. Both LSA are received by the server, but still, only the first LinkDescription is received in the first LSA. The message id is correct in all three messages.
If I run everything a second time (i.e. I create a new Client and a new ClientService Thread for the already running routers), only then does the server finally receive two LinkDescription in the first LSA.
Java sends references to objects that have already been serialized, to preserve the integrity of object graphs.
You should call ObjectOutputStream.reset() after each writeObject().
Or use ObjectOutputStream.writeUnshared(), but note that it still shares referenced objects, i.e. if you try to send a list with both added and changed element objects, it will send the new list and new element objects, but not the element objects which have been changed.
Finally figured it out. Somehow it seems like the problem was the following line of code in Util.makeMessage: rd.getLsd().getStore().forEach((k, v) -> message.lsaArray.addElement(v));. I replaced it with rd.getLsd().getStore().forEach((k, v) -> message.lsaArray.add(new LSA(v))); with the following LSA constructor:
public LSA(LSA lsa) {
linkStateID = lsa.linkStateID;
lsaSeqNumber = lsa.lsaSeqNumber;
links = new LinkedList<>();
for (LinkDescription ld : lsa.links) {
LinkDescription linkD = new LinkDescription();
linkD.linkID = ld.linkID;
linkD.portNum = ld.portNum;
linkD.tosMetrics = ld.tosMetrics;
links.add(linkD);
}
}
In other words, I needed to deep copy the object contained in my message.
I have to write some dao tests for project where I want to:
create DDL schema from database (MySQL);
create tables in another test database in memory (H2);
insеrt some data to database;
select the just inserted item;
check some data from this item.
This is my test:
public class BridgeDBTest {
private static String JDBC_DRIVER;
private static String JDBC_URL;
private static String USER;
private static String PSWD;
private static final Logger logger = LoggerFactory.getLogger(BridgeDBTest.class);
#BeforeGroups(groups = "bridgeDB")
public void init(){
try {
JDBC_DRIVER = org.h2.Driver.class.getName();
JDBC_URL = "jdbc:h2:mem:test;DB_CLOSE_DELAY=-1";
USER = "root";
PSWD = "";
new HibernateTestUtil().setDialect("org.hibernate.dialect.HSQLDialect")
.translateCreateDllToOutputStream(new FileOutputStream(new File("src/test/resources/createSchema.sql")));
RunScript.execute(JDBC_URL, USER, PSWD, "src/test/resources/createSchema.sql", Charset.forName("UTF8"), false);
insertDataset(readDataSet());
}
catch (Exception expt) {
expt.printStackTrace();
logger.error("!!!" + expt);
throw new RuntimeException(expt.getMessage());
}
}
#Test(groups = "bridgeDB")
public void getItem(){
BridgeDAOImpl dao = new BridgeDAOImpl();
dao.setSessionFactory(new HibernateTestUtil().getSessionFactory());
try {
Bridge bridge = dao.get(1L);
assert(bridge.getName().equals("TEST-CN-DEVBOX01"));
} catch (ServiceException e) {
e.printStackTrace();
}
}
#AfterGroups(groups = "bridgeDB")
public void dropTables(){
try {
new HibernateTestUtil().setDialect("org.hibernate.dialect.HSQLDialect")
.translateDropDllToOutputStream(new FileOutputStream(new File("src/test/resources/dropSchema.sql")));
}
catch (Exception expt) {
expt.printStackTrace();
logger.error("!!!" + expt);
throw new RuntimeException(expt.getMessage());
}
}
private IDataSet readDataSet() throws Exception{
return new FlatXmlDataSetBuilder().build(new File("src/test/resources/datasetForTest.xml"));
}
private void insertDataset(IDataSet dataSet) throws Exception{
IDatabaseTester databaseTester = new JdbcDatabaseTester(JDBC_DRIVER, JDBC_URL, USER, PSWD);
databaseTester.setSetUpOperation(DatabaseOperation.CLEAN_INSERT);
databaseTester.setDataSet(dataSet);
databaseTester.onSetup();
}
}
BridgeDAOImplused class HibernateUtilfrom src/main/..., but I need to use my class HibernateTestUtil from src/test/.... It's modified HibernateUtil fitted for my test (there I set parameters for Configuration class).
BridgeDAOImpl (See 5 line in try block):
public class BridgeDAOImpl extends GenericDAOImpl<Bridge, Long> implements BridgeDAO {
//...
public SearchResult<Bridge> list(int from, int limit, String filter, String order, Long authId) throws ServiceException {
SearchResult<Bridge> results = null;
Search search = new Search(Bridge.class);
Session session = getSessionFactory().getCurrentSession();
Transaction transaction = null;
try {
transaction = session.beginTransaction();
search.setFirstResult(from);
search.setMaxResults(limit);
HibernateUtil.buildSearch(filter, order, search, aliases);
results = searchAndCount(search);
transaction.commit();
}
catch (Exception expt) {
logger.error("!!!", expt);
if (transaction != null) {
transaction.rollback();
}
throw new ServiceException(expt.getMessage());
}
finally {
// session.close();
}
return results;
}
//...
}
How I can test my dao without modifying it?
This is my first posting here, hope I won't seem too desperate with my question.
I have a work task which involves comparing two large set of names to see if matching exists between them (regardless of order of words in the names).
I've tried both a regular, more straightforward approach and also one using Regex.
Standard approach:
public static boolean isMatch(String terroristName, String clientName) {
String[] terroristArray = terroristName.split(" ");
String[] clientArray = clientName.split(" ");
int size = clientArray.length;
int ctrl = 0;
boolean alreadyFound = false;
for (String client : clientArray) {
for (String terrorist : terroristArray) {
//if already found a match, stop comparing with rest of the words from terrorist name
if (!alreadyFound)
if (client.compareTo(terrorist) == 0) {
alreadyFound = true;
ctrl++;
break;
}
}
alreadyFound = false;
if (ctrl == 0 && !alreadyFound) {
//if first word of client is not found in whole terrorist name
//then exit loop, no match possible
break;
}
}
if (ctrl == size)
return true;
else
return false;
}
Regex approach:
public static boolean isRegexMatch(String terroristName, String clientName) {
boolean result = false;
String[] clientNameArray = clientName.split(" ");
String myPattern = "^";
//build pattern using client name
for (String cname : clientNameArray) {
myPattern += "(?=.*\\b" + cname + "\\b)";
}
myPattern += ".*$";
Pattern pattern = Pattern.compile(myPattern);
Matcher matcher = pattern.matcher(terroristName);
// check all occurance
while (matcher.find()) {
result = true;
}
return result;
}
Loop comparing the 2 lists of names:
for (Person terrorist : terrorists) {
System.setOut(matchPrintStream);
for (Person client : clients) {
if (Util.isRegexMatch(terrorist.getNoDuplicatesName(), client.getName())) {
System.out.println(client.getId() + ";" + client.getName() + ";" + terrorist.getId() + ";" +
terrorist.getName());
}
}
}
The two sets have the following sizes:
terrorist = aprox 16000
clients = aprox 3.4 million
Runtime of both methods is quite slow:
ps -ef | grep 42810
42810 41919 99 17:47 pts/0 00:52:23 java -Xms1024M -Xmx1024M -classpath ojdbc6.jar:TerroristBuster.jar ro.btrl.mis.mihai.Main
By the time above of 00:52:23 runtime, it had processed about 170 entries, meaning it would need several days to complete. I know it has a large complexity, unsure how to lower it. What do you think of maybe using something other than List? I figured it would the most fast using foreach since of the random access.
Can this code be improved/changed in any way to improve the runtime, or am i just dealing with a too large set of data?
If you can use Java 8 this should be very easy to parallelise.
First, you don't have that many clients, so preprocess those:
final Collection<Collection<String>> processedClients = clients.parallelStream().
map(c -> c.split("\\s+")).
map(Arrays::asList).
collect(toList());
This takes each client name, splits it into the parts, and then uses the asList wrapper to turn it into a List. This is done parallelised, so should be fast.
Next we need to loop over all the terrorists:
terrorists.parallelStream().
map(t -> t.split("\\s+")).
map(t -> Stream.of(t).collect(toSet())).
forEach(t -> {
processedClients.parallelStream().forEach(c -> {
if (t.containsAll(c)) {
System.out.println("Match found t:" + t + ", c:" + c);
}
});
});
Here, for each terrorist, we split their name, but this time we turn it into a Set because Set has O(1) contains() - this means checking whether a whole client name is contained in a whole terrorist name will only take time proportional to the size of the client name.
We then use forEach to loop over the terrorists and another forEach to loop over the clients, we check is the terrorists name Set containsAll the client name.
Again this is in parallel.
In theory it shouldn't take long at all. Storing the processed client names in memory might require a bit of RAM, but it shouldn't be too much - about 1GB.
EDIT
Here is a rewrite to an earlier version (1.7, but if you remove the diamond notation it should work on 1.5)
First you need two processing classes, these are submitted to individual work threads:
final class NameProcessor implements Callable<Collection<String>> {
private final String name;
public NameProcessor(final String name) {
this.name = name;
}
#Override
public Collection<String> call() throws Exception {
return Arrays.asList(name.split("\\s+"));
}
}
final class TerroristProcessor implements Runnable {
private final String name;
public TerroristProcessor(final String name) {
this.name = name;
}
#Override
public void run() {
final Set<String> splitName = new HashSet<>(Arrays.asList(name.split("\\s+")));
for (final Collection<String> client : proccessedClients) {
if (splitName.containsAll(client)) {
System.out.println("Match found t:" + name + ", c:" + client);
}
}
}
}
Now you need to ExecutorService and an ExecutorCompletionService:
final ExecutorService es = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
final ExecutorCompletionService<Collection<String>> cs = new ExecutorCompletionService<>(es);
Now you first need to process your clients, as before:
for (final String name : clients) {
cs.submit(new NameProcessor(name));
}
final Collection<Collection<String>> proccessedClients = new LinkedList<>();
for (int i = 0; i < clients.size(); ++i) {
try {
proccessedClients.add(cs.take().get());
} catch (InterruptedException ex) {
return;
} catch (ExecutionException ex) {
throw new RuntimeException(ex);
}
}
And then process the terrorists:
final Collection<Future<?>> futures = new LinkedList<>();
for (final String terrorist : terrorists) {
futures.add(es.submit(new TerroristProcessor(terrorist)));
}
es.shutdown();
es.awaitTermination(1, TimeUnit.DAYS);
for (final Future<?> f : futures) {
try {
f.get();
} catch (ExecutionException ex) {
throw new RuntimeException(ex);
}
}
The loop over the futures is to check for processing errors.
EDIT
The OP wants to process custom objects rather than collections of String.
I would assume you have some sort of Person class like so:
final class Person {
private final int id;
private final String name;
//constructor
//getters and setters
}
Then you can simply create a wrapper class like so:
final class PersonWrapper {
private final Person person;
private final Collection<String> processedName;
//constructor
//getters and setters
}
And create a result class like so:
final class ProblemClient {
private final Person client;
private final Person terrorist;
//constructor
//getters and setters
}
And simply rewrite the code appropriately:
final class NameProcessor implements Callable<PersonWrapper> {
private final Person person;
public NameProcessor(final Person person) {
this.person = person;
}
#Override
public PersonWrapper call() throws Exception {
return new PersonWrapper(person, Arrays.asList(person.getName().split("\\s+")));
}
}
final ExecutorService es = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
final ExecutorCompletionService<PersonWrapper> cs = new ExecutorCompletionService<>(es);
for (final Person client : clients) {
cs.submit(new NameProcessor(client));
}
final Collection<PersonWrapper> proccessedClients = new LinkedList<>();
for (int i = 0; i < clients.size(); ++i) {
try {
proccessedClients.add(cs.take().get());
} catch (InterruptedException ex) {
return;
} catch (ExecutionException ex) {
throw new RuntimeException(ex);
}
}
final class TerroristProcessor implements Runnable {
private final Person person;
private final Collection<ProblemClient> results;
public TerroristProcessor(final Person person, final Collection<ProblemClient> results) {
this.person = person;
this.results = results;
}
#Override
public void run() {
final Set<String> splitName = new HashSet<>(Arrays.asList(person.getName().split("\\s+")));
for (final PersonWrapper client : proccessedClients) {
if (splitName.containsAll(client.getProcessedName())) {
results.add(new ProblemClient(client.getPerson(), person));
}
}
}
}
final Collection<ProblemClient> results = new ConcurrentLinkedQueue<>();
final Collection<Future<?>> futures = new LinkedList<>();
for (final Person terrorist : terrorists) {
futures.add(es.submit(new TerroristProcessor(terrorist, results)));
}
es.shutdown();
es.awaitTermination(1, TimeUnit.DAYS);
for (final Future<?> f : futures) {
try {
f.get();
} catch (ExecutionException ex) {
throw new RuntimeException(ex);
}
}
//process results
for (final ProblemClient problemClient : results) {
//whatever.
}
As I said, it might be informative to see what the outcome of preprocessing terrorists first and then looping over clients is too:
final class TerroristPreprocessor implements Callable<PersonWrapper> {
private final Person person;
public TerroristPreprocessor(final Person person) {
this.person = person;
}
#Override
public PersonWrapper call() throws Exception {
final Set<String> splitName = new HashSet<>(Arrays.asList(person.getName().split("\\s+")));
return new PersonWrapper(person, splitName);
}
}
final ExecutorService es = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
final ExecutorCompletionService<PersonWrapper> cs = new ExecutorCompletionService<>(es);
for (final Person terrorist : terrorists) {
cs.submit(new TerroristPreprocessor(terrorist));
}
final Collection<PersonWrapper> proccessedTerrorists = new LinkedList<>();
for (int i = 0; i < terrorists.size(); ++i) {
try {
proccessedTerrorists.add(cs.take().get());
} catch (InterruptedException ex) {
return;
} catch (ExecutionException ex) {
throw new RuntimeException(ex);
}
}
final class ProblemClientFinder implements Runnable {
private final Person client;
private final Collection<ProblemClient> results;
public ProblemClientFinder(final Person client, final Collection<ProblemClient> results) {
this.client = client;
this.results = results;
}
#Override
public void run() {
final Collection<String> splitName = Arrays.asList(client.getName().split("\\s+"));
for (final PersonWrapper terrorist : proccessedTerrorists) {
if (terrorist.getProcessedName().containsAll(splitName)) {
results.add(new ProblemClient(client, terrorist.getPerson()));
}
}
}
}
final Collection<ProblemClient> results = new ConcurrentLinkedQueue<>();
final Collection<Future<?>> futures = new LinkedList<>();
for (final Person client : clients) {
futures.add(es.submit(new ProblemClientFinder(client, results)));
}
es.shutdown();
es.awaitTermination(1, TimeUnit.DAYS);
for (final Future<?> f : futures) {
try {
f.get();
} catch (ExecutionException ex) {
throw new RuntimeException(ex);
}
}
//process results
for (final ProblemClient problemClient : results) {
//whatever.
}
I want to test MessageProcessor1.listAllKeyword method, which in turn
calls HbaseUtil1.getAllKeyword method. Initialy, I had to deal with a problem associated with the static initializer and the constructor. The problem was to initialize a HBASE DB connection. I used powerMock to suppress static and constructor calls and it worked fine.
Even though I mocked HbaseUtil1.getAllKeyword method, actual method is being called and executes all HBase code leading to an exception, in which HBASE server is not up.
EasyMock.expect(hbaseUtil.getAllKeyword("msg", "u1")).andReturn(expectedList);
Please give me any idea on how to avoid an actual method call. I tried many ways but none of them worked.
public class MessageProcessor1
{
private static Logger logger = Logger.getLogger("MQ-Processor");
private final static String CLASS_NAME = "MessageProcessor";
private static boolean keywordsTableExists = false;
public static PropertiesLoader props;
HbaseUtil1 hbaseUtil;
/**
* For checking if table exists in HBase. If doesn't exists, will create a
* new table. This runs only once when class is loaded.
*/
static {
props = new PropertiesLoader();
String[] userTablefamilys = {
props.getProperty(Constants.COLUMN_FAMILY_NAME_COMMON_KEYWORDS),
props.getProperty(Constants.COLUMN_FAMILY_NAME_USER_KEYWORDS) };
keywordsTableExists = new HbaseUtil()
.creatTable(props.getProperty(Constants.HBASE_TABLE_NAME),
userTablefamilys);
}
/**
* This will load new configuration every time this class instantiated.
*/
{
props = new PropertiesLoader();
}
public String listAllKeyword(String userId) throws IOException {
HbaseUtil1 util = new HbaseUtil1();
Map<String, List<String>> projKeyMap = new HashMap<String, List<String>>();
//logger.info(CLASS_NAME+": inside listAllKeyword method");
//logger.debug("passed id : "+userId);
List<String> qualifiers = util.getAllKeyword("msg", userId);
List<String> keywords = null;
for (String qualifier : qualifiers) {
String[] token = qualifier.split(":");
if (projKeyMap.containsKey(token[0])) {
projKeyMap.get(token[0]).add(token[1]);
} else {
keywords = new ArrayList<String>();
keywords.add(token[1]);
projKeyMap.put(token[0], keywords);
}
}
List<Project> projects = buildProject(projKeyMap);
Gson gson = new GsonBuilder().excludeFieldsWithoutExposeAnnotation()
.create();
System.out.println("Json projects:::" + gson.toJson(projects));
//logger.debug("list all keyword based on project::::"+ gson.toJson(projects));
//return gson.toJson(projects);
return "raj";
}
private List<Project> buildProject(Map<String, List<String>> projKeyMap) {
List<Project> projects = new ArrayList<Project>();
Project proj = null;
Set<String> keySet = projKeyMap.keySet();
for (String hKey : keySet) {
proj = new Project(hKey, projKeyMap.get(hKey));
projects.add(proj);
}
return projects;
}
//#Autowired
//#Qualifier("hbaseUtil1")
public void setHbaseUtil(HbaseUtil1 hbaseUtil) {
this.hbaseUtil = hbaseUtil;
}
}
public class HbaseUtil1 {
private static Logger logger = Logger.getLogger("MQ-Processor");
private final static String CLASS_NAME = "HbaseUtil";
private static Configuration conf = null;
public HbaseUtil1() {
PropertiesLoader props = new PropertiesLoader();
conf = HBaseConfiguration.create();
conf.set(HConstants.ZOOKEEPER_QUORUM, props
.getProperty(Constants.HBASE_CONFIGURATION_ZOOKEEPER_QUORUM));
conf.set(
HConstants.ZOOKEEPER_CLIENT_PORT,
props.getProperty(Constants.HBASE_CONFIGURATION_ZOOKEEPER_CLIENT_PORT));
conf.set("hbase.zookeeper.quorum", props
.getProperty(Constants.HBASE_CONFIGURATION_ZOOKEEPER_QUORUM));
conf.set(
"hbase.zookeeper.property.clientPort",
props.getProperty(Constants.HBASE_CONFIGURATION_ZOOKEEPER_CLIENT_PORT));
}
public List<String> getAllKeyword(String tableName, String rowKey)
throws IOException {
List<String> qualifiers = new ArrayList<String>();
HTable table = new HTable(conf, tableName);
Get get = new Get(rowKey.getBytes());
Result rs = table.get(get);
for (KeyValue kv : rs.raw()) {
System.out.println("KV: " + kv + ", keyword: "
+ Bytes.toString(kv.getRow()) + ", quaifier: "
+ Bytes.toString(kv.getQualifier()) + ", family: "
+ Bytes.toString(kv.getFamily()) + ", value: "
+ Bytes.toString(kv.getValue()));
qualifiers.add(new String(kv.getQualifier()));
}
table.close();
return qualifiers;
}
/**
* Create a table
*
* #param tableName
* name of table to be created.
* #param familys
* Array of the name of column families to be created with table
* #throws IOException
*/
public boolean creatTable(String tableName, String[] familys) {
HBaseAdmin admin = null;
boolean tableCreated = false;
try {
admin = new HBaseAdmin(conf);
if (!admin.tableExists(tableName)) {
HTableDescriptor tableDesc = new HTableDescriptor(tableName);
for (int i = 0; i < familys.length; i++) {
tableDesc.addFamily(new HColumnDescriptor(familys[i]));
}
admin.createTable(tableDesc);
System.out.println("create table " + tableName + " ok.");
}
tableCreated = true;
admin.close();
} catch (MasterNotRunningException e1) {
e1.printStackTrace();
} catch (ZooKeeperConnectionException e1) {
e1.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return tableCreated;
}
}
Below is my Test class.
#RunWith(PowerMockRunner.class)
#PrepareForTest(MessageProcessor1.class)
#SuppressStaticInitializationFor("com.serendio.msg.mqProcessor.MessageProcessor1")
public class MessageProcessorTest1 {
private MessageProcessor1 messageProcessor;
private HbaseUtil1 hbaseUtil;
#Before
public void setUp() {
messageProcessor = new MessageProcessor1();
hbaseUtil = EasyMock.createMock(HbaseUtil1.class);
}
#Test
public void testListAllKeyword(){
List<String> expectedList = new ArrayList<String>();
expectedList.add("raj:abc");
suppress(constructor(HbaseUtil1.class));
//suppress(method(HbaseUtil1.class, "getAllKeyword"));
try {
EasyMock.expect(hbaseUtil.getAllKeyword("msg", "u1")).andReturn(expectedList);
EasyMock.replay();
assertEquals("raj", messageProcessor.listAllKeyword("u1"));
} catch (IOException e) {
e.printStackTrace();
}catch (Exception e) {
e.printStackTrace();
}
}
}
The HbaseUtil1 is instantiated within the listAllKeyword method
public String listAllKeyword(String userId) throws IOException {
HbaseUtil1 util = new HbaseUtil1();
...
So the mock one you create in your test isn't being used at all.
If possible, make the HbaseUtil1 object passable, or settable on the MessageProcessor1 class and then set it in the test class.
Also, and note I'm not 100% familiar with PowerMock, you could include HbaseUtil1 in the prepare for test annotation. I think that will make PowerMock instantiate mocks instead of real objects and then use the expectations you provide in you test.