Spark: Run through List Asyncronously and perform SparkContext Actions - java

In my application I am running through a list synchronously in order to process all hostnames in a list.
While the application works, the problem is I want to now be able to run through that list asynchronously, however I am not entirely sure how to do this with spark since the SparkContext object is not serializable, meaning you can only have a single SparkContext object running at any given time, preventing you from doing multiple spark actions simultaneously.
So my question is, how do other people approach this problem? Maybe I am not thinking about it in the right way.
Possible Solution: Map List(modelFilteredTopology) to a JavaRDD and then use JavaRDD.foreachAsync(Call MapProcessing with VoidFunction) in order to perform the actions asynchronously. However I may run into some head scratching issues if I do this. And I believe I will still be limited by the SparkContext as it wont execute the actions asynchronously anyways.
Sample of my code below:
MainClass.java snippet - Loads some data from a SchemaRDD over to a LIST
List<modelFilteredTopology> filteredList = FILTERED_TOPOLOGY.map(new Function<Row, modelFilteredTopology>() {
public modelFilteredTopology call(Row row){
modelFilteredTopology modelF = new modelFilteredTopology();
modelF.setHOSTNAME(row.getString(0));
modelF.setMODEL(row.getString(1));
modelF.setCOMMIT_TS(new SimpleDateFormat("yyyy-MM-dd").format(Calendar.getInstance().getTime()));
return modelF;
}
}).collect();
for(modelFilteredTopology f : filteredList) {
MapProcessing2 map = new MapProcessing2(schemaTopology, sc, sqlContext);
map.runApp(f.getHOSTNAME(),f.getMODEL(),f.getCOMMIT_TS());
}
MapProcessing2.class - This is a small snippet of the code that performs a TON of JAVARDD transformations, I have to make SparkContext static, otherwise it will obviously through a Serialization not allowed error.
public class MapProcessing2 implements Serializable {
private static final Integer HISTORYDATE = 30;
private static JavaSchemaRDD TOPO_EXTRACT;
private static JavaSparkContext SC;
private static JavaSQLContext sqlContext;
private static final String ORACLE_DRIVER = Properties.getString("OracleDB_Driver");
private static final String ORACLE_CONNECTION_URL_PROD = Properties.getString("DB_Connection_PROD");
private static final String ORACLE_USERNAME_PROD = Properties.getString("DB_Username_PROD");
private static final String ORACLE_PASSWORD_PROD = Properties.getString("DB_Password_PROD");
public MapProcessing2(JavaSchemaRDD TOPO_EXTRACT, JavaSparkContext SC, JavaSQLContext sqlContext) {
this.TOPO_EXTRACT = TOPO_EXTRACT;
this.SC = SC;
this.sqlContext = sqlContext;
}
public void runApp(String hostname, String model, String commits) throws Exception{
if (String.valueOf(hostname).equals("null")) {
System.out.println("No Value Retrieved");
} else {
try {
//Retrieve Tickets for GWR
JavaSchemaRDD oltint = getOLTInterfaces(TOPO_EXTRACT, hostname);
System.out.println("Number of OLT's Found for "+hostname+": " + oltint.collect().size());
if (oltint.collect().size() < 1) {
System.out.println("No OLT's found for: " + hostname);
} else {
System.out.println("Processing Router: " + hostname + " Model:" + model);
System.out.println("Processing Tickets for: " + hostname + " " + model);
}
} catch (Exception e) {
System.out.println("Error Processing GWR: " + hostname + " " + model + " Stacktrace Error: " + e);
}
}
}

Related

Best Practices to create Message for Logging or Exception in Java

I found this code in Java 6.
String mensajeExcluido = ArqSpringContext.getPropiedad("MENSAJE.EXCLUIDO");
LOG.warn("ERROR: Servicio: " + mensajeExcluido + ":" + someDTO.getProperty() +
",\tsomeValue:" + someDTO.getValue() + "'.");
throw new Exception(mensajeExcluido);
this code
String mensajeExcluido = ArqSpringContext.getPropiedad("REGLA.MENSAJE");
String mensajeWarn = "ALERTA: Otro Servicio: " + mensajeExcluido + ":" +
someDTO.getProperty() + ",\tsomeValue:" + someDTO.getValue() + "'.";
LOG.warn(mensajeWarn);
boolean exclusionVisible = Boolean.valueOf(ArqSpringContext.getPropiedad("EXCLUSION.VISIBLE"));
if (exclusionVisible) {
mensajeWarn = "<br></br>" + mensajeWarn;
} else {
mensajeWarn = "";
}
throw new Exception(mensajeExcluido + mensajeWarn);
an this code
LOG.warn("No se pudo validar Client Service. Code: " +
someDTO.getStatusCode() + ".");
return "No se pudo validar Client Service. Code: " +
someDTO.getStatusCode() + ".";
In order to follow best practices...
What recommendations are applicable?
What changes would they make to the code?
How should texts be handled?
First, try to avoid message creation processing before checking if the log statement should be printed (I.e.: don't concatenate the messages strings before checking the log level.
// Better this
if (LOG.isDebugEnabled())
LOG.debug("This is a message with " + variable + " inside");
// Than this
final String message = "This is a message with " + variable + " inside";
if (LOG.isDebugEnabled())
LOG.debug(message);
Most Java logging frameworks allow a check to know in advance if a log statement is going to be printed based on the given settings.
If you want to save you the burden of writting those checks for each log statement, you can take advantage of Java 8 lambdas and code an utility like this:
import java.util.function.Supplier;
import java.util.logging.Logger;
import static java.util.logging.Level.FINE;
class MyLogger {
public static MyLogger of(final Class<?> loggerClass) {
return new MyLogger(loggerClass);
}
private final Logger logger;
private MyLogger(final Class<?> loggerClass) {
logger = Logger.getLogger(loggerClass.getName());
}
// Supplier will be evaluated AFTER checking if log statement must be executed
public void fine(final Supplier<?> message) {
if (logger.isLoggable(FINE))
logger.log(FINE, message.get().toString());
}
}
static final LOG = MyLogger.of(String.class);
public void example() {
LOG.fine(() -> "This is a message with a system property: " + System.getProperty("property"));
}
And finally, you can take advantage of Java string formatting to format log messages using String.format. I.e.:
final String message = String.format("Print %s string and %d digit", "str", 42);
Those good practices applied to the examples you provided would be:
/*
* Using java.util.logging in JDK8+
*/
import java.util.logging.Level;
import java.util.logging.Logger;
import static java.lang.String.format;
class Dto {
String getProperty() { return "property"; }
String getValue() { return "property"; }
String getStatusCode() { return "statusCode"; }
}
final Logger LOG = Logger.getGlobal();
final Dto someDTO = new Dto();
void example1() throws Exception {
String mensajeExcluido = System.getProperty("MENSAJE.EXCLUIDO");
// Check if log will be printed before composing the log message
if (LOG.isLoggable(Level.WARNING)) {
// Using String.format usually is clearer and gives you more formatting options
final String messageFormat = "ERROR: Servicio: %s:%s,\tsomeValue:%s'.";
LOG.warning(format(messageFormat, mensajeExcluido, someDTO.getProperty(), someDTO.getValue()));
}
// Or using lambdas
LOG.warning(() -> {
final String message = "ERROR: Servicio: %s:%s,\tsomeValue:%s'.";
return format(message, mensajeExcluido, someDTO.getProperty(), someDTO.getValue());
});
throw new Exception(mensajeExcluido);
}
void example2() throws Exception {
String mensajeExcluido = System.getProperty("REGLA.MENSAJE");
String mensajeWarn = format(
// The concatenated message is probably missing a single quote at 'someValue'
"ALERTA: Otro Servicio: %s:%s,\tsomeValue:%s'.",
mensajeExcluido,
someDTO.getProperty(),
someDTO.getValue()
);
LOG.warning(mensajeWarn);
boolean exclusionVisible = Boolean.parseBoolean(System.getProperty("EXCLUSION.VISIBLE"));
String exceptionMessage = exclusionVisible ?
mensajeExcluido + "<br></br>" + mensajeWarn : mensajeExcluido;
throw new Exception(exceptionMessage);
}
String example3() {
// You can compose the message only once and use it for the log and the result
String message =
format("No se pudo validar Client Service. Code: %s.", someDTO.getStatusCode());
LOG.warning(message);
return message;
}

Using ScheduledExecutorService to save(Entites), I get detached Entity passed to persist error

I have a very curious Error that I cant seem to get my head around.
I need to use a ScheduledExecutorService to pass the Survey Entity I created to be edited and then saved as a new Entity.
public void executeScheduled(Survey eventObject, long interval) {
HashMap<String, String> eventRRules = StringUtils.extractSerialDetails(eventObject.getvCalendarRRule());
long delay = 10000;
ScheduledExecutorService service = Executors.newScheduledThreadPool(1);
Runnable runnable = new Runnable() {
private int counter = 1;
private int executions = Integer.parseInt(eventRRules.get("COUNT"));
Survey survey = eventObject;
public void run() {
String uid = eventObject.getUniqueEventId();
logger.info("SurveyController - executeScheduled - Iteration: " + counter);
String serialUid = null;
if (counter == 1) {
serialUid = uid + "-" + counter;
} else {
serialUid = StringUtils.removeLastAndConcatVar(eventObject.getUniqueEventId(), Integer.toString(counter));
}
if (++counter > executions) {
service.shutdown();
}
survey.setUniqueEventId(serialUid);
try {
executeCreateSurvey(survey);
} catch(Exception e) {
logger.debug("SurveyController - executeScheduled - Exception caught: ");
e.printStackTrace();
}
}
};
service.scheduleAtFixedRate(runnable, delay, interval, TimeUnit.MILLISECONDS);
}
When the executeCreateSurvey(survey) Method is run without the ScheduleExecutorService, it works flawlessly.
Yet when it is executed inside the run() Method, I get the "detached entity passed to persist" Error each time the save(survey) Method is run within the executeCreateSurvey() Method....
The executeCreateSurvey() Method where the .save() Method is called:
public ResponseEntity<?> executeCreateSurvey(Survey eventObject) {
MailService mailService = new MailService(applicationProperties);
Participant eventOwner = participantRepositoryImpl.createOrFindParticipant(eventObject.getEventOwner());
eventObject.setEventOwner(eventOwner);
Survey survey = surveyRepositoryImpl.createSurveyOrFindSurvey((Survey) eventObject);
// Saves additional information if small errors (content
// errors,.. )
// occurs
String warnMessage = "";
List<Participant> participants = new ArrayList<Participant>();
for (Participant participantDetached : eventObject.getParticipants()) {
// Check if participant already exists
Participant participant = participantRepositoryImpl.createOrFindParticipant(participantDetached);
participants.add(participant);
// Only create PartSur if not existing (Update case)
if (partSurRepository.findAllByParticipantAndSurvey(participant, survey).isEmpty()) {
PartSur partSur = new PartSur(participant, survey);
partSurRepository.save(partSur);
try {
mailService.sendRatingInvitationEmail(partSur);
surveyRepository.save(survey);
} catch (Exception e) {
// no special exception for "security" reasons
String errorMessage = "error sending mail for participant: " + e.getMessage() + "\n";
warnMessage += errorMessage;
logger.warn("createSurvey() - " + errorMessage);
}
}
}
// Delete all PartSurs and Answers from removed participants
List<PartSur> partSursForParticipantsRemoved = partSurRepository.findAllByParticipantNotIn(participants);
logger.warn("createSurvey() - participants removed: " + partSursForParticipantsRemoved.size());
partSurRepositoryImpl.invalidatePartSurs(partSursForParticipantsRemoved);
return new ResponseEntity<>("Umfrage wurde angelegt. Warnungen: " + warnMessage, HttpStatus.OK);
}
What could the reason be for this?
I have not been able to find this Problem anywhere so far.

Interactive Brokers API - Executing multiple trades

I'm trying to create a program for an API to place multiple trades at once, and then get prices for the stocks, and then rebalance every so often. I used a tutorial from online to get some of this code, and made a few tweaks.
However, when I run the code, it often connects and will place an order if I restart IB TWS. But if I go to run the code again it does not work, or show any indication that it will connect. Can anyone help me figure out how to keep the connection going, so that I can run the main.java file, and it will execute multiple trades and then end the connection? Do I need to change the client id number in either the code, or the settings of TWS?
There are three files:
Ordermanagement.java:
package SendMarketOrder;
//import statements//
class OrderManagement extends Thread implements EWrapper{
private EClientSocket client = null; //IB API client Socket Object
private Stock stock = new Stock();
private Order order = new Order();
private int orderId;
private double limitprice;
private String Ticker;
//method to create connection class. It's the constructor
public OrderManagement() throws InterruptedException, ClassNotFoundException, SQLException {
// Create a new EClientSocket object
System.out.println("////////////// Creating a Connection ////////////");
client = new EClientSocket(this); //Creation of a socket to connect
//connect to the TWS Demo
client.eConnect(null,7497,1);
try {
Thread.sleep(3000); //waits 3 seconds for user to accept
while (!(client.isConnected()));
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("///////// Connected /////////");
}
public void sendMarketOrder(String cusip, String buyorSell, int shares) throws SQLException, ClassNotFoundException{
//New Order ID
orderId++;
order.m_action = buyorSell;
order.m_orderId = orderId;
order.m_orderType = "MKT";
order.m_totalQuantity = shares;
order.m_account = "DU33xxxxx"; //write own account
order.m_clientId = 1;
//Create a new contract
stock.createContract(cusip);
client.placeOrder(orderId, stock.contract, order);
//Show order in console
SimpleDateFormat time_formatter = new SimpleDateFormat("HH:mm:ss");
String current_time_str = time_formatter.format(System.currentTimeMillis());
System.out.println("////////////////////////////////////////////////\n" +
"#Limit Price: " + order.m_lmtPrice + "///////////////////////////\n" +
"#Client number: " + order.m_clientId + "///////////////////////////\n" +
"#OrderType: " + order.m_orderType + "///////////////////////////\n" +
"#Order Quantity: " + order.m_totalQuantity + "///////////////////////////\n" +
"#Account number: " + order.m_account + "///////////////////////////\n" +
"#Symbol: " + stock.contract.m_secId + "///////////////////////////\n" +
"///////////////////////////////////////"
);
}
Stock.java
public class Stock{
private int StockId; //we can identify the stock
private String Symbol; //Ticker
public Stock() { //default constructor
}
public Stock(int StockId, String Symbol) { //constructor
this.StockId = StockId;
this.Symbol = Symbol;
}
//getter and setters
public int getStockId() {
return StockId;
}
public String getSymbol() {
return Symbol;
}
Contract contract = new Contract ();
public void createContract(String cusip){
contract.m_secId = cusip;
contract.m_secIdType = "CUSIP";
contract.m_exchange = "SMART";
contract.m_secType = "STK";
contract.m_currency = "USD";
}
}
Main.java:
package SendMarketOrder;
import java.sql.SQLException;
public class Main {
public static void main(String[] args) throws InterruptedException, ClassNotFoundException, SQLException {
OrderManagement order = new OrderManagement();
order.sendMarketOrder("922908363","BUY", 100);
order.sendMarketOrder("92204A504","BUY", 50);
order.sendMarketOrder("92204A702","BUY", 100);
System.exit(0);
}
}
These are my current settings TWS settings if that helps:
Thanks in advance for the help!
I changed a few things around in the code and added comments.
package sendmarketorder;//usually lower case pkg names
public class Main {
//you throw a bunch of exceptions that are never encountered
public static void main(String[] args) {
//since there's a Thread.sleep in this class
//it will block until ready
OrderManagement order = new OrderManagement();
//obviously you need some logic to buy/sell
//you can use command line args here if you want
order.sendMarketOrder("922908363", "BUY", 100);
order.sendMarketOrder("92204A504", "BUY", 50);
order.sendMarketOrder("92204A702", "BUY", 100);
//the socket creates a reader thread so this will stop it.
//if you didn't have this line the non-daemon thread would keep a
//connection to TWS and that's why you couldn't reconnect
//System.exit(0);//use better exit logic
}
}
.
package sendmarketorder;
import com.ib.client.*;
import java.text.SimpleDateFormat;
import java.util.HashMap;
import java.util.Map;
//doesn't extend thread and if you implement EWrapper you have to implement all methods
//in API 9.72 you can extend DefaultWrapper and just override the methods you need
public class OrderManagement implements EWrapper{
private EClientSocket client = null; //IB API client Socket Object
private int orderId = -1;//use as flag to send orders
//private double limitprice;
//private String Ticker;
//keep track of all working orders
private Map<Integer, Order> workingOrders = new HashMap<>();
//method to create connection class. It's the constructor
public OrderManagement(){
// Create a new EClientSocket object
System.out.println("////////////// Creating a Connection ////////////");
client = new EClientSocket(this); //Creation of a socket to connect
//connect to the TWS Demo
client.eConnect(null, 7497, 123);//starts reader thread
try {
while (orderId < 0){ //not best practice but it works
System.out.println("waiting for orderId");
Thread.sleep(1000);
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("///////// Connected /////////");
}
public void sendMarketOrder(String cusip, String buyorSell, int shares) {
//make new stock and order for each stock
Stock stock = new Stock();
Order order = new Order();
//New Order ID, but get from API as you have to increment on every run for life
orderId++;
order.m_action = buyorSell;
order.m_orderId = orderId;
order.m_orderType = "MKT";
order.m_totalQuantity = shares;
//I don't think you're supposed to use these fields
//order.m_account = "DU33xxxxx"; //write own account
//order.m_clientId = 1;
//Create a new contract
stock.createContract(cusip);
//remember which orders are working
workingOrders.put(orderId, order);
client.placeOrder(orderId, stock.contract, order);
//Show order in console
SimpleDateFormat time_formatter = new SimpleDateFormat("HH:mm:ss");
String current_time_str = time_formatter.format(System.currentTimeMillis());
System.out.println("////////////////////////////////////////////////\n"
+ "#Limit Price: " + order.m_lmtPrice + "///////////////////////////\n"
+ "#Client number: " + order.m_clientId + "///////////////////////////\n"
+ "#OrderType: " + order.m_orderType + "///////////////////////////\n"
+ "#Order Quantity: " + order.m_totalQuantity + "///////////////////////////\n"
+ "#Account number: " + order.m_account + "///////////////////////////\n"
+ "#Symbol: " + stock.contract.m_secId + "///////////////////////////\n"
+ "///////////////////////////////////////"
);
}
//always impl the error callback so you know what's happening
#Override
public void error(int id, int errorCode, String errorMsg) {
System.out.println(id + " " + errorCode + " " + errorMsg);
}
#Override
public void nextValidId(int orderId) {
System.out.println("next order id "+orderId);
this.orderId = orderId;
}
#Override
public void orderStatus(int orderId, String status, int filled, int remaining, double avgFillPrice, int permId, int parentId, double lastFillPrice, int clientId, String whyHeld) {
//so you know it's been filled
System.out.println(EWrapperMsgGenerator.orderStatus(orderId, status, filled, remaining, avgFillPrice, permId, parentId, lastFillPrice, clientId, whyHeld));
//completely filled when remaining == 0, or possible to cancel order from TWS
if (remaining == 0 || status.equals("Cancelled")){
//remove from map, should always be there
if (workingOrders.remove(orderId) == null) System.out.println("not my order!");
}
//if map is empty then exit program as all orders have been filled
if (workingOrders.isEmpty()){
System.out.println("all done");
client.eDisconnect();//will stop reader thread
//now is when you stop the program, but since all
//non-daemon threads have finished, the jvm will close.
//System.exit(0);
}
}
//impl rest of interface...
}

How to properly wait for apache spark launcher job during launching it from another application?

I am trying to avoid "while(true)" solution when i waiting until my spark apache job is done, but without success.
I have spark application which suppose to process some data and put a result to database, i do call it from my spring service and would like to wait until the job is done.
Example:
Launcher with method:
#Override
public void run(UUID docId, String query) throws Exception {
launcher.addAppArgs(docId.toString(), query);
SparkAppHandle sparkAppHandle = launcher.startApplication();
sparkAppHandle.addListener(new SparkAppHandle.Listener() {
#Override
public void stateChanged(SparkAppHandle handle) {
System.out.println(handle.getState() + " new state");
}
#Override
public void infoChanged(SparkAppHandle handle) {
System.out.println(handle.getState() + " new state");
}
});
System.out.println(sparkAppHandle.getState().toString());
}
How to wait properly until state of handler is "Finished".
I am also using SparkLauncher from a Spring application. Here is a summary of the approach that I took (by following examples in the JavaDoc).
The #Service used to launch the job also implements SparkHandle.Listener and passes a reference to itself via .startApplication e.g.
...
...
#Service
public class JobLauncher implements SparkAppHandle.Listener {
...
...
...
private SparkAppHandle launchJob(String mainClass, String[] args) throws Exception {
String appResource = getAppResourceName();
SparkAppHandle handle = new SparkLauncher()
.setAppResource(appResource).addAppArgs(args)
.setMainClass(mainClass)
.setMaster(sparkMaster)
.setDeployMode(sparkDeployMode)
.setSparkHome(sparkHome)
.setConf(SparkLauncher.DRIVER_MEMORY, "2g")
.startApplication(this);
LOG.info("Launched [" + mainClass + "] from [" + appResource + "] State [" + handle.getState() + "]");
return handle;
}
/**
* Callback method for changes to the Spark Job
*/
#Override
public void infoChanged(SparkAppHandle handle) {
LOG.info("Spark App Id [" + handle.getAppId() + "] Info Changed. State [" + handle.getState() + "]");
}
/**
* Callback method for changes to the Spark Job's state
*/
#Override
public void stateChanged(SparkAppHandle handle) {
LOG.info("Spark App Id [" + handle.getAppId() + "] State Changed. State [" + handle.getState() + "]");
}
Using this approach, one can take action when the state changes to "FAILED", "FINISHED" or "KILLED".
I hope this information is helpful to you.
I implemented using CountDownLatch, and it works as expected.
...
final CountDownLatch countDownLatch = new CountDownLatch(1);
SparkAppListener sparkAppListener = new SparkAppListener(countDownLatch);
SparkAppHandle appHandle = sparkLauncher.startApplication(sparkAppListener);
Thread sparkAppListenerThread = new Thread(sparkAppListener);
sparkAppListenerThread.start();
long timeout = 120;
countDownLatch.await(timeout, TimeUnit.SECONDS);
...
private static class SparkAppListener implements SparkAppHandle.Listener, Runnable {
private static final Log log = LogFactory.getLog(SparkAppListener.class);
private final CountDownLatch countDownLatch;
public SparkAppListener(CountDownLatch countDownLatch) {
this.countDownLatch = countDownLatch;
}
#Override
public void stateChanged(SparkAppHandle handle) {
String sparkAppId = handle.getAppId();
State appState = handle.getState();
if (sparkAppId != null) {
log.info("Spark job with app id: " + sparkAppId + ",\t State changed to: " + appState + " - "
+ SPARK_STATE_MSG.get(appState));
} else {
log.info("Spark job's state changed to: " + appState + " - " + SPARK_STATE_MSG.get(appState));
}
if (appState != null && appState.isFinal()) {
countDownLatch.countDown();
}
}
#Override
public void infoChanged(SparkAppHandle handle) {}
#Override
public void run() {}
}

Spymemcached set does not work when used with multithreading and high load

I am using memcached version 1.4.7 and spymemcached 2.8.4 as a client to set and get the key values to it. When used in multi-threaded and high load environment spymemcached client is unable to set the values in cache itself.
I am running my load test program with 40M long key which are equally divided in 20 worker threads. Each worker thread tries to set 1M keys in cache. Hence there are 40 worker threads running.
In my DefaultCache.java file, I have made a connection pool of 20 spymemcached clients. Every time a worker thread tries to set the key to cache DefaultCache.java returns it a random client as shown in getCache() method.
When my program exits, it prints
Total no of keys loaded = 40000000
However when I go to memcached telnet console, it always misses few thousands records. I have also verified it by randomly fetching few keys which output null. There is no eviction and the cmd_set, curr_items, total_items are each equal to 39.5M
What could be the reason behind these missing keys in cache.
Here is the code for reference purpose.
public class TestCacheLoader {
public static final Long TOTAL_RECORDS = 40000000L;
public static final Long LIMIT = 1000000L;
public static void main(String[] args) {
long keyCount = loadKeyCacheData();
System.out.println("Total no of keys loaded = " + keyCount);
}
public static long loadKeyCacheData() {
DefaultCache cache = new DefaultCache();
List<Future<Long>> futureList = new ArrayList<Future<Long>>();
ExecutorService executorThread = Executors.newFixedThreadPool(40);
long offset = 0;
long keyCount = 0;
long workerCount = 0;
try {
do {
List<Long> keyList = new ArrayList<Long>(LIMIT.intValue());
for (long counter = offset; counter < (offset + LIMIT) && counter < TOTAL_RECORDS; counter++) {
keyList.add(counter);
}
if (keyList.size() != 0) {
System.out.println("Initiating a new worker thread " + workerCount++);
KeyCacheThread keyCacheThread = new KeyCacheThread(keyList, cache);
futureList.add(executorThread.submit(keyCacheThread));
}
offset += LIMIT;
} while (offset < TOTAL_RECORDS);
for (Future<Long> future : futureList) {
keyCount += (Long) future.get();
}
} catch (Exception e) {
e.printStackTrace();
} finally {
cache.shutdown();
}
return keyCount;
}
}
class KeyCacheThread implements Callable<Long> {
private List<Long> keyList;
private DefaultCache cache;
public KeyCacheThread(List<Long> keyList, DefaultCache cache) {
this.keyList = keyList;
this.cache = cache;
}
public Long call() {
return createKeyCache();
}
public Long createKeyCache() {
String compoundKey = "";
long keyCounter = 0;
System.out.println(Thread.currentThread() + " started to process " + keyList.size() + " keys");
for (Long key : keyList) {
keyCounter++;
compoundKey = key.toString();
cache.set(compoundKey, 0, key);
}
System.out.println(Thread.currentThread() + " processed = " + keyCounter + " keys");
return keyCounter;
}
}
public class DefaultCache {
private static final Logger LOGGER = Logger.getLogger(DefaultCache.class);
private MemcachedClient[] clients;
public DefaultCache() {
this.cacheNamespace = "";
this.cacheName = "keyCache";
this.addresses = "127.0.0.1:11211";
this.cacheLookupTimeout = 3000;
this.numberOfClients = 20;
try {
LOGGER.debug("Cache initialization started for the cache : " + cacheName);
ConnectionFactory connectionFactory = new DefaultConnectionFactory(DefaultConnectionFactory.DEFAULT_OP_QUEUE_LEN,
DefaultConnectionFactory.DEFAULT_READ_BUFFER_SIZE, DefaultHashAlgorithm.KETAMA_HASH) {
public NodeLocator createLocator(List<MemcachedNode> list) {
KetamaNodeLocator locator = new KetamaNodeLocator(list, DefaultHashAlgorithm.KETAMA_HASH);
return locator;
}
};
clients = new MemcachedClient[numberOfClients];
for (int i = 0; i < numberOfClients; i++) {
MemcachedClient client = new MemcachedClient(connectionFactory, AddrUtil.getAddresses(getServerAddresses(addresses)));
clients[i] = client;
}
LOGGER.debug("Cache initialization ended for the cache : " + cacheName);
} catch (IOException e) {
LOGGER.error("Exception occured while initializing cache : " + cacheName, e);
throw new CacheException("Exception occured while initializing cache : " + cacheName, e);
}
}
public Object get(String key) {
try {
return getCache().get(cacheNamespace + key);
} catch (Exception e) {
return null;
}
}
public void set(String key, Integer expiryTime, final Object value) {
getCache().set(cacheNamespace + key, expiryTime, value);
}
public Object delete(String key) {
return getCache().delete(cacheNamespace + key);
}
public void shutdown() {
for (MemcachedClient client : clients) {
client.shutdown();
}
}
public void flush() {
for (MemcachedClient client : clients) {
client.flush();
}
}
private MemcachedClient getCache() {
MemcachedClient client = null;
int i = (int) (Math.random() * numberOfClients);
client = clients[i];
return client;
}
private String getServerAddresses(List<Address> addresses) {
StringBuilder addressStr = new StringBuilder();
for (Address address : addresses) {
addressStr.append(address.getHost()).append(":").append(address.getPort()).append(" ");
}
return addressStr.toString().trim();
}
}
I saw the same. The reason is reactor pattern that they use for async operations. That means 1 worker thread per 1 connection. This 1 thread is a bootleneck under high load and multi threaded machines. 1 thread can load only 1 CPU while rest 23 will be idle.
We have come up with pool of connections that increased the worker threads and allowed to utilize more hardware power. Check out project 3levelmemcache at github.
I am not sure but it seems the issue with spymemcached library itself.
I changed the implemention of DefaultCache.java file to use xmemcached and everything started working fine. Now I am not missing any records. telnet stats are showing matching number of set commands.
Thanks for your patience though.

Categories

Resources