Java - Runnable, lambda function, and methods of class - java

I'm quite new with Java (studied on University but was version 2).
Now I've developed an application that downloads files from s3 in parallel. I've used ExecutorService and Runnable to download multiple files in parallel in this way:
public class DownloaderController {
private AmazonS3 s3Client;
private ExecutorService fixedPool;
private TransferManager dlManager;
private List<MultipleFileDownload> downloads = new ArrayList<>();
public DownloaderController() {
checkForNewWork();
}
public void checkForNewWork(){
Provider1 provider = new Provider1();
fixedPool = Executors.newFixedThreadPool(4);
List<Download> providedDownloadList = provider.toBeDownloaded();
for (Download temp : providedDownloadList) {
if (!downloadData.contains(temp)) {
fixedPool.submit(download.downloadCompletedHandler(s3Client));
}
}
}
}
public void printToTextArea(String msg){
Date now = new Date();
if ( !DateUtils.isSameDay(this.lastLogged, now)){
this._doLogRotate();
}
this.lastLogged = now;
SimpleDateFormat ft = new SimpleDateFormat("dd/MM/yyyy H:mm:ss");
String output = "[ " + ft.format(now) + " ] " + msg + System.getProperty("line.separator");
Platform.runLater(() -> {
//this is a FXML object
statusTextArea.appendText(output);
});
}
}
public class Provider1 implements downloadProvider {
}
public class Download {
abstract Runnable downloadCompletedHandler(AmazonS3 s3Client);
}
public class DownloadProvider1 extends Download {
#Override
public Runnable downloadCompletedHandler(AmazonS3 s3Client){
Runnable downloadwork = () -> {
ObjectListing list = s3Client.listObjects(this.bucket,this.getFolder());
List<S3ObjectSummary> objects = list.getObjectSummaries();
AtomicLong workSize = new AtomicLong(0);
List<DeleteObjectsRequest.KeyVersion> keys = new ArrayList<>();
objects.forEach(obj -> {
workSize.getAndAdd(obj.getSize());
keys.add((new DeleteObjectsRequest.KeyVersion(obj.getKey())));
});
MultipleFileDownload fileDownload = dlManager.downloadDirectory("myBucket","folder","outputDirectory");
try {
fileDownload.waitForCompletion();
} catch (Exception e){
printToTextArea("Exception while download from AmazonS3");
}
};
return downloadwork;
}
}
In the downloadController i call every minute a function that adds some Download objects to a List that contains folders that has to be downloaded from s3. when a new Download is added it's also added to ExecutorService pool. The Download object returns the code that has to be executed to download the folder from s3 and what to do when it's download is finished.
My problem is, what is the best way to communicate between the Runnable and the DownloadController ?

Your code does not make entirely clear what the goal is. From what I understand, I would have done it something like this:
public class Download {
private AmazonS3 s3Client;
public Download(AmazonS2 client) { s3Client = client; }
public void run() { // perform download }
}
That class does nothing but download the file (cfg Separation of Concern) and is a Runnable. You can do executorService.submit(new Download(client)) and the download will be finished eventually; also, you can test it without being called concurrently.
Now, you want a callback method for logging it being finished.
public class LoggingCallback {
public void log() {
System.out.println("finished");
}
}
Also a Runnable (the method doesn't have to be run()).
And, to make sure it's triggered one after the other, maybe
class OneAfterTheOther {
private Runnable first;
private Runnable second;
public OneAfterTheOther(Runnable r1, Runnable r2) {
first = r1; second = r2;
}
public void run() { first.run(); second.run(); }
}
which if submitted like this
Download dl = new Download(client);
Logger l = new LoggingCallback();
executorService.submit(new OneAfterTheOther(dl::run, l::log));
will do what I think you're trying to do.

Related

Java Quit class function

I have a problem. I created this class that opens a websocket stream with Binance:
public class AccountStream extends Driver {
private Integer agentId;
private String API_KEY;
private String SECRET;
private String listenKey;
private Order newOrder;
private String LOG_FILE_PATH;
public AccountStream(Integer agentId) {
this.agentId = agentId;
// Load binance config
HashMap<String, String> binanceConfig = MainDriver.getBinanceConfig(agentId);
API_KEY = binanceConfig.get("api_key");
SECRET = binanceConfig.get("secret");
startAccountEventStreaming();
setConnectionCheckScheduler();
}
private void startAccountEventStreaming() {
BinanceApiClientFactory factory = BinanceApiClientFactory.newInstance(API_KEY, SECRET);
BinanceApiRestClient client = factory.newRestClient();
// First, we obtain a listenKey which is required to interact with the user data stream
listenKey = client.startUserDataStream();
// Then, we open a new web socket client, and provide a callback that is called on every update
BinanceApiWebSocketClient webSocketClient = factory.newWebSocketClient();
// Listen for changes in the account
webSocketClient.onUserDataUpdateEvent(listenKey, response -> {
System.out.println(response);
});
// Ping the datastream every 30 minutes to prevent a timeout
ScheduledExecutorService keepAliveUserDataStreamPool = Executors.newScheduledThreadPool(1);
Runnable pingUserDataStream = () -> {
client.keepAliveUserDataStream(listenKey);
};
keepAliveUserDataStreamPool.scheduleWithFixedDelay(pingUserDataStream, 0, 30, TimeUnit.MINUTES);
}
private void setConnectionCheckScheduler() {
ScheduledExecutorService checkConnectionPool = Executors.newScheduledThreadPool(1);
Runnable checkConnectionTask = () -> {
if (!MainDriver.connected) {
// DESTORY ENTIRE CLASS HERE
}
};
checkConnectionPool.scheduleWithFixedDelay(checkConnectionTask, 0, 1, TimeUnit.SECONDS);
}
}
Now the setConnectionCheckScheduler() schedules a check for a specific variable, which gets set to true if the program lost connection to the internet, but the function also has to stop the code from continuing with pinging the API. Actually I want to do some kind of return which quits the entire class code, so I need to create a new instance of the class to start the API stream again.
How can I cancel all the code (actually destroy the class) in this class from within the Runnable checkConnectionTask?
Basically you need to close/clean up the resources/running threads and run the logic from the beginning, wrapping start logic to separate method and add one for clean up will help:
private void start() {
startAccountEventStreaming();
setConnectionCheckScheduler();
}
private void shutdown() {
// possibly more code to make sure resources closed gracefully
webSocketClient.close();
keepAliveUserDataStreamPool.shutdown();
checkConnectionPool.shutdown();
}
private void restart() {
shutdown();
start();
}
so you just call restart in your checker:
Runnable checkConnectionTask = () -> {
if (!MainDriver.connected) {
restart();
}
};

Java : newSingleThreadScheduledExecutor is not giving expected result when used parallel to akka framework

Using Akka framework for my use case where I created one SupervisorActor and two child actors now parallel to that I have token service which needs to update my cache before the expiry please find the code :
public class TokenCacheService {
final Logger logger = LoggerFactory.getLogger(TokenCacheService.class);
private static final String KEY = "USER_TOKEN";
private LoadingCache<String, String> tokenCache;
private final ScheduledExecutorService cacheScheduler;
ThreadFactory threadFactory = new ThreadFactoryBuilder()
.setNameFormat("MyCacheRefresher-pool-%d").setDaemon(true)
.build();
public UserTokenCacheService(CacheConfig cacheConfig) {
cacheScheduler = Executors.newSingleThreadScheduledExecutor(threadFactory);
buildCache(cacheConfig);
}
public String getToken() {
String token = StringUtils.EMPTY;
try {
token = tokenCache.get(KEY);
} catch (ExecutionException ex) {
logger.debug("unable to process get token...");
}
return token;
}
private void buildCache(CacheConfig cacheConfig) {
tokenCache = CacheBuilder.newBuilder()
.refreshAfterWrite(4, "HOURS")
.expireAfterWrite(5, "HOURS")
.maximumSize(2)
.build(new CacheLoader<String, String>() {
#Override
#ParametersAreNonnullByDefault
public String load(String queryKey) {
logger.debug("cache load()");
return <token method call which return token>
}
#Override
#ParametersAreNonnullByDefault
public ListenableFutureTask<String> reload(final String key, String prevToken) {
logger.debug("cache reload()");
ListenableFutureTask<String> task = ListenableFutureTask.create(() -> return <token method call which return token>);
cacheScheduler.execute(task);
return task;
}
});
cacheScheduler.scheduleWithFixedDelay(() -> tokenCache.refresh(KEY), 0,
4, "HOURS");
}
}
It is working fine with test class :
public static void main(String[] args) throws InterruptedException {
TokenCacheService userTokenCacheService = new TokenCacheService();
while(true){
System.out.println(tokenCacheService.getToken());
Thread.sleep(180000);
}
}
above method printing correct logs as after 4 hours which is expected but when I run the above code with my actual application (with Akka-actors) I can only see the first log cache load() apart from this it isn't printing further log for reload the cache.
Please suggest what am doing wrong here.
I tweaked the code little bit by setting the priority to 1 and replaced scheduleWithFixedDelay with scheduleAtFixedRate
ThreadFactory threadFactory = new ThreadFactoryBuilder()
.setNameFormat("MyCacheRefresher-pool-%d")
.setPriority(1)
.build();
public UserTokenCacheService(CacheConfig cacheConfig) {
idsTokenApplication = new IdsTokenApplication();
cacheScheduler = Executors.newSingleThreadScheduledExecutor(threadFactory);
buildCache(cacheConfig);
}
cacheScheduler.scheduleAtFixedRate(() -> tokenCache.refresh(KEY), 0,
cacheConfig.getReloadCache(), TimeUnit.valueOf(cacheConfig.getReloadCacheTimeUnit()));

Is there a way to get the Apache Commons FileAlterationMonitor to only alert once for a batch of incoming files?

I am monitoring several (about 15) paths for incoming files using the Apache Commons FileAlterationMonitor. These incoming files can come in batches of anywhere between 1 and 500 files at a time. I have everything set up and the application monitors the folders as expected, I have it set to poll the folders every minute. My issue is that, as expected, the listener that I have set up alerts for each incoming file when all I really need, and want, is to know when a new batch of files come in. So I would like to receive a single alert as opposed to up to 500 at a time.
Does anyone have any ideas for how to control the number of alerts or only pick up the first or last notification or something to that effect? I would like to stick with the FileAlterationMonitor if at all possible because it will be running for long periods and so far from what I can tell in testing is that it doesn't seem to put a heavy load on the system or slow the rest of the application down. But I am definitely open to other ideas if what I'm looking for isn't possible with the FileAlterationMonitor.
public class FileMonitor{
private final String newDirectory;
private FileAlterationMonitor monitor;
private final Alerts gui;
private final String provider;
public FileMonitor (String d, Alerts g, String pro) throws Exception{
newDirectory = d;
gui = g;
provider = pro;
}
public void startMonitor() throws Exception{
// Directory to monitor
final File directory = new File(newDirectory);
// create new observer
FileAlterationObserver fao = new FileAlterationObserver(directory);
// add listener to observer
fao.addListener(new FileAlterationListenerImpl(gui, provider));
// wait 1 minute between folder polls.
monitor = new FileAlterationMonitor(60000);
monitor.addObserver(fao);
monitor.start();
}
}
public class FileAlterationListenerImpl implements FileAlterationListener{
private final Alerts gui;
private final String provider;
private final LogFiles monitorLogs;
public FileAlterationListenerImpl(Alerts g, String pro){
gui = g;
provider = pro;
monitorLogs = new LogFiles();
}
#Override
public void onStart(final FileAlterationObserver observer){
System.out.println("The FileListener has started on: " + observer.getDirectory().getAbsolutePath());
}
#Override
public void onDirectoryCreate(File file) {
}
#Override
public void onDirectoryChange(File file) {
}
#Override
public void onDirectoryDelete(File file) {
}
#Override
public void onFileCreate(File file) {
try{
switch (provider){
case "Spectrum": gui.alertsAreaAppend("New/Updated schedules available for Spectrum zones!\r\n");
monitorLogs.appendNewLogging("New/Updated schedules available for Spectrum zones!\r\n");
break;
case "DirecTV ZTA": gui.alertsAreaAppend("New/Updated schedules available for DirecTV ZTA zones!\r\n");
monitorLogs.appendNewLogging("New/Updated schedules available for DirecTV ZTA zones!\r\n");
break;
case "DirecTV RSN": gui.alertsAreaAppend("New/Updated schedules available for DirecTV RSN zones!\r\n");
monitorLogs.appendNewLogging("New/Updated schedules available for DirecTV RSN zones!\r\n");
break;
case "Suddenlink": gui.alertsAreaAppend("New/Updated schedules available for Suddenlink zones!\r\n");
monitorLogs.appendNewLogging("New/Updated schedules available for Suddenlink zones!\r\n");
break;
}
}catch (IOException e){}
}
#Override
public void onFileChange(File file) {
}
Above is the FileMonitor class and overridden FileAlterationListener I have so far.
Any suggestions would be greatly appreciated.
Here's a quick and crude implementation:
public class FileAlterationListenerAlterThrottler {
private static final int DEFAULT_THRESHOLD_MS = 5000;
private final int thresholdMs;
private final Map<String, Long> providerLastFileProcessedAt = new HashMap<>();
public FileAlterationListenerAlterThrottler() {
this(DEFAULT_THRESHOLD_MS);
}
public FileAlterationListenerAlterThrottler(int thresholdMs) {
this.thresholdMs = thresholdMs;
}
public synchronized boolean shouldAlertFor(String provider) {
long now = System.currentTimeMillis();
long last = providerLastFileProcessedAt.computeIfAbsent(provider, x -> 0l);
if (now - last < thresholdMs) {
return false;
}
providerLastFileProcessedAt.put(provider, now);
return true;
}
}
And a quicker and cruder driver:
public class Test {
public static void main(String[] args) throws Exception {
int myThreshold = 1000;
FileAlterationListenerAlterThrottler throttler = new FileAlterationListenerAlterThrottler(myThreshold);
for (int i = 0; i < 3; i++) {
doIt(throttler);
}
Thread.sleep(1500);
doIt(throttler);
}
private static void doIt(FileAlterationListenerAlterThrottler throttler) {
boolean shouldAlert = throttler.shouldAlertFor("Some Provider");
System.out.println("Time now: " + System.currentTimeMillis());
System.out.println("Should alert? " + shouldAlert);
System.out.println();
}
}
Yields:
Time now: 1553739126557
Should alert? true
Time now: 1553739126557
Should alert? false
Time now: 1553739126557
Should alert? false
Time now: 1553739128058
Should alert? true

Java threads - waiting on all child threads in order to proceed

So a little background;
I am working on a project in which a servlet is going to release crawlers upon a lot of text files within a file system. I was thinking of dividing the load under multiple threads, for example:
a crawler enters a directory, finds 3 files and 6 directories. it will start processing the files and start a thread with a new crawler for the other directories. So from my creator class I would create a single crawler upon a base directory. The crawler would assess the workload and if deemed needed it would spawn another crawler under another thread.
My crawler class looks like this
package com.fujitsu.spider;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.Serializable;
import java.util.ArrayList;
public class DocumentSpider implements Runnable, Serializable {
private static final long serialVersionUID = 8401649393078703808L;
private Spidermode currentMode = null;
private String URL = null;
private String[] terms = null;
private float score = 0;
private ArrayList<SpiderDataPair> resultList = null;
public enum Spidermode {
FILE, DIRECTORY
}
public DocumentSpider(String resourceURL, Spidermode mode, ArrayList<SpiderDataPair> resultList) {
currentMode = mode;
setURL(resourceURL);
this.setResultList(resultList);
}
#Override
public void run() {
try {
if (currentMode == Spidermode.FILE) {
doCrawlFile();
} else {
doCrawlDirectory();
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("SPIDER # " + URL + " HAS FINISHED.");
}
public Spidermode getCurrentMode() {
return currentMode;
}
public void setCurrentMode(Spidermode currentMode) {
this.currentMode = currentMode;
}
public String getURL() {
return URL;
}
public void setURL(String uRL) {
URL = uRL;
}
public void doCrawlFile() throws Exception {
File target = new File(URL);
if (target.isDirectory()) {
throw new Exception(
"This URL points to a directory while the spider is in FILE mode. Please change this spider to FILE mode.");
}
procesFile(target);
}
public void doCrawlDirectory() throws Exception {
File baseDir = new File(URL);
if (!baseDir.isDirectory()) {
throw new Exception(
"This URL points to a FILE while the spider is in DIRECTORY mode. Please change this spider to DIRECTORY mode.");
}
File[] directoryContent = baseDir.listFiles();
for (File f : directoryContent) {
if (f.isDirectory()) {
DocumentSpider spider = new DocumentSpider(f.getPath(), Spidermode.DIRECTORY, this.resultList);
spider.terms = this.terms;
(new Thread(spider)).start();
} else {
DocumentSpider spider = new DocumentSpider(f.getPath(), Spidermode.FILE, this.resultList);
spider.terms = this.terms;
(new Thread(spider)).start();
}
}
}
public void procesDirectory(String target) throws IOException {
File base = new File(target);
File[] directoryContent = base.listFiles();
for (File f : directoryContent) {
if (f.isDirectory()) {
procesDirectory(f.getPath());
} else {
procesFile(f);
}
}
}
public void procesFile(File target) throws IOException {
BufferedReader br = new BufferedReader(new FileReader(target));
String line;
while ((line = br.readLine()) != null) {
String[] words = line.split(" ");
for (String currentWord : words) {
for (String a : terms) {
if (a.toLowerCase().equalsIgnoreCase(currentWord)) {
score += 1f;
}
;
if (currentWord.toLowerCase().contains(a)) {
score += 1f;
}
;
}
}
}
br.close();
resultList.add(new SpiderDataPair(this, URL));
}
public String[] getTerms() {
return terms;
}
public void setTerms(String[] terms) {
this.terms = terms;
}
public float getScore() {
return score;
}
public void setScore(float score) {
this.score = score;
}
public ArrayList<SpiderDataPair> getResultList() {
return resultList;
}
public void setResultList(ArrayList<SpiderDataPair> resultList) {
this.resultList = resultList;
}
}
The problem I am facing is that in my root crawler I have this list of results from every crawler that I want to process further. The operation to process the data from this list is called from the servlet (or main method for this example). However the operations is always called before all of the crawlers have completed their processing. thus launching the operation to process the results too soon, which leads to incomplete data.
I tried solving this using the join methods but unfortunately I cant seems to figure this one out.
package com.fujitsu.spider;
import java.util.ArrayList;
import com.fujitsu.spider.DocumentSpider.Spidermode;
public class Main {
public static void main(String[] args) throws InterruptedException {
ArrayList<SpiderDataPair> results = new ArrayList<SpiderDataPair>();
String [] terms = {"SERVER","CHANGE","MO"};
DocumentSpider spider1 = new DocumentSpider("C:\\Users\\Mark\\workspace\\Spider\\Files", Spidermode.DIRECTORY, results);
spider1.setTerms(terms);
DocumentSpider spider2 = new DocumentSpider("C:\\Users\\Mark\\workspace\\Spider\\File2", Spidermode.DIRECTORY, results);
spider2.setTerms(terms);
Thread t1 = new Thread(spider1);
Thread t2 = new Thread(spider2);
t1.start();
t1.join();
t2.start();
t2.join();
for(SpiderDataPair d : spider1.getResultList()){
System.out.println("PATH -> " + d.getFile() + " SCORE -> " + d.getSpider().getScore());
}
for(SpiderDataPair d : spider2.getResultList()){
System.out.println("PATH -> " + d.getFile() + " SCORE -> " + d.getSpider().getScore());
}
}
}
TL:DR
I really wish to understand this subject so any help would be immensely appreciated!.
You need a couple of changes in your code:
In the spider:
List<Thread> threads = new LinkedList<Thread>();
for (File f : directoryContent) {
if (f.isDirectory()) {
DocumentSpider spider = new DocumentSpider(f.getPath(), Spidermode.DIRECTORY, this.resultList);
spider.terms = this.terms;
Thread thread = new Thread(spider);
threads.add(thread)
thread.start();
} else {
DocumentSpider spider = new DocumentSpider(f.getPath(), Spidermode.FILE, this.resultList);
spider.terms = this.terms;
Thread thread = new Thread(spider);
threads.add(thread)
thread.start();
}
}
for (Thread thread: threads) thread.join()
The idea is to create a new thread for each spider and start it. Once they are all running, you wait until each on is done before the Spider itself finishes. This way each spider thread keeps running until all of its work is done (thus the top thread runs until all children and their children are finished).
You also need to change your runner so that it runs the two spiders in parallel instead of one after another like this:
Thread t1 = new Thread(spider1);
Thread t2 = new Thread(spider2);
t1.start();
t2.start();
t1.join();
t2.join();
You should use a higher-level library than bare Thread for this task. I would suggest looking into ExecutorService in particular and all of java.util.concurrent generally. There are abstractions there that can manage all of the threading issues while providing well-formed tasks a properly protected environment in which to run.
For your specific problem, I would recommend some sort of blocking queue of tasks and a standard producer-consumer architecture. Each task knows how to determine if its path is a file or directory. If it is a file, process the file; if it is a directory, crawl the directory's immediate contents and enqueue new tasks for each sub-path. You could also use some properly-synchronized shared state to cap the number of files processed, depth, etc. Also, the service provides the ability to await termination of its tasks, making the "join" simpler.
With this architecture, you decouple the notion of threads and thread management (handled by the ExecutorService) with your business logic of tasks (typically a Runnable or Callable). The service itself has the ability to tune how to instantiate, such as a fixed maximum number of threads or a scalable number depending on how many concurrent tasks exist (See factory methods on java.util.concurrent.Executors). Threads, which are more expensive than the Runnables they execute, are re-used to conserve resources.
If your objective is primarily something functional that works in production quality, then the library is the way to go. However, if your objective is to understand the lower-level details of thread management, then you may want to investigate the use of latches and perhaps thread groups to manage them at a lower level, exposing the details of the implementation so you can work with the details.

JUnit test scanning folder class

I want write several tests, but from a high level each of them should populate a directory structure with some files. I'd test each of these cases at least:
A single folder with a file that passes the filter.
A single folder with a file that does NOT pass the filter.
A nested folder with a file in each.
Code:
class FolderScan implements Runnable {
private String path;
private BlockingQueue<File> queue;
private CountDownLatch latch;
private File endOfWorkFile;
private List<Checker> checkers;
FolderScan(String path, BlockingQueue<File> queue, CountDownLatch latch,
File endOfWorkFile) {
this.path = path;
this.queue = queue;
this.latch = latch;
this.endOfWorkFile = endOfWorkFile;
checkers = new ArrayList<Checker>(Arrays.asList(new ExtentionsCheker(),
new ProbeContentTypeCheker(), new CharsetDetector()));
}
public FolderScan() {
}
#Override
public void run() {
findFiles(path);
queue.add(endOfWorkFile);
latch.countDown();
}
private void findFiles(String path) {
boolean checksPassed = true;
File root;
try {
root = new File(path);
File[] list = root.listFiles();
for (File currentFile : list) {
if (currentFile.isDirectory()) {
findFiles(currentFile.getAbsolutePath());
} else {
for (Checker currentChecker : checkers) {
if (!currentChecker.check(currentFile)) {
checksPassed = false;
break;
}
}
if (checksPassed)
queue.put(currentFile);
}
}
} catch (InterruptedException | RuntimeException e) {
System.out.println("Wrong input !!!");
e.printStackTrace();
}
}
}
Questions:
How to create files into each folder?
To prove that queue contains
the File objects that you expect?
The last element in queue is the
'trigger' File?
How to create files into each folder?
Extract the file IO and use a mocked repository for the tests. This means that you will have the IO somewhere else and may wish to use the below to test that.
A temp folder using the JUnit rule With a test folder you create the files to match the test.
To prove that queue contains the File objects that you expect?
.equals works well for File objects I believe.
A single folder with a file that does NOT pass the filter.
I'd pass in the blockers so I can pass in an "Always Pass" and "Always Fail" blocker.
public class TestFolderScan {
#Rule
public TemporaryFolder folder= new TemporaryFolder();
#Test
public void whenASingleFolderWithAFileThatPassesTheFilterThenItExistsInTheQueue() {
File expectedFile = folder.newFile("file.txt");
File endOfWorkFile = new File("EOW");
Queue queue = ...;
FolderScan subject = new FolderScan(folder.getRoot(), queue, new AllwaysPassesBlocker(),...);
subject.run();
expected = new Queue(expectedFile, endOfWorkFile);
assertEquals(queue, expected);
}
}

Categories

Resources