I am writing to an in-memory distributed database in the batch size of that is user-defined in multithreaded environment. But I want to limit the number of rows written to ex. 1000 rows/sec. The reason for this requirement is that my producer is writing too fast and consumer is running into leaf-memory error. Is there any standard practice to perform throttling while batch processing of the records.
dataStream.map(line => readJsonFromString(line)).grouped(memsqlBatchSize).foreach { recordSet =>
val dbRecords = recordSet.map(m => (m, Events.transform(m)))
dbRecords.map { record =>
try {
Events.setValues(eventInsert, record._2)
eventInsert.addBatch
} catch {
case e: Exception =>
logger.error(s"error adding batch: ${e.getMessage}")
val error_event = Events.jm.writeValueAsString(mapAsJavaMap(record._1.asInstanceOf[Map[String, Object]]))
logger.error(s"event: $error_event")
}
}
// Bulk Commit Records
try {
eventInsert.executeBatch
} catch {
case e: java.sql.BatchUpdateException =>
val updates = e.getUpdateCounts
logger.error(s"failed commit: ${updates.toString}")
updates.zipWithIndex.filter { case (v, i) => v == Statement.EXECUTE_FAILED }.foreach { case (v, i) =>
val error = Events.jm.writeValueAsString(mapAsJavaMap(dbRecords(i)._1.asInstanceOf[Map[String, Object]]))
logger.error(s"insert error: $error")
logger.error(e.getMessage)
}
}
finally {
connection.commit
eventInsert.clearBatch
logger.debug(s"committed: ${dbRecords.length.toString}")
}
}
I was hoping if I could pass a user defined arguments as a throttleMax and if total records written by each thread reaches the throttleMax, thread.sleep() will be called for 1 sec. But this is going to make the entire process very slow. Can there be any other effective method, that can be used for throttle the loading of the data to 1000 rows/sec?
As others have suggested (see the comments on the question), you have better options available to you than throttling here. However, you can throttle an operation in Java with some simple code like the following:
/**
* Given an Iterator `inner`, returns a new Iterator which will emit items upon
* request, but throttled to at most one item every `minDelayMs` milliseconds.
*/
public static <T> Iterator<T> throttledIterator(Iterator<T> inner, int minDelayMs) {
return new Iterator<T>() {
private long lastEmittedMillis = System.currentTimeMillis() - minDelayMs;
#Override
public boolean hasNext() {
return inner.hasNext();
}
#Override
public T next() {
long now = System.currentTimeMillis();
long requiredDelayMs = now - lastEmittedMillis;
if (requiredDelayMs > 0) {
try {
Thread.sleep(requiredDelayMs);
} catch (InterruptedException e) {
// resume
}
}
lastEmittedMillis = now;
return inner.next();
}
};
}
The above code uses Thread.sleep, so is not suitable for use in a Reactive system. In that case, you would want to use the Throttle implementation provided in that system, e.g. throttle in Akka
Related
My program gets very slow as more and more records are processed. I initially thought it is due to excessive memory consumption as my program is String intensive (I am using Java 11 so compact strings should be used whenever possible) so I increased the JVM Heap:
-Xms2048m
-Xmx6144m
I also increased the task manager's memory as well as timeout, flink-conf.yaml:
jobmanager.heap.size: 6144m
heartbeat.timeout: 5000000
However, none of this helped with the issue. The Program still gets very slow at about the same point which is after processing roughly 3.5 million records, only about 0.5 million more to go. As the program approaches the 3.5 million mark it gets very very slow until it eventually times out, total execution time is about 11 minutes.
I checked the memory consumption in VisualVm, but the memory consumption never goes more than about 700MB.My flink pipeline looks as follows:
final StreamExecutionEnvironment environment = StreamExecutionEnvironment.createLocalEnvironment(1);
environment.setParallelism(1);
DataStream<Tuple> stream = environment.addSource(new TPCHQuery3Source(filePaths, relations));
stream.process(new TPCHQuery3Process(relations)).addSink(new FDSSink());
environment.execute("FlinkDataService");
Where the bulk of the work is done in the process function, I am implementing data base join algorithms and the columns are stored as Strings, specifically I am implementing query 3 of the TPCH benchmark, check here if you wish https://examples.citusdata.com/tpch_queries.html.
The timeout error is this:
java.util.concurrent.TimeoutException: Heartbeat of TaskManager with id <id> timed out.
Once I got this error as well:
Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space
Also, my VisualVM monitoring, screenshot is captured at the point where things get very slow:
Here is the run loop of my source function:
while (run) {
readers.forEach(reader -> {
try {
String line = reader.readLine();
if (line != null) {
Tuple tuple = lineToTuple(line, counter.get() % filePaths.size());
if (tuple != null && isValidTuple(tuple)) {
sourceContext.collect(tuple);
}
} else {
closedReaders.add(reader);
if (closedReaders.size() == filePaths.size()) {
System.out.println("ALL FILES HAVE BEEN STREAMED");
cancel();
}
}
counter.getAndIncrement();
} catch (IOException e) {
e.printStackTrace();
}
});
}
I basically read a line of each of the 3 files I need, based on the order of the files, I construct a tuple object which is my custom class called tuple representing a row in a table, and emit that tuple if it is valid i.e. fullfils certain conditions on the date.
I am also suggesting the JVM to do garbage collection at the 1 millionth, 1.5millionth, 2 millionth and 2.5 millionth record like this:
System.gc()
Any thoughts on how I can optimize this?
String intern() saved me. I did intern on every string before storing it in my maps and that worked like a charm.
these are the properties that I changed on my link stand-alone cluster to compute the TPC-H query 03.
jobmanager.memory.process.size: 1600m
heartbeat.timeout: 100000
taskmanager.memory.process.size: 8g # defaul: 1728m
I implemented this query to stream only the Order table and I kept the other tables as a state. Also I am computing as a windowless query, which I think it makes more sense and it is faster.
public class TPCHQuery03 {
private final String topic = "topic-tpch-query-03";
public TPCHQuery03() {
this(PARAMETER_OUTPUT_LOG, "127.0.0.1", false, false, -1);
}
public TPCHQuery03(String output, String ipAddressSink, boolean disableOperatorChaining, boolean pinningPolicy, long maxCount) {
try {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime);
if (disableOperatorChaining) {
env.disableOperatorChaining();
}
DataStream<Order> orders = env
.addSource(new OrdersSource(maxCount)).name(OrdersSource.class.getSimpleName()).uid(OrdersSource.class.getSimpleName());
// Filter market segment "AUTOMOBILE"
// customers = customers.filter(new CustomerFilter());
// Filter all Orders with o_orderdate < 12.03.1995
DataStream<Order> ordersFiltered = orders
.filter(new OrderDateFilter("1995-03-12")).name(OrderDateFilter.class.getSimpleName()).uid(OrderDateFilter.class.getSimpleName());
// Join customers with orders and package them into a ShippingPriorityItem
DataStream<ShippingPriorityItem> customerWithOrders = ordersFiltered
.keyBy(new OrderKeySelector())
.process(new OrderKeyedByCustomerProcessFunction(pinningPolicy)).name(OrderKeyedByCustomerProcessFunction.class.getSimpleName()).uid(OrderKeyedByCustomerProcessFunction.class.getSimpleName());
// Join the last join result with Lineitems
DataStream<ShippingPriorityItem> result = customerWithOrders
.keyBy(new ShippingPriorityOrderKeySelector())
.process(new ShippingPriorityKeyedProcessFunction(pinningPolicy)).name(ShippingPriorityKeyedProcessFunction.class.getSimpleName()).uid(ShippingPriorityKeyedProcessFunction.class.getSimpleName());
// Group by l_orderkey, o_orderdate and o_shippriority and compute revenue sum
DataStream<ShippingPriorityItem> resultSum = result
.keyBy(new ShippingPriority3KeySelector())
.reduce(new SumShippingPriorityItem(pinningPolicy)).name(SumShippingPriorityItem.class.getSimpleName()).uid(SumShippingPriorityItem.class.getSimpleName());
// emit result
if (output.equalsIgnoreCase(PARAMETER_OUTPUT_MQTT)) {
resultSum
.map(new ShippingPriorityItemMap(pinningPolicy)).name(ShippingPriorityItemMap.class.getSimpleName()).uid(ShippingPriorityItemMap.class.getSimpleName())
.addSink(new MqttStringPublisher(ipAddressSink, topic, pinningPolicy)).name(OPERATOR_SINK).uid(OPERATOR_SINK);
} else if (output.equalsIgnoreCase(PARAMETER_OUTPUT_LOG)) {
resultSum.print().name(OPERATOR_SINK).uid(OPERATOR_SINK);
} else if (output.equalsIgnoreCase(PARAMETER_OUTPUT_FILE)) {
StreamingFileSink<String> sink = StreamingFileSink
.forRowFormat(new Path(PATH_OUTPUT_FILE), new SimpleStringEncoder<String>("UTF-8"))
.withRollingPolicy(
DefaultRollingPolicy.builder().withRolloverInterval(TimeUnit.MINUTES.toMillis(15))
.withInactivityInterval(TimeUnit.MINUTES.toMillis(5))
.withMaxPartSize(1024 * 1024 * 1024).build())
.build();
resultSum
.map(new ShippingPriorityItemMap(pinningPolicy)).name(ShippingPriorityItemMap.class.getSimpleName()).uid(ShippingPriorityItemMap.class.getSimpleName())
.addSink(sink).name(OPERATOR_SINK).uid(OPERATOR_SINK);
} else {
System.out.println("discarding output");
}
System.out.println("Stream job: " + TPCHQuery03.class.getSimpleName());
System.out.println("Execution plan >>>\n" + env.getExecutionPlan());
env.execute(TPCHQuery03.class.getSimpleName());
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws Exception {
new TPCHQuery03();
}
}
The UDFs are here: OrderSource, OrderKeyedByCustomerProcessFunction, ShippingPriorityKeyedProcessFunction, and SumShippingPriorityItem. I am using the com.google.common.collect.ImmutableList since the state will not be updated. Also I am keeping only the necessary columns on the state, such as ImmutableList<Tuple2<Long, Double>> lineItemList.
I'm working with Akka (version 2.4.17) to build an observation Flow in Java (let's say of elements of type <T> to stay generic).
My requirement is that this Flow should be customizable to deliver a maximum number of observations per unit of time as soon as they arrive. For instance, it should be able to deliver at most 2 observations per minute (the first that arrive, the rest can be dropped).
I looked very closely to the Akka documentation, and in particular this page which details the built-in stages and their semantics.
So far, I tried the following approaches.
With throttle and shaping() mode (to not close the stream when the limit is exceeded):
Flow.of(T.class)
.throttle(2,
new FiniteDuration(1, TimeUnit.MINUTES),
0,
ThrottleMode.shaping())
With groupedWith and an intermediary custom method:
final int nbObsMax = 2;
Flow.of(T.class)
.groupedWithin(Integer.MAX_VALUE, new FiniteDuration(1, TimeUnit.MINUTES))
.map(list -> {
List<T> listToTransfer = new ArrayList<>();
for (int i = list.size()-nbObsMax ; i>0 && i<list.size() ; i++) {
listToTransfer.add(new T(list.get(i)));
}
return listToTransfer;
})
.mapConcat(elem -> elem) // Splitting List<T> in a Flow of T objects
Previous approaches give me the correct number of observations per unit of time but these observations are retained and only delivered at the end of the time window (and therefore there is an additional delay).
To give a more concrete example, if the following observations arrives into my Flow:
[Obs1 t=0s] [Obs2 t=45s] [Obs3 t=47s] [Obs4 t=121s] [Obs5 t=122s]
It should only output the following ones as soon as they arrive (processing time can be neglected here):
Window 1: [Obs1 t~0s] [Obs2 t~45s]
Window 2: [Obs4 t~121s] [Obs5 t~122s]
Any help will be appreciated, thanks for reading my first StackOverflow post ;)
I cannot think of a solution out of the box that does what you want. Throttle will emit in a steady stream because of how it is implemented with the bucket model, rather than having a permitted lease at the start of every time period.
To get the exact behavior you are after you would have to create your own custom rate-limit stage (which might not be that hard). You can find the docs on how to create custom stages here: http://doc.akka.io/docs/akka/2.5.0/java/stream/stream-customize.html#custom-linear-processing-stages-using-graphstage
One design that could work is having an allowance counter saying how many elements that can be emitted that you reset every interval, for every incoming element you subtract one from the counter and emit, when the allowance used up you keep pulling upstream but discard the elements rather than emit them. Using TimerGraphStageLogic for GraphStageLogic allows you to set a timed callback that can reset the allowance.
I think this is exactly what you need: http://doc.akka.io/docs/akka/2.5.0/java/stream/stream-cookbook.html#Globally_limiting_the_rate_of_a_set_of_streams
Thanks to the answer of #johanandren, I've successfully implemented a custom time-based GraphStage that meets my requirements.
I post the code below, if anyone is interested:
import akka.stream.Attributes;
import akka.stream.FlowShape;
import akka.stream.Inlet;
import akka.stream.Outlet;
import akka.stream.stage.*;
import scala.concurrent.duration.FiniteDuration;
public class CustomThrottleGraphStage<A> extends GraphStage<FlowShape<A, A>> {
private final FiniteDuration silencePeriod;
private int nbElemsMax;
public CustomThrottleGraphStage(int nbElemsMax, FiniteDuration silencePeriod) {
this.silencePeriod = silencePeriod;
this.nbElemsMax = nbElemsMax;
}
public final Inlet<A> in = Inlet.create("TimedGate.in");
public final Outlet<A> out = Outlet.create("TimedGate.out");
private final FlowShape<A, A> shape = FlowShape.of(in, out);
#Override
public FlowShape<A, A> shape() {
return shape;
}
#Override
public GraphStageLogic createLogic(Attributes inheritedAttributes) {
return new TimerGraphStageLogic(shape) {
private boolean open = false;
private int countElements = 0;
{
setHandler(in, new AbstractInHandler() {
#Override
public void onPush() throws Exception {
A elem = grab(in);
if (open || countElements >= nbElemsMax) {
pull(in); // we drop all incoming observations since the rate limit has been reached
}
else {
if (countElements == 0) { // we schedule the next instant to reset the observation counter
scheduleOnce("resetCounter", silencePeriod);
}
push(out, elem); // we forward the incoming observation
countElements += 1; // we increment the counter
}
}
});
setHandler(out, new AbstractOutHandler() {
#Override
public void onPull() throws Exception {
pull(in);
}
});
}
#Override
public void onTimer(Object key) {
if (key.equals("resetCounter")) {
open = false;
countElements = 0;
}
}
};
}
}
I'm playing with the RxJava retryWhen operator. Very little is found about it on the internet, the only one worthy of any mention being this. That too falls short of exploring the various use cases that I'd like to understand. I also threw in asynchronous execution and retry with back-off to make it more realistic.
My setup is simple: I've a class ChuckNorrisJokesRepository that returns random number of Chuck Norris jokes from a JSON file. My class under test is ChuckNorrisJokesService which is shown below. The use cases I'm interested in are as follows:
Succeeds on 1st attempt (no retries)
Fails after 1 retry
Attempts to retry 3 times but succeeds on 2nd hence doesn't retry 3rd time
Succeeds on 3rd retry
Note: The project is available on my GitHub.
ChuckNorrisJokesService.java:
#Slf4j
#Builder
public class ChuckNorrisJokesService {
#Getter
private final AtomicReference<Jokes> jokes = new AtomicReference<>(new Jokes());
private final Scheduler scheduler;
private final ChuckNorrisJokesRepository jokesRepository;
private final CountDownLatch latch;
private final int numRetries;
private final Map<String, List<String>> threads;
public static class ChuckNorrisJokesServiceBuilder {
public ChuckNorrisJokesService build() {
if (scheduler == null) {
scheduler = Schedulers.io();
}
if (jokesRepository == null) {
jokesRepository = new ChuckNorrisJokesRepository();
}
if (threads == null) {
threads = new ConcurrentHashMap<>();
}
requireNonNull(latch, "CountDownLatch must not be null.");
return new ChuckNorrisJokesService(scheduler, jokesRepository, latch, numRetries, threads);
}
}
public void setRandomJokes(int numJokes) {
mergeThreadNames("getRandomJokes");
Observable.fromCallable(() -> {
log.debug("fromCallable - before call. Latch: {}.", latch.getCount());
mergeThreadNames("fromCallable");
latch.countDown();
List<Joke> randomJokes = jokesRepository.getRandomJokes(numJokes);
log.debug("fromCallable - after call. Latch: {}.", latch.getCount());
return randomJokes;
}).retryWhen(errors ->
errors.zipWith(Observable.range(1, numRetries), (n, i) -> i).flatMap(retryCount -> {
log.debug("retryWhen. retryCount: {}.", retryCount);
mergeThreadNames("retryWhen");
return Observable.timer(retryCount, TimeUnit.SECONDS);
}))
.subscribeOn(scheduler)
.subscribe(j -> {
log.debug("onNext. Latch: {}.", latch.getCount());
mergeThreadNames("onNext");
jokes.set(new Jokes("success", j));
latch.countDown();
},
ex -> {
log.error("onError. Latch: {}.", latch.getCount(), ex);
mergeThreadNames("onError");
},
() -> {
log.debug("onCompleted. Latch: {}.", latch.getCount());
mergeThreadNames("onCompleted");
latch.countDown();
}
);
}
private void mergeThreadNames(String methodName) {
threads.merge(methodName,
new ArrayList<>(Arrays.asList(Thread.currentThread().getName())),
(value, newValue) -> {
value.addAll(newValue);
return value;
});
}
}
For brevity, I'll only show the Spock test case for the 1st use case. See my GitHub for the other test cases.
def "succeeds on 1st attempt"() {
setup:
CountDownLatch latch = new CountDownLatch(2)
Map<String, List<String>> threads = Mock(Map)
ChuckNorrisJokesService service = ChuckNorrisJokesService.builder()
.latch(latch)
.threads(threads)
.build()
when:
service.setRandomJokes(3)
latch.await(2, TimeUnit.SECONDS)
Jokes jokes = service.jokes.get()
then:
jokes.status == 'success'
jokes.count() == 3
1 * threads.merge('getRandomJokes', *_)
1 * threads.merge('fromCallable', *_)
0 * threads.merge('retryWhen', *_)
1 * threads.merge('onNext', *_)
0 * threads.merge('onError', *_)
1 * threads.merge('onCompleted', *_)
}
This fails with:
Too few invocations for:
1 * threads.merge('fromCallable', *_) (0 invocations)
1 * threads.merge('onNext', *_) (0 invocations)
What I'm expecting is that fromCallable is called once, it succeeds, onNext is called once, followed by onCompleted. What am I missing?
P.S.: Full disclosure - I've also posted this question on RxJava GitHub.
I solved this after several hours of troubleshooting and with help from ReactiveX member David Karnok.
retryWhen is a complicated, perhaps even buggy, operator. The official doc and at least one answer here use range operator, which completes immediately if there are no retries to be made. See my discussion with David Karnok.
The code is available on my GitHub complete with the following test cases:
Succeeds on 1st attempt (no retries)
Fails after 1 retry
Attempts to retry 3 times but succeeds on 2nd hence doesn't retry 3rd time
Succeeds on 3rd retry
I have this code that dumps documents into MongoDB once an ArrayBlockingQueue fills it's quota. When I run the code, it seems to only run once and then gives me a stack trace. My guess is that the BulkWriteOperation someone has to 'reset' or start over again.
Also, I create the BulkWriteOperations in the constructor...
bulkEvent = eventsCollection.initializeOrderedBulkOperation();
bulkSession = sessionsCollection.initializeOrderedBulkOperation();
Here's the stacktrace.
10 records inserted
java.lang.IllegalStateException: already executed
at org.bson.util.Assertions.isTrue(Assertions.java:36)
at com.mongodb.BulkWriteOperation.insert(BulkWriteOperation.java:62)
at willkara.monkai.impl.managers.DataManagers.MongoDBManager.dumpQueue(MongoDBManager.java:104)
at willkara.monkai.impl.managers.DataManagers.MongoDBManager.addToQueue(MongoDBManager.java:85)
Here's the code for the Queues:
public void addToQueue(Object item) {
if (item instanceof SakaiEvent) {
if (eventQueue.offer((SakaiEvent) item)) {
} else {
dumpQueue(eventQueue);
}
}
if (item instanceof SakaiSession) {
if (sessionQueue.offer((SakaiSession) item)) {
} else {
dumpQueue(sessionQueue);
}
}
}
And here is the code that reads from the queues and adds them to an BulkWriteOperation (initializeOrderedBulkOperation) to execute it and then dump it to the database. Only 10 documents get written and then it fails.
private void dumpQueue(BlockingQueue q) {
Object item = q.peek();
Iterator itty = q.iterator();
BulkWriteResult result = null;
if (item instanceof SakaiEvent) {
while (itty.hasNext()) {
bulkEvent.insert(((SakaiEvent) itty.next()).convertToDBObject());
//It's failing at that line^^
}
result = bulkEvent.execute();
}
if (item instanceof SakaiSession) {
while (itty.hasNext()) {
bulkSession.insert(((SakaiSession) itty.next()).convertToDBObject());
}
result = bulkSession.execute();
}
System.out.println(result.getInsertedCount() + " records inserted");
}
The general documentation applies to all driver implementations in this case:
"After execution, you cannot re-execute the Bulk() object without reinitializing."
So the .execute() method effectively "drains" the current list of operations that have been sent to it and now contains state information about how the commands were actually sent. So you cannot add more entries or call .execute() again on the same instance without reinitializing .
So after you call execute on each "Bulk" object, you need to call the intialize again:
bulkEvent = eventsCollection.initializeOrderedBulkOperation();
bulkSession = sessionsCollection.initializeOrderedBulkOperation();
Each of those lines placed again repectively after each .execute() call in your function. Then further calls to those instances can add operations and call execute again continuing the cycle.
Note that "Bulk" operations objects will store as many items as you want to put into them but will break up requests to the server into maximum amounts of 1000 items. After execution the state of the operations list will reflect exactly how this is done should you want to inspect that.
I have to write a javaScript function that return some data to the caller.
In that function I have multiple ways to retrieve data i.e.,
Lookup from cache
Retrieve from HTML5 LocalStorage
Retrieve from REST Backend (bonus: put the fresh data back into cache)
Each option may take its own time to finish and it may succeed or fail.
What I want to do is, to execute all those three options asynchronously/parallely and return the result whoever return first.
I understand that parallel execution is not possible in JavaScript since it is single threaded, but I want to at least execute them asynchronously and cancel the other tasks if one of them return successfully result.
I have one more question.
Early return and continue executing the remaining task in a JavaScript function.
Example pseudo code:
function getOrder(id) {
var order;
// early return if the order is found in cache.
if (order = cache.get(id)) return order;
// continue to get the order from the backend REST API.
order = cache.put(backend.get(id));
return order;
}
Please advice how to implement those requirements in JavaScript.
Solutions discovered so far:
Fastest Result
JavaScript ES6 solution
Ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise
Promise.race(iterable)
Returns a promise that resolves when the first promise in the iterable resolves.
var p1 = new Promise(function(resolve, reject) { setTimeout(resolve, 500, "one"); });
var p2 = new Promise(function(resolve, reject) { setTimeout(resolve, 100, "two"); });
Promise.race([p1, p2]).then(function(value) {
// value == "two"
});
Java/Groovy solution
Ref: http://gpars.org/1.1.0/guide/guide/single.html
import groovyx.gpars.dataflow.Promise
import groovyx.gpars.dataflow.Select
import groovyx.gpars.group.DefaultPGroup
import java.util.concurrent.atomic.AtomicBoolean
/**
* Demonstrates the use of dataflow tasks and selects to pick the fastest result of concurrently run calculations.
* It shows a waz to cancel the slower tasks once a result is known
*/
final group = new DefaultPGroup()
final done = new AtomicBoolean()
group.with {
Promise p1 = task {
sleep(1000)
if (done.get()) return
10 * 10 + 1
}
Promise p2 = task {
sleep(1000)
if (done.get()) return
5 * 20 + 2
}
Promise p3 = task {
sleep(1000)
if (done.get()) return
1 * 100 + 3
}
final alt = new Select(group, p1, p2, p3, Select.createTimeout(500))
def result = alt.select()
done.set(true)
println "Result: " + result
}
Early Return and Interactive Function
Angular Promises combined with ES6 generators???
angular.module('org.common')
.service('SpaceService', function ($q, $timeout, Restangular, $angularCacheFactory) {
var _spacesCache = $angularCacheFactory('spacesCache', {
maxAge: 120000, // items expire after two min
deleteOnExpire: 'aggressive',
onExpire: function (key, value) {
Restangular.one('organizations', key).getList('spaces').then(function (data) {
_spacesCache.put(key, data);
});
}
});
/**
* #class SpaceService
*/
return {
getAllSpaces: function (orgId) {
var deferred = $q.defer();
var spaces;
if (spaces = _spacesCache.get(orgId)) {
deferred.resolve(spaces);
} else {
Restangular.one('organizations', orgId).getList('spaces').then(function (data) {
_spacesCache.put(orgId, data);
deferred.resolve(data);
} , function errorCallback(err) {
deferred.reject(err);
});
}
return deferred.promise;
},
getAllSpaces1: function (orgId) {
var deferred = $q.defer();
var spaces;
var timerID = $timeout(
Restangular.one('organizations', orgId).getList('spaces').then(function (data) {
_spacesCache.put(orgId, data);
deferred.resolve(data);
}), function errorCallback(err) {
deferred.reject(err);
}, 0);
deferred.notify('Trying the cache now...'); //progress notification
if (spaces = _spacesCache.get(orgId)) {
$timeout.cancel(timerID);
deferred.resolve(spaces);
}
return deferred.promise;
},
getAllSpaces2: function (orgId) {
// set up a dummy canceler
var canceler = $q.defer();
var deferred = $q.defer();
var spaces;
$timeout(
Restangular.one('organizations', orgId).withHttpConfig({timeout: canceler.promise}).getList('spaces').then(function (data) {
_spacesCache.put(orgId, data);
deferred.resolve(data);
}), function errorCallback(err) {
deferred.reject(err);
}, 0);
if (spaces = _spacesCache.get(orgId)) {
canceler.resolve();
deferred.resolve(spaces);
}
return deferred.promise;
},
addSpace: function (orgId, space) {
_spacesCache.remove(orgId);
// do something with the data
return '';
},
editSpace: function (space) {
_spacesCache.remove(space.organization.id);
// do something with the data
return '';
},
deleteSpace: function (space) {
console.table(space);
_spacesCache.remove(space.organization.id);
return space.remove();
}
};
});
Personally, I would try the three asynchronous retrievals sequentially, starting with the least expensive and ending with the most expensive. However, responding to the first of three parallel retrievals is an interesting problem.
You should be able to exploit the characteristic of $q.all(promises), by which :
as soon as any of the promises fails then the returned promise is rejected
if all promises are successful then the returned promise is resolved.
But you want to invert the logic such that :
as soon as any of the promises is successful then the returned promise is resolved
if all promises fail then the returned promise is rejected.
This should be achievable with an invert() utility which converts success to failure and vice versa.
function invert(promise) {
return promise.then(function(x) {
return $q.defer().reject(x).promise;
}, function(x) {
return $q.defer().resolve(x).promise;
});
}
And a first() utility, to give the desired behaviour :
function first(arr) {
return invert($q.all(arr.map(invert)));
}
Notes:
the input arr is an array of promises
a native implementation of array.map() is assumed (otherwise you can explicitly loop to achieve the same effect)
the outer invert() in first() restores the correct sense of the promise it returns
I'm not particularly experienced in angular, so I may have made syntactic errors - however I think the logic is correct.
Then getOrder() will be something like this :
function getOrder(id) {
return first([
cache.get(id),
localStorage.get(id).then(cache.put),
backend.get(id).then(cache.put).then(localStorage.put)
]);
}
Thus, getOrder(id) should return a Promise of an order (not the order directly).
The problem in your example getOrder lies in that if the 3 lookup functions are going to be asynchronous, you won't get the order back from them right away and as they are not blocking, the getOrder would return null; You would be better off defining a callback function which takes action on the first returned order data and simply ignores the rest of them.
var doSomethingWithTheOrder = function CallBackOnce (yourResult) {
if (!CallBackOnce.returned) {
CallBackOnce.returned = true;
// Handle the returned data
console.log('handle', yourResult);
} else {
// Ignore the rest
console.log('you are too late');
}
}
Make your data lookup functions accept a callback
function cacheLookUp(id, callback) {
// Make a real lookup here
setTimeout(function () {
callback('order data from cache');
}, 3000);
}
function localeStorageLookUp(id, callback) {
// Make a real lookup here
setTimeout(function () {
callback('order data from locale storage');
}, 1500);
}
function restLookUp(id, callback) {
// Make a real lookup here
setTimeout(function () {
callback('order data from rest');
}, 5000);
}
And pass the callback function to each of them
function getOrder(id) {
cacheLookUp(id, doSomethingWithTheOrder);
localeStorageLookUp(id, doSomethingWithTheOrder);
restLookUp(id, doSomethingWithTheOrder);
}
Create a broadcast event in your api calls, then create $scope.$on to listen to those broadcast, when $on get's activated do the function that refreshes those objects.
So in your service have a function that makes an ajax calls to your rest api.
You would have 3 ajax calls. And 3 listeners. Each of them would look something like this.
This is just sudo code, but this format is how you do it
something like
$http({
method: "GET",
url: url_of_api,
}).success(function(data, *args, *kwargs){
$rooteScope.$braodcast('success', data)
})
In your controller have a listener something like this
$scope.$on('success', function(event, args){
// Check the state of the other objects, have they been refreshed - you probably want to set flags to check
if (No Flags are Set):
$scope.data = args // which would be the returned data adn $scope.data would be what you're trying to refresh.
}