Measuring Task Queue Costs in Google App Engine

Measuring Task Queue Costs in Google App Engine - java

I am measuring the cost of requests to GAE by inspecting the x-appengine-estimated-cpm-us-dollars header. This works great and in combination with x-appengine-resource-usage and
x-traceurl I can even get more detailed information.
However, a large part of my application run in the context of task queues. Thus, a huge part of the instance hour costs are consumed by queues. Each time code is executed outside of a request its costs are not included in the x-appengine-estimated-cpm-us-dollars header.
I am looking for a way to measure the full costs consumed by each request. I.e. costs generated by the request itself and the cost of the tasks that have been added by this request.

It is an overkill. There is a tool you can download google app engine log and convert them to sqlite.
http://code.google.com/p/google-app-engine-samples/source/browse/trunk/logparser/logparser.py
With this tool, cpm usd for both task request and normal request would be all downloaded together. You can store daily log into separate sqlite file and do as much analysis as you want.
In terms of relate the cost of task back to original request. The log data downloaded with this tool includes the full output of logging module.
So you can simply logging an generate id in the original request
pass the id to task.
logging the received id again in the task request.
find normal and task request pair via id.
for example:
# in org request
a_id = genereate_a_random_id()
logging.info(a_id) # the id will be included
taskqueue.add(url='/path_to_task', params={'id': a_id})
# in task request
a_id = self.request.get('id')
logging.info(a_id)
EDIT1
I think there is another possible way to estimate the cost of normal request + task request.
The trick is change the async task to sync (assume the cost would be the same).
I didn't try it but it is much easier to try.
# in org request, add a variable to identify debug
debug = self.request.get('DEBUG')
if debug:
self.redirect('/path_to_task')
else:
taskqueue.add(url='/path_to_task')
Thus, while testing the normal request with DEBUG parameter. It will firstly process the normal request then return x-appengine-estimated-cpm-us-dollars for normal request. Later it will redirect your test client to the relative task request (task request could also be access and trigger via url client as normal request) and return x-appengine-estimated-cpm-us-dollars for task request. You can simply add them together to get the total cost.

Related

Difference between elasticsearch ReIndex task REST API implementation and Java rest high level client

Hi I'm trying to use elastic search reindex api via rest high level client and am comparing two ways of doing it.
Rest API:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-task-api
[![Rest API Documentation screenshot][1]][1]
Running reindex asynchronously - If the request contains wait_for_completion=false, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at _tasks/<task_id>. When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.
rest high level client：
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-reindex.html#java-rest-high-document-reindex-task-submission
[![rest high level client Documentation screenshot][2]][2]
Reindex task submission - It is also possible to submit a ReindexRequest and not wait for it completion with the use of Task API. This is an equivalent of a REST request with wait_for_completion flag set to false.
I'm trying to figure out this: From Rest API Doc I know that I should delete the task document so Elasticsearch can reclaim the space. Since the rest high level client is basically doing the same thing, do I need to "delete the task document" if I choose to use this client instead of the rest API? If so, how can I do that?
Thanks
[1]: https://i.stack.imgur.com/OEVHi.png
[2]: https://i.stack.imgur.com/sw9Dw.png

The task document is just a summary of what happen during reindex (so a small document), since you specify to do in async with wait_for_completion=false it will be created in system indice .tasks, so you can query this indice like any other to find the summary and delete it.
The .tasks indice will not be available by default in futur version of Elasticsearch and you will need to use specific function linked to _tasks with the java REST api available here

How to find out whether a particular BluePrism has completed or not in Java?

I need to
start a BluePrism process and
wait until it has completed
in a Java application which runs on a machine without a BluePrism client.
I know that it is possible to start a process using a SOAP call.
How can I find out whether or not the started process is finished and whether or not it completed successfully?
A colleague of mine said that it is possible to get a notification from BluePrism by passing a special parameter in the SOAP request, but I could not find anything on that in the Web Services User Guide.
Update 1: One solution is to adapt this software so that it exposes the BluePrism queues via a REST API.
Update 2: This page suggests running a query like below against the BluePrism database.
SELECT
[BPAProcess].[name],
[BPAProcess].[description],
[BPASession].[sessionid],
[BPASession].[startdatetime],
[BPASession].[enddatetime],
[BPASession].[statusid],
[BPAStatus].[description]
FROM [BPAProcess]
JOIN [BPASession] ON
[BPASession].[processid] = [BPAProcess].[processid]
JOIN [BPAStatus] ON
[BPASession].[statusid] = [BPAStatus].[statusid]
WHERE [BPAStatus].[description] IN ('Completed', 'Stopped', 'Terminated')
AND [BPASession].[sessionid] = 'Your session id'
Update 3: The BluePrism version is 6.4.2.
Update 4: Additional information is available in the BluePrism community.

If you expose the process in question as a web service (System -> Processes -> Exposure) and invoke it this way, the SOAP response will not be returned until the process has completed running. Your Java code can simply wait for the response to be returned from the endpoint to be sure that the process you invoked has completed.
While I can't seem to locate any formal documentation of this behavior, this aligns with the intended design to enable the return of output values from the process/object being invoked back to the SOAP caller. (The output values couldn't possibly be known if the request resolves before the process is finished executing.)

couchdb gen_server call timeout during purge

I'm running an analysis on time duration to run a couchdb purge using a java program. The couchdb connections and calls are handled using ektorp. For a small number of documents purging takes place and I receive a success response.
But when I purge ~ 10000 or more, I get the following error:
org.ektorp.DbAccessException: 500:Internal Server Error
URI: /dbname/_purge
Response Body:
{
"error" : "timeout",
"reason" : "{gen_server,call,
....
On checking the db status using a curl command, the actual purging has taken place. But this timeout does not allow me to monitor the actual time of the purging method in my java program since this throws an exception.
On some research, I believe this is due to a default timeout value of an erlang gen_server process. Is there anyway for me to fix this?
I have tried changing the timeout values of the StdHttpClient to no avail.
HttpClient authenticatedHttpClient = new StdHttpClient.Builder()
.url(url)
.username(Conf.COUCH_USERNAME)
.password(Conf.COUCH_PASSWORD)
.connectionTimeout(600*1000)
.socketTimeout(600*1000)
.build();

CouchDB Dev here. You are not supposed to use purge with large numbers of documents. This is to remove accidentally added data from the DB, like credit card or social security numbers. This isn’t meant for general operations.
Consequently, you can’t raise that gen_server timeout :)

How to get loading time of each component of a page from server side in cq5

How to get loading time of each component of a page in CQ5 from server side.
Here as per my implementation we are getting longest time taking to load page from request.log file. But i need to get each component loading time of page from server side.
I found this link but this will work from client side:
http://www.wemblog.com/2014/05/how-to-find-component-load-time-on-page.html

You have to include logger for every component tag class call and provide stopwatch for entry and exit of the call.
Logger LOG = LoggerFactory.getLogger(classname.class);
StopWatch stopWatch = new StopWatch("new");
stopWatch.start();
stopWatch.stop();
Once you included this in your tag class you could able to find the time taken for each component in a particular page.
You can use putty for accessing your server log.

Starting from version AEM 6.0, there is OOTB feature to measure rendering time for each component on a page.
It is accessible through TouchUI, Developer mode.
However, it won't work if AEM is installed with run mode 'nosamplecontent'.

You can use the RequestProgressTracker, as explained in the Sling documentation
It's obtainable from SlingHttpServletRequest#getRequestProgressTracker. In order to get the timing stats for your components, you can use a Servlet Filter to execute the code on every request.
Whenever the filter is called:
Get the RequestProgressTracker from the request object
Call getMessages to obtain an iterator over a collection of request progress messages for the current request
Analyze the messages to find the resource types and timing information. Unfortunately, every message is only available as a String so you'll need to parse it to get the data.
Let's have a look at some example messages from the docs:
The last message is the kind we're looking for:
TIMER_END{103,/libs/sling/servlet/default/explorer/node.esp#0}
The number 103 is the number of milliseconds the script took to execute. The value after the comma is the script. You can tailor a regular expression to extract both values from every such message.
One of the projects I recently worked on used this approach to report on component performance. We had a neat dashboard in NewRelic with live stats on every component we built.

Java Bloomberg API - how to generate a Request without a Service

I am using the Bloomberg API to grab data. Currently, I have 3 processes which get data in the typical way as per the developers guide. Something like:
Service refDataService = session.getService("//blp/refdata");
Request request = refDataService.createRequest("ReferenceDataRequest");
request.append("securities", "IBM US Equity");
request.append("fields", "PX_LAST");
cid = session.sendRequest(request, null);
That works. Now I would like to expand the logic to be something more like an update queue. I would like each process to send their Request to an update queue process, which would in turn be responsible for creating the session and service, and then sending the requests. However, I don't see of any way to create the request without the Service. Also, since the request types (referenceData, historical data, intraday ticks) are so varied and have such different properties, it is not trivial to create a container object which my update queue could read.
Any ideas on how to accomplish this? My ultimate goal is to have a process (I'm calling update queue) which takes in a list of requests, removes any duplicates, and goes out to Bloomberg for the data in 30 second intervals.
Thank you!

I have updated the jBloomberg library to include tick data. You can submit different types of query to a BloombergSession which acts as a queue. So if you want to submit different types of request you can write something like:
RequestBuilder<IntradayTickData> tickRequest =
new IntradayTickRequestBuilder("SPX Index",
DateTime.now().minusHours(2),
DateTime.now());
RequestBuilder<IntradayBarData> barRequest =
new IntradayBarRequestBuilder("SPX Index",
DateTime.now().minusHours(2),
DateTime.now())
.period(5, TimeUnit.MINUTES);
RequestBuilder<ReferenceData> refRequest =
new ReferenceRequestBuilder("SPX Index", "NAME");
Future<IntradayTickData> ticks = session.submit(tickRequest);
Future<IntradayBarData> bars = session.submit(barRequest);
Future<ReferenceData> name = session.submit(refRequest);
More examples available in the javadoc.
If you need to fetch the same information regularly, you can reuse a builder and use it in combination with a ScheduledThreadPoolExecutor for example.
Note: the library is still in beta state so don't use it blindly in an black box that trades automatically!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.