I am woking with AWS Kinesis and CouldWatch. How can I fetch many metrics of one shard with one request? This is how I get one metric:
GetMetricStatisticsRequest request = new GetMetricStatisticsRequest();
request.withNamespace(namespace)
.withDimensions(dimensions)
.withPeriod(duration)
.withStatistics(statistic)
.withMetricName(metricName)
.withStartTime(startTime)
.withEndTime(endTime);
You can't fetch data for multiple metrics with one call to GetMetricStatistics.
GetMetricStatistics API takes metric name and a list of dimensions, which together define exactly one metric. To get data for multiple metrics you'll have to make multiple GetMetricStatistics calls.
Related
Hi I'm trying to use elastic search reindex api via rest high level client and am comparing two ways of doing it.
Rest API:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-task-api
[![Rest API Documentation screenshot][1]][1]
Running reindex asynchronously - If the request contains wait_for_completion=false, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at _tasks/<task_id>. When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.
rest high level client:
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-reindex.html#java-rest-high-document-reindex-task-submission
[![rest high level client Documentation screenshot][2]][2]
Reindex task submission - It is also possible to submit a ReindexRequest and not wait for it completion with the use of Task API. This is an equivalent of a REST request with wait_for_completion flag set to false.
I'm trying to figure out this: From Rest API Doc I know that I should delete the task document so Elasticsearch can reclaim the space. Since the rest high level client is basically doing the same thing, do I need to "delete the task document" if I choose to use this client instead of the rest API? If so, how can I do that?
Thanks
[1]: https://i.stack.imgur.com/OEVHi.png
[2]: https://i.stack.imgur.com/sw9Dw.png
The task document is just a summary of what happen during reindex (so a small document), since you specify to do in async with wait_for_completion=false it will be created in system indice .tasks, so you can query this indice like any other to find the summary and delete it.
The .tasks indice will not be available by default in futur version of Elasticsearch and you will need to use specific function linked to _tasks with the java REST api available here
I'm updating a microservice to spring boot 2 and migrating metrics from dropwizard to micrometer. We are using prometheus to store metrics and grafana to display them. I want to measure requests per second to all URLs. Micrometer documentation states that:
Timers are intended for measuring short-duration latencies, and the frequency of such events.
So timers seem to be the way to do the job:
Timer.Sample sample = log ? Timer.start(registry)
//...code which executes request...
List<Tag> tags = Arrays.asList(
Tag.of("status", status),
Tag.of("uri", uri),
Tag.of("method", request.getMethod()));
Timer timer = Timer.builder(TIMER_REST)
.tags(tags)
.publishPercentiles(0.95, 0.99)
.distributionStatisticExpiry(Duration.ofSeconds(30))
.register(registry);
sample.stop(timer);
but it doesn't produce any rate per second, instead we have metrics similar to:
# TYPE timer_rest_seconds summary
timer_rest_seconds{method="GET",status="200",uri="/test",quantile="0.95",} 0.620756992
timer_rest_seconds{method="GET",status="200",uri="/test",quantile="0.99",} 0.620756992
timer_rest_seconds_count{method="GET",status="200",uri="/test",} 7.0
timer_rest_seconds_sum{method="GET",status="200",uri="/test",} 3.656080641
# HELP timer_rest_seconds_max
# TYPE timer_rest_seconds_max gauge
timer_rest_seconds_max{method="GET",status="200",uri="/test",} 0.605290436
what would be the proper way to solve this? Should the rate per second be calculated via prometheus queries or returned via spring actuator endpoint?
Prometheus provides a functional query language called PromQL (Prometheus Query Language) that lets the user select and aggregate time series data in real time. You can use rate() function:
The following example expression returns the per-second rate of HTTP requests as measured over the last 5 minutes, per time series in the range vector:
rate(http_requests_total{job="api-server"}[5m])
I am using Amazon AWS SES to send my email campaigns. I have around 35,000 subscribers on my list. At present I am using a code something similar to the following.
for (Entry<Integer, String> emailEntry : email_ids.entrySet()) {
MimeMessage msg = getMimeMessage(emailEntry.getKey(), emailEntry.getValue());
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
msg.writeTo(outputStream);
RawMessage rawMessage = new RawMessage(ByteBuffer.wrap(outputStream.toByteArray()));
ses.sendRawEmail(new SendRawEmailRequest(rawMessage));
}
This way I was able to send email to all my subscribers the way I wanted. But there was a huge bill accounting to Data transfer. Each MimeMessage is of 150Kb in size and sending it to 35,000 subscribers resulted in 5.5 GB of data transfer.
So I decided to use BulkTemplateEmail in my application, to create the template once and send it to 35,000 emails. This way the email has to be send to SES only once and there will be significant gain in terms of data transfer.
Can anyone provide me a sample to do this via Java AWS SDK? I want to add List-Unsubscribe header on each Destination. This is where I am actually stuck. Couldn't find any methods to add custom email headers for each Destination. Is this possible with BulkTemplateEmail?
Any info is highly appreciated.
When sending emails using SES, Amazon charges for data transfer out. The current price is $0.12 per GB. For large volumes of emails this can result in serious charges.
Amazon SES pricing
For embedded images, attachments, etc. another solution is to use links instead of embedded objects. This way you can mitigate and reduce data transfers fees. This can have a moderate to high impact for email campaigns where a lot of emails are never opened, thereby saving on the data transfer charges.
If your links reference files on your EC2 instances, remember that you will still be charged for Data-Out from your EC2 instances. S3 will provide a lower cost.
We are developping an application that uses the Google Cloud Datastore, important detail: it's not an gae application!
Everything works fine for normal usage. We designed a test that fetches over 30000 records but when we tried to run the test we got the following error:
java.net.SocketTimeoutException: Read timed out
We found that a Timeout Exception occurs after 30 seconds, so this explains the error.
I have two questions:
Is there a way to increase this timeout?
Is it possible to use pagination to query the datastore? We found when you have an aep application you can use the cursor, but our application isn't.
You can use cursors in the exact same way as a GAE app using Datastore. Take a look at this page for info.
In particular, the ResultQueryBatch object has an .getEndCursor() method which you can then use when you reissue a Query with setStartCursor(...). Here's a code snippet from the page above:
Query q = ...
if (response.getBatch().getMoreResults() == QueryResultBatch.MoreResultsType.NOT_FINISHED) {
ByteString endCursor = response.getBatch().getEndCursor();
q.setStartCursor(endCursor);
// reissue the query to get more results...
}
You should definitely use cursors to ensure that you get all your results. The rpc has additional constraints to time like total rpc size, so you shouldn't depend on a single rpc answering your entire query.
I am using the Bloomberg API to grab data. Currently, I have 3 processes which get data in the typical way as per the developers guide. Something like:
Service refDataService = session.getService("//blp/refdata");
Request request = refDataService.createRequest("ReferenceDataRequest");
request.append("securities", "IBM US Equity");
request.append("fields", "PX_LAST");
cid = session.sendRequest(request, null);
That works. Now I would like to expand the logic to be something more like an update queue. I would like each process to send their Request to an update queue process, which would in turn be responsible for creating the session and service, and then sending the requests. However, I don't see of any way to create the request without the Service. Also, since the request types (referenceData, historical data, intraday ticks) are so varied and have such different properties, it is not trivial to create a container object which my update queue could read.
Any ideas on how to accomplish this? My ultimate goal is to have a process (I'm calling update queue) which takes in a list of requests, removes any duplicates, and goes out to Bloomberg for the data in 30 second intervals.
Thank you!
I have updated the jBloomberg library to include tick data. You can submit different types of query to a BloombergSession which acts as a queue. So if you want to submit different types of request you can write something like:
RequestBuilder<IntradayTickData> tickRequest =
new IntradayTickRequestBuilder("SPX Index",
DateTime.now().minusHours(2),
DateTime.now());
RequestBuilder<IntradayBarData> barRequest =
new IntradayBarRequestBuilder("SPX Index",
DateTime.now().minusHours(2),
DateTime.now())
.period(5, TimeUnit.MINUTES);
RequestBuilder<ReferenceData> refRequest =
new ReferenceRequestBuilder("SPX Index", "NAME");
Future<IntradayTickData> ticks = session.submit(tickRequest);
Future<IntradayBarData> bars = session.submit(barRequest);
Future<ReferenceData> name = session.submit(refRequest);
More examples available in the javadoc.
If you need to fetch the same information regularly, you can reuse a builder and use it in combination with a ScheduledThreadPoolExecutor for example.
Note: the library is still in beta state so don't use it blindly in an black box that trades automatically!