Find total amount of disk space used by a Elasticsearch cluster

Find total amount of disk space used by a Elasticsearch cluster - java

I am looking for a way to get the total amount of disk space used by a
Elasticsearch cluster. I have found suggestions to use the following REST API endpoints to obtain this information among other things:
GET /_cat/stats
GET /_nodes/stats
I was wondering can the same information be obtained using the Elasticsearch Java High-Level REST Client or the Transport Client in an older version of Elasticsearch?

Yes, you are correct that disk stats can be obtained using the _nodes/stats API as REST high-level client doesn't provide any direct API for node stats, you can see all the API supported by it here.
But you can use the low level rest client which is provide in high level client and below is the working example code.
private void getDiskStats(RestHighLevelClient restHighLevelClient) throws IOException {
RestClient lowLevelClient = restHighLevelClient.getLowLevelClient();
Request request = new Request(
"GET",
"/_nodes/stats");
Response response = lowLevelClient.performRequest(request);
if (response.getStatusLine().getStatusCode() == 200) {
System.out.println("resp: \n"+ EntityUtils.toString(response.getEntity()));
}
}
You can see I am printing the O/P of above API on console and verified that it contains the disk usage status which comes in below format:
"most_usage_estimate": {
"path": "/home/opster/runtime/elastic/elasticsearch-7.8.1/data/nodes/0",
"total_in_bytes": 124959473664,
"available_in_bytes": 6933352448,
"used_disk_percent": 94.45151916481107
},

Related

Elasticsearch Node IP address and other details using java rest high level client

How can we get node details for elasticseacrh using java high level rest client.
We can get node details in KIbana using GET /_cat/nodes .
I want to how to get same details using high level rest client , i need IP address of nodes and wether its master node or not.

Looks like JHLRC doesn't have a API to get _cat/nodes API, but it can be easily obtained using the low-level client which is available within JHLRC as shown in below code
private void getNodesAPI(RestHighLevelClient restHighLevelClient) throws IOException {
RestClient lowLevelClient = restHighLevelClient.getLowLevelClient();
Request request = new Request(
"GET",
"/_cat/nodes?v");
Response response = lowLevelClient.performRequest(request);
if (response.getStatusLine().getStatusCode() == 200) {
System.out.println("resp: \n"+ EntityUtils.toString(response.getEntity()));
}
}
Ran above code locally and it works and below is the O/P on console
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
127.0.0.1 48 99 21 2.59 1.93 1.92 * opster

Streaming Chunked HTTP Entities with vert.x

I'm beginning an initial review of vert.x and comparing it to akka-http. One area where akka appears to shine is streaming of response bodies.
In akka-http it is possible to create a streaming entity that utilizes back-pressure which allows the client to decide when it is ready to consume data.
As an example, it is possible to create a response with an entity consisting of 1 billion instances of "42" values:
//Iterator is "lazy", therefore this function returns immediately
val bodyData : () => Iterator[ChunkStreamPart] = () =>
Iterator
.continually("42")
.take(1000000000)
.map(ChunkStreamPart.apply)
val route =
get {
val entity : HttpEntity =
Chunked(ContentTypes.`text/plain(UTF-8)`, Source fromIterator bodyData)
complete(HttpResponse(entity=entity))
}
The above code will not "blow up" the server's memory and will return the response to the client before the billion values have been generated.
The "42" values will get created on-the-fly as the client tries to consume the response body.
Question: is this streaming capability also present in vert.x?
A cursory review of the HttpServerResponse class would indicate that it is not since the write member function can only take in a String or a vert.x Buffer. From my limited understanding it seems that Buffer is not lazy and holds the data in memory which means the 1 billion "42" example would crash a server with just a few concurrent requests.
Thank you in advance for your consideration and response.

How can I check whether the location declared by a twitter's user exists?

I'm doing a twitter crawler and I have built a search engine on top of it using Lucene. Since many users submit locations that don't exist (e.g. "in my kitchen", "wonderland", "from LA to Paris"...), I think I should check which users to index depending on their location, in orer to make them reachable further with a location-search. I retrieve users by sampling english tweets (using TwitterStream.sample("en")).
My first idea was to download from some web sites all cities in the world and check if there was a match. However, there's a problem with this approach: It's difficult to find a document which contains all cities in the world spelled in all possible languages. The user, indeed, could either submit the name of his city (or country) in english, or in his own language.

You need to use geocoding google maps, yandex maps.
I'm facing the fact that the first link tells google API look for
cities in USA by default. So...if a user says he's in "Paris", google
API will response me NO_REPONSE
Red Light District
I have read the first link with much attention and the second link
with less attention, because the latter seems to be useful just for
javascript application (I'm doing all in java).
No. It is not correct. You can get information by a HTTP request, refer HTTP request parameters.
A small code snippet for yandex maps using apache http client
private void request(String geocode) throws IOException {
HttpResponse response = Request.Post(SEARCH_URL).version(HttpVersion.HTTP_1_1)
.bodyForm(createForm(geocode).build(), Charsets.UTF_8).useExpectContinue()
.connectTimeout(CONNECTION_TIMEOUT_MILS)
.socketTimeout(CONNECTION_TIMEOUT_MILS)
.execute().returnResponse();
assertStatus(response, geocode);
getCoordinatesFromResponse(response, geocode);
}
private Form createForm(String geocode) {
return Form.form().add("format", "json").add("results", "1").add("geocode", geocode);
}
private void assertStatus(HttpResponse response, String requestString) {
StatusLine statusLine = response.getStatusLine();
if (statusLine.getStatusCode() >= ERROR_STATUS_MIN) {
throw new RuntimeException(String.format(
"Error sending request '%s' to the map service, server response: %s",
requestString, response));
}
}

Sharepoint API for Java

Steps that I am trying to perform the following steps through Java:
1) Connect to a sharepoint site with a given URL.
2) Get the list of files listed on that page
3) Filter the files using Modified date
4) Perform some more checks using Create Date and Modified Date
5) And finally save that file(s) into the Unix box.
As of now, I am able to access a particular file and read through it.
However I need to get hold of file's metadata before reading it.
Is there an API or a way to do all these in Java.
Thanks

With SharePoint 2013, the REST services will make your life easier. In previous versions, you could use the good old SOAP web services.
For instance, you could connect to a list with this query on the REST API:
http://server/site/_api/lists/getbytitle('listname')/items
This will give you all items from that list. With OData you can do additional stuff like filtering:
$filter=StartDate ge datetime'2015-05-21T00%3a00%3a00'
Additionally, you can provide CAML queries to these services, allowing you to define detailed queries. Here's an example in Javascript:
var re = new SP.RequestExecutor(webUrl);
re.executeAsync({
url: "http://server/site/_api/web/lists/getbytitle('listname')/GetItems",
method: 'POST',
headers: {
"Accept": "application/json; odata=verbose",
"Content-Type": "application/json; odata=verbose"
},
body: {
"query" : {
"__metadata": {
"type": "SP.CamlQuery"
},
"ViewXml": "<View>" +
"<Query>" + query + "</Query>" +
"</View>"
}
},
success: successHandler,
error: errorHandler
});
If all of this doesn't provide enough flexibility, you might as well take these list items in memory and do additional work in your (server side) code.

I have developed a Sharepoint Rest API java wrapper that allows you to use most common operations of the rest API.
https://github.com/kikovalle/PLGSharepointRestAPI-java

How does caching work in JAX-RS?

Suppose I have the following web service call using #GET method:
#GET
#Path(value = "/user/{id}")
#Produces(MediaType.APPLICATION_JSON)
public Response getUserCache(#PathParam("id") String id, #Context HttpHeaders headers) throws Exception {
HashMap<String, Object> map = new HashMap<String, Object>();
map.put("id", id);
SqlSession session = ConnectionFactory.getSqlSessionFactory().openSession();
Cre8Mapper mapper = session.getMapper(Cre8Mapper.class);
// slow it down 5 seconds
Thread.sleep(5000);
// get data from database
User user = mapper.getUser(map);
if (user == null) {
return Response.ok().status(Status.NOT_FOUND).build();
} else {
CacheControl cc = new CacheControl();
// save data for 60 seconds
cc.setMaxAge(60);
cc.setPrivate(true);
return Response.ok(gson.toJson(user)).cacheControl(cc).status(Status.OK).build();
}
}
To experiment, I slow down the current thread 5 seconds before fetching data from my database.
When I call my web service using Firefox Poster, within 60 seconds it seemed much faster on the 2nd, 3rd calls and so forth, until it passed 60 seconds.
However, when I paste the URI to a browser (Chrome), it seemed to slow down 5s everytime. And I'm really confused about how caching is actually done with this technique. Here are my questions:
Does POSTER actually look at the header max-age and decide when to
fetch the data?
In client side (web, android....),
when accessing my web service do I need to check the header and then
perform caching manually or the browser already cached the data
itself?
Is there a way to avoid fetching data from the database
every time? I guess I would have to store my data in memory somehow,
but could it potentially run out of memory?
In this tutorial
JAX-RS caching tutorial:
How does caching actually work? The first line always fetch the data from the database:
Book myBook = getBookFromDB(id);
So how it is considered cached? Unless the code doesn't execute in top/down order.
#Path("/book/{id}")
#GET
public Response getBook(#PathParam("id") long id, #Context Request request) {
Book myBook = getBookFromDB(id);
CacheControl cc = new CacheControl();
cc.setMaxAge(86400);
EntityTag etag = new EntityTag(Integer.toString(myBook.hashCode()));
ResponseBuilder builder = request.evaluatePreconditions(etag);
// cached resource did change -> serve updated content
if (builder == null){
builder = Response.ok(myBook);
builder.tag(etag);
}
builder.cacheControl(cc);
return builder.build();
}

From your questions i see that you're mixing client side caching(http) with server side caching(database). I think the root cause for this is the different behavior you observed in firefox and chrome first i will try to clear this
When I call my web service using Firefox Poster, within 60 seconds it
seemed much faster on the 2nd, 3rd calls and so forth, until it passed
60 seconds. However, when I paste the URI to a browser (Chrome), it
seemed to slow down 5s everytime.
Example :
#Path("/book")
public Response getBook() throws InterruptedException {
String book = " Sample Text Book";
TimeUnit.SECONDS.sleep(5); // thanks #fge
final CacheControl cacheControl = new CacheControl();
cacheControl.setMaxAge((int) TimeUnit.MINUTES.toSeconds(1));
return Response.ok(book).cacheControl(cacheControl).build();
}
I have a restful webservice up and running and url for this is
http://localhost:8780/caching-1.0/api/cache/book - GET
FireFox :
First time when i accessed url ,browser sent request to server and got response back with cache control headers.
Second Request with in 60 seconds (using Enter) :
This time firefox didn't went to server to get response,instead its loaded data from cache
Third Request after 60 seconds (using Enter) :
this time firefox made request to server and got response.
Fourth Request using Refresh (F5 or ctrl F5) :
If i refresh page ( instead of hitting enter) with in 60 seconds of previous request firefox didn't load data from cache instead it made request to server with special header in request
Chrome :
Second Request with in 60 seconds (using Enter ) : This time chrome sent request again to server instead of loading data from cache ,and in request it added header cache-control = "max-age=0"
Aggregating Results :
As chrome responding differently to enter click you saw different behavior in firefox and chrome ,its nothing do with jax-rs or your http response . To summarize clients (firefox/chrome/safari/opera) will cache data for specified time period in cache control , client will not make new request to server unless time expires or until we do a force refresh .
I hope this clarifies your questions 1,2,3.
4.In this tutorial JAX-RS caching tutorial: How does caching actually
work? The first line always fetch the data from the database:
Book myBook = getBookFromDB(id);
So how it is considered cached? Unless the code doesn't execute in
top/down order.
The example you referring is not talking about minimizing data base calls instead its about saving band width over network ,Client already has data and its checking with server(revalidating) if data is updated or not if there is no update in data in response you're sending actual entity .

Yes.
When using a browser like firefox or chrome, you don't need to worry about HTTP cache because modern browsers will handle it. For example, it uses in-memory cache when using Firefox. When using Android, it depends on how you interact with the origin server. According to WebView, it's actually a browser object, but you need to handle HTTP cache on your own if using HTTPClient.
It's not about HTTP caching but your server-side logic. the common answer is using database cache so that you don't need to hit database in every HTTP request.
Actually JAX-RS just provides you ways to work with HTTP cache headers. you need to use CacheControl and/or EntityTag to do time based cache and conditional requests. for example, when using EntityTag, the builder will handle response status code 304 which you never need to worry about.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Find total amount of disk space used by a Elasticsearch cluster - java

Related

Elasticsearch Node IP address and other details using java rest high level client

Streaming Chunked HTTP Entities with vert.x

How can I check whether the location declared by a twitter's user exists?

Sharepoint API for Java

How does caching work in JAX-RS?

Categories

Resources