Related
We have an Apache Flink POC application which works fine locally but after we deploy into Kinesis Data Analytics (KDA) it does not emit records into the sink.
Used technologies
Local
Source: Kafka 2.7
1 broker
1 topic with partition of 1 and replication factor 1
Processing: Flink 1.12.1
Sink: Managed ElasticSearch Service 7.9.1 (the same instance as in case of AWS)
AWS
Source: Amazon MSK Kafka 2.8
3 brokers (but we are connecting to one)
1 topic with partition of 1, replication factor 3
Processing: Amazon KDA Flink 1.11.1
Parallelism: 2
Parallelism per KPU: 2
Sink: Managed ElasticSearch Service 7.9.1
Application logic
The FlinkKafkaConsumer reads messages in json format from the topic
The jsons are mapped to domain objects, called Telemetry
private static DataStream<Telemetry> SetupKafkaSource(StreamExecutionEnvironment environment){
Properties kafkaProperties = new Properties();
kafkaProperties.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "BROKER1_ADDRESS.amazonaws.com:9092");
kafkaProperties.setProperty(ConsumerConfig.GROUP_ID_CONFIG, "flink_consumer");
FlinkKafkaConsumer<String> consumer = new FlinkKafkaConsumer<>("THE_TOPIC", new SimpleStringSchema(), kafkaProperties);
consumer.setStartFromEarliest(); //Just for repeatable testing
return environment
.addSource(consumer)
.map(new MapJsonToTelemetry());
}
The Telemetry’s timestamp is chosen for EventTimeStamp.
3.1. With forMonotonousTimeStamps
Telemetry’s StateIso is used for keyBy.
4.1. The two letter iso code of the state of USA
5 seconds tumbling window strategy is applied
private static SingleOutputStreamOperator<StateAggregatedTelemetry> SetupProcessing(DataStream<Telemetry> telemetries) {
WatermarkStrategy<Telemetry> wmStrategy =
WatermarkStrategy
.<Telemetry>forMonotonousTimestamps()
.withTimestampAssigner((event, timestamp) -> event.TimeStamp);
return telemetries
.assignTimestampsAndWatermarks(wmStrategy)
.keyBy(t -> t.StateIso)
.window(TumblingEventTimeWindows.of(Time.seconds(5)))
.process(new WindowCountFunction());
}
A custom ProcessWindowFunction is called to perform some basic aggregation.
6.1. We calculate a single StateAggregatedTelemetry
ElasticSearch is configured as sink.
7.1. StateAggregatedTelemetry data are mapped into a HashMap and pushed into source.
7.2. All setBulkFlushXYZ methods are set to low values
private static void SetupElasticSearchSink(SingleOutputStreamOperator<StateAggregatedTelemetry> telemetries) {
List<HttpHost> httpHosts = new ArrayList<>();
httpHosts.add(HttpHost.create("https://ELKCLUSTER_ADDRESS.amazonaws.com:443"));
ElasticsearchSink.Builder<StateAggregatedTelemetry> esSinkBuilder = new ElasticsearchSink.Builder<>(
httpHosts,
(ElasticsearchSinkFunction<StateAggregatedTelemetry>) (element, ctx, indexer) -> {
Map<String, Object> record = new HashMap<>();
record.put("stateIso", element.StateIso);
record.put("healthy", element.Flawless);
record.put("unhealthy", element.Faulty);
...
LOG.info("Telemetry has been added to the buffer");
indexer.add(Requests.indexRequest()
.index("INDEXPREFIX-"+ from.format(DateTimeFormatter.ofPattern("yyyy-MM-dd")))
.source(record, XContentType.JSON));
}
);
//Using low values to make sure that the Flush will happen
esSinkBuilder.setBulkFlushMaxActions(25);
esSinkBuilder.setBulkFlushInterval(1000);
esSinkBuilder.setBulkFlushMaxSizeMb(1);
esSinkBuilder.setBulkFlushBackoff(true);
esSinkBuilder.setRestClientFactory(restClientBuilder -> {});
LOG.info("Sink has been attached to the DataStream");
telemetries.addSink(esSinkBuilder.build());
}
Excluded things
We managed to put Kafka, KDA and ElasticSearch under the same VPC and same subnets to avoid the need to sign each request
From the logs we could see that the Flink can reach the ES cluster.
Request
{
"locationInformation": "org.apache.flink.streaming.connectors.elasticsearch7.Elasticsearch7ApiCallBridge.verifyClientConnection(Elasticsearch7ApiCallBridge.java:135)",
"logger": "org.apache.flink.streaming.connectors.elasticsearch7.Elasticsearch7ApiCallBridge",
"message": "Pinging Elasticsearch cluster via hosts [https://...es.amazonaws.com:443] ...",
"threadName": "Window(TumblingEventTimeWindows(5000), EventTimeTrigger, WindowCountFunction) -> (Sink: Print to Std. Out, Sink: Unnamed, Sink: Print to Std. Out) (2/2)",
"applicationARN": "arn:aws:kinesisanalytics:...",
"applicationVersionId": "39",
"messageSchemaVersion": "1",
"messageType": "INFO"
}
Response
{
"locationInformation": "org.elasticsearch.client.RequestLogger.logResponse(RequestLogger.java:59)",
"logger": "org.elasticsearch.client.RestClient",
"message": "request [HEAD https://...es.amazonaws.com:443/] returned [HTTP/1.1 200 OK]",
"threadName": "Window(TumblingEventTimeWindows(5000), EventTimeTrigger, WindowCountFunction) -> (Sink: Print to Std. Out, Sink: Unnamed, Sink: Print to Std. Out) (2/2)",
"applicationARN": "arn:aws:kinesisanalytics:...",
"applicationVersionId": "39",
"messageSchemaVersion": "1",
"messageType": "DEBUG"
}
We could also verify that the messages had been read from the Kafka topic and sent for processing by looking at the Flink Dashboard
What we have tried without luck
We had implemented a RichParallelSourceFunction which emits 1_000_000 messages and then exits
This worked well in the Local environment
The job finished in the AWS environment, but there was no data on the sink side
We had implemented an other RichParallelSourceFunction which emits 100 messages at each second
Basically we had two loops a while(true) outer and for inner
After the inner loop we called the Thread.sleep(1000)
This worked perfectly fine on the local environment
But in AWS we could see that checkpoints' size grow continuously and no message appeared in ELK
We have tried to run the KDA application with different parallelism settings
But there was no difference
We also tried to use different watermarking strategies (forBoundedOutOfOrderness, withIdle, noWatermarks)
But there was no difference
We have added logs for the ProcessWindowFunction and for the ElasticsearchSinkFunction
Whenever we run the application from IDEA then these logs were on the console
Whenever we run the application with KDA then there was no such logs in CloudWatch
Those logs that were added to the main they do appear in the CloudWatch logs
We suppose that we don't see data on the sink side because the window processing logic is not triggered. That's why don't see processing logs in the CloudWatch.
Any help would be more than welcome!
Update #1
We have tried to downgrade the Flink version from 1.12.1 to 1.11.1
There is no change
We have tried processing time window instead of event time
It did not even work on the local environment
Update #2
The average message size is around 4kb. Here is an excerpt of a sample message:
{
"affiliateCode": "...",
"appVersion": "1.1.14229",
"clientId": "guid",
"clientIpAddr": "...",
"clientOriginated": true,
"connectionType": "Cable/DSL",
"countryCode": "US",
"design": "...",
"device": "...",
...
"deviceSerialNumber": "...",
"dma": "UNKNOWN",
"eventSource": "...",
"firstRunTimestamp": 1609091112818,
"friendlyDeviceName": "Comcast",
"fullDevice": "Comcast ...",
"geoInfo": {
"continent": {
"code": "NA",
"geoname_id": 120
},
"country": {
"geoname_id": 123,
"iso_code": "US"
},
"location": {
"accuracy_radius": 100,
"latitude": 37.751,
"longitude": -97.822,
"time_zone": "America/Chicago"
},
"registered_country": {
"geoname_id": 123,
"iso_code": "US"
}
},
"height": 720,
"httpUserAgent": "Mozilla/...",
"isLoggedIn": true,
"launchCount": 19,
"model": "...",
"os": "Comcast...",
"osVersion": "...",
...
"platformTenantCode": "...",
"productCode": "...",
"requestOrigin": "https://....com",
"serverTimeUtc": 1617809474787,
"serviceCode": "...",
"serviceOriginated": false,
"sessionId": "guid",
"sessionSequence": 2,
"subtype": "...",
"tEventId": "...",
...
"tRegion": "us-east-1",
"timeZoneOffset": 5,
"timestamp": 1617809473305,
"traits": {
"isp": "Comcast Cable",
"organization": "..."
},
"type": "...",
"userId": "guid",
"version": "v1",
"width": 1280,
"xb3traceId": "guid"
}
We are using ObjectMapper to parse only just some of the fields of the json. Here is how the Telemetry class looks like:
public class Telemetry {
public String AppVersion;
public String CountryCode;
public String ClientId;
public String DeviceSerialNumber;
public String EventSource;
public String SessionId;
public TelemetrySubTypes SubType; //enum
public String TRegion;
public Long TimeStamp;
public TelemetryTypes Type; //enum
public String StateIso;
...
}
Update #3
Source
Subtasks tab
ID
Bytes received
Records received
Bytes sent
Records sent
Status
0
0 B
0
0 B
0
RUNNING
1
0 B
0
2.83 MB
15,000
RUNNING
Watermarks tab
No Data
Window
Subtasks tab
ID
Bytes received
Records received
Bytes sent
Records sent
Status
0
1.80 MB
9,501
0 B
0
RUNNING
1
1.04 MB
5,499
0 B
0
RUNNING
Watermarks
SubTask
Watermark
1
No Watermark
2
No Watermark
According the comments and more information You have provided, it seems that the issue is the fact that two Flink consumers can't consume from the same partition. So, in Your case only one parallel instance of the operator will consume from kafka partition and the other one will be idle.
In general Flink operator will select MIN([all_downstream_parallel_watermarks]), so In Your case one Kafka Consumer will produce normal Watermarks and the other will never produce anything (flink assumes Long.Min in that case), so Flink will select the lower one which is Long.Min. So, window will never be fired, because while the data is flowing one of the watermarks is never generated. The good practice is to use the same paralellism as the number of Kafka partitions when working with Kafka.
After having a support session with the AWS folks it turned out that we have missed to set the time characteristic on the streaming environment.
In 1.11.1 the default value of TimeCharacteristic was IngestionTime.
Since 1.12.1 (see related release notes) the default value is EventTime:
In Flink 1.12 the default stream time characteristic has been changed to EventTime, thus you don’t need to call this method for enabling event-time support anymore.
So, after we have set that EventTime explicitly then it started to generates watermarks like a charm:
streamingEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
I'm using elasticsearch in my project. but when deploying the project on a server it gives the exception, I read other same questions but didn't find solution:
I changed the port to 9300 and it wasn't solved.
NoNodeAvailableException[None of the configured nodes are available: [{#transport#-1}{EEv7PPi1SYqxodHCtCrfEw}{192.168.0.253}{192.168.0.253:9200}]]
at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:344)
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:242)
at org.elasticsearch.client.transport.TransportProxyClient.execute(TransportProxyClient.java:59)
This is my configuration for elasticsearch in my code:
public static void postConstruct() {
try {
Settings settings = Settings.builder()
.put("cluster.name","my-application").build();
client = new PreBuiltTransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(Bundle.application.getString("ELASTIC_ADDRESS")), Integer.parseInt(Bundle.application.getString("9200"))));
try {
client.admin().indices().prepareCreate("tempdata").get();
} catch (Exception e) {
e.printStackTrace();
}
} catch (UnknownHostException e) {
e.printStackTrace();
}
}
The version of elasticsearch bot in my project and on the server is the same. and this is what i get when I curl 'http://x.x.x.x:9200/?pretty'
{
"name" : "node-1",
"cluster_name" : "my-application",
"cluster_uuid" : "_na_",
"version" : {
"number" : "5.2.2",
"build_hash" : "f9d9b74",
"build_date" : "2017-02-24T17:26:45.835Z",
"build_snapshot" : false,
"lucene_version" : "6.4.1"
},
"tagline" : "You Know, for Search"
}
when I change the port to 9300, after some seconds the exception I see is this:
MasterNotDiscoveredException[null]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:211)
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:307)
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:237)
at org.elasticsearch.cluster.service.ClusterService$NotifyTimeout.run(ClusterService.java:1157)
This is log of elasticsearch and I have no idea what are host1 and host2:
[2018-07-16T15:40:59,476][DEBUG][o.e.a.a.i.g.TransportGetIndexAction] [gCJIhnQ] no known master node, scheduling a retry
[2018-07-16T15:41:29,478][DEBUG][o.e.a.a.i.g.TransportGetIndexAction] [gCJIhnQ] timed out while retrying [indices:admin/get] after failure (timeout [30s])
[2018-07-16T15:41:29,481][WARN ][r.suppressed ] path: /bad-request, params: {index=bad-request}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:211) [elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:307) [elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:237) [elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.cluster.service.ClusterService$NotifyTimeout.run(ClusterService.java:1157) [elasticsearch-5.2.2.jar:5.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:527) [elasticsearch-5.2.2.jar:5.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
Because the number of comments has increased, here are some tips that might be helpful. I assume that you are using a standalone elasticsearch instance started using ES_HOME/bin/elasticsearch as the master node on the server machine.
Make sure the elasticsearch on the server configured as a master node. Refer to https://www.elastic.co/guide/en/elasticsearch/reference/6.3/modules-node.html for more details about nodes in elasticsearch.
Make sure the elasticsearch on the server is bounded to a non-loopback address. For details about this refer to https://www.elastic.co/guide/en/elasticsearch/reference/current/network.host.html
Check the transport client version be compatible with the server version.
Increase the mmap counts as they said here. In Linux by running the command: sysctl -w vm.max_map_count=262144
Check the transport port number on the server and it is reachable from outside. Defaults to 9300-9400
Check the elasticsearch service logs on the server to be sure it is configured as you did!
Problem: MongoDB writes fail with the error -
Timed out after 30000 ms while waiting for a server that matches
PrimaryServerSelector. Client view of cluster state is
{type=REPLICA_SET, servers=[{address=intdb01:27017,
type=REPLICA_SET_SECONDARY, roundTripTime=0.7 ms, state=CONNECTED}]
This happens when the primary has switched from intdb01 to intdb02.
Looks like the client driver is still looking for intdb01
to be primary node.
Our Setup
We have 3 mongoDb nodes in a replicaSet called rs0.
When we connect to it using Java, we give all 3 servers in the connection string as follows:
mongodb://intdb01:27017,intdb02:27017,intdb03:27017/?replicaSet=rs0
db.version --> 3.0.4
Java Driver Version: mongodb-driver-3.0.4.jar, mongodb-driver-core-3.0.4.jar
Client Connection code:
if( mongoClient == null) {
MongoClientURI mcu = new MongoClientURI(mongoConnect);
mongoClient = new MongoClient(mcu);
}
mongoConnect contains the connection string shown above.
Mongo Replicaset Status Info
> rs.status()
{
"set" : "rs0",
"date" : ISODate("2016-07-19T21:14:03.001Z"),
"myState" : 1,
"members" : [{
"_id" : 0,
"name" : "intdb01",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 439786,
"optime" : Timestamp(1468521291, 5),
"optimeDate" : ISODate("2016-07-14T18:34:51Z"),
"lastHeartbeat" : ISODate("2016-07-19T21:14:01.877Z"),
"lastHeartbeatRecv" : ISODate("2016-07-19T21:14:01.611Z"),
"pingMs" : 0,
"configVersion" : 4
},
{
"_id" : 1,
"name" : "intdb02:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 2948844,
"optime" : Timestamp(1468521291, 5),
"optimeDate" : ISODate("2016-07-14T18:34:51Z"),
"electionTime" : Timestamp(1468523057, 1),
"electionDate" : ISODate("2016-07-14T19:04:17Z"),
"configVersion" : 4,
"self" : true
},
{
"_id" : 2,
"name" : "intdb03:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 439779,
"optime" : Timestamp(1468521291, 5),
"optimeDate" : ISODate("2016-07-14T18:34:51Z"),
"lastHeartbeat" : ISODate("2016-07-19T21:14:01.294Z"),
"lastHeartbeatRecv" : ISODate("2016-07-19T21:14:01.294Z"),
"pingMs" : 0,
"configVersion" : 4
}
],
"ok" : 1
}
I have checked and rechecked the code/config with the documentation but cannot find what is amiss.
We can simulate this by stepping down the primary.
The Secondary takes over but the app writes start failing. :-(
Any pointers/tips much appreciated!
Adding Stack trace-
Caused by: com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches PrimaryServerSelector. Client view of cluster state is {type=REPLICA_SET, servers=[{address=intdb01:27017, type=REPLICA_SET_SECONDARY, roundTripTime=0.7 ms, state=CONNECTED}]
at com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:370)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:75)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:71)
at com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:175)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:141)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:72)
at com.mongodb.Mongo.execute(Mongo.java:747)
at com.mongodb.Mongo$2.execute(Mongo.java:730)
at com.mongodb.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:482)
at com.mongodb.MongoCollectionImpl.update(MongoCollectionImpl.java:474)
at com.mongodb.MongoCollectionImpl.updateOne(MongoCollectionImpl.java:325)
at com.grid.core.persistence.mongodb.dao.GridEventDao.save(GridEventDao.java:69)
... 6 more
2016-07-19 17:05:01,882 [pool-2-thread-2] INFO org.mongodb.driver.cluster - No server chosen by PrimaryServerSelector from cluster description ClusterDescription{type=REPLICA_SET, connectionMode=SINGLE, all=[ServerDescription{address=intdb01:27017, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 0, 4]}, minWireVersion=0, maxWireVersion=3, electionId=null, maxDocumentSize=16777216, roundTripTimeNanos=724681, setName='rs0', canonicalAddress=intdb01:27017, hosts=[intdb01:27017, intdb02:27017], passives=[intdb03:27017], arbiters=[], primary='intdb02:27017', tagSet=TagSet{[]}}]}. Waiting for 30000 ms before timing out
I am trying to figure out if it is a coding error or configuration.
The last info Message in the stack trace, says connectionMode is SINGLE. Have a suspicion that it is causing this issue but I cannot find any info about it in the developer documentation site.
All of the server addresses should appear in the error message. For example, try this with 3 arbitrary hostnames what won't resolve:
com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches PrimaryServerSelector. Client view of cluster state is {type=REPLICA_SET, servers=[{address=xxx:27117, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketException: xxx}, caused by {java.net.UnknownHostException: xxx}}, {address=xxx:27118, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketException: xxx}, caused by {java.net.UnknownHostException: xxx}}, {address=xxx:27119, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketException: xxx}, caused by {java.net.UnknownHostException: xxx}}]
at com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:370)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:175)
at com.mongodb.operation.BaseWriteOperation.execute(BaseWriteOperation.java:106)
at com.mongodb.operation.BaseWriteOperation.execute(BaseWriteOperation.java:58)
at com.mongodb.Mongo.execute(Mongo.java:747)
at com.mongodb.Mongo$2.execute(Mongo.java:730)
at com.mongodb.DBCollection.executeWriteOperation(DBCollection.java:327)
at com.mongodb.DBCollection.insert(DBCollection.java:323)
at com.mongodb.DBCollection.insert(DBCollection.java:314)
at com.mongodb.DBCollection.insert(DBCollection.java:284)
at com.mongodb.DBCollection.insert(DBCollection.java:250)
at com.mongodb.DBCollection.insert(DBCollection.java:187)
at InsertTest2.main(InsertTest2.java:26)
Notice how all 3 server addresses appear. I would suggest that you create a simple test program like this and verify, then work backwards from there. Also verify that the hostnames resolve correctly, you might also try IP addresses.
I'm trying to setup on 3 servers. For the purpose of an example, I'm
trying to setup a class "client", with 3 clusters "client_1",
"client_2", and "client_3". My servers are called node1, node2, and
node3. I want the clusters arranged such that I have 2 copies of each
cluster, so if 1 node goes down I still have access to all the data, so for
example:
node1 is master for client_1 and has a copy of client_2.
node2 is master for client_2 and has a copy of client_3.
node3 is master for client_3 and has a copy of client_1.
I've tried setting this up with the following steps:
1. Download OrientDB 2.1.1 Community and extract onto the 3 servers.
2. Delete the GratefulDeadConcerts database from the databases directory on
each server.
3. Edit default-distributed-db-config.json on node1 as follows :
{
"autoDeploy": true,
"hotAlignment": false,
"executionMode": "undefined",
"readQuorum": 1,
"writeQuorum": 2,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"clusters": {
"internal": {
},
"index": {
},
"client_1": {
"servers" : [ "node1","node2" ]
},
"client_2": {
"servers" : [ "node2","node3" ]
},
"client_3": {
"servers" : [ "node3","node1" ]
},
"*": {
"servers" : [ "<NEW_NODE>" ]
}
}
}
Start node1 with dserver.sh.
Create a database using console on node1:
connect remote:localhost root password
create database remote:localhost/testdb root password plocal graph
Create a class and rename the default cluster:
create class client extends v
alter cluster client name client_1
Startup node2 with dserver.sh, wait for database to auto deploy, then
startup node3 and wait for deploy
At this point I have a database on 3 nodes, with a class called "client"
with only one cluster "client_1".
On node2, add the client_2 cluster:
alter class client addcluster client_2
Similarly, on node3:
alter class client addcluster client_3
If I reconnect all console sessions and execute "list clusters" I now see
all 3 clusters of the client class on each node. I also see the .cpm and
.pcl files for each of the 3 clusters on each node. However, it appears
that my intention in default-distributed-db-config.json is being taken into
account as if I wait a couple of minutes and then insert a record from each
node I see that the timestamps and file sizes only change on the files
relating to the clusters that are supposed to be present on each node
(would be nice and less confusing if the files didn't exist on the wrong
nodes, but its not the end of the world).
So... now it appears that I have the database setup the way I intended, but
the point of doing this is so that we can survive a server going down, so I
shutdown node3 with ctrl-c. I can still see each of the records (I inserted
3, one per cluster) from both node1 and node2 - so far so good.
If I take a look at the contents of distirbuted-db.json on node1 or node2,
I now see my "client" class clusters have been reconfigured - there's no
node3 in the config any longer:
"client_3": { "servers": [ "node1" ], "#version": 0, "#type": "d" },
"client_2": { "servers": [ "node2" ], "#version": 0, "#type": "d" },
"client_1": { "servers": [ "node1", "node2" ], "#version": 0,
"#type": "d" }
Now I restart node3. The config is not getting updated again:
"client_3": { "servers": [ "node1" ], "#version": 0,
"#type": "d" },
"client_2": { "servers": [ "node2" ], "#version": 0, "#type": "d" },
"client_1": { "servers": [ "node1", "node2" ], "#version": 0,
"#type": "d" }
Is there something wrong in the way I've created/configured the database or is this a bug?
I think the issue here is that "hotAlignment" needs to be set to "true" in the file "default-distributed-db-config.json". Per the OrientDB 2.2.x sharding doc, "If hotAlignment=false is set, when a node re-joins the cluster (after a failure or simply unreachability) the full copy of database from a node could have no all information about the shards." Note, though, this bullet from the changes between 2.1.x to 2.2.x: "Removed hotAlignment setting: servers, once they join the cluster, remain always in the configuration until they are manually removed."
I am using 3 node cluster setup with the elasticsearch 1.3.1, i have 17 indices each one is having min 0.5 M (1Gi) documents and 1.4 M (3 Gi) max. now i would like to try the snapshot and restore process in my cluster. i used the following REST calls to do the same...
To create a repository:
curl -XPUT 'http://host.name:9200/_snapshot/es_snapshot_repo' -d '{
"type": "fs",
"settings": {
"location": "/data/es_snapshot_bkup_repo/es_snapshot_repo"
}
}'
Verified the repository:
curl -XGET 'http://host.name:9200/_snapshot/es_snapshot_repo?pretty' the response is
{
"es_snapshot_repo" : {
"type" : "fs",
"settings" : {
"location" : "/data/es_snapshot_bkup_repo/es_snapshot_repo"
}
}
}
done the SNAPSHOT using
curl -XPUT "http://host.name:9200/_snapshot/es_snapshot_repo/snap_001" -d '{
"indices": "index_01",
"ignore_unavailable": "true",
"include_global_state": false,
"wait_for_completion": true
}'
the response is
{
"accepted": true
}
then I am trying to restore the snapshot by the request
curl -XPOST "http://host.name:9200/_snapshot/es_snapshot_repo/snap_001/_restore" -d '{
"indices": "index_01",
"ignore_unavailable": "true",
"include_global_state": false,
"rename_pattern": "index_01",
"rename_replacement": "index_01_bk",
"include_aliases": false
}'
ISSUE:
As I informed I have 3 nodes. the index which I am trying to take snapshot & restore is has 6 shards and 2 replicas.
Most of the shards and its replicas are restored properly, but sometimes 1, sometimes 2 primary shards and its replicas restoring is not happen. those primary shards are in the INITIALIZING state. I allow the cluster to relocate them for more than an hour but the shards are not relocating to the correct node... I got the following exception in my node.
the restore process trying to place the shard in the other 2 nodes... but it can't possible...
[2014-08-27 07:10:35,492][DEBUG][cluster.service ] [node_01] processing [
shard-failed (
[snap_001][4],
node[r4UoA7vJREmQfh6lz634NA],
[P],
restoring[es_snapshot_repo:snap_001],
s[INITIALIZING]),
reason [Failed to start shard,
message [IndexShardGatewayRecoveryException[[snap_001][4] failed recovery];
nested: IndexShardRestoreFailedException[[snap_001][4] restore failed];
nested: IndexShardRestoreFailedException[[snap_001][4] failed to restore snapshot [snap_001]];
nested: IndexShardRestoreFailedException[[snap_001][4] failed to read shard snapshot file];
nested: FileNotFoundException[/data/es_snapshot_bkup_repo/es_snapshot_repo/indices/index_01/4/snapshot-snap_001 (No such file or directory)]; ]]]:
done applying updated cluster_state (version: 56391)
Could anyone help me to overcome this issue and please correct me if I done any mistake in these process...
FYI I am using master node to pass the curl request
We need to provide a shared file system location which should be access by all the elasticsearch nodes with read & write permission.