Couchbase Client Configuration for enhanced durability

Couchbase Client Configuration for enhanced durability - java

I am trying to implement ATPlus scan consistency within my couchbase java application. I have updated my queries to include the consistentWith(mutationState):
RawJsonDocument courseJsonDocument = toRawJsonDocument(course, true);
RawJsonDocument insertedJsonDocument = bucket.insert(courseJsonDocument);
MutationState insertMutationState = MutationState.from(insertedJsonDocument);
.....
N1qlQuery.simple(GET_COURSE_BY_ID_QUERY, N1qlParams.build().consistentWith(mutationState));
I'm trying to achieve read-your-own-write, but when I run the query immediately after inserting the document, nothing is found, so I must be doing something wrong. I think what I am missing is actually enabling enhanced durability on the client configuration.
I see examples of how to do it in .NET, but I can't figure out how to 'enable enhanced durability' in JAVA. Here is my cluster configuration:
Cluster cluster = CouchbaseCluster.create(DefaultCouchbaseEnvironment.builder()
.queryServiceConfig(QueryServiceConfig.create(1, 100))
.mutationTokensEnabled(true)
.observeIntervalDelay(Delay.fixed(100, TimeUnit.MICROSECONDS))
.connectTimeout(timeout)
.build(),
clusterHost);

Related

Elastic apm - Disable transaction/span programatically for specific endpoint

I am using elastic-apm with spring application to monitor API requests and track all SQL's executed for given endpoint. The problem is give the amount of traffic elastic search is collecting huge magnitude of data and I would like to enable capturing span only for specific endpoints.
I tried using public api of elastic-apm https://www.elastic.co/guide/en/apm/agent/java/current/public-api.html
I can customize a transaction and span but I couldn't find a way to enable/disable to specific endpoints.
I have tried this but no luck -
ElasticApm.currentSpan().startSpan();
ElasticApm.currentSpan().end();

Looks like it can be done using drop_event processor in api-server.yml.
processors:
- drop_event:
when:
equals:
transaction.custom.transactions_sampled: false
and in code set custom context:
Transaction elasticTransaction = ElasticApm.currentTransaction();
elasticTransaction.addCustomContext("transactions.sampled", false);

Java async driver for MongoDB doesn't work on replica set when "primary" is changed

I have some troubles with the usage of java async driver (3.8.1).
I'll describe my environment:
I have a replica set (rs0) with 3 istances: let me call them A,B,C.
In my application I use Mongo and two different java driver, sync and async.
At the beginning I reached no problems but when the primary went down (and come up after some minutes changing its behavior as secondary) the part of code when I use async driver was not able to use transactions and session.
The error is the following:
com.mongodb.MongoClientException: Sessions are not supported by the MongoDB cluster to which this client is connected
at com.mongodb.async.client.MongoClientImpl$1.onResult(MongoClientImpl.java:90)
at com.mongodb.async.client.MongoClientImpl$1.onResult(MongoClientImpl.java:83)
at com.mongodb.async.client.ClientSessionHelper$2.onResult(ClientSessionHelper.java:77)
at com.mongodb.async.client.ClientSessionHelper$2.onResult(ClientSessionHelper.java:73)
at com.mongodb.internal.connection.BaseCluster$ServerSelectionRequest.onResult(BaseCluster.java:433)
at com.mongodb.internal.connection.BaseCluster.handleServerSelectionRequest(BaseCluster.java:309)
at com.mongodb.internal.connection.BaseCluster.access$800(BaseCluster.java:65)
at com.mongodb.internal.connection.BaseCluster$WaitQueueHandler.run(BaseCluster.java:482)
at java.lang.Thread.run(Unknown Source)
2019-01-21 17:02:01.906 ERROR 17560 --- [271de4498944329] org.mongodb.driver.client : Callback onResult call produced an error
java.lang.NullPointerException: null
at it.mypackage.mongo.service.ProcessoDocumentService$1.onResult(ProcessoDocumentService.java:124)
at it.mypackage.mongo.service.ProcessoDocumentService$1.onResult(ProcessoDocumentService.java:1)
at com.mongodb.internal.async.ErrorHandlingResultCallback.onResult(ErrorHandlingResultCallback.java:49)
at com.mongodb.async.client.MongoClientImpl$1.onResult(MongoClientImpl.java:90)
at com.mongodb.async.client.MongoClientImpl$1.onResult(MongoClientImpl.java:83)
at com.mongodb.async.client.ClientSessionHelper$2.onResult(ClientSessionHelper.java:77)
at com.mongodb.async.client.ClientSessionHelper$2.onResult(ClientSessionHelper.java:73)
at com.mongodb.internal.connection.BaseCluster$ServerSelectionRequest.onResult(BaseCluster.java:433)
at com.mongodb.internal.connection.BaseCluster.handleServerSelectionRequest(BaseCluster.java:309)
at com.mongodb.internal.connection.BaseCluster.access$800(BaseCluster.java:65)
at com.mongodb.internal.connection.BaseCluster$WaitQueueHandler.run(BaseCluster.java:482)
at java.lang.Thread.run(Unknown Source)
Just FYI, if I comment the part of code when I use session and transactions, the error is a classic timeout, as the driver was not longer able to find replica set anymore.
Someone could help me? What I'm missing?
This is how I create my MongoClient:
connectionString = new ConnectionString("mongodb://address1:27017,address2:27018,address3:27019/?replicaSet=rs0");
MongoClientSettings settings = MongoClientSettings.builder().applyConnectionString(connectionString)
.build();
settings = settings.builder().credential(credential).build();
asyncMongoClientInstance = MongoClients.create(settings);

I found the solution by myself, as the wise man once said: "If you want an help, find it at the end of your arm".
Let's us focus on this part of code:
connectionString = new connectionString("mongodb://address1:27017,address2:27018,address3:27019/?replicaSet=rs0");
MongoClientSettings settings = MongoClientSettings.builder().applyConnectionString(connectionString)
.build();
settings = settings.builder().credential(credential).build();
asyncMongoClientInstance = MongoClients.create(settings);
I'm reallocating the settings object to another object without the connection string.
So, async library doesn't know anymore where to address the connection.
Why I did that? I wanted to dinamically add credentials to the settings. But is not possible in this way. So I created two different settings object, one with credentials and one without.
MongoClientSettings settings = MongoClientSettings.builder().applyConnectionString(connectionString).credential(credential).build();
It definitely works with this object now.

How to programmatically get all running jobs in a Hadoop cluster using the new API?

I have a software component which submits MR jobs to Hadoop. I now want to check if there are other jobs running before submitting it. I found out that there is a Cluster object in the new API which can be used to query the cluster for running jobs, get their configurations and extract the relevant information from them. However I am having problems using this.
Just doing new Cluster(conf) where conf is a valid Configuration which can be used to access this cluster (e.g., to submit jobs to it) leaves the object unconfigured, and the getAllJobStatuses() method of Cluster returns null.
Extracting mapreduce.jobtracker.address from the configuration, constructing an InetSocketAddress from it and using the other constructor of Cluster throws Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses..
Using the old api, doing something like new JobClient(conf).getAllJobs() throws an NPE.
What am I missing here? How can I programmatically get the running jobs?

I investigated even more, and I solved it. Thomas Jungblut was right, it was because of the mini cluster. I used the mini cluster following this blog post which turned out to work for MR jobs, but set up the mini cluster in a deprecated way with an incomplete configuration. The Hadoop Wiki has a page on how to develop unit tests which also explains how to correctly set up a mini cluster.
Essentially, I do the mini cluster setup the following way:
// Create a YarnConfiguration for bootstrapping the minicluster
final YarnConfiguration bootConf = new YarnConfiguration();
// Base directory to store HDFS data in
final File hdfsBase = Files.createTempDirectory("temp-hdfs-").toFile();
bootConf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, hdfsBase.getAbsolutePath());
// Start Mini DFS cluster
final MiniDFSCluster hdfsCluster = new MiniDFSCluster.Builder(bootConf).build();
// Configure and start Mini MR YARN cluster
bootConf.setInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 64);
bootConf.setClass(YarnConfiguration.RM_SCHEDULER, FifoScheduler.class, ResourceScheduler.class);
final MiniMRYarnCluster yarnCluster = new MiniMRYarnCluster("test-cluster", 1);
yarnCluster.init(bootConf);
yarnCluster.start();
// Get the "real" Configuration to use from now on
final Configuration conf = yarnCluster.getConfig();
// Get the filesystem
final FileSystem fs = new Path ("hdfs://localhost:" + hdfsCluster.getNameNodePort() + "/").getFileSystem(conf);
Now, I have conf and fs I can use to submit jobs and access HDFS, and new Cluster(conf) and cluster.getAllJobStatuses works as expected.
When everything is done, to shut down and clean up, I call:
yarnCluster.stop();
hdfsCluster.shutdown();
FileUtils.deleteDirectory(hdfsBase); // from Apache Commons IO
Note: JAVA_HOME must be set for this to work. When building on Jenkins, make sure JAVA_HOME is set for the default JDK. Alternatively you can explicitly state a JDK to use, Jenkins will then set up JAVA_HOME automatically.

I tried it like this, it worked for me, but it is after submitting the job
JobClient jc = new JobClient(job.getConfiguration());
for(JobStatus js: jc.getAllJobs())
{
if(js.getState().getValue() == State.RUNNING.getValue())
{
}
}
jc.close();
or else we can get the cluster from job api and there are methods which return all the jobs, jobs status
cluster.getAllJobStatuses();

Why might describing Amazon EC2 instances yield no result?

I am trying to retrieve all the instances running in my AWS account (say instance id, etc). I use the following code. I am not able to print the instance ids. When I debug, I am just getting null values. But I have three instances running on AWS. Can someone point out what I am doing wrong here?
DescribeInstancesResult result = ec2.describeInstances();
List<Reservation> reservations = result.getReservations();
for (Reservation reservation : reservations) {
List<Instance> instances = reservation.getInstances();
for (Instance instance : instances) {
System.out.println(instance.getInstanceId());
}
}

The most common cause for issues like this is a missing region specification when initializing the client, see section To create and initialize an Amazon EC2 client within Create an Amazon EC2 Client for details:
Specifically, step 2 only creates an EC2 client without specifying the region explicitly:
2) Use the AWSCredentials object to create a new AmazonEC2Client instance, as follows:
amazonEC2Client = new AmazonEC2Client(credentials);
This yields a client talking to us-east-1 - surprisingly, the AWS SDKs and the AWS Management Console use different defaults even as outlined in step 3, which also shows how to specify a different endpoint:
3) By default, the service endpoint is ec2.us-east-1.amazonaws.com. To specify a different endpoint, use the setEndpoint method. For example:
amazonEC2Client.setEndpoint("ec2.us-west-2.amazonaws.com");
The AWS SDK for Java uses US East (N. Virginia) as the default region
if you do not specify a region in your code. However, the AWS
Management Console uses US West (Oregon) as its default. Therefore,
when using the AWS Management Console in conjunction with your
development, be sure to specify the same region in both your code and
the console. [emphasis mine]
The differing defaults are easy to trip over, and the respective default in the AWS Management Console has in fact changed over time - as so often in software development, I recommend to always be explicit about this in your code to avoid such subtle error sources.

In java, how can I get an Amazon EC2 Instance to see its own tags?

So I have a java program running within an Amazon EC2 instance. Is there a way to programatically get its own tags? I have tried instantiating a new AmazonEC2Client to us the describeTags() function but it only gives me null. Any help would be appreciated thank you.
Edit: To make things clearer, the instances are going to be unmanned worker machines spun up to solely do some computations

This should help you get started...
String instanceId = EC2MetadataUtils.getInstanceId();
AmazonEC2 client = AmazonEC2ClientBuilder.standard()
.withCredentials(new DefaultAWSCredentialsProviderChain())
.build();
DescribeTagsRequest req = new DescribeTagsRequest()
.withFilters(new Filter("resource-id", Collections.singletonList(instanceId)));
DescribeTagsResult describeTagsResult = client.describeTags(req);
List<TagDescription> tags = describeTagsResult.getTags()

You should be able to get the current instance id by sending a request to: http://169.254.169.254/latest/meta-data/instance-id. This only works within ec2. With this you can access quite a bit of information about the instance. However, tags do not appear to be included.
You should be able to take the instance id along with the correct authentication to get the instance tags. If you are going to run this on an instance, you may want to provide an IAM user with limited access instead of a user which has access to everything in case the instance is compromised.

While using user-data may be the simplest solution, the OP was asking specifically about the tagging, and unfortunately amazon hasn't made this as easy as it could be. However, It can be done. You want to use a combination of 2 amazon services.
First you need to retrieve the Instance ID. This can be achieved by hitting the URL from within your instance:
http://169.254.169.254/latest/meta-data/instance-id
Once you have the resource ID, you'll want to use Amazon's EC2 API to access the tags. Since you said you're using Java, I would suggest the Using the AWS SDK amazon makes available. Within this SDK you'll find a method called describeTags (documentation). You can use a Resource ID as one of the filters to get the specific tags to your instance. Supported filters are
tag key
resource-id
resource-type
I suggest doing this retrieval at boot using something like cloud-init and caching the tags on your server for use later if necessary.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Couchbase Client Configuration for enhanced durability - java

Related

Elastic apm - Disable transaction/span programatically for specific endpoint

Java async driver for MongoDB doesn't work on replica set when "primary" is changed

How to programmatically get all running jobs in a Hadoop cluster using the new API?

Why might describing Amazon EC2 instances yield no result?

In java, how can I get an Amazon EC2 Instance to see its own tags?

Categories

Resources