EMR cluster hangs in Step state 'Running/Pending'

EMR cluster hangs in Step state 'Running/Pending' - java

I am launching an EMR cluster through java SDK with a custom jar step. The cluster launch is successful but when after bootstrapping while the step is pending/running state the cluster stucks.
I am not even able to ssh on the machine.
Following is my code to launch the cluster with custom jar step -
String dataTrasnferJar = s3://test/testApplication.jar;
if (dataTrasnferJar == null || dataTrasnferJar.isEmpty())
throw new InvalidS3ObjectException(
"EMR custom jar file path is null/empty. Please provide a valid jar file path");
HadoopJarStepConfig customJarConfig = new HadoopJarStepConfig().withJar(dataTrasnferJar);
StepConfig customJarStep = new StepConfig("Mongo_to_S3_Data_Transfer", customJarConfig)
.withActionOnFailure(ActionOnFailure.CONTINUE);
AmazonElasticMapReduce emr = AmazonElasticMapReduceClientBuilder.standard()
.withCredentials(awsCredentialsProvider)
.withRegion(region)
.build();
Application spark = new Application().withName("Spark");
String clusterName = "my-cluster-" + System.currentTimeMillis();
RunJobFlowRequest request = new RunJobFlowRequest()
.withName(clusterName)
.withReleaseLabel("emr-6.0.0")
.withApplications(spark)
.withVisibleToAllUsers(true)
.withSteps(customJarStep)
.withLogUri(loggingS3Bucket)
.withServiceRole("EMR_DefaultRole")
.withJobFlowRole("EMR_EC2_DefaultRole")
.withInstances(new JobFlowInstancesConfig()
.withEc2KeyName(key_pair)
.withInstanceCount(instanceCount)
.withEc2SubnetIds(subnetId)
.withAdditionalMasterSecurityGroups(securityGroup)
.withKeepJobFlowAliveWhenNoSteps(true)
.withMasterInstanceType(instanceType));
RunJobFlowResult result = emr.runJobFlow(request);

EMR emr-6.0.0 version is still in development. Can you try the same with emr-5.29.0?

Related

Unable to execute java code in Linux server

I have made one java API service. While Calling this service I am trying to get response from Dialogue flow. Now When I am executing the code on window platform then It is working fine.But when I am publish it ,on Linux server. And trying to test it Giving error.
I have done it with export GOOGLE_APPLICATION_CREDENTIALS="[PATH]" on system variable.Even I have setup the path on Linux server but it is not working.Then I have put .json file on project and trying to read and then Again making call. (These things are working fine only on local system not Linux after publish the project)
List<String> texts= new ArrayList<String>();
texts.add("Hi");
String sessionId="3f46dfa4-5204-84f3-1488-5556f3d6b8a1";
String languageCode="en-US";
GoogleCredentials credentials = GoogleCredentials.fromStream(new FileInputStream("E:\\GoogleDialogFlow\\ChatBox\\Json.json"))
.createScoped(Lists.newArrayList("https://www.googleapis.com/auth/cloud-platform"));
SessionsSettings.Builder settingsBuilder = SessionsSettings.newBuilder();
SessionsSettings sessionsSettings = settingsBuilder.setCredentialsProvider(FixedCredentialsProvider.create(credentials)).build();
SessionsClient sessionsClient = SessionsClient.create(sessionsSettings);
SessionName session = SessionName.of(projectId, sessionId);
com.google.cloud.dialogflow.v2.TextInput.Builder textInput = TextInput.newBuilder().setText("Hi").setLanguageCode(languageCode);
QueryInput queryInput = QueryInput.newBuilder().setText(textInput).build();
DetectIntentResponse response = sessionsClient.detectIntent(session, queryInput);
QueryResult queryResult = response.getQueryResult();
System.out.println("====================");
System.out.format("Query Text: '%s'\n", queryResult.getQueryText());
System.out.format("Detected Intent: %s (confidence: %f)\n",
queryResult.getIntent().getDisplayName(), queryResult.getIntentDetectionConfidence());
System.out.format("Fulfillment Text: '%s'\n", queryResult.getFulfillmentText());
credentials.toBuilder().build();```

I have trouble with updating existing aws cloud front CNAMEs

I am trying to add CNAMEs for the existing Distribution in aws cloud front programmatically.
I have tried the following code, but it did not give any result. If someone knows how to do it programmatically. Please kind enough to mention it. Thank you
AmazonCloudFront cloudFront = AmazonCloudFrontAsyncClientBuilder.standard()
.withRegion(Regions.AP_EAST_1)
.withCredentials(new AWSStaticCredentialsProvider(
new BasicAWSCredentials(route53Manager.getAccessKey(), route53Manager.getSecretKey())))
.build();
GetDistributionConfigResult result = cloudFront.getDistributionConfig(
new GetDistributionConfigRequest("E1EJBNNYJZ6G34"));
Aliases aliases = new Aliases()
.withItems(subDomain)
.withQuantity(1);
DistributionConfig config = result.getDistributionConfig()
.withEnabled(true)
.withAliases(aliases);

It looks like you are missing the update distribution code and a few extra things. See the below code:
AmazonCloudFront cloudFront = AmazonCloudFrontAsyncClientBuilder.standard()
.withRegion(Regions.AP_EAST_1)
.withCredentials(new AWSStaticCredentialsProvider(
new BasicAWSCredentials(route53Manager.getAccessKey(), route53Manager.getSecretKey())))
.build();
//create the request
GetDistributionConfigRequest distributionConfigRequest = new GetDistributionConfigRequest("E1EJBNNYJZ6G34");
//submit the request and get the resulting config
GetDistributionConfigResult distributionConfigResult = cloudFront.getDistributionConfig(distributionConfigRequest);
Aliases aliases = new Aliases()
.withItems(subDomain)
.withQuantity(1);
DistributionConfig config = distributionConfigResult.getDistributionConfig()
.withEnabled(true)
.withAliases(aliases);
//create the update request
UpdateDistributionRequest updateDistributionRequest = new UpdateDistributionRequest(config, distributionConfigRequest.getId(), distributionConfigResult.getETag());
//submit the request to update the config
UpdateDistributionResult updateDistributionResult = cloudfront.updateDistribution(updateDistributionRequest);
//print output of result to console
System.out.println(updateDistributionResult);

Redirecting Standard Output of a JAR in c#

I have a C# app that at some point needs to communicate with a JAR app. No problem, I use the following code to start the app but unfortunately, the window that should print the app's console remains black and the jar file executes in GUI mode.
If I call the same file with the same command from RUN or CMD, it works and the console registers messages from the JAR any ideas why when started from my C# app it won't register any message to the console?
[ProcessStartInfo psi32 = default(ProcessStartInfo);
Process proc1 = new Process();
string path1 = Application.StartupPath;
dynamic process3 = pat1 + "java.exe";
dynamic jar = "-jar ";
dynamic param1 = "/ssh.jar";
dynamic args3 = string.Format("{0}{1} {2}", jar, path1, param1);
psi32 = new ProcessStartInfo(process3, args3);
psi32.RedirectStandardInput = true;
psi32.UseShellExecute = false;
proc1.StartInfo = psi32;
proc1.Start();
string condeva;
using (StreamReader reader = proc1.StandardOutput)
{
message= reader.ReadToEnd();
}
if (message.Contains("failed"))
{
message.Text = "found it...";
}

Write JavaPairDStream to single HDFS location

I am exploring Spark Streaming using Java.
I current have Cloudera quick start VM (CDH 5.5) downloaded and I wrote a Java code for Spark streaming
I have written a program which returns JavaPairDStream. When I try to write the output to HDFS, it works but it is creating multiple folders (based on timestamp). The documentation says that this is how it is going to work, but is there a way to write the output to the same folder/file in HDFS? I tried to use repartition(1), but that did not work
Please see the code below:
if (args.length < 3) {
System.err.println("Invalid arguments");
System.exit(1);
}
SparkConf sparkConf = new SparkConf().setMaster("local").setAppName("Product Reco Spark Streaming");
JavaStreamingContext javaStreamContext = new JavaStreamingContext(sparkConf, new Duration(10000));
String inputFile = args[0];
String outputPath = args[1];
String outputFile = args[2];
JavaDStream<String> dStream = javaStreamContext.textFileStream(inputFile);
JavaPairDStream<String, String> finalDStream = fetchProductRecommendation(dStream); // Does some logic to get the final DStream
finalDStream.print();
finalDStream.repartition(1).saveAsNewAPIHadoopFiles(outputPath, outputFile, String.class, String.class, TextOutputFormat.class);
javaStreamContext.start();
javaStreamContext.awaitTermination();
To run this program, here is the command that I am using
spark-submit --master local /home/cloudera/Spark/JarLib_ProductRecoSparkStream.jar /user/ProductRecomendations/SparkInput/ /user/ProductRecomendations/SparkOutput/ productRecoOutput
Please let me know if you need more information since this is the first time I am writing a spark stream code.

AWS was not able to validate the provided access credentials

I have been trying to create Security Group using AWS SDK, but somehow it fails to authenticate it. For the specific Access Key and Secret Key, i have provided the Administrative rights, then also it fails to validate. On the other side, I tried the same credentials on AWS S3 Example, it successfully executes.
Getting following error while creating security group:
com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 1584a035-9a88-4dc7-b5e2-a8b7bde6f43c)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1077)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:725)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:9393)
at com.amazonaws.services.ec2.AmazonEC2Client.createSecurityGroup(AmazonEC2Client.java:1146)
at com.sunil.demo.ec2.SetupEC2.createSecurityGroup(SetupEC2.java:84)
at com.sunil.demo.ec2.SetupEC2.main(SetupEC2.java:25)
Here is the Java Code:
public class SetupEC2 {
AWSCredentials credentials = null;
AmazonEC2Client amazonEC2Client ;
public static void main(String[] args) {
SetupEC2 setupEC2Instance = new SetupEC2();
setupEC2Instance.init();
setupEC2Instance.createSecurityGroup();
}
public void init(){
// Intialize AWS Credentials
try {
credentials = new BasicAWSCredentials("XXXXXXXX", "XXXXXXXXX");
} catch (Exception e) {
throw new AmazonClientException(
"Cannot load the credentials from the credential profiles file. " +
"Please make sure that your credentials file is at the correct " +
"location (/home/sunil/.aws/credentials), and is in valid format.",
e);
}
// Initialize EC2 instance
try {
amazonEC2Client = new AmazonEC2Client(credentials);
amazonEC2Client.setEndpoint("ec2.ap-southeast-1.amazonaws.com");
amazonEC2Client.setRegion(Region.getRegion(Regions.AP_SOUTHEAST_1));
} catch (Exception e) {
e.printStackTrace();
}
}
public boolean createSecurityGroup(){
boolean securityGroupCreated = false;
String groupName = "sgec2securitygroup";
String sshIpRange = "0.0.0.0/0";
String sshprotocol = "tcp";
int sshFromPort = 22;
int sshToPort =22;
String httpIpRange = "0.0.0.0/0";
String httpProtocol = "tcp";
int httpFromPort = 80;
int httpToPort = 80;
String httpsIpRange = "0.0.0.0/0";
String httpsProtocol = "tcp";
int httpsFromPort = 443;
int httpsToProtocol = 443;
try {
CreateSecurityGroupRequest createSecurityGroupRequest = new CreateSecurityGroupRequest();
createSecurityGroupRequest.withGroupName(groupName).withDescription("Created from AWS SDK Security Group");
createSecurityGroupRequest.setRequestCredentials(credentials);
CreateSecurityGroupResult csgr = amazonEC2Client.createSecurityGroup(createSecurityGroupRequest);
String groupid = csgr.getGroupId();
System.out.println("Security Group Id : " + groupid);
System.out.println("Create Security Group Permission");
Collection<IpPermission> ips = new ArrayList<IpPermission>();
// Permission for SSH only to your ip
IpPermission ipssh = new IpPermission();
ipssh.withIpRanges(sshIpRange).withIpProtocol(sshprotocol).withFromPort(sshFromPort).withToPort(sshToPort);
ips.add(ipssh);
// Permission for HTTP, any one can access
IpPermission iphttp = new IpPermission();
iphttp.withIpRanges(httpIpRange).withIpProtocol(httpProtocol).withFromPort(httpFromPort).withToPort(httpToPort);
ips.add(iphttp);
//Permission for HTTPS, any one can accesss
IpPermission iphttps = new IpPermission();
iphttps.withIpRanges(httpsIpRange).withIpProtocol(httpsProtocol).withFromPort(httpsFromPort).withToPort(httpsToProtocol);
ips.add(iphttps);
System.out.println("Attach Owner to security group");
// Register this security group with owner
AuthorizeSecurityGroupIngressRequest authorizeSecurityGroupIngressRequest = new AuthorizeSecurityGroupIngressRequest();
authorizeSecurityGroupIngressRequest.withGroupName(groupName).withIpPermissions(ips);
amazonEC2Client.authorizeSecurityGroupIngress(authorizeSecurityGroupIngressRequest);
securityGroupCreated = true;
} catch (Exception e) {
// TODO: handle exception
e.printStackTrace();
securityGroupCreated = false;
}
System.out.println("securityGroupCreated: " + securityGroupCreated);
return securityGroupCreated;
}
}

Try to update your Systemtime.
When the diffrence between AWS-datetime and your datetime are too big, the credentials will not accepted.
For Debian/Ubuntu Users:
when you never set your time-zone you can do this with
sudo dpkg-reconfigure tzdata
Stop the ntp-Service, because too large time diffrences, cannot be changed by running service.
sudo /etc/init.d/ntp stop
Syncronize your time and date (-q Set the time and quit / Run only once) (-g Allow the first adjustment to be Big) (-x Slew up to 600 seconds / Adjuste also time witch large diffrences) (-n Do not fork / process will not going to background)
sudo ntpd -q -g -x -n
Restart service
sudo /etc/init.d/ntp start
check actual system-datetime
sudo date
set system-datetime to your hardware-datetime
sudo hwclock --systohc
show your hardware-datetime
sudo hwclock

you must specify the profile and the region
aws ec2 describe-instances --profile nameofyourprofile --region eu-west-1

"A client error (AuthFailure) occurred when calling the [Fill-in the blanks] operation: AWS was not able to validate the provided access credentials"
If you are confident of the validity of AWS credentials i.e. access key and secret key and corresponding profile name, your date and time being off-track is a very good culprit.
In my case, I was confident but I was wrong - I had used the wrong keys. Doesn't hurt to double check.
Let's say that you created an IAM user called "guignol". Configure "guignol" in ~/.aws/config as follows:
[profile guignol]
region = us-east-1
aws-access-key_id = AKXXXYYY...
aws-secret-key-access = ...
Install the aws cli (command level interface) if you haven't already done so. As a test, run aws ec2 describe-instances --profile guignol If you gat an error message that aws was not able to validate the credentials, run aws configure --profile guignol , enter your credentials and run the test command again.

If you put your credentials in ~/.aws/credentials then you don't need to provide a parameter to your AmazonEC2Client call. If you do this then on an EC2 instance the same code will work with Assumed STS roles.
For more info see: http://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/credentials.html

In my case, killing the terminal and running the command again helped

In my case I copied CDK env variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN for programmatic access, but it appeared that I already had an old session token in my ~/aws/.credentials which I forgot about. Needed to remove the old tokens from the file.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

EMR cluster hangs in Step state 'Running/Pending' - java

EMR emr-6.0.0 version is still in development. Can you try the same with emr-5.29.0?

Related

Unable to execute java code in Linux server

I have trouble with updating existing aws cloud front CNAMEs

Redirecting Standard Output of a JAR in c#

Write JavaPairDStream to single HDFS location

AWS was not able to validate the provided access credentials

Categories

Resources