InstanceProfile is required for creating cluster - java

I was trying to run Elastic MapReduce from Eclipse but couldn't do so.
My code is as below:
public class RunEMR {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
AWSCredentials credentials = new BasicAWSCredentials("xxxx","xxxx");
AmazonElasticMapReduceClient emr = new AmazonElasticMapReduceClient(credentials);
StepFactory stepFactory = new StepFactory();
StepConfig enableDebugging = new StepConfig()
.withName("Enable Debugging")
.withActionOnFailure("TERMINATE_JOB_FLOW")
.withHadoopJarStep(stepFactory.newEnableDebuggingStep());
StepConfig installHive = new StepConfig()
.withName("Install Hive")
.withActionOnFailure("TERMINATE_JOB_FLOW")
.withHadoopJarStep(stepFactory.newInstallHiveStep());
StepConfig hiveScript = new StepConfig().withName("Hive Script")
.withActionOnFailure("TERMINATE_JOB_FLOW")
.withHadoopJarStep(stepFactory.newRunHiveScriptStep("s3://mywordcountbuckett/binary/WordCount.jar"));
RunJobFlowRequest request = new RunJobFlowRequest()
.withName("Hive Interactive")
.withSteps(enableDebugging, installHive)
.withLogUri("s3://mywordcountbuckett/")
.withInstances(new JobFlowInstancesConfig()
.withEc2KeyName("xxxx")
.withHadoopVersion("0.20")
.withInstanceCount(3)
.withKeepJobFlowAliveWhenNoSteps(true)
.withMasterInstanceType("m1.small")
.withSlaveInstanceType("m1.small"));
RunJobFlowResult result = emr.runJobFlow(request);
}
}
The error that I got was :
Exception in thread "main" com.amazonaws.AmazonServiceException: InstanceProfile is required for creating cluster. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: 7a96ee32-9744-11e5-947d-65ca8f7db0a5
I have tried for couple of hours but unable to fix it. Does anyone knows how ?

I got same exception InstanceProfile is required for creating cluster.
Had to set service-role, and job-flow-role like below
aRunJobFlowRequest.setServiceRole("EMR_DefaultRole")
aRunJobFlowRequest.setJobFlowRole("EMR_EC2_DefaultRole")
After that I was OK.
AWS Document for EMR IAM Roles said
AWS Identity and Access Management (IAM) roles provide a way for IAM users or AWS services to have certain specified permissions and access to resources. For example, this may allow users to access resources or other services to act on your behalf. You must specify two IAM roles for a cluster: a role for the Amazon EMR service (service role), and a role for the EC2 instances (instance profile) that Amazon EMR manages.
So word InstanceProfile in exception message might mean a role for the EC2 instances (instance profile) in the doc, but I got pass that exception after specifying JobFlowRole. little weird.

For an ec2 role (here jobflowrole), an instance profile with the same nameis created internally. Hence it uses these names interchangeably.
If you creating an emr cluster from scratch using boto3, you should also create emr service role, one ec2jobflow role , one instance profile linked to ec2jobflow role.
AWS doc

The version you are trying to use is deprecated and IAM roles are required. Follow the example as given in the documentation http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/calling-emr-with-java-sdk.html.

Related

Creating EKS cluster by using Java application

everyone I am new to AWS SDK. I am trying to create an EKS cluster from my java application.
I have used this eksctl create cluster command to create a cluster and I have also done this by using cluster templates.
I have tried to use AWS SDK to create clusters but that didn't work and have no idea how to go with it.
If anyone of you has a good sample code or explanation of using AWS SDK for creating a cluster using cluster template or anything which can help me to reach there would be helpful.
here i provide you a sample of Java code. i wish its serve your purpose on eks cluster creation:
String accessKey = "your_aws_access_key";
String secretKey = "your_aws_secret_key";
AWSCredentials credentials = new BasicAWSCredentials (accessKey, secretKey);
ClientConfiguration clientConfig = new ClientConfiguration ();
clientConfig.setProtocol (Protocol.HTTPS);
clientConfig.setMaxErrorRetry (DEFAULT_MAX_ERROR_RETRY);
clientConfig.setRetryPolicy (new RetryPolicy (PredefinedRetryPolicies.DEFAULT_RETRY_CONDITION,
DEFAULT_BACKOFF_STRATEGY, DEFAULT_MAX_ERROR_RETRY, false));
AmazonEKS amazonEKS = AmazonEKSClientBuilder.standard ()
.withClientConfiguration (clientConfig)
.withCredentials (new AWSStaticCredentialsProvider (credentials))
.withRegion ("us-east-1") //replace your region name
.build ();
CreateClusterResult eksCluster = amazonEKS.createCluster (
new CreateClusterRequest ().withName ("cluster-name") //with other param
);

I am trying to write to Amazon S3 using assumeRole via FileIO with ParquetIO

Step1 : AssumeRole
public static AWSCredentialsProvider getCredentials() {
if (roleARN.length() > 0) {
STSAssumeRoleSessionCredentialsProvider credentialsProvider = new STSAssumeRoleSessionCredentialsProvider
.Builder(roleARN, Constants.SESSION_NAME)
.withStsClient(AWSSecurityTokenServiceClientBuilder.defaultClient())
.build();
return credentialsProvider;
}
return new ProfileCredentialsProvider();
}
Step 2 : Set Credentials to pipeline
credentials = getCredentials();
pipeline.getOptions().as(AwsOptions.class).setAwsRegion(Regions.US_WEST_2.getName());
pipeline.getOptions().as(AwsOptions.class).setAwsCredentialsProvider(new AWSStaticCredentialsProvider(new BasicAWSCredentials(credentials.getCredentials().getAWSAccessKeyId(), credentials.getCredentials().getAWSAccessKeyId())));
Step 3 : Run pipeline to write to s3
PCollection<GenericRecord> parquetRecord = formattedEvent
.apply("ParquetRecord", ParDo.of(new ParquetWriter()))
.setCoder(AvroCoder.of(getOutput_schema()));
parquetRecord.apply(FileIO.<GenericRecord, GenericRecord>writeDynamic()
.by(elm -> elm)
.via(ParquetIO.sink(getOutput_schema()))
.to(outputPath).withNumShards(1)
.withNaming(type -> FileNaming.getNaming("part", ".snappy.parquet", "" + DateTime.now().getMillisOfSecond()))
.withDestinationCoder(AvroCoder.of(getOutput_schema())));
I am using 'org.apache.beam:beam-sdks-java-io-parquet:jar:2.22.0' and
'org.apache.beam:beam-sdks-java-io-amazon-web-services:jar:2.22.0'
Issue : Currently assumeRole seems to be not working.
Errors :
org.apache.beam.sdk.util.UserCodeException: java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception: The AWS Access Key Id you provided does not exist in our records.
Or
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Unexpected IOException (of type java.io.IOException): Failed to serialize and deserialize property 'awsCredentialsProvider' with value 'com.amazonaws.auth.InstanceProfileCredentialsProvider#71262020'
Recently release of beam (2.24.0) has the feature to assume role.
where do you run this pipeline from (in an AWS account ?) if yes then it is better to provide assume role access to the Role which runs the pipeline and then from the pipeline FileIO will just use the default AWS Client.
It is better to shift the assume role operation out of the pipeline and just allow S3 permissions to the Role running the pipeline.

AmazonClientException: Unable To Load Credentials from any Provider in the Chain

My mule application writes json record to a kinesis stream. I use KPL producer library. When run locally, it picks AWS credentials from .aws/credentials and writes record to kinesis successfully.
However, when I deploy my application to Cloudhub, it throws AmazonClientException, obviously due to not having access to any of directories that DefaultAWSCredentialsProviderChain class supports. (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html)
This is how I attach credentials and it looks locally in .aws/credentials:
config.setCredentialsProvider( new
DefaultAWSCredentialsProviderChain());
I couldn't figure out a way to provide credentials explicitly using my-app.properies file.
Then I tried to create a separate configuration file with getters/setters. set access key and private key as private and then impement a getter:
public AWSCredentialsProvider getCredentials() {
if(accessKey == null || secretKey == null) {
return new DefaultAWSCredentialsProviderChain();
}
return new StaticCredentialsProvider(new BasicAWSCredentials(getAccessKey(), getSecretKey()));
}
}
This was intended to be used instead of DefaultAWSCredentialsProviderChain class this way---
config.setCredentialsProvider(new AWSConfig().getCredentials());
Still throws the same error when deployed.
The following repo states that it is possible to provide explicit credentials. I need help to figure out how because I can't find a proper documentation / example.
https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/src/com/amazonaws/services/kinesis/producer/sample/SampleProducer.java
I have Faced the same issue so, I got this solution I hope this will work for you also.
#Value("${s3_accessKey}")
private String s3_accessKey;
#Value("${s3_secretKey}")
private String s3_secretKey;
//this above value I am taking from Application.properties file
BasicAWSCredentials creds = new BasicAWSCredentials(s3_accessKey,
s3_secretKey);
AmazonS3 s3Client = AmazonS3ClientBuilder.standard().
withCredentials(new AWSStaticCredentialsProvider(creds))
.withRegion(Regions.US_EAST_2)
.build();

GAE HttpResponseException: 401

I am trying to access the DataStore of one app from another GAE project using Remote API.
I am using the following code:
String serverString = "http://example.com";//this should be the target appengine
RemoteApiOptions options;
if (serverString.equals("localhost")) {
options = new RemoteApiOptions().server(serverString, 8080).useDevelopmentServerCredential();
} else {
options = new RemoteApiOptions().server(serverString, 80).useApplicationDefaultCredential();
}
RemoteApiInstaller installer = new RemoteApiInstaller();
installer.install(options);
datastore = DatastoreServiceFactory.getDatastoreService();
try {
results = datastore.get(KeyFactory.createKey("some key"));
} catch (EntityNotFoundException e) {
e.printStackTrace();
return null;
}
when I run this locally, i get a nullpointerexception at installer.install(options);.
and when deployed, the error seen from error reporting on the appengine is :HttpResponseException: 401 You must be logged in as an administrator, or access from an approved application.
That being said, I made a small java application with the follwing code:
String serverString = "http://example.com";//same string as the one used in the above code
RemoteApiOptions options;
if (serverString.equals("localhost")) {
options = new RemoteApiOptions().server(serverString, 8080).useDevelopmentServerCredential();
} else {
options = new RemoteApiOptions().server(serverString, 80).useApplicationDefaultCredential();
}
RemoteApiInstaller installer = new RemoteApiInstaller();
installer.install(options);
try {
DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
System.out.println("Key of new entity is " + ds.put(new Entity("Hello Remote API!")));
and this one works!! Hello Remote API entity is added.
The reason it does not work when running on App Engine vs running locally has to do with the credentials that are being picked up. When running locally, it is likely using your own credentials (which has access to both projects); by contrast, when running on App Engine, you are likely picking up the App Engine default service account, which only has access to that App Engine project.
Try fixing this by opening the Cloud IAM section of Cloud Console for the project containing the Cloud Datastore that you wish to access. There, grant the appropriate level of access to the default App Engine service account that is being used by the other project.
If you don't want all App Engine services in the other project to have this kind of access, you might also consider, instead, generating a service account for this cross-project access that you grant the appropriate access to (rather than granting that access to the default App Engine service account). Then, in your code that calls the API, you would explicitly use that service account by calling the useServiceAccountCredential() method of RemoteApiOptions to ensure that the API requests that are issued use the specified service account rather than the default App Engine service account.

How can I get list of IAM Roles from EC2 using Java SDK or Amazon API?

There are several notes on how to run Instance with given IAM Role and create one. But what about retrieving such data from EC2 service using Amazon Client (Java SDK) or http-requests via Amazon API? Can I get such list of IAM Roles somehow (they were preliminary created in EC2 console by devOps team, so I must somehow expose them in other web-application)? Thanks in advance.
Okay. Seems like AmazonIdentityManagementClient listInstanceProfiles() call does the trick.
Some kind of solution should work. Sorry for bother.
public Collection<String> getIAMRolesRange() {
AmazonIdentityManagementClient identityManagementClient = new AmazonIdentityManagementClient(new BasicAWSCredentials(awsAccount.getAccessKeyId(), awsAccount.getAccessSecret()));
ListInstanceProfilesResult listInstanceProfilesResult = identityManagementClient.listInstanceProfiles();
List<String> iamRoles = new LinkedList<String>();
for(InstanceProfile instanceProfile: listInstanceProfilesResult.getInstanceProfiles()) {
iamRoles.addAll(Collections2.transform(instanceProfile.getRoles(), iamRoleToStringFunction));
}
return iamRoles;
}

Categories

Resources