Apache Camel interfaces well with AWS S3, but I have found a scenario in which it is not built correctly for. Going over all of the Camel examples I have seen online, I have never seen anyone use the recommended, industry standard, AWS temporary credentials on non-local environments. Using static credentials that live for ~6 months is a security issue as well as a manual burden (to refresh) and realistically shouldn't be used anywhere except for local environments.
Given a custom, s3 client setup, Camel can take temporary credentials, however, a Camel route pointed to AWS S3 will experience an expiration at some point. Camel is not smart enough to know this and will continue to try to poll a S3 bucket without throwing any exceptions or timeout errors indefinitely.
I have tried to add a timeout configuration to my endpoint like so:
aws-s3://" + incomingAWSBucket + "?" + "amazonS3Client=#amazonS3Client&timeout=4000
Can anyone explain how to interface Camel with AWS temporary credentials or throw an exception if AWS credentials expire (given the aforementioned setup)?
Thanks for the help!
UPDATE:
I pushed a feature to Apache Camel to handle the issue above:
https://github.com/apache/camel/blob/master/components/camel-aws-s3/src/main/docs/aws-s3-component.adoc#use-useiamcredentials-with-the-s3-component
The answer to this question is dense enough for a tutorial if others want it. For now, I will copy and paste it to the correct forums and threads to get the word out:
Without complaining too much, I'd just like to say that for how powerful Camel is, its documentation and example base is really lacking for production scenarios in the AWS world... Sigh... Thats a mouthful and probably a stretch for any open source lib.
I figured out how to solve the credential problem by referencing the official camel-s3 documentation to first see how to create an advanced S3 configuration (relying on the aws sdk itself -- you can see a bare bones example there -- it makes the s3 client manually).
After I figured this out, I went out to the aws sdk documentation on IAM credentials to figure out how this could work on an EC2 instance since I am able to build the client itself. In the aforementioned docs, there are a few bare bones examples as well. Upon testing testing with the examples listed, I found that the credential refresh (the sole purpose of this question) was not working. It could get credentials at first, but it was not refreshing them during my tests after they were manually expired.
Lastly, I figured out that you can specify a provider chain that can handle the refreshing of the credentials on its own. The aws documentation that explains this is here.
In the end, I still need to have static credentials for my local camel setups that poll aws s3 buckets, however, my remote environments that live on ec2s can access them with temporary credentials that refresh themselves flawlessly. WOWSA! :)
To do this, I simply made a factory that uses a local camel setup for my local development and remote camel setup that relies on the temporary IAM credentials. This saves me the security concern and the work on needing to manually refresh credentials for all remote environments!
I will not explain how to create a factory or how my local & remote configurations are set up entirely, but I will include my code sample of the AmazonS3ClientBuilder that creates an S3 Client for remote setups.
AmazonS3ClientBuilder.standard()
.withCredentials(new InstanceProfileCredentialsProvider(false))
.withRegion(Regions.US_WEST_2)
.build();
If there is a desire on how I got this to work, I can provide an example project that shows the entire process.
By request, here are my local and remote implementations of the s3 client:
Local:
public class LocalAWSS3ClientManagerImpl implements AWSS3ClientManager {
private static Logger logger = LoggerFactory.getLogger(LocalAWSS3ClientManagerImpl.class);
private PriorityCodeSourcesRoutesProperties priorityCodeSourcesRoutesProperties;
private SimpleRegistry registry = new SimpleRegistry();
private CamelContext camelContext;
public LocalAWSS3ClientManagerImpl(PriorityCodeSourcesRoutesProperties priorityCodeSourcesRoutesProperties) {
this.priorityCodeSourcesRoutesProperties = priorityCodeSourcesRoutesProperties;
registry.put("amazonS3Client", getS3Client());
camelContext = new DefaultCamelContext(registry);
logger.info("Creating an AWS S3 manager for a local instance (you should not see this on AWS EC2s).");
}
private AmazonS3 getS3Client() {
try {
String awsBucketAccessKey = priorityCodeSourcesRoutesProperties.getAwsBucketAccessKey();
String awsBucketSecretKey = priorityCodeSourcesRoutesProperties.getAwsBucketSecretKey();
AWSCredentials awsCredentials = new BasicAWSCredentials(awsBucketAccessKey, awsBucketSecretKey);
return AmazonS3ClientBuilder.standard().withCredentials(
new AWSStaticCredentialsProvider(awsCredentials)).build();
} catch (RuntimeException ex) {
logger.error("Could not create AWS S3 client with the given credentials from the local config.");
}
return null;
}
public Endpoint getIncomingAWSEndpoint(final String incomingAWSBucket, final String region,
final String fileNameToSaveAndDownload) {
return camelContext.getEndpoint(
"aws-s3://" + incomingAWSBucket + "?" + "amazonS3Client=#amazonS3Client"
+ "®ion=" + region + "&deleteAfterRead=false" + "&prefix=" + fileNameToSaveAndDownload);
}
public Endpoint getOutgoingLocalEndpoint(final String outgoingEndpointDirectory,
final String fileNameToSaveAndDownload) {
return camelContext.getEndpoint(
"file://" + outgoingEndpointDirectory + "?" + "fileName="
+ fileNameToSaveAndDownload + "&readLock=markerFile");
}
}
Remote:
public class RemoteAWSS3ClientManagerImpl implements AWSS3ClientManager {
private static Logger logger = LoggerFactory.getLogger(RemoteAWSS3ClientManagerImpl.class);
private PriorityCodeSourcesRoutesProperties priorityCodeSourcesRoutesProperties;
private SimpleRegistry registry = new SimpleRegistry();
private CamelContext camelContext;
public RemoteAWSS3ClientManagerImpl(PriorityCodeSourcesRoutesProperties priorityCodeSourcesRoutesProperties) {
this.priorityCodeSourcesRoutesProperties = priorityCodeSourcesRoutesProperties;
registry.put("amazonS3Client", getS3Client());
camelContext = new DefaultCamelContext(registry);
logger.info("Creating an AWS S3 client for a remote instance (normal for ec2s).");
}
private AmazonS3 getS3Client() {
try {
logger.info("Attempting to create an AWS S3 client with IAM role's temporary credentials.");
return AmazonS3ClientBuilder.standard()
.withCredentials(new InstanceProfileCredentialsProvider(false))
.withRegion(Regions.US_WEST_2)
.build();
} catch (RuntimeException ex) {
logger.error("Could not create AWS S3 client with the given credentials from the instance. "
+ "The default credential chain was used to create the AWS S3 client. "
+ ex.toString());
}
return null;
}
public Endpoint getIncomingAWSEndpoint(final String incomingAWSBucket, final String region,
final String fileNameToSaveAndDownload) {
return camelContext.getEndpoint(
"aws-s3://" + incomingAWSBucket + "?" + "amazonS3Client=#amazonS3Client"
+ "®ion=" + region + "&deleteAfterRead=false" + "&prefix=" + fileNameToSaveAndDownload);
}
public Endpoint getOutgoingLocalEndpoint(final String outgoingEndpointDirectory,
final String fileNameToSaveAndDownload) {
return camelContext.getEndpoint(
"file://" + outgoingEndpointDirectory + "?" + "fileName="
+ fileNameToSaveAndDownload + "&readLock=markerFile");
}
}
Related
My mule application writes json record to a kinesis stream. I use KPL producer library. When run locally, it picks AWS credentials from .aws/credentials and writes record to kinesis successfully.
However, when I deploy my application to Cloudhub, it throws AmazonClientException, obviously due to not having access to any of directories that DefaultAWSCredentialsProviderChain class supports. (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html)
This is how I attach credentials and it looks locally in .aws/credentials:
config.setCredentialsProvider( new
DefaultAWSCredentialsProviderChain());
I couldn't figure out a way to provide credentials explicitly using my-app.properies file.
Then I tried to create a separate configuration file with getters/setters. set access key and private key as private and then impement a getter:
public AWSCredentialsProvider getCredentials() {
if(accessKey == null || secretKey == null) {
return new DefaultAWSCredentialsProviderChain();
}
return new StaticCredentialsProvider(new BasicAWSCredentials(getAccessKey(), getSecretKey()));
}
}
This was intended to be used instead of DefaultAWSCredentialsProviderChain class this way---
config.setCredentialsProvider(new AWSConfig().getCredentials());
Still throws the same error when deployed.
The following repo states that it is possible to provide explicit credentials. I need help to figure out how because I can't find a proper documentation / example.
https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/src/com/amazonaws/services/kinesis/producer/sample/SampleProducer.java
I have Faced the same issue so, I got this solution I hope this will work for you also.
#Value("${s3_accessKey}")
private String s3_accessKey;
#Value("${s3_secretKey}")
private String s3_secretKey;
//this above value I am taking from Application.properties file
BasicAWSCredentials creds = new BasicAWSCredentials(s3_accessKey,
s3_secretKey);
AmazonS3 s3Client = AmazonS3ClientBuilder.standard().
withCredentials(new AWSStaticCredentialsProvider(creds))
.withRegion(Regions.US_EAST_2)
.build();
Here is a link to the documentation for java 3 sdk version 1. Does version 2.0 has something similar or they removed such option?
Yes! It is possible in AWS SDK v2 to execute S3 operations on regions other than the one configured in the client.
In order to do this, set useArnRegionEnabled to true on the client.
An example of this using Scala is:
val s3Configuration = S3Configuration.builder.useArnRegionEnabled(true).build
val client = S3Client
.builder
.credentialsProvider({$foo})
.region(Region.EU_WEST_1)
.overrideConfiguration({$foo})
.serviceConfiguration(s3Configuration)
.build
Here is the documentation: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3Configuration.Builder.html#useArnRegionEnabled-java.lang.Boolean-
Not supported per here
In version 1.x, services such as Amazon S3, Amazon SNS, and Amazon SQS allowed access to resources across Region boundaries. This is no longer allowed in version 2.x using the same client. If you need to access a resource in a different region, you must create a client in that region and retrieve the resource using the appropriate client.
This works for me when using java AWS SDK 2.16.98 and only requires the name of the bucket rather than the full arn.
private S3Client defaultClient;
private S3Client bucketSpecificClient;
private String bucketName = "my-bucket-in-some-region";
// this client seems to be able to look up the location of buckets from any region
defaultClient = S3Client.builder().endpointOverride(URI.create("https://s3.us-east-1.amazonaws.com")).region(Region.US_EAST_1).build();
public S3Client getClient() {
if (bucketSpecificClient == null) {
String bucketLocation = defaultClient.getBucketLocation(builder -> builder.bucket(this.bucketName)).locationConstraintAsString();
Region region = bucketLocation.trim().equals("") ? Region.US_EAST_1 : Region.of(bucketLocation);
bucketSpecificClient = S3Client.builder().region(region).build();
}
return bucketSpecificClient;
}
Now you can use bucketSpecificClient to perform operations on objects in the bucket my-bucket-in-some-region
I have problem with vertx HttpClient.
Here's code which shows that tests GET using vertx and plain java.
Vertx vertx = Vertx.vertx();
HttpClientOptions options = new HttpClientOptions()
.setTrustAll(true)
.setSsl(false)
.setDefaultPort(80)
.setProtocolVersion(HttpVersion.HTTP_1_1)
.setLogActivity(true);
HttpClient client = vertx.createHttpClient(options);
client.getNow("google.com", "/", response -> {
System.out.println("Received response with status code " + response.statusCode());
});
System.out.println(getHTML("http://google.com"));
Where getHTML() is from here: How do I do a HTTP GET in Java?
This is my output:
<!doctype html><html... etc <- correct output from plain java
Feb 08, 2017 11:31:21 AM io.vertx.core.http.impl.HttpClientRequestImpl
SEVERE: java.net.UnknownHostException: failed to resolve 'google.com'. Exceeded max queries per resolve 3
But vertx can't connect. What's wrong here? I'm not using any proxy.
For reference: a solution, as described in this question and in tsegismont's comment here, is to set the flag vertx.disableDnsResolver to true:
-Dvertx.disableDnsResolver=true
in order to fall back to the JVM DNS resolver as explained here:
sometimes it can be desirable to use the JVM built-in resolver, the JVM system property -Dvertx.disableDnsResolver=true activates this behavior
I observed this DNS resolution issue with a redis client in a kubernetes environment.
I had this issue, what caused it for me was stale DNS servers being picked up by the Java runtime, i.e. servers registered for a network the machine was no longer connected to. The issue is first in the Sun JNDI implementation, it also exists in Netty which uses JNDI to bootstrap its list of name servers on most platforms, then finally shows up in VertX.
I think a good place to fix this would be in the Netty layer where the set of default DNS servers is bootstrapped. I have raised a ticket with the Netty project so we'll see if they agree with me! Here is the Netty ticket
In the mean time a fairly basic workaround is to filter the default DNS servers detected by Netty, based on whether they are reachable or not. Here is a code Sample in Kotlin to apply before constructing the main VertX instance.
// The default set of name servers provided by JNDI can contain stale entries
// This default set is picked up by Netty and in turn by VertX
// To work around this, we filter for only reachable name servers on startup
val nameServers = DefaultDnsServerAddressStreamProvider.defaultAddressList()
val reachableNameServers = nameServers.stream()
.filter {ns -> ns.address.isReachable(NS_REACHABLE_TIMEOUT)}
.map {ns -> ns.address.hostAddress}
.collect(Collectors.toList())
if (reachableNameServers.size == 0)
throw StartupException("There are no reachable name servers available")
val opts = VertxOptions()
opts.addressResolverOptions.servers = reachableNameServers
// The primary Vertx instance
val vertx = Vertx.vertx(opts)
A little more detail in case it is helpful. I have a company machine, which at some point was connected to the company network by a physical cable. Details of the company's internal name servers were set up by DHCP on the physical interface. Using the wireless interface at home, DNS for the wireless interface gets set to my home DNS while the config for the physical interface is not updated. This is fine since that device is not active, ipconfig /all does not show the internal company DNS servers. However, looking in the registry they are still there:
Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces
They get picked up by the JNDI mechanism, which feeds Netty and in turn VertX. Since they are not reachable from my home location, DNS resolution fails. I can imagine this home/office situation is not unique to me! I don't know whether something similar could occur with multiple virtual interfaces on containers or VMs, it could be worth looking at if you are having problems.
Here is the sample code which works for me.
public class TemplVerticle extends HttpVerticle {
public static void main(String[] args) {
Vertx vertx = Vertx.vertx();
// Create the web client and enable SSL/TLS with a trust store
WebClient client = WebClient.create(vertx,
new WebClientOptions()
.setSsl(true)
.setTrustAll(true)
.setDefaultPort(443)
.setKeepAlive(true)
.setDefaultHost("www.w3schools.com")
);
client.get("www.w3schools.com")
.as(BodyCodec.string())
.send(ar -> {
if (ar.succeeded()) {
HttpResponse<String> response = ar.result();
System.out.println("Got HTTP response body");
System.out.println(response.body().toString());
} else {
ar.cause().printStackTrace();
}
});
}
}
Try using web client instead of httpclient, here you have an example (with rx):
private val client: WebClient = WebClient.create(vertx, WebClientOptions()
.setSsl(true)
.setTrustAll(true)
.setDefaultPort(443)
.setKeepAlive(true)
)
open fun <T> get(uri: String, marshaller: Class<T>): Single<T> {
return client.getAbs(host + uri).rxSend()
.map { extractJson(it, uri, marshaller) }
}
Another option is to use getAbs.
Spring boot app works fine running locally connecting to sandbox S3 & sandbox SQS, using DefaultAWSCredentialsProviderChain and set as system property.
When application is deployed to EC2 environment and using ProfileCredentials, I get a continuous stream of following error in CloudWatch:
{
"Host": "<myhost>",
"Date": "2016-12-20T21:52:56,777",
"Thread": "simpleMessageListenerContainer-1",
"Level": "WARN ",
"Logger": "org.springframework.cloud.aws.messaging.listener.SimpleMessageListenerContainer",
"Msg": "An Exception occurred while polling queue 'my-queue-name'. The failing operation will be retried in 10000 milliseconds",
"Identifiers": {
"Jvm-Instance": "",
"App-Name": "my-app",
"Correlation-Id": "ca9a556e-2fbc-3g49-9fb8-0e9213bb79bc",
"Session-Id": "",
"Thread-Group": "main",
"Thread-Id": "32",
"Version": ""
}
}
java.lang.NullPointerException
at org.springframework.cloud.aws.messaging.listener.SimpleMessageListenerContainer$AsynchronousMessageListener.run(SimpleMessageListenerContainer.java:255) [spring-cloud-aws-messaging-1.1.1.RELEASE.jar:1.1.1.RELEASE]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_91]
The problem boils down to SimpleMessageListenerContainer.java:255 :
ReceiveMessageResult receiveMessageResult = getAmazonSqs().receiveMessage(this.queueAttributes.getReceiveMessageRequest());
this.queueAttributes is null.
I have tried everything, including #EnableContextCredentials(instanceProfile=true), to setting cloud.aws.credentials.instanceProfile=true while making sure access & secretKey is null. The SQS queue definitely exists, and I have verified through aws cli on the EC2 instance itself that the profile credentials exist and are valid.
Additionally, in AWS environment the app also used S3 client to generate unique keys for bucket storage, which all works. It's only when the app tries to poll messages from SQS that seems to be failing.
I am processing messages like so:
#SqsListener("${aws.sqs.queue.name}")
public void receive(S3EventNotification s3EventNotificationRecord) {
more config:
#Bean
public AWSCredentialsProvider awsCredentialsProvider(
#Value("${aws.credentials.accessKey}") String accessKey,
#Value("${aws.credentials.secretKey}") String secretKey,
JasyptPropertyDecryptor propertyDecryptor) {
if (!Strings.isNullOrEmpty(accessKey) || !Strings.isNullOrEmpty(secretKey)) {
Preconditions.checkState(
!Strings.isNullOrEmpty(accessKey) && !Strings.isNullOrEmpty(secretKey),
"Error in accessKey/secretKey config. Either both must be provided, or neither.");
System.setProperty("aws.accessKeyId", propertyDecryptor.decrypt(accessKey));
System.setProperty("aws.secretKey", propertyDecryptor.decrypt(secretKey));
}
return DefaultAWSCredentialsProviderChain.getInstance();
}
#Bean
public S3Client s3Client(
AWSCredentialsProvider awsCredentialsProvider,
#Value("${aws.s3.region.name}") String regionName,
#Value("${aws.s3.bucket.name}") String bucketName) {
return new S3Client(awsCredentialsProvider, regionName, bucketName);
}
#Bean
public QueueMessageHandlerFactory queueMessageHandlerFactory() {
MappingJackson2MessageConverter messageConverter = new MappingJackson2MessageConverter();
messageConverter.setStrictContentTypeMatch(false);
QueueMessageHandlerFactory factory = new QueueMessageHandlerFactory();
factory.setArgumentResolvers(
Collections.<HandlerMethodArgumentResolver>singletonList(
new PayloadArgumentResolver(messageConverter)));
return factory;
}
One additional thing I noticed is that on application start up, ContextConfigurationUtils.registerCredentialsProvider is called and unless you specify cloud.aws.credentials.profileName= as empty in your app.properties, this class will add a ProfileCredentialsProvider to the list of awsCredentialsProviders. I figured this might be problematic since I'm not providing credentials on the EC2 instance that way, and instead it should be using InstanceProfileCredentialsProvider. This change did not work.
Turns out the issue was that the services I was using in AWS such as SQS has proper access permissions on them, but the IAM profile itself lacked the permissions to even attempt the service operations that the application needed to make.
I created an AWS Lambda package (Java) with a function that reads some files from Amazon S3 and pushes the data to AWS ElasticSearch Service. Since I'm using AWS Elastic Search, I can't use the Transport client, in which case I'm working with the Jest Client to push via REST. The issue is with the Jest client.
Here's my Jest client instance:
public JestClient getClient() throws InterruptedException{
final Supplier<LocalDateTime> clock = () -> LocalDateTime.now(ZoneOffset.UTC);
DefaultAWSCredentialsProviderChain awsCredentialsProvider = new DefaultAWSCredentialsProviderChain();
final AWSSigner awsSigner = new AWSSigner(awsCredentialsProvider, REGION, SERVICE, clock);
JestClientFactory factory = new JestClientFactory() {
#Override
protected HttpClientBuilder configureHttpClient(HttpClientBuilder builder) {
builder.addInterceptorLast(new AWSSigningRequestInterceptor(awsSigner));
return builder;
}
#Override
protected HttpAsyncClientBuilder configureHttpClient(HttpAsyncClientBuilder builder) {
builder.addInterceptorLast(new AWSSigningRequestInterceptor(awsSigner));
return builder;
}
};
factory.setHttpClientConfig(
new HttpClientConfig.Builder(URL)
.discoveryEnabled(true)
.multiThreaded(true).build());
JestClient jestClient = factory.getObject();
return jestClient;
}
Since the AWS Elasticsearch domain is protected by an IAM access policy, I sign the requests for them to be authorized by AWS (example here). I use POJOs to index documents.
The problem I face is that I am not able to execute more than one action with the jest client instance. For example, if I created the index first :
client.execute(new CreateIndex.Builder(indexName).build());
and later on, I wanted to, for example do some bulk indexing:
for (Object object : listOfObjects) {
bulkIndexBuilder.addAction(new Index.Builder(object ).
index(INDEX_NAME).type(DOC_TYPE).build());
}
client.execute(bulkIndexBuilder.build());
only the first action will be executed and the second will fail. Why is that? Is it possible to execute more than one action?
Morover, using the provided code, I'm not able to execute more than 20 Bulk operations when I want to index the document. Basically, around 20 is fine, but anything more than that, the client.execute(bulkIndexBuilder.build()); just does not execute and the client shuts down.
Any help or suggestion would be appriciated.
UPDATE:
It seems that AWS ElasticSearch does not allow connecting to individual nodes. Simply turning off node discovery in the Jest client .discoveryEnabled(false) solved all the problems. This answer helped.