Rsync with AWS lambda(java) between Bucket and Remote Server

Rsync with AWS lambda(java) between Bucket and Remote Server - java

I want to perform an one direction rsync between an AWS S3 Bucket and a remote ftp server (accepts ftps) with a java lambda function. So if one file in bucket is deleted the lambda cron should remove it from the remote ftp server.
I read that aws cli offers the function s3 sync. Could this be an option?
best regards
Jannik

This would be pretty straight forward. The Lambda would be setup to be triggered on an S3 delete. The basic code (untested) would be something like:
public class Handler implements RequestHandler<S3Event, String> {
public String handleRequest(S3Event s3event, Context context) {
try {
S3EventNotificationRecord record = s3event.getRecords().get(0);
// Object key may have spaces or unicode non-ASCII characters.
String srcKey = record.getS3().getObject().getUrlDecodedKey();
// now use Apache Commons Net
// (https://commons.apache.org/proper/commons-net/)
// to delete the file on the FTP server
FTPClient ftpClient = new FTPClient();
ftpClient.connect(server, port);
int replyCode = ftpClient.getReplyCode();
if (!FTPReply.isPositiveCompletion(replyCode)) {
contect.getLogger().log("SFTP Connect failed");
return;
}
boolean success = ftpClient.login(user, pass);
if (!success) {
contect.getLogger().log("Could not login to the FTP server");
return;
}
String fileToDelete = "/some/ftp/directory/" + srcKey;
boolean deleted = ftpClient.deleteFile(fileToDelete);
if (deleted) {
contect.getLogger().log("The file was deleted successfully.");
} else {
contect.getLogger().log("Could not delete the file, it may not exist.");
}
}
catch (IOException e) {
throw new RuntimeException(e);
}
}
On the S3 side, you will need to enable your S3 bucket to send a delete event to your Lambda. This can be done in the AWS console by selecting the bucket and in the advanced section, add select Events, add a notification, select "Permanently deleted" (or "All object delete events") and add your Lambda.

Related

How to upload multipart to Amazon S3 asynchronously using the java SDK

In my java application I need to write data to S3, which I don't know the size in advance and sizes are usually big so as recommend in the AWS S3 documentation I am using the Using the Java AWS SDKs (low-level-level API) to write data to the s3 bucket.
In my application I provide S3BufferedOutputStream which is an implementation OutputStream where other classes in the app can use this stream to write to the s3 bucket.
I store the data in a buffer and loop and once the data is bigger than bucket size I upload data in the buffer as a a single UploadPartRequest
Here is the implementation of the write method of S3BufferedOutputStream
#Override
public void write(byte[] b, int off, int len) throws IOException {
this.assertOpen();
int o = off, l = len;
int size;
while (l > (size = this.buf.length - position)) {
System.arraycopy(b, o, this.buf, this.position, size);
this.position += size;
flushBufferAndRewind();
o += size;
l -= size;
}
System.arraycopy(b, o, this.buf, this.position, l);
this.position += l;
}
The whole implementation is similar to this: code repo
My problem here is that each UploadPartRequest is done synchronously, so we have to wait for one part to be uploaded to be able to upload the next part. And because I am using the AWS S3 low level API I can not benefit from the parallel uploading provided by the TransferManager
Is there a way to achieve the parallel upload using low level SDK?
Or some code changes that can be done to operate Asynchronously without corrupting the uploaded data and maintain order of the data?

Here's some example code from a class that I have. It submits the parts to an ExecutorService and holds onto the returned Future. This is written for the v1 Java SDK; if you're using the v2 SDK you could use an async client rather than the explicit threadpool:
// WARNING: data must not be updated by caller; make a defensive copy if needed
public synchronized void uploadPart(byte[] data, boolean isLastPart)
{
partNumber++;
logger.debug("submitting part {} for s3://{}/{}", partNumber, bucket, key);
final UploadPartRequest request = new UploadPartRequest()
.withBucketName(bucket)
.withKey(key)
.withUploadId(uploadId)
.withPartNumber(partNumber)
.withPartSize(data.length)
.withInputStream(new ByteArrayInputStream(data))
.withLastPart(isLastPart);
futures.add(
executor.submit(new Callable<PartETag>()
{
#Override
public PartETag call() throws Exception
{
int localPartNumber = request.getPartNumber();
logger.debug("uploading part {} for s3://{}/{}", localPartNumber, bucket, key);
UploadPartResult response = client.uploadPart(request);
String etag = response.getETag();
logger.debug("uploaded part {} for s3://{}/{}; etag is {}", localPartNumber, bucket, key, etag);
return new PartETag(localPartNumber, etag);
}
}));
}
Note: this method is synchronized to ensure that parts are not submitted out of order.
Once you've submitted all of the parts, you use this method to wait for them to finish and then complete the upload:
public void complete()
{
logger.debug("waiting for upload tasks of s3://{}/{}", bucket, key);
List<PartETag> partTags = new ArrayList<>();
for (Future<PartETag> future : futures)
{
try
{
partTags.add(future.get());
}
catch (Exception e)
{
throw new RuntimeException(String.format("failed to complete upload task for s3://%s/%s"), e);
}
}
logger.debug("completing multi-part upload for s3://{}/{}", bucket, key);
CompleteMultipartUploadRequest request = new CompleteMultipartUploadRequest()
.withBucketName(bucket)
.withKey(key)
.withUploadId(uploadId)
.withPartETags(partTags);
client.completeMultipartUpload(request);
logger.debug("completed multi-part upload for s3://{}/{}", bucket, key);
}
You'll also need an abort() method that cancels outstanding parts and aborts the upload. This, and the rest of the class, are left as an exercise for the reader.

You should look at using the AWS SDK for Java V2. You are referencing V1, not the newest Amazon S3 Java API. If you are not familiar with V2, start here:
Get started with the AWS SDK for Java 2.x
To perform Async operations via the Amazon S3 Java API, you use S3AsyncClient.
Now to learn how to upload an object using this client, see this code example:
import software.amazon.awssdk.core.async.AsyncRequestBody;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3AsyncClient;
import software.amazon.awssdk.services.s3.model.PutObjectRequest;
import software.amazon.awssdk.services.s3.model.PutObjectResponse;
import java.nio.file.Paths;
import java.util.concurrent.CompletableFuture;
// snippet-end:[s3.java2.async_ops.import]
// snippet-start:[s3.java2.async_ops.main]
/**
* To run this AWS code example, ensure that you have setup your development environment, including your AWS credentials.
*
* For information, see this documentation topic:
*
* https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html
*/
public class S3AsyncOps {
public static void main(String[] args) {
final String USAGE = "\n" +
"Usage:\n" +
" S3AsyncOps <bucketName> <key> <path>\n\n" +
"Where:\n" +
" bucketName - the name of the Amazon S3 bucket (for example, bucket1). \n\n" +
" key - the name of the object (for example, book.pdf). \n" +
" path - the local path to the file (for example, C:/AWS/book.pdf). \n" ;
if (args.length != 3) {
System.out.println(USAGE);
System.exit(1);
}
String bucketName = args[0];
String key = args[1];
String path = args[2];
Region region = Region.US_WEST_2;
S3AsyncClient client = S3AsyncClient.builder()
.region(region)
.build();
PutObjectRequest objectRequest = PutObjectRequest.builder()
.bucket(bucketName)
.key(key)
.build();
// Put the object into the bucket
CompletableFuture<PutObjectResponse> future = client.putObject(objectRequest,
AsyncRequestBody.fromFile(Paths.get(path))
);
future.whenComplete((resp, err) -> {
try {
if (resp != null) {
System.out.println("Object uploaded. Details: " + resp);
} else {
// Handle error
err.printStackTrace();
}
} finally {
// Only close the client when you are completely done with it
client.close();
}
});
future.join();
}
}
That is uploading an object using the S3AsyncClient client. To perform a multi-part upload, you need to use this method:
https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3AsyncClient.html#createMultipartUpload-software.amazon.awssdk.services.s3.model.CreateMultipartUploadRequest-
TO see an example of Multipart upload using the S3 Sync client, see:
https://github.com/awsdocs/aws-doc-sdk-examples/blob/main/javav2/example_code/s3/src/main/java/com/example/s3/S3ObjectOperations.java
That is your solution - use S3AsyncClient object's createMultipartUpload method.

Is there any direct way to copy one s3 directory to another in java or scala?

I want to archive all the files and sub directories in a s3 directory to some other s3 location using java. Is there any direct way to copy one s3 directory to another in java or scala?

There is no API call to operate on whole directories in Amazon S3.
In fact, directories/folders do not exist in Amazon S3. Rather, each object stores the full path in its filename (Key).
If you wish to copy multiple objects that have the same prefix in their Key, your code will need to loop through the objects, copying one object at a time.

A bit wordy, but does the job: reasonable logging, multithreading via TransferManager, handling continuation token for "folders" with more than 1000 keys:
/**
* Copies all content from s3://sourceBucketName/sourceFolder to s3://destinationBucketName/destinationFolder.
*/
public void copyAll(String sourceBucketName, String sourceFolder, String destinationBucketName, String destinationFolder) {
log.info("Copying data from s3://{}/{} to s3://{}/{}", sourceBucketName, sourceFolder, destinationBucketName, destinationFolder);
TransferManager transferManager = TransferManagerBuilder.standard()
.withS3Client(client)
.build();
try {
ListObjectsV2Request request = new ListObjectsV2Request()
.withBucketName(sourceBucketName)
.withPrefix(sourceFolder);
ListObjectsV2Result objects;
do {
objects = client.listObjectsV2(request);
List<Copy> transfers = new ArrayList<>();
for (S3ObjectSummary object : objects.getObjectSummaries()) {
String sourceKey = object.getKey();
String sourceRelativeKey = sourceKey.substring(sourceFolder.length());
String destinationKey = destinationFolder + sourceRelativeKey;
transfers.add(transferManager.copy(sourceBucketName, sourceKey, destinationBucketName, destinationKey));
}
for (Copy transfer : transfers) {
log.debug(transfer.getDescription());
transfer.waitForCompletion();
}
log.info("Copied batch of {} objects. Last object: {}", transfers.size(), transfers.isEmpty() ? "None" : transfers.get(transfers.size() - 1).getDescription());
request.setContinuationToken(objects.getNextContinuationToken());
} while (objects.isTruncated());
log.info("Copy operation completed successfully from s3://{}/{} to s3://{}/{}", sourceBucketName, sourceFolder, destinationBucketName, destinationFolder);
} catch (InterruptedException e) {
// Resetting interrupt flag and returning control to the caller.
Thread.currentThread().interrupt();
throw new RuntimeException(e);
} finally {
transferManager.shutdownNow(false);
}
}

How to create dataset on mainframe using FTP from java

I am trying to FTP a text file to mainframe using java. I am able to create a member in PDS using below code.
//Function to FTP the report
public void sendReport() throws IOException
{
FTPSClient ftp = null;
InputStream in = null;
String protocol="TLS";
//Connecting to mainframe server for ftp transfer
ftp = new FTPSClient(protocol);
ftp.connect(hostname);
ftp.login(user,password);
ftp.execPBSZ(0);
ftp.execPROT("P");
ftp.enterLocalPassiveMode();
ftp.setFileType(FTP.ASCII_FILE_TYPE);
int reply = ftp.getReplyCode();
System.out.println("Received Reply from FTP Connection:" + reply);
if (FTPReply.isPositiveCompletion(reply))
System.out.println("Connected To Mainframe");
else
System.out.println("Not connected to Mainframe..Check ID or Password");
//Setting mainframe PDS for reports
boolean success = ftp.changeWorkingDirectory("***Mainframe Directory***");
if (success)
System.out.println("Successfully changed PDS.");
else
System.out.println("Failed to change PDS. See Mainframe's reply.");
//Sending Report to mainframe PDS
File f1 = new File(dkReportName);
in = new FileInputStream(f1);
boolean done = ftp.storeFile("DKI"+dkReportName.substring(14,18), in);
in.close();
if (done)
System.out.println("FILE FTP SUCCESSFUL");
else
System.out.println("FILE FTP NOT SUCCESSFUL");
ftp.logout();
ftp.disconnect();
}
user,password and hostname variables are being set in appContext.xml.
However, I want to create a PS dataset.
Could anyone please suggest a way of doing it.

Based on your question, this is for the MVS file space and not USS.
When creating a dataset with FTP you need to give the host some information about files size, attributes, etc.
This page on IBM's website outlines a list of commands that you can execute to setup for the transfer. The basic sequence would be something like:
site cyl
site pri=5
site sec=5
site recfm=fb
and you can combine more than one command on a line:
site lrecl=80 blksize=3120
Execute these commands before the transfer and the file should be allocated with your desired characteristics.
Based on your coding example here is a sample that should work:
ftp.sendCommand("site",
"cyl pri=5 sec=5 recfm=fb filetype=seq lrecl=80 blksize=3120");

How to copy S3 object from one region to another when vpc endpoint is enabled

Recently I was unable to copy files using the s3.copyObject(sourceBucket, sourceKey, destBucket, destKey); because of 2 reasons.
1) The source and destination buckets are in 2 different regions (us-east-1 and us-east2 in my case).
2) The region where the server resides is in a VPC which has an S3 endpoint enabled. S3 endpoint is an internal connection to S3, but only in the same region
Given that we are moving large files, we could not download and then upload even temporarily. We also wanted to keep the S3 endpoint in place, because the application makes serious use of S3 assets once in region.

The solution is to stream copy the files from one stream to another. I wrote this simple function which will handle it.
ZipException is just a custom exception. Throw whatever you want.
Hopefully this helps somebody.
public static void copyObject(AmazonS3 sourceClient, AmazonS3 destClient, String sourceBucket, String sourceKey, String destBucket, String destKey) throws IOException {
S3ObjectInputStream inStream = null;
try {
GetObjectRequest request = new GetObjectRequest(sourceBucket, sourceKey);
S3Object object = sourceClient.getObject(request);
inStream = object.getObjectContent();
destClient.putObject(destBucket,
destKey, inStream, object.getObjectMetadata());
} catch (SdkClientException e) {
throw new ZipException("Unable to copy file.", e);
} finally {
if (inStream != null) {
inStream.close();
}
}
}

Connect to MySQL with Java - JDBC without showing credentials in Java source code

I am trying to learn how you would tackle the task of creating a Java console application, connect to a (in this case) MySQL DB and send or retrieve data, without showing your username and password in the source code of the Java application. I currently have no trouble
creating a connection showing credentials.
// JDBC driver name and database URL
private static final String JDBC_DRIVER = "com.mysql.jdbc.Driver";
private static final String DB_URL = "jdbc:mysql://192.168.1.159:3306/javahelper";
// Database credentials
private static final String USER = "xxxx";
private static final String PASS = "RandomString";
/**
* #return
*/
public Connection openConnection() {
Connection connection = null;
try {
Class.forName(JDBC_DRIVER);
// opening connection
connection = (Connection) DriverManager.getConnection(DB_URL,USER,PASS);
} catch (ClassNotFoundException e) {
System.out.println("This is from openConnection method");
e.printStackTrace();
} catch (SQLException f) {
System.out.println("This is from openConnection method");
f.printStackTrace();
}
return connection;
}
From what information I can gather you always need to show your credentials somewhere in the application. But how do you than achieve "safe" connection between a application and a DB, so others can't misuse your credentials for malicious reasons?

one way of doing it is using a properties file having your credentials or having your data in a xml file.
create a properties file like the one below
// database.properties
DB_URL=jdbc:mysql://localhost:3306/UserDB
DB_USERNAME=user_name
DB_PASSWORD=password
Use this information in your code to get the username and passwords.
Properties properties= new Properties();
FileInputStream input = null;
try{
input = new FileInputStream("database.properties");
props.load(input );
con = DriverManager.getConnection(props.getProperty("DB_URL"),props.getProperty("DB_USERNAME"),props.getProperty("DB_PASSWORD"));
}

you can use encrypt the username and password.The best opensource encryptor(My personal view) is jbcrypt
// Hash a password for the first time
String hashed = BCrypt.hashpw(password, BCrypt.gensalt());
// gensalt's log_rounds parameter determines the complexity
// the work factor is 2**log_rounds, and the default is 10
String hashed = BCrypt.hashpw(password, BCrypt.gensalt(12));
// Check that an unencrypted password matches one that has
// previously been hashed
if (BCrypt.checkpw(candidate, hashed))
System.out.println("It matches");
else
System.out.println("It does not match");

Sharing what i find
Creating and using the propertise file
I created a database.properties file(normal text file) and placed it in the src folder of the Java project.
JDBC_DRIVER=com.mysql.jdbc.Driver
USER=YourUser
PASS=YourPassword
DB_URL=jdbc:mysql://IP:PORT/DB
Afterwards i edited my openConnection() method to use the properties file for loading the credientials of the connection.
public Connection openConnection() {
Properties properties = new Properties();
Connection connection = null;
String path = System.getProperty("user.dir");
path += "/src/database.properties";
try(FileInputStream fin = new FileInputStream(path);) {
properties.load(fin);
try {
Class.forName(properties.getProperty("JDBC_DRIVER"));
// opening connection
connection = (Connection) DriverManager.getConnection(properties.getProperty("DB_URL"),properties.getProperty("USER"),properties.getProperty("PASS"));
} catch (ClassNotFoundException e) {
System.out.println("This is from openConnection method");
e.printStackTrace();
} catch (SQLException f) {
System.out.println("This is from openConnection method");
f.printStackTrace();
}
} catch (IOException io) {
System.out.println("This is from openConnection method");
io.printStackTrace();
}
return connection;
}
Sending username and password, Java application -> MySQL
From what i can read on the web, it dosent matter much if you encrypt or hash the password before you send it towards the sequel service from your Java application. An example i found is that the sequel service dosent have a "receive hash method and authenticate". And even if it did the hash would need to be in the program somewhere. And when the program has access to it, others also have access to it if they really want it. Also if the hash is whats needed to authenticate than your back to where you can just as well use the clear text password.
The discussion than ends on "what is the best approach". Some suggest a keyserver / auth system in between the application and sequel service, using a datastore setup on the server side, using the OS "wallet" (example Windows registry) or creating a database user with minimum permissions to just get the job done / or a read only DB "read_only=1 in my.cnf".
I tried the 3'rd option and created a "DBaccess" user, with only the select permission to retrieve data, no administrative rights and random generated password by MySQL.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Rsync with AWS lambda(java) between Bucket and Remote Server - java

Related

How to upload multipart to Amazon S3 asynchronously using the java SDK

Is there any direct way to copy one s3 directory to another in java or scala?

How to create dataset on mainframe using FTP from java

How to copy S3 object from one region to another when vpc endpoint is enabled

Connect to MySQL with Java - JDBC without showing credentials in Java source code

Categories

Resources