IFile.getFile is case sensitive - java

Say my workspace has certain files in the root folder like foo.xml, foo1.xml, foo2.xml, foo3.xml.
final List<String> configFiles = new ArrayList<>();
configFiles.add("foo.xml");
configFiles.add("foo1.xml");
configFiles.add("Foo2.xml");
final List<IFile> iFiles = configFiles.stream()
.map(project::getFile)
.filter(IFile::exists)
.collect(Collectors.toList());
When I do a getFile on the project, IFile expects a case sensitive fileName, say there is foo2.xml in my workspace and I try to access Foo2.xml, I don't get the file.
How can I get files regardless of the case ?

I don't think there is a simple way.
You could get call members() on the project:
IResource [] members = project.members();
and then match the member names using equalsIgnoreCase:
private IFile findFile(IResource [] members, String name)
{
for (IResource member : members) {
if (name.equalsIgnoreCase(member.getName())) {
if (member instanceof IFile) {
return (IFile)member;
}
return null;
}
}
return null;
}
so the stream would be:
final List<IFile> iFiles = configFiles.stream()
.map(file -> findFile(members, file))
.filter(Objects::nonNull)
.collect(Collectors.toList());

Related

Getting latest modified date for a list of files from network

I have a service which fetches a list of files from network, with each file inside a custom object.
What I want is find latest modified date among all of them, which I plan to do as:
val latestModifiedTime = fileService
.fetchFiles(...)
.map(_.file)
.map(file => Files.getLastModifiedTime(file.toPath).toMillis)
.sorted(Ordering.Long.reverse)
.head
But I am not sure of this approach, due to toPath usage. Would it work with file fetched from network?
Secondly, if it works, then can it be mocked in unit test?
If you are scanning network mounted filesystem you can use Files.find to scan file with attributes and filter the stream with a suitable comparator on the last modified time:
public static void main(String[] args) throws IOException {
Path dir = Path.of(args[0]);
try(var files = find(dir, Integer.MAX_VALUE, (p,a) -> a.isRegularFile())) {
var latest = files.max(Comparator.comparing(entry -> entry.getValue().lastModifiedTime()));
if (latest.isPresent())
System.out.println(latest.get().getKey() +" modified "+latest.get().getValue().lastModifiedTime());
}
}
To make find() collect Path with BasicFileAttributes use this stream:
public static Stream<Map.Entry<Path, BasicFileAttributes>>
find(Path dir, int maxDepth, BiPredicate<Path, BasicFileAttributes> matcher, FileVisitOption... options) throws IOException {
// Using ConcurrentHashMap is safe to use with parallel()
ConcurrentHashMap<Path,BasicFileAttributes> attrs = new ConcurrentHashMap<>();
BiPredicate<Path, BasicFileAttributes> predicate = (p,a) -> (matcher == null || matcher.test(p, a)) && attrs.put(p, a) == null;
return Files.find(dir, maxDepth, predicate, options).map(p -> Map.entry(p, attrs.remove(p)));
}

AWS S3 - Bulk deletion via Java SDK v1 [duplicate]

Is it possible to delete a folder(In S3 bucket) and all its content with a single api request using java sdk for aws. For browser console we can delete and folder and its content with a single click and I hope that same behavior should be available using the APIs also.
There is no such thing as folders in S3. There are simply files (objects) with slashes in the filenames (keys).
The S3 browser console will visualize these slashes as folders, but they're not real.
You can delete all files with the same prefix, but first you need to look them up with list_objects(), then you can batch delete them.
For code snippet using Java SDK, please refer to Deleting multiple objects.
You can specify keyPrefix in ListObjectsRequest.
For example, consider a bucket that contains the following keys:
foo/bar/baz
foo/bar/bash
foo/bar/bang
foo/boo
And you want to delete files from foo/bar/baz.
if (s3Client.doesBucketExist(bucketName)) {
ListObjectsRequest listObjectsRequest = new ListObjectsRequest()
.withBucketName(bucketName)
.withPrefix("foo/bar/baz");
ObjectListing objectListing = s3Client.listObjects(listObjectsRequest);
while (true) {
for (S3ObjectSummary objectSummary : objectListing.getObjectSummaries()) {
s3Client.deleteObject(bucketName, objectSummary.getKey());
}
if (objectListing.isTruncated()) {
objectListing = s3Client.listNextBatchOfObjects(objectListing);
} else {
break;
}
}
}
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/ListObjectsRequest.html
There is no option of giving a folder name or more specifically prefix in java sdk to delete files. But there is an option of giving array of keys you want to delete.
Click for details
.
By using this, I have written a small method to delete all files corresponding to a prefix.
private AmazonS3 s3client = <Your s3 client>;
private String bucketName = <your bucket name, can be signed or unsigned>;
public void deleteDirectory(String prefix) {
ObjectListing objectList = this.s3client.listObjects( this.bucketName, prefix );
List<S3ObjectSummary> objectSummeryList = objectList.getObjectSummaries();
String[] keysList = new String[ objectSummeryList.size() ];
int count = 0;
for( S3ObjectSummary summery : objectSummeryList ) {
keysList[count++] = summery.getKey();
}
DeleteObjectsRequest deleteObjectsRequest = new DeleteObjectsRequest( bucketName ).withKeys( keysList );
this.s3client.deleteObjects(deleteObjectsRequest);
}
You can try the below methods, it will handle deletion even for truncated pages, and also it will recursively delete all the contents in the given directory:
public Set<String> listS3DirFiles(String bucket, String dirPrefix) {
ListObjectsV2Request s3FileReq = new ListObjectsV2Request()
.withBucketName(bucket)
.withPrefix(dirPrefix)
.withDelimiter("/");
Set<String> filesList = new HashSet<>();
ListObjectsV2Result objectsListing;
try {
do {
objectsListing = amazonS3.listObjectsV2(s3FileReq);
objectsListing.getCommonPrefixes().forEach(folderPrefix -> {
filesList.add(folderPrefix);
Set<String> tempPrefix = listS3DirFiles(bucket, folderPrefix);
filesList.addAll(tempPrefix);
});
for (S3ObjectSummary summary: objectsListing.getObjectSummaries()) {
filesList.add(summary.getKey());
}
s3FileReq.setContinuationToken(objectsListing.getNextContinuationToken());
} while(objectsListing.isTruncated());
} catch (SdkClientException e) {
System.out.println(e.getMessage());
throw e;
}
return filesList;
}
public boolean deleteDirectoryContents(String bucket, String directoryPrefix) {
Set<String> keysSet = listS3DirFiles(bucket, directoryPrefix);
if (keysSet.isEmpty()) {
System.out.println("Given directory {} doesn't have any file "+ directoryPrefix);
return false;
}
DeleteObjectsRequest deleteObjectsRequest = new DeleteObjectsRequest(bucket)
.withKeys(keysSet.toArray(new String[0]));
try {
amazonS3.deleteObjects(deleteObjectsRequest);
} catch (SdkClientException e) {
System.out.println(e.getMessage());
throw e;
}
return true;
}
First you need to fetch all object keys starting with the given prefix:
public List<FileKey> list(String keyPrefix) {
var objectListing = client.listObjects("bucket-name", keyPrefix);
var paths =
objectListing.getObjectSummaries().stream()
.map(s3ObjectSummary -> s3ObjectSummary.getKey())
.collect(Collectors.toList());
while (objectListing.isTruncated()) {
objectListing = client.listNextBatchOfObjects(objectListing);
paths.addAll(
objectListing.getObjectSummaries().stream()
.map(s3ObjectSummary -> s3ObjectSummary.getKey())
.toList());
}
return paths.stream().sorted().collect(Collectors.toList());
}
Then call deleteObjects:
client.deleteObjects(new DeleteObjectsRequest("bucket-name").withKeys(list("some-prefix")));
You can try this
void deleteS3Folder(String bucketName, String folderPath) {
for (S3ObjectSummary file : s3.listObjects(bucketName, folderPath).getObjectSummaries()){
s3.deleteObject(bucketName, file.getKey());
}
}

"recursively" grab all files in subfolders in S3

I need help to ‘recursively’ grab files in s3:
For example, I have s3 structure like this:
My-bucket/2018/06/05/10/file1.json
My-bucket/2018/06/05/11/file2.json
My-bucket/2018/06/05/12/file3.json
My-bucket/2018/06/05/13/file5.json
My-bucket/2018/06/05/14/file4.json
My-bucket/2018/06/05/15/file6.json
I need to get all files pathes with file name for given bucket:
I tried following method, but it didn’t worked for me (its returning not whole path):
public List<String> getObjectsListFromFolder4(String bucketName, String keyPrefix) {
List<String> paths = new ArrayList<String>();
String delimiter = "/";
if (keyPrefix != null && !keyPrefix.isEmpty() && !keyPrefix.endsWith(delimiter)) {
keyPrefix += delimiter;
}
ListObjectsRequest listObjectRequest = new ListObjectsRequest().withBucketName(bucketName)
.withPrefix(keyPrefix).withDelimiter(delimiter);
ObjectListing objectListing;
do {
objectListing = s3Client.listObjects(listObjectRequest);
paths.addAll(objectListing.getCommonPrefixes());
listObjectRequest.setMarker(objectListing.getNextMarker());
} while (objectListing.isTruncated());
return paths;
}
There is a new utility class — S3Objects — that provides an easy way to iterate Amazon S3 objects in a "foreach" statement. Use its withPrefix method and then just iterate them. You can use filters and streams as well.
Here is an example (Kotlin):
val s3 = AmazonS3ClientBuilder
.standard()
.withCredentials(EnvironmentVariableCredentialsProvider())
.build()
S3Objects
.withPrefix(s3, bucket, folder)
.filter { s3ObjectSummary ->
s3ObjectSummary.key.endsWith(".gz")
}
.parallelStream()
.forEach { s3ObjectSummary ->
CSVParser.parse(
GZIPInputStream(s3.getObject(s3ObjectSummary.bucketName, s3ObjectSummary.key).objectContent),
StandardCharsets.UTF_8,
CSVFormat.DEFAULT
).use { csvParser ->
…
}
}
getCommonPrefixes() only lists the prefixes, not the actual keys. From the documentation:
For example, consider a bucket that contains the following keys:
"foo/bar/baz"
"foo/bar/bash"
"foo/bar/bang"
"foo/boo"
If calling
listObjects with the prefix="foo/" and the delimiter="/" on this
bucket, the returned S3ObjectListing will contain one entry in the
common prefixes list ("foo/bar/") and none of the keys beginning with
that common prefix will be included in the object summaries list.
Instead, use getObjectSummaries() to get the keys. You also need to remove withDelimiters(). This causes S3 to only list items in the current 'directory.' This method works for me:
public static List<String> getObjectsListFromS3(AmazonS3 s3, String bucket, String prefix) {
final String delimiter = "/";
if (!prefix.endsWith(delimiter)) {
prefix = prefix + delimiter;
}
List<String> paths = new LinkedList<>();
ListObjectsRequest request = new ListObjectsRequest().withBucketName(bucket).withPrefix(prefix);
ObjectListing result;
do {
result = s3.listObjects(request);
for (S3ObjectSummary summary : result.getObjectSummaries()) {
// Make sure we are not adding a 'folder'
if (!summary.getKey().endsWith(delimiter)) {
paths.add(summary.getKey());
}
}
request.setMarker(result.getMarker());
}
while (result.isTruncated());
return paths;
}
Consider an S3 bucket that contains the following keys:
particle.fs
test/
test/blur.fs
test/blur.vs
test/subtest/particle.fs
With this driver code:
public static void main(String[] args) {
String bucket = "playground-us-east-1-1234567890";
AmazonS3 s3 = AmazonS3ClientBuilder.standard().withRegion("us-east-1").build();
String prefix = "test";
for (String key : getObjectsListFromS3(s3, bucket, prefix)) {
System.out.println(key);
}
}
produces:
test/blur.fs
test/blur.vs
test/subtest/particle.fs
Here is an example about how to get all files in the directory, hope can help you :
public static List<String> getAllFile(String directoryPath,boolean isAddDirectory) {
List<String> list = new ArrayList<String>();
File baseFile = new File(directoryPath);
if (baseFile.isFile() || !baseFile.exists()) {
return list;
}
File[] files = baseFile.listFiles();
for (File file : files) {
if (file.isDirectory()) {
if(isAddDirectory){
list.add(file.getAbsolutePath());
}
list.addAll(getAllFile(file.getAbsolutePath(),isAddDirectory));
} else {
list.add(file.getAbsolutePath());
}
}
return list;
}

Delete a folder and its content AWS S3 java

Is it possible to delete a folder(In S3 bucket) and all its content with a single api request using java sdk for aws. For browser console we can delete and folder and its content with a single click and I hope that same behavior should be available using the APIs also.
There is no such thing as folders in S3. There are simply files (objects) with slashes in the filenames (keys).
The S3 browser console will visualize these slashes as folders, but they're not real.
You can delete all files with the same prefix, but first you need to look them up with list_objects(), then you can batch delete them.
For code snippet using Java SDK, please refer to Deleting multiple objects.
You can specify keyPrefix in ListObjectsRequest.
For example, consider a bucket that contains the following keys:
foo/bar/baz
foo/bar/bash
foo/bar/bang
foo/boo
And you want to delete files from foo/bar/baz.
if (s3Client.doesBucketExist(bucketName)) {
ListObjectsRequest listObjectsRequest = new ListObjectsRequest()
.withBucketName(bucketName)
.withPrefix("foo/bar/baz");
ObjectListing objectListing = s3Client.listObjects(listObjectsRequest);
while (true) {
for (S3ObjectSummary objectSummary : objectListing.getObjectSummaries()) {
s3Client.deleteObject(bucketName, objectSummary.getKey());
}
if (objectListing.isTruncated()) {
objectListing = s3Client.listNextBatchOfObjects(objectListing);
} else {
break;
}
}
}
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/ListObjectsRequest.html
There is no option of giving a folder name or more specifically prefix in java sdk to delete files. But there is an option of giving array of keys you want to delete.
Click for details
.
By using this, I have written a small method to delete all files corresponding to a prefix.
private AmazonS3 s3client = <Your s3 client>;
private String bucketName = <your bucket name, can be signed or unsigned>;
public void deleteDirectory(String prefix) {
ObjectListing objectList = this.s3client.listObjects( this.bucketName, prefix );
List<S3ObjectSummary> objectSummeryList = objectList.getObjectSummaries();
String[] keysList = new String[ objectSummeryList.size() ];
int count = 0;
for( S3ObjectSummary summery : objectSummeryList ) {
keysList[count++] = summery.getKey();
}
DeleteObjectsRequest deleteObjectsRequest = new DeleteObjectsRequest( bucketName ).withKeys( keysList );
this.s3client.deleteObjects(deleteObjectsRequest);
}
You can try the below methods, it will handle deletion even for truncated pages, and also it will recursively delete all the contents in the given directory:
public Set<String> listS3DirFiles(String bucket, String dirPrefix) {
ListObjectsV2Request s3FileReq = new ListObjectsV2Request()
.withBucketName(bucket)
.withPrefix(dirPrefix)
.withDelimiter("/");
Set<String> filesList = new HashSet<>();
ListObjectsV2Result objectsListing;
try {
do {
objectsListing = amazonS3.listObjectsV2(s3FileReq);
objectsListing.getCommonPrefixes().forEach(folderPrefix -> {
filesList.add(folderPrefix);
Set<String> tempPrefix = listS3DirFiles(bucket, folderPrefix);
filesList.addAll(tempPrefix);
});
for (S3ObjectSummary summary: objectsListing.getObjectSummaries()) {
filesList.add(summary.getKey());
}
s3FileReq.setContinuationToken(objectsListing.getNextContinuationToken());
} while(objectsListing.isTruncated());
} catch (SdkClientException e) {
System.out.println(e.getMessage());
throw e;
}
return filesList;
}
public boolean deleteDirectoryContents(String bucket, String directoryPrefix) {
Set<String> keysSet = listS3DirFiles(bucket, directoryPrefix);
if (keysSet.isEmpty()) {
System.out.println("Given directory {} doesn't have any file "+ directoryPrefix);
return false;
}
DeleteObjectsRequest deleteObjectsRequest = new DeleteObjectsRequest(bucket)
.withKeys(keysSet.toArray(new String[0]));
try {
amazonS3.deleteObjects(deleteObjectsRequest);
} catch (SdkClientException e) {
System.out.println(e.getMessage());
throw e;
}
return true;
}
First you need to fetch all object keys starting with the given prefix:
public List<FileKey> list(String keyPrefix) {
var objectListing = client.listObjects("bucket-name", keyPrefix);
var paths =
objectListing.getObjectSummaries().stream()
.map(s3ObjectSummary -> s3ObjectSummary.getKey())
.collect(Collectors.toList());
while (objectListing.isTruncated()) {
objectListing = client.listNextBatchOfObjects(objectListing);
paths.addAll(
objectListing.getObjectSummaries().stream()
.map(s3ObjectSummary -> s3ObjectSummary.getKey())
.toList());
}
return paths.stream().sorted().collect(Collectors.toList());
}
Then call deleteObjects:
client.deleteObjects(new DeleteObjectsRequest("bucket-name").withKeys(list("some-prefix")));
You can try this
void deleteS3Folder(String bucketName, String folderPath) {
for (S3ObjectSummary file : s3.listObjects(bucketName, folderPath).getObjectSummaries()){
s3.deleteObject(bucketName, file.getKey());
}
}

Java Iterators and Streams

I am trying to convert a loop that I have made into Java streams, though the code uses iterators and I am finding it hard to convert it into readable code.
private void printKeys() throws IOException {
ClassLoader classLoader = getClass().getClassLoader();
// read a json file
ObjectMapper objectMapper = new ObjectMapper();
JsonNode root = objectMapper.readTree(classLoader.getResource("AllSets.json"));
Set<String> names = new HashSet<>();
// loop through each sub node and store the keys
for (JsonNode node : root) {
for (JsonNode cards : node.get("cards")) {
Iterator<String> i = cards.fieldNames();
while(i.hasNext()){
String name = i.next();
names.add(name);
}
}
}
// print each value
for (String name : names) {
System.out.println(name);
}
}
I have tried the following though I feel like its not going the right way.
List<JsonNode> nodes = new ArrayList<>();
root.iterator().forEachRemaining(nodes::add);
Set<JsonNode> cards = new HashSet<>();
nodes.stream().map(node -> node.get("cards")).forEach(cards::add);
Stream s = StreamSupport.stream(cards.spliterator(), false);
//.. unfinished and unhappy
You can find the Json file I used here: https://mtgjson.com/json/AllSets.json.zip
Be warned its quite large.
You can do most of the things in one swoop, but it's a shame this json api does not support streams better.
List<JsonNode> nodes = new ArrayList<>();
root.iterator().forEachRemaining(nodes::add);
Set<String> names = nodes.stream()
.flatMap(node -> StreamSupport.stream(
node.get("cards").spliterator(), false))
.flatMap(node -> StreamSupport.stream(
((Iterable<String>) () -> node.fieldNames()).spliterator(), false))
.collect(Collectors.toSet());
Or with Patrick's helper method (from the comments):
Set<String> names = stream(root)
.flatMap(node -> stream(node.get("cards")))
.flatMap(node -> stream(() -> node.fieldNames()))
.collect(Collectors.toSet());
...
public static <T> Stream<T> stream(Iterable<T> itor) {
return StreamSupport.stream(itor.spliterator(), false);
}
And printing:
names.stream().forEach(System.out::println);
If you provide 'json' file to us, it will be very useful.
At least now, I can make some suggestions to you:
Set<JsonNode> cards = new HashSet<>();
nodes.stream().map(node -> node.get("cards")).forEach(cards::add);
Replace with:
Set<JsonNode> cards = nodes.stream().map(node -> node.get("cards")).collect(Collectors.toSet());
for (String name : names) {
System.out.println(name);
}
Replace with:
names.forEach(System.out::println);
Replace
Set<JsonNode> cards = new HashSet<>();
with
List<JsonNode> cards = new ArrayList<>();
Remove
Stream s = StreamSupport.stream(cards.spliterator(), false);
Then add below lines
cards.stream().forEach( card -> {
Iterable<String> iterable = () -> card.fieldNames();
Stream<String> targetStream = StreamSupport.stream(iterable.spliterator(), false);
targetStream.forEach(names::add);
});
names.forEach(System.out::println);

Categories

Resources