Azure Java SDK v12 is not downloading a file asynchronously - java

I am writing a quick proof-of-concept for downloading images from Azure Blob Storage using the Java 12 Azure Storage SDK. The following code works properly when I convert it to synchronous. However, despite the subscribe() at the bottom of the code, I only see the subscription message. The success and error handlers are not firing. I would appreciate any suggestions or ideas.
Thank you for your time and help.
private fun azureReactorDownload() {
var startTime = 0L
var accountName = "abcd"
var key = "09sd0908sd08f0s&&6^%"
var endpoint = "https://${accountName}.blob.core.windows.net/$accountName
var containerName = "mycontainer"
var blobName = "animage.jpg"
// Get the Blob Service client, so we can use it to access blobs, containers, etc.
BlobServiceClientBuilder()
// Container URL
.endpoint(endpoint)
.credential(
SharedKeyCredential(
accountName,
key
)
)
.buildAsyncClient()
// Get the container client so we can work with our container and its blobs.
.getContainerAsyncClient(containerName)
// Get the block blob client so we can access individual blobs and include the path
// within the container as part of the filename.
.getBlockBlobAsyncClient(blobName)
// Initiate the download of the desired blob.
.download()
.map { response ->
// Drill down to the ByteBuffer.
response.value()
}
.doOnSubscribe {
println(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Subscription arrived.")
startTime = System.currentTimeMillis()
}
.doOnSuccess { data ->
data.map { byteBuffer ->
println(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> READY TO WRITE TO THE FILE")
byteBuffer.writeToFile("/tmp/azrxblobdownload.jpg")
val elapsedTime = System.currentTimeMillis() - startTime
println(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Finished downloading blob in $elapsedTime ms.")
}
}
.doOnError {
println(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Failed to download blob: ${it.localizedMessage}")
}
.subscribe()
}
fun ByteBuffer.writeToFile(path: String) {
val fc = FileOutputStream(path).channel
fc.write(this)
fc.close()
}

I see someone asking the same question 4 months ago and getting no answer:
Azure Blob Storage Java SDK: Why isn't asynchronous working?
I'm going to conjecture that this part of the JDK just isn't working right now. I wouldn't recommend using Azure's version of Java.
You should be able to accomplish it another way perhaps one of these answers:
Downloading Multiple Files Parallelly or Asynchronously in Java

I've worked with Microsoft and have a documented solution at the following link: https://github.com/Azure/azure-sdk-for-java/issues/5071. The person who worked with me provided very good background information, so it is more than just some working code.
I have opened a similar query with Microsoft for the downloadToFile() method in the Azure Java SDK v12, which is throwing an exception when saving to a file.
Here is the working code from that posting:
private fun azureReactorDownloadMS() {
var startTime = 0L
val chunkCounter = AtomicInteger(0)
// Get the Blob Service client, so we can use it to access blobs, containers, etc.
val aa = BlobServiceClientBuilder()
// Container URL
.endpoint(kEndpoint)
.credential(
SharedKeyCredential(
kAccountName,
kAccountKey
)
)
.buildAsyncClient()
// Get the container client so we can work with our container and its blobs.
.getContainerAsyncClient(kContainerName)
// Get the block blob client so we can access individual blobs and include the path
// within the container as part of the filename.
.getBlockBlobAsyncClient(kBlobName)
.download()
// Response<Flux<ByteBuffer>> to Flux<ByteBuffer>
.flatMapMany { response ->
response.value()
}
.doOnSubscribe {
println(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Subscription arrived.")
startTime = System.currentTimeMillis()
}
.doOnNext { byteBuffer ->
println(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> CHUNK ${chunkCounter.incrementAndGet()} FROM BLOB ARRIVED...")
}
.doOnComplete {
val elapsedTime = System.currentTimeMillis() - startTime
println(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Finished downloading ${chunkCounter.incrementAndGet()} chunks of data for the blob in $elapsedTime ms.")
}
.doOnError {
println(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Failed to download blob: ${it.localizedMessage}")
}
.blockLast()
}

Related

Google Dataflow trigger from Google bucket upload?

I am currently evaluating a proof of concept which uses Google bucket, a java microservice and Dataflow.
The communication flow is like so:
User sends CSV file to third party service
Service uploads CSV file to Google bucket with ID and filename
A create event is triggered and sent as a HTTP request to Java microservice
Java service triggers a Google Dataflow job
I am starting to think that the Java service is not necessary and I can directly call Dataflow after the CSV is uploaded to the bucket?
This is the service as you can see its just a basic controller that validates the request params from the "Create" trigger and then delegates to the Dataflow service
#PostMapping(value = "/dataflow", produces = {MediaType.APPLICATION_JSON_VALUE})
public ResponseEntity<Object> triggerDataFlowJob(#RequestBody Map<String, Object> body) {
Map<String, String> requestParams = getRequestParams(body);
log.atInfo().log("Body %s", requestParams);
String bucket = requestParams.get("bucket");
String fileName = requestParams.get("name");
if (Objects.isNull(bucket) || Objects.isNull(fileName)) {
AuditLogger.log(AuditCode.INVALID_CLOUD_STORAGE_REQUEST.getCode(), AuditCode.INVALID_CLOUD_STORAGE_REQUEST.getAuditText());
return ResponseEntity.accepted().build();
}
log.atInfo().log("Triggering a Dataflow job, using Cloud Storage bucket: %s --> and file %s", bucket, fileName);
try {
return DataflowTransport
.newDataflowClient(options)
.build()
.projects()
.locations()
.flexTemplates()
.launch(gcpProjectIdProvider.getProjectId(),
dataflowProperties.getRegion(),
launchFlexTemplateRequest)
.execute();
} catch (Exception ex) {
if (ex instanceof GoogleJsonResponseException && ((GoogleJsonResponseException) ex).getStatusCode() == 409) {
log.atInfo().log("Dataflow job already triggered using Cloud Storage bucket: %s --> and file %s", bucket, fileName);
} else {
log.atSevere().withCause(ex).log("Error while launching dataflow jobs");
AuditLogger.log(AuditCode.LAUNCH_DATAFLOW_JOB.getCode(), AuditCode.LAUNCH_DATAFLOW_JOB.getAuditText());
}
}
return ResponseEntity.accepted().build();
}
Is there a way to directly integrate Google bucket triggers with Dataflow?
When a file is uploaded to Cloud Storage, you can trigger a Cloud Function V2 with event arc.
Then in this Cloud Function, you can trigger a Dataflow job.
Deploy and trigger the Cloud Function V2, with event type object finalize :
gcloud functions deploy your_function_name \
--gen2 \
--trigger-event-filters="type=google.cloud.storage.object.v1.finalized" \
--trigger-event-filters="bucket=YOUR_STORAGE_BUCKET
In the Cloud Function, you will trigger the Dataflow job with a code sample that looks like this :
def startDataflowProcess(data, context):
from googleapiclient.discovery import build
#replace with your projectID
project = "grounded-pivot-266616"
job = project + " " + str(data['timeCreated'])
#path of the dataflow template on google storage bucket
template = "gs://sample-bucket/sample-template"
inputFile = "gs://" + str(data['bucket']) + "/" + str(data['name'])
#user defined parameters to pass to the dataflow pipeline job
parameters = {
'inputFile': inputFile,
}
#tempLocation is the path on GCS to store temp files generated during the dataflow job
environment = {'tempLocation': 'gs://sample-bucket/temp-location'}
service = build('dataflow', 'v1b3', cache_discovery=False)
#below API is used when we want to pass the location of the dataflow job
request = service.projects().locations().templates().launch(
projectId=project,
gcsPath=template,
location='europe-west1',
body={
'jobName': job,
'parameters': parameters,
'environment':environment
},
)
response = request.execute()
print(str(response))
This Cloud Function shows an example with Python but you can keep your logic with Java if you prefer.

Azure SAS Token for a specific file to be uploaded ? With Read And Expiry Time (JAVA)

I have BlobServiceAsyncClient
Used TenantID, clientID, ClientSecret, ContainerName for creating the blobContainerAsyncClient.
Uploading file as
blobContainerAsyncClient.getBlobAsyncClient(fileName).upload(.........);
You can use the below code
creates a Shared Access Signature with Read only permission and available only for the next 10 minutes.
public string CreateSAS(string blobName)
{
var container = blobClient.GetContainerReference(ContainerName);
// Create the container if it doesn't already exist
container.CreateIfNotExists();
var blob = container.GetBlockBlobReference(blobName);
var sas = blob.GetSharedAccessSignature(new SharedAccessBlobPolicy()
{
Permissions = SharedAccessBlobPermissions.READ,
SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(10),
});
return sas;
}
Please refer this document for more information: https://tech.trailmax.info/2013/07/upload-files-to-azure-blob-storage-with-using-shared-access-keys/

How to concatenate clips getting from Kinesis Video Stream in Java

I'm using AWS Kinesis Video Stream service to get my video recordings. So due to the Kinesis Video Stream fragment limitation, it turns out I can only retrieve up to ~30 minutes video at one request. And I was intend to retrieve a 2 hour video.
So I loop the request and get all 4 response into a List of InputStream, then I turn them into SequenceInputStream because I try to chain them all together.
However when I success uploaded them to S3 bucket and try to download from there. It shows me file are corrupted. I researched on SequenceInputStream however it seems that my design was okay.
Furthermore, if I extend my video length, let say I have 24 InputStream, and I chained them all to a single SequenceInputStream, it will encounter the SSL Socket Exception: Connection Reset when I run the readAllBytes operation on the sequence input stream.
Is there any way I can achieve what I want or something wrong in my code to cause this?
Here are my source code:
private String downloadMedia(Request request, JSONObject response, JSONObject metaData, Date startDate, Date endDate) throws Exception {
long duration = endDate.getTime() - startDate.getTime();
long durationInMinutes = TimeUnit.MILLISECONDS.toMinutes(duration);
long intervalsCount = durationInMinutes / 30;
ArrayList<GetClipResult> getClipResults = new ArrayList<>();
for (int i = 0; i < intervalsCount; i++){
Media currentMedia = constructMediaAfterIntervalsBreakdown(metaData, request, startDate, endDate);
String deviceName = metaData.getString("name") + "_" + request.getId();
Stream stream = getStreamByName(name, request.getId());
String endPoint = getDataEndpoint(stream.getStreamName());
GetClipResult clipResult = downloadMedia(currentMediaDto, endPoint, stream.getStreamName());
if(clipResult != null){
getClipResults.add(clipResult);
}
startDate = currentMediaDto.getEndTime();
}
//Get presigned URL from S3 service response
String url = response.getJSONArray("data").getJSONObject(0).getJSONArray("parts").getJSONObject(0).getString("url");
if (getClipResults.size() > 0) {
Vector<InputStream> inputStreams = new Vector<>();
for (GetClipResult clipResult : getClipResults){
InputStream videoStream = clipResult.getPayload();
inputStreams.add(videoStream);
}
Enumeration<InputStream> inputStreamEnumeration = inputStreams.elements();;
SequenceInputStream sequenceInputStream = new SequenceInputStream(inputStreamEnumeration);
if (sequenceInputStream.available() > 0){
sequenceInputStream.readAllBytes();
byte[] bytes = sequenceInputStream.readAllBytes();
String message = uploadFileUsingSecureUrl(url, bytes, metaData);
return message;
}
}
return "failed";
}
Edited: I came across couple package that called Xuggler and FFMPEG, however most of them are getting the video file from disk (which has a path), but for my case there isn't any video file because I do not download them to local, they only existed in the runtime and will upload to S3 later on after concatenated.
Appreciates any help! Thank you!
So in the end I just downloaded the clips, saved it to the disk on runtime, merged them using mp4parser and upload to S3. Afterwards I just deleted those on my disk.
If anyone curious about the code, it is taken from https://github.com/sannies/mp4parser/blob/master/examples/src/main/java/com/googlecode/mp4parser/AppendExample.java
Thank you.

Pdf Renderer API Android From URL

I am looking into the PDF renderer API native to Google Android development. I see the following code example in the documentation:
// create a new renderer
PdfRenderer renderer = new PdfRenderer(getSeekableFileDescriptor());
// let us just render all pages
final int pageCount = renderer.getPageCount();
for (int i = 0; i < pageCount; i++) {
Page page = renderer.openPage(i);
// say we render for showing on the screen
page.render(mBitmap, null, null, Page.RENDER_MODE_FOR_DISPLAY);
// do stuff with the bitmap
// close the page
page.close();
}
// close the renderer
renderer.close();
I think this example uses from File Object. How I can get this API to work with a URL from a webserver, such as a document from a website? How can I load a PDF natively in an Android app that does not require a download of the file onto the local storage? Something like how you can run the Google docs viewer to open the PDF in webview - but I cannot take that approach because the Google docs viewer is blocked in the environment I am in.
You cannot use Pdf Renderer to load URL. But your can make use of Google Docs in your webview to load URL without downloading the file...
webView.loadUrl("https://docs.google.com/gview?embedded=true&url=" + YOUR_URL);
how I can get this API to work with URL from a webserver?
Download the PDF from the server to a local file. Then, use the local file.
The purpose of what I am trying to learn is how to load pdf natively in android app that does not require a download of the file onto the local storage
AFAIK, you cannot use PdfRenderer that way. It needs a seekable FileDescriptor, and the only way that I know of to create one of those involves a local file.
I would first download the pdf and then show it in a pdfView
private fun downloadPdf(): File? {
val client = OkHttpClient()
val request = Request.Builder().url(urlString)
.addHeader("Content-Type", "application/json")
.build()
val response = client.newCall(request).execute()
val inputStream: InputStream? = response.body?.byteStream()
val pdfFile = File.createTempFile("myFile", ".pdf", cacheDir)
inputStream?.readBytes()?.let { pdfFile.writeBytes(it) }
return pdfFile
}
and then do something like this:
CoroutineScope(IO).launch {
val pdfDownloaded = downloadPdf()
if (pdfDownloaded != null) {
pdfView.fromFile(pdfDownloaded)
}
withContext(Main) {
pdfView.visibility = View.VISIBLE
hideProgress()
pdfView.show()
}
}
here

How to set HTTP header in Apache JClouds?

I'm using Apache JClouds to connect to my Openstack Swift installation. I managed to upload and download objects from Swift. However, I failed to see how to upload dynamic large object to Swift.
To upload dynamic large object, I need to upload all segments first, which I can do as usual. Then I need to upload a manifest object to combine them logically. The problem is to tell Swift this is a manifest object, I need to set a special header, which I don't know how to do that using JClouds api.
Here's a dynamic large object example from openstack official website.
The code I'm using:
public static void main(String[] args) throws IOException {
BlobStore blobStore = ContextBuilder.newBuilder("swift").endpoint("http://localhost:8080/auth/v1.0")
.credentials("test:test", "test").buildView(BlobStoreContext.class).getBlobStore();
blobStore.createContainerInLocation(null, "container");
ByteSource segment1 = ByteSource.wrap("foo".getBytes(Charsets.UTF_8));
Blob seg1Blob = blobStore.blobBuilder("/foo/bar/1").payload(segment1).contentLength(segment1.size()).build();
System.out.println(blobStore.putBlob("container", seg1Blob));
ByteSource segment2 = ByteSource.wrap("bar".getBytes(Charsets.UTF_8));
Blob seg2Blob = blobStore.blobBuilder("/foo/bar/2").payload(segment2).contentLength(segment2.size()).build();
System.out.println(blobStore.putBlob("container", seg2Blob));
ByteSource manifest = ByteSource.wrap("".getBytes(Charsets.UTF_8));
// TODO: set manifest header here
Blob manifestBlob = blobStore.blobBuilder("/foo/bar").payload(manifest).contentLength(manifest.size()).build();
System.out.println(blobStore.putBlob("container", manifestBlob));
Blob dloBlob = blobStore.getBlob("container", "/foo/bar");
InputStream input = dloBlob.getPayload().openStream();
while (true) {
int i = input.read();
if (i < 0) {
break;
}
System.out.print((char) i); // should print "foobar"
}
}
The "TODO" part is my problem.
Edited:
I've been pointed out that Jclouds handles large file upload automatically, which is not so useful in our case. In fact, we do not know how large the file will be or when the next segment will arrive at the time we start to upload the first segment. Our api is designed to make client able to upload their files in chunks of their own chosen size and at their own chosen time, and when done, call a 'commit' to make these chunks as a file. So this makes us want to upload the manifest on our own here.
According to #Everett Toews's answer, I've got my code correctly running:
public static void main(String[] args) throws IOException {
CommonSwiftClient swift = ContextBuilder.newBuilder("swift").endpoint("http://localhost:8080/auth/v1.0")
.credentials("test:test", "test").buildApi(CommonSwiftClient.class);
SwiftObject segment1 = swift.newSwiftObject();
segment1.getInfo().setName("foo/bar/1");
segment1.setPayload("foo");
swift.putObject("container", segment1);
SwiftObject segment2 = swift.newSwiftObject();
segment2.getInfo().setName("foo/bar/2");
segment2.setPayload("bar");
swift.putObject("container", segment2);
swift.putObjectManifest("container", "foo/bar2");
SwiftObject dlo = swift.getObject("container", "foo/bar", GetOptions.NONE);
InputStream input = dlo.getPayload().openStream();
while (true) {
int i = input.read();
if (i < 0) {
break;
}
System.out.print((char) i);
}
}
jclouds handles writing the manifest for you. Here are a couple of examples that might help you, UploadLargeObject and largeblob.MainApp.
Try using
Map<String, String> manifestMetadata = ImmutableMap.of(
"X-Object-Manifest", "<container>/<prefix>");
BlobBuilder.userMetadata(manifestMetadata)
If that doesn't work you might have to use the CommonSwiftClient like in CrossOriginResourceSharingContainer.java.

Categories

Resources