In general, I need to create an app with java that will perform some operations on azure storage
like upload file, append to file, rename, check if exist and so on. And IMPORTANT It has to communicate with DFS endpoint https://xxxx.dfs.core.windows..
But I encounter some problems:
during using BlobContainerClient and uploading a file to azure storage an error appears:
com.azure.storage.blob.models.BlobStorageException: Status code 400,
"{"error":{"code":"MissingRequiredHeader","message":"An HTTP header
that's mandatory for this request is not
specified.\nRequestId:b225d695-201f-00ed-212e-c7c9e8000000\nTime:2021-10-22T10:23:12.4983407Z"}}"
How can I avoid this situation, what header is required and how to set up it?
Afterward I have implemented something similar but using DataLakeFileSystemClient and this time uploading of the file was totally fine. Unfortunately, not all operations can be performed. e.g. exists() method internally uses BlobContainerClient
and perform call via blob endpoint https://xxxx.blob.core.windows.. what if forbidden in my case.
IMO It is caused because BlobContainerClientBuilder.endpoint(String endpoint) set up endpoint blobContainerClient
endpoint to blob, and dfs endpoint for DataLakeFileSystemClient.
source code:
public DataLakeFileSystemClientBuilder endpoint(String endpoint) { // Ensure endpoint provided is dfs endpoint endpoint = DataLakeImplUtils.endpointToDesiredEndpoint(endpoint, "dfs", "blob"); blobContainerClientBuilder.endpoint(DataLakeImplUtils.endpointToDesiredEndpoint(endpoint, "blob", "dfs"));
So the question is: is it a bug in BlobContainerClientBuilder.endpoint(String endpoint) ?
or how to fix this problem to use the same endpoint for both clients.
Currently, I have implemented wcomunicatend I'm using both clients: DataLakeFileSystemClient to perform actions
like upload, append etc. and BlobContainerClient to check if file exist. I would like to use only one of the clients.
Could you help me somehow, please?
Azure Blob Storage is developed for storing large amount of unstructured data. And Unstructured data does not adhere to a particular data model or definition, such as text or binary data.
Blob Storage provides three resources, which are Storage Account (SA), Container inside SA and a Blob in the Container. And we use some Java classes to interact with these resources.
The BlobContainerClient class allows you to manipulate Azure Storage Containers and their Blobs. This class is mainly used to manipulate or work on the Containers (file system). So if you want to work on or manipulate Blobs (files) then it's recommended to use the BlobClient.
Check the following snippets to create a container and uploading a file.
Create a container using a BlobContainerClient.
blobContainerClient.create();
Upload BinaryData to a blob using a BlobClient generated from a BlobContainerClient.
BlobClient blobClient = blobContainerClient.getBlobClient("myblockblob");
String dataSample = "samples";
blobClient.upload(BinaryData.fromString(dataSample));
And to rename a blob (file), Copy and Delete is the only method of renaming a blob. If you want to do so for larger blobs, you need to use the asynchronous copy and check periodically for its completion.
Check this Manage blobs with Java v12 SDK and Azure Storage Blob client library for Java document for more information.
Related
Is there a way to determine whether a storage account is blob storage or general purpose storage using the Azure storage Java API
According to the Azure Storage REST API Create Storage Account(only version 2016-01-01 and later), you can see a parameter kind which determine what kind of storage account (Storage or BlobStorage) will be created in the request body.
For using Azure Storage Java API, there is an enum class Kind which includes the two kinds of storage account, you can select the one of you want via two interfaces (WithGeneralPurposeAccountKind and WithBlobStorageAccountKind) of StorageAccount.DefinitionStages interface.
Here are the usual usages of them.
Create a default kind storage account via define method, see the completed sample code here.
StorageAccount storageAccount = azure.storageAccounts().define(storageAccountName)
.withRegion(Region.US_EAST)
.withNewResourceGroup(rgName)
.create();
According to the source code of define method, the default kind of storage account is Storage via WithGeneralPurposeAccountKind.
Create a storage account of BlobStorage kind.
StorageAccount storageAccount = azure.storageAccounts().define(storageAccountName)
.withBlobStorageAccountKind() // Set the kind as `BlobStorage`
.withRegion(Region.US_EAST)
.withNewResourceGroup(rgName)
.create();
From a Web API, I receive the following information about an Amazon S3 Bucket I am allowed to upload a File to:
s3_bucket (the Bucket name)
s3_key (the Bucket key)
s3_policy (the Bucket policy)
s3_signature the Bucket signature)
Because I am not the owner of the Bucket, I am provided with the s3_policy and s3_signature values, which, according to the AWS Upload Examples, can be used to authenticate a Put request to a Bucket.
However, in AWS's official Java SDK I'm using, I can't seem to find a way to perform this authentication. My code:
PutObjectRequest putObjectRequest = new PutObjectRequest(s3_bucket, s3_key, fileToUpload);
s3Client.putObject(putObjectRequest);
I do understand that I need to use the s3_signature and s3_policy I'm given at some point, but how do I do so to authenticate my PutObjectRequest?
Thanks in advance,
CrushedPixel
I don't think you're going to use the SDK for this operation. It's possible that the SDK will do what you need at this step, but it seems unlikely, since the SDK would typically take the access key and secret as arguments, and generate the signature, rather than accepting the signature as an argument.
What you describe is an upload policy document, not a bucket policy. That policy, the signature, and your file, would all go into an HTTP POST (not PUT) request of type multipart/form-data -- a form post -- as shown in the documentation page you cited. All you should need is an HTTP user agent.
You'd also need to craft the rest of the form, including all of the other fields in the policy, which you should be able to access by base64-decoding it.
The form also requires the AWSAccessKeyId, which looks something like "AKIAWTFBBQEXAMPLE", which is -- maybe -- what you are calling the "s3_key," although in S3 terminology, the "(object) key" refers to the path and filename.
This seems like an odd set of parameters to receive from a web API, particularly if they are expecting you to generate the form yourself.
I have an app that allows users to save blobs in the blobstore. I have a schema that does so presently, but I am interested in something simpler and less twisted. For context, imagine my app allows users to upload the picture of an animal with a paragraph describing what the animal is doing.
Present schema
User calls my endpoint api to save the paragraph and name of the animal in entity Animal. Note: The Animal entity actually has 4 fields ( name, paragraph, BlobKey, and blobServingUrl as String). But the endpoint api only allows saving of the two mentioned.
Within the endpoint method, on app-engine side, after saving name and paragraph I make the following call to generate a blob serving url, which my endpoint method returns to the caller
#ApiMethod(name = "saveAnimalData", httpMethod = HttpMethod.POST)
public String saveAnimalData(AnimalData request) throws Exception {
...
BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
String url = blobstoreService.createUploadUrl("/upload");
return url;
}
On the android side, I use a normal http call to send the byte[] of the image to the blobstore. I use apache DefaultHttpClient(). Note: the blobstore, after saving the image, calls my app-engine server with the blob key and serving url
I read the response from the blobstore (blobstore called my callback url) using a normal java servlet, i.e. public void doPost(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException. From the servlet, I put the BlobKey and blobServingUrl into the Animal entity for the associated animal. (I had passed some meta data to the blobstore, which I use as markers to identify the associated animal entity).
Desired Schema
This is where your response comes in. Essential, I would like to eliminate the java servlet and have my entire api restricted to google cloud endpoint. So my question is: how would I use my endpoint to execute steps 3 and 4?
So the idea would be to send the image bytes to the endpoint method saveAnimalData at the same time that I am sending the paragraph and name data. And then within the endpoint method, send the image to the blobstore and then persist the BlobKey and blobServingUrl in my entity Animal.
Your response must be in java. Thanks.
I see two questions in one here :
Can Google Cloud Endpoints handle multipart files ? -> I don't know about this TBH
Is there a simpler process to store blobs than using the BlobStoreService?
It depends on the size of your image. If you limit your users to < 1MB files, you could just store your image as a Blob property of your Animal entity. It would allow you to bypass the BlobStoreService plumbering. See : https://developers.google.com/appengine/docs/java/datastore/entities?hl=FR
This solution still depends on how the Cloud Endpoint would handle the multipart file as a raw byte[]...
We encountered the same issue with GWT + Google App Engine in 2009, and it was before the BlobStoreService was made available.
GWT RPC and Cloud Endpoints interfaces share some similarities, and for us it was not possible. We had to create a plain HTTP Servlet, and use a Streaming Multipart file resolver beacause the one from Apache's HTTP Commons used the file system.
My main question is how can I pass JSON as well as File to post request to REST API? What needs in Spring framework to work as client and wait for response by passing post with JSON and File?
Options:
Do I need to use FileRepresentation with ClientResource? But how can I pass file as well as JSON?
By using RestTemplate for passing both JSON as well as File? How it can be used for posting JSON as well as File?
Any other option is available?
Sounds like an awful resource you're trying to expose. My suggestion is to separate them into 2 different requests. Maybe the JSON has the URI for the file to then be requested…
From a REST(ish) perspective, it sounds like the resource you are passing is a multipart/mixed content-type. One subtype will be application/json, and one will be whatever type the file is. Either or both could be base64 encoded.
You may need to write specific providers to serialize/deserialize this data. Depending on the particular REST framework, this article may help.
An alternative is to create a single class that encapsulates both the json and the file data. Then, write a provider specific to that class. You could optionally create a new content-type for it, such as "application/x-combo-file-json".
You basically have three choices:
Base64 encode the file, at the expense of increasing the data size
by around 33%.
Send the file first in a multipart/form-data POST,
and return an ID to the client. The client then sends the metadata
with the ID, and the server re-associates the file and the metadata.
Send the metadata first, and return an ID to the client. The client
then sends the file with the ID, and the server re-associates the
file and the metadata.
I'm planning to develop a webservice, and I like to try the RESTful architecture. The issue is that I don't know if the service is adequate for it, or it is better to use SOAP.
The service is about downloading some data from the server to a device on the local computer. The data will be split into chunks. The service will be run with an ad-hoc client at the local machine that will manage the device the file is gonna be stored in.
I was thinking on having something like:
/files/{id} --> will inform about the details of the file
/files--> list all the files
The problem is for the action. In rest only GET, POST and (PUT DELETE) are defined. But I want to have something like download. My idea, although not fully restful is to create:
/files/{id}/download
This will return something like
{ "chunk" : "base64 string with chunk data"
"next" : "http://XXX/file/id/download?chunk=1
}
When next is empty the whole set of chunks would be downloaded.
What do you think? Is it ok to do it this way or would it be better the traditional way using SOAP and defining functions like getFiles(), getFileChunk(chunkNo, file)?
Any comment is really appreciated.
See you
If using REST, you don't need to define your own "chunking" protocol as the HTTP headers Content-Length, Content-Range and Transfer-Encoding are all used for sending chunked data.
See the RFC for HTTP header fields
As John already mentioned you might want to separate between your file resources and the file resource metadata (any information about your file). Additionally a more RESTful way to access your chunks could look like this:
http://url/files/{id}/chunks
{
"complete" : false,
"chunks": [
"http://url/files/<fileid>/chunks/1",
"http://url/files/<fileid>/chunks/2",
"http://url/files/<fileid>/chunks/3",
]
}
Basically, here, you return a list of RESTFUL URIs to all your file chunks and the information if all chunks of the file are already complete. I don't see that SOAP might have any advantage there since you would define the same methods (getFile and getChunks) that are already covered by the REST verb GET.
It sounds like you really have two different resources: file-metadatas and files. What about something like:
/file/{id} // GET: Retrieve this file's data.
/file-metadata/{id} // GET: Metadata about a particular file. Contains link to file:
// {
// ...
// data: "http://.../file/156", // Where to find file's data.
// }
/file-metadata // GET: List metadata for all files.