I currently get notes from Evernote WebClipper API (Java) using the following code. This gets me the notes that have text in them. However, some notes might have images contained in them. I would like to get access to these images (resources). How would I do this?
NotesMetadataList nl = evernoteAccount.getRequestedNotes(words);
for (NoteMetadata note : nl.getNotes()) {
logger.debug("GUID: " + note.getGuid());
logger.debug("Title: " + note.getTitle());
logger.debug("Content: " + note.getContent());
}
Evernote notes can have attachments called resources which includes images. To download the resources you have 2 options:
parse the content of the note for the images for the "en-media" tag. The tag will have the attributes "type" and "hash". Type will contain the MIME type of the file that is attached to the note as a resource and the "hash" is the MD5 hash of the file. If the type of the file is one you would like to retrieve call getResourceByHash on the note store where the note resides passing the GUID of the note, hash of the file, and true/false to include not include the data, recognition, and alternative data respectively.
download the resource associated with the note via the metadata in the note. Each note has a "resources" attribute which is a list of all the resources attached to the note. Each item in the list represents a resource and will have a "mime" and "guid" attribute. You can also inspect the attributes.fileName attribute for the file name of the resource. If the "mime", filename or other attribute of the resource matches your criteria for downloading you can use the getResourceData method on the note store on which the note is contained to download the file by passing the GUID of the resource (not the GUID of the note).
Sources:
Evernote API Reference: https://dev.evernote.com/doc/reference/
Evernote Resources/Attachments: https://dev.evernote.com/doc/articles/resources.php
Evernote Markup Language (ENML): https://dev.evernote.com/doc/articles/enml.php
Related
Uploading a large file to SharePoint Online (Document library) via the MS Graph SDK (Java) works for me, but adding also metadata on an upload seems to be hard
I tried the to add the metadata inside the DriveItemUploadableProperties, because I didn't find any hints where the right place should be
DriveItemUploadableProperties value = new DriveItemUploadableProperties();
value.additionalDataManager().put("Client", new JsonPrimitive("Test ABC"));
var driveItemCreateUploadSessionParameterSet = DriveItemCreateUploadSessionParameterSet.newBuilder().withItem(value);
UploadSession uploadSession = graphClient.sites(SPValues.SITE_ID).lists(SPValues.LIST_ID).drive().root().itemWithPath(path).createUploadSession(driveItemCreateUploadSessionParameterSet.build()).buildRequest().post();
LargeFileUploadTask<DriveItem> largeFileUploadTask = new LargeFileUploadTask<>(uploadSession, graphClient, fileStream, streamSize, DriveItem.class);
LargeFileUploadResult<DriveItem> upload = largeFileUploadTask.upload(customConfig);
This results in a 400 : Bad Request response
How can I add metadata on an upload the right way?
AFAIK, you cannot add metadata while uploading to Sharepoint. You will have to make two separate requests, one to upload the file, and one to add additional metadata to the file that you just uploaded.
Before adding your own custom metadata, you must register the facets / schema to OneDrive. Refer to this doc :
https://learn.microsoft.com/en-us/onedrive/developer/rest-api/concepts/custom-metadata-facets?view=odsp-graph-online
But you should be aware that because custom facets are a feature in preview, at the time of this post you have to literally contact an MS email and get the custom facet manually approved, there is no automatic API to do this unfortunately.
If you somehow manage to get the custom facet approved :
DriveItemUploadableProperties has preset fields such as filename, size, etc. meant to represent the upload task and basic details about the file, there are no options to add additional metadata to it. Refer to the documentation for DriveItemUploadableProperties :
https://learn.microsoft.com/en-us/graph/api/resources/driveitemuploadableproperties?view=graph-rest-1.0
I assume that when you say, "Uploading a large file to SharePoint Online (Document library) via the MS Graph SDK (Java) works for me", you are able to successfully upload the file and obtain the item ID in the response from the uploaded file. You can use the item ID to update the metadata of the file via a second request. Specifically, refer to the update driveitem here :
https://learn.microsoft.com/en-us/graph/api/driveitem-update?view=graph-rest-1.0&tabs=http
GraphServiceClient graphClient = GraphServiceClient.builder().authenticationProvider( authProvider ).buildClient();
DriveItem driveItem = new DriveItem();
driveItem.name = "new-file-name.docx";
graphClient.me().drive().items("{item-id}")
.buildRequest()
.patch(driveItem);
Edit :
As additional information, you can use a ListItem rather than a DriveItem resource and input custom fields there. However, you should be aware that unlike custom facets that I mention above, custom metadata stored in these fields are not indexed and is not meant to be queried / filtered on large datasets, which is the most common use case for metadata. When querying for these fields you must include the
Prefer : HonorNonIndexedQueriesWarningMayFailRandomly
in the request header, and as the header says you should be aware that the query may fail randomly in large datasets.
Need to check if PDF Tags have properties as per Accessibility guidelines.
Examples:
H1 - validate that a H1 exists in the PDF
Image(Figure Tag) - validate image\figure has a Alt text
Language - Validate that language property is set so that screen reader will read properly. For Spanish and English documents, respective Language codes should be updated
Tables - access table object and validate that table structure is proper (headers columns match with row column etc)
So far I was able to:
Extract the Metadata and validate the document has proper Title, Subject and Producer info by PDDocument.getDocumentInformation().getMetadataKeys();
Validate if PDF is accessible or not by checking PDDocument.getDocumentCatalog().getMarkInfo().isMarked(); flag
To access the Tags, I have tried these options:
getDocumentCatalog().getAcroForm() returns Null
PDDocument.getDocumentCatalog().getPages().get(0).getAnnotations(); returns Null
I tried looping through PDDocument.getDocumentCatalog().getStructureTreeRoot().getKids() but its returning only 1 StructElem type object
Creation of Accessible PDF is done using OpenText so Dev team doesn't know about PDFBox.
I am lost here as how to get the access to Tags/Objects (use MarkedContent or something else).
Please suggest how to extract the individual objects(tags) such as P, H1, Table, Figure/Image and validate their properties.
Note: Manual validation of these properties are performed using Adobe Acrobat Pro
Based upon https://issues.apache.org/jira/browse/PDFBOX-7, it appears that you can use PDFMarkedContentExtractor to get the information that you need.
In our platform, we use a certain format from paths. In the Android App, it receives those paths to load some data or do something.
I want to do all the data handling using content provider, I want to give the path and get data. A simple transaction.
When I read into content providers, the documentation and all the tutorials out there always use "content://" at the beginning. However, I want to use our own start of the path which is usually "is-://". Can something like this work?
no, this is how the system categorize the uri as content provider.
its like relacing file:// with something else.
After referring to Developer.google site
A content URI is a URI that identifies data in a provider. Content URIs include the symbolic name of the entire provider (its authority) and a name that points to a table (a path). When you call a client method to access a table in a provider, the content URI for the table is one of the arguments.
From this I believe you can't set it on your own as it includes the symbol name.
Also why do you want to change it?
I am trying to update the content of a Google Doc file with the content of another Google Doc file. The reason I don't use the copy method of the API is because that creates another file with another ID. My goal is to keep the current ID of the file. This is a code snippet which unfortunately does nothing:
com.google.api.services.drive.Drive.Files.Get getDraft = service.files().get(draftID);
File draft = driveManager.getFileBackoffExponential(getDraft);
com.google.api.services.drive.Drive.Files.Update updatePublished = service.files().update(publishedID, draft);
driveManager.updateFileBackoffExponential(updatePublished);
The two backoffExponential functions just launch the execute method on the object.
Googling around I found out that the update method offers another constructor:
public Update update(java.lang.String fileId, com.google.api.services.drive.model.File content, com.google.api.client.http.AbstractInputStreamContent mediaContent)
Thing is, I have no idea how to retrieve the mediaContent of a Google file such as a Google Doc.
The last resort could be a Google Apps Script but I'd rather avoid that since it's awfully slow and unreliable.
Thank you.
EDIT: I am using Drive API v3.
Try the Google Drive REST update.
Updates a file's metadata and/or content with patch semantics.
This method supports an /upload URI and accepts uploaded media with
the following characteristics:
Maximum file size: 5120GB Accepted Media MIME types: /*
To download a Google File in the format that's usable, you need to specify the mime-type. Since you're using Spreadsheets, you can try application/vnd.openxmlformats-officedocument.spreadsheetml.sheet. Link to Download files for more info.
If I've got a link that looks like this from a tweet: https://t.co/xxxxxxxxxxx,
And I know that link contains and image. How do I extract that image from that post so I can use it on another page? I'm using twitter4j.
Thanks in advance
EDIT:
I thought it worked by doing the following:
public String getImageUrlFromPost(String url) throws TwitterException {
Query query = new Query(url);
QueryResult result = this.getTwitter().search(query);
System.out.println("The tweets found: " + result.getTweets() +" with query " + url);
for (Status status : result.getTweets()) {
for (MediaEntity mediaEntity : status.getMediaEntities()) {
return mediaEntity.getMediaURLHttps();
}
}
return null;
}
Unfortunately result.getTweets() is empty when I pass my t.co link :(
I'm afraid, but you won't be able to query for or retrieve images behind t.co-URLs programmatically via Twitter4J API that way.
Basically, there are at least two types of URL formats to reference resources in Twitter:
Every URL which has the format http://t.co/randomstringhere is a redirecting link to another resource in the web (most likely: a Web Page) and the actual web page might be structured totally different for every single referenced page. Hence, there is no generic way of inferring the xHTML structure of the referenced page and consequently no proper way to retrieve what you're looking for.
By contrast, Twitter uses the URL format http://pbs.twimg.com/media/anotherandomstring.png (or .jpg or other formats) to reference images that have been shared in tweets with attached media files (here: images). Only in this case you can use status.getMediaEntities() and mediaEntity.getMediaURLHttps() to retrieve the URL's binary content of the actual image.
Conclusion:
Sadly, at least in 2016, there is no generic way to retrieve resources (media files) behind http://t.co/... URLs referenced in tweets via Twitter4J.