I want to store graph data in marklogic using semantic triple. I am able to do that when i use ttl file with uri of http://example.org/item/item22.
But i want to store this triple wrt to documents which are stored in Marklogic.
Means i have one document "Java KT" which is in relation to Java class, and all this data is present in marklogic , how can i create a ttl file with uri to document which is present in marklogic DB?
Load your documents, load your triples, and just add extra triples with document uri as subject or object, and some triple entity uri as the other side. You could express those in another ttl file, or create them via code.
Next question would be, though, how you would want to use documents and triples together?
HTH!
my question is what is the IRI that i will be writing in ttl file which will be of my document available in DB. As ttl file accepts IRIs , so what is the iri for my document ?
#grtjn
It sounds like you want to link from some existing information to your document URI.
If it's item22 from your example then it should be straight-forward.
Let's say item22 is a book. Your TTL data might look like this:
PREFIX item: <http://example.org/item/>
PREFIX domain: <http://example.org/stuff/>
item:item22 a domain:Book ;
domain:hasTitle "A tale of two cities" ;
domain:hasAuthor "Charles Dickens" .
Let's say you have that book as a document in MarkLogic. You could simply add another triple:
item:item22 domain:contentsInUri "/books/Dickens/A-tale-of-two-cities.xml" .
Now you can use SPARQL and easily find the URI related to all the books by Dickens or books with the title "A tale of two cities".
If you are looking for more structure you look into some semantic ontologies such as RDFS and OWL.
Related
Let's say I have collection called root
Can I create document with its subcollection in one call?
I mean if I do:
db.Collection("root").document("doc1").Collection("sub").document("doc11").Set(data)
Then will that create the structure in one shot?
To be honest I gave this a try and doc1 had an italics heading which I thought only for deleted docs
The code you shared doesn't create an actual document. It merely "reserves" the ID for a document in root and then creates a sub collection under it with an actual doc11 document.
Seeing a document name in italics in the Firestore console indicates that there is no physical document at a location, but that there is data under the location. This is most typical when you've deleted a document that previously existed, but your code is another way accomplishing the same.
There is no way to create two documents in one call, although you can create multiple documents in a single transaction or batch write if you want.
I was working on storing a data in azure tables in a meanwhile I found JSON support for Azure tables here. So for a change of plan now I have a data in JSON format i need it to store on azure tables.I found few code snippets but all were for c#. Can you please guide me ?
Thanks in Advance
Azure Table Storage is a Key/Value pair store as opposed to a document store (a good example for that would be DocumentDB). Essentially a table contains entities (broadly think of them as rows) and each entity contains some attributes (broadly think of them as columns). Each attribute consist of 3 things: Attribute name (that would be the key in key/value pair), attribute value (that would the value in key/value pair) and attribute data type.
To answer your question, yes, you can store a JSON document in Azure Tables but that goes in as an attribute thus you need to assign a key to your JSON document. Furthermore each attribute can't be more than 64KB in size so you would need to take that into consideration.
If your requirement is to store JSON documents, I woul recommend looking into DocumentDB. It is more suitable for storing JSON data and doing many more things that Azure Tables can't do.
Regarding your comment about JSON support for Azure table, it talks about the format in which data is sent to/retrieved from Azure tables. In the earlier days, data was transmitted using ATOM PUB XML format which made the request payload/response body really bulky. With JSON format, the size is considerably reduced. However no matter which way you go, Azure Tables store the data in key/value pair format.
#AnandDeshmukh, Based on my understanding, I think you might want to use Java to write the similar code with C#. I suggest that you can try to refer to the javadoc of Azure Storage SDK to rewrite the sample code in Java.
For example, you can use the Java code instead of the C# code as below.
C# code:
CloudTableClient tableClient = new CloudTableClient(baseUri, cred)
{
// Values supported can be AtomPub, Json, JsonFullMetadata or JsonNoMetadata
PayloadFormat = TablePayloadFormat.JsonNoMetadata
};
Java code:
CloudTableClient tableClient = new CloudTableClient(baseUri, cred)
tableClient.getDefaultRequestOptions().setTablePayloadFormat(TablePayloadFormat.JsonNoMetadata);
I am faily new to MarkLogic (and noSQL) and currently trying to learn the Java API client. My question on searching, which returns back search result snippets / matches, is it possible for the search result to include specific fields in the document?
For example, given this document:
{"id":"1", "type":"classified", "description": "This is a classified type."}
And I search using this:
QueryManager queryMgr = client.newQueryManager();
StringQueryDefinition query = queryMgr.newStringDefinition();
query.setCriteria("classified");
queryMgr.search(query, resultsHandle);
How can I get the JSON document's 3 defined fields (id, type, description) as part of the search result - so I can display them in my UI table?
Do I need to hit the DB again by loading the document via URI (thus if I have 1000 records, that means hitting the DB again 1000 times)?
You have several options to retrieve specific fields with your search results. You could use the Pojo Data Binding Interface. You could read multiple documents matching a query which brings back the entirety of each document which you can then get as a pojo or String or any other handle. Or you can use the same API you're using above but add search options to allow you to extract a portion of a matching document.
If you're bring back thousands of matches, you're probably not showing all those snippets to end users, so you should probably disable snippets using something like
<transform-results apply="empty-snippet" />
in your options.
I would like to know how the Search API store Document internally? Does it store a Document into the Datastore with a "Document" kind? Or something else? Also where are indexes stored? In memcache?
Documents and indexes are stored in HR Datastore
Documents and indexes are saved in a separate persistent store
optimized for search operations
The Document class represents documents. Each document has a document identifier and a list of fields.
The Document class represents documents. Each document has a document identifier and a list of fields.
It's all in Google's documentation
I am planning to use Lucene to index a very large corpus of text documents. I know how an inverted index and all that works.
Question: Does Lucene store the actual source documents in its index (in addition to the terms)? So if I search for a term and want all the documents that contain the term, do the documents come out of Lucene, or does Lucene just return pointers (e.g. the file path to the matched documents)?
This is up to you. Lucene represents documents as collections of fields, and for each field you can configure whether it is stored. Typically, you would store the title fields, but not the body fields, when handling largish documents, and you'd add an identifier field (not indexed) that can be used to retrieve the actual document.