Let's say I have collection called root
Can I create document with its subcollection in one call?
I mean if I do:
db.Collection("root").document("doc1").Collection("sub").document("doc11").Set(data)
Then will that create the structure in one shot?
To be honest I gave this a try and doc1 had an italics heading which I thought only for deleted docs
The code you shared doesn't create an actual document. It merely "reserves" the ID for a document in root and then creates a sub collection under it with an actual doc11 document.
Seeing a document name in italics in the Firestore console indicates that there is no physical document at a location, but that there is data under the location. This is most typical when you've deleted a document that previously existed, but your code is another way accomplishing the same.
There is no way to create two documents in one call, although you can create multiple documents in a single transaction or batch write if you want.
Related
I want to store graph data in marklogic using semantic triple. I am able to do that when i use ttl file with uri of http://example.org/item/item22.
But i want to store this triple wrt to documents which are stored in Marklogic.
Means i have one document "Java KT" which is in relation to Java class, and all this data is present in marklogic , how can i create a ttl file with uri to document which is present in marklogic DB?
Load your documents, load your triples, and just add extra triples with document uri as subject or object, and some triple entity uri as the other side. You could express those in another ttl file, or create them via code.
Next question would be, though, how you would want to use documents and triples together?
HTH!
my question is what is the IRI that i will be writing in ttl file which will be of my document available in DB. As ttl file accepts IRIs , so what is the iri for my document ?
#grtjn
It sounds like you want to link from some existing information to your document URI.
If it's item22 from your example then it should be straight-forward.
Let's say item22 is a book. Your TTL data might look like this:
PREFIX item: <http://example.org/item/>
PREFIX domain: <http://example.org/stuff/>
item:item22 a domain:Book ;
domain:hasTitle "A tale of two cities" ;
domain:hasAuthor "Charles Dickens" .
Let's say you have that book as a document in MarkLogic. You could simply add another triple:
item:item22 domain:contentsInUri "/books/Dickens/A-tale-of-two-cities.xml" .
Now you can use SPARQL and easily find the URI related to all the books by Dickens or books with the title "A tale of two cities".
If you are looking for more structure you look into some semantic ontologies such as RDFS and OWL.
I've stucked with one question in my understanding of ElasticSearch indexing process. I've already read this article, which says, that inverted-index stores all tokens of all documents and it is immutable. So, to update it we must remove it and reindexing all data to have all document searchable.
But I've read about partial updating the documents (automaticaly marking them to "deleted" and inserting+indexing new one). But in those article where no mention about reindexing all previous data.
So, I do not understand properly next: when I update the document (text document with 100 000 words) and already have in storage some other indexed document - is it true that I will have on every UPDATE or INSERT operation reindexing process of all my documents?
Basicly I rely on default ElasticSearch settings (5 primary shards with one replica per shard and 2 nodes in cluster)
You can just have a document updated (that is reindexed, which is basically the same as removing from index and adding it again), see: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/update-doc.html This will take care of the whole index, so you won't need to reindex every other document.
I'm not sure what you mean by "save" operation, you may want to clarify it with an example.
As of the time required to update a document of 100K words, I suggest you to try it out.
I have been working on lucene since about 1 years and suddenly today I figured out something weird about it.
I was updating my indexing using the normal lucene mechanism of fetching the document and deleting old document and then reindexing the document.
So
1. Fetched the document to update from lucene index and maintained this doc in a list
2. Removed the document from index.
3. Using the doc from list updated some of it field and then re-indexed this document.
But when I found that this updated document that was indexed were having duplicate values for the original document field.
Like suppose there was a field id:1 and I didnt updated this field and updated the other content from the document and then index this doc.
I found that this id:1 was appearing two times in the same document. And even further if i reindex the same document the same field will get created those many time under single document.
How should I get rid of this duplication?
I have to make some modification for the document that was re-indexed. Means that document I fetched from the indexed, using that I took out all the fields and then created a new fresh document and added those field to that document and then re-indexed this new document, which got indexed properly without any duplication.
Was not able to find the cause but the document fetched from index was having docId and due to this when it was re-index internally some duplication might be taking place which must have cause the problem.
I'm trying to define Solr data-config.xml and schema.xml files so that I could have multiple independent document and/or root entity nodes which are then linked together. It seems Solr won't index anything but the first definition of root nodes in data-config.
What I'm trying to achieve is that I could have indexed documents which are imported from database. Each row will create one document. This is fine and already working.
Next I want to have kind of a context for the documents. The context can be updated and in that case I also need to update the Solr index. The issue is that if the context is indexed as a sub entity for the documents I would need to delete and re-add all the documents which are involved.
The goal is that I would like to have the context as a separate entity and the docs would have link to it. Then the previous case would change so that I could only delete and re-add the context while the link between it and related docs would remain unchanged and there would be no need to drop the documents during the udpate.
The amount of documents linked to a context can be anything from hundreds to tens of thousands so I really wouldn't like to recreate all of them in case the context updates.
I'd like to implement a filter/search feature in my application using Lucene.
Querying Lucene index gives me a Hits instance, which is nothing more than a list of Documents matching my criteria.
Since I generate the indexed Documents from my objects, which is the best way to find the original object related to a specific Lucene Document?
A better description of my situation:
Three model classes for now: Folder (can have other Folders or
Lists as children), List (can have Tasks as children) and
Task (can have other Tasks as children). They are all
DefaultMutableTreeNode subclasses. I'll add the Tag entity in the
future.
Each Task has a text, a start date, a due date, some boolean flags.
They are displayed in a JTree.
The hole tree is saved in an XML file.
I'd like to do things like these:
search Tasks with Google-like queries.
Find all Tasks that start today.
Filter Tasks by Tag.
You can't, not with vanilla Lucene. You said yourself that you converted your objects into Documents and then stored the Documents in Lucene, how would you imagine that process would be reversible?
If you want to store and retrieve your own objects in Lucene, I strongly recommend that you use Compass instead. Compass is to Lucene what Hibernate is to JDBC - you define a mapping between your objects and Lucene documents, Compass takes care of the conversion.
Add a "stored" field that contains an object identifier. For each hit, lookup the original object via the identifier.
Without knowing more context, it's hard to be more specific.