I have Merchant_Id, Merchant_details, Shop_Id and are have category.
If I create a HashTable with < Merchant_Id , Merchant_details > The retrieval of Merchant_details will be easy for a given Merchant_Id.
The Shop_Id can be encapsulated inside the Merchant_details, so I can retrieve the list of shop for a Merchant_Id.
If I need to list of merchants for a specific category, for example list of merchants in 'restaurant' or 'sports'. And I don't want to iterate through the entire list of items to find the category.
How can I incorporate all the three in single structure without database.
A suggestion:
Create a structure with Merchant_Id, Merchant_Details and Shop_Id.
Put them into a std::vector.
Create an empty std::map<Merchant_ID, unsigned int>.
Iterate through the vector, adding elements to the map:
merchant_id_map[vector[i].Merchant_ID] = i;
This is called an index table.
Create one for the other fields as well.
When you need to search by Merchant_Id, use the merchant_id_map to get the index into the vector:
unsigned int vector_index = merchant_id_map[merchant_id];
record = database_vector[vector_index];
C++ flavor:
using category_assignment = std::set<merchant_id_type>;
std::map<std::string, category_assignment> categories;
(wherever you place it) will give you cheap access from the category name to the respective merchants.
Problem starts when you need it backwards ("which categories does Merchant N belong to?")
To get that you should encapsulate the map in a singleton-like class X, add a map for the reverse mapping and manage both maps through the interface of X.
Related
I have a java arraylist that is made like this:
{[{},{}], [{},{}], [{},{}], [{},{}]} of around four thousand records.
I have a particular key through which I want to search in one of the objects in this list and fetch that particular array where that
record matches. The search key is a string.
Is there a solution to this without traversing through the entire list.
It is basically a list that is constructed like this:
List<Object[]> list = new ArrayList<>();
I am using this to fetch the the data from two tables using a join. Individual records of each tables map to these objects.
Say table1: {a:1,b:2,c:3} and table2: {x:1,y:2,z:3}
the data returned would be
{[{a:1,b:2,c:3}, {x:1,y:2,z:3}],[{a:2,b:3,c:4}, {x:2,y:3,z:4}]}
How will I search for say in which array in the list is a=2.
Thanks
If you do not want to be a victim of the linear search, you should consider using another type of data structure than List.
The use case you described seems like a good match for a Map in general. If you want constant time key lookup, consider using HashMap instead.
I have the following problem:
There is a Set<C> s of objects of class C. C is defined as follows:
class C {
A a;
B b;
...
}
Given A e, B f, ..., I want to find from s all objects o such that o.a = e, o.b = f, ....
Simplest solution: stream over s, filter, collect, return. But that takes a long time.
Half-assed solution: create a Map<A, Set<C>> indexA, which splits the set by a's value. Stream over indexA.get(e), filter for the other conditions, collect, return.
More-assed solution: create index maps for all fields, select for all criteria from the maps, stream over the shortest list, filter for other criteria, collect, return.
You see where this is going: we're accidentally building a database. The thing is that I don't want to serialize my objects. Sure I could grab H2 or HSQLDB and stick my objects in there, but I don't want to persist them. I basically just want indices on my regular old on-the-heap Java objects.
Surely there must be something out there that I can reuse.
Eventually, I found a couple of projects which tackle this problem including CQEngine, which seems like the most complete and mature library for this purpose.
HSQLDB provides the option of storing Java objects directly in an in-memory database without serializing them.
The property sql.live_object=true is used as a property on the connection URL to a mem: database, for example jdbc:hsqldb:mem:test;sql.live_object=true. A table is created with a column of type OTHER to store the object. Extra columns in this table duplicate any fields in the object that need indexing.
For example:
CREATE TABLE OBJECTLIST (ID INTEGER IDENTITY, OBJ OTHER, TS_FIELD TIMESTAMP, INT_FIELD INTEGER)
CREATE INDEX IDX1 ON OBJECTLIST(TS_FIELD)
CREATE INDEX IDX2 ON OBJECTLIST(INT_FIELD)
The object is stored in the OBJ column, and the timestamp and integer values for the fields that are indexed are stored the the extra columns. SQL queries such as SELECT * FROM OBJECTLIST WHERE INT_FILED = 1234 return the rows containing the relevant objects.
http://hsqldb.org/doc/2.0/guide/dbproperties-chapt.html#dpc_sql_conformance
I am currently writing code which contains an arraylist. This arraylist includes data which is name, lastname, job and id. I need to seperate the data into different arraylists. Currently i am using the method which is shown below.
for (int i = 0; i < details.size(); i = i + 4) {
names.add(details.get(i));
lastname.add(details.get(i + 1));
job.add(details.get(i + 2));
id.add(details.get(i+3));
}
I want to know if there is a better way of doing this. The initial arraylist can be very long, and i dont know if there are any issues with this method.
You asked: "I want to know if there is a better way of doing this". There is a better way.
You should consider creating a class called Record that contains the data (name, last name, job, and ID), and create an ArrayList. Then, instead of using index locations (and potentially grab the wrong data item), you could use the Record getter methods to get the data item you need (and perhaps store it in a different list).
Step 1: Create a Record class:
public class Record
{
private String firstName;
private String lastName;
private String job;
private String id;
// TODO add constructor(s), getters and setters
}
Step 2: Create a list of Records (this is an better alternative that create a list having the information in different index locations. That way, each set of name, last name, job, and ID will be self-contained which is way better than disjointed in different index locations in a list.
ArrayList<Record> records = new ArrayList<Record>();
Step 3: Instead of using index locations (and potentially grab the wrong data item), you could use the Record getter methods to get the data item you need (and perhaps store it in a different list).
ArrayList<String> names = new ArrayList<String>();
ArrayList<String> jobs = new ArrayList<String>();
...
names.add(records.getLastName() + ", " + records.getFirstName());
jobs.add(records.getJob());
Alternatively, and maybe a better solution, you could use a Map to store this information. For example, a job ID could be the key in a Map that returns a job description and who has been assigned to perform it. Job IDs have to be unique. Adding IDs to a list can be duplicated, because the List interface doesn't restrict entering duplicate data. If you use a Map, they keys are guaranteed to be unique. The value being returned from the Map could be a Record object (or some other kind) that contains the name of the person and the job the person is responsible for. Since values can be duplicates, you can have a person performing multiple jobs, which is probably what you want to do. To use a Map:
Map<String, Record> jobs = new HashMap<String, Record>(); //This record class doesn't have ID in it.
jobs.put("ABC123", new Record("John", "Doe", "Fix Drywall");
jobs.put("321CBA", new Record("Bill", "Smith", "Install Light Fixtures");
A few things to consider if using a Map. If you try to make a new entry with an existing key, the old one will be overwritten.
jobs.put("ABC123", new Record("John", "Doe", "Fix Drywall");
jobs.put("ABC123", new Record("Bill", "Smith", "Install Light Fixtures"); //Overwrote the previous entry because key is the same
If you want to change the key for an existing value, you must obtain the value, store temporarily, remove the old record, and make a new entry with the old temp value:
jobs.put("ABC123", new Record("John", "Doe", "Fix Drywall");
Record rec = jobs.remove("ABC123"); // gets the record and removes old entry
jobs.put("321CBA", rec); // new job ID for old record
The main issue is that your details can have missing data. For example it has the size=5. Then your method will crush with IndexOutOfBounds. Your details list should contain a Person object which has all the details you want and then just use them to fill other lists.
The main performance kill will be the add operation since it will have to grow the data structure over time. Since you know details.size() you should initialize the other arraylists with details.size()/4.
You should also check that details.size() % 4 == 0 before the for loop. If not it means your data is somehow wrong and you will run for sure into an IndexOutOfBounds.
Just for correctness you should write i < details.size()+3 as your condition, since you will access element i+3 in the for body. You should always check for i < details.size()+x do it like this if you ever access i+x in the body. (for the largest x there will be in the body)
I have an ArrayList of HashMap key-value pairs which looks like
ArrayList<HashMap<String, String>> myList =
new ArrayList<HashMap<String, String>>();
I understand that I can iterate through these items and find a match, but this seems to be an expensive task. Is there any other way to get an element without iterating?
My ArrayList has values like
[{Father Name=a, Mother Name=b, Child Name=c, Reg No=1, Tag ID=1},
{Father Name=p, Mother Name=q, Child Name=r, Reg No=2, Tag ID=2},
{Father Name=x, Mother Name=y, Child Name=z, Reg No=3, Tag ID=3}]
Based on RegNo, I wish to get Father Name, Mother Name and Child Name without iterating individual items.
Without iterating you will need to store your HashMap in another HashMap with key Reg No. Though I'd recommend using a Family object or something similar: HashMap<Integer, Family> registration (that's the beauty of OO-languages :) )
class Family {
String father;
String mother;
String child;
// constructor getters setters
}
Map<Integer, Family> registration = new HashMap(); // note this is a JDK7 future
//Map<Integer, Family> registration = new HashMap<Integer, Family>(); // the 'old' way
registration.put(regNo, new Family("Jack", "Mary", "Bastard"));
Family family = registration.get(regNo);
String father = family.getFather();
since you are storing hashes in list, that means order remain constant. So that mean you can create another array to store the Reg No in same order, and then search reg no in that array and based on searched value index you can get the other values.
Iterating is O(n), but you want the access to your structure to be faster... This means storing objects in a ordered manner ( -> O(log(n)) usually) or using another hash ( -> O(1)).
Or this, or you "hide" the iteration, but this would solve the problem only esthetically (something like getElementsByTagName in xml).
In any case you'll probably have to alter your structures, especially if you want to be able to have faster access for every field (father/mother/child/tag) and not just 'reg no'.
Maybe another solution could be storing plain data in a hash with a keypair like (primary key, data), duplicating the PK for every field in your HashMap, but this not only implies searching a valid primary key, there could be the problem of the size of the hash.
My use case is an index which holds titles of online media. The provider of the data associates a list of categories with each title. I am using SolrJ to populate the index via an annotated POJO class
e.g.
#Field("title")
private String title;
#Field("categories")
private List<Category> categoryList;
The associated POJO is
public class Category {
private Long id;
private String name;
...
}
My question has two parts:
a) is this possible via SolrJ - the docs only contain an example of #Field using a List of String, so I assume the serialization/marshalling only supports simple types ?
b) how would I set up the schema to hold this. I have a naive assumption I just need to set
multiValued=true on the required field & it will all work by magic.
I'm just starting to implement this so any response would be highly appreciated.
The answer is as you thought:
a) You have only simple types available. So you will have a List of the same type e.g. String. The point is you cant represent complex types inside the lucene document so you wont deserialize them as well.
b) The problem is what you are trying is to represent relational thinking in a "document store". That will probably work only to a certain point. If you want to represent categories inside a lucene document just use the string it is not necessary to store a id as well.
The only point to store an id as well is: if you want to do aside the search a lookup on a RDBMS. If you want to do this you need to make sure that the id and the category name is softlinked. This is not working for every 1:n relation. (Every 1:n relation where the n related table consists only of required fields is possible. If you have an optional field you need to put something like a filling emptyconstant in the field if possible).
However if these 1:n relations are not sparse its possible actually if you maintain the order in which you add fields to the document. So the case with the category relation can be probably represented if you dont sort the lists.
You may implement a method which returns this Category if you instantiate it with the values at position 0...n. So the solution would be if you want to have the first category it will be at position 0 of every list related to this category.