Querying by attribute in Riak - java

I'm developing an app with Riak and java. Basically I want to store news, for which I have an object with this attributes:
public String title;
public String author;
public String URL;
public ArrayList<String> categories;
public String description;
public String release;
It's working properly but now I want to allow users to search for news by keywords.
The problem is that I only find in the java client documentation queries by primary key, which are done like this:
RiakClient client = RiakClient.newClient(10017, "127.0.0.1");
Location location = new Location(new
Namespace("TestBucket"),"TestKey");
FetchValue fv = new FetchValue.Builder(location).build();
FetchValue.Response response = client.execute(fv);
// Fetch object as String
String value = response.getValue(String.class);
System.out.println(value);
client.shutdown();
Is there a way to query by attributes? For example, could you search if a word is in the title?
Because right now the only option I see is to get all the objects from the database and search by hand, which seems very inefficient to me.

Yes, there two ways to query things by attributes, either using secondary indexes (so called 2i), or by using Riak Search. I suggest you start by using secondary index, it's easy enough. Basically when you write data, you need to decide which attribute you want to be indexed. Then you can query these indexes. They can be numeric or alphanumeric, and you can query ranges.
See https://docs.riak.com/riak/kv/latest/using/reference/secondary-indexes/index.html
And example using 2i with HTTP API: https://docs.riak.com/riak/kv/latest/developing/api/http/secondary-indexes/index.html
Check out the doc of the java client you're using.

Related

What's an efficient way to retrieve chat messages from Realm?

I am using FCM to create a chat app, therefore both tokens and topics are being used. In my application I've created a POJO which extends RealmObject intended for storing the chat messages from different userIDs as well as the ones I've sent, both in private-chats and groups.
But what I can't understand is, how should I frame the Realm Query to retrieve the received messages and the messages I've sent to a UserID.
I'm thinking of trying:
RealmResults<ChatModel> results=mDatabase.where(TestChatModel.class)
.equalTo("sender",<Person with whom Im chatting's userID >)
.or()
.equalTo("receiver",<Person with whom Im chatting's userID >)
.and()
.equalTo("sender",<My userID >)
.or()
.equalTo("receiver",<My userID >
.sort("timestamp")
.findAll();
But that just seems very inefficient and messed up.
My POJO is:
public class TestChatModel {
private String chatMessage;
private String timestamp;
private String sender;
private String receiver;
private String topicName; // Set to NA is private-chat
private int isTopic; // Set to 0 for private-chat and Set to 1 for
// group
.
.
.
//Associated constructors and getters and setters
.
.
}
The community's help is much appreciated, thanks in advance !
Your query looks fine. All you can do is write down the query logic and then translate in into Realm query syntax. If your intention is to create a query with the criteria being that:
A specific person is either the sender or receiver, AND
The logged in user either the sender or receiver
Then that's probably the best way to do it. This assumes that you have ALL messages to and from everyone in the Realm; if you were to apply some other rule (e.g. that you only have messages including the logged in user in the Realm) then you could ditch clause 2, as this would be implied. An example of this would be if your API only provided messages for a logged in user (which seems like a reasonable scenario). That would improve efficiency and simplify the query.
In terms of other ways to improve efficiency, it's likely (although I have no direct evidence) that using a numerical ID for users rather than a string ID would allow for more efficient comparison and filtering in Realm. This would be preferable, but may depend on your API (again).
Finally, it's probably worth adding 'parentheses' to your query if it remains as above to ensure the operators are evaluated as you expect (i.e. the AND in the middle of the ORs). This can be accomplished with beginGroup and endGroup in the query (as described here).

Spring boot low performance on reading documents

I have a Spring Boot API linked to a mongodb database.
On a specific route I get the events for a user given (it's parsing a big collection with millions documents); the problem is that I get these documents in >20s but when I use mongoshell I get them in 0,5s.
I've already added an index on the userId (it got a way faster).
I've googled the problem but I don't see answers about this (or maybe I didn't get the point).
My method does a very basic thing :
public Collection<Event> getEventsForUser(final String tenantId, final String orgId, final String userId)throws EventNotFoundException {
Collection<MongoEvent> mongoEvents = mongoEventRepository.findByTenantIdAndOrganizationIdAndUserIdIgnoreCase(tenantId, orgId, userId);
if (mongoEvents != null && !mongoEvents.isEmpty())
return mongoEvents.stream().map(MongoEvent::getEvent).collect(Collectors.toList());
throw new EventNotFoundException("Events not found.");
}
Is it normal or is there a solution to optimize the query?
Thanks!

Memory Based Data Storage

I need to load instances of an Account class
class Account {
private String firstName;
private String lastName;
private String email;
...
}
to memory for a quick access. I can use a Java collection class to store the data. I also need to search the data. Since the email address needs to be unique, I am thinking of using map with an email address as the key. That approach won't help if searching the first name and last name are required. I can use filters for searching on first name and last name.
Any better approach?
You can maintain several different Java collections or use in-memory databases with better searching capabilities or Java object databases.
However, see also Coollection. It's interesting.
If you want to search for stuff, you want to use an indexed collection like Data Store:
https://github.com/jparams/data-store
Example:
Store<Account> store = new MemoryStore<>() ;
store.add(new Account("Ed", "Smith", "ed.smith#gmail.com"));
store.add(new Account("Fred", "Smith", "fred.johnson#hotmail.com"));
store.add(new Account("Freda", "Pinto", null));
store.index("firstName", Account::getFirstName);
store.index("lastName", Account::getLastName);
Account ed = store.getFirst("firstName", "Ed");
Account smiths = store.get("lastName", "Smith");
With data store you can create case-insensitive indexes and all sorts. If you are doing a lot of lookups, you definitely want to use a library like this.

I am creating a blog app using Java. Do I set Id manually or Auto generate in Mysql DB?

I see most people recommend saying let the DB generate the ID. How do I access the post then? Let's say, I click on button to delete a comment, NOW, how do I know this comment belongs to which posts id?
public interface PostService {
Long createPost(String title, String content, List categories, Date publishing_date);
Long createComment(String author, String content, Date submission_date);
List getAllPosts();
List getAllCommentsOn(Long post);
boolean existPost(Long post);
}
This is not mandatory, but I suggest you to use an Auto Increment column and make it PRIMARY KEY. This help you in maintaining a primary as well as unique column manually as well as speed up your SELECT query because of PRIMARY KEY index.
I would let it auto generate. Then query for the "most-commented-post"
"select * from Posts order by CommentsCount desc limit 1"
and get it with the ID.
I would prefer DB to generate it. you can modify your createComment method to accept
public interface PostService {
Long createPost(String title, String content, List categories, Date publishing_date);
Long createComment(Long postId,String author, String content, Date submission_date);
List getAllPosts();
List getAllCommentsOn(Long post);
boolean existPost(Long post);
}
When someone is adding the comment on particular post you should definately have the post id as in hidden field at least.

Are there alternatives to FileDataModel?

I'm new to mahout and this field of big data.
In general data doesn't come as a (long, long, Double) all the time.
So are there alternatives to FileDataModel?
DataModel model = new FileDataModel(new File("Ratings.csv"));
Users and items are identified solely by an ID value in the framework.
Further, this ID value must be numeric; it is a Java long type through
the APIs. A Preference object or PreferenceArray object encapsulates
the relation between user and preferred items (or items and users
preferring them).
I have recently faced the same issue. I had user id UUID type. But I had to add additional table with numeric user id and original UUID user id. Later checking the documentation i have found this explanation. According other implementation of DataModel :
A DataModel is the interface to information about user preferences. An
implementation might draw this data from any source, but a database is
the most likely source. Be sure to wrap this with a
ReloadFromJDBCDataModel to get good performance! Mahout provides
MySQLJDBCDataModel, for example, to access preference data from a
database via JDBC and MySQL. Another exists for PostgreSQL. Mahout
also provides a FileDataModel, which is fine for small applications.
You can build DataModel from Database.
Here is a example for PostgreSQL:
Intercafe looks like this:
PostgreSQLJDBCDataModel(DataSource dataSource, String preferenceTable, String userIDColumn, String itemIDColumn, String preferenceColumn, String timestampColumn)
Initalization:
source = new PGPoolingDataSource();
source.setDataSourceName(properties.getProperty("DATABASE_NAME"));
source.setServerName("127.0.0.1");
source.setDatabaseName(properties.getProperty("DATABASE_NAME"));
source.setUser(properties.getProperty("DATABASE_USER"));
source.setPassword(properties.getProperty("DATABASE_PASS"));
source.setMaxConnections(50);
DataModel model =new PostgreSQLJDBCDataModel(
source,
"mahout_teble",
"user_id",
"item_id",
"preference",
"timestamp"
)
)

Categories

Resources