I'm new to mahout and this field of big data.
In general data doesn't come as a (long, long, Double) all the time.
So are there alternatives to FileDataModel?
DataModel model = new FileDataModel(new File("Ratings.csv"));
Users and items are identified solely by an ID value in the framework.
Further, this ID value must be numeric; it is a Java long type through
the APIs. A Preference object or PreferenceArray object encapsulates
the relation between user and preferred items (or items and users
preferring them).
I have recently faced the same issue. I had user id UUID type. But I had to add additional table with numeric user id and original UUID user id. Later checking the documentation i have found this explanation. According other implementation of DataModel :
A DataModel is the interface to information about user preferences. An
implementation might draw this data from any source, but a database is
the most likely source. Be sure to wrap this with a
ReloadFromJDBCDataModel to get good performance! Mahout provides
MySQLJDBCDataModel, for example, to access preference data from a
database via JDBC and MySQL. Another exists for PostgreSQL. Mahout
also provides a FileDataModel, which is fine for small applications.
You can build DataModel from Database.
Here is a example for PostgreSQL:
Intercafe looks like this:
PostgreSQLJDBCDataModel(DataSource dataSource, String preferenceTable, String userIDColumn, String itemIDColumn, String preferenceColumn, String timestampColumn)
Initalization:
source = new PGPoolingDataSource();
source.setDataSourceName(properties.getProperty("DATABASE_NAME"));
source.setServerName("127.0.0.1");
source.setDatabaseName(properties.getProperty("DATABASE_NAME"));
source.setUser(properties.getProperty("DATABASE_USER"));
source.setPassword(properties.getProperty("DATABASE_PASS"));
source.setMaxConnections(50);
DataModel model =new PostgreSQLJDBCDataModel(
source,
"mahout_teble",
"user_id",
"item_id",
"preference",
"timestamp"
)
)
Related
I have approximately the following entity:
public class Article {
private String name;
private Long fileId;
}
As you can see, it has a field fileld that contains the id of the associated file, which is also an entity. However, the file does not know anything about the Article, so the only thing that connects them is the fileId field in the Article. Therefore, they must be explicitly linked so as not to get lost. Now to get a linked file, I have to make a separate query to the database for each Article. That is, if I want to get a list of 10 Articles, I need to make a request to the database 10 times and get the file by its id. This looks very inefficient. How can this be done better? I use jooq, so I can't use JPA, so I can't substitute a file object instead of the fileId field. Any ideas?
I'm going to make an assumption that your underlying tables are something like this:
create table file (
id bigint primary key
content blob
);
create table article (
name text,
file_id bigint references file
);
In case of which you can fetch all 10 files into memory using a single query like this:
Result<?> result =
ctx.select()
.from(ARTICLE)
.join(FILE).on(ARTICLE.FILE_ID.eq(FILE.ID))
.fetch();
I've been trying to use insert...returning in MySQL with the DSL-based table definition (I'm not using the code generation) and my returned record is always null. Based on reading, I need to specify the identify column in the table definition, but I have no idea how!
Record recordKey = create.insertInto(table("modulerecords"),
field("id"),
field("module_id"),
field("created_date"),
field("created_by"),
field("state"),
field("tag_id"),
field("start_time",Timestamp.class),
field("kill_time", Timestamp.class),
field("feed_guid")
)
.values(null, moduleId, currentTimestamp(),
userId, state, tagId,
new Timestamp(startTime),
new Timestamp(killTime), feedGuid)
.returning(field("id"))
.fetchOne();
The field "id" is auto_increment primary key in the database, but recordKey is always null.
As of jOOQ 3.14, this is possible by specifying the field's datatype as being an identity, which can be done using SQLDataType.INTEGER.identity(true).
So for example, if you had a table with an auto-generating integer id and a string name, you would call:
int id = DSL.using(connection, MYSQL_5_7)
.insertInto(
table("myTable"),
field("name", String.class))
.values("John Smith")
.returning(field("id", SQLDataType.INTEGER.identity(true)))
.fetchAny(field("id", Integer.class))
So for your example, you would do
Record recordKey = create.insertInto(table("modulerecords"),
field("id"),
field("module_id"),
field("created_date"),
field("created_by"),
field("state"),
field("tag_id"),
field("start_time",Timestamp.class),
field("kill_time", Timestamp.class),
field("feed_guid")
)
.values(null, moduleId, currentTimestamp(),
userId, state, tagId,
new Timestamp(startTime),
new Timestamp(killTime), feedGuid)
.returning(field("id", SQLDataType.INTEGER.identity(true)))
.fetchOne();
See this Github comment for more background.
It is highly recommended you use the code generator to provide all the meta information to the DSL API. You can, of course, not use the code generator and still use the internal APIs that the code generator would otherwise use. Instead of creaating your table and field references using the plain SQL API, you'd have to create a TableImpl subclass and override / implement all the relevant methods.
Or, you just use the code generator.
I'm trying to search value inside Corda unconsumed states on a collection Field.
I'm able to search on String field using -
Field uniqueAttributeName = MySchema.PersistentIOU.class.getDeclaredField("fieldname");
CriteriaExpression uniqueAttributeEXpression = Builder.equal(uniqueAttributeName, "valueToSearch");
QueryCriteria customCriteria = new QueryCriteria.VaultCustomQueryCriteria(uniqueAttributeEXpression);
result = rpcOps.vaultQueryByCriteria(customCriteria, MyState.class).getStates();
Above worked fine when "fieldname" is String but I have another field which is List and I'm not sure how to search inside List for a specific value.
Please assist.
After a quick chat with #Roger3cev, we think the best way is to amend your ORM wrapper such that you have a parent - child relationship between the state and the list of fields you want to have in that list. Once you do this, you can use the JDBC connection available to you to query against the child state and then use the relationship to the parent to get the Corda state.
I am new to lotus. I need to get some info from Lotus database with Java. I have database:
Session session = NotesFactory.createSession(host, user, pwd);
Database database = session.getDatabase(server, database);
I have that info:
field - fldContractorCode;
form - form="formAgreement";
For example field is "abcde";
So how I can get info from that database? I need to use seatch formula? Or what methods I need to use? Thanx for help.
UPD
Now I am using such way:
DocumentCollection collection = DATABASE.search("form=\"formAgreement\"");
Document doc = collection.getFirstDocument();
while(doc != null) {
doc.getItemValueString("fldContractorCode");
doc = collection.getNextDocument();
}
And it works fine for me, but I think that way is not very comfortable because to find some document for example with field="abcd" I need to itearte over collection every time...
So that why I am asking for some way to find document by the field value. And I dont understand what is VIEW in database and where to get this VIEW name.
In your existing code, you can just change one line:
DocumentCollection collection = DATABASE.search("form=\"formAgreement\ & "fldContractorCode=\"abcd\"");
However, this will be slow if the database contains many documents. For best performance, you should consider using Domino Designer to add a new view to your database and using the getDocumentByKey() method suggested in the other answers. If that is not an option, Simon's suggestion of using the FTSearch() method is faster than the Search() method, but only if a full text index exists for the database. It also has a slightly different syntax for the search string.
There are a number of ways to get the document.
1. Search for the document from a view, where the first column of the view contains a sorted value of the fldContractorCode.
For example:
String key = "abide";
View view = db.getView("viewName");
Document doc = view.getDocumentByKey(key, true);
2. You can use the Database FTSearch Method to do a full text search to find the document. You will need the database to have a full text index created.
3. If you know the UNID or notes ID of the document you can use getDocumentByUNID() or getDocumentByID().
Your question is quite broad, so I recommend reading the Infocenter as it details sample code for each method.
http://publib.boulder.ibm.com/infocenter/domhelp/v8r0/topic/com.ibm.designer.domino.main.doc/H_NOTESDATABASE_CLASS_JAVA.html
You will have to drill down to the DOCUMENT (not Form) you want to retrieve the field from.
Lotus Notes has a very easy to understand hierarchical way to get to where you want. You will need to instantiate objects in this sequence:
Session
Database
View
Document
Let's say you have a view called $(sysAgreements) that list all forms "formAgreement".
Its selection formula would be something like this:
SELECT Form="formAgreement"
To get to the document or documents you want you will do something like this:
Session session = NotesFactory.createSession(host, user, pwd);
Database database = session.getDatabase(server, database);
View view = database.getView("$(sysAgreements)");
Document doc = view.getDocumentByKey(VIEW_KEY);
String fieldContent = doc.getItemValueString("fldContractorCode");
There are several ways to retrieve info from a Notes database. This is one of them. Bear in mind that they key used by Notes to search a view with getDocumentByKey is the 1st sorted column.
If you want to get multiple documents you can use:
DocumentCollection docCol = view.getAllDocumentsByKey(VIEW_KEY);
and then iterate over it.
Avoid doing ftsearch because it's slow and a bit painful to Notes. Prefere looking up in the views.
Also another powerful source of help is the Notes help. Get the help database from a computer that has the Notes Development Client installed. But pay attention to the name of the help you're picking, there are 3 helps in Notes: the client, development and administration. Development is what you want.
I have a web application build in Django + Python that interact with web services (written in JAVA).
Now all the database management part is done by web-services i.e. all CRUD operations to actual database is done by web-services.
Now i have to track all User Activities done on my website in some log table.
Like If User posted a new article, then a new row is created into Articles table by web-services and side by side, i need to add a new row into log table , something like "User : Raman has posted a new article (with ID, title etc)"
I have to do this for all Objects in my database like "Article", "Media", "Comments" etc
Note : I am using PostgreSQL
So what is the best way to achieve this..?? (Should I do it in PostgreSQL OR JAVA ..??..And How..??)
So, you have UI <-> Web Services <-> DB
Since the web services talk to the DB, and the web services contain the business logic (i.e. I guess you validate stuff there, create your queries and execute them), then the best place to 'log' activities is in the services themselves.
IMO, logging PostgreSQL transactions is a different thing. It's not the same as logging 'user activities' anymore.
EDIT: This still means you create DB schema for 'logs' and write them to DB.
Second EDIT: Catching log worthy events in the UI and then logging them from there might not be the best idea either. You will have to rewrite logging if you ever decide to replace the UI, or for example, write an alternate UI for, say mobile devices, or something else.
For an audit table within the DB itself, have a look at the PL/pgSQL Trigger Audit Example
This logs every INSERT, UPDATE, DELETE into another table.
In your log table you can have various columns, including:
user_id (the user that did the action)
activity_type (the type of activity, such as view or commented_on)
object_id (the actual object that it concerns, such as the Article or Media)
object_type (the type of object; this can be used later, in combination with object_id to lookup the object in the database)
This way, you can keep track of all actions the users do. You'd need to update this table whenever something happens that you wish to track.
Whenever we had to do this, we overrode signals for every model and possible action.
https://docs.djangoproject.com/en/dev/topics/signals/
You can have the signal do whatever you want, from injecting some HTML into the page, to making an entry in the database. They're an excellent tool to learn to use.
I used django-audit-log and I am very satisfied.
Django-audit-log can track multiple models each in it's own additional table. All of these tables are pretty unified, so it should be fairly straightforward to create a SQL view that shows data for all models.
Here is what I've done to track a single model ("Pauza"):
class Pauza(models.Model):
started = models.TimeField(null=True, blank=False)
ended = models.TimeField(null=True, blank=True)
#... more fields ...
audit_log = AuditLog()
If you want changes to show in Django Admin, you can create an unmanaged model (but this is by no means required):
class PauzaAction(models.Model):
started = models.TimeField(null=True, blank=True)
ended = models.TimeField(null=True, blank=True)
#... more fields ...
# fields added by Audit Trail:
action_id = models.PositiveIntegerField(primary_key=True, default=1, blank=True)
action_user = models.ForeignKey(User, null=True, blank=True)
action_date = models.DateTimeField(null=True, blank=True)
action_type = models.CharField(max_length=31, choices=(('I', 'create'), ('U', 'update'), ('D', 'delete'),), null=True, blank=True)
pauza = models.ForeignKey(Pauza, db_column='id', on_delete=models.DO_NOTHING, default=0, null=True, blank=True)
class Meta:
db_table = 'testapp_pauzaauditlogentry'
managed = False
app_label = 'testapp'
Table testapp_pauzaauditlogentry is automatically created by django-audit-log, this merely creates a model for displaying data from it.
It may be a good idea to throw in some rude tamper protection:
class PauzaAction(models.Model):
# ... all like above, plus:
def save(self, *args, **kwargs):
raise Exception('Permission Denied')
def delete(self, *args, **kwargs):
raise Exception('Permission Denied')
As I said, I imagine you could create a SQL view with the four action_ fields and an additional 'action_model' field that could contain varchar references to model itself (maybe just the original table name).