Connect To MongoDB using Apache Mahout - java

I'm trying to generate recommendations using Apache Mahout while using MongoDB to create the datamodel as per the MongoDBDataModel. My code is as follows :
import java.net.UnknownHostException;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.ThresholdUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.UserBasedRecommender;
import org.apache.mahout.cf.taste.similarity.ItemSimilarity;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import com.mongodb.MongoException;
public class usingMongo {
public static void main(String[] args) throws UnknownHostException, Mong oException
,TasteException {
final long startTime = System.nanoTime();
MongoDBDataModel model = new MongoDBDataModel("AdamsLaptop", 27017,
"test", "ratings100k", false, false, null);
System.out.println("connected to mongo ");
UserSimilarity UserSim = new PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.5, UserSim, model);
UserBasedRecommender UserRecommender = new GenericUserBasedRecommender(model, neighborhood, UserSim);
List<RecommendedItem>UserRecommendations = UserRecommender.recommend(1, 3);
for (RecommendedItem recommendation : UserRecommendations) {
System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " USER");
}
ItemSimilarity ItemSim = new PearsonCorrelationSimilarity(model);//LogLikelihoodSimilarity(model);
GenericItemBasedRecommender ItemRecommender = new GenericItemBasedRecommender(model, ItemSim);
List<RecommendedItem>ItemRecommendations = ItemRecommender.recommend(1, 3);
for (RecommendedItem recommendation : ItemRecommendations) {
System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " ITEM");
}
final long duration = System.nanoTime() - startTime;
System.out.println(duration);
}
}
I cant see where I've gone wrong but with numerous changes and lots of trial and error the error message remains the same :
Exception in thread "main" java.lang.NullPointerException
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.getID(MongoDBDataModel.java:743)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.buildModel(MongoDBDataModel.java:570)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.<init>(MongoDBDataModel.java:245)
at recommender.usingMongo.main(usingMongo.java:24)
Any suggestions? Here's an example of my data within MongoDB :
{ "_id" : ObjectId("56ddf61f5960960c333f3dcb"),"userId" : 1, "movieId" : 292, "rating" : 4, "timestamp" : 847116936 }

I succesfully integrated MongoDB data to mahout.
The structure of the data in mongoDB depends on the kind of Similarity algorithm you use.for eg,
UserSimilarity
MongoDBDataModel datamodel = new MongoDBDataModel("127.0.0.1", 27017, "testing", "ratings", true, true, null);
where the user_id, item_id are integer values, preference are float values and created_at as timestamp
SVDRecommender
the user_id, item_id are MongoDB Objects and preference are float values and created_at as timestamp
The obvious troubleshooting you can do is whether the MongoDB server is running or not. As per the exception it's running. I think the problem lies in your structure of data..
Use user_id instead of userId, item_id instead of itemId, preference instead of rating. I don't know if this will make any difference. I used one of the tutorial online, but can't find it at the moment.
It's working but too slow when I have more than 10000 users with 1000 items.

I think that the problem is that mahout assumes some default values when it comes to some fields that need to reside in your mongoDB the item ID, User ID and preferences that are user_id, item_id and preference so The solution might lie on using another MongoDBDataModel constructor that will give you the possibility to pass as parameters the names of those fields in your mongoDB instance or redesign your Collections Schema.
I hope that makes sense.

Related

How to pass “values” attribute as array of strings in payload object in case of Googles AUTOML TABLE?

We are  using Google cloud service AUTOML TABLES for online prediction.
We have created, trained and deployed the model. The model is giving predictions using the Google console. We are trying to integrate this model in our java code.
We are not able to pass “values”  attribute as array of strings in payload object in java code. We haven’t found anything for this in documentation.
Please find the links we are using for this:
https://cloud.google.com/automl-tables/docs/samples/automl-tables-predict
Please find the json object in the screenshot.
Please let us know how to pass “values”  attribute as array of strings in payload object?
Thanks.
Based from the reference you are following, to be able to populate "values" you need to define it at the main(). You can refer to Class Value.Builder if you need to set Numbers, Null, etc. values.
List<Value> values = new ArrayList<>();
values.add(Value.newBuilder().setStringValue("This is test data.").build());
// add more elements in values as needed
This list values will be used in Row that accepts iterable protobuf value. See Row.newBuilder.addAllValues().
Row row = Row.newBuilder().addAllValues(values).build();
Using these, the payload is complete and a prediction request be built:
ExamplePayload payload = ExamplePayload.newBuilder().setRow(row).build();
PredictRequest request =
PredictRequest.newBuilder()
.setName(name.toString())
.setPayload(payload)
.putParams("feature_importance", "true")
.build();
PredictResponse response = client.predict(request);
Your full prediction code should look like this:
import com.google.cloud.automl.v1beta1.AnnotationPayload;
import com.google.cloud.automl.v1beta1.ExamplePayload;
import com.google.cloud.automl.v1beta1.ModelName;
import com.google.cloud.automl.v1beta1.PredictRequest;
import com.google.cloud.automl.v1beta1.PredictResponse;
import com.google.cloud.automl.v1beta1.PredictionServiceClient;
import com.google.cloud.automl.v1beta1.Row;
import com.google.cloud.automl.v1beta1.TablesAnnotation;
import com.google.protobuf.Value;
import com.google.protobuf.NullValue;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
class TablesPredict {
public static void main(String[] args) throws IOException {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-project-id";
String modelId = "TBL9999999999";
// Values should match the input expected by your model.
List<Value> values = new ArrayList<>();
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("blue-colar").build());
values.add(Value.newBuilder().setStringValue("married").build());
values.add(Value.newBuilder().setStringValue("primary").build());
values.add(Value.newBuilder().setStringValue("no").build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("yes").build());
values.add(Value.newBuilder().setStringValue("yes").build());
values.add(Value.newBuilder().setStringValue("cellular").build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("unknown").build());
predict(projectId, modelId, values);
}
static void predict(String projectId, String modelId, List<Value> values) throws IOException {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
try (PredictionServiceClient client = PredictionServiceClient.create()) {
// Get the full path of the model.
ModelName name = ModelName.of(projectId, "us-central1", modelId);
Row row = Row.newBuilder().addAllValues(values).build();
ExamplePayload payload = ExamplePayload.newBuilder().setRow(row).build();
// Feature importance gives you visibility into how the features in a specific prediction
// request informed the resulting prediction. For more info, see:
// https://cloud.google.com/automl-tables/docs/features#local
PredictRequest request =
PredictRequest.newBuilder()
.setName(name.toString())
.setPayload(payload)
.putParams("feature_importance", "true")
.build();
PredictResponse response = client.predict(request);
System.out.println("Prediction results:");
for (AnnotationPayload annotationPayload : response.getPayloadList()) {
TablesAnnotation tablesAnnotation = annotationPayload.getTables();
System.out.format(
"Classification label: %s%n", tablesAnnotation.getValue().getStringValue());
System.out.format("Classification score: %.3f%n", tablesAnnotation.getScore());
// Get features of top importance
tablesAnnotation
.getTablesModelColumnInfoList()
.forEach(
info ->
System.out.format(
"\tColumn: %s - Importance: %.2f%n",
info.getColumnDisplayName(), info.getFeatureImportance()));
}
}
}
}
For testing purposes I used Google's test dataset (gs://cloud-ml-tables-data/bank-marketing.csv) and used the code above to run send prediction.
See test prediction:

Text to Speech live Tweets using TTS.lib

I'm trying to get a specific user's tweets into Processing and then have them spoken out using the TTS Library. So far I've managed to get the tweets into Processing, with them printed as I want them. BUT, adding the TTS stuff is where it's proving problematic, considering my novice-level-skills.
What happens at the moment is that I receive the error message:
The method speak(String) in the type TTS is not applicable for the arguments (String[])
Anyone have any ideas? Help would be greatly appreciated. Thanks.
import twitter4j.util.*;
import twitter4j.*;
import twitter4j.management.*;
import twitter4j.api.*;
import twitter4j.conf.*;
import twitter4j.json.*;
import twitter4j.auth.*;
import guru.ttslib.*;
import java.util.*;
TTS tts;
tts = new TTS();
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setOAuthConsumerKey("XXXX");
cb.setOAuthConsumerSecret("XXXX");
cb.setOAuthAccessToken("XXXX");
cb.setOAuthAccessTokenSecret("XXXX");
java.util.List statuses = null;
Twitter twitter = new TwitterFactory(cb.build()).getInstance();
String userName ="#BBC";
int numTweets = 19;
String[] twArray = new String[numTweets];
try {
statuses = twitter.getUserTimeline(userName);
}
catch(TwitterException e) {
}
for (int i=0; i<statuses.size(); i++) {
Status status = (Status)statuses.get(i);
//println(status.getUser().getName() + ": " + status.getText());
twArray[i] = status.getUser().getName() + ": " + status.getText();
}
println(twArray);
tts.speak(twArray);
The error says it all: the tts.speak() function take a single String value, but you're giving it a String[] array.
In other words, you should only be passing in a single tweet at a time. Instead, you're passing in every tweet at once. The function doesn't know how to handle that, so you get the error.
You need to only pass in a single String value. How you do that depends on what exactly your goal is. You might just pass in the first tweet:
tts.speak(twArray[0]);
Or you might pass in each tweet one at a time:
for(String tweet : twArray){
tts.speak(tweet);
}

insert text data values eg(date time PH value) in Mongodb every 3 hours automatically using java

I have a mongodb database, I need to insert text data values eg(date time PH value) every 3 hours automatically using java.
Need help
I have made a mongodb database called project and collection called Water Monetering system
also here is the basic layout of java- mongodb integration
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;
public class MongoDBJDBC{
public static void main( String args[] ){
try{
// To connect to mongodb server
MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
// Now connect to your databases
DB db = mongoClient.getDB( "project" );
System.out.println("Connect to database successfully");
DBCollection coll = db.getCollection("WaterMoneteringSystem");
System.out.println("Collection WaterMoneteringSystem selected successfully");
BasicDBObject doc = new BasicDBObject("title", "Watermoneteringsystem").
append("Date", "date").
append("time", "time").
append("value", "ph").
coll.insert(doc);
System.out.println("Document inserted successfully");
}catch(Exception e){
System.err.println( e.getClass().getName() + ": " + e.getMessage() );
}
}
}
I am not getting how data(date time PH) in text box can be implemented
Thank you
Instead of:
BasicDBObject doc = new BasicDBObject("title", "Watermoneteringsystem").
append("Date", "date").
append("time", "time").
append("value", "ph");
You need something like:
BasicDBObject doc = new BasicDBObject("title", "Watermoneteringsystem").
append("Date", new Date()).
append("value", "ph");
The new Date() call will set today's date and time in there (you don't need separate fields for date and time).
But the "ph" bit needs to come from some other place - where are you getting the ph values from? Is it going to be an argument (i.e. in args[])? Is it coming from a file, or external system?
Assuming it's something you can pass in to the method, you can do something like:
String phValue = args[0];
BasicDBObject doc = new BasicDBObject("title", "Watermoneteringsystem").
append("Date", new Date()).
append("value", phValue);
I'd suggest renaming the field to something like "ph" or "phValue" as well, since "value" is not a helpful field name.
Take a look at the MongoDB and Java documentation, there are more examples there in how to use MongoDB from Java.

Fetching weather data using java and the wunderground api

I'm trying to fetch some weather data using java. I am using the following java api for fetching the data from wunderground.com
https://code.google.com/p/wunderground-core/
The example code they give on their website works okay for (Dortmund in Germany). However when I change the key from dortmund to Boston in the U.S.A, I get null pointer errors. Any idea what I could be doing wrong? Please try it and leave comments/advice. Thanks!
Code:
import de.mbenning.weather.wunderground.api.domain.DataSet;
import de.mbenning.weather.wunderground.api.domain.WeatherStation;
import de.mbenning.weather.wunderground.api.domain.WeatherStations;
import de.mbenning.weather.wunderground.impl.services.HttpDataReaderService;
public class weather {
public static void main(String[] args)
{
// create a instance of a wunderground data reader
HttpDataReaderService dataReader = new HttpDataReaderService();
// select a wunderground weather station (ID "INORDRHE72" = Dortmund-Mengede)
WeatherStation weatherStation = WeatherStations.ALL.get("INORDRHE72");
// KMABOSTO22 is the ID for Boston South end
//WeatherStation weatherStation = WeatherStations.ALL.get("KMABOSTO32");
// set selected weather station to data reader
dataReader.setWeatherStation(weatherStation);
// get current (last) weather data set from selected station
DataSet current = dataReader.getCurrentData();
// print selected weather station ID
System.out.println(weatherStation.getStationId());
// print city, state and country of weather station
System.out.println(weatherStation.getCity() + " " + weatherStation.getState() + " " + weatherStation.getCountry());
//`enter code here` print datetime of measure and temperature ...
System.out.println(current.getDateTime() + " " + current.getTemperature());
}
}
Check out the source code of the Wunderground API.
svn checkout http://wunderground-core.googlecode.com/svn/trunk/ wunderground-core-read-only
In the package de.mbenning.weather.wunderground.api.domain there is a class called WeatherStations. There you will find the content of all weather stations you can call in your code.
Right now there are only a few ones:
public static final Map<String, WeatherStation> ALL = new HashMap<String, WeatherStation>();
static {
ALL.put("INRWKLEV2", INRWKLEV2_KLEVE);
ALL.put("INORDRHE110", INORDRHE110_GOCH);
ALL.put("IDRENTHE48", IDRENTHE48_COEVORDEN);
ALL.put("IZEELAND13", IZEELAND13_GOES);
ALL.put("INORDRHE72", INORDRHE72_DORTMUND);
ALL.put("INOORDBR35", INOORDBR35_BOXMEER);
};
All others won't work.
It works: You can instantiate every weather station which is registered on WUnderground.
It's possible to set the station id as contructor parameter:
WeatherStation aWeatherStation = new WeatherStation("INORDRHE72");
HttpDataReaderService dataReader = new HttpDataReaderService();
dataReader.setWeatherStation(aWeatherStation );
Double currentTemperature = dataReader.getCurrentData().getTemperature();

Easiest way to obtain database metadata in Java?

I'm familiar with the java.sql.DatabaseMetaData interface, but I find it quite clunky to use. For example, in order to find out the table names, you have to call getTables and loop through the returned ResultSet, using well-known literals as the column names.
Is there an easier way to obtain database metadata?
It's easily done using DdlUtils:
import javax.sql.DataSource;
import org.apache.ddlutils.Platform;
import org.apache.ddlutils.PlatformFactory;
import org.apache.ddlutils.model.Database;
import org.apache.ddlutils.platform.hsqldb.HsqlDbPlatform;
public void readMetaData(final DataSource dataSource) {
final Platform platform = PlatformFactory.createNewPlatformInstance(dataSource);
final Database database = platform.readModelFromDatabase("someName");
// Inspect the database as required; has objects like Table/Column/etc.
}
Take a look at SchemaCrawler (free and open source), which is another API designed for this purpose. Some sample SchemaCrawler code:
// Create the options
final SchemaCrawlerOptions options = new SchemaCrawlerOptions();
// Set what details are required in the schema - this affects the
// time taken to crawl the schema
options.setSchemaInfoLevel(SchemaInfoLevel.standard());
options.setShowStoredProcedures(false);
// Sorting options
options.setAlphabeticalSortForTableColumns(true);
// Get the schema definition
// (the database connection is managed outside of this code snippet)
final Database database = SchemaCrawlerUtility.getDatabase(connection, options);
for (final Catalog catalog: database.getCatalogs())
{
for (final Schema schema: catalog.getSchemas())
{
System.out.println(schema);
for (final Table table: schema.getTables())
{
System.out.print("o--> " + table);
if (table instanceof View)
{
System.out.println(" (VIEW)");
}
else
{
System.out.println();
}
for (final Column column: table.getColumns())
{
System.out.println(" o--> " + column + " (" + column.getType()
+ ")");
}
}
}
}
http://schemacrawler.sourceforge.net/

Categories

Resources