Deeplearning4j difficulty using LibSVM files with CNNs - java

I am using the LibSVM record reader to load sparse data into neural networks.
This worked fine when using a MLP model, but when I tried to load data into one of the example CNNs given in one of the problems:
ComputationGraphConfiguration config = new NeuralNetConfiguration.Builder()
.trainingWorkspaceMode(WorkspaceMode.SINGLE).inferenceWorkspaceMode(WorkspaceMode.SINGLE)
//.trainingWorkspaceMode(WorkspaceMode.SEPARATE).inferenceWorkspaceMode(WorkspaceMode.SEPARATE)
.weightInit(WeightInit.RELU)
.activation(Activation.LEAKYRELU)
.updater(Updater.ADAM)
.convolutionMode(ConvolutionMode.Same)
.regularization(true).l2(0.0001)
.learningRate(0.01)
.graphBuilder()
.addInputs("input")
.addLayer("cnn3", new ConvolutionLayer.Builder()
.kernelSize(3, vectorSize)
.stride(1, vectorSize)
.nIn(1)
.nOut(cnnLayerFeatureMaps)
.build(), "input")
.addLayer("cnn4", new ConvolutionLayer.Builder()
.kernelSize(4, vectorSize)
.stride(1, vectorSize)
.nIn(1)
.nOut(cnnLayerFeatureMaps)
.build(), "input")
.addLayer("cnn5", new ConvolutionLayer.Builder()
.kernelSize(5, vectorSize)
.stride(1, vectorSize)
.nIn(1)
.nOut(cnnLayerFeatureMaps)
.build(), "input")
.addVertex("merger", new MergeVertex(), "cnn3", "cnn4", "cnn5")
.addLayer("globalPool", new GlobalPoolingLayer.Builder()
.poolingType(globalPoolingType)
.dropOut(0.5)
.build(), "merger")
.addLayer("out", new OutputLayer.Builder()
.lossFunction(LossFunctions.LossFunction.MCXENT)
.activation(Activation.SOFTMAX)
.nIn(3*cnnLayerFeatureMaps)
.nOut(classes.length)
.build(), "globalPool")
.setOutputs("out")
.setInputTypes(InputType.convolutionalFlat(32,45623,1))
.build();
I got an error that seems to be saying that it was getting 2-dimensional data, but it needs 3-dimensional data (with the third dimension being a trivial one).
Exception in thread "main" java.lang.IllegalArgumentException: Invalid input: expect output columns must be equal to rows 32 x columns 45623 x channels 1 but was instead [32, 45623]
How do I give it the 1 channel dimension?
Failing that, how do I get the CNN to recognize channel-less data, or how do I give a CNN sparse data?
Thank you

The typical problem you run in to when setting up cnns is setting the input type wrong. Deeplearning4j's equivalent of an "input layer" is an input type where we configure common configurations like rnns or cnn flat depending on the type of data you are dealing with. Typically if you are using cnns, you would want to look at the InputType.convolutionalFlat
method.
That will take a flat vector and convert it to a proper 1 channel tensor meant for use with cnns. If you use input type, it will also automatically set things like the number of inputs and outputs for you.

Related

CNN for Sentiment Analysis using TFLearn model for Android to classify user input

I have a CNN model for text classification which uses a pre-trained embedding of the glove. I have frozen that graph optimized for inference and using it on the android studio. The problem is when I try to pass the weights into the model for inference. I have a JSON file with the key-value pairs between the words and the embedding which I use to create an input of embeddings from the text that the user types in. I can already get the embeddings from the JSON file but when I try to feed it into the graph for inference, it gives me the following error:
java.lang.IllegalArgumentException: indices[0,3891] = -2 is not in [0,
7459)
[[Node: EmbeddingLayer/embedding_lookup = Gather[Tindices=DT_INT32,
Tparams=DT_FLOAT, _class=["loc:#EmbeddingLayer/W"],
validate_indices=false,
_device="/job:localhost/replica:0/task:0/device:CPU:0"]
(EmbeddingLayer/W/read, EmbeddingLayer/Cast)]]
The Android code is in my GitHub
https://github.com/sushiboo/testNN1
The main code that gives me problem is the Classify method:
private void classify(float[] input){
TFInference = new TensorFlowInferenceInterface(getAssets(), MODEL_FILE);
TFInference.feed(INPUT_NODE, input, 1, input.length);
TFInference.run(OUTPUT_NODES);
float[] resu = new float[2];
TFInference.fetch(OUTPUT_NODE, resu);
tvResult.setText("Programmer: " + Float.toString(resu[0]) + "\n Construction" + Float.toString(resu[1]));
Log.e("Result: ", Float.toString(resu[0]));
}
The problem is in the
TFInference.run(OUTPUT_NODES);
On the Error message, the number '7459' represents the input dimension of the embedding layer.
I am really confused as to what is happening here but I know that the indices[0,3891] = -2 plays some part in this.
The problem was with the model guys. I have fixed this one and now I am stuck on another error.

Passing data to Tensorflow model in Java

I'm trying to use a Tensorflow model that I trained in python to score data in Scala (using TF Java API). For the model, I've used thisregression example, with the only change being that I dropped asText=True from export_savedmodel.
My snippet of Scala:
val b = SavedModelBundle.load("/tensorflow/tf-estimator-tutorials/trained_models/reg-model-01/export/1531933435/", "serve")
val s = b.session()
// output = predictor_fn({'csv_rows': ["0.5,1,ax01,bx02", "-0.5,-1,ax02,bx02"]})
val input = "0.5,1,ax01,bx02"
val inputTensor = Tensor.create(input.getBytes("UTF-8"))
val result = s.runner()
.feed("csv_rows", inputTensor)
.fetch("dnn/logits/BiasAdd")
.run()
.get(0)
When I run, I get the following error:
Exception in thread "main" java.lang.IllegalArgumentException: Input to reshape is a tensor with 2 values, but the requested shape has 4
[[Node: dnn/input_from_feature_columns/input_layer/alpha_indicator/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _output_shapes=[[?,2]], _device="/job:localhost/replica:0/task:0/device:CPU:0"](dnn/input_from_feature_columns/input_layer/alpha_indicator/Sum, dnn/input_from_feature_columns/input_layer/alpha_indicator/Reshape/shape)]]
at org.tensorflow.Session.run(Native Method)
at org.tensorflow.Session.access$100(Session.java:48)
at org.tensorflow.Session$Runner.runHelper(Session.java:298)
at org.tensorflow.Session$Runner.run(Session.java:248)
I figure that there's a problem with how I've prepared my input Tensor, but I'm stuck on how to best debug this.
The error message suggests that the shape of the input tensor in some operation isn't what is expected.
Looking at the Python notebook you linked to (particularly section 8a and 8c), it seems that the input tensor is expected to be a "batch" of string tensors, not a single string tensor.
You can observe this by comparing the shapes of the tensors in your Scala and Python program (inputTensor.shape() in scala vs. the shape of csv_rows provided to predict_fn in the Python notebook).
From that, it seems what you want is for inputTensor to be a vector of strings, not a single scalar string. To do that, you'd want to do something like:
val input = Array("0.5,1,ax01,bx02")
val inputTensor = Tensor.create(input.map(x => x.getBytes("UTF-8"))
Hope that helps

pmml model created from xgboost in R leads to different result than original model in R

I have a ranking task, where my training data looks like this:
session_id item_id item_features target
---------------------------------------------
session1 item1 ... 1
session1 item2 ... 0
...
sessionN item1 ... 0
sessionN itemX ... 10
sessionN itemY ... 0
...
I am using xgboost in R with the objective "rank:pairwise" for training the model. xgboost expects grouped data (same session_id) to be bunched together in the training and test sets. The lines belonging to the same session_id have to be specified using the function setinfo() (e. g. setinfo(model, 'group', group_info).
When I evaluate the model in R, applying new data works perfectly. However, I have used the package pmml to convert the model into a pmml file in order to use it in Java.
In Java the pmml file gets parsed and evaluated via the org.jpmml pmml-evaluator dependency (v. 1.3.15). Feeding the same data as in R to the org.jpmml.evaluator.Evaluator yields different results, though. The results are mostly negative values - which is no valid result in my setup- all predicted targets should be positive.
I have come up with two possible explanations:
There might be a bug in the pmml conversion in my scenario
I have no idea, where I can apply the equivalent of setinfo() in Java. Since I am only applying the model to a single session at a time, I was under the impression that I did not need to specify it. But maybe, I was wrong.
Please contact me for fully working example including training and test data, I will send via mail. But for starters, here is the R code from training the model:
library(xgboost)
example_matrix_train <- xgb.DMatrix(X, label = y)
setinfo(example_matrix_train, 'group', example_train_groupInfo)
example.model <- xgboost(data = example_matrix_train, objective = "rank:pairwise", max.depth = 8, eta = 0.2, nthread = 8, nround = 10, verbose=0)
library(pmml)
library(pmmlTransformations)
xgb.dump(example.model, "example.model.dumped.trees")
logfile <- file(paste0("pmml_example_model",Sys.Date(),".txt"), open="a")
sink(logfile)
pmml(example.model, inputFeatureNames = colnames(example_train), outputLabelName = "prediction1", xgbDumpFile = "example.model.dumped.trees")
sink()
Any help is welcome
I have come up with two possible explanations: There might be a bug in the pmml conversion
This is the true explanation - the pmml package is producing incorrect PMML for XGBoost models. The technical reason is that it is using XGBoost text dump file as input, but the information contained therein is incomplete (eg. rounded threshold values).
If you're looking to export XGBoost models into PMML, then you should be using the r2pmml package, which is using XGBoost binary files as input.
In truth, the 'pmml' package currently does not support the 'rank:pairwise' objective function you need. The upcoming release of the 'pmml' package (version 1.5.3) includes a check for unsupported objective functions.

Java - xgboost DMatrix input

When creating a DMatrix in java with the xgboost4j package, at first i succeed to create the matrix using a "filepath".
DMatrix trainMat = new DMatrix("...\\xgb_training_input.csv");
But when I try to train the model:
Booster booster = XGBoost.train(trainMat, params, round, watches, null, null);
I get the following error:
...regression_obj.cc:108: label must be in [0,1] for logistic regression
now my data is solid. I've checked it out on an xgb model built in python.
I'm guessing the problem is with the data format somehow.
currently the format is as follows:
x1,x2,x3,x4,x5,y
where x1-x5 are "Real" numbers and y is either 0 or 1. file end is .csv
Maybe the separator shouldn't be ',' ?
DMatrix gets an .libsvm file. which can be easily created with python.
libsvm looks like this:
target 0:column1 1:column2 2:column3 ... and so on
so the target is the first column, while every other column (predictor) is being attached to increasing index with ":" in between.

How to load libsvm model into Android

I have generated a model file from a model trained in MATLAB, and I would like to load this into Android from a mobile device.
The model file looks like this shown for the three first SV's and the params (should be correct):
svm_type 0
kernel_type 2
gamma 3.3636
coef0 0
nr_class 2
total_sv 1106
rho -0.7401
label 0 1
nr_sv 754 352
SV
0 1:8.02710 2:8.90538 3:9.56450 4:10.15383
0 1:7.87334 2:8.71629 3:9.41049 4:9.45693
0 1:8.52795 2:9.19652 3:10.17247 4:10.30913 ...
However, when I load this using svm.svm_load_model(), the resulting model is null:
FileReader fIn = new FileReader("mymodel.txt");
BufferedReader bufferedReader = new BufferedReader(fIn);
svm_model model = svm.svm_load_model(bufferedReader);
I can't seem to find the problem, anyone got an answer?
Thx
EDIT: I figured out what the error is. The model file output from MATLAB is apparently not fully compatible with the Android load_model function in the way that the values to keys svm_type and kernel_type has to specified as strings instead of numerals (c_svc instead of 0, rbf instead of 2).
EDIT: I figured out what the error is. The model file output from MATLAB is apparently not fully compatible with the Android load_model function in the way that the values to keys svm_type and kernel_type has to specified as strings instead of numerals (c_svc instead of 0, rbf instead of 2).
you can do it because libsvm is written in C/C++. So you can use a wrapper (interfaced by JNI or whatever) to access this "C-based libsvm library" in Android
For example, you can use this wrapper: https://github.com/yctung/AndroidLibSvm
After you load the program. Go to edit the "AndroidLibSvm/app/src/main/jni/jnilibsvm.cpp" file.
In this file you can load the model file by
model=svm_load_model(modelFile);
You can also access other libsvm functions as you want

Categories

Resources