I am using Java to serve a Tensorflow model learned with Python. That model have two inputs. The code is the following :
def predict(float32InputShape: (Long, Long),
float32Inputs: Seq[Seq[Float]],
uint8InputShape: (Long, Long),
uint8Inputs: Seq[Seq[Byte]]
): Array[Float] = {
val float32Input = Tensor.create(
Array(float32InputShape._1, float32InputShape._2),
FloatBuffer.wrap(float32Inputs.flatten.toArray)
)
val uint8Input = Tensor.create(
classOf[UInt8],
Array(uint8InputShape._1, uint8InputShape._2),
ByteBuffer.wrap(uint8Inputs.flatten.toArray)
)
val tfResult = session
.runner()
.feed("serving_default_float32_Input", float32Input)
.feed("serving_default_uint8_Input", uint8Input)
.fetch("PartitionedCall")
.run()
.get(0)
.expect(classOf[java.lang.Float])
tfResult
}
What I would like to do is to refactor that method to make it more generic by passing the inputs like with feed_dict in Python. That is, something like :
def predict2(inputs: Map[String, Seq[Seq[Float]]]): Array[Float] = {
...
session
.runner()
.feed(inputs)
...
}
Where the key of the inputs map would be the name of the input layer. It's not possible to do so with the feed method unless I make a macro (which I want to avoid).
Is there any way to do this with the Java API of Tensorflow (I'm using TF 2.0) ?
Edit :
I found the solution (thanks to #geometrikal answer), the code is in Scala but it shoudn't be too hard to the same in Java.
val runnerWithInputLayers = inputs.foldLeft(session.runner()) {
case (sess, (layerName, array)) =>
val tensor = createTensor(array)
sess.feed(layerName, tensor)
}
val output = runnerWithInputLayers
.fetch(outputLayer)
.run()
.get(0)
.expect(Float.getClass)
It's possible because the .feed method returns a Session.Runner with the input layer provided.
You can feed each in a loop. If not so familiar with java script but pseudo-code is something like
e.g.
val tfResult = session.runner()
for(key, value : inputs) {
tfResult = tfResult(key, value)
}
tfResult = tfResult.fetch("PartitionedCall")
.run()
.get(0)
.expect(classOf[java.lang.Float])
Remember you can break up the function chain at any point, e.g. result = foo.bar().baz().qux() can be written temp = foo.bar().baz(); result = temp.qux()
Related
I just started Kotlin so please be nice :)
I have a class that is responsible for fetching some data and notify the main activity that its need to update its UI.
So i have made a function in my DataProvider.kt :
fun getPeople(fromNetwork: Boolean, results: ((persons: Array<Person>, error: MyError?) -> Unit)) {
// do some stuff stuff
val map = hashMapOf(
"John" to "Doe",
"Jane" to "Smith"
)
var p = Person(map)
val persons: Array <Person> = arrayOf (p)
results(persons, null)
}
So i want to call this from my activity but i can't find the right syntax ! :
DataProvider.getPeople(
true,
results =
)
I have try many things but i just want to get my array of persons and my optional error so i can update the UI.
The goal is to perform async code in my data provider so my activity can wait for it.
Any ideas ? Thank you very much for any help.
This really depends on how you define the callback method. If you use a standalone function, use the :: operator. First (of course), I should explain the syntax:
(//these parenthesis are technically not necessary
(persons: Array<Person>, error: MyError?)//defines input arguments: an Array of Person and a nullable MyError
-> Unit//defines the return type: Unit is the equivalent of void in Java (meaning no return type)
)
So the method is defined as:
fun callback(persons: Array<CustomObject>, error: Exception?){
//Do whatever
}
And you call it like:
DataProvider.getPeople(
true,
results = this::callback
)
However, if you use anonymous callback functions, it's slightly different. This uses lambda as well:
getPeople(true, results={/*bracket defines a function. `persons, error` are the input arguments*/persons, error -> {
//do whatever
}})
Yes Kotlin has a great way of using callback functions which I will show you an example of how I use them below:
fun addMessageToDatabase(message: String, fromId: String, toId: String,
addedMessageSuccessHandler: () -> Unit,
addedMessageFailureHandler: () -> Unit) {
val latestMessageRef = mDatabase.getReference("/latest-messages/$fromId/$toId")
latestMessageRef.setValue(message).addOnSuccessListener {
latestMessageUpdateSuccessHandler.invoke()
}.addOnFailureListener {
latestMessageUpdateFailureHandler.invoke()
}
}
And finally you can utilise the new callbacks with the following code
databaseManager.updateLatestMessageForUsers(message, fromId, toId,
latestMessageUpdateSuccessHandler = {
// your success action
},
latestMessageUpdateFailureHandler = {
// your failure action
})
So basically when I successfully add a new row to my database I'm invoking a success or a failure response to the caller of the service. Hopefully this will help out someone.
How can I port a java inner function from here
which fully is contained in to Scala?
JavaPairRDD<Envelope, HashSet<Point>> castedResult = joinListResultAfterAggregation.mapValues(new Function<HashSet<Geometry>,HashSet<Point>>()
{
#Override
public HashSet<Point> call(HashSet<Geometry> spatialObjects) throws Exception {
HashSet<Point> castedSpatialObjects = new HashSet<Point>();
Iterator spatialObjectIterator = spatialObjects.iterator();
while(spatialObjectIterator.hasNext())
{
castedSpatialObjects.add((Point)spatialObjectIterator.next());
}
return castedSpatialObjects;
}
});
return castedResult;
My approach as outlined below would not compile due to some NotinferredU
val castedResult = joinListResultAfterAggregation.mapValues(new Function[java.util.HashSet[Geometry], java.util.HashSet[Point]]() {
def call(spatialObjects: java.util.HashSet[Geometry]): java.util.HashSet[Point] = {
val castedSpatialObjects = new java.util.HashSet[Point]
val spatialObjectIterator = spatialObjects.iterator
while (spatialObjectIterator.hasNext) castedSpatialObjects.add(spatialObjectIterator.next.asInstanceOf[Point])
castedSpatialObjects
}
})
When asking a question about compilation errors please provide the exact error, especially when your code doesn't stand on its own.
The inner function itself is fine; my guess would be that due to changes above joinListResultAfterAggregation isn't a JavaPairRDD anymore, but a normal RDD[(Envelope, Something)] (where Something could be java.util.HashSet, scala.collection.Set or some subtype), so its mapValues takes a Scala function, not a org.apache.spark.api.java.function.Function. Scala functions are written as lambdas: spatialObjects: Something => ... (the body will depend on what Something actually is, and the argument type can be omitted in some circumstances).
How about this ?
val castedResult = joinListResultAfterAggregation.mapValues(spatialObjects => {
spatialObjects.map(obj => (Point) obj)
})
I have a Java code calling Scala method.
Java side code:
List<String> contexts = Arrays.asList(initialContext);
ContextMessage c = ContextMessage.load(contexts);
Scala side code:
def load(contexts: List[String]) = ...
contexts foreach context =>
In this case, I have scala.collection.immutable.List<String> cannot be applied ... error message.
I also need to make the type of contexts as general as possible (i.e., Seq) as the load method iterates over the given collection object to process something.
def load(contexts: Seq[String]) = ...
How to solve the two issues?
I would just use JavaConversions and keep my Scala code scalatic.
// Scala code
object ContextMessage {
def load(contexts: Seq[String]) = ???
}
// in your Java code
ContextMessage c = ContextMessage.load(JavaConversions.asScalaBuffer(Arrays.asList(initialContext));
In the case Scala calling this method, Implicit conversion between java.util.ArrayList and Seq type can solve this issue easily.
import scala.collection.mutable.ListBuffer
object ContextMessage extends App {
implicit def typeConversion(input: java.util.ArrayList[String]) = {
val res : ListBuffer[String] = new ListBuffer[String]()
for (i <- 0 to input.size - 1) {
// println(input.get(i))
res += input.get(i)
}
res
}
def load(contexts: Seq[String]) = {
contexts foreach { c =>
println(c)
}
}
val x = new java.util.ArrayList[String]()
x.add("A")
x.add("B")
load(x)
}
ContextMessage.main(args)
The result shows:
A
B
First of all, let me be clear that I am very new to Scala and functional programming, so my understanding and implementation may be incorrect or inefficient.
Given a file look like this:
type1 param11 param12 ...
type2 param21 param22 ...
type2 param31 param32 ...
type1 param41 param42 ...
...
Basically, each line starts with the type of an object which can be created by the following parameters in the same line. I'm working an application which goes through each line, creates an object of a given type and returns the list of lists of all the objects.
In Java, my implementation is like this:
public void parse(List[Type1] type1s, List[Type2] type2s, List[String] lines) {
for (String line in lines) {
if (line.startsWith("type1")) {
Type1 type1 = Type1.createObj(line);
type1s.add(type1)l
} else if (line.startsWith("type2")) {
Type2 type2 = Type2.createObj(line);
type2s.add(type2)l
} else { throw new Exception("Unknown type %s".format(line)) }
}
}
In order to do the same thing in Scala, I do this:
def parse(lines: List[String]): (List[Type1], List[Type2]) = {
val type1Lines = lines filter (x => x.startsWith("type1"))
val type2Lines = lines filter (x => x.startsWith("type2"))
val type1s = type1Lines map (x => Type1.createObj(x))
val type2s = type2Lines map (x => Type2.createObj(x))
(type1s, type2s)
}
As I understand, while my Java implementation only goes through the list once, the Scala one has to do it three times: to filter type1, to filter type2 and to create objects from them. Which means the Scala implementation should be slower than the Java one, right? Moreover, the Java implementation is also more memory saving as it only has 3 instances: type1s, type2s and lines. On the other hand, the Scala one has 5: lines, type1Lines, type2Lines, type1s and type2s.
So my questions are:
Is there a better way to re-write my Scala implementation so that the list is iterated only once?
Using immutable object means a new object is create every time, does
it mean functional programming requires more memory than others?
Updated: I create a simple test to demonstrate that the Scala program is slower: a program receives a list of String with size = 1000000. It iterate through a list and check each item, if an item starts with "type1", it adds 1 to a list named type1s, otherwise, it adds 2 to another list named type2s.
Java implementation:
public static void test(List<String> lines) {
System.out.println("START");
List<Integer> type1s = new ArrayList<Integer>();
List<Integer> type2s = new ArrayList<Integer>();
long start = System.currentTimeMillis();
for (String l : lines) {
if (l.startsWith("type1")) {
type1s.add(1);
} else {
type2s.add(2);
}
}
long end = System.currentTimeMillis();
System.out.println(String.format("END after %s milliseconds", end - start));
}
Scala implementation:
def test(lines: List[String]) = {
println("START")
val start = java.lang.System.currentTimeMillis()
val type1Lines = lines filter (x => x.startsWith("type1"))
val type2Lines = lines filter (x => x.startsWith("type2"))
val type1s = type1Lines map (x => 1)
val type2s = type2Lines map (x => 2)
val end = java.lang.System.currentTimeMillis()
println("END after %s milliseconds".format(end - start))
}
}
Averagely, the Java application took 44 milliseconds while the Scala one needed 200 milliseconds.
object ScalaTester extends App {
val random = new Random
test((0 until 1000000).toList map {_ => s"type${random nextInt 10}"})
def test(lines: List[String]) {
val start = Platform.currentTime
val m = lines groupBy {
case s if s startsWith "type1" => "type1"
case s if s startsWith "type2" => "type2"
case _ => ""
}
println(s"Total type1: ${m("type1").size}; Total type2: ${m("type2").size}; time=${Platform.currentTime - start}")
}
}
The real advantage of Scala (and functional programming in general) is the ability to process data transforming one structures into another.
Of course you can combine mappings, flatMappings, filters, groups and so forth in a single code line. It results to a single data collection.
You may do it one after another creating new collections each time. And this produces a little overhead indeed. But does one care about it? Even though you create excessive collections Scala-style programming helps you design parallel oriented code (as Niklas already mentioned) and prevents you from very elusive side-effects errors that imperative-style programming is prone to
I have constructed a query that's essentially a weighted sum of other queries:
val query = new BooleanQuery
for ((subQuery, weight) <- ...) {
subQuery.setBoost(weight)
query.add(subQuery, BooleanClause.Occur.MUST)
}
When I query the index, I get back documents with the overall scores. This is good, but I also need to know what the sub-scores for each of the sub-queries were. How can I get those? Here's what I'm doing now:
for (scoreDoc <- searcher.search(query, nHits).scoreDocs) {
val score = scoreDoc.score
val subScores = subQueries.map { subQuery =>
val weight = searcher.createNormalizedWeight(subQuery)
val scorer = weight.scorer(reader, true, true)
scorer.advance(scoreDoc.doc)
scorer.score
}
}
I think this gives me the right scores, but it seems wasteful to advance to and re-score the document when I know it's already been scored as part of the overall score.
Is there a more efficient way to get those sub-scores?
[My code here is in Scala, but feel free to respond in Java if that's easier.]
EDIT: Here's what things look like after following Robert Muir's suggestion.
The query:
val query = new BooleanQuery
for ((subQuery, weight) <- ...) {
val weightedQuery = new BoostedQuery(subQuery, new ConstValueSource(weight))
query.add(weightedQuery, BooleanClause.Occur.MUST)
}
The search:
val collector = new DocScoresCollector(nHits)
searcher.search(query, collector)
for (docScores <- collector.getDocSubScores) {
...
}
The collector:
class DocScoresCollector(maxSize: Int) extends Collector {
var scorer: Scorer = null
var subScorers: Seq[Scorer] = null
val priorityQueue = new DocScoresPriorityQueue(maxSize)
override def setScorer(scorer: Scorer): Unit = {
this.scorer = scorer
// a little reflection hackery is required here because of a bug in
// BoostedQuery's scorer's getChildren method
// https://issues.apache.org/jira/browse/LUCENE-4261
this.subScorers = scorer.getChildren.asScala.map(childScorer =>
childScorer.child ...some hackery... ).toList
}
override def acceptsDocsOutOfOrder: Boolean = false
override def collect(doc: Int): Unit = {
this.scorer.advance(doc)
val score = this.scorer.score
val subScores = this.subScorers.map(_.score)
priorityQueue.insertWithOverflow(DocScores(doc, score, subScores))
}
override def setNextReader(context: AtomicReaderContext): Unit = {}
def getDocSubScores: Seq[DocScores] = {
val buffer = Buffer.empty[DocScores]
while (this.priorityQueue.size > 0) {
buffer += this.priorityQueue.pop
}
buffer
}
}
case class DocScores(doc: Int, score: Float, subScores: Seq[Float])
class DocScoresPriorityQueue(maxSize: Int) extends PriorityQueue[DocScores](maxSize) {
def lessThan(a: DocScores, b: DocScores) = a.score < b.score
}
There is a scorer navigation API: the basic idea is you write a collector and in its setScorer method, where normally you would save a reference to that Scorer to later score() each hit, you can now walk the tree of that Scorer's subscorers and so on.
Note that Scorers have pointers back to the Weight that created them, and the Weight back to the Query.
Using all of this, you can stash away references to the subscorers you care about in your setScorer method, e.g. all the ones created from TermQueries. Then when scoring hits, you could and investigate things like the freq() and score() of those nodes in your collector.
In the 3.x series this is a visitor API limited to boolean relationships, in the 4.x series (as of now only an alpha release), you can just get the child+relationship of each subscorer, so it can work with arbitrary queries (including custom ones you write or whatever).
Caveats:
you will need to return false from acceptsDocsOutOfOrder in your collector, as your collector requires this document-at-a-time processing for this to work.
you probably want a bugfix branch of the 3.6 series (http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/) or a snapshot of 4.x (http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/). This is because this functionality generally didnt work since disjunctions (OR queries) always set their subscorers 'one doc ahead' of the current document until some things were fixed last week, and those fixes didnt make it in time for 3.6.1. See https://issues.apache.org/jira/browse/LUCENE-3505 for more details.
There aren't really any good examples, except some simple tests that sum up the term frequencies of all the leaf nodes (see below)
Tests:
4.x series: http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/search/TestBooleanQueryVisitSubscorers.java
3.x series: http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/core/src/test/org/apache/lucene/search/TestBooleanQueryVisitSubscorers.java