Disclaimer: I work on a non-traditional project, so don't be shocked if some assumptions seem absurd.
Context
I wish to create a stream reader for integers, strings, and the other common types in Scala, but to start with I focus only on integers. Also note that I'm not interesting in handling exception at the moment -- I'll deal with them in due time and this will be reflected in the API and in the meantime I can make the huge assumption that failures won't occur..
The API should be relatively simple, but due to the nature of the project I'm working on, I can't rely on some feature of Scala and the API needs to look something like this (slightly simplified for the purpose of this question):
object FileInputStream {
def open(filename: String): FileInputStream =
new FileInputStream(
try {
// Check whether the stream can be opened or not
val out = new java.io.FileReader(filename)
out.close()
Some[String](filename)
} catch {
case _: Throwable => None[String]
}
)
}
case class FileInputStream(var filename: Option[String]) {
def close: Boolean = {
filename = None[String]
true // This implementation never fails
}
def isOpen: Boolean = filename.isDefined
def readInt: Int = nativeReadInt
private def nativeReadInt: Int = {
??? // TODO
}
}
object StdIn {
def readInt: Int = nativeReadInt
private def nativeReadInt: Int = {
??? // TODO
}
}
Please also note that I cannot rely on additional fields in this class, with the exception of Int variables. This (probably) implies that the stream has to be opened and closed for every operations. Hence, it goes without saying that the implementation will not be efficient, but this is not an issue.
The Question
My goal is to implement the two nativeReadInt methods such that the input stream gets consumed by only one integer if one is available straight away. However, if the input doesn't start (w.r.t. the last read operation) with an integer then nothing should be read and a fixed value can be returned, say -1.
I've explored several high level Java and Scala standard APIs, but none seemed to offer a way to re-open a stream to a given position trivially. My hope is to avoid implementing low level parsing based solely on java.io.InputStream and its read() and skip(n) methods.
Additionally, to let the user read from the standard input stream, I need to avoid using scala.io.StdIn.readInt() method because it reads "an entire line of the default input", therefore trashing some potential data.
Are you aware of a Java or Scala API that could do the trick here?
Thank you
Related
We need a method to sign messages using signatures that are as short as possible and came across the BLS scheme, which promises rather short-ish signatures. Trying the JPBC implementation, the examples are easy to set up and run, but they lack a rather crucial part: storing and loading the private keys.
The example from the current JPBC BLS website 1 does not contain any storage whatsoever, it just verifies a message using the instances in RAM.
An older example from the same website 2 which is no longer linked on the website but can be found using search engines refers to a store method which seems to have since been removed from the library in favour of an implementation that does not contain any storage capabilities.
The AsymmetricCipherKeyPair instances (which are what I get from the keygen) are not serializable by themselves, neither are instances of BLS01PublicKeyParameters or BLS01PrivateKeyParameters, with the fields that contain the keys (sk and pk) being private and typed only to the Element interface that doesn't really say much about the contents.
As a workaround, I have implemented a store method, that (stripped of all exception handling) roughly looks like this:
public static void storePrivateKey(AsymmetricCipherKeyPair key, String filename)
throws FileNotFoundException, IOException {
Field f = null;
f = key.getPrivate().getClass().getDeclaredField("sk");
if (f != null) {
f.setAccessible(true);
Object fieldContent = null;
fieldContent = f.get(key.getPrivate());
if (fieldContent != null) {
byte[] data = null;
if (fieldContent instanceof ImmutableZrElement) {
ImmutableZrElement izr = (ImmutableZrElement)fieldContent;
data = izr.toBytes();
}
try (FileOutputStream fos = new FileOutputStream(filename)) {
fos.write(data);
}
}
}
}
With a similar approach for public keys. That means, I'm now down to using reflection to retrieve the contents of a private field in order to store it somewhere. That solution is obviously a hackish collection of all sorts of bad smells, but it's so far the best that I've come up with. I know that writing some bytes to disk shouldn't really be that hard, but I really can't seem to find the proper way to do this. Also, to be blunt, I'm not into crypto: I want to apply this scheme to sign and verify some messages, that is all. I understand that I should dig deeper into the math of the whole approach, but time is limited - which is why I picked a library in the first place.
I've written a program to aid the user in configuring 'mechs for a game. I'm dealing with loading the user's saved data. This data can (and some times does) become partially corrupt (either due to bugs on my side or due to changes in the game data/rules from upstream).
I need to be able to handle this corruption and load as much as possible. To be more specific, the contents of the save file are syntactically correct but semantically corrupt. I can safely parse the file and drop whatever entries that are not semantically OK.
Currently my data parser will just show a modal dialog with an appropriate warning message. However displaying the warning is not the job of the parser and I'm looking for a way of passing this information to the caller.
Some code to show approximately what is going on (in reality there is a bit more going on than this, but this highlights the problem):
class Parser{
public void parse(XMLNode aNode){
...
if(corrupted) {
JOptionPane.showMessageDialog(null, "Corrupted data found",
"error!", JOptionPane.WARNING_MESSAGE);
// Keep calm and carry on
}
}
}
class UserData{
static UserData loadFromFile(File aFile){
UserData data = new UserData();
Parser parser = new Parser();
XMLDoc doc = fromXml(aFile);
for(XMLNode entry : doc.allEntries()){
data.append(parser.parse(entry));
}
return data;
}
}
The thing here is that bar an IOException or a syntax error in the XML, loadFromFile will always succeed in loading something and this is the wanted behavior. Somehow I just need to pass the information of what (if anything) went wrong to the caller. I could return a Pair<UserData,String> but this doesn't look very pretty. Throwing an exception will not work in this case obviously.
Does any one have any ideas on how to solve this?
Depending on what you are trying to represent, you can use a class, like SQLWarning from the java.sql package. When you have a java.sql.Statement and call executeQuery you get a java.sql.ResultSet and you can then call getWarnings on the result set directly, or even on the statement itself.
You can use an enum, like RefUpdate.Result, from the JGit project. When you have a org.eclipse.jgit.api.Git you can create a FetchCommand, which will provide you with a FetchResult, which will provide you with a collection of TrackingRefUpdates, which will each contain a RefUpdate.Result enum, which can be one of:
FAST_FORWARD
FORCED
IO_FAILURE
LOCK_FAILURE
NEW
NO_CHANGE
NOT_ATTEMPTED
REJECTED
REJECTED_CURRENT_BRANCH
RENAMED
In your case, you could even use a boolean flag:
class UserData {
public boolean isCorrupt();
}
But since you mentioned there is a bit more than that going on in reality, it really depends on your model of "corrupt". However, you will probably have more options if you have a UserDataReader that you can instantiate, instead of a static utility method.
Am I correct I's suppose that within the bounds of the same process having 2 threads reading/writing to a named pipe does not block reader/writer at all? So with wrong timings it's possible to miss some data?
And in case of several processes - reader will wait until some data is available, and writer will be blocked until reader will read all the data supplied by reader?
I am planning to use named pipe to pass several (tens, hundreds) of files from external process and consume ones in my Java application. Writing simple unit tests to use one thread for writing to the pipe, and another one - for reading from the pipe, resulted in sporadic test failures because of missing data chunks.
I think it's because of the threading and same process, so my test is not correct in general. Is this assumption correct?
Here is some sort of example which illustrates the case:
import java.io.{FileOutputStream, FileInputStream, File}
import java.util.concurrent.Executors
import org.apache.commons.io.IOUtils
import org.junit.runner.RunWith
import org.scalatest.FlatSpec
import org.scalatest.junit.JUnitRunner
#RunWith(classOf[JUnitRunner])
class PipeTest extends FlatSpec {
def md5sum(data: Array[Byte]) = {
import java.security.MessageDigest
MessageDigest.getInstance("MD5").digest(data).map("%02x".format(_)).mkString
}
"Pipe" should "block here" in {
val pipe = new File("/tmp/mypipe")
val srcData = new File("/tmp/random.10m")
val md5 = "8e0a24d1d47264919f9d47f5223c913e"
val executor = Executors.newSingleThreadExecutor()
executor.execute(new Runnable {
def run() {
(1 to 10).foreach {
id =>
val fis = new FileInputStream(pipe)
assert(md5 === md5sum(IOUtils.toByteArray(fis)))
fis.close()
}
}
})
(1 to 10).foreach {
id =>
val is = new FileInputStream(srcData)
val os = new FileOutputStream(pipe)
IOUtils.copyLarge(is, os)
os.flush()
os.close()
is.close()
Thread.sleep(200)
}
}
}
without Thread.sleep(200) the test is failing to pass for reasons
broken pipe exception
incorrect MD5 sum
with this delay set - it works just great. I am using file with 10 megabytes of random data.
This is a very simple race condition in your code: you're writing fixed-size messages to the pipe, and assuming that you can read the same messages back. However, you have no idea how much data is available in the pipe for any given read.
If you prefix your writes with the number of bytes written, and ensure that each read only reads that number of bytes, you'll see that pipes work exactly as advertised.
If you have a situation with multiple writers and/or multiple readers, I recommend using an actual message queue. Actually, I recommend using a message queue in any case, as it solves the issue of message boundary demarcation; there's little point in reinventing that particular wheel.
Am I correct I's suppose that within the bounds of the same process having 2 threads reading/writing to a named pipe does not block reader/writer at all?
Not unless you are using non-blocking I/O, which you aren't.
So with wrong timings it's possible to miss some data?
Not unless you are using non-blocking I/O, which you aren't.
guys
I am implementing a simple example of 2 level cache in java:
1st level is memeory
2nd - filesystem
I am new in java and I do this just for understanding caching in java.
And sorry for my English, this language is not native for me :)
I have completed 1st level by using LinkedHashMap class and removeEldestEntry method and it is looks like this:
import java.util.*;
public class level1 {
private static final int max_cache = 50;
private Map cache = new LinkedHashMap(max_cache, .75F, true) {
protected boolean removeEldestEntry(Map.Entry eldest) {
return size() > max_cache;
}
};
public level1() {
for (int i = 1; i < 52; i++) {
String string = String.valueOf(i);
cache.put(string, string);
System.out.println("\rCache size = " + cache.size() +
"\tRecent value = " + i +
" \tLast value = " +
cache.get(string) + "\tValues in cache=" +
cache.values());
}
}
Now, I am going to code my 2nd level. What code, methods I should write to implement this tasks:
1) When the 1st level cache is full, the value shouldn't be removed by removeEldestEntry but it should be moved to 2nd level (to file)
2) When the new values are added to 1st level, firstly this value should be checked in file (2nd level) and if it exists it should be moved from 2nd to 1st level.
And I tried to use LRUMap to upgrade my 1st level but the compiler couldn't find class LRUMap in library, what's the problem? Maybe special syntax needed?
You can either use the built in java serialization mechanism and just send your stuff to file by wrapping FileOutputStrem with DataOutputStream and then calling writeObjet().
This method is simple but not flexible enough. for example you will fail to read old cache from file if your classes changed.
You can use serialization to xml, e.g. JaxB or XStream. I used XStream in past and it worked just fine. You can easily store any collection in file and the restore it.
Obviously you can store stuff in DB but it is more complicated.
A remark is that you are not getting thread safety under consideration for your cache! By default LinkedHashMap is not thread-safe and you would need to synchronize your access to it. Even better you could use ConcurrentHashMap which deals with synchronization internally being able to handle by default 16 separate threads (you can increase this number via one of its constructors).
I don't know your exact requirements or how complicated you want this to be but have you looked at existing cache implementations like the ehcache library?
I am passing Yaml created with PyYaml to SnakeYaml and Snakeyaml does not seem to recognize anything beyond the first line where !! exists and python/object is declared. I already have identical objects setup in Java. Is there an example out there that shows a loadAll into an object array where the object type is asserted or assigned?
Good call... was away from the computer when I originally posted.
Here is the data from PyYaml that I am trying to use SnakeYaml to get into a Java application:
--- !!python/object:dbmethods.Project.Project {dblogin: kirtstrim7900, dbname: 92218kirtstrim_wfrogls,dbpw: 1234567895#froggy, preference1: '', preference2: '', preference3: '', projName: CheckPoint Firewall Audit - imp, projNo: 1295789430544+CheckPoint Firewall Audit - imp, projectowner: kirtcathey#sysrisk.com,result1label: Evidence, result2label: Recommend, result3label: Report, resultlabel: Response,role: owner, workstep1label: Objective, workstep2label: Policy, workstep3label: Guidance,worksteplabel: Procedure}
Not just a single instance of the above, but several objects, so need to use loadAll in SnakeYaml.... unless somebody knows better.
As for the code, this is all I have from SnakeYaml docs:
for (Object data : yaml.loadAll(sb.toString())) {
System.out.println(data.toString());
}
Then, this error is thrown:
Exception in thread "AWT-EventQueue-0" Can't construct a java object for tag:yaml.org,2002:java/object: ......
Caused by: org.yaml.snakeyaml.error.YAMLException: Class not found: ......
As you can see from the small code snippet, EVEN without all this information supplied, anybody who knows the answer about how to cast an object arbitrarily could PROBABLY answer the question.
Thx.
Parsed off the two exclamation points (!!) at the beginning of each entry and now I get:
mapping values are not allowed here
in "", line 1, column 73:
as an error. The whole point of using YAML was to reduce coding related to parsing. If I have to turn around and parse incoming and outgoing code for whatever reason, then YAML sucks!! And will gladly revert back XML or anything else that will allow a python middleware to talk to a java application.
To achieve the same result you may:
configure PyYAML to skip the tag (exactly as you did with the comment "Convert objects to a dictionary of their representation")
configure SnakeYAML to create the object you expect (exactly as you did with "projectData = gson.fromJson(mystr, ProjectData[].class); ")
If you are lost (before you say "it sucks") you may ask a question in the corresponding mailing lists. It may help you to find a proper solution in the future.
Fixed. YAML sucks, so don't use it. All kinds of Google results about how SnakeYAML is derived from PyYaml and what-not, but nobody clearly states exactly what dumps format from PyYaml works with what loadAll routines with SnakeYAML.
Also, performance with YAML is horrid, JSON is far simpler and easier to implement. In Python, where our middleware resides (and most crunching occurs), YAML takes almost twice the time to process than JSON!!
If you are using Python 2.6 or greater, just
import json
json_doc = json.dumps(projects, default=convert_to_builtin_type)
print json_doc
def convert_to_builtin_type(obj):
print 'default(', repr(obj), ')'
# Convert objects to a dictionary of their representation
d = { '__class__':obj.__class__.__name__,
'__module__':obj.__module__,
}
d.update(obj.__dict__)
return d
Then on the Java client (loading) side, use GSon -- this took a lot of head-scratching and searches to figure out because ALL examples on the 'net are virtually useless. Every blogger with 500 ads per page shows you how to convert one single, stupid object and last time I created an app, I used lists, arrays, or anything that held more than one object!!
try {
serverAddress = new URL("http://127.0.0.1:5000/projects/" + ruser.getUserEmail()+"+++++"+ruser.getUserHash());
//set up out communications stuff
connection = null;
//Set up the initial connection
connection = (HttpURLConnection)serverAddress.openConnection();
connection.setRequestMethod("GET");
connection.setDoOutput(true);
connection.setReadTimeout(100000);
connection.connect();
//get the output stream writer and write the output to the server
//not needed in this example
rd = new BufferedReader(new InputStreamReader(connection.getInputStream()));
sb = new StringBuilder();
while ((line = rd.readLine()) != null)
{
sb.append(line + '\n');
}
String mystr = sb.toString();
// Now do the magic.
Gson gson = new Gson();
projectData = gson.fromJson(mystr, ProjectData[].class);
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (ProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
finally
{
//close the connection, set all objects to null
connection.disconnect();
rd = null;
sb = null;
connection = null;
}
return projectData;
Done! In a nutshell - YAML sucks and use JSON!! Also, the http connection code is mostly snipped off of this site...now I need to figure out https.