I am writing a server in java that allows clients to play a game similar to 20 questions. The game itself is basically a binary tree with nodes that are questions about an object and leaves that are guesses at the object's identity. When the game guesses wrong it needs to be able to get the right answer from the player and add it to the tree. This data is then saved to a random access file.
The question is: How do you go about representing a tree within a file so that the data can be reaccessed as a tree at a later time.
If you know where I can find information on keeping data structures like trees organized as such when writing/reading to files then please link it. Thanks a lot.
Thanks for the quick answers everyone. This is a school project so it has some odd requirements like using random access files and telnet.
This data is then saved to a random access file.
That's the hard way to solve your problem (the "random access" bit, I mean).
The problem you are really trying to solve is how to persist a "complicated" data structure. In fact, there are a number of ways that this can be done. Here are some of them ...
Use Java persistence. This is simple to implement; make sure that your data structure is serializable, and then its just a few lines of code to serialize and few more lines to deserialize. The downsides are:
Serialized objects can be fragile in the face of code changes.
Serialization is not incremental. You write/read the whole graph each time.
If you have multiple separate serialized graphs, you need some scheme to name and manage them.
Use XML. This is more work to implement than Java persistence, but it has the advantage of being less fragile. And if something does go wrong, there's a chance you can fix it with XSLT or a text editor. (There are XML "binding" libraries that eliminate a lot of the glue coding.)
Use an SQL database. This addresses all of the downsides of Java persistence, but involves more coding ... and using a different computational model to access the persistent data (query versus graph navigation).
Use a database and an Object Relational Mapping technology; e.g. a JPA or JDO implementation. (Hibernate is a popular choice). These bridge between the database and in-memory views of data in a more or less transparent fashion, and avoids a lot of the glue code that you need to write in the SQL database and XML cases.
I think you're looking for serialization. Try this:
http://java.sun.com/developer/technicalArticles/Programming/serialization/
As mentioned, serialization is what you are looking for. It allows you to write an object to a file, and read it back later with minimal effort. The file will automatically be read back in as your object type. This makes things much easier than trying to store the object yourself using XML.
Java serialization has some pitfalls (like when you update your class). I would serialize in a text format. Json is my first choice here but xml and yaml would work as well.
This way you would have a file that doesn't rely on the binary version of your class.
There are several java libraries: http://www.json.org
Some examples:
http://code.google.com/p/json-simple/wiki/DecodingExamples
http://code.google.com/p/json-simple/wiki/EncodingExamples
And to save and read from the file you can use the Commons Io:
import org.apache.commons.io.FileUtis;
import java.io.File;
...
File dataFile = new File("yourfile.json");
String data = FileUtils.readFileToString(dataFile);
FileUtils.writeStringToFile(dataFile, content);
Related
Im building an automation framework in selenium using the Page Object Design Pattern.
Following are some of the data that Im using and where i have stored them
PageObjects (xpath, id etc) - In the Page Classes itself
Configuration Data (wait-times, browser type , the URL etc) - In a properties file.
Other data - In a class as static variables.
Once the framework starts growing it would be hard to store all the data it would be hard to organize the data. I did a some research on how others have implemented the way they store data in their framework. Here is what I found out,
Storing data (mostly page objects) in classes itself
Storing data in JSON
And some even suggested storing data in a database so that it would reduce reading times
Since there are lot of options out there, I thought of getting some feedback on what is the best way to store data and how everyone else has stored there data.
JSON or Any temp data storage is the best option as it is a framework and the purpose of it is to reuse for different projects.
I don't see any problem with the way you have stored your data.
Locators (by POM definition) should be stored in the page objects themselves.
Config data can be stored in some sort of config file... whatever you find convenient. You can use plain text, JSON, XML, etc. We use XML but that really comes down to personal preference.
I think this is fine also.
The framework doesn't really grow, the automation suite does. As long as you keep the data stored in the 3 places above consistently, I think you should be fine. The only issue I've run into with this approach is that sometimes certain pages have a LOT of functionality on them so the page objects grow quite large. In those cases, we found a way to divide the page into smaller chunks, e.g. one page had 22 tabs, each consisting of a different panel. In that case, we broke the page object into 22 different class files to keep the size more manageable and then hooked them all back into the main page as properties, e.g. mainPage.Panel1.someMethodOnPanel1();
I advice using Interfaces for each device type to store multiple type selectors, example:
import static org.openqa.selenium.By.cssSelector;
import static org.openqa.selenium.By.linkText;
import static org.openqa.selenium.By.xpath;
public interface DesktopMainPageSelector {
By FIRST_ELEMENT = cssSelector("selector_here");
By SECOND_ELEMENT = xpath("selector_here");
By THIRD_ELEMENT = id("selector_here");
}
than, just implement these selectors from whatever you need them.
You can also use enums with for a more complex structure.
I found this as best solution, because its easy to manage large numbers of selectors
This is my second post and I am getting used to the function of things on here now!
this is more of a theory question for computer science but, my question is what does this mean?
'Parsing a text file or data stream'
This is an assignment and the books and web sources I have consulted are old or vague. I have implemented a serializable interface on a SinglyLinkedList which saves/loads the file to/from the disk so it can be transferred/edited and accessed later on. Does this qualify for a sufficient achievement of the rather vague requirement?
things to note when considering this question:
this requirement is one of many for a project I am doing
the Singly Linked List I am using is custom made - I know, the premade Java one is better, but I must show my skills
all the methods work - I have tested them - its just a matter of documentation
I am using ObjectOutputStream, FileOutputStream, ObjectInputStream and FileInputStream and the respective methods to read/write the Singly linked list object
I would appreciate the feedback
The process of "parsing" can be described as reading in a data stream of some sort and building an in-memory model or representation of the semantic content of that data, in order to facilitate performing some kind of transformation on the data.
Some examples:
A compiler parses your source code to (usually) build an abstract syntax tree of the code, with the objective of generating object- (or byte-) code for execution by a machine.
An interpreter does the same thing but the syntax tree is then directly used to control execution (some interpreters are a mashup of byte-code generators and virtual machines and may generate intermediate byte-code).
A CSV parser reads a stream structured according to the rules of CSV (commas, quoting, etc) to extract the data items represented by each line in the file.
A JSON or XML parser does a similar operation for JSON- or XML-encoded data, building an in-memory representation of the semantic values of the data items and their hierarchical inter-relationships.
For my project, I need to store info about protocols (the data sent (most likely integers) and in the order it's sent) and info that might be formatted something like this:
'ID' 'STRING' 'ADDITIONAL INTEGER DATA'
This info will be read by a Java program and stored in memory for processing, but I don't know what would be the most sensible format to store this data in?
EDIT: Here's some extra information:
1)I will be using this data in a game server.
2)Since it is a game server, speed is not the primary concern, since this data will primary be read and utilized during startup, which shouldn't occur very often.
3)Memory consumption I would like to keep at a minimum, however.
4)The second data "example" will be used as a "dictionary" to look up names of specific in-game items, their stats and other integer data (and therefore might become very large, unlike the first data containing the protocol information, where each file will only note small protocol bites, like a login protocol for instance).
5)And yes, I would like the data to be "human-editable".
EDIT 2: Here's the choices that I've made:
JSON - For the protocol descriptions
CSV - For the dictionaries
There are many factors that could come to weigh--here are things that might help you figure this out:
1) Speed/memory usage: If the data needs to load very quickly or is very large, you'll probably want to consider rolling your own binary format.
2) Portability/compatibility: Balanced against #1 is the consideration that you might want to use the data elsewhere, with programs that won't read a custom binary format. In this case, your heavy hitters are probably going to be CSV, dBase, XML, and my personal favorite, JSON.
3) Simplicity: Delimited formats like CSV are easy to read, write, and edit by hand. Either use double-quoting with proper escaping or choose a delimiter that will not appear in the data.
If you could post more info about your situation and how important these factors are, we might be able to guide you further.
How about XML, JSON or CSV ?
I've written a similar protocol-specification using XML. (Available here.)
I think it is a good match, since it captures the hierarchal nature of specifying messages / network packages / fields etc. Order of fields are well defined and so on.
I even wrote a code-generator that generated the message sending / receiving classes with methods for each message type in XSLT.
The only drawback as I see it is the verbosity. If you have a really simple structure of the specification, I would suggest you use some simple home-brewed format and write a parser for it using a parser-generator of your choice.
In addition to the formats suggested by others here (CSV, XML, JSON, etc.) you might consider storing the info in a Java properties file. (See the java.util.Properties class.) The code is already there for you, so all you have to figure out is the properties names (or name prefixes) you want to use.
The Properties class also provides for storing/loading properties in a simple XML format.
I'm writing my own Document Management System (DMS) in Java (the ones available don't satisfy my needs).
The documents shall be described by the Qualified DublinCore Metadata Standard. The easiest way to do this, in my opinion is do pack the key-value pairs in a RDF model with a XML representation.
To store the metadata for all documents i have two ideas (the document files will be stored in the filesystem):
Store all metadata of all documents in a single XML file
Make a XML file for each document and store it either in the filesystem or in a RDBMS (like the H2 database engine for Java), a key-value database won't solve this because the keys for one document are not unique.
Since (many) documents are linked among each other the first approach may would be better for analysing the data, but the second approach may be much faster.
Which solution you would recommend? Or are there any better solutions?
Stefan
I don't know how your analysis work, but if you need the complete graph in memory to do your analysis then use variante 1 (Store all metadata of all documents in a single XML file), because you will get no gain (but only extra work) from variante 2 in this scenario.
added
If this extra work for variant 2 is not to much, then I recomend variant 2, because it can be more calable.
you could update or add document meta data by writing only a small xml file instead of a huge one
it depends on what xml parser you use, but in some cases it is faster to parse some smaller xml files than one huge one (but this strongly depends on the ammout of data).
Have you considered using MongoDB and GridFS? http://www.mongodb.org/display/DOCS/GridFS+Specification
You can store your documents directly in MongoDB as binary and even store the associated metadata for that particular file in any format you want. It would have the ability to store documents even if they have the same name and it will generate it's own unique IDs.
BTW: even if it does not belong to your question: have a look at a JCR (Java Content Repository) implementation like JackRabbit. You could use it to store your documents and maybe your meta data too.
I'd look into a NO SQL document solution like Couch DB to see if it could help you.
I don't like the file system solution; there's no abstraction whatsoever to help you there.
If your are always accessing all documents, none of your approaches would be slower than the other. But I would recommend the second approach. When it comes to analyzing the data, you'll need to read all documents, so there is no difference if they are in different files or in one file...
I'm trying to find the best way to save the state of a simple application.
From a DB point-of-view there are 4/5 tables with date fields and relationships off course.
Because the app is simple, and I want the user to have the option of moving the data around (usb pen, dropbox, etc), I wanted to put all data in a single file.
What is the best way/lib to do this?
XML usually is the best format for this (readability & openness), but I haven't found any great lib for this without doing SAX/DOM.
If you want to use XML, take a look at XStream for simple serialization of Java objects into XML. Here is "Two minute tutorial".
If you want something simple, standard Java Properties format can be also a way to store/load some small data.
consider using plain JAXB annotations that come with the JDK:
#XmlRootElement
private class Foo {
#XmlAttribute
private String text = "bar";
}
here's a blog-post of mine that gives more details on this simple usage of JAXB (it also mentiones a more "classy" JAXB-based approach -- in case you need better control over your XML schema, e.g. to guarantee backwards compatibility)
2 other options you might consider -
Hsqldb is a small sql db written in
java. More relevant for your
purposes, it can be configured to
simply write to a csv file as it's
data store, so you could conceivably
use it's text output as a portable
datastore and still use sql, if
that's what you prefer.
A second option might be to write the
datastore directly to a serialized
file either directly or through a
library like prevayler. Very good
performance and simple to implement,
cons are the fragility and opacity of
the format.
But if the data is small enough, xml is probably much less bother.
If you don't need to provide semantic meaning to your data then XML is probably a wrong choice. I would recommend using the fat-free alternative JSON, which is much more naturally built for data structures.