multiple clients querying data with no server - java

I'm working on a school project where the client needs to have multiple users querying and writing to a single data source. The users have access to shared network drives and all functionality has to be in the client application, the IT department won't allow a service to run from one of their servers and external server hosting isn't an option.
The amount of data that actually needs to be stored is actually very little, about 144 rows maximum.
I've looked into using embedded databases, sqllite , hsql , objectdb ... etc but they seem over kill for how little data needs to be saved. It also seemed like with hsql if anyone accessed the database it would be completely locked to any other user. Concurrency wouldn't be much of an issue there will be 5-7 people using the system albeit scarcely only a few times a year.
Would using something like XQuery and serializing everything in xml be a viable option or just simply using the java serializable api?

A distributed, client side database writing files to the shared network drive could be a good solution for this use case. Take a look at Cloud DB, it might be what your looking for.

Related

javax persistence *switch* to flat file

I'm writing a Java EE program that manages client data. The original specification was for it to talk to a hosted database that the client would setup, but now (just before deadline) they're saying they can't get that together and want it to talk to local files instead. (The program will be used in contexts where internet access is infeasible. Also, the user will want to email or dropbox the files around and expresses some hostility to the idea of my starting a mysql server on their local machine.)
The persistence layer is implemented with javax.persistence, configured to talk to a mysql database. What I'd love is to keep the program as similar as possible and switch to configuring my EntityManagerFactory to talk to a flat file if it can't connect to mysql. Is there a jdbc driver that will talk to a flat file?
(Searching on this subject turns up a number of options, but none of them look obviously supported/tested/likely to be robust. If there's a standard solution, it's the one I'm looking for.)

Accessing Data across two Databases

I have large set of data(more than 1TB). This will be accessed by more than 1000 people concurrently. Storing it in one database will make the application really slow. So I was planning to store it across different databases. Does mongo DB support routing between different databases? Or should this in our application? I am developing using Java and use Spring framework to interact with mongo.
Given the reason for splitting your data into multiple databases is to improve performance, I would suggest sharding a single database rather than splitting across multiple. If location is granular enough and you would like to split load across servers you could then use tag aware sharding to pin specific locations or location ranges to a specific server. There is a good tutorial on this available here.
Before following this route I would suggest performing load tests on your application with your database on the hardware you plan to use for your system. It is worth confirming that you really do need to shard/split data and if so the # of servers you may need. If your database is going to be read rather than write intensive it could be that a non-sharded database would handle your load giving your working set fits in memory.

copy data from a mysql database to other mysql database with java

I have developed a small swing desktop application. This app needs data from other database, so for that I've created an small process using java that allows to get the info (using jdbc) from remote db and copy (using jpa) it to the local database, the problem is that this process take a lot of time. is there other way to do it in order to make faster this task ?
Please let me know if I am not clear, I'm not a native speaker.
Thanks
Diego
One good option is to use the Replication feature in MySQL. Please refer to the MySQL manual here for more information.
JPA is less suited here, as object-relational-mapping is costly, and this is bulk data transfer. Here you probably also do not need data base replication.
Maybe backup is a solution: several different approaches listed there.
In general one can also do a mysqldump (on a table for instance) on a cron task compress the dump, and retrieve it.

Database distribution

What are the possibilities to distribute data selectively?
I explain my question with an example.
Consider a central database that holds all the data. This database is located in a certain geographical location.
Application A needs a subset of the information present in the central database. Also, application A may be located in a geographical location different (and maybe far) from the one where the central database is located.
So, I thought about creating a new database at the same location of application A that would contain a subset of information of the central database.
Which technology/product allow me to deploy such a configuration?
Thanks
Look for database replication. SQL Server can do this for sure, others (Oracle, MySQL, ...) should have it, too.
The idea is that the other location maintains a (subset) copy. Updates are exchanged incrementally. The way to treat conflicts depends on your application.
Most major database software such as MySql and SQL server can do the job, but it
is not a good model. With the growth of the application (traffic and users),
not only will you create a load on the central database server (which might be serving
other applications),but you will also be abusing your network bandwidth to transfer data
between the far away database and the application server.
A better model is to keep your data close to the application server, and use the far away
database for backup and recovery purposes only. You can use an FC\IP SAN (or any other
storage network architecture) as your storage network model, based on your applications' needs.
One big question that you didn't address is if Application A needs read-only access to the data or if it needs to be read-write.
The immediate concept that comes to mind when reading your requirements is sharding. In MySQL, this can be accomplished with partitioning. That being said, before you jump into partitions, make sure you read up on their pros and cons. There are instances where partitioning can slow things down if your indexes are not well chosen, or your partitioning scheme is not well thought out.
If your needs are read-only, then this should be a fairly simple solution. You can use MySQL in a Master-Slave context, and use App A off a slave. If you need read-write, then this becomes much more complex.
Depending on your write needs, you can split your reads to your slave, and your writes to the master, but that significantly adds complexity to your code structure (need to deal with multiple connections to multiple dbs). The advantage of this kind of layout is that you don't need to have complex DB infrastructure.
On the flip side, you can keep your code as is, and use a Master-Master replication in MySQL. Although not officially supported by Oracle, a lot of people have had success in this. A quick Google search will find you a huge list of blogs, howtos, etc. Just keep in mind that your code has to be properly written to support this (ex: you cannot use auto-increment fields for PKs, etc).
If you have cash to spend, then you can look at some of the more commercial offerings. Oracle DB and SQL Server both support this.
You can also use Block Based data replication, such as DRDB (and Mysql DRDB) to handle the replication between your nodes, but the problem you always will encounter is what happens if your link between the two nodes fails.
The biggest issue you will encounter is how to handle conflicting updates in 2 separate DB nodes. If your data is geographically dependent, then this may not be an issue for you.
Long story short, this is not an easy (or inexpensive) problem to resolve.
It's important to address the possibility of conflicts at the design phase anytime you are talking about replicating databases.
Moving on from that, SAP's Sybase Replication Server will allow you to do just that, either with Sybase database's or 3rd party databases.
In Sybase's world this is frequently called a corporate roll-up environment. There may be multiple geographically seperated databases each with a subset of data which they have primary control over. At the HQ, there is a server that contains all the various subsets in one repository. You can choose to replicate whole tables, or replicate based on values in individual rows/columns.
This keeps the databases in a loosely consistent state. Transaction rates, Geographic separation, and the latency that can be inherent to network will impact how quickly updates move from one database to another. If a network connection is temporarily down, Sybase Replication Server will queue up transaction, and send them as soon as the link comes back up, but the reliability and stability of the replication system will be affected by the stability of the network connection.
Again, as others have stated it's not cheap, but it's relatively straight forward to implement and maintain.
Disclaimer: I have worked for Sybase, and am still part of the SAP family of companies.

How do I put data into the datastore of Google's app engine?

I have a little application written in php+mysql I want to port to AppEngine, but I just can't find the way to port my mysql data to the datastore.
How am I supposed to save the data into my datastore? Is that even possible? I can only see documentation for persistence of Java objects, does that mean I have to port my database to a bunch of fake objects, one per line?
Edit: I say fake objects because I don't want to use them, they're just a way to get over a shortcoming of the GAE design.
I have a 30 megs table I need to check on every GET, by using objects I would need to create an object for every row, so I'd have a java class of maybe 45 megs with thousands upon thousands of lines like:
Row Row23423 = new Row (123,346,75,34,"a cow");
I just can't believe this is the only way.
Here's an idea, what about populating the data store by POST-ing the objects one by one? I mean, like the posts in a blog. You write a class that generates and persists the data, and then you Curl the url with the data, one by one. Slow, but it may work?
How to upload data with the bulk loader is described here. It's not supported directly in Java yet, but that doesn't have to stop you - just do the following:
Create an app.yaml that looks something like this:
application: myapp
version: upload
runtime: python
api_version: 1
handlers:
- url: /remote_api
script: $PYTHON_LIB/google/appengine/ext/remote_api/handler.py
login: admin
Make sure the application name is the same as your Java app's, and the version is not the same as the version you're using for Java. Upload this 'empty' app using appcfg.py.
Now, follow the directions for bulk loading in the page linked to above. When it comes time to run the tool, specify the server address with --server=upload.latest.myapp.appspot.com .
Since multiple versions of the same app share the same datastore - even across runtimes - the data uploaded with the Python version will be accessible to the Java one.
There is documentation on the datastore here.
I can't see anything about a raw data-porting service but if you can extract the data from your MySQL database into text files, then it should be relatively easy to write a script to import it into the app engine's data store using the persistence frameworks provided by it.
Your script would take your raw data, convert into a (Java) object model and imprt those Java objects into the store.
Migrating an application to Googles App Engine I think would be quite some task. As you have seen the App Engine does not have a relational database instead it uses BigTable. This will likely involve exporting it to Java objects (serialized in some way) and the inserting them.
You say "fake" objects in your post but I as you will have to use Java objects anyway I don't think they would be fake unless you plan on using one set of objects for the migration and a new set for the application.
There is no (good) general answer to the question of how to port a relational application to the GAE datastore, because the notion of "data" is incompatible between the two. Relational databases are all about the schema. GAE doesn't even have one. It's a schemaless persistent object datastore with very specific APIs. The environment is great for certain types of apps if you're developing from scratch, but it's pretty tricky to port to.
That said, you can import CSV files, as Nick explains, which you should be able to export from MySQL fairly easily. GAE supports Java and Python "at the same time" using the versions mechanism. So you can set up your data store in Python, and then run against it for your application in Java. (A Java version of the bulk loader is under development.)

Categories

Resources