I've been reading a little about Google's AppEngine that provides application hosting. I've been trying it out as I think it looks quite interesting but I'm a bit concerned about the database part.
Say I'm developing my Java app locally. I don't want to deploy to Google every time I make change to the code, so I setup a nice little Servlet container on my development machine to test things easily. With AppEngine you store things using their datastore API, which basically lets you model your data using Java objects - which is nice.
However, it seems like this data is embedded in the application code itself (inside the .war that is deployed to Google). Can I simply use their datastore api locally? How will it be stored on my local machine? Is this all handled by them so that I just have to worry about using the datastore API and when I deploy it to Google the data will just be stored in a different way than how it's stored on my local machine?
I'm just a little confused because I'm used to having the data part layered out of my application code.
I hope I'm clear enough. Thanks.
Development datastore and Production datastore are two different and separated things:
Development datastore is tipically a file based datastore named local_db.bin that it's just useful to store your data in your testing environment; the data is not replicated to the production environment when you deploy your application.
This kind of datastore is meant to be used with a fairly small number of entities and its performance has nothing to do with the powerful Production datastore beast based on Big Table.
All you need to do is to use the Datastore API that creates a level of abstraction between your code and the underlying datastore; in testing your data will be stored in the local datastore file, in production the created data will be saved to the Google App Engine datastore with all the features and limitations that this implies.
Related
I am creating a java based application and i want to use google app engine for its deployment. But i want this application to be movable to other servers like tomcat etc on my local or other machines. So i though want to use google app engine. But want to keep my application independent of any Google specific things. Can somebody summarize the points i must take care of. I want to keep it independent both from application and database layer perspective.
Though I am not master in google app engine however the thumb rule to make your webapp portable is to use standard specification APIs instead of vendor specific APIs. For example if your app is using google app engine UserServive (com.google.appengine.api.users.UserService) or data store com.google.appengine.api.datastore.DatastoreService , if is tightly bound with Google app engine and can not be migrated to standalone tomcat engine.
To loose couple your database for further migration you should consider using MySQl schema in google app engine. Because in future you can host your database anywhere by just taking a dump. Also, you should use JDBC apis/JPA for database operations from your application using MySQL JDBC JAR
To summarize, you should avoid any API call which has com.google.appengine* import in your source. Also, you should have your own mysql schema running in google app engine cloud.
I think you can and it's only a matter of design.
Just an example: if your application need user authentication, you can create an interface AuthenticationService and two implementations:
GAEAuthenticationService for the Google App Engine
FakeAuthenticationService for local tests running with jetty (for example)
DataSourceAuthenticationService for authentication based on a DataSource
You can do the same think with persistence, scheduler, etc... the only thing to do is:
define the objects you need and use interfaces when you need different implementations that depends on platforms
I recently made an interesting application using Play Framework and MySQL Connector/MXJ to make a completely portable web server with database, independent of any currently installed software(including Java).
I'm still new to MXJ, and the desktop application realm (as opposed to straight-up webapps), so I'm wondering if there are other, better methods for storing/accessing large amounts of data than embedded MySQL. I would assume so, since it seems not many people use MXJ. It essentially just packs mysqld.exe in its various forms for multiple operating systems and platforms. It runs in its own thread, and stores its data in whatever directory you provide.
For an application that frequently analyzes and searches through data in large chunks(100MB to 5GB), what other (fast)options are there, or am I justified in my webapp-laziness of bringing along MySQL?
Independent of any currently installed software(including Java).
If you are looking for an embedded database for a desktop application, then you can go for SQLITE. However, there are pros/cons for using either MySQL or SQLite
SQLite:
Easier to setup
Great for temporary (testing databases)
Great for rapid development
Great for embedding in an application
Doesn't have user management
Doesn't have many performance features
Doesn't scale well.
MySQL:
Far more difficult/complex to set up
Better options for performance tuning
Fit for a production database
Can scale well if tuned properly
Can manage users, permissions, etc.
You can find more info on when to use SQLite here
UPDATE: I came across HSQLDB and here are its test results. HamsterDb is another option.
Do you really need a database if your app is single user and desktop based? Maybe it is faster to simply write large files to the local filesystem then loading then through the network tier. If your app is very complex you could use an embedded db just for storing your domain and configuration, but if its not maybe you can avoid using a db + sql + o/r-mapping and so on.
I have written a relatively simple Java App Engine application which I would like to be able to port to another cloud provider.
I am using the JDO datastore API so I think my data handling should be portable to other backends as listed here: http://www.datanucleus.org/products/accessplatform/index.html
I would ideally like to deploy my application onto EC2 with minimal code changes. What is my best approach?
Note: I am aware of the http://code.google.com/p/appscale/ project but I want to avoid using this as it doesn't look like they are updating very often.
AppScale remains your best option to avoid rewriting any code. They do keep up to date with official App Engine - for instance, they just released preliminary support for Go. Even if they weren't so assiduous at keeping up to date, though, this would only be relevant if some feature you required wasn't yet supported - and it sounds like your needs are fairly basic.
JDO should be trivial, there might be some Google specific configuration here and there but generally it should be easy. The storage model Google promotes is not bad for RDBMS either, but you might need to fine tune your model depending on the backend you end up with.
If you're not using the low-level Google APIs, you should be pretty much there.
I managed to get my application working on EC2 using the following components.
Tomcat 7
Datanucelus
HBase
I had to manually create a table in HBase for each of my data classes but was able to configure Datanucleus to auto create the columns.
I also had to change my primary key value generation strategy from identity to increment as per this table of supported features.
http://www.datanucleus.org/products/accessplatform_3_0/datastore_features.html
I am developing a Java application using Google App Engine that depends on a largish dataset to be present. Without getting into specifics of my application, I'll just state that working with a small subset of the data is simply not practical. Unfortunately, at the time of this writing, the Google App Engine for Java development server stores the entire datastore in memory. According to Ikai Lan:
The development server datastore stub is an in memory Map that is persisted
to disk.
I simply cannot import my entire dataset into the development datastore without running into memory problems. Once the application is pushed into Google's cloud and uses BigTable, there is no issue. But deployment to the cloud takes a long time making development cycles kind of painful. So developing this way is not practical.
I've noticed the Google App Engine for Python development server has an option to use SQLite as the backend datastore which I presume would solve my problem.
dev_appserver.py --use_sqlite
But the Java development server includes no such option (at least not documented). What is the best way to get a large dataset working with the Google App Engine Java development server?
There's no magic solution - the only datastore stub for the Java API, currently, is an in-memory one. Short of implementing your own disk-based stub, your only options are to find a way to work with a subset of data for testing, or do your development on appspot.
I've been using the mapper api to import data from the blobstore, as described by Ikai Lan in this blog entry - http://ikaisays.com/2010/08/11/using-the-app-engine-mapper-for-bulk-data-import/.
I've found it to be much faster and more stable than using the remote api bulkloader - especially when loading medium sized datasets (100k entities) into the local datastore.
I'm on the process of designing a web application based on Google App Engine (Java) platform. I'm basically from relational database world and I'm trying to understand how to use the persistence that GAE provides.
So my questions is, in RDBMS, I can easily access my data without going through my application. i.e, I can use an SQL client to connect to my data and manipulate it. Is the same thing possible with GAE?
Both yes and no. You can open https://appengine.google.com/ and go to "Datastore Viewer". Here is a possibility to write a GQL query. But you will not be able to operate data sets with more than 500 records and with offset >1000. Good luck!
Install AppWrench Tools plugin for eclipse. I use it. It allows you to create/edit/delete/browse your local and production datastore entities from within eclipse.