I am looking to implement a way to transfer data from one application to another programmatically in Google app engine.
I know there is a way to achieve this with database_admin console but that process is very time inefficient.
I am currently implementing this with the use of Google Cloud Storage(GCS) but that involves querying data, saving it to GCS and then reading from GCS from different app and restoring it.
Please let me know if anyone knows a simpler way of transferring data between two applications programmatically.
Thanks!
Haven't tried this myself but it sounds like it should work: Use the data_store admin to backup your objects to GCS from one app, then use your other app to restore that file from GCS. This should be a good method if you only require a one time sync.
If you need to constantly replicate data from one app to another, introducing REST endpoints at one or both sides could help:
https://code.google.com/p/appengine-rest-server/ (this is in Python, I know, but just define a version of your app for the REST endpoint)
You just need to make sure your model definitions match on both sides (pretty much update the app at both sides with the same deployment code) and only have the side that needs to sync data track time of last sync and use the REST endpoints to pull in new data. Cron Jobs can do this.
Alternatively, create a PostPut callback on all of your models to make a POST call every time a model is written to your datastore to the proper REST endpoint on the other app.
You can batch update with one method, or keep a constantly updated version with the other method (at the expense of more calls).
Are you trying to move data between two App Engine applications or trying to export all your data from App Engine so you can move to a different hosting system? Your question doesn't have enough information to understand what you're attempting to do. Based on the vague requirements I would say typically this would be handled with a web service that you write in one application that exposes the data and the other application calls that service to consume the data. I'm not sure why Cloud Endpoints was down voted because that provides a nice way to expose your data as a JSON based web service with a minimum of coding fuss.
I'd recommend adding some more details into your question like exactly what are you trying to accomplish and maybe a mock data sample.
You could create a backup of your data using bulkloader and then restore it on another application.
Details here:
https://developers.google.com/appengine/docs/python/tools/uploadingdata?csw=1#Python_Downloading_and_uploading_all_data
Refer to this post if you are using Java:
Downloading Google App Engine Database
I don't know if this could be suitable for your concrete scenario, but Google Cloud Endpoints are definitely a simple way of transferring data programmatically from Google App Engine.
This is kind of the Google implementation of REST web services, so they allow you to share resources using URLs. This is still an experimental technology, but as long as I've worked with them, they work perfectly. Moreover they are very well integrated with GAE and the Google Plugin for Eclipse.
You can automatically generate an endpoint from a JDO persistent class and I think you can automatically generate the client libraries as well (although I didn't try that).
Related
I am currently attempting to create an automated data export from an existing Google Datastore "Kind" to output to a json file.
I'm having trouble finding a suitable example that allows me to simply pull specific entities from the Datastore and push them out into an output file.
All the examples and documentation I've found assume I am creating an app-engine project to interface with the Datastore. The program I need to create would have to be local to sit on a server and query the Datastore to pull down the data.
Is my approach possible? Any advice on how to achieve this would be appreciated.
Yes, it is possible.
But you'll have to use one of the generic Cloud Datastore Client Libraries, not the GAE-specific one(s). You'll still need a GAE app, but you don't have to run your code in it, see Dependency on App Engine application.
You could also use one of the REST/RPC/GQL APIs, see APIs & Reference.
Some of the referenced docs contain examples as well.
I am developing a server-side app using Java and couchbase. I am trying to understand the pros and cons of handling the cluster and bucket management from the java code over using the couchbase admin web console.
For instance, should I handle create/ remove buckets, indexing, and update buckets in my java code?
The reason I want to handle as many as couchbase administration functions is my app is expected to run on-prem not a cloud services. I want to avoid that our customers need to learn how to administrate couchbase.
The main reason to use the management APIs programmatically, rather than using the admin console, is exactly as you say: when you need to handle initializing and maintaining yourself, especially if the application needs to be deployed elsewhere. Generally speaking, you'll want to have some sort of database initializer or manager module in your code, which handles bootstrapping the correct buckets and indexes if they don't exist.
If all you need to do is handle preparing the DB environment one time for your application, you can also use the command line utilities that come with Couchbase, or send calls to the REST API. A small deployment script would probably be easier than writing code to do the same thing.
I am attempting to build off of the App Engine MSB. The client it comes with handles a simple public single-thread messaging application through a general backend. I want to extend the client to build a "real" messaging app. By that I mean that I would like a user to be able to have multiple private conversations.
I'm new to Google Cloudstore, but through discussions I thought having a message entity for each message made sense, and then setting the ancestor of the message to be the conversation it belongs to would make sense. And then a user would have a list of referenceProperties to the conversations it was involved in.
As I look at the Mobile Backend Starter I see that the android client doesn't actually create any entities itself. It uses a shell class "CloudEntity" to hold data and then pass it over the Google Endpoints api to the general server code which will built an Entity and insert it in the cloud store. (with intermediary steps through EntityDto?) My understanding is that the ancestor key needs to be available at the time the Entity is created, as it becomes part of the key for the entity, so if there was a way I could modify the android client code for CloudEntity to handle the ancestor that would be great.
I've looked at code for the client and the server and it's a lot to take in, so I was hoping I could get some assistance:
1) Is it possible with the MBS client code to set ancestors for entities?
2) Is it even desirable to want to use ancestors in this way?
This is my first question, so thank you for your time and patience.
In my web application I have a part which needs to continuously crawl the Web, process those data and present it to a user. So I was wondering if it is a good approach to split it up into two separate applications where one would do the crawling, data processing and store the data in the database. And the other app would be a web application (mounted on some web server) which would present to a user the data from the database and allow him a certain interaction with the data.
The reason I think I need this split is because if I make certain changes to my web app (like adding new functionalities, change the interface etc.) I wouldn't like the crawling to be interrupted.
My application stack is Tapestry (web layer), Spring, Hibernate (over MySQL) and my own implementation of the crawler independent from the others.
Is it good for the integration to be done just by using the same database? This might cause an issue with accessing the database from the both applications at the same time. Or can the integration be done on the Hibernate level, so both applications could use the same Hibernate session? But can the app from one JVM instance access the object from another JVM instance?
I would be grateful for any suggestions regarding this matter.
UPDATE
The user (from web app's interface) would enter the URLs for crawler to parse. The crawler app would just read the tables with URLs the web app populates. And vice versa, the data processed by the crawler would just be presented on the user interface. So, I think I shouldn't concern about any kind locking, right?
Thanks,
Nikola
I would definitely keep them separated like you are planning. The web crawling is more a "batch" process than a request driven web application. The web crawling app will run in its own JVM and your web app will be running in a servlet/Java EE container.
How often will the crawler run or is it a continuously running process? You may want to consider the frequency based on your requirements.
Will the users from web app be updating the same tables that the crawler will post data to? In that case you will need to take precaution otherwise a potential deadlock may arise. If you want your web app to auto refresh data based on new inserts in the tables then you can create a message driven bean (using JMS) to asynchronously notify the web app from the crawler app. When a new data insert message arrives you can either do a form submit on your page or use ajax to update the data on the page itself.
The web app should use connection pooling and the batch app could use DBCP or C3P0. I am not sure you gain much benefit by trying to share the database sessions in this scenario.
This way you have the integration between the two apps while not slowing down each other waiting on other to process.
HTH!
You are right, splitting the application into two could be reasonable in your case.
Disadvantages of separating into two applications -
You can not cache in Hibernate or any other cached mutable objects that are modifiable from both applications in any one of them. Optimistic locking should work fine with two hibernate applications. I don't see any other problems.
Advantages you have already specified in your code.
I have a little application written in php+mysql I want to port to AppEngine, but I just can't find the way to port my mysql data to the datastore.
How am I supposed to save the data into my datastore? Is that even possible? I can only see documentation for persistence of Java objects, does that mean I have to port my database to a bunch of fake objects, one per line?
Edit: I say fake objects because I don't want to use them, they're just a way to get over a shortcoming of the GAE design.
I have a 30 megs table I need to check on every GET, by using objects I would need to create an object for every row, so I'd have a java class of maybe 45 megs with thousands upon thousands of lines like:
Row Row23423 = new Row (123,346,75,34,"a cow");
I just can't believe this is the only way.
Here's an idea, what about populating the data store by POST-ing the objects one by one? I mean, like the posts in a blog. You write a class that generates and persists the data, and then you Curl the url with the data, one by one. Slow, but it may work?
How to upload data with the bulk loader is described here. It's not supported directly in Java yet, but that doesn't have to stop you - just do the following:
Create an app.yaml that looks something like this:
application: myapp
version: upload
runtime: python
api_version: 1
handlers:
- url: /remote_api
script: $PYTHON_LIB/google/appengine/ext/remote_api/handler.py
login: admin
Make sure the application name is the same as your Java app's, and the version is not the same as the version you're using for Java. Upload this 'empty' app using appcfg.py.
Now, follow the directions for bulk loading in the page linked to above. When it comes time to run the tool, specify the server address with --server=upload.latest.myapp.appspot.com .
Since multiple versions of the same app share the same datastore - even across runtimes - the data uploaded with the Python version will be accessible to the Java one.
There is documentation on the datastore here.
I can't see anything about a raw data-porting service but if you can extract the data from your MySQL database into text files, then it should be relatively easy to write a script to import it into the app engine's data store using the persistence frameworks provided by it.
Your script would take your raw data, convert into a (Java) object model and imprt those Java objects into the store.
Migrating an application to Googles App Engine I think would be quite some task. As you have seen the App Engine does not have a relational database instead it uses BigTable. This will likely involve exporting it to Java objects (serialized in some way) and the inserting them.
You say "fake" objects in your post but I as you will have to use Java objects anyway I don't think they would be fake unless you plan on using one set of objects for the migration and a new set for the application.
There is no (good) general answer to the question of how to port a relational application to the GAE datastore, because the notion of "data" is incompatible between the two. Relational databases are all about the schema. GAE doesn't even have one. It's a schemaless persistent object datastore with very specific APIs. The environment is great for certain types of apps if you're developing from scratch, but it's pretty tricky to port to.
That said, you can import CSV files, as Nick explains, which you should be able to export from MySQL fairly easily. GAE supports Java and Python "at the same time" using the versions mechanism. So you can set up your data store in Python, and then run against it for your application in Java. (A Java version of the bulk loader is under development.)