I am currently attempting to create an automated data export from an existing Google Datastore "Kind" to output to a json file.
I'm having trouble finding a suitable example that allows me to simply pull specific entities from the Datastore and push them out into an output file.
All the examples and documentation I've found assume I am creating an app-engine project to interface with the Datastore. The program I need to create would have to be local to sit on a server and query the Datastore to pull down the data.
Is my approach possible? Any advice on how to achieve this would be appreciated.
Yes, it is possible.
But you'll have to use one of the generic Cloud Datastore Client Libraries, not the GAE-specific one(s). You'll still need a GAE app, but you don't have to run your code in it, see Dependency on App Engine application.
You could also use one of the REST/RPC/GQL APIs, see APIs & Reference.
Some of the referenced docs contain examples as well.
Related
I am developing a server-side app using Java and couchbase. I am trying to understand the pros and cons of handling the cluster and bucket management from the java code over using the couchbase admin web console.
For instance, should I handle create/ remove buckets, indexing, and update buckets in my java code?
The reason I want to handle as many as couchbase administration functions is my app is expected to run on-prem not a cloud services. I want to avoid that our customers need to learn how to administrate couchbase.
The main reason to use the management APIs programmatically, rather than using the admin console, is exactly as you say: when you need to handle initializing and maintaining yourself, especially if the application needs to be deployed elsewhere. Generally speaking, you'll want to have some sort of database initializer or manager module in your code, which handles bootstrapping the correct buckets and indexes if they don't exist.
If all you need to do is handle preparing the DB environment one time for your application, you can also use the command line utilities that come with Couchbase, or send calls to the REST API. A small deployment script would probably be easier than writing code to do the same thing.
Previously I asked one question regarding building Document management system on top of GAE using Google cloud storage Document management system using Google Cloud Storage. I think I got appropriate answers for it. This question just an extension of the same. So My question is: can I handle versioning through my java code like mention in this link (developers.google.com/storage/docs/object-versioning) like listing all versions of an object, retrieving a specfic version of an object etc.
Since I found list API's for listing, deleting objects and doing several operations on Google cloud storage but can I handle version through any API's provided by the same from Java?
Thanks in advance.
As Google Cloud Storage doc states (https://developers.google.com/storage/docs/developer-guide) stored objects are immutable.
I.e. you can only delete object after storing and store new one, even with the same name.
So to have versioning you can organize data in pseudo folders. Like: bucket/file-name/version-1; data/file-name/version-2 etc.
Then you need to add some BL to handle this versions (access most recent one when needed, delete outdated, etc). However, in document management system its good to think about transactions, conflicts etc. So probably you will want to manage versions in DB (on GAE?) and just store versions content in the cloud as files (i.e. named by file content hashes).
I am looking to implement a way to transfer data from one application to another programmatically in Google app engine.
I know there is a way to achieve this with database_admin console but that process is very time inefficient.
I am currently implementing this with the use of Google Cloud Storage(GCS) but that involves querying data, saving it to GCS and then reading from GCS from different app and restoring it.
Please let me know if anyone knows a simpler way of transferring data between two applications programmatically.
Thanks!
Haven't tried this myself but it sounds like it should work: Use the data_store admin to backup your objects to GCS from one app, then use your other app to restore that file from GCS. This should be a good method if you only require a one time sync.
If you need to constantly replicate data from one app to another, introducing REST endpoints at one or both sides could help:
https://code.google.com/p/appengine-rest-server/ (this is in Python, I know, but just define a version of your app for the REST endpoint)
You just need to make sure your model definitions match on both sides (pretty much update the app at both sides with the same deployment code) and only have the side that needs to sync data track time of last sync and use the REST endpoints to pull in new data. Cron Jobs can do this.
Alternatively, create a PostPut callback on all of your models to make a POST call every time a model is written to your datastore to the proper REST endpoint on the other app.
You can batch update with one method, or keep a constantly updated version with the other method (at the expense of more calls).
Are you trying to move data between two App Engine applications or trying to export all your data from App Engine so you can move to a different hosting system? Your question doesn't have enough information to understand what you're attempting to do. Based on the vague requirements I would say typically this would be handled with a web service that you write in one application that exposes the data and the other application calls that service to consume the data. I'm not sure why Cloud Endpoints was down voted because that provides a nice way to expose your data as a JSON based web service with a minimum of coding fuss.
I'd recommend adding some more details into your question like exactly what are you trying to accomplish and maybe a mock data sample.
You could create a backup of your data using bulkloader and then restore it on another application.
Details here:
https://developers.google.com/appengine/docs/python/tools/uploadingdata?csw=1#Python_Downloading_and_uploading_all_data
Refer to this post if you are using Java:
Downloading Google App Engine Database
I don't know if this could be suitable for your concrete scenario, but Google Cloud Endpoints are definitely a simple way of transferring data programmatically from Google App Engine.
This is kind of the Google implementation of REST web services, so they allow you to share resources using URLs. This is still an experimental technology, but as long as I've worked with them, they work perfectly. Moreover they are very well integrated with GAE and the Google Plugin for Eclipse.
You can automatically generate an endpoint from a JDO persistent class and I think you can automatically generate the client libraries as well (although I didn't try that).
Following are related to GAE/J local development setup:
How do i add/edit entities in local datastore (preferably using some UI)? _ah/admin allows only to view entities.
In the local JUnit test cases, how to access the same datastore data that my local web application writes to? I wrote my Test Cases in accordance with http://code.google.com/appengine/docs/java/tools/localunittesting.html but the test cases don't access the same data that the web application uses.
How to save local datastore data between clean-build (right now local_db.bin is written in the target directory which gets cleaned every now and then)
Stack being used :
Google AppEngine for Java - (gae sdk
1.4/ java sdk 6),
Netbeans-6.9.1,
Maven-2 (maven-gae-plugin 0.7.3)
You can't currently edit entities in the Java local datastore viewer. It's in the todo list, though.
Your unit tests shouldn't rely on the contents of the datastore: unit tests are intended to be self-contained.
You can't do this, either, unless you make a backup of local_db.bin part of your build process. Again, you should ideally design your app with easy reloading of data in mind.
Now you can save/load entities using a command line client
I have a little application written in php+mysql I want to port to AppEngine, but I just can't find the way to port my mysql data to the datastore.
How am I supposed to save the data into my datastore? Is that even possible? I can only see documentation for persistence of Java objects, does that mean I have to port my database to a bunch of fake objects, one per line?
Edit: I say fake objects because I don't want to use them, they're just a way to get over a shortcoming of the GAE design.
I have a 30 megs table I need to check on every GET, by using objects I would need to create an object for every row, so I'd have a java class of maybe 45 megs with thousands upon thousands of lines like:
Row Row23423 = new Row (123,346,75,34,"a cow");
I just can't believe this is the only way.
Here's an idea, what about populating the data store by POST-ing the objects one by one? I mean, like the posts in a blog. You write a class that generates and persists the data, and then you Curl the url with the data, one by one. Slow, but it may work?
How to upload data with the bulk loader is described here. It's not supported directly in Java yet, but that doesn't have to stop you - just do the following:
Create an app.yaml that looks something like this:
application: myapp
version: upload
runtime: python
api_version: 1
handlers:
- url: /remote_api
script: $PYTHON_LIB/google/appengine/ext/remote_api/handler.py
login: admin
Make sure the application name is the same as your Java app's, and the version is not the same as the version you're using for Java. Upload this 'empty' app using appcfg.py.
Now, follow the directions for bulk loading in the page linked to above. When it comes time to run the tool, specify the server address with --server=upload.latest.myapp.appspot.com .
Since multiple versions of the same app share the same datastore - even across runtimes - the data uploaded with the Python version will be accessible to the Java one.
There is documentation on the datastore here.
I can't see anything about a raw data-porting service but if you can extract the data from your MySQL database into text files, then it should be relatively easy to write a script to import it into the app engine's data store using the persistence frameworks provided by it.
Your script would take your raw data, convert into a (Java) object model and imprt those Java objects into the store.
Migrating an application to Googles App Engine I think would be quite some task. As you have seen the App Engine does not have a relational database instead it uses BigTable. This will likely involve exporting it to Java objects (serialized in some way) and the inserting them.
You say "fake" objects in your post but I as you will have to use Java objects anyway I don't think they would be fake unless you plan on using one set of objects for the migration and a new set for the application.
There is no (good) general answer to the question of how to port a relational application to the GAE datastore, because the notion of "data" is incompatible between the two. Relational databases are all about the schema. GAE doesn't even have one. It's a schemaless persistent object datastore with very specific APIs. The environment is great for certain types of apps if you're developing from scratch, but it's pretty tricky to port to.
That said, you can import CSV files, as Nick explains, which you should be able to export from MySQL fairly easily. GAE supports Java and Python "at the same time" using the versions mechanism. So you can set up your data store in Python, and then run against it for your application in Java. (A Java version of the bulk loader is under development.)