I deployed my java web application in kubernetes using DEPLOYMENTS and was able to scale it and connect it to a database POD, but then I wanted to scale the database too but as you know is not possible in kubernetes and the MYSQL REPLICA not recommended for production. So I tried vitess and was able to scale my database but don't know how or where should I create my java web application DEPLOYMENTS/REPLICAS and connect them to the database through vtgate.
And is there another way of scaling mysql database through kubernetes ?
It's important to note that Vitess is not a transparent proxy that you can just insert between the app and MySQL at the connection level. Vitess turns a set of MySQL servers into a clustered database, and it requires you to build your app against a Vitess driver instead of the plain MySQL driver.
If you're already using JDBC, you shouldn't need a lot of code changes other than connection management, since there is a Vitess implementation of the JDBC interface. However, some query constructs may not be supported yet by Vitess, so you may need to rewrite them into an equivalent form that is supported.
Once your app is compatible with Vitess, deploying it in Kubernetes will be the same as you did before, except you will point the app pods to connect to the VTGate service via DNS.
As for other ways to scale MySQL in Kubernetes without Vitess, there's an important new feature entering Beta in Kubernetes 1.5 called StatefulSet that will help you scale databases like MySQL similar to the way a Deployment can scale stateless Pods. Vitess itself will also become more convenient to scale in Kubernetes by taking advantage of StatefulSet.
However, StatefulSet with pure MySQL will mostly only help you scale read-only traffic by increasing the number of slaves. If you need to scale write traffic, you will likely need to implement application-defined sharding. At that point, the required changes to your app will almost certainly be much more than if you modify it to support Vitess.
Related
My project is looking to deploy a new j2ee application to Amazon's cloud. ElasticBeanstalk supports Tomcat apps, which seems perfect. Are there any particular design considerations to keep in mind when writing said app that might differ from just a standalone tomcat on a server?
For example, I understand that the server is meant to scale automatically. Is this like a cluster? Our application framework tends to like to stick state in the HttpSession, is that a problem? Or when it says it scales automatically, does that just mean memory and CPU?
Automatic scaling on AWS is done via adding more servers, not adding more CPU/RAM. You can add more CPU/RAM manually, but it requires shutting down the server for a minute to make the change, and then configuring any software running on the server to take advantage of the added RAM, so that's not the way automatic scaling is done.
Elastic Beanstalk is basically a management interface for Amazon EC2 servers, Elastic Load Balancers and Auto Scaling Groups. It sets all that up for you and provides a convenient way of deploying new versions of your application easily. Elastic Beanstalk will create EC2 servers behind an Elastic Load Balancer and use an Auto Scaling configuration to add more servers as your application load increases. It handles adding the servers to the load balancer when they are ready to receive traffic, and removing them from the load balancer and deleting the extra servers when they are no longer needed.
For your Java application running on Tomcat you have a few options to handle horizontal scaling well. You can enable sticky sessions on the Load Balancer so that all requests from a specific user will go to the same server, thus keeping the HttpSession tied to the user. The main problem with this is that if a server is removed from the pool you may lose some HttpSessions and cause any users that were "stuck" to that server to be logged out of your application. The solution to this is to configure your Tomcat instances to store sessions in a shared location. There are Tomcat session store implementations out there that work with AWS services like ElastiCache (Redis) and DynamoDB. I would recommend using one of those, probably the Redis implementation if you aren't already familiar with DynamoDB.
Another consideration for moving a Java application to AWS is that you cannot use any tools or libraries that rely on multi-cast. You may not be using multi-cast for anything, but in my experience every Java app I've had to migrate to AWS relied on multi-cast for clustering and I had to modify it to use a different clustering method.
Also, for a successful migration to AWS I suggest you read up a bit on VPCs, private IP versus public IP, and Security Groups. A solid understanding of those topics is key to setting up your network so that your web servers can communicate with your DB and cache servers in a secure and performant manner.
I have a typical stateless Java application which provides a REST API and performs updates (CRUD) in a Postgresql Database.
However the number of clients is growing and I feel the need to
Increase redundancy, so that if one fails another takes place
For this I will probably need a load balancer?
Increase response speed by not flooding the network and the CPU of just one server (however how will the load balancer not get flooded?)
Maybe I will need to distribute the Database?
I want to be able to update my app seamlessly (I have seen a thingy called kubernetes doing this): Kill each redundant node one by one and immediately replace it with an updated version
My app also stores some image files, which grow fast in disk size, I need to be able to distribute them
All of this must be backup-able
This is the diagram of what I have now (both Java app and DB are on the same server):
What is the best/correct way of scaling this?
Thanks!
Web Servers:
Run your app on multiple servers, behind a load balancer. Use AWS Elastic Beanstalk or roll your own solution with EC2 + Autoscaling Groups + ELB.
You mentioned a concern about "flooding" of the load balancer, but if you use Amazon's Elastic Load Balancer service it will scale automatically to handle whatever traffic you get so that you don't need to worry about this concern.
Database Servers:
Move your database to RDS and enable multi-az fail-over. This will create a hot-standby server that your database will automatically fail-over to if there are issues with your primary server. Optionally add read replicas to scale-out your database capacity.
Start caching your database queries in Redis if you aren't already. There are plugins out there to do this with Hibernate fairly easily. This will take a huge load off your database servers if your app performs the same queries regularly. Use AWS ElastiCache or RedisLabs for your Redis server(s).
Images:
Stop storing your image files on your web servers! That creates lots of scalability issues. Move those to S3 and serve them directly from S3. S3 gives you unlimited storage space, automated backups, and the ability to serve the images directly from S3 which reduces the load on your web servers.
Deployments:
There are so many solutions here that it just becomes a question about which method someone prefers. If you use Elastic Beanstalk then it provides a solution for deployments. If you don't use EB, then there are hundreds of solutions to pick from. I'd recommend designing your environment first, then choosing an automated deployment solution that will work with the environment you have designed.
Backups:
If you do this right you shouldn't have much on your web servers to backup. With Elastic Beanstalk all you will need in order to rebuild your web servers is the code and configuration files you have checked into Git. If you end up having to backup EC2 servers you will want to look into EBS snapshots.
For database backups, RDS will perform a daily backup automatically. If you want backups outside RDS you can schedule those yourself using pg_dump with a cron job.
For images, you can enable S3 versioning and multi-region replication.
CDN:
You didn't mention this, but you should look into a CDN. This will allow your application to be served faster while reducing the load on your servers. AWS provides the CloudFront CDN, and I would also recommend looking at CloudFlare.
I'm working on a plugin for an emulator that is going to allow people to host a control panel through a website to view statistics, etc; Currently I have XAMPP installed which is running my SQL Server, and the HTTP Server is being handled through the Netty networking library in Java.
I'm curious as to if there is a way to host the SQL Server from within Java, similar to the HTTP Server. It'd also greatly simplify the process of installation for the plugin.
The other option was to use ObjectDB, but after looking into it, it seems like it requires Quercus and I don't want to go through that.
The term you're looking for is "embedded database".
There are a number of databases that you can use as an embedded database, and you can choose to use them as an in-memory database (which means the data is gone when your application stops) or make them save data to a file on disk, so that the data is still there when you stop and re-start the application.
Examples of databases that can be used in this way: H2 Database, HSQLDB, Apache Derby.
Can I use an embedded Derby database as non-embedded one in future? In this case will I need to migration or I will just need to change the driver in jdbc? If it is more complicated what will I have to do?
Yes, you can. A Derby database is identical, whether it's accessed by a standalone program using the embedded driver, or by multiple client programs communicating with the Derby network server.
The Derby network server is just some "glue" software which implements the DRDA remote database protocols to implement JDBC-over-the-net and then uses the normal embedded database access to access your database on the server side.
If you wish, there is even a slightly more advanced configuration called the "embedded server" which allows you to have your program which uses the embedded driver to access your database share that access with other networked clients by simultaneously acting as a networked server.
Here's some more information about that last option: http://db.apache.org/derby/docs/10.10/adminguide/radminembeddedserverex.html
I'm a Java EE developer and we typically use Weblogic to deploy our apps. Now I'm faced with a new desktop application which requires logging, database connectivity and mail.
After some investigation I'm realizing that desktop apps are a completely new world to me and I'm not sure if I'm choosing the right libraries to support my app.
These are my questions:
In our Weblogic projects we used Log4j and I want to use it again in my desktop app. Is it a bad idea? Should I use a better logging framework?
In Weblogic we retrieve database connections with JNDI but now it seems impossible to do the same. How do I perform the same action in a desktop application so I can connect with a remote database? Is the combination c3p0 + database driver a good approach for this?
Is there any framework/JAR which provides all this stuff (log + ddbb + mail) as an integrated solution? Workmates told me Spring could help. I also found Warework.
In our Weblogic projects we used Log4j and I want to use it again in
my desktop app. Is it a bad idea? Should I use a better logging
framework?
No, it is not a bad idea and perfectly works. Personally, I'd go with java.util.logging as it does the job fairly well and it reduces your applications' footprint (storage). Although, it's configuration is a bit tricky.
In Weblogic we retrieve database connections with JNDI but now it
seems impossible to do the same. How do I perform the same action in a
desktop application so I can connect with a remote database? Is the
combination c3p0 + database driver a good approach for this?
You can directly connect to your database using pure java.sql JDBC API (tons of examples available in the internet), but always have to distribute the proprietary database drivers as part of your application (mySQL, Oracle, DB2, etc.). Furthermore it's possible to directly use connection pools provided with those drivers by using their proprietary APIs (fairly easy to encapsulate). Nevertheless, there are a number of issues:
latency; database protocols are fairly sensitive when it comes to latency (distance between client and database server). Having a database in the UK and desktop clients in US is probably not a good idea.
security 1; you have to distribute database user credentials to each and every desktop client. Be aware of that.
security 2; your database security requirements may demand for transport security (packet encryption).
change management; applying non-backward compatible updates to your database requires you to update all desktop clients (believe me - it's not fun).
network; depending on your environment, certain ports and/or protocols may be blocked.
Is there any framework/JAR which provides all this stuff (log + ddbb +
mail) as an integrated solution? Workmates told me Spring could help.
I also found Warework.
Logging and database access are not an issue and work fairly well without any third-party framework. Of course, those frameworks might provide value regarding other aspects (abstraction, DI, JDBC abstraction, etc.), but this is a topic of detailed software design. Sending emails directly from a desktop application might become an issue, regardless of the framework in use. Just some things to keep in mind:
which SMTP relay server do you want to use?
in case of an enterprise environment, your IT operations teams might not allow you to use their SMTP server from each desktop (keep spam in mind).
Conclusion: In desktop scenarios an application server is not a bad idea either. You should have your desktop application to communicate with an application server only by using e.g. JSON, XML, SOAP over HTTP/HTTPS or RMI, etc. The application should be responsible for the complex tasks like database access, transaction management, fine grained security, email, etc.