I am developing a middleware application using Java, Spring, Hibernate, slf4j, log4j, Oracle db stack. Currently I log output to text file. I want to store logs in database for troubleshooting purpose. I tried using log4j db appender to directly log into db but I found the performance to be too slow. So now instead I let log4j append to a file and in a separate thread I read the log file line by line and insert into database. This method is not too slow and also it does not affect the performance of main application.
My question is, does anyone else have a better idea or is there a better way to do it ? I dont want to use any tools like loggy or splunk, because for my purpose those tools are an overkill. I want to know of any homegrown techniques I can use.
I know you have said you don't want to use external tools but I think that is a mistake. The effort you are putting in to create a bespoke solution for your logging has already be made by others and will provide a better more efficient and robust solution than you can.
For starters, loading the log files into the database simply to make them more searchable is a bad idea. You then have the performance overhead of running the database and loading all the files in, plus having to write and test all your code to do this.
I would recommend looking at the logstash tool.
If you are determined to go with your solution of loading then to a database then you need to provide some information on what type of database you intend to use.
Related
I'm Processing info in Google Cloud Dataflow, we tried to use JPA to insert or update the data into our mysql database, but these queries shouted down our server. So we've decided to change our paths...
I want to generate a mysql or .sql file so we can write the new info processed through dataflow. I want to know if there is an implemented way to do so, or do I have to do this by myself?
Let me explain a little more, we have an input from an XML, we process the info into java classes, we have a json dump of the db, so we can see what we have online without making so much calls, with this in mind, we compare the new info with the info we already have, and we decide if it's new or if it's just an update.
How can I do this via Java/Maven? I need code to generate this file...
Yes, Cloud Dataflow processes data in parallel on many machines. As such, it is not very surprising that other services may not be able to keep up or that some quotas are hit.
Depending on your specific use case, you may be able to slow/throttle Dataflow down without changing your approach. One might limit the number of workers, limit parallelism, use IntraBundleParallelization API, etc. This might be a better path, overall. We are also working on more explicit ways to throttle Dataflow.
Now, it is not really feasible for any system to automatically generate a .sql file for your database. However, it should be pretty straightforward to use primitives like ParDo and TextIO.Write to generate such a file via a Dataflow pipeline.
We have a Java based system with postgres as database. For some reasons we want to propagate certain changes on timely basis (say 1 hour) to a different location. The two broad approaches are
Logging all the changes to a file as and when that happens. However
this approach will scatter the code everywhere.
Somehow find the incremental changes in postgres between two time stamps in
some log files and send that. However I am not sure how feasible is this
approach.
Anyone has any thoughts/ideas around this?
Provided that the database size is not very great, you could do it quick&dirt by just:
Dumping the entire postgresql to a textfile.
(If the dump file is not sorted *1) sorting the textfile.
Create a diff file with the previous dump file.
Of course, I would only advice this for a situation where your database is going to be kept relatively small and you are just going to use it for a couple of servers.
*1: I do not know if it is somehow sorted, check the docs.
There are a few different options available:
Depending on the amount of data being written you could give Bucardo a try.
Otherwise it is also possible to do something with PgQ in combination with Londiste
Or create something yourself by using triggers so you can generate some kind of audit table
There are many pre-packaged approaches, so you probably don't need to develop your own. Many of the options are summarized and compared on this Wiki page:
http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
Many of them are based on the use of triggers to capture the data, with automatic generation of the triggers based on a more user-friendly interface.
Instead of writing your own solution, I would advise to leverage work already done by others. And in the case you described I would go for PgQ + Londiste (both part of Skytools package), that are easy to set up and use. If you do not want streaming replication, you could still use PgQ / Londiste to easily capture DMLs and write them to a file that you can load when needed. This would allow you expand your setup / processing when new requirements come.
We have a utility spring-mvc application that doesn't use a database, it is just a soap/rest wrapper. We would like to store an arbitrary message for display to users that persists between deployments. The application must be able to both read and write this data. Are there any best practices for this?
Multiple options.
Write something to the file system - Great for persistence. A little slow. Primary drawback is that it would probably have to be a shared file system, as any type of clustering wouldn't deal well with this. Then you get into file locking issues. Very easy implementation
Embedded DB - Similar benefits and pitfalls as just writing to the file system, but probably deals better with locking/transactional issues. Somewhat more difficult implementation.
Distributed Cache - Like Memcached - A bit faster than file, though not much. Deals with the clustering and locking issues. However, it's not persistent. Fairly reliable for a short webapp restart, but definitely not 100%. More difficult implementation, plus you need another server.
Why not use an embedded database? Options are:
H2
HSQL
Derby
Just include the jar file in the webapps classdir and configure the JDBC URL as normal.
Perfect for demos and easy to substitute when you want to switch to a bigger database server
I would simple store that in a file on a filesystem. It's possible to use an embedded database, or something like that, but for 1 message, a file will be fine.
I'd recommend you store the file outside of the application directory.
It might be alongside (next to) it, but don't go storing it inside your "webapps/" directory, or anything like that.
You'll probably also need to manage concurrency. A global (static) read/write lock should do fine.
I would use JNDI. Why over-complicate?
I had completed my project Address Book in Java core, in which my data is stored in database (MySql).
I am facing a problem that when i run my program on other computer than tere is the requirement of creating the hole data base again.
So please tell me any alternative for storing my data without using any database software like mysql, sql etc.
You can use an in-memory database such as HSQLDB, Derby (a.k.a JavaDB), H2, ..
All of those can run without any additional software installation and can be made to act like just another library.
I would suggest using an embeddable, lightweight database such as SQLite. Check it out.
From the features page (under the section Suggested Uses For SQLite):
Application File Format. Rather than
using fopen() to write XML or some
proprietary format into disk files
used by your application, use an
SQLite database instead. You'll avoid
having to write and troubleshoot a
parser, your data will be more easily
accessible and cross-platform, and
your updates will be transactional.
The whole point of StackOverflow was so that you would not have to email around questions/answers :)
You could store data in a filesystem, memory (use serialisation etc) which are simple alternatives to DB. You can even use HSQLDB which can be run completely in memory
If you data is not so big, you may use simple txt file and store everything in it. Then load it in memory. But this will lead to changing the way you modify/query data.
Database software like mysql, sql etc provides an abstraction in terms of implementation effort. If you wish to avoid using the same, you can think of having your own database like XML or flat files. XML is still a better choice as XML parsers or handlers are available. Putting your data in your customised database/flat files will not be manageable in the long run.
Why don't you explore sqlite? It is file based, means you don't need to install it separately and still you have the standard SQL to retrieve or interact with the data? I think, sqlite will be a better choice.
Just use a prevayler (.org). Faster and simpler than using a database.
I assume from your question that you want some form of persistent storage to the local file system of the machine your application runs on. In addition to that, you need to decide on how the data in your application is to be used, and the volume of it. Do you need a database? Are you going to be searching the data different fields? Do you need a query language? Is the data small enough to fit in to a simple data structure in memory? How resilient does it need to be? The answers to these types of questions will help lead to the correct choice of storage. It could be that all you need is a simple CSV file, XML or similar. There are a host of lightweight databases such as SQLite, Berkelely DB, JavaDB etc - but whether or not you need the power of a database is up to your requirements.
A store that I'm using a lot these days is Neo4j. It's a graph database and is not only easy to use but also is completely in Java and is embedded. I much prefer it to a SQL alternative.
In addition of the others answers about embedded databases I was working on a objects database that directly serialize java objects without the need for ORM. Its name is Sofof and I use it in my projects. It has many features which are described in its website page.
I am a .Net developer who is starting to do more and more Java development at work. I have a specific question about caching that I hope you guys can solve or offer suggestions. We are starting a java project that will be deployed on a Linux box running JBoss. We are planning ahead and try to think about our caching strategy. One thing we would like to do is to output cache the pages since our content will likely be cacheable for 8 hours or so. I started looking at mod_cache and this does what we want to do. The one other requirement that I need to meet is that for every request, I need to do some custom logging. I need the basic request URL and then some other business logic stuff and stuff it into a database. My questions are these:
1) How can i put code at the mod_cache level to kick off the custom logging process?
2) I want to queue these logging messages up somehow since i don't want to go to the db with every request. What would be the best way to tackle this?
I would appreciate any suggestions or solutions if you got 'em!
I assume your planned setup is Apache httpd -> mod_cache -> mod_proxy/mod_jk -> JBoss
1) You can't, since mod_cache at the Apache level does not even get down to calling Java. Hence, you would need to check if mod_cache itself has some logging facility which you can hook something in, or you would need to modify mod_cache and recompile it. This has nothing to do with Java and I don't think you can do it in Java.
2) Again, this is not a Java question when mod_cache is handling the response on its own without calling JBoss.
JBoss/Catalina/Tomcat are pretty fast when it comes down to delivering pages rendered by JSPs or other web frameworks. Set the cache expiration date and let the browser handle the cache.
You might look into using memcached instead. I don't know if there is a Tomcat/JBoss wrapper module for this program or not. It will let you cache just about any data that you can serialize and make a string key value for.
You have to write an accessor for what you need, which tries to pull stuff out of the memcached process, and/or calls a generator function upon cache miss (as well as caching the generator's result). Within this accessor, you could log cache hits and misses.
The memcached daemon is written in C, of course, so it won't "mark and sweep" itself to death holding onto data, unlike java (or most of the other "modern" language runtimes).
On the other hand, maybe mod_cache has some logging hooks. Maybe you should look into that, as this will lighten the load on your java process.
Is this close to the sort of thing you are looking for?