I'm developing a token based api gateway. Its basically provide a token for authentic clients. So I'm not sure how to remove expired tokens. For every request I checked whether the token is valid or not.
Option 1 is
Mark status of token as expired in database table row.
and create a scheduler to run in midnight to delete expired tokens.
Option 2 is
Delete the token from the row when its expired.
In here No need to run a scheduler.
Normally this API Gateway will handle around 1000 requests per second and and this will increase day by day.
So I'm not sure which option I should use.
The technology I have used is.
Spring mvc,Spring data jpa and Postgre DB. Will deploy on tomcat server.
Neither of the two options is particularly good as both will modify a table row and therefore generate I/O. At 1,000 q/s you need a better solution. On 2ndQuadrant is a blog post on authenticating users through connection pooling in the context of row-level security. The blog post has some issues IMHO and non-relevant material as well so I'll try to redo it here in the right way (or read my comment on the blog post over there).
In Java - as in most other programming languages and/or frameworks - connection pooling is the preferred way to connect to a database server for performance reasons. There is an implicit contract that the application requests a Connection instance from the pool, uses it and then returns the instance to the pool for some other thread. Holding on to a Connection is not an option as it breaks the pooling logic. So proceed as follows:
Connection pool object
Create a connection pool object with database cluster credentials. That role should be GRANTed all necessary privileges on tables and other objects.
Authentication
In the application a user authenticates doing myapp_login(username, password) or something similar using a Connection from the pool. In the database the credentials are checked against a table users or whatever it is called in your setup. If a match is found then create a random token and insert that in a table:
CREATE UNLOGGED TABLE sessions (
token text DEFAULT uuid_generate_v4()::text,
login_time timestamp DEFAULT CURRENT_TIME,
user_name integer,
...
);
Add as many fields as you want. I use a uuid here (cast to text, read on) but you could also md5() some data or use some pg_crypto routine.
This table has to be fast so it is UNLOGGED. That means it is not crash-safe and will be truncated after some server error but that is not a problem: all database sessions will have been invalidated anyway. Also, do not put any constraints like NOT NULL on the table because the only access to this table is through the functions that you as a developer design, no ordinary user ever touches this table, and every constraint involves more CPU cycles.
The myapp_login() function looks somewhat like this:
CREATE FUNCTION myapp_login(uname text, password text) RETURNS text AS $$
DECLARE
t text;
BEGIN
PERFORM * FROM app_users WHERE username = uname AND pwd = password;
IF FOUND THEN
INSERT INTO sessions(user_name) VALUES (uname) RETURNING token INTO t;
EXECUTE format('SET SESSION "my_app.session_user" TO %s', t);
RETURN t;
END IF;
SET SESSION "my_app.session_user" = '';
RETURN NULL;
END;
$$ LANGUAGE plpgsql STRICT SECURITY DEFINER;
REVOKE EXECUTE ON FUNCTION myapp_login(text, text) FROM PUBLIC;
GRANT EXECUTE ON FUNCTION myapp_login(text, text) TO myapp_role;
As you can see, the token is also set in an environment variable with SET SESSION (which needs a literal text value, hence the uuid::text cast and the EXECUTE command) and then returned to the caller. That session token should be stored somewhere in your application code on the Java side.
The function does a lookup on the app_users table and an INSERT on the sessions table. The first is cheap, the second is expensive.
Resume the same session for further queries
If your app user needs further database access after the first queries, then get a Connection instance from the connection pool again, but don't call myapp_ login() but myapp_resume(token) instead. This latter function looks up the token in the sessions table (cheap) and, if found, sets the session variable to this new token. You can also check that the login_time value is recent or set it with the CURRENT_TIME to keep the session "alive" (expensive) or do any other necessary business.
The trick is to keep resuming the session as lean as possible because this is likely to be happening multiple time during a session (from the application perspective).
Close the session
When your app user is done, do myapp_logout(token) which deletes the row from the sessions table that corresponds to the token.
Sessions that are not properly closed are not deleted from the sessions table, but I would not worry too much about that.You could schedule a job that runs once a week to delete all rows that are older than 6 hours or so. That would also allow you to figure out where the error comes from, for instance.
A final word on the token. A uuid is just a random number, but you could also make a hash of the application user name with some random data and use that, for instance, in RLS or some other row-based access mechanism; the blog post I link to above has good info on that. In an application I have developed myself I link the row from the users table to what the user is allowed to see. In either case you should really weigh the pro's and con's: a hash that can be used in RLS sounds nice, but it requires the hash to be re-calculated (which tends to be expensive) and compared to the session hash on every query, a repeated lookup against a users table is also an overhead. Setting another session variable that can be checked at query time with current_setting() might be a good alternative.
I think the easiest way would be like this. When you generate token in your database, you can store time of generation. So that when client sends a request to your database, you can check if it's expired and delete it in request time.
Related
I have a situation in my java, spring based web app. My server generates coupons ( a number mixed with alphabets , all random but unique) , each coupon can be applied or used by only one and only on logged in customer. They are shown on the front end to all the users, which then gets accepted/selected by the customers.But once accepted by one customer it gets assigned to him and not available to anyone else.
I tried to do synchronization of code block which checks if the coupon is already applied / availed, it worked but , cases like when two users click avail it at exact same time, it fails ( get allocated to both)
Please help.
Do not use synchronization for this. You can store the state of the coupons in a database, and work on these data in a DB transaction, using locks. So:
User tries the coupon, you get the ID
Start a DB transaction, get the coupon row from it, and lock it
Do what you need to, then invalidate the coupon
End the DB transaction, release the lock
The database do not necessarly need to be a standalone RDMS, in a simple case, even SQLite is sufficient. Anyway, DBs most certainly handle race conditions betten than you (or most of us) can.
If you prefer avoid database transactions you can use a Set with all the generated coupons and a set referencing only available coupons. When a user select a coupon in a synch block remove the coupon from available ones. The second user then fail to obtain it
I am thinking of setting up a page in an application that each of the queries can return a resultset that cannot fit in memory or the query is very expensive to fetch all of them. The user will be hitting "get more" to get more of those results. I wonder if I could use a yielder for Java something like that (http://benjiweber.co.uk/blog/2015/03/21/yield-return-in-java/) and if I will need Web Sockets e.g from Spring (http://docs.spring.io/spring/docs/current/spring-framework-reference/html/websocket.html) so that the client can tell to Server to push more results. Also could you please give an example of the handshake .. Will the endpoint uri be based on some session id as well? Also when databases like OrientDB/Neo4j return Iterables does it mean that we can keep the connection open and get the next rows after minutes without problems? Thanks!
You are talking about two different concepts.
Pagination
If you have a large result set and you need to return it piece by piece to avoid long query times or high memory requirements, you're paginating the over the result set.
To do this, you require another piece of the set hitting "Get More" button from the client. Each time you require more, the server will receive a request from the server and will hit the DB with some paginated query.
Example in SQL (page 10, 10 results/page , for instance):
SELECT * FROM Table OFFSET 100 LIMIT 109
Websockets / Yielder
You'll need a websocket / yielder when is the server the one who sends data, in other words, the client doesn't require an update, it only keeps the socket open and receives updates from the Server when they come.
That's the case of a Message service, for example, avoiding constant polling from the client side.
In your case is absolutely unnecessary a websocket. You can also see an example of what I'm saying here -> What's the behavioral difference between HTTP Stay-Alive and Websockets?
However you can setup a keep-alive connection between your back-end and database in order to avoid closing/opening constantly the connection each time the user requires more results.
Finally, your question about Iterable results in Neo4j. Neo4j's result type is an Iterable list of Map<String,Object> which represents a List of key-value pairs. That doesn't keep the connection alive (by default), it only iterates through the returned results of that particular query.
In my code, there is a websocket server that persists information to the database on behalf of the connected client.
I am using Jetty 9, Hibernate, and Postgres.
Essentially, when a client posts a data object via the websocket:
The server deserializes the data
Checks to see if the object already exists in the database, based on content
If a match is found
The server will update the database row, and indicate that the item already exists
Else
The server will create a new database row and indicate that the item was added.
In this system:
An endpoint (ie., URL) corresponds to a single user principal
Multiple connections to the same endpoint are allowed, meaning multiple connections from different clients to a single user.
Data objects in the database are specific to a user
(This means that the server will only check rows belonging to the user for already-existing items)
This is all working well, except when 2 clients post the same data at precisely the same time. Neither server instance knows about the other, and the database ends up with 2 rows that are the same, except for the server-generated ID.
How can I prevent the server from writing the same data at the same time?
I can't use a UNIQUE constraint in the database, because I actually do have to support having multiple rows with the same data, but posted at different times.
I can get all the other sessions attached to the same endpoint, using "session.getOpenSessions()".
Can I:
Synchronize on the endpoint, somehow?
Synchronize on something attached to the session?
Do something with threading configuration?
Thanks for any suggestions.
When you say:
Checks to see if the object already exists in the database, based on
content
You already define a unique constraint, meaning there's a column set combination that must be unique, so that when you match an existing row, you will update instead of inserting.
The database already offers a centralized concurrency control mechanism, so you should probably use an index on all the aforementioned columns.
I am trying to sync multiple databases whose items have GUID for IDs, meaning that one item has the same ID on all databases.
My question is:
If i modify or create on item on 1 database, and want to synchronize this change to the other database should i:
1.) Check if the item is new or just modified, if its new then use the save() function, if its modified then use the update() function
or
2.)Do not check if its new or modified and just use the saveOrUpdate() function?
After seeing your use case in the comments, I think the best approach is to track (on both the client and server) when the last updated/last synced time was. In the event that the last sync time is null, or comes before the last updated time, you know that the data needs to be synced.
Now, on to the heart of your question: how to sync it. The client need not know the state of a server when it sends an object to you. In fact, it shouldn't. Consider the case where the client posts an object, your server receives it and process it, but the connection dies before your client receives the response. This is a very valid scenario and will result in a mis-match of data. As a result, any way that you try to determine whether or not the server has received an object (from the client) is likely to end up in a bad state.
The best solution is really to create an idempotent endpoint on the server (an upsert method, or saveOrUpdate as you referred to it in your question) which is able to determine what to do with the object. The server can query it's database by primary key to determine if it has the object or not. If it does, it can update, if not, it can insert.
Understandably, performance is important as well as the data. But, stick with primary keys in the database and that one additional select query you add should be extremely minimal (sub-10ms). If you really want to squeeze some more performance out, you could always use memcache or redis as a caching layer to determine if you have a certain GUID in your database. This way, you only have to hit memory (not your database) to determine if an object exists or not. The overhead of that would be measured only in the latency between your web server and cache server (since a memory read is incredibly cheap).
tl;dr
Upsert (or saveOrUpdate) is the way to go. Try not to track the state of one machine on another.
I'm looking to use Google's App Engine (Java) to provide the backend to an Android messaging app I'm currently writing, I'm just starting out with GAP but have a little experience with Java (through Android).
The first time someone uses the app it will send some sign-up data to the server, this will stored in the GAE datastore, and a unique id returned to the phone (or an error message if something broken).
As I can't see something that looks like key = datastore.giveMeAUniqueKey or datastore.hasThisBeenUsedBefore(key) I guess I'm going to have to generate a random key and see if it's been taken (I'm not that sure how to do that to be honest).
Any ideas (either answers to the specific question, or pointer to useful "getting started" resources)?
Thanks.
If this value is not security sensitive (ie, it's just a user ID and you have some other method to authenticate the phone), just do an insert and take the key of the newly inserted entity. The datastore will assign a guarenteed-unique key automatically if you insert a new entity without providing one. Alternately, you can explicitly request an ID with the allocate_ids call.
If the value is security sensitive (it's a session nonce or something used for authentication), use the SecureRandom class to generate a sequence of random bytes. Do not use this as a key for an entity such as a user object; this would preclude changing the session ID if the user's session is compromised. Have a separate user ID used for that purpose, and use this secure nonce only for the authentication step.
Note that simply looping creating IDs, testing for conflicts, and inserting is not safe without using a transaction; it's easier (and faster, and cheaper...) just to use app engine's built in ID assignment system.