API twitter rate limit - java

Here is what i'm trying to do :
I have a list a twitter user ID, for each one of them I need to retrieve a complete list of his followers ID and his friends ID. I don't need anything else, no screen name etc..
i'm using twitter4j btw
Here is how I'm doing it :
for each user i'm executing the following code in order to get a complete list of his followers IDs
long lCursor = -1
do{
IDs response = t.getFollowersIDs(id, lCursor);
long tab[] = response.getIDs();
for(long val : tab){
myIdList.add(val);
}
lCursor = response.getNextCursor();
}while(lCursor != 0);
My problem :
according to this page : https://dev.twitter.com/docs/api/1.1/get/followers/ids
the request rate limit for getFollowersIDs() is 15, considering this method return a maximum number of 5000 IDs, it means that it will be only possible to get 15*5000 IDs (or 15 users if they have less than 5000 followers).
This is really not enough for what i'm trying to do.
Am I doing something wrong ? Is there any solutions to improve that ? (even slightly)
Thanks for your help :)

The rate limit for that endpoint in v1.1 is 15 calls per 15 minutes per access token. See https://dev.twitter.com/docs/rate-limiting/1.1 for more information about the limits.
With that in mind, if you have an access token for each of your users, you should be able to fetch up to 75,000 (15*5000) follower IDs every 15 minutes for each access token.
If you only have one access token you'll, unfortunately, be limited in the manner you described and will just have to handle when your application hits the rate limit and continue processing once the 15 minutes is up.

Related

Loop through multiples api calls with pagination talend

I'm trying to retrieve data out of an api calls with different tokens (credentials), each token represent a different division.
The api has a limit of row per page which is why is paginated. So far my job extract the data out of the API looping through the pages until record count is 0, but is only working for 1 token and I have 6 different tokens, so if I add another token it should retrieve all records for token 1 and then continue to token 2, etc...
So the only thing remaining is loop through each token but making sure it retrieve all records for each token.
Below is my job:
Tjava_1 code:
globalMap.put("page",new Integer("1"));
globalMap.put("enddata", new Boolean(true));
Tloop_1:
tFixedFlowInput_1, here is where I put all different tokens but like I explained above if I use 1 it works perfect but if I put more than 1 it doesn't work.:
tJavaRow_1:
globalMap.put("Token", new String(row1.Token));
globalMap.put("Division", new String(row1.Division));
System.out.println("Token: " + globalMap.get("Token"));
System.out.println("Division: " + globalMap.get("Division"));
tFlowToIterate_2:
tREST_2:
tExtractJSONFields_6:
tJavaRow_6:
System.out.println(row21.count);
globalMap.put("page",(Integer)globalMap.get("page")+1);
if(row21.count == null || row21.count == 0)
{
globalMap.put("enddata", new Boolean(false));
}
This is the result I get when I use 1 token in the tFixedFlowInput_1, which works as expected.:
This is the result if I add 2 or more tokens, which is what I want to resolve. Instead of doing this it should loop all pages for token 1 until there are no records and then continue with token 2 and so on...
Any help will be greatly appreciated!!

How to ensure the expiry of a stream data structure in redis is set once only?

I have a function that use lettuce to talk to a redis cluster.
In this function, I insert data into a stream data structure.
import io.lettuce.core.cluster.SlotHash;
...
public void addData(Map<String, String> dataMap) {
var sync = SlotHash.getSlot(key).sync()
sync.xadd(key, dataMap);
}
I also want to set the ttl when I insert a record for the first time. It is because part of the user requirement is expire the structure after a fix length of time. In this case it is 10 hour.
Unfortunately the XADD function does not accept an extra parameter to set the TTL like the SET function.
So now I am setting the ttl this way:
public void addData(Map<String, String> dataMap) {
var sync = SlotHash.getSlot(key).sync()
sync.xadd(key, dataMap);
sync.expire(key, 60000 /* 10 hours */);
}
What is the best way to ensure the I will set the expiry time only once (i.e. when the stream structure is first created)? I should not set TTL multiple times within the function because every call to xadd will also follow by a call of expire which effectively postpone the expiry time indefinitely.
I think I can always check the number of items in the stream data structure but it is an overhead. I don't want to keep flags in the java application side because the app could be restarted and this information will be removed from the memory.
You may want to try lua script, sample script below which sets the expiry only if it's not set for key, works with any type of key in redis.
eval "local ttl = redis.call('ttl', KEYS[1]); if ttl == -1 then redis.call('expire', KEYS[1], ARGV[1]); end; return ttl;" 1 mykey 12
script also returns the actual expiry time left in seconds.

Time spent on each chrome tab/website from data stored in History sqlite

I am trying to find out the time spent on each tab/website by the user.
For example if I visited youtube and watched it for 10 minutes then I should be able to see something like this
www.youtube.com ---> 10 minutes
I already made a connection with sqlite database i.e. History file present in chrome directory and was able to run the following sql command to fetch the data:
SELECT urls.id, urls.url, urls.title, urls.visit_count, urls.typed_count, urls.last_visit_time, urls.hidden, urls.favicon_id, visits.visit_time, visits.from_visit, visits.visit_duration, visits.transition, visit_source.source FROM urls JOIN visits ON urls.id = visits.url LEFT JOIN visit_source ON visits.id = visit_source.id
So can anyone tell me which combination of column can i use to get the time spent on each website.
Please note that: visit_duration is not giving me appropriate data.
visit_duration Stores duration in microseconds, you need to convert and format that number. Here is one way to show a human-readable visit duration:
SELECT urls.url AS URL, (visits.visit_duration / 3600 / 1000000) || ' hours ' || strftime('%M minutes %S seconds', visits.visit_duration / 1000000 / 86400.0) AS Duration
FROM urls LEFT JOIN visits ON urls.id = visits.url
Here is a sample output:
URL
Duration
http://www.stackoverflow.com/
3 hours 14 minutes 15 seconds
You can also use strftime if you want more format options

Best practice for SOLR partial index in order to update attributes that change frequently in Hybris

My scenario is like this.
Solr Indexing happens for a product and then product approval status is made unapproved from backoffice. After then, when you search the related words that is placed in description of the product or directly product code from website, you get a server error since the product that is made unapproved is still placed in solr.
If you perform any type of indexing from backoffice manually, it works again. But it is not a good solution since there might be lots of products whose status is changed or that is not a solution which happens instantly. If you use cronjob for indexing, that is not a fast solution again.You get server error until cronjob starts to work.
I would like to update solr index instantly for the attributes which changes frequently like price, status, etc. For instance, when an attribute changes, Is it a good way to start partial index immediately in java code? If it is, how? (by IndexerService?). For another solution, Is it a better idea to make http request to solr for the attribute?
In summary, I am looking for the best solution to perform partial index.
Any ideas?
For this case you need to write two new important SOLR-Configuration parts:
1) A new SOLR-Cronjob that trigger the indexing
2) A new SOLR-IndexerQuery for indexing with your special requirements.
When you have a look at the default stuff from hybris you see:
INSERT_UPDATE CronJob;code[unique=true];job(code);singleExecutable;sessionLanguage(isocode);active;
;backofficeSolrIndexerUpdateCronJob;backofficeSolrIndexerUpdateJob;false;en;false;
INSERT Trigger;cronJob(code);active;activationTime;year;month;day;hour;minute;second;relative;weekInterval;daysOfWeek;
;backofficeSolrIndexerUpdateCronJob;true;;-1;-1;-1;-1;-1;05;false;0;;
This part above is to configure when the job should run. You can modify him, that he should run ever 5 seconds for example.
INSERT_UPDATE SolrIndexerQuery; solrIndexedType(identifier)[unique = true]; identifier[unique = true]; type(code); injectCurrentDate[default = true]; injectCurrentTime[default = true]; injectLastIndexTime[default = true]; query; user(uid)
; $solrIndexedType ; $solrIndexedType-updateQuery ; update ; false ; false ; false ; "SELECT DISTINCT {PK} FROM {Product AS p JOIN VariantProduct AS vp ON {p.PK}={vp.baseProduct} } WHERE {p.modifiedtime} >= ?lastStartTimeWithSuccess OR {vp.modifiedtime} >= ?lastStartTimeWithSuccess" ; admin
The second part here is the more important. Here you define which products should be indexed. Here you can see that the UPDATE-Job is looking for every Product that was modified. Here you could write a new FlexibleSearch with your special requirements.
tl;tr Answear: You have to write a new performant solrIndexerQuery that could be trigger every 5 seconds

jinstagram api - get complete user list

I am working on a java application that uses the jInstagram api. I can successfully login with my application, but when I want to get a list of users that follow a certain user, I can only gather 50 user Id's.
The code I am using to get 50 users that follow a user is this:
String userId = instagram.getCurrentUserInfo().getData().getId();
UserFeed feed = instagram.getUserFollowedByList(userId);
List<UserFeedData> users = feed.getUserList();
//iterate through the list and print each value. The print value is simply a user id.
I can then iterate through the list and print out 50 user Id's. This is fine, but I need to get a lot more user Id's.
From my research, in order to get more than 50 user Id's I must use the Pagination class. Here is what I put together.
String userId = instagram.getCurrentUserInfo().getData().getId();
UserFeed feed = instagram.getUserFollowList(userId);
UserFeed recentData = instagram.getUserFollowedByListNextPage(feed.getPagination());
int counter = 0;
while(recentData.getPagination() != null && counter < 10){
List<UserFeedData> a = recentData.getUserList();
for(int i = 0; i < a.size(); i++){
System.out.println(a.get(i));
}
counter++;
}
This Code works, but it gives an output like this for each user.
UserFeedData [id=316470004, profilePictureUrl=https://instagramimages-a.akamaihd.net/profiles/profile_316470004_75sq_1386826158.jpg, userName=thebeautifulwarrior, fullName=Dee Shower, website=http://www.totallifechanges.com/dns2015, bio=2015 the year to Reinvent & Restore! Lose 10 pounds in 10 days!!! Ask me how. Invest in yourself. E-mail: totallifechange2015#gmail.com]
For my program, I only want the id part. I know I can just parse the text and create a substring, but I want to do this more efficiently and retrieve the data from the api call instead. In the first snippet of code, that gives an output exactly how I need it. For example, the output for the code is "316470004" rather than the entire user information set.
Thanks for your help in advance!
What you're seeing is the string representation of the entire UserFeedData object, rather than just the ID. Instead of System.out.println(a.get(i)); use System.out.println(a.get(i).getId()); to pull the user's ID.

Categories

Resources