Hbase read performance varying abnormally

Hbase read performance varying abnormally - java

I've installed HBase 0.94.0. I had to improve my read performance through scan. I've inserted random 100000 records.
When I set setCache(100); my performance was 16 secs for 100000 records.
When I set it to setCache(50) my performance was 90 secs for 100000 records.
When I set it to setCache(10); my performance was 16 secs for 100000 records
public class Test {
public static void main(String[] args) {
long start, middle, end;
HTableDescriptor descriptor = new HTableDescriptor("Student7");
descriptor.addFamily(new HColumnDescriptor("No"));
descriptor.addFamily(new HColumnDescriptor("Subject"));
try {
HBaseConfiguration config = new HBaseConfiguration();
HBaseAdmin admin = new HBaseAdmin(config);
admin.createTable(descriptor);
HTable table = new HTable(config, "Student7");
System.out.println("Table created !");
start = System.currentTimeMillis();
for(int i =1;i<100000;i++) {
String s=Integer.toString(i);
Put p = new Put(Bytes.toBytes(s));
p.add(Bytes.toBytes("No"), Bytes.toBytes("IDCARD"),Bytes.toBytes("i+10"));
p.add(Bytes.toBytes("No"), Bytes.toBytes("PHONE"),Bytes.toBytes("i+20"));
p.add(Bytes.toBytes("No"), Bytes.toBytes("PAN"),Bytes.toBytes("i+30"));
p.add(Bytes.toBytes("No"), Bytes.toBytes("ACCT"),Bytes.toBytes("i+40"));
p.add(Bytes.toBytes("Subject"), Bytes.toBytes("English"),Bytes.toBytes("50"));
p.add(Bytes.toBytes("Subject"), Bytes.toBytes("Science"),Bytes.toBytes("60"));
p.add(Bytes.toBytes("Subject"), Bytes.toBytes("History"),Bytes.toBytes("70"));
table.put(p);
}
middle = System.currentTimeMillis();
Scan s = new Scan();
s.setCaching(100);
ResultScanner scanner = table.getScanner(s);
try {
for (Result rr = scanner.next(); rr != null; rr=scanner.next()) {
System.out.println("Found row: " + rr);
}
end = System.currentTimeMillis();
} finally {
scanner.close();
}
System.out.println("TableCreation-Time: " + (middle - start));
System.out.println("Scan-Time: " + (middle - end));
} catch (IOException e) {
System.out.println("IOError: cannot create Table.");
e.printStackTrace();
}
}
}
Why is this happening?

Why would you want to return every record in your 100000 records table? You're doing a full
table scan and just as in any large database this is slow.
Try thinking about a more useful use case in which you would like to return some columns of a record or a range of records.
HBase does only have one index on it's table, the row key. Make use of that. Try defining your row key so that you can get the data you need just by specifying the row key.
Let's say you would like to know the value of Subject:History for the rows with a
row key between 80000 and 80100. (Note that setCaching(100) means HBase will fetch 100 records per RPC and is this case thus one. Fetching 100 rows obviously requires more memory opposed to fetching, let's say, one row. Keep that in mind in a large multi-user environment.)
Long start, end;
start = System.currentTimeMillis();
Scan s = new Scan(String.valueOf(80000).getBytes(), String.valueOf(80100).getBytes());
s.setCaching(100);
s.addColumn("Subject".getBytes(), "History".getBytes());
ResultScanner scanner = table.getScanner(s);
try {
for (Result rr = scanner.next(); rr != null; rr=scanner.next()) {
System.out.println("Found row: " + new String(rr.getRow(), "UTF-8") + " value: " + new String(rr.getValue("Subject".getBytes(), "History".getBytes()), "UTF-8")));
}
end = System.currentTimeMillis();
} finally {
scanner.close();
}
System.out.println("Scan: " + (end - start));
This might look stupid because how would you know which rows you need just by an integer? Well, exactly, but that's why you need to design a row key according to what you're about to query instead of just using an incremental value as you would in a traditional database.
Try this example. It should be fast.
Note: I didn't run the example. I just typed it here. Maybe there are some small syntax errors you should correct but I hope the idea is clear.

Related

Improve performance of loading 100,000 records from database

We created a program to make the use of the database easier in other programs. So the code im showing gets used in multiple other programs.
One of those other programs gets about 10,000 records from one of our clients and has to check if these are in our database already. If not we insert them into the database (they can also change and have to be updated then).
To make this easy we load all the entries from our whole table (at the moment 120,000), create a class for every entry we get and put all of them into a Hashmap.
The loading of the whole table this way takes around 5 minutes. Also we sometimes have to restart the program because we run into a GC overhead error because we work on limited hardware. Do you have an idea of how we can improve the performance?
Here is the code to load all entries (we have a global limit of 10.000 entries per query so we use a loop):
public Map<String, IMasterDataSet> getAllInformationObjects(ISession session) throws MasterDataException {
IQueryExpression qe;
IQueryParameter qp;
// our main SDP class
Constructor<?> constructorForSDPbaseClass = getStandardConstructor();
SimpleDateFormat itaTimestampFormat = new SimpleDateFormat("yyyyMMddHHmmssSSS");
// search in standard time range (modification date!)
Calendar cal = Calendar.getInstance();
cal.set(2010, Calendar.JANUARY, 1);
Date startDate = cal.getTime();
Date endDate = new Date();
Long startDateL = Long.parseLong(itaTimestampFormat.format(startDate));
Long endDateL = Long.parseLong(itaTimestampFormat.format(endDate));
IDescriptor modDesc = IBVRIDescriptor.ModificationDate.getDescriptor(session);
// count once before to determine initial capacities for hash map/set
IBVRIArchiveClass SDP_ARCHIVECLASS = getMasterDataPropertyBag().getSDP_ARCHIVECLASS();
qe = SDP_ARCHIVECLASS.getQueryExpression(session);
qp = session.getDocumentServer().getClassFactory()
.getQueryParameterInstance(session, new String[] {SDP_ARCHIVECLASS.getDatabaseName(session)}, null, null);
qp.setExpression(qe);
qp.setHitLimitThreshold(0);
qp.setHitLimit(0);
int nrOfHitsTotal = session.getDocumentServer().queryCount(session, qp, "*");
int initialCapacity = (int) (nrOfHitsTotal / 0.75 + 1);
// MD sets; and objects already done (here: document ID)
HashSet<String> objDone = new HashSet<>(initialCapacity);
HashMap<String, IMasterDataSet> objRes = new HashMap<>(initialCapacity);
qp.close();
// do queries until hit count is smaller than 10.000
// use modification date
boolean keepGoing = true;
while(keepGoing) {
// construct query expression
// - basic part: Modification date & class type
// a. doc. class type
qe = SDP_ARCHIVECLASS.getQueryExpression(session);
// b. ID
qe = SearchUtil.appendQueryExpressionWithANDoperator(session, qe,
new PlainExpression(modDesc.getQueryLiteral() + " BETWEEN " + startDateL + " AND " + endDateL));
// 2. Query Parameter: set database; set expression
qp = session.getDocumentServer().getClassFactory()
.getQueryParameterInstance(session, new String[] {SDP_ARCHIVECLASS.getDatabaseName(session)}, null, null);
qp.setExpression(qe);
// order by modification date; hitlimit = 0 -> no hitlimit, but the usual 10.000 max
qp.setOrderByExpression(session.getDocumentServer().getClassFactory().getOrderByExpressionInstance(modDesc, true));
qp.setHitLimitThreshold(0);
qp.setHitLimit(0);
// Do not sort by modification date;
qp.setHints("+NoDefaultOrderBy");
keepGoing = false;
IInformationObject[] hits = null;
IDocumentHitList hitList = null;
hitList = session.getDocumentServer().query(qp, session);
IDocument doc;
if (hitList.getTotalHitCount() > 0) {
hits = hitList.getInformationObjects();
for (IInformationObject hit : hits) {
String objID = hit.getID();
if(!objDone.contains(objID)) {
// do something with this object and the class
// here: construct a new SDP sub class object and give it back via interface
doc = (IDocument) hit;
IMasterDataSet mdSet;
try {
mdSet = (IMasterDataSet) constructorForSDPbaseClass.newInstance(session, doc);
} catch (Exception e) {
// cause for this
String cause = (e.getCause() != null) ? e.getCause().toString() : MasterDataException.ERRMSG_PART_UNKNOWN;
throw new MasterDataException(MasterDataException.ERRMSG_NOINSTANCE_POSSIBLE, this.getClass().getSimpleName(), e.toString(), cause);
}
objRes.put(mdSet.getID(), mdSet);
objDone.add(objID);
}
}
doc = (IDocument) hits[hits.length - 1];
Date lastModDate = ((IDateValue) doc.getDescriptor(modDesc).getValues()[0]).getValue();
startDateL = Long.parseLong(itaTimestampFormat.format(lastModDate));
keepGoing = (hits.length >= 10000 || hitList.isResultSetTruncated());
}
qp.close();
}
return objRes;
}

Loading 120,000 rows (and more) each time will not scale very well, and your solution may not work in the future as the record size grows. Instead let the database server handle the problem.
Your table needs to have a primary key or unique key based on the columns of the records. Iterate through the 10,000 records performing JDBC SQL update to modify all field values with where clause to exactly match primary/unique key.
update BLAH set COL1 = ?, COL2 = ? where PKCOL = ?; // ... AND PKCOL2 =? ...
This modifies an existing row or does nothing at all - and JDBC executeUpate() will return 0 or 1 indicating number of rows changed. If number of rows changed was zero you have detected a new record which does not exist, so perform insert for that new record only.
insert into BLAH (COL1, COL2, ... PKCOL) values (?,?, ..., ?);
You can decide whether to run 10,000 updates followed by however many inserts are needed, or do update+optional insert, and remember JDBC batch statements / auto-commit off may help speed things up.

How to mass delete multiple rows in hbase?

I have the following rows with these keys in hbase table "mytable"
user_1
user_2
user_3
...
user_9999999
I want to use the Hbase shell to delete rows from:
user_500 to user_900
I know there is no way to delete, but is there a way I could use the "BulkDeleteProcessor" to do this?
I see here:
https://github.com/apache/hbase/blob/master/hbase-examples/src/test/java/org/apache/hadoop/hbase/coprocessor/example/TestBulkDeleteProtocol.java
I want to just paste in imports and then paste this into the shell, but have no idea how to go about this. Does anyone know how I can use this endpoint from the jruby hbase shell?
Table ht = TEST_UTIL.getConnection().getTable("my_table");
long noOfDeletedRows = 0L;
Batch.Call<BulkDeleteService, BulkDeleteResponse> callable =
new Batch.Call<BulkDeleteService, BulkDeleteResponse>() {
ServerRpcController controller = new ServerRpcController();
BlockingRpcCallback<BulkDeleteResponse> rpcCallback =
new BlockingRpcCallback<BulkDeleteResponse>();
public BulkDeleteResponse call(BulkDeleteService service) throws IOException {
Builder builder = BulkDeleteRequest.newBuilder();
builder.setScan(ProtobufUtil.toScan(scan));
builder.setDeleteType(deleteType);
builder.setRowBatchSize(rowBatchSize);
if (timeStamp != null) {
builder.setTimestamp(timeStamp);
}
service.delete(controller, builder.build(), rpcCallback);
return rpcCallback.get();
}
};
Map<byte[], BulkDeleteResponse> result = ht.coprocessorService(BulkDeleteService.class, scan
.getStartRow(), scan.getStopRow(), callable);
for (BulkDeleteResponse response : result.values()) {
noOfDeletedRows += response.getRowsDeleted();
}
ht.close();
If there exists no way to do this through JRuby, Java or alternate way to quickly delete multiple rows is fine.

Do you really want to do it in shell because there are various other better ways. One way is using the native java API
Construct an array list of deletes
pass this array list to Table.delete method
Method 1: if you already know the range of keys.
public void massDelete(byte[] tableName) throws IOException {
HTable table=(HTable)hbasePool.getTable(tableName);
String tablePrefix = "user_";
int startRange = 500;
int endRange = 999;
List<Delete> listOfBatchDelete = new ArrayList<Delete>();
for(int i=startRange;i<=endRange;i++){
String key = tablePrefix+i;
Delete d=new Delete(Bytes.toBytes(key));
listOfBatchDelete.add(d);
}
try {
table.delete(listOfBatchDelete);
} finally {
if (hbasePool != null && table != null) {
hbasePool.putTable(table);
}
}
}
Method 2: If you want to do a batch delete on the basis of a scan result.
public bulkDelete(final HTable table) throws IOException {
Scan s=new Scan();
List<Delete> listOfBatchDelete = new ArrayList<Delete>();
//add your filters to the scanner
s.addFilter();
ResultScanner scanner=table.getScanner(s);
for (Result rr : scanner) {
Delete d=new Delete(rr.getRow());
listOfBatchDelete.add(d);
}
try {
table.delete(listOfBatchDelete);
} catch (Exception e) {
LOGGER.log(e);
}
}
Now coming down to using a CoProcessor. only one advice, 'DON'T USE CoProcessor' unless you are an expert in HBase.
CoProcessors have many inbuilt issues if you need I can provide a detailed description to you.
Secondly when you delete anything from HBase it's never directly deleted from Hbase there is tombstone marker get attached to that record and later during a major compaction it gets deleted, so no need to use a coprocessor which is highly resource exhaustive.
Modified code to support batch operation.
int batchSize = 50;
int batchCounter=0;
for(int i=startRange;i<=endRange;i++){
String key = tablePrefix+i;
Delete d=new Delete(Bytes.toBytes(key));
listOfBatchDelete.add(d);
batchCounter++;
if(batchCounter==batchSize){
try {
table.delete(listOfBatchDelete);
listOfBatchDelete.clear();
batchCounter=0;
}
}}
Creating HBase conf and getting table instance.
Configuration hConf = HBaseConfiguration.create(conf);
hConf.set("hbase.zookeeper.quorum", "Zookeeper IP");
hConf.set("hbase.zookeeper.property.clientPort", ZookeeperPort);
HTable hTable = new HTable(hConf, tableName);

If you already aware of the rowkeys of the records that you want to delete from HBase table then you can use the following approach
1.First create a List objects with these rowkeys
for (int rowKey = 1; rowKey <= 10; rowKey++) {
deleteList.add(new Delete(Bytes.toBytes(rowKey + "")));
}
2.Then get the Table object by using HBase Connection
Table table = connection.getTable(TableName.valueOf(tableName));
3.Once you have table object call delete() by passing the list
table.delete(deleteList);
The complete code will look like below
Configuration config = HBaseConfiguration.create();
config.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));
config.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
String tableName = "users";
Connection connection = ConnectionFactory.createConnection(config);
Table table = connection.getTable(TableName.valueOf(tableName));
List<Delete> deleteList = new ArrayList<Delete>();
for (int rowKey = 500; rowKey <= 900; rowKey++) {
deleteList.add(new Delete(Bytes.toBytes("user_" + rowKey)));
}
table.delete(deleteList);

Database insertion synchronization

I have a java code that generates a request number based on the data received from database, and then updates the database for newly generated
synchronized (this.getClass()) {
counter++;
System.out.println(counter);
System.out.println("start " + System.identityHashCode(this));
certRequest
.setRequestNbr(generateRequestNumber(certInsuranceRequestAddRq
.getAccountInfo().getAccountNumberId()));
System.out.println("outside funcvtion"+certRequest.getRequestNbr());
reqId = Utils.getUniqueId();
certRequest.setRequestId(reqId);
System.out.println(reqId);
ItemIdInfo itemIdInfo = new ItemIdInfo();
itemIdInfo.setInsurerId(certRequest.getRequestId());
certRequest.setItemIdInfo(itemIdInfo);
dao.insert(certRequest);
addAccountRel();
counter++;
System.out.println(counter);
System.out.println("end");
}
the output for System.out.println() statements is `
1
start 27907101
com.csc.exceed.certificate.domain.CertRequest#a042cb
inside function request number66
outside funcvtion66
AF88172D-C8B0-4DCD-9AC6-12296EF8728D
2
end
3
start 21695531
com.csc.exceed.certificate.domain.CertRequest#f98690
inside function request number66
outside funcvtion66
F3200106-6033-4AEC-8DC3-B23FCD3CA380
4
end
In my case I get a call from two threads for this code.
If you observe both the threads run independently. However the data for request number is same in both the cases.
is it possible that before the database updation for first thread completes the second thread starts execution.
`
the code for generateRequestNumber() is as follows:
public String generateRequestNumber(String accNumber) throws Exception {
String requestNumber = null;
if (accNumber != null) {
String SQL_QUERY = "select CERTREQUEST.requestNbr from CertRequest as CERTREQUEST, "
+ "CertActObjRel as certActObjRel where certActObjRel.certificateObjkeyId=CERTREQUEST.requestId "
+ " and certActObjRel.certObjTypeCd=:certObjTypeCd "
+ " and certActObjRel.certAccountId=:accNumber ";
String[] parameterNames = { "certObjTypeCd", "accNumber" };
Object[] parameterVaues = new Object[] {
Constants.REQUEST_RELATION_CODE, accNumber };
List<?> resultSet = dao.executeNamedQuery(SQL_QUERY,
parameterNames, parameterVaues);
// List<?> resultSet = dao.retrieveTableData(SQL_QUERY);
if (resultSet != null && resultSet.size() > 0) {
requestNumber = (String) resultSet.get(0);
}
int maxRequestNumber = -1;
if (requestNumber != null && requestNumber.length() > 0) {
maxRequestNumber = maxValue(resultSet.toArray());
requestNumber = Integer.toString(maxRequestNumber + 1);
} else {
requestNumber = Integer.toString(1);
}
System.out.println("inside function request number"+requestNumber);
return requestNumber;
}
return null;
}

Databases allow multiple simultaneous connections, so unless you write your code properly you can mess up the data.
Since you only seem to require a unique growing integer, you can easily generate one safely inside the database with for example a sequence (if supported by the database). Databases not supporting sequences usually provide some other way (such as auto increment columns in MySQL).

How to add elements to ConcurrentHashMap using ExecutorService

I have a requirement of reading User Information from 2 different sources (db) per userId and storing consolidated information in a Map with key as userId. Users in numbers can vary based on period they have opted for. Group of users may belong to different Period of Year.eg daily, weekly, monthly users.
I used HashMap and LinkedHashMap to get this done. As it slows down the process and to make it faster, I thought of using Threading here.
Reading some tutorials and examples now I am using ConcurrentHashMap and ExecutorService.
In cases based on some validation I want to skip the current iteration and move to next User info. It doesnot allow to use continue keyword to use within for loop. Is there any way to achieve same differently within Multithreaded code.
Moreover below code piece though it works, but its not significantly that faster than the code without threading which creates doubt if Executor Service is implemented correctly.
How do we debug in case we get any error in Multithreaded code. Execution holds at debug point but its not consistent and it does not move to next line with F6.
Can someone point out if I am missing something in the code. Or any other example of simillar use case also can be of great help.
public void getMap() throws UserException
{
long startTime = System.currentTimeMillis();
Map<String, Map<Integer, User>> map = new ConcurrentHashMap<String, Map<Integer, User>>();
//final String key = "";
try
{
final Date todayDate = new Date();
List<String> applyPeriod = db.getPeriods(todayDate);
for (String period : applyPeriod)
{
try
{
final String key = period;
List<UserTable1> eligibleUsers = db.findAllUsers(key);
Map<Integer, User> userIdMap = new ConcurrentHashMap<Integer, User>();
ExecutorService executor = Executors.newFixedThreadPool(eligibleUsers.size());
CompletionService<User> cs = new ExecutorCompletionService<User>(executor);
int userCount=0;
for (UserTable1 eligibleUser : eligibleUsers)
{
try
{
cs.submit(
new Callable<User>()
{
public User call()
{
int userId = eligibleUser.getUserId();
List<EmployeeTable2> empData = db.findByUserId(userId);
EmployeeTable2 emp = null;
if (null != empData && !empData.isEmpty())
{
emp = empData.get(0);
}else{
String errorMsg = "No record found for given User ID in emp table";
logger.error(errorMsg);
//continue;
// conitnue does not work here.
}
User user = new User();
user.setUserId(userId);
user.setFullName(emp.getFullName());
return user;
}
}
);
userCount++;
}
catch(Exception ex)
{
String errorMsg = "Error while creating map :" + ex.getMessage();
logger.error(errorMsg);
}
}
for (int i = 0; i < userCount ; i++ ) {
try {
User user = cs.take().get();
if (user != null) {
userIdMap.put(user.getUserId(), user);
}
} catch (ExecutionException e) {
} catch (InterruptedException e) {
}
}
executor.shutdown();
map.put(key, userIdMap);
}
catch(Exception ex)
{
String errorMsg = "Error while creating map :" + ex.getMessage();
logger.error(errorMsg);
}
}
}
catch(Exception ex){
String errorMsg = "Error while creating map :" + ex.getMessage();
logger.error(errorMsg);
}
logger.info("Size of Map : " + map.size());
Set<String> periods = map.keySet();
logger.info("Size of periods : " + periods.size());
for(String period :periods)
{
Map<Integer, User> mapOfuserIds = map.get(period);
Set<Integer> userIds = mapOfuserIds.keySet();
logger.info("Size of Set : " + userIds.size());
for(Integer userId : userIds){
User inf = mapOfuserIds.get(userId);
logger.info("User Id : " + inf.getUserId());
}
}
long endTime = System.currentTimeMillis();
long timeTaken = (endTime - startTime);
logger.info("All threads are completed in " + timeTaken + " milisecond");
logger.info("******END******");
}

You really don't want to create a thread pool with as many threads as users you've read from the db. That doesn't make sense most of the time because you need to keep in mind that threads need to run somewhere... There are not many servers out there with 10 or 100 or even 1000 cores reserved for your application. A much smaller value like maybe 5 is often enough, depending on your environment.
And as always for topics about performance: You first need to test what your actual bottleneck is. Your application may simply don't benefit of threading because e.g. you are reading form a db which only allows 5 concurrent connections a the same time. In that case all your other 995 threads will simply wait.
Some other thing to consider is network latency: Reading multiple user ids from multiple threads may even increase the round trip time needed to get the data for one user from the database. An alternative approach might be to not read one user at a time, but the data of all 10'000 of them at once. That way your maybe available 10 GBit Ethernet connection to your database might really speed things up because you have only small communication overhead with the database but it might serve you all data you need in one answer quickly.
So in short, from my opinion your question is about performance optimization of your problem in general, but you don't know enough yet to decide which way to go.

you could try something like that:
List<String> periods = db.getPeriods(todayDate);
Map<String, Map<Integer, User>> hm = new HashMap<>();
periods.parallelStream().forEach(s -> {
eligibleUsers = // getEligibleUsers();
hm.put(s, eligibleUsers.parallelStream().collect(
Collectors.toMap(UserTable1::getId,createUserForId(UserTable1:getId))
});
); //
And in the createUserForId you do your db-reading
private User createUserForId(Integer id){
db.findByUserId(id);
//...
User user = new User();
user.setUserId(userId);
user.setFullName(emp.getFullName());
return user;
}

how to get 150k followersIDs from User?

I am trying to get all the followersIDs from an a twitter account with about 150.000 followers. I later want to map their location, but first I need all those IDs.
at the moment I am using this code:
long lCursorIDs = -1;
long[] fArray = new long[100];
do
{
fArray = twitter.getFollowersIDs(name, lCursorIDs).getIDs();
} while (twitter.getFollowersIDs(name, lCursorIDs).hasNext ());
try
{
PrintWriter pr = new PrintWriter(filenameOutput);
for (int i=0; i<fArray.length ; i++)
{
pr.println(fArray[i]);
}
pr.close();
System.out.println("Follower IDs collected and saved to file: " + filenameOutput );
}
catch (Exception e)
{
e.printStackTrace();
System.out.println("No such file exists.");
}
This works for User with less followers. but with that many it always returns an error message - rate limit exceeded.
I was thinking about getting only a certain number of followersIDs per hour, but I am not sure how to do that and not start every hour from the beginning with the first follower. also, I am not sure how many followers I can get with one request. maybe it is 100, as with the "lookupUser" method but I am not sure.. any ideas/suggestions?
EDIT: ok, I just tried to get the followerIDs of an account with 2700 followers and it stored them correctly in the text file. It also only "cost" one request. than I changed the account name to an account with 15500 followers and it crashes again with an rate limit exceeded message. I don´t get why since it´s only roughly 6 times as many followers but all the remaining requests get spend.. any ideas on what I´m doing wrong?

the answer:
int numberOfFollowers;
numberOfFollowers = user.getFollowersCount();
//CREATE ARRAYS FOR FOLLOWER IDS
long cursor = -1;
long[] fArray = new long[numberOfFollowers];
long[] local = new long[5000];
IDs ids = twitter.getFollowersIDs(name, cursor);
int j = 0;
int x = 5000;
int durchgang = 1;
int d_anzahl = 1 + numberOfFollowers / 5000;
//STROE FOLLOWER IDS IN ARRAYS
do
{
ids = twitter.getFollowersIDs(name, cursor);
local = twitter.getFollowersIDs(name, cursor).getIDs();
System.out.println("Durchgang: " + durchgang + " / " + d_anzahl );
System.arraycopy(local, 0, fArray, j * x , local.length);
j++;
durchgang++;
cursor = ids.getNextCursor();
} while (ids.hasNext());
this gets an array with all follower IDs of any twitter User. It calculates the number of loops needed to get all follower IDs and copys each array of 5000 IDs into new array which has all IDs at the end.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Hbase read performance varying abnormally - java

Related

Improve performance of loading 100,000 records from database

How to mass delete multiple rows in hbase?

Database insertion synchronization

How to add elements to ConcurrentHashMap using ExecutorService

how to get 150k followersIDs from User?

Categories

Resources