I am trying to list the table data from BigQuery using JAVA. However I am not able to find how to configure API to get maximum rows per call?
public class QuickstartSample {
public static void main(String... args) throws Exception {
GoogleCredentials credentials;
File credentialsPath = new File("/Users/gaurang.shah/Downloads/fb3735b731b9.json"); // TODO: update to your key path.
FileInputStream serviceAccountStream = new FileInputStream(credentialsPath);
credentials = ServiceAccountCredentials.fromStream(serviceAccountStream);
BigQuery bigquery = BigQueryOptions.newBuilder().
setCredentials(credentials).
setProjectId("bigquery-public-data").
build().
getService();
Dataset hacker_news = bigquery.getDataset("hacker_news");
Table comments = hacker_news.get("comments");
TableResult result = comments.list().;
for (FieldValueList row : result.iterateAll()) {
// do something with the row
System.out.println(row);
}
}
}
To limit the number of rows you can use listTableData method with TableDataListOption.pageSize(n) parameter.
Following example returns 100 rows as the result:
String datasetName = "my_dataset_name";
String tableName = "my_table_name";
TableId tableIdObject = TableId.of(datasetName, tableName);
TableResult tableData =
bigquery.listTableData(tableIdObject, TableDataListOption.pageSize(100));
for (FieldValueList row : tableData.iterateAll()) {
// do something with the row
}
Related
I have a table in DynamoDB and it has an attribute 'createDate' and I want to do a scan using a filter in a specific period of that attribute (for example: 2022-01-01 to 2022-01-31) but I don't know exactly if it's possible and how to do. If anyone has done this and can help me it would be very helpful.
just one more question: is it possible to put the result in a CSV file?
Here is my code where I can scan with a single date:
public class QueryTableResearchAnswers {
static AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
static DynamoDB dynamoDB = new DynamoDB(client);
static String tableName = "research-answers";
public static void main(String[] args) throws Exception {
String researchAnswers = "Amazon DynamoDB";
findAnswersWithinTimePeriod(researchAnswers);
//findRepliesPostedWithinTimePeriod(researchAnswers);
}
private static void findAnswersWithinTimePeriod(String researchAnswers) {
Table table = dynamoDB.getTable(tableName);
Map<String, Object> expressionAttributeValues = new HashMap<String, Object>();
expressionAttributeValues.put(":startDate", "2022-01-01T00:00:00.0Z" );
ItemCollection<ScanOutcome> items = table.scan("createDate between > startDate", // FilterExpression
"bizId, accountingsessionid, accounttype, acctsessionid, choicecode, contextname, createDate, document, framedipaddress," +
"macaddress, macaddressnetworkdata, machash, mail, nasgrelocalip, nasidentifier, nasipaddress, nasportid, network, networktype, networkuuid, phone," +
"question, questionanswer, questioncode, realm, relayingmacaddress, remoteipaddress, useragent, username", // ProjectionExpression
null, // ExpressionAttributeNames - not used in this example
expressionAttributeValues);
System.out.println("Scan of " + tableName + " for january answers");
Iterator<Item> iterator = items.iterator();
while (iterator.hasNext()) {
System.out.println(iterator.next().toJSONPretty());
}
}
In general, for an arbitrary date range:
createDate BETWEEN :date1 AND :date2
But, in your specific case of 2022-01-01 to 2022-01-31 (the entire month of January), you can simplify this to:
beginsWith(createDate, "2022-01")
I am trying to get all the records related to a custom record type. How to do it in Netsuite SOAP?
Also is there a way to search records of that custom record type by it's recordname?
Something like this returns only the first record:
CustomRecordRef customRec = new CustomRecordRef();
customRec.setInternalId("XXX");
customRec.setScriptId("customrecord_lc_mapping");
netsuiteSoapClient.getPort(true).get(customRec);
Here is an example code on how to query all the values of a custom record type using Java SOAP:
/**
* String search Cost Template Values
*
* #return internal ID
* #throws Any Exception
*/
private Map<String, String> searchCostTemplateValues() throws Exception {
CustomRecordSearchBasic customRecordSearch = new CustomRecordSearchBasic();
RecordRef recordRef = new RecordRef();
if(environment.toLowerCase().equals("test") || environment.equals("default")) {
recordRef.setInternalId("426");
}
else {
recordRef.setInternalId("426");
}
customRecordSearch.setRecType(recordRef);
SearchResult response = netsuiteSoapClient
.getPort(true)
.search(customRecordSearch);
LOGGER.info("Search Result: " + new ObjectMapper()
.writerWithDefaultPrettyPrinter().writeValueAsString(response));
RecordList costTemplateRecordList = response.getRecordList();
Record[] customRecordArray = costTemplateRecordList.getRecord();
Map<String, String> costTemplateMap = new HashMap<>();
for(Record r : customRecordArray) {
CustomRecord cr = (CustomRecord) r;
String name = cr.getName();
String internalId = cr.getInternalId();
costTemplateMap.put(name, internalId);
}
return costTemplateMap;
}
I have the following rows with these keys in hbase table "mytable"
user_1
user_2
user_3
...
user_9999999
I want to use the Hbase shell to delete rows from:
user_500 to user_900
I know there is no way to delete, but is there a way I could use the "BulkDeleteProcessor" to do this?
I see here:
https://github.com/apache/hbase/blob/master/hbase-examples/src/test/java/org/apache/hadoop/hbase/coprocessor/example/TestBulkDeleteProtocol.java
I want to just paste in imports and then paste this into the shell, but have no idea how to go about this. Does anyone know how I can use this endpoint from the jruby hbase shell?
Table ht = TEST_UTIL.getConnection().getTable("my_table");
long noOfDeletedRows = 0L;
Batch.Call<BulkDeleteService, BulkDeleteResponse> callable =
new Batch.Call<BulkDeleteService, BulkDeleteResponse>() {
ServerRpcController controller = new ServerRpcController();
BlockingRpcCallback<BulkDeleteResponse> rpcCallback =
new BlockingRpcCallback<BulkDeleteResponse>();
public BulkDeleteResponse call(BulkDeleteService service) throws IOException {
Builder builder = BulkDeleteRequest.newBuilder();
builder.setScan(ProtobufUtil.toScan(scan));
builder.setDeleteType(deleteType);
builder.setRowBatchSize(rowBatchSize);
if (timeStamp != null) {
builder.setTimestamp(timeStamp);
}
service.delete(controller, builder.build(), rpcCallback);
return rpcCallback.get();
}
};
Map<byte[], BulkDeleteResponse> result = ht.coprocessorService(BulkDeleteService.class, scan
.getStartRow(), scan.getStopRow(), callable);
for (BulkDeleteResponse response : result.values()) {
noOfDeletedRows += response.getRowsDeleted();
}
ht.close();
If there exists no way to do this through JRuby, Java or alternate way to quickly delete multiple rows is fine.
Do you really want to do it in shell because there are various other better ways. One way is using the native java API
Construct an array list of deletes
pass this array list to Table.delete method
Method 1: if you already know the range of keys.
public void massDelete(byte[] tableName) throws IOException {
HTable table=(HTable)hbasePool.getTable(tableName);
String tablePrefix = "user_";
int startRange = 500;
int endRange = 999;
List<Delete> listOfBatchDelete = new ArrayList<Delete>();
for(int i=startRange;i<=endRange;i++){
String key = tablePrefix+i;
Delete d=new Delete(Bytes.toBytes(key));
listOfBatchDelete.add(d);
}
try {
table.delete(listOfBatchDelete);
} finally {
if (hbasePool != null && table != null) {
hbasePool.putTable(table);
}
}
}
Method 2: If you want to do a batch delete on the basis of a scan result.
public bulkDelete(final HTable table) throws IOException {
Scan s=new Scan();
List<Delete> listOfBatchDelete = new ArrayList<Delete>();
//add your filters to the scanner
s.addFilter();
ResultScanner scanner=table.getScanner(s);
for (Result rr : scanner) {
Delete d=new Delete(rr.getRow());
listOfBatchDelete.add(d);
}
try {
table.delete(listOfBatchDelete);
} catch (Exception e) {
LOGGER.log(e);
}
}
Now coming down to using a CoProcessor. only one advice, 'DON'T USE CoProcessor' unless you are an expert in HBase.
CoProcessors have many inbuilt issues if you need I can provide a detailed description to you.
Secondly when you delete anything from HBase it's never directly deleted from Hbase there is tombstone marker get attached to that record and later during a major compaction it gets deleted, so no need to use a coprocessor which is highly resource exhaustive.
Modified code to support batch operation.
int batchSize = 50;
int batchCounter=0;
for(int i=startRange;i<=endRange;i++){
String key = tablePrefix+i;
Delete d=new Delete(Bytes.toBytes(key));
listOfBatchDelete.add(d);
batchCounter++;
if(batchCounter==batchSize){
try {
table.delete(listOfBatchDelete);
listOfBatchDelete.clear();
batchCounter=0;
}
}}
Creating HBase conf and getting table instance.
Configuration hConf = HBaseConfiguration.create(conf);
hConf.set("hbase.zookeeper.quorum", "Zookeeper IP");
hConf.set("hbase.zookeeper.property.clientPort", ZookeeperPort);
HTable hTable = new HTable(hConf, tableName);
If you already aware of the rowkeys of the records that you want to delete from HBase table then you can use the following approach
1.First create a List objects with these rowkeys
for (int rowKey = 1; rowKey <= 10; rowKey++) {
deleteList.add(new Delete(Bytes.toBytes(rowKey + "")));
}
2.Then get the Table object by using HBase Connection
Table table = connection.getTable(TableName.valueOf(tableName));
3.Once you have table object call delete() by passing the list
table.delete(deleteList);
The complete code will look like below
Configuration config = HBaseConfiguration.create();
config.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));
config.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
String tableName = "users";
Connection connection = ConnectionFactory.createConnection(config);
Table table = connection.getTable(TableName.valueOf(tableName));
List<Delete> deleteList = new ArrayList<Delete>();
for (int rowKey = 500; rowKey <= 900; rowKey++) {
deleteList.add(new Delete(Bytes.toBytes("user_" + rowKey)));
}
table.delete(deleteList);
I have read some data from Excel in array form, and converted into a 2-D array in order to provide the data to a data provider.But now in the #Test when I pass the data it takes on the null value.
Can you please suggest why it's going null? Also my function in #Test is using a map as well - how can I convert the data provider data to map ?
My function in #Test is like below:-
public void testCategorySearch(String vendor_code, Map<Integer, List<String>> seller_sku , String upload_id,Protocol protocol)
throws InvocationTargetException
My code is :
#DataProvider(name = "valid_parameters")
public Object[][] sendValidParameters() {
List<ArrayList> result = td.getExcelData("C:\\Users\\ashish.gupta02\\QAAutomation\\test.xls", 1);
Object[][] a = new String[result.size()][3];
{
for (int i = 0; i < result.size(); i++) {
Object currentObject = result.get(i);
a[i][0] = currentObject.toString();
System.out.println("COnverted" + a[i][0]);
}
}
System.out.println("Printing data" + a);
//return mapper.getProtocolMappedObject(a);
//return Object ;
return a;
}
#Test(dataProvider = "valid_parameters", groups = {"positive"})
public void testCategorySearch(String vendor_code, Map<Integer, List<String>> seller_sku, String upload_id, Protocol protocol)
throws InvocationTargetException {
//Protocol protocol
//set parameter values to the api
System.out.println("Executing the request");
CreateSellerProductUpdateInfoRequest createReq = setRequest(vendor_code, seller_sku, upload_id, protocol);
CreateSellerProductUpdateInfoResponse createResponse = service.createSellerProductUpdateInfo(createReq);
System.out.print("Response is :" + createResponse);
}
How do I return all timestamped versions of an HBase cell with the Get.setMaxVersions(10) method where 10 is an arbitrary number (could be something else like 20 or 5)? The following is a console main method that creates a table, inserts 10 random integers, and tries to retrieve all of them to print out.
public static void main(String[] args)
throws ZooKeeperConnectionException, MasterNotRunningException, IOException, InterruptedException {
final String HBASE_ZOOKEEPER_QUORUM_IP = "localhost.localdomain"; //set ip in hosts file
final String HBASE_ZOOKEEPER_PROPERTY_CLIENTPORT = "2181";
final String HBASE_MASTER = HBASE_ZOOKEEPER_QUORUM_IP + ":60010";
//identify a data cell with these properties
String tablename = "characters";
String row = "johnsmith";
String family = "capital";
String qualifier = "A";
//config
Configuration config = HBaseConfiguration.create();
config.clear();
config.set("hbase.zookeeper.quorum", HBASE_ZOOKEEPER_QUORUM_IP);
config.set("hbase.zookeeper.property.clientPort", HBASE_ZOOKEEPER_PROPERTY_CLIENTPORT);
config.set("hbase.master", HBASE_MASTER);
//admin
HBaseAdmin hba = new HBaseAdmin(config);
//create a table
HTableDescriptor descriptor = new HTableDescriptor(tablename);
descriptor.addFamily(new HColumnDescriptor(family));
hba.createTable(descriptor);
hba.close();
//get the table
HTable htable = new HTable(config, tablename);
//insert 10 different timestamps into 1 record
for(int i = 0; i < 10; i++) {
String value = Integer.toString(i);
Put put = new Put(Bytes.toBytes(row));
put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), System.currentTimeMillis(), Bytes.toBytes(value));
htable.put(put);
Thread.sleep(200); //make sure each timestamp is different
}
//get 10 timestamp versions of 1 record
final int MAX_VERSIONS = 10;
Get get = new Get(Bytes.toBytes(row));
get.setMaxVersions(MAX_VERSIONS);
Result result = htable.get(get);
byte[] value = result.getValue(Bytes.toBytes(family), Bytes.toBytes(qualifier)); // returns MAX_VERSIONS quantity of values
String output = Bytes.toString(value);
//show me what you got
System.out.println(output); //prints 9 instead of 0 through 9
}
The output is 9 (because the loop ended at i=9, and I don't see multiple versions in Hue's HBase Browser web UI. What can I do to fix the versions so it gives me 10 individual results for 0 - 9 instead of one result of only the number 9?
You should use getColumnCells on Result to get all versions (depending on MAX_VERSION_COUNT you have set in Get). getValue returns the latest value.
Sample Code:
List<Cell> values = result.getColumnCells(Bytes.toBytes(family), Bytes.toBytes(qualifier));
for ( Cell cell : values )
{
System.out.println( Bytes.toString( CellUtil.cloneValue( cell ) ) );
}
This is a deprecated approach which matches the version of HBase I am currently working on.
List<KeyValue> kvpairs = result.getColumn(Bytes.toBytes(family), Bytes.toBytes(qualifier));
String line = "";
for(KeyValue kv : kvpairs) {
line += Bytes.toString(kv.getValue()) + "\n";
}
System.out.println(line);
Then, going one step further, it is important to note the setMaxVersions method must be called at table creation to allow for more than a default three values to be inserted into a cell. Here's the updated table creation:
//create a table based on variables from question above
HTableDescriptor tableDescriptor = new HTableDescriptor(tablename);
HColumnDescriptor columnDescriptor = new HColumnDescriptor(columnFamily);
columnDescriptor.setMaxVersions(MAX_VERSIONS);
tableDescriptor.addFamily(columnDescriptor);
hba.createTable(tableDescriptor);
hba.close();