Is there a good way to get the first empty cell in a column from Google's spreadsheet service via Java?
I know I can use:
public CellFeed CheckColumn(int row, int col)
throws IOException, ServiceException {
CellQuery query = new CellQuery(cellFeedUrl);
query.setMinimumRow(row);
query.setMaximumRow(row);
query.setMinimumCol(col);
query.setMaximumCol(col);
query.setReturnEmpty(true);
CellFeed feed = service.query(query, CellFeed.class);
int cell_loc[];
for (CellEntry entry : feed.getEntries()) {
cell_loc=CheckIfEmpty(entry);
}
return cell_loc;
}
And walk through the entries, but I'd rather not load the entire column at once, it's slow for my users and it seems bad to just walkthrough the entire column
Any thoughts?
This small snippet will create a function in Google Spreadsheet with Google Apps Script:
function emptySpace(array) {
// set counter
var counter = 0;
// itterate through values
for (i in array){
if (array[i].length > 1) {
throw ("Only single column of data");
} else {
if(array[i][0] != "") {
counter++;
} else {
break;
}
}
}
// return value + 1
return counter + 1;
}
Add this script, via the script editor, to your spreadsheet and the function emptySpace is available throughout the worksheet, like so: =emptySpace(A1:A7).
See example file I've created: empty space
Related
I have a .docx template with placeholders to be filled, such as ${programming_language}, ${education}, etc.
The placeholder keywords must be easily distinguished from the other plain words, hence they are enclosed with ${ }.
for (XWPFTable table : doc.getTables()) {
for (XWPFTableRow row : table.getRows()) {
for (XWPFTableCell cell : row.getTableCells()) {
for (XWPFParagraph paragraph : cell.getParagraphs()) {
for (XWPFRun run : paragraph.getRuns()) {
System.out.println("run text: " + run.text());
/** replace text here, etc. */
}
}
}
}
}
I want to extract the placeholders together with the enclosing ${ } characters. The problem is, that is seems like the enclosing characters are treated as different runs...
run text: ${
run text: programming_language
run text: }
run text: Some plain text here
run text: ${
run text: education
run text: }
Instead, I would like to achieve the following effect:
run text: ${programming_language}
run text: Some plain text here
run text: ${education}
I have tried using other enclosing characters, such as: { }, < >, # #, etc.
I do not want to do some weird concatenations of runs, etc. I want to have it in a single XWPFRun.
If I cannot find the proper solution, I will just make it like so: VAR_PROGRAMMING_LANGUGE, VAR_EDUCATION, I think.
Current apache poi 4.1.2 provides TextSegment to deal with those Word text-run issues. XWPFParagraph.searchText searches for a string in a paragraph and returns a TextSegment. This provides access to the begin run and the end run of that text in that paragraph (BeginRun and EndRun). It also provides access to the start character position in begin run and end character position in end run (BeginChar and EndChar).
It additionally provides access to the index of the text element in the text run (BeginText and EndText). This always should be 0, because default text runs only have one text element.
Having this, we can do the following:
Replace the found partial string in begin run by the replacement. To do so, get the text part which was before the searched string and concatenate the replacement to it. After that the begin run fully contains the replacement.
Delete all text runs between begin run and end run as they contain parts of the searched string which is not more needed.
Let remain only the text part after the searched string in end run.
Doing so we are able replacing text which is in multiple text runs.
Following example shows this.
import java.io.*;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;
public class WordReplaceTextSegment {
static public void replaceTextSegment(XWPFParagraph paragraph, String textToFind, String replacement) {
TextSegment foundTextSegment = null;
PositionInParagraph startPos = new PositionInParagraph(0, 0, 0);
while((foundTextSegment = paragraph.searchText(textToFind, startPos)) != null) { // search all text segments having text to find
System.out.println(foundTextSegment.getBeginRun()+":"+foundTextSegment.getBeginText()+":"+foundTextSegment.getBeginChar());
System.out.println(foundTextSegment.getEndRun()+":"+foundTextSegment.getEndText()+":"+foundTextSegment.getEndChar());
// maybe there is text before textToFind in begin run
XWPFRun beginRun = paragraph.getRuns().get(foundTextSegment.getBeginRun());
String textInBeginRun = beginRun.getText(foundTextSegment.getBeginText());
String textBefore = textInBeginRun.substring(0, foundTextSegment.getBeginChar()); // we only need the text before
// maybe there is text after textToFind in end run
XWPFRun endRun = paragraph.getRuns().get(foundTextSegment.getEndRun());
String textInEndRun = endRun.getText(foundTextSegment.getEndText());
String textAfter = textInEndRun.substring(foundTextSegment.getEndChar() + 1); // we only need the text after
if (foundTextSegment.getEndRun() == foundTextSegment.getBeginRun()) {
textInBeginRun = textBefore + replacement + textAfter; // if we have only one run, we need the text before, then the replacement, then the text after in that run
} else {
textInBeginRun = textBefore + replacement; // else we need the text before followed by the replacement in begin run
endRun.setText(textAfter, foundTextSegment.getEndText()); // and the text after in end run
}
beginRun.setText(textInBeginRun, foundTextSegment.getBeginText());
// runs between begin run and end run needs to be removed
for (int runBetween = foundTextSegment.getEndRun() - 1; runBetween > foundTextSegment.getBeginRun(); runBetween--) {
paragraph.removeRun(runBetween); // remove not needed runs
}
}
}
public static void main(String[] args) throws Exception {
XWPFDocument doc = new XWPFDocument(new FileInputStream("source.docx"));
String textToFind = "${This is the text to find}"; // might be in different runs
String replacement = "Replacement text";
for (XWPFParagraph paragraph : doc.getParagraphs()) { //go through all paragraphs
if (paragraph.getText().contains(textToFind)) { // paragraph contains text to find
replaceTextSegment(paragraph, textToFind, replacement);
}
}
FileOutputStream out = new FileOutputStream("result.docx");
doc.write(out);
out.close();
doc.close();
}
}
Above code works not in all cases because XWPFParagraph.searchText has bugs. So I will provide a better searchText method:
/**
* this methods parse the paragraph and search for the string searched.
* If it finds the string, it will return true and the position of the String
* will be saved in the parameter startPos.
*
* #param searched
* #param startPos
*/
static TextSegment searchText(XWPFParagraph paragraph, String searched, PositionInParagraph startPos) {
int startRun = startPos.getRun(),
startText = startPos.getText(),
startChar = startPos.getChar();
int beginRunPos = 0, candCharPos = 0;
boolean newList = false;
//CTR[] rArray = paragraph.getRArray(); //This does not contain all runs. It lacks hyperlink runs for ex.
java.util.List<XWPFRun> runs = paragraph.getRuns();
int beginTextPos = 0, beginCharPos = 0; //must be outside the for loop
//for (int runPos = startRun; runPos < rArray.length; runPos++) {
for (int runPos = startRun; runPos < runs.size(); runPos++) {
//int beginTextPos = 0, beginCharPos = 0, textPos = 0, charPos; //int beginTextPos = 0, beginCharPos = 0 must be outside the for loop
int textPos = 0, charPos;
//CTR ctRun = rArray[runPos];
CTR ctRun = runs.get(runPos).getCTR();
XmlCursor c = ctRun.newCursor();
c.selectPath("./*");
try {
while (c.toNextSelection()) {
XmlObject o = c.getObject();
if (o instanceof CTText) {
if (textPos >= startText) {
String candidate = ((CTText) o).getStringValue();
if (runPos == startRun) {
charPos = startChar;
} else {
charPos = 0;
}
for (; charPos < candidate.length(); charPos++) {
if ((candidate.charAt(charPos) == searched.charAt(0)) && (candCharPos == 0)) {
beginTextPos = textPos;
beginCharPos = charPos;
beginRunPos = runPos;
newList = true;
}
if (candidate.charAt(charPos) == searched.charAt(candCharPos)) {
if (candCharPos + 1 < searched.length()) {
candCharPos++;
} else if (newList) {
TextSegment segment = new TextSegment();
segment.setBeginRun(beginRunPos);
segment.setBeginText(beginTextPos);
segment.setBeginChar(beginCharPos);
segment.setEndRun(runPos);
segment.setEndText(textPos);
segment.setEndChar(charPos);
return segment;
}
} else {
candCharPos = 0;
}
}
}
textPos++;
} else if (o instanceof CTProofErr) {
c.removeXml();
} else if (o instanceof CTRPr) {
//do nothing
} else {
candCharPos = 0;
}
}
} finally {
c.dispose();
}
}
return null;
}
This will be called like:
...
while((foundTextSegment = searchText(paragraph, textToFind, startPos)) != null) {
...
Just like someone has commented your question, you can't have control where or when Word will split the paragraph in some runs. If the other answer still didn't help you, then I have the way I got around it:
First of all, this "solution" have a big problem, but still, I will put it here for the reason that someone can solve it.
public void mainMethod(XWPFParagraph paragraph) {
if (paragraph.getRuns().size() > 1) {
String myRun = unifyRuns(paragraph.getRuns());
// make the verification of placeholders ${...}
paragraph.getRuns().get(0).setText(myRun);
while(paragraph.getRuns().size() > 1) {
paragraph.removeRun(1);
}
}
}
private String unifyRuns(List<XWPFRun> runElements) {
StringBuilder unifiedRun = new StringBuilder();
for (XWPFRun run : runElements) {
unifiedRun.append(run);
}
return unifiedRun.toString();
}
The code may contain some error since I'm doing it as I remember.
The problem here is that when Word separates paragraphs into runs, it doesn't do it for nothing, because when there are texts with different fonts (like font-family or font-size), it separates the texts in different runs.
In the text "Here's my bold text", Word will split the text to separate the bold and normal text. Then, the code above is a bad solution if you are using POI to create large documents with different types of fonts. In that case you would need to verify first if the run is actualy in bold, then you will treat the placeholders.
Again, this a "solution" that i found, and it's not complete yet. Sorry for english errors, i'm using Google Translate to write this answer.
My java spring boot app needs to create a new excel file based on the contents of my DB. My current solution places all the data from my DB and inserts it in my excel sheet, but I want to improve it by not stating what the cell values are. For example, although it works, my solution has 34 fields so I am stating the userRow.createCell line 34 times for each field which is repetitive. Ideally I want to say create the cell(n) and take all the values from each row in the DB. How can this be done? Another for loop within this for loop? Every example I looked at online seems to specifically state what the cell value is.
List<CaseData> cases = (List<CaseData>) model.get("cases");
Sheet sheet = workbook.createSheet("PIE Cases");
int rowCount = 1;
for (CaseData pieCase : cases) {
Row userRow = sheet.createRow(rowCount++);
userRow.createCell(0).setCellValue(pieCase.getCaseId());
userRow.createCell(1).setCellValue(pieCase.getAcknowledgementReceivedDate());
}
Use the Reflection API
Example:
try {
Class caseDataObj = CaseData.class;
Method [] methods = caseDataObj.getDeclaredMethods();
Sheet sheet = workbook.createSheet("PIE Cases");
int rowCount = 1;
for(CaseData cd : cases) {
int cellIndex = 0;
Row userRow = sheet.createRow(rowCount++);
for (Method method : methods) {
String methodName = method.getName();
if(methodName.startsWith("get")) {
// Assuming all getters return String
userRow.createCell(cellIndex++).setCellValue((String) method.invoke(cd));
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
There are probably many ways to do this, You can try something like this, this is how I usually go about it for things like what you are doing.
public enum DATA {
CASE_ID(0),
ACK_RECIEVED(1),
ETC(2);
//ETC(3) and so on
public int index;
DATA(int index) {
this.index = index;
}
public Object parse(CaseData data) throws Exception {
switch (this) {
case CASE_ID:
return data.getCaseId();
case ACK_RECIEVED:
return data.getAcknowledgementReceivedDate();
case ETC:
return "etc...";
default: return null;
}
}
}
Then, the implementation is:
List<CaseData> cases = (List<CaseData>) model.get("cases");
Sheet sheet = workbook.createSheet("PIE Cases");
int rowCount = 1;
for (CaseData pieCase : cases) {
Row userRow = sheet.createRow(rowCount++);
for (DATA DAT : DATA.values()) {
userRow.createCell(DAT.index).setCellValue(DAT.parse(pieCase));
}
}
I have a google spreadsheet which contains 2 or more worksheets. I am able to print all tabs name or worksheets name using java. I'm looking for a way to print all worksheet data by default it prints only first tab or worksheet. I am attaching my code below plz someone helps me I am pretty new to this code snippet
You need to specify the range first, then return the data in a ValueRange object. See example code below
String range = "UK!A2:E";
ValueRange response = service.spreadsheets().values()
.get(spreadsheetId, range)
.execute();
List<List<Object>> values = response.getValues();
if (values == null || values.isEmpty()) {
System.out.println("No data found.");
} else {
for (List row : values) {
Iterator<Object> elem = row.iterator();
while (elem.hasNext()) {
System.out.println(elem.next());
}
}
}
Is it possible to parse a delimited file and find column datatypes? e.g
Delimited file:
Email,FirstName,DOB,Age,CreateDate
test#test1.com,Test User1,20/01/2001,24,23/02/2015 14:06:45
test#test2.com,Test User2,14/02/2001,24,23/02/2015 14:06:45
test#test3.com,Test User3,15/01/2001,24,23/02/2015 14:06:45
test#test4.com,Test User4,23/05/2001,24,23/02/2015 14:06:45
Output:
Email datatype: email
FirstName datatype: Text
DOB datatype: date
Age datatype: int
CreateDate datatype: Timestamp
The purpose of this is to read a delimited file and construct a table creation query on the fly and insert data into that table.
I tried using apache validator, I believe we need to parse the complete file in order to determine each column data type.
EDIT: The code that I've tried:
CSVReader csvReader = new CSVReader(new FileReader(fileName),',');
String[] row = null;
int[] colLength=(int[]) null;
int colCount = 0;
String[] colDataType = null;
String[] colHeaders = null;
String[] header = csvReader.readNext();
if (header != null) {
colCount = header.length;
}
colLength = new int[colCount];
colDataType = new String[colCount];
colHeaders = new String[colCount];
for (int i=0;i<colCount;i++){
colHeaders[i]=header[i];
}
int templength=0;
String tempType = null;
IntegerValidator intValidator = new IntegerValidator();
DateValidator dateValidator = new DateValidator();
TimeValidator timeValidator = new TimeValidator();
while((row = csvReader.readNext()) != null) {
for(int i=0;i<colCount;i++) {
templength = row[i].length();
colLength[i] = templength > colLength[i] ? templength : colLength[i];
if(colHeaders[i].equalsIgnoreCase("email")){
logger.info("Col "+i+" is Email");
} else if(intValidator.isValid(row[i])){
tempType="Integer";
logger.info("Col "+i+" is Integer");
} else if(timeValidator.isValid(row[i])){
tempType="Time";
logger.info("Col "+i+" is Time");
} else if(dateValidator.isValid(row[i])){
tempType="Date";
logger.info("Col "+i+" is Date");
} else {
tempType="Text";
logger.info("Col "+i+" is Text");
}
logger.info(row[i].length()+"");
}
Not sure if this is the best way of doing this, any pointers in the right direction would be of help
If you wish to write this yourself rather than use a third party library then probably the easiest mechanism is to define a regular expression for each data type and then check if all fields satisfy it. Here's some sample code to get you started (using Java 8).
public enum DataType {
DATETIME("dd/dd/dddd dd:dd:dd"),
DATE("dd/dd/dddd",
EMAIL("\\w+#\\w+"),
TEXT(".*");
private final Predicate<String> tester;
DateType(String regexp) {
tester = Pattern.compile(regexp).asPredicate();
}
public static Optional<DataType> getTypeOfField(String[] fieldValues) {
return Arrays.stream(values())
.filter(dt -> Arrays.stream(fieldValues).allMatch(dt.tester)
.findFirst();
}
}
Note that this relies on the order of the enum values (e.g. testing for datetime before date).
Yes it is possible and you do have to parse the entire file first. Have a set of rules for each data type. Iterate over every row in the column. Start of with every column having every data type and cancel of data types if a row in that column violates a rule of that data type. After iterating the column check what data type is left for the column. Eg. Lets say we have two data types integer and text... rules for integer... well it must only contain numbers 0-9 and may begin with '-'. Text can be anything.
Our column:
345
-1ab
123
The integer data type would be removed by the second row so it would be text. If row two was just -1 then you would be left with integer and text so it would be integer because text would never be removed as our rule says text can be anything... you dont have to check for text basically if you left with no other data type the answer is text. Hope this answers your question
I have slight similar kind of logic needed for my project. Searched lot but did not get right solution. For me i need to pass string object to the method that should return datatype of the obj. finally i found post from #sprinter, it looks similar to my logic but i need to pass string instead of string array.
Modified the code for my need and posted below.
public enum DataType {
DATE("dd/dd/dddd"),
EMAIL("#gmail"),
NUMBER("[0-9]+"),
STRING("^[A-Za-z0-9? ,_-]+$");
private final String regEx;
public String getRegEx() {
return regEx;
}
DataType(String regEx) {
this.regEx = regEx;
}
public static Optional<DataType> getTypeOfField(String str) {
return Arrays.stream(DataType.values())
.filter(dt -> {
return Pattern.compile(dt.getRegEx()).matcher(str).matches();
})
.findFirst();
}
}
For example:
Optional<DataType> dataType = getTypeOfField("Bharathiraja");
System.out.println(dataType);
System.out.println(dataType .get());
Output:
Optional[STRING]
STRING
Please note, regular exp pattern is vary based on requirements, so modify the pattern as per your need don't take as it is.
Happy Coding !
I have a hbase table where all keys have the following structure ID,DATE,OTHER_DETAILS
For example:
10,2012-05-01,"some details"
10,2012-05-02,"some details"
10,2012-05-03,"some details"
10,2012-05-04,"some details"
...
How can I write a scan that get all the rows that older than some date?
For example 2012-05-01 and 2012-05-02 are older than 2012-05-03.
Scan scan = new Scan();
Filter f = ???
scan.setFilter(f);
scan.setCaching(1000);
ResultScanner rs = table.getScanner(scan);
You can create your own Filter and implement the method filterRowKey. To make scan more faster you can also implement the method getNextKeyHint, but this is a bit complicated. The disadvantage of this approach is that you need to put jar file with your filter into the HBase classpath and restart cluster.
This approximate implementation of this filter.
#Override
public void reset() {
this.filterOutRow = false;
}
#Override
public Filter.ReturnCode filterKeyValue(KeyValue v) {
if(this.filterOutRow) {
return ReturnCode.SEEK_NEXT_USING_HINT;
}
return Filter.ReturnCode.INCLUDE;
}
#Override
public boolean filterRowKey(byte[] data, int offset, int length) {
if(startDate < getDate(data) && endDate > getDate(data)) {
this.filterOutRow = true;
}
return this.filterOutRow;
}
#Override
public KeyValue getNextKeyHint(KeyValue currentKV) {
if(getDate(currentKV) < startDate){
String nextKey = getId(currentKV)+","+startDate.getTime();
return KeyValue.createFirstOnRow(Bytes.toBytes(nextKey));
}
if(getDate(currentKV) > endDate){
String nextKey = (getId(currentKV)+1)+","+startDate.getTime();
return KeyValue.createFirstOnRow(Bytes.toBytes(nextKey));
}
return null;
}
#Override
public boolean filterRow() {
return this.filterOutRow;
}
store the key of the very first row somewhere. it will always be there in your final resultset, being the 'first' row, which makes it older than all other rows(am i correct??)
now take the date, which you want to use to filter out the results and create a RowFilter with RegexStringComparator using this date. this will give the row matching the specified criteria. now, using this row and the first row, which you had store earlier, do a range query.
and if you have multiple rows having the same date, say:
10,2012-05-04,"some details"
10,2012-05-04,"some new details"
take the last row, which you would have got after the RowFilter, and use the same technique.
HTH
i was trying to say that you can use range query to achieve this. where the "startrowkey" will be the first row of your table. being the first row it'll always be the oldest row which means you will always have this row in your result. and the "stoprowkey" for your range query will be the row which contains the given date. to find the stoprowkey you can set a "RowFilter" with "RegexStringComparator".
byte[] startRowKey = FIRST_ROW_OF_THE_TABLE;
Scan scan = new Scan();
Filter rowFilter = new RowFilter(CompareFilter.CompareOp.EQUAL,new RegexStringComparator("YOUR_REGEX"));
scan.setFilter(filter);
ResultScanner scanner1 = table.getScanner(scan);
for (Result res : scanner1) {
byte[] stopRowKey = res.getRow();
}
scanner1.close();
scan.setStartRow(startRowKey);
scan.setStopRow(stopRowKey);
ResultScanner scanner2 = table.getScanner(scan);
for (Result res : scanner2) {
//you final result
}