Remove Hightlight matching String content

Remove Hightlight matching String content - java

Ok, few days ago I made one post regarding to the remove of Hightlighted text in JTextArea:
Removing Highlight from specific word - Java
The thing is, that time I made one code to remove Hightlights macthing its size...but now I have a lot of words with the same size in my app and obviously the application isnt running right.
So I ask, Does anyone know a library or a way to do this removal macthing the content of each highlighted string?

You could write a method to get the text for a given highlighter:
private static String highlightedText(Highlight h, Document d) {
int start = h.getStartIndex();
int end = h.getEndIndex();
int length = end - start;
return d.getText(start, length);
}
Then your removeHighlights method would look like this:
public void removeHighlights(JTextComponent c, String toBlackOut) {
Highlighter highlighter = c.getHighlighter();
Highlighter.Highlight[] highlights = h.getHighlights();
Document d = c.getDocument();
for (Highlighter.Highlight h : highlights)
if (highlightedText(h, d).equals(toBlackOut) && h.getPainter() instanceof TextHighLighter)
highlighter.removeHighlight(h);
}

Related

StringComparison in java

We developed a PDF reader desktop app using iTextSharp and we are now developing an android app. This is my C# code I want to know what can be use for StringComparison in Java
public final void PDFReferenceGetter(String pSearch, StringComparison SC, String sourceFile, String destinationFile)
{
//
}
the full code
public final void PDFReferenceGetter(String pSearch, String SC, String sourceFile, String destinationFile)
{
PdfStamper stamper = null;
PdfContentByte contentByte;
Rectangle refRectangle = null;
int refPage = 0;
//this.Cursor = Cursors.WaitCursor;
if ((new java.io.File(sourceFile)).isFile())
{
PdfReader pReader = new PdfReader(sourceFile);
stamper = new PdfStamper(pReader, new FileOutputStream(destinationFile));
for (int page = 1; page <= pReader.getNumberOfPages(); page++)
{
LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
contentByte = stamper.getUnderContent(page);
//Send some data contained in PdfContentByte, looks like the first is always cero for me and the second 100, but i'm not sure if this could change in some cases
strategy._UndercontentCharacterSpacing = contentByte.getCharacterSpacing();
strategy._UndercontentHorizontalScaling = contentByte.getHorizontalScaling();
//It's not really needed to get the text back, but we have to call this line ALWAYS,
//because it triggers the process that will get all chunks from PDF into our strategy Object
String currentText = PdfTextExtractor.getTextFromPage(pReader, page, strategy);
//The real getter process starts in the following line
java.util.ArrayList<Rectangle> matchesFound = strategy.GetTextLocations("References", SC);
//Set the fill color of the shapes, I don't use a border because it would make the rect bigger
//but maybe using a thin border could be a solution if you see the currect rect is not big enough to cover all the text it should cover
contentByte.setColorFill(BaseColor.PINK);
//MatchesFound contains all text with locations, so do whatever you want with it, this highlights them using PINK color:s
for (Rectangle rect : matchesFound)
{
refRectangle = rect;
refPage = page;
}
contentByte.fill();
}
for (int page = 1; page <= pReader.getNumberOfPages(); page++)
{
LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
contentByte = stamper.getUnderContent(page);
//Send some data contained in PdfContentByte, looks like the first is always cero for me and the second 100, but i'm not sure if this could change in some cases
strategy._UndercontentCharacterSpacing = contentByte.getCharacterSpacing();
strategy._UndercontentHorizontalScaling = contentByte.getHorizontalScaling();
//It's not really needed to get the text back, but we have to call this line ALWAYS,
//because it triggers the process that will get all chunks from PDF into our strategy Object
String currentText = PdfTextExtractor.getTextFromPage(pReader, page, strategy);
String text = currentText;
String patternString = pSearch;
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);
boolean matches = matcher.matches();
if(matches == true)
{
ArrayList<String> mc;
mc.add(text);
//MatchCollection mc = Regex.Matches(currentText, pSearch);
java.util.ArrayList<Rectangle> matchesFound = new java.util.ArrayList<Rectangle>();
for (String m : mc)
{
matchesFound = strategy.getTextLocations(m.toString(), SC);
for (Rectangle rect : matchesFound)
{
contentByte.rectangle(rect.getLeft(), rect.getBottom(), rect.getWidth(), rect.getHeight());
PdfDestination pdfdest = new PdfDestination(PdfDestination.XYZ, refRectangle.LEFT, refRectangle.TOP, 0);
PdfAnnotation annot = PdfAnnotation.createLink(stamper.getWriter(), rect, PdfAnnotation.HIGHLIGHT_INVERT, refPage, pdfdest);
stamper.addAnnotation(annot, page);
}
}
//The real getter process starts in the following line
//Set the fill color of the shapes, I don't use a border because it would make the rect bigger
//but maybe using a thin border could be a solution if you see the currect rect is not big enough to cover all the text it should cover
contentByte.setColorFill(BaseColor.LIGHT_GRAY);
//MatchesFound contains all text with locations, so do whatever you want with it, this highlights them using PINK color:
contentByte.fill();
}
stamper.close();
pReader.close();
}
//this.Cursor = Cursors.Default;
}
}

The StringComparison enum is described in more detail here.
The short answer is that there is, unfortunately, no suitable type in the java libraries.
The easy solution
Create your own Java enum mirroring the c#. You also have to create your own string comparison method taking the StringComparison into account, e.g. ignoring case, etc, depending on the value of the StringComparison.
The best solution
I would avoid using the StringComparison in the interface of a method. Instead search for usages of the method. I'm guessing it is only used to sometimes ignore case and others not. Or that it is completely unused. For the later case - Simply remove it and you're done! For the former case just pass in a bool to the interface instead! Remember to update the c# code to keep the ports somewhat in sync.

If you have one string:
String myString = "somestring";
And another one:
String anotherString = "somestringelse";
You can do use the built in equals() function like this:
if(myString.equals(anotherString)) {
//Do code
}

You can use basic Java string comparison methods
If you wanna compare two strings completely I mean as a whole string
String string1="abcd", string2="abcd";
if(string1.equals(string2)) ----> returns true as they are equal else it returns false.
If you wanna compare two strings completely ignoring their cases you can use the following method
String string1="abcd", string2="AbCd";
if(string1.equalsIgnorecase(string2)) -- > returns true as they are equal though their cases are different else it returns false.
If you don't wanna compare whole strings you can use following methods
check the following link for all the string comparison methods in Java
http://docs.oracle.com/javase/tutorial/java/data/comparestrings.html

How many times a text appears in webpage - Selenium Webdriver

Hi I would like to count how many times a text Ex: "VIM LIQUID MARATHI" appears on a page using selenium webdriver(java). Please help.
I have used the following to check if a text appears in the page using the following in the main class
assertEquals(true,isTextPresent("VIM LIQUID MARATHI"));
and a function to return a boolean
protected boolean isTextPresent(String text){
try{
boolean b = driver.getPageSource().contains(text);
System.out.println(b);
return b;
}
catch(Exception e){
return false;
}
}
... but do not know how to count the number of occurrences...

The problem with using getPageSource(), is there could be id's, classnames, or other parts of the code which match your String, but those don't actually appear on the page. I suggest just using getText() on the body element, which will only return the page's content, and not HTML. If I'm understanding your question correctly, I think that is more what you are looking for.
// get the text of the body element
WebElement body = driver.findElement(By.tagName("body"));
String bodyText = body.getText();
// count occurrences of the string
int count = 0;
// search for the String within the text
while (bodyText.contains("VIM LIQUID MARATHI")){
// when match is found, increment the count
count++;
// continue searching from where you left off
bodyText = bodyText.substring(bodyText.indexOf("VIM LIQUID MARATHI") + "VIM LIQUID MARATHI".length());
}
System.out.println(count);
The variable count contains the number of occurrences.

There are two different ways to do this:
int size = driver.findElements(By.xpath("//*[text()='text to match']")).size();
This will tell the driver to find all of the elements that have the text, and then output the size.
The second way is to search the HTML, like you said.
int size = driver.getPageSource().split("text to match").length-1;
This will get the page source, the split the string whenever it finds the match, then counts the number of splits it made.

You can try to execute javascript expression using webdriver:
((JavascriptExecutor)driver).executeScript("yourScript();");
If you are using jQuery on your page you can use jQuery's selectors:
((JavascriptExecutor)driver).executeScript("return jQuery([proper selector]).size()");
[proper selector] - this should be selector that will match text you are searching for.

Try
int size = driver.findElements(By.partialLinkText("VIM MARATHI")).size();

caret position into the html of JEditorPane

The getCaretPosition method of JEditorPane gives an index into the text only part of the html control. Is there a possibility to get the index into the html text?
To be more specific suppose I have a html text (where | denotes the caret position)
abcd<img src="1.jpg"/>123|<img src="2.jpg"/>
Now getCaretPosition gives 8 while I would need 25 as a result to read out the filename of the image.

I had mostly the same problem and solved it with the following method (I used JTextPane, but it should be the same for JEditorPane):
public int getCaretPositionHTML(JTextPane pane) {
HTMLDocument document = (HTMLDocument) pane.getDocument();
String text = pane.getText();
String x;
Random RNG = new Random();
while (true) {
x = RNG.nextLong() + "";
if (text.indexOf(x) < 0) break;
}
try {
document.insertString(pane.getCaretPosition(), x, null);
} catch (BadLocationException ex) {
ex.printStackTrace();
return -1;
}
text = pane.getText();
int i = text.indexOf(x);
pane.setText(text.replace(x, ""));
return i;
}
It just assumes your JTextPane won't contain all possible Long values ;)

The underlying model of the JEditorPane (some subclass of StyledDocument, in your case HTMLDocument) doesn't actually hold the HTML text as its internal representation. Instead, it has a tree of Elements containing style attributes. It only becomes HTML once that tree is run through the HTMLWriter. That makes what you're trying to do kinda tricky! I could imagine putting some flag attribute on the character element that you're currently on, and then using a specially crafted subclass of HTMLWriter to write out until that marker and count the characters, but that sounds like something of an epic hack. There is probably an easier way to get what you want there, though it's a bit unclear to me what that actually is.

I had the same problem, and solved it with the following code:
editor.getDocument().insertString(editor.getCaretPosition(),"String to insert", null);

I don't think you can transform your caret to be able to count tags as characters. If your final aim is to read image filename, you should use :
HTMLEditorKit (JEditorPane.getEditorKitForContentType("text/html") );
For more information about utilisation see Oracle HTMLEditorKit documentation and this O'Reilly PDF that contains interesting examples.

Why is the size of this vector 1?

When I use System.out.println to show the size of a vector after calling the following method then it shows 1 although it should show 2 because the String parameter is "7455573;photo41.png;photo42.png" .
private void getIdClientAndPhotonames(String csvClientPhotos)
{
Vector vListPhotosOfClient = new Vector();
String chainePhotos = "";
String photoName = "";
String photoDirectory = new String(csvClientPhotos.substring(0, csvClientPhotos.indexOf(';')));
chainePhotos = csvClientPhotos.substring(csvClientPhotos.indexOf(';')+1);
chainePhotos = chainePhotos.substring(0, chainePhotos.lastIndexOf(';'));
if (chainePhotos.indexOf(';') == -1)
{
vListPhotosOfClient.addElement(new String(chainePhotos));
}
else // aaa;bbb;...
{
for (int i = 0 ; i < chainePhotos.length() ; i++)
{
if (chainePhotos.charAt(i) == ';')
{
vListPhotosOfClient.addElement(new String(photoName));
photoName = "";
continue;
}
photoName = photoName.concat(String.valueOf(chainePhotos.charAt(i)));
}
}
}
So the vector should contain the two String photo41.png and photo42.png , but when I print the vector content I get only photo41.png.
So what is wrong in my code ?

The answer is not valid for this question anymore, because it has been retagged to java-me. Still true if it was Java (like in the beginning): use String#split if you need to handle csv files.
It's be far easier to split the string:
String[] parts = csvClientPhotos.split(";");
This will give a string array:
{"7455573","photo41.png","photo42.png"}
Then you'd simply copy parts[1] and parts[2] to your vector.

You have two immediate problems.
The first is with your initial manipulation of the string. The two lines:
chainePhotos = csvClientPhotos.substring(csvClientPhotos.indexOf(';')+1);
chainePhotos = chainePhotos.substring(0, chainePhotos.lastIndexOf(';'));
when applied to 7455573;photo41.png;photo42.png will end up giving you photo41.png.
That's because the first line removes everything up to the first ; (7455573;) and the second strips off everything from the final ; onwards (;photo42.png). If your intent is to just get rid of the 7455573; bit, you don't need the second line.
Note that fixing this issue alone will not solve all your ills, you still need one more change.
Even though your input string (to the loop) is the correct photo41.png;photo42.png, you still only add an item to the vector each time you encounter a delimiting ;. There is no such delimiter at the end of that string, meaning that the final item won't be added.
You can fix this by putting the following immediately after the for loop:
if (! photoName.equals(""))
vListPhotosOfClient.addElement(new String(photoName));
which will catch the case of the final name not being terminated with the ;.

These two lines are the problem:
chainePhotos = csvClientPhotos.substring(csvClientPhotos.indexOf(';') + 1);
chainePhotos = chainePhotos.substring(0, chainePhotos.lastIndexOf(';'));
After the first one the chainePhotos contains "photo41.png;photo42.png", but the second one makes it photo41.png - which trigers the if an ends the method with only one element in the vector.
EDITED: what a mess.
I ran it with correct input (as provided by the OP) and made a comment above.
I then fixed it as suggested above, while accidently changing the input to 7455573;photo41.png;photo42.png; which worked, but is probably incorrect and doesn't match the explanation above input-wise.
I wish someone would un-answer this.

You can split the string manually. If the string having the ; symbol means why you can do like this? just do like this,
private void getIdClientAndPhotonames(String csvClientPhotos)
{
Vector vListPhotosOfClient = split(csvClientPhotos);
}
private vector split(String original) {
Vector nodes = new Vector();
String separator = ";";
// Parse nodes into vector
int index = original.indexOf(separator);
while(index>=0) {
nodes.addElement( original.substring(0, index) );
original = original.substring(index+separator.length());
index = original.indexOf(separator);
}
// Get the last node
nodes.addElement( original );
return nodes;
}

DocumentListener slows down Document.setCharacterAttributes method?

this is my first question in this site, though is not the first time I enter to clear my doubts, awesome webpage. :)
I'm writing a java program that highlights code in a JTextPane and I'm changing the way highlights are done. I'm using a JTabbedPane to let the user edit more than one file at the same time and I used to perform document highlights using a Timer, now I've built a highlight queue that runs in a separate thread and implemented a DocumentListener that queues the documents as changes take place.
But I have a really big problem, if I add the document via DocumentListener, the Highlight process takes a really long time while if I add it in the main class by getting the document directly from the JTextPane, it takes just a few milliseconds.
I've performed multiple benchmarks in my code and found out that what takes so much time to be performed when the document is added from the DocumentListener is the method Document.setCharacterAttributes().
Here is the method that adds documents via DocumentListener:
// eventType: 0 - insertUpdate / 1- removeUpdate
private void queueChange(javax.swing.event.DocumentEvent e, int eventType){
StyledDocument doc = (StyledDocument) e.getDocument();
int changeLength = e.getLength();
int changeOffset = e.getOffset();
int length = doc.getLength();
String title = (String) doc.getProperty("title");
String text;
try {
text = doc.getText(0, length);
if (changeLength != 1) {
Element element = doc.getDefaultRootElement();
int startLn = element.getElement(element.getElementIndex(changeOffset)).getStartOffset();
int endLn = element.getElement(element.getElementIndex(changeOffset + changeLength)).getEndOffset() - 1;
Engine.addDocument(doc, startLn, endLn, title, text);
} else {
if(eventType == 1){
changeOffset = changeOffset - changeLength;
}
int startLn = text.lastIndexOf("\n", changeOffset) + 1;
int endLn = text.indexOf("\n", changeOffset);
if (endLn < 0) {
if (length != startLn) {
endLn = length;
Engine.addDocument(doc, startLn, endLn, title, text);
}
} else if (startLn != endLn && startLn < endLn) {
Engine.addDocument(doc, startLn, endLn, title, text);
}
}
} catch (BadLocationException ex) {
Engine.crashEngine();
}
}
If I add a document with 2k lines with this method, it takes ~1900 ms to highlight the whole document, while if I add the document to the highlight queue by using a caret listening method it takes ~500 ms.
Here's a part of the caret listening method that is used to highlight whole documents when they're loaded:
if (loadFile == true) {
isKey = false;
doc = edit[currentTab].Editor.getStyledDocument();
try {
Highlight.addDocument(doc, 0, doc.getLength(),
Scripts.getTitleAt(currentTab), doc.getText(0, doc.getLength()));
} catch (BadLocationException ex) {
ex.printStackTrace();
}
loadFile = false;
}
Note: the Highlight/Engine.addDocument() method has five parameters: (StyledDocument doc,int start, int end, String tabTitle, String docText). Start and end both indicate the region where highlighting is needed.
I will appreciate any help related to this problem cause I've been trying to solve it for a few days and I can't find anything similar on the Internet. :(
Btw, does anyone know the actual difference between Document.setCharacterAttributes and Document.setParagraphAttributes? :P

Maybe you have some kind of recursion in your code that is causing the problem. With the DocumentEvent you should only worry about additions and removals. You don't need to worry about changes since those are attribute changes.
Maybe you add some text which schedules the highlighting, but then when you change the attributes of the text you schedule another highllighting task.

You can try to set a flag indicating whether it's user changes or your API changes. In the beginning of the Engine.addDocument() set the flag to API state and reset it back after changes are done.
In your listener check the flag and skip changes from API.
You wrote " I use highlights the text by setting the character attributes of a portion of the Document, so the method is not inserting more text". I'm not sure it doesn't insert text. E.g. you have "it's a bold text piece" then you select the "bold" and change attributes to bold. Original element is separated and 3 new elements appear. I didn't test it but it might call insertUpdate() and removeUpdate()
does anyone know the actual difference between Document.setCharacterAttributes and Document.setParagraphAttributes?
There are paragraph and char attributes. Char attributes are font size, family, style, colors. Paragraph attributes are alignment, indentation, line spacing.
Actually paragraphs are char elements' parents.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Remove Hightlight matching String content - java

Related

StringComparison in java

How many times a text appears in webpage - Selenium Webdriver

caret position into the html of JEditorPane

Why is the size of this vector 1?

DocumentListener slows down Document.setCharacterAttributes method?

Categories

Resources