How to read data from solr/data/index - java

How to read data from solr/data/index by some simple console Java application? I found some solution.
But maybe there is more simple way.
Help please with that, I really don't know what to do.

It's my own solution. I get index files from solr 4.4 and I also use lucene-core-4.4.0.jar library. Maybe it can help someone.
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.solr.client.solrj.SolrServerException;
public class SomeClass {
public static void main(String[] args) throws IOException {
Directory dirIndex = FSDirectory.open(new File("solr/home/data/index"));
IndexReader indexReader = IndexReader.open(dirIndex);
Document doc = null;
for(int i = 0; i < indexReader.numDocs(); i++) {
doc = indexReader.document(i);
}
System.out.println(doc.toString());
indexReader.close();
dirIndex.close();
}
}

There is a project called Luke which is a GUI for users to inspect Lucene indices.
Here is a blog post with more detail.

Related

Error occurred when importing the Html file in Jsoup

When I am importing the HTML File according to the tutorialpoint link https://www.tutorialspoint.com/jsoup/jsoup_load_file.htm
import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;
import java.net.URL;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class jsoupTester {
public static void main(String[] args) throws IOException, URISyntaxException {
URL path = ClassLoader.getSystemResource("test.htm");
File input = new File(path.toURI());
Document document = Jsoup.parse(input, "UTF-8", "");
System.out.println(document.title());
}
}
I got this error when I run the program:
Exception in thread "main" java.lang.NullPointerException
at jsoupTester.main(jsoupTester.java:13)
Note: jsoupTester.java file and temp.htm are in the same location
May I know how to solve this issue? Your suggestions will be highly appreciated :)
Have you checked the website properly? the code documentation showed this
import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;
import java.net.URL;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class JsoupTester {
public static void main(String[] args) throws IOException, URISyntaxException {
URL path = ClassLoader.getSystemResource("test.htm");
File input = new File(path.toURI());
Document document = Jsoup.parse(input, "UTF-8"); // Only 2 parameters
System.out.println(document.title());
}
}
Error
Document document = Jsoup.parse(input, "UTF-8", ""); // 3rd parameter is not included in the documentation
As you can see the error is that you have another redundant parameter which I believe is causing the error. Remove that "" in your code and it will work fine. Hope that answers your question :)

Difficulty Configuring JSON Authentication for Box from Java

I'm working on a simple automated java program for using the box api and am trying to use json. I've borrowed the first part of the checkstyle sample code from the Github's repo example SearchExamplesAsAppUser, figuring it should work.
When I run it, I get a this error
java.lang.NoClassDefFoundError: org/bouncycastle/operator/OperatorCreationException
The problem seems to be stemming from the statement:
api = BoxDeveloperEditionAPIConnection.getAppUserConnection(USER_ID, boxConfig, accessTokenCache);
The Jars which I am using are (aside from commons, all recommended by box):
bcpkix-jdk15on-1.52.jar
bcprov-jdk15on-1.52.jar
box-java-sdk-2.14.1.jar
jose4j-0.4.4.jar
minimal-json-0.9.1.jar
commons-codec-1.9.jar
commons-httpclient-3.1.jar
commons-logging-1.2.jar
I am using netbeans so all of the jars above are listed under the libraries to use fr compilation.
The code is as follows:
package boxapitest;
import com.box.sdk.BoxAPIConnection;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
import com.box.sdk.BoxConfig;
import com.box.sdk.BoxDeveloperEditionAPIConnection;
import com.box.sdk.BoxItem;
import com.box.sdk.BoxMetadataFilter;
import com.box.sdk.BoxSearch;
import com.box.sdk.BoxSearchParameters;
import com.box.sdk.BoxUser;
import com.box.sdk.DateRange;
import com.box.sdk.IAccessTokenCache;
import com.box.sdk.InMemoryLRUAccessTokenCache;
import com.box.sdk.PartialCollection;
import com.box.sdk.SizeRange;
public final class BoxAPITest {
private static final String USER_ID = "***email address removed for privacy***";
private static final int MAX_DEPTH = 1;
private static final int MAX_CACHE_ENTRIES = 100;
private static BoxDeveloperEditionAPIConnection api;
/**
* #param args the command line arguments
* #throws java.io.IOException
*/
public static void main(String[] args) throws IOException {
// Turn off logging to prevent polluting the output.
Logger.getLogger("com.box.sdk").setLevel(Level.SEVERE);
//It is a best practice to use an access token cache to prevent unneeded requests to Box for access tokens.
//For production applications it is recommended to use a distributed cache like Memcached or Redis, and to
//implement IAccessTokenCache to store and retrieve access tokens appropriately for your environment.
IAccessTokenCache accessTokenCache = new InMemoryLRUAccessTokenCache(MAX_CACHE_ENTRIES);
Reader reader;
reader = new FileReader("\\My Path\\file.json");
BoxConfig boxConfig = BoxConfig.readFrom(reader);
api = BoxDeveloperEditionAPIConnection.getAppUserConnection(USER_ID, boxConfig, accessTokenCache);
//api = BoxAPIConnection.getAppUserConnection(USER_ID, boxConfig, accessTokenCache);
BoxUser.Info userInfo = BoxUser.getCurrentUser(api).getInfo();
System.out.format("Welcome, %s!\n\n", userInfo.getName());
}
}
Any assistance would be most appreciated.
Bentaye actually provided the answer. One of my jars was corrupt.

Convert docx file to pdf in java..issue

I am developing a project which needs a docx file to be converted to pdf. I found same question already posted and used the code which was provided by "Kishan C S". It uses docx4J2.8.1
The code is working fine , pdf is generated but only problem I am facing is that the docx file contains logo.jpg (images header part) which are not converted. Only textual format is converted to pdf.
I am posting the code which I have used. Please let me know what how can I solve the problem
P.S: link I referred Convert docx file into PDF with Java
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.Collections;
import java.util.List;
import org.apache.log4j.Level;
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
import org.docx4j.convert.out.pdf.viaXSLFO.PdfSettings;
import org.docx4j.fonts.IdentityPlusMapper;
import org.docx4j.fonts.Mapper;
import org.docx4j.fonts.PhysicalFont;
import org.docx4j.fonts.PhysicalFonts;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
public class DocxConverter {
public static void main(String[] args) throws FileNotFoundException, Docx4JException, Exception {
InputStream is = new FileInputStream(new File("D:\\Test\\C_IN0004_AppointmentLetter.docx"));
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(is);
List sections = wordMLPackage.getDocumentModel().getSections();
for (int i = 0; i < sections.size(); i++) {
wordMLPackage.getDocumentModel().getSections().get(i).getPageDimensions();
}
Mapper fontMapper = new IdentityPlusMapper();
PhysicalFont font = PhysicalFonts.getPhysicalFonts().get("Comic Sans MS");//set your desired font
fontMapper.getFontMappings().put("Algerian", font);
wordMLPackage.setFontMapper(fontMapper);
PdfSettings pdfSettings = new PdfSettings();
org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);
//To turn off logger
List<Logger> loggers = Collections.<Logger> list(LogManager.getCurrentLoggers());
loggers.add(LogManager.getRootLogger());
for (Logger logger : loggers) {
logger.setLevel(Level.OFF);
}
OutputStream out = new FileOutputStream(new File("D:\\Test\\C_IN0004_AppointmentLetter.pdf"));
conversion.output(out, pdfSettings);
System.out.println("DONE!!");
}
}

Java + Json - ClassNotFoundDefError [duplicate]

This question already has an answer here:
How to solve "NoClassDefFoundError"?
(1 answer)
Closed 6 years ago.
I'm trying to get an program that I coded to run properly. So far it will javac and java fine, however I get a NoClassDefFoundError.
This screenshot shows how I've javac, java the program, and the command prompt report.
As you can see I have 3 source files, and therefore 3 classes. PeriodicTable doesn't do anything related to the issue.
Inside of class Table I have...
import javax.swing.JFrame;
import javax.swing.JPanel;
import javax.json.JsonArray;
import javax.json.JsonObject;
import java.awt.GridBagLayout;
import java.awt.GridBagConstraints;
import java.awt.Color;
import java.awt.Toolkit;
import java.awt.Dimension;
import java.io.IOException;
class Table {
//Predefining some global variables
DataBaseReader dbReader;
//some methods...
protected void showLayout() {
dbReader = new DataBaseReader();
//A few lines of code
try {
JsonArray elements = dbReader.readDataBase(); //Here it enters the DataBaseReader class through dbReader
//Some more code
} catch(IOException ioe) {
ioe.printStackTrace();
}
}
}
Here is my DataBaseReader class
import javax.json.Json;
import javax.json.JsonReader;
import javax.json.JsonObject;
import javax.json.JsonArray;
import java.io.FileReader;
import java.io.IOException;
public class DataBaseReader
{
public JsonArray readDataBase() throws IOException {
System.out.println("Check!"); //This check is reached
JsonReader reader = Json.createReader(new FileReader("C:/projects/PeriodicTable/Elements.JSON"));
System.out.println("Check!"); //This check is not reached
JsonObject jsonst = reader.readObject();
reader.close();
return jsonst.getJsonArray("Elements");
}
}
What versions, programs, etc am I using?
Java 8
Command Prompt
Notepad
javax.json-1.0.jar
To clearly state my question... Any ideas or explanations about what is causing this error?
i think you are missing javax.json-api-1.0.jar file

unable to train location.bin using opennlp with java

I am trying to train en-ner-location.bin file using opennlp in java The thing is i got the training text file in the following format
<START:location> Fontana <END>
<START:location> Palo Verde <END>
<START:location> Picacho <END>
and i trained the file using the following code
import java.io.BufferedOutputStream;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.Charset;
import java.util.Collections;
import opennlp.tools.namefind.NameFinderME;
import opennlp.tools.namefind.NameSample;
import opennlp.tools.namefind.NameSampleDataStream;
import opennlp.tools.namefind.TokenNameFinderModel;
import opennlp.tools.tokenize.Tokenizer;
import opennlp.tools.tokenize.TokenizerME;
import opennlp.tools.tokenize.TokenizerModel;
import opennlp.tools.util.ObjectStream;
import opennlp.tools.util.PlainTextByLineStream;
import opennlp.tools.util.Span;
public class TrainNames {
#SuppressWarnings("deprecation")
public void TrainNames() throws IOException{
File fileTrainer=new File("citytrain.txt");
File output=new File("en-ner-location.bin");
ObjectStream<String> lineStream = new PlainTextByLineStream(new FileInputStream(fileTrainer), "UTF-8");
ObjectStream<NameSample> sampleStream = new NameSampleDataStream(lineStream);
System.out.println("lineStream = " + lineStream);
TokenNameFinderModel model = NameFinderME.train("en", "location", sampleStream, Collections.<String, Object>emptyMap(), 1, 0);
BufferedOutputStream modelOut = null;
try {
modelOut = new BufferedOutputStream(new FileOutputStream(output));
model.serialize(modelOut);
} finally {
if (modelOut != null)
modelOut.close();
}
}
}
I got no errors or warnings but when i try to get a city name from a string like this cnt="John is planning to specialize in Electrical Engineering in UC Fontana and pursue a career with IBM."; It returns the whole string
anybody could tell me why...??
Welcome to SO! Looks like you need more context around each location annotation. I believe right now openNLP thinks you are training it to find words (any word) because your training data has only one word. You need to annotate locations within whole sentences and you will need at least a few hundred samples to start seeing good results.
See this answer as well:
How I train an Named Entity Recognizer identifier in OpenNLP?

Categories

Resources