JVM falls after parsing images via Tesseract - java

I have an app that works on backend it uses maven dependency: tess4j 5.4.0
Trained data was installed and connected through env variable.
I have endpoint where I can create entity by adding an image and it works correctly when I use Postman(image added -> text recognized and added to entity -> entity created -> I can create next component), but if I invoke this endpoint via frontend - JVM crashed after 4-5 minutes. If I turn off this function I can create entities and backend works correctly
Also it works when I try to run in locally, but when I deploy it on the server everything breaks(via frontend).
Example of entity Service:
public String storeFileAndGetContentText(MultipartFile file) {
String fileName;
Path targetLocation = this.fileStorageLocation.resolve(fileName);
Files.copy(file.getInputStream(), targetLocation, StandardCopyOption.REPLACE_EXISTING);
File targetFile = targetLocation.toFile();
String contentText = new RecognitionService().parseImage(targetFile);
return contentText;
}
Example of recognition service:
public class RecognitionService {
private static ITesseract tesseract;
static {
try {
tesseract = new Tesseract();
} catch (Exception e) {
log.warn("Failure during Tesseract initialization", e);
}
}
public String parseImage(File file) {
try {
return tesseract.doOCR(file).replaceAll("\n", " ").trim();
} catch (TesseractException e) {
log.warn("Tesseract can't read file {}", file.getName(), e);
return "";
}
}
Logs:
java.lang.IllegalStateException: failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?
at org.elasticsearch.server#8.4.2/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:285)
at org.elasticsearch.server#8.4.2/org.elasticsearch.node.Node.<init>(Node.java:456)
at org.elasticsearch.server#8.4.2/org.elasticsearch.node.Node.<init>(Node.java:311)
at org.elasticsearch.server#8.4.2/org.elasticsearch.bootstrap.Elasticsearch$2.<init>(Elasticsearch.java:214)
at org.elasticsearch.server#8.4.2/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:214)
at org.elasticsearch.server#8.4.2/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)
Caused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data
at org.elasticsearch.server#8.4.2/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:230)
at org.elasticsearch.server#8.4.2/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:198)
at org.elasticsearch.server#8.4.2/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:277)
... 5 more
Caused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/node.lock
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixPath.toRealPath(UnixPath.java:825)
at org.apache.lucene.core#9.3.0/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:94)
at org.apache.lucene.core#9.3.0/org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:43)
at org.apache.lucene.core#9.3.0/org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:44)
at org.elasticsearch.server#8.4.2/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:223)
... 7 more
Suppressed: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:218)
at java.base/java.nio.file.Files.newByteChannel(Files.java:380)
at java.base/java.nio.file.Files.createFile(Files.java:658)
at org.apache.lucene.core#9.3.0/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:84)
Actually, I see that something wrong with ElasticSearch, but I don’t see how it’s connected
I installed tesseract-ocr on target machine - to make test recognition work
I tried to install next libraries: imagemagic, ghostscript - I thought that tesseract-ocr expect something from these libraries(found in Intenet)

Related

Unable to use apache-camel for sftp file transfer

Currently I am unable to grab -> archive -> decrypt a file from an SFTP server. I have tested the logic using local directories but with no success using SFTP.
The connection appears to be established to the server as neglecting to pass the private key will result in a connection exception. When the key is being passed no exception is given from the route itself but no files are copied. What would be a potential solution or next steps to help in troubleshooting this issue?
I am using the absolute directory's in which the files would be stored from the sftp location.
CamelContext camelContext = new DefaultCamelContext();
camelContext.getRegistry().bind("SFTPPrivateKey",Byte[].class,privateKey.getBytes());
String sftpInput = buildURISFTP(input,inputOptions,connectionConfig);
String sfpOutput = buildURISFTP(output,outputOptions,connectionConfig);
String sfpArchive = buildURISFTP(archive,archiveOptions,connectionConfig);
camelContext.addRoutes(new RouteBuilder() {
public void configure() throws Exception {
PGPDataFormat pgpDataFormat = new PGPDataFormat();
pgpDataFormat.setKeyFileName(pPgpSecretKey);
pgpDataFormat.setKeyUserid(pgpUserId);
pgpDataFormat.setPassword(pgpPassword);
pgpDataFormat.setArmored(true);
from(sftpInput)
.to(sfpArchive);
//tested decryption local with file to file routing
.unmarshal(pgpDataFormat)
.to(sfpOutput);
}
});
camelContext.start();
Thread.sleep(timeout);
camelContext.stop();
public String buildURISFTP(String directory, String options, ConnectionConfig connectionConfig){
StringBuilder uri = new StringBuilder();
uri.append("sftp://");
uri.append(connectionConfig.getSftpHost());
uri.append(":");
uri.append(connectionConfig.getSftpPort());
uri.append(directory);
uri.append("?username=");
uri.append(connectionConfig.getSftpUser());
if(!StringUtils.isEmpty(connectionConfig.getSftpPassword())){
uri.append("&password=");
uri.append(connectionConfig.getSftpPassword());
}
uri.append("&privateKey=#SFTPPrivateKey");
if(!StringUtils.isEmpty(options)){
uri.append(options);
}
return uri.toString();
}
Issue was due to lack of knowledge around FTP component
https://camel.apache.org/components/3.18.x/ftp-component.html
Where it is specified that absolute paths are not supported, unfortunately I did not read this page and only referenced the SFTP component page where it is not specified.
https://camel.apache.org/components/3.18.x/sftp-component.html
Issue was resolved by backtracking directories with /../../ before giving the absolute path.

Upload file to Spring Boot Resource Folder, should not be that hard right

I checked all over the internet and still cannot find the correct answer. I want to upload a file to the resources folder from Spring. So I can get the file from the heroku server when I deploy it.
For example applicationname/herokuapp.com/image.jpg
The structure of my app:
I tried and got a few problems :
File not found exception
Illegal char <:> at index 2
The file path I get is in the target folder??
Can't find path
I just need to get the correct path to the resources folder but I can't get it.
My controller with the following method looks like this:
#PostMapping(value = "/sheetmusic")
public SheetMusic create(HttpServletRequest request, #RequestParam("file") MultipartFile file, #RequestParam("title") String title, #RequestParam("componist") String componist, #RequestParam("key") String key, #RequestParam("instrument") String instrument) throws IOException {
URL s = ResourceUtils.getURL("classpath:static/");
String path = s.getPath();
fileService.uploadFile(file,path);
SheetMusic sheetMusic = new SheetMusic(title,componist,key,instrument,file.getOriginalFilename());
return sheetMusicRepository.save(sheetMusic);
}
The FileService:
public void uploadFile(MultipartFile file, String uploadDir) {
try {
Path copyLocation = Paths
.get(uploadDir + File.separator + StringUtils.cleanPath(file.getOriginalFilename()));
Files.copy(file.getInputStream(), copyLocation, StandardCopyOption.REPLACE_EXISTING);
} catch (Exception e) {
e.printStackTrace();
}
}
I read something about jar but I don't understand it. I did not think it was this hard to just upload a file to a folder but I hope you guys can help me out!
EDIT :
When I add this :
String filePath = ResourceUtils.getFile("classpath:static").toString();
It will upload to the target folder which is not right.
EDIT 2 : IT IS FIXED
This is the right way to get the correct path :
String path = new File(".").getCanonicalPath() + "/src/main/webapp/WEB-INF/images/";
fileService.uploadFile(file,path);
My folder structure is the following:
main
-java
- webapp
- WEB-INF
- images
Then I had to put this code into my MainApplicationClass
#Override
public void addResourceHandlers(ResourceHandlerRegistry registry) {
// Register resource handler for images
// Register resource handler for images
registry.addResourceHandler("/images/**").addResourceLocations("/WEB-INF/images/")
.setCacheControl(CacheControl.maxAge(2, TimeUnit.HOURS).cachePublic());
}
What you are trying to do here (replacing a file/uploading a file INTO a package .jar file) does not work, it is literally impossible.
You need to upload your file somewhere else, be that S3, some network drive etc, so that you application can reference it.

Can not read file when run within jar file

I have an akka http service. I simply return the api documentation for a get request. The documentation is in html file.
It all works fine when run within the IDE. When I package it as a jar I get error 'resource not found'. I am not sure why it can not read the html file when hosted in a jar and works fine when in IDE.
Here is the code for the route.
private Route topLevelRoute() {
return pathEndOrSingleSlash(() -> getFromResource("asciidoc/html/api.html"));
}
The files are located in resource path.
I have got this working now.
I am doing this.
private Route topLevelRoute() {
try {
InputStreamReader inputStreamReader = new InputStreamReader(getClass().getResourceAsStream("/asciidoc/html/api.html"));
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
//Get the stream input into string builder
reader.lines().forEach(s -> strBuild.append(s));
inputStreamReader.close();
bufferedReader.close();
//pass the string builder as string with contenttype set to html
complete(HttpEntities.create(ContentTypes.TEXT_HTML_UTF8, strBuild.toString()))
} catch (Exception ex) {
//Catch any exception here
}
}

Extracting Text From JPG

I've tried this code and added the needed jar files but still I'm getting an error message like Exception in thread "main" java.lang.UnsatisfiedLinkError: Unable to load library 'libtesseract302'.
Is there a complete tutorial how to extract text and what things should be done to address the error? Any help is appreciated...
import net.sourceforge.tess4j.*;
import java.io.File;
public class ExtractTxtFromImg {
public static void main(String[] args) {
File imgFile = new File("C:\\Documents and Settings\\rueca\\Desktop\\sampleImg.jpg");
Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
// Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping
try {
String result = instance.doOCR(imgFile);
System.out.println(result);
} catch (Exception e) {
System.err.println(e.getMessage());
}
}
}
In addition to adding the jars, you also need to add the natives. You can do so with Djava.library.path="C:\[absolute path to dir containing *.dll files and such]"
Note that you need to provide the directory, not the file itself.

Caused by: java.lang.ClassNotFoundException: org.springframework.util.StreamUtils in GAE

My project don't work in GAE on global side, but it work correctly in local GAE server.
Logs from global server:
Caused by: java.lang.ClassNotFoundException: org.springframework.util.StreamUtils
I see this exception when i call getLogin() method, but methos setUserINN() work is correctly.
#RequestMapping(value="/getSalesInfo", method = RequestMethod.GET)
public #ResponseBody String getLogin(){
MasterAccountInfo msi = dataMethods.getMasterAccountInfo();
ObjectMapper mapper = new ObjectMapper();
try {
return mapper.writeValueAsString(msi);
} catch (JsonProcessingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
return "failed transform";
}
}
#RequestMapping(value="/setuserINN", method = RequestMethod.POST)
public String setUserINN(#RequestParam("INN") String INN){
Principal pr = SecurityContextHolder.getContext().getAuthentication();
String str = pr.getName();
dataMethods.changeUserInfo(str, INN);
return "redirect:/myaccount";
}
}
I have no idea about this problem. Please help.
StreamUtils was added in Spring 3.2.2. You'll need to upgrade to at least that version of Spring for it to work.
http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/util/StreamUtils.html
I don't know how ant was compiled WAR in local machine but
org.springframework.util.StreamUtils exist only in spring-core-4.0.jar I used 3.1.1 version without this package.
I am insert library v4.0 in WAR and all is work.
Upgrade your spring version to at least(4.2.1.RELEASE) or latest version for it to work.
http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/util/StreamUtils.html

Categories

Resources