JESS Userfunction writes "BS" instead of "/home" to a file - java

I'm using JESS for my expert system implementation and I have a userfunction. It writes some strings to a text file.
public Value call(ValueVector vv, Context context) throws JessException {
Rete engine = context.getEngine();
int size = vv.size();
for(i = 0; i < size-1; i++)
params[i] = vv.get(i+1).stringValue(context);
engine.eval("(printout file " + params[2] + ")");
return new Value(params[1], RU.STRING);
}
params[2] has /home/username/folder as content. When it prints out to a file I get the following in the file. BS has black background btw.
BSusername/folder
I'm not sure what's going on here. Any ideas?
In addition, I've never had this problem when I print out from JESS code.

The unquoted text /home/ is being parsed as a regular expression; the printed value is somewhat unpredictable. You need to include double quotes in your built-up command so the path is seen as a quoted string.

Related

Generating custom text files in java

public class ScriptCreator {
public static void main(String[] args) throws IOException {
#Choose the CSV file that I am importing the data from
String fName = "C:\\Users\\MyUser\\Downloads\\CurrentApplications (1).csv";
String thisLine;
int count = 0;
FileInputStream fis = new FileInputStream(fName);
DataInputStream myInput = new DataInputStream(fis);
int i = 0;
#Prints the List of names in the CSV file
while((thisLine = myInput.readLine()) != null){
String strar[] = thisLine.split(",");
Printer(strar[0]);
}
}
public static void Printer(String arg) throws IOException{
#Want to pull from the String strar[0] from above
#Says that it cannot be resolved to a variable
String name = arg;
String direc = "C:/Users/MyUser/Documents/";
String path = "C:/Users/MyUser/Documents";
Iterable<String> lines = Arrays.asList("LOGIN -acceptssl ServerName","N " + name + " " + direc ,"cd " + name,"import " + path + "*.ppf" + " true","scan", "publishassessase -aseapplication " + name,"removeassess *","del " + name );
Path file = Paths.get(name + ".txt");
Files.write(file, lines, Charset.forName("UTF-8"));
}
}
Hello everyone and thank you in advance for any help that you may be able to give me. I am trying to create a java program that will pull names from a CSV file and take those names to generate custom outputs for text files. I am having a hard time being able to set a variable that I can use to grab the names that are being printed and using them to generate a text file by setting the name variable.
I am also going to need some help in making sure that it creates the amount of scripts for the amount of names in the CSV file. Ex. 7 names in CSV makes 7 custom .txt files, each with its appropriate name.
Any help is greatly appreciated!
Edit: I have updated my code to match the correction that was needed to make the code work.
It looks like you have some scoping issues. Whenever you declare a variable, it only exists within the boundaries of its closest set of braces. By declaring strar in your main method, the only place you can explicitly use it is within your main method. Your Printer() method doesn't have any previous mention of strar, and the only way it can know about it is by passing it as an argument to the function.
i.e.
Printer(String[] args)
Or, better yet:
Printer(String arg)
and then call it in your while loop with
Printer(strar[0]);
Also, your Printer method begins with a "for each" loop called on strar[0], which is not a valid target for a foreach loop anyway, because if I recall correctly, String isn't an Iterable object. If you implemented the Printer function in the way I recommended, you won't need a for each loop anyway, as there will only be one name passed at a time.

Gujarati text in Java String

I have Gujarati Bible and trying to insert each verse in MySQL database using parser written in Java. When I assign Gujarati text to Java String variable it shows junks in debug.
E.g. This is my Gujarati text
હે યહોવા તું મારો દેવ છે;
I assign it to Java String variable as shown below
verse._verseText = "હે યહોવા તું મારો દેવ છે;";
What i see in debug window is all junk characters. Any help is appreciated. If need more information let me know and I will provide as and when asked.
UPDATE
Pasting my parser code here
private Boolean Insert(String _text)
{
BibleVerse verse = new BibleVerse();
String[] data = _text.split("\\|");
try
{
if (data[0].equals(bookName) || bookName.equals("All"))
{
verse._Version = "Gujarati";
verse._book = data[0];
verse._chapter = Integer.parseInt(data[1]);
verse._verse = Integer.parseInt(data[2]);
verse._verseText = new String(data[3].getBytes(), "UTF-8");
_bibleDatabase.Insert(verse);
pcs.firePropertyChange("logupdate", null, data[0] + " " + data[1] + "," + data[2] + " - INSERTED.");
}
else
{
pcs.firePropertyChange("logupdate", null, data[0] + " " + data[1] + "," + data[2] + " - SKIPPED.");
}
return true;
}
catch(Exception e)
{
pcs.firePropertyChange("logupdate", null, "ERROR : " + e.getMessage());
return false;
}
}
Here is the sample line from the text file
Isaiah|25|1|હે યહોવા તું મારો દેવ છે; હું તને મોટો માનીશ, હું તારા નામની સ્તુતિ કરીશ; કેમકે તેં અદભુત કાર્યો કર્યાં છે, તેં વિશ્વાસુપણે તથા સત્યતાથી પુરાતન સંકલ્પો પાર પાડ્યા છે.
UPDATE
Here is the code where I open & read file.
try
{
FileReader _file = new FileReader(this._filename);
_bufferedReader = new BufferedReader(_file);
SwingWorker parseWorker = new SwingWorker()
{
#Override
protected Object doInBackground() throws Exception
{
String line;
String[] data;
int lineno=0;
BibleVerse verse = new BibleVerse();
while ((line = _bufferedReader.readLine()) != null)
{
++lineno;
pcs.firePropertyChange("pgbupdate", null, lineno);
Insert(line);
}
_bufferedReader.close();
return null;
}
#Override
protected void done()
{
pcs.firePropertyChange("logupdate", null, "Parsing complete.");
}
};
parseWorker.execute();
}
catch (Exception e)
{
pcs.firePropertyChange("logupdate", null, "ERROR : " + e.getMessage());
}
The problem is this:
FileReader _file = new FileReader(this._filename);
This reads the file using the platform's default charset. If your data file is not encoded in that charset, you will get incorrect characters.
On Windows, the default charset is almost always UTF-16LE. On most other systems, it's UTF-8.
The easiest solution is to find out the actual encoding of your data file, so you can specify it explicitly in the code. The encoding of a file can be determined with the file command on Unix and Linux systems. In Windows, you may need to examine it with a binary editor, or install something like Cygwin, which has a file command of its own.
Once you know what it is, you should pass it explicitly to the construction of your Reader:
// Replace "UTF-8" with the actual encoding of your data file (if it's not UTF-8).
Reader _file = new InputStreamReader(new FileInputStream(this._filename), "UTF-8");
Once you've done that, there is no reason for any other part of your code to concern itself with bytes. You should replace this:
verse._verseText = new String(data[3].getBytes(), "UTF-8");
with this:
verse._verseText = data[3];
how to inject chinese characters using javascript?
not quite the same problem, but I think the same solution may work in this case.
If the script is inline (in the HTML file), then it's using the
encoding of the HTML file and you won't have an issue.
If the script is loaded from another file:
Your text editor must save the file in an appropriate encoding such as
utf-8 (it's probably doing this already if you're able to save it,
close it, and reopen it with the characters still displaying
correctly) Your web server must serve the file with the right http
header specifying that it's utf-8 (or whatever the enocding happens to
be, as determined by your text editor settings). Here's an example for
how to do this with php: Set http header to utf-8 php If you can't
have your webserver do this, try to set the charset attribute on your
script tag (e.g. > I tried to see what the spec said should happen
in the case of mismatching charsets defined by the tag and the http
headers, but couldn't find anything concrete, so just test and see if
it helps. If that doesn't work, place your script inline
It looks like if you want to store Gujarati text in Java string, you need to use unicode characters. See this: http://jrgraphix.net/r/Unicode/0A80-0AFF
So for example the first Gujarati character:
char example = '0A80';
String result = Character.toString((char)example);

Java - PDFBox - ReplaceString - Issues with parsed tokens (possibly encoding?)

I've been struggling with an issue related to PDFBox and PDF editing. I have been assigned the task to edit a couple of strings given a PDF file, and to output a mirrored version of the files with the edited strings into it. I've been told that the problem has been solved in the past using this tool, so I have been told to do the same. The function I am using is this :
public void doIt( String inputFile, String outputFile, String strToFind, String message)
throws IOException, COSVisitorException
{
// the document
PDDocument doc = null;
try
{
doc = PDDocument.load( inputFile );
List pages = doc.getDocumentCatalog().getAllPages();
for( int i=0; i<pages.size(); i++ )
{
PDPage page = (PDPage)pages.get( i );
PDStream contents = page.getContents();
PDFStreamParser parser = new PDFStreamParser(contents.getStream() );
parser.parse();
List tokens = parser.getTokens();
for( int j=0; j<tokens.size(); j++ )
{
Object next = tokens.get( j );
if( next instanceof PDFOperator )
{
PDFOperator op = (PDFOperator)next;
//Tj and TJ are the two operators that display
//strings in a PDF
if( op.getOperation().equals( "Tj" ) )
{
//Tj takes one operator and that is the string
//to display so lets update that operator
COSString previous = (COSString)tokens.get( j-1 );
String string = previous.getString();
string = string.replaceFirst( strToFind, message );
previous.reset();
previous.append( string.getBytes("ISO-8859-1") );
}
else if( op.getOperation().equals( "TJ" ) )
{
COSArray previous = (COSArray)tokens.get( j-1 );
for( int k=0; k<previous.size(); k++ )
{
Object arrElement = previous.getObject( k );
if( arrElement instanceof COSString )
{
COSString cosString = (COSString)arrElement;
String string = cosString.getString();
string = string.replaceFirst( strToFind, message );
cosString.reset();
cosString.append( string.getBytes("ISO-8859-1") );
}
}
}
}
}
//now that the tokens are updated we will replace the
//page content stream.
PDStream updatedStream = new PDStream(doc);
OutputStream out = updatedStream.createOutputStream();
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens( tokens );
page.setContents( updatedStream );
}
doc.save( outputFile );
}
finally
{
if( doc != null )
{
doc.close();
}
}
}
Which is the code that is being used in a file contained into the PDFBox examples (https://svn.apache.org/repos/asf/pdfbox/tags/1.5.0/pdfbox/src/main/java/org/apache/pdfbox/examples/pdmodel/ReplaceString.java).
The file I have been given, however, is not being modified at all from this function. Nothing happens at all. Upon further inspection, I decided to analyze the sequencing of the tokens produced from the parser. The file is being parsed correctly in everything other than the COSString elements, which contain gibberish characters that look like they have been wrongly encoded (bunch of random symbols and numbers). I tried parsing other documents, and the function works with some of them, but not on everything I passed as input (a latex output file was modified correctly and had correctly encoded COSStrings, whereas other automatically generated pdfs produced no results with gibberish COSString content). I am also fairly sure the rest of the structure is being read correctly, since I rebuild the output on a different file, and the output file looks exactly the same as the input, which seems to mean that the file structure is being analyzed correctly.The file contains Identity-H encoded fonts.
I tried parsing the very same file using the PDFTextStripper (which extracts text from PDFs), and the parsing output from there returns the correct text output, using this:
PDFTextStripper pdfStripper = new PDFTextStripper("UTF-8");
String result = pdfStripper.getText(doc);
System.out.println(result);
Could it be an encoding issue? Can I tell the PDFStreamParser (or whoever holds the responsability) to force an encoding on read? Is it even an encoding issue, since the text extraction is working correctly?
Thanks in advance for the help.
Some files use font subsets. Lets say that the subset uses only the characters E, G, L, and O. So GOOGLE would appear in the file as hex byte values 2, 4, 4, 2, 3, and 1.
Now if you want to change GOOGLE into APPLE you'll have three problems:
1) your subset doesn't contain the characters A, L and P
2) the size will be different
3) It is quite possible that the string you're searching is splitted in several parts.
Btw the current version is 1.8.10. The ReplaceString utility has been removed in the upcoming 2.0 version to avoid giving the illusion that characters can easily be replaced.
This answer is somewhat speculative, because you haven't linked to a PDF.
Inside PDF text can be stored at two places:
Content stream
X Object inside Resource
Inside content stream mostly text are associated with TJ or Tj operator. But texts associated with Tj or TJ are not always in ASCII format, it may be some byte values. We can extract text from these byte value after mapping character codes to unicode values using proper encoding and mapping. While extracting text we use mapping and encoding, but we do not have a reverse mapping to check if a glyph belong to which character code. So basically we should replace character codes of string to be replaced with character codes of new string.
Example:
1. (Text) Tj
2. (12 45 5 3)Tj
Also we should replace string in content stream as well as X Object (if present) inside resource.
So I think this might be helpful.
GoodLuck!

Parsing A Text File, So That Every Line Is Stored As An Array Value

Basically, I want to parse, line by line, a Text file so that every line is in it's own array value.
E.g.
Hi There,
My Name's Aiden,
Not Really.
Array[0] = "Hi There"
Array[1] = "My Name's Aiden"
Array[2] = "Not Really"
But all the examples I have read already just confuse me and lead me to get frustrated. Maybe it's the way I approach it.
I don't know how to go about it, a point in the right direction would be most satisfying.
My suggestion is to use List<String> instead of String[] as arrays have fixed size, and that size is unknown before reading. Afterward one could make an array out of it, but to no real purpose.
For reading one has to know the encoding of the file.
Path path = Paths.get("C:/Users/Me/list.txt");
//Charset encoding = StandardCharsets.UTF_8;
Charset encoding = Charset.defaultCharset();
List<String> lines = Files.readAllLines(path, encoding);
for (String line : lines) {
...
}
for (int i = 0; i < lines.size(); ++i) {
String line = lines.get(i);
lines.set(i, "-- " + line;
}

Print all Unicode characters within a specific range

I can't find the right API for this. I tried this;
public static void main(String[] args) {
for (int i = 2309; i < 3000; i++) {
String hex = Integer.toHexString(i);
System.out.println(hex + " = " + (char) i);
}
}
This code only prints like this in Eclipse IDE.
905 = ?
906 = ?
907 = ?
...
How can I make us of these decimal and hex values to get the Unicode characters?
It prints like that because all consoles use a mono spaced font. Try that on a JLabel in a frame and it should display fine.
EDIT:
Try creating a unicode printstream
PrintStream out = new PrintStream(System.out, true, "UTF-8");
And then print to it.
Here's the output in CMD window.
I forgot to save it in UTF-8 format by changing it from
File > Properties > Select the text file encoding
This will properly print the right character from the Eclipse console. The default is cp1252 which will print only ? for those characters it does not understand.

Categories

Resources