Convert from CL8ISO8859P5 encoding to ASCII in Java - java

probably you may give me some hints what can I do/see in my case:)
There is an Oracle code that converts given hexadecimal input in AMERICAN_AMERICA.CL8ISO8859P5 to ASCII: utl_raw.cast_to_varchar2(utl_raw.convert(hextoraw('31383831303891353080853737303338385A5A'), 'AMERICAN_AMERICA.CL8ISO8859P5', 'AMERICAN_AMERICA.RU8PC866'))
Example input: 31383831303891353080853737303338385A5A, example output: 188108С50АЕ770388ZZ
My pain is to solve how can I do it in Java:) Prerequisite: I have no connection to the database and can't execute prepared SQL statement in order to call this function in Oracle package...
I am able to parsing everything except specific bytes (91 -> 'C1', 8085 -> 'B0B5', 5A -> 'Z') with the following code:
new String(DatatypeConverter.parseHexBinary("31383831303891353080853737303338385A5A"))
I've also tried all standards encodings in String constructor with encoding but there were no positive results:(
Do you know if there are encodings in Java that are identical to AMERICAN_AMERICA.CL8ISO8859P5? Or do you know some libraries or Java functions that are able to make this conversion (AMERICAN_AMERICA.CL8ISO8859P5 to ASCII) ?
Many thanks to you in advance!

AMERICAN_AMERICA.RU8PC866 Oracle encoding is IBM-866 encoding in Java (hint from #kfinity). My issue solved by using
new String(DatatypeConverter.parseHexBinary(input), "IBM-866")
CP866 worked as well.

Related

getting syntax error from nifi ExecuteScript processor inspite of correct python code

I am getting below error inspite of correct python code don't know how to resolve this error. Any help is much appreciated
org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: SyntaxError: no viable alternative at input '*' in <script> at line number 35 at column number 26
python code
def get_match_list(regEx, line):
match = re.search(regEx, line)
print(match)
if match:
match_list = [*match.groups()] # this is the line exception is pointed
return match_list
else:
return []
It looks like jython use python 2.7 and as Unpacking Generalizations is a feature that introduced in python 3.5 you can not use this syntax in jython, so an alternative way to convert a tuple to a list is that use list ( match.groups) it works fine in older versions of python and current version of jython (2.7.2)

Java CSVeed library, option for quote inside unquoted field?

I'm benching several Java libraries to parse csv files. I can't find a solution for the CSVeed library with this line :
af,dekh"iykh'ya,Dekh"iykh'ya,13,,34.60345,69.2405
I have this error :
org.csveed.report.CsvException: Illegal state transition:
Parsing symbol QUOTE_SYMBOL [34] in state INSIDE_FIELD
19970: af,dekh
I understand very well what happen unfortunately I tried different blend of options without succeed. Is there a way?
In fact the perfect line of 7 cols should be :
af,dekh\"iykh\'ya,Dekh\"iykh\'ya,13,,34.60345,69.2405
af,dekh"iykh'ya,Dekh"iykh'ya,13,,34.60345,69.2405
To parse this in the following fields you'll have to turn quoting off in your parser:
af
dekh"iykh'ya
Dekh"iykh'ya
13
<null>
34.60345
69.2405
If quoting can not be turned off, you could use setQuote(char symbol) and provide an unused char as parameter.

ENCOG values in output file incorrectly denormalized?

The following was produced using the most recent version of encog-workbench (3.2.0)
I was wondering if this is a bug or if I do not grasp the purpose of the output file.
When I run the [ sunspot example ][1] in the encog workbench, without seggregation, i expect the output file to have the fitted values from the model. When i create the validation chart it presents me with the figure found in the tutorial so this seems correct.
But when i go to the sunspots_output.csv output file I get the following output:
ssn(t-29) ssn(t+1) Output:ssn(t+1)
... first thirty values have output Null ...
-0.600472813 -0.947202522 null
-0.477541371 -1 8.349050184
-0.528762805 -0.976359338 8.334476431
-0.814814815 -0.986603625 8.314903157
-0.817178881 -0.892040977 8.292847897
...
All the output values are around 8 for the rest of the file.
Now when i go back to the validation chart, there is a tab data, which contains the following columns:
Ideal Result
-0.477541371 -0.52449577
-0.528762805 -0.526507195
-0.814814815 -0.535029097
-0.817178881 -0.653884012
If I denormalize the values in these columns, I get the following.
66.3 60.3414868
59.8 60.08623701
23.5 59.00480764
23.2 43.92211894
These seem to be correct values for the actual(if i compare them with the original data) and thus these should be the predicted values in the output column.
Is this a bug or do the values in the output(t+1) column mean something else.
I copied these values to excel and denormalized by typing in the formula for (-1,1).
I was hoping not to have to do this every time I run an experiment.
I am going to move to code eventually. Just wanted to get some preliminary results with the workbench. Using segregation results in the same problem, btw.
If its a bug I'll report it on the encog website.
Thanks for your answers,
Florian
UPDATE
Hey Jef, I downloaded your zip and reproduced the problem using my workbench.
The problem only arises when i do not seggregate, which i do not want to.
There are some clear differences in the .ega file created by workbench-excecutable3.2.0
When i use your .ega file and remove the seggregate section, it works.
When i use mine it doesn't. That's why i uploaded my project [here][2]:
Maybe you can discover if something new interferes with outputting the correct values.
Hope it helps!
Update 3:
My actual goal is to build a forecaster of which the project can be found here:
http://wikisend.com/download/477372/Myproject.rar
I was wondering if you could tell me if I am doing something definitely wrong, because currently my output is total rubbish.
Thanks again.
I tried to reproduce the error, but when I ran my own sunspots prediction I did get predicted values closer to the expected range. You might try running the zipped version of the example, found here.
http://www.heatonresearch.com/dload/encog/example/workbench/SunspotExample.zip
You should be able to run the EGA file and it will produce an output file. Some of my data are as follows:
"year" "mon" "ssn" "dev" "Output:ssn(t+1)"
1948 5 174.0 69.3 156.3030108771
1948 6 167.8 26.6 168.4791037592
1948 7 142.2 28.3 208.1090604116
1948 8 157.9 35.3 186.0234029962
1948 9 143.3 55.9 131.5008296846
1948 10 136.3 44.9 93.0720770479
1948 11 95.8 21.8 89.8269594386
Perhaps comparing the EGA file for the above zip to your EGA file.

Unsupport characters in command line file path

I would linke to call java app from PHP:
exec('LC_ALL=en_US.utf-8 java -jar /test.jar' . $filepath . ');
But always there are unsupported characters in the file path.
For example: # & ; ? * [SPACE]..., after change them to # \& ... it will be ok.
But a full list of these characters could not be find.
Any ideas to solve this problem?
Take a look at escapeshellarg() and escapeshellcmd().
They will take care of all necessary sanitation for you.
If $filepath comes from the outside (e.g. from user input), running escapeshellarg() is mandatory to prevent injections.
My problem resolved.
Useful url:
http://bugs.php.net/bug.php?id=44945

meaning of #(#) characters

In documentation code I see some things like this:
/*
* #(#)File.java 1.142 09/04/01
what does characters like #(#) meaning?
#(#) is the character string used by the Unix what command to filter strings from binaries to list the components that were used to build that binary. For instance what java on AIX yields:
java:
23 1.4 src/bos/usr/ccs/lib/libpthreads/init.c, libpth, bos520 8/19/99 12:20:14
61 1.14 src/bos/usr/ccs/lib/libc/__threads_init.c, libcthrd, bos520 7/11/00 12:04:14
src/tools/sov/java.c, tool, asdev, 20081128 1.83.1.36
src/misc/sov/copyrght.c, core, asdev, 20081128 1.8
while `strings java | grep '#(#)' yields:
#(#)23 1.4 src/bos/usr/ccs/lib/libpthreads/init.c, libpth, bos520 8/19/99 12:20:14
#(#)61 1.14 src/bos/usr/ccs/lib/libc/__threads_init.c, libcthrd, bos520 7/11/00 12:04:14
#(#)src/tools/sov/java.c, tool, asdev, 20081128 1.83.1.36
#(#)src/misc/sov/copyrght.c, core, asdev, 20081128 1.8
#(#) was chosen as marker because it would not occur elsewhere, source code controls systems typically add a line containing this marker and the description of the file version on synchronisation, expanding keywords with values reflecting the file contents.
For instance, the comment you list would be the result of expanding the SCCS keywords %Z% %M% %R%.%L% %E% where the %Z% translates into #(#).
From (hazy) memory, that was the tag used by SCCS back in the "good old days". Given that (to my knowledge), BitKeeper uses SCCS underneath, it could be BitKeeper.
It is usually something that is added automatically by the version control system.
That construct has no special meaning in Java. It is just some text in a comment.
It looks like something that's inserted by a version control system.

Categories

Resources