Background to my problem
Hi, I am just attempting to complete an exercise on project Euler which states that I must read all names from a ".txt" file and add all the character codes for each character within that string etc. As I was doing the exercises I realized that the wrong character codes is being displayed.
This is the full details for my problem from project Euler
Using names.txt (right click and 'Save Link/Target As...'), a 46K text
file containing over five-thousand first names, begin by sorting it
into alphabetical order. Then working out the alphabetical value for
each name, multiply this value by its alphabetical position in the
list to obtain a name score.
For example, when the list is sorted into alphabetical order, COLIN,
which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938th name in the
list. So, COLIN would obtain a score of 938 × 53 = 49714.
What is the total of all the name scores in the file?
My Question
why is my code displaying the value "67" for the character "C" when the actual character code value for "C" is 3? . Thanks in advance.
private static int NameValue(string name)
{
string StrimName = name.Substring(1, name.Length-2); // name ---> COLIN
Console.WriteLine(StrimName[0] + 0); // should print 3 because character code for "C" Is 3 but result is 67...
return 0;
}
It prints a number from an ASCII table: http://www.asciitable.com/
You should replace it with:
Console.WriteLine((StrimName[0]-64) + 0);
to receive what you want. It turns out you want to count 'A' as one, and its number in ASCII table is 65, therefrom I subtract 64.
Every character has a number in the ascii code,
The ascii-code dor 'C' is 67, This is why you see 67.
You can see here a table for ascii code
Related
I have a problem where I want to scan the files that are in a certain folder and output them.
the only problem is that the output is: (1.jpg , 10.jpg , 11.jpg , 12.jpg , ... , 19.jpg , 2.jpg) when I want it to be: (1.jpg , 2.jpg and so on). Since I use: File actual = new File(i.); (i is the number of times the loop repeats) to scan for images, I don't know how to sort the output.
this is my code for now.
//variables
String htmlHeader = ("<!DOCTYPE html>:\n"
+ "<html lang=\"en\">\n"
+ "<head>\n"
+ "<meta charset=\"UTF-8\">\n"
+ "<meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\">\n"
+ "<meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n"
+ "<title>Document</title>\n"
+ "</head>"
+ "<body>;\n");
String mangaName = ("THREE DAYS OF HAPPINESS");
String htmlEnd = ("</body>\n</html>");
String image = ("image-");
//ask for page number
Scanner scan = new Scanner(System.in);
System.out.print("enter a chapter number: ");
int n = scan.nextInt();
//create file for chapter
File creator = new File("manga.html");
//for loop
for (int i = 1; i <= n; ++i) {
//writing to HTML file
BufferedWriter bw = null;
bw = new BufferedWriter(new FileWriter("manga"+i+".html"));
bw.write(htmlHeader);
bw.write("<h2><center>" + mangaName + "</center></h2</br>");
//scaning files
File actual = new File("Three Days Of Happiness Chapter "+i+" - Manganelo_files.");
for (File f : actual.listFiles()) {
String pageName = f.getName();
//create list
List<String> list = Arrays.asList(pageName);
list.sort(Comparator.nullsFirst(Comparator.comparing(String::length).thenComparing(Comparator.naturalOrder())));
System.out.println("list");
//for loop
//writing bpdy to html file
bw.write("<p><center><img src=\"Three Days Of Happiness Chapter "+i+" - Manganelo_files/" + pageName + "\" <br/></p>\n");
System.out.println(pageName);
}
bw.write(htmlEnd);
bw.close();
System.out.println("Process Finished");
}}
}```
When you try to sort the names, you'll most certainly notice that they are sorted alphanumerically (e.g. Comparing 9 with 12; 12 would come before 9 because the leftmost digit 1 < 9).
One way to get around this is to use an extended numbering format when naming & storing your files.
This has been working great for me when sorting pictures, for example. I use YYYY-MM-DD for all dates regardless whether the day contains one digit (e.g. 9) or two digits (11). This would mean that I always type 9 as 09. This also means that every file name in a given folder has the same length, and each digit (when compared to the corresponding digit to any other adjacent file) is compared properly.
One solution to your problem is to do the same and add zeros to the left of the file names so that they are easily sorted both by the OS and by your Java program. The drawback to this solution is that you'll need to decide the maximum number of files you'll want to store in a given folder beforehand – by setting the number of digits properly (e.g. 3 digits would mean a maximum of 1000 uniquely & linearly numbered file names from 000 to 999). The plus, however, is that this will save you the hassle of having to sort unevenly numerered files, while making it so that your files are pre-sorted once and are ready to be quickly read whenever.
Generally, file systems do not have an order to the files in a directory. Instead, anything that lists files (be it an ls or dir command on a command line, calling Files.list in java code, or opening Finder or Explorer) will apply a sorting order.
One common sorting order is 'alphanumerically'. In which case, the order you describe is correct: 2 comes after 1 and also after 10. You can't wave a magic wand and tell the OS or file system driver not to do that; files as a rule don't have an 'ordering' property.
Instead, make your filenames such that they do sort the way you want, when sorting alphanumerically. Thus, the right name for the first file would be 01.jpg. Or possibly even 0001.jpg - you're going to have to make a call about how many digits you're going to use before you start, unfortunately.
String.format("%05d", 1) becomes "00001" - that's pretty useful here.
The same principle applies to reading files - you can't just rely on the OS sorting it for you. Instead, read it all into e.g. a list of some sort and then sort that. You're going to have to write a fairly funky sorting order: Find the dot, strip off the left side, check if it is a number, etc. Quite complicated. It would be a lot simpler if the 'input' is already properly zero-prefixed, then you can just sort them naturally instead of having to write a complex comparator.
That comparator should probably by modal. Comparators work by being handed 2 elements, and you must say which one is 'earlier', and you must be consistent (if a is before b, and later I ask you: SO, how about b and a, you must indicate that b is after a).
Thus, an algorithm would look something like:
Determine if a is numeric or not (find the dot, parseInt the substring from start to the dot).
Determine if b is numeric or not.
If both are numeric, check ordering of these numbers. If they have an order (i.e. aren't identical), return an answer. Otherwise, compare the stuff after the dot (1.jpg should presumably be sorted before 1.png).
If neither are numeric, just compare alphanum (aName.compareTo(bName)).
If one is numeric and the other one is not, the numeric one always wins, and vice versa.
Making use of an ASCII .DAT file that contains multiple records of a fixed length I would like to read each record and generate an output based on its certain portions of its contents.
So far my program does exactly this but I was alerted to the fact that the first field in each .DAT file starts with the record length and number of records, the only issue I am having is reading this first field and extracting the data as usable, the issue is that the data is in ASCII chars and not decimal numbers.
Below is a code snipet in BASIC that reads the same file and extracts the initial data required
CLS
INPUT "Survey System Data File? : ", survey$
survey$ = "f:\apps\survey\" + survey$
reclen = 3004
OPEN survey$ + ".dat" FOR RANDOM AS 1 LEN = reclen
FIELD #1, 3 AS RL$, 9 AS n$
GET #1, 1
RL = CVI(RL$): n = CVI(n$)
PRINT "Record Length = "; RL
reclen = RL
PRINT "Number of Records = "; n
CLOSE #1
Is there a way of doing something similar in Java?
The initial record and second record are seen below. The second record starts from 0001511
#Å Õ 000151115 2 351228 6 8131720 1121211 12111121121111111112112111 Treat people fairly. Motivated people who go the extra mile should be recognised. Trust employees to make decisions and find out what is best for the business. Examine the workload and the performance and timing of the work. 11 6 5 6 5 2003/10/007:12 21 111 2 1154 1 1 113 1 1 1 1 1 4000100 0 0 0 400 0 0 0 400 4100.0000.0000.0 0 0 10 24 12111none 9 1346
As you can see the initial characters are ASCII chars and not decimals that I'm looking for.
Many thanks in advance for the help.
I have found a way around this issue as the initial record of the file is basically a blank indicator record and so using this initial record length I was able to find the recurring record length of the others.
Hi all and thank you for the help in advance.
I have scoured the webs and have not really turned up with anything concrete as to my initial question.
I have a program I am developing in JAVA thats primary purpose is to read a .DAT file and extract certain values from it and then calculate an output based on the extracted values which it then writes back to the file.
The file is made up of records that are all the same length and format and thus it should be fairly straightforward to access, currently I am using a loop and and an if statement to find the first occurrence of a record and then through user input determine the length of each record to then loop through each record.
HOWEVER! The first record of this file is a blank (Or so I thought). As it turns out this first record is the key to the rest of the file in that the first few chars are ascii and reference the record length and the number of records contained within the file respectively.
below are a list of the ascii values themselves as found in the files (Disregard the " " the ascii is contained within them)
"#¼ ä "
"#g â "
"ÇG # "
"lj ‰ "
"Çò È "
"=¼ "
A friend of mine who many years ago use to code in Basic recons the first 3 chars refer to the record length and the following 9 refer to the number of records.
Basically what I am needing to do is convert this initial string of ascii chars to two decimals in order to work out the length of each record and the number of records.
Any assistance will be greatly appreciated.
Edit...
Please find below the Basic code used to access the file in the past, perhaps this will help?
CLS
INPUT "Survey System Data File? : ", survey$
survey$ = "f:\apps\survey\" + survey$
reclen = 3004
OPEN survey$ + ".dat" FOR RANDOM AS 1 LEN = reclen
FIELD #1, 3 AS RL$, 9 AS n$
GET #1, 1
RL = CVI(RL$): n = CVI(n$)
PRINT "Record Length = "; RL
reclen = RL
PRINT "Number of Records = "; n
CLOSE #1
Basically what I am looking for is something similar but in java.
ASCII is a special way to translate a bit pattern in a byte to a character, and that gives each character a numerical value; for the letter 'A' is this 65.
In Java, you can get that numerical value by converting the char to an int (ok, this gives you the Unicode value, but as for the ASCII characters the Unicode value is the same as for ASCII, this does not matter).
But now you need to know how the length is calculated: do you have to add the values? Or multiply them? Or append them? Or multiply them with 128^p where p is the position, and add the result? And, in the latter case, is the first byte on position 0 or position 3?
Same for the number of records, of course.
Another possible interpretation of the data is that the bytes are BCD encoded numbers. In that case, each nibble (4bit set) represents a number from 0 to 9. In that case, you have to do some bit manipulation to extract the numbers and concatenate them, from left (highest) to right (lowest). At least you do not have to struggle with the sequence and further interpretation here …
But as BCD would require 8-bit, this would be not the right interpretation if the file really contains ASCII, as ASCII is 7-bit.
I have a flat file of e-mail header data that I'm trying to parse for analysis. The file will always have fields in order as follows: Record Number, 1 or 2 bytes, "From:" followed by the sender's name and "Sent:" followed by the date sent.
1 From: Person.Name Sent: April 12, 2010
2 From:<tab>Person.Name Sent: April 30, 2011
10 From: Person.Name Sent: June 29, 2012
11 From:<tab>Person.Name Sent: July 8, 2012
Using BufferedReader I am reading a the file line-by-line and defining a substring of the Name based on all characters between the indeces of "From:" and "Sent:".
String sender = inputLine.substring((inputLine.indexof("From:")+6),(inputLine.indexOf("Sent:")-1));
In this case, I'm grabbing everything AFTER "From: " (sixth byte excludes the word, the colon, and the space/single byte after the colon) through one LESS than the position of "Sent: " (the space before the S).
However, I'm getting unexpected output when I run the job. Some of my input data appears to have a tab after "From: " and some lines do not. When a tab is present, my output include the last two or three bytes of "From: " (when the record number is a single digit, I get m:<tab>, for double digit record numbers it's om:<tab>.
Person.Name
m:<tab>Person.Name <-- single digit record number
Person.Name
om:<tab>Person.Name <-- double digit record number
EDIT: When I amend my substring to
String sender = inputLine.substring((inputLine.indexof("From:\t")+6),(inputLine.indexOf("Sent:")-1));
ONLY the records with a space (and not a tab) prepent the end of the From: to the output.
Person.Name <-- records with From:<tab>
om: Person.Name <-- records with From:<space>
I'm now wondering if I understand substring correctly. My statement above is based on an understanding of substring(x,y) where x is the start and y is the end of the string. Is that correct?
Since indexOf("From:") is intended to represent an integer value of 2 or 3 (depending on a 1 or 2 byte record number, e.g., 1 From: or 10 From:) I would think that adding a value of 6 would give me an index value that falls AFTER the : in index 8 or 9 from the front of the line. So why does it appear to be viewing this as an index of 5--regardless?
111111111122222222222 |
0123456789012345678901234567890 + index values
1 From: Person.Name Sent: June
10 From: Person.Name Sent: July
The tab is the only difference in the records, and while I understand that a tab character may need to be counted differently than an ASCII space character, SUBTRACTING from the index seems a little strange.
Even more interesting, if I remove the "adjustments" from the statement,
String sender = inputLine.substring((inputLine.indexof("From:")),(inputLine.indexOf("Sent:")));
I get a -1 out of range exception.
Can someone please explain what's happening here? I am baffled, and can't find answers this specific in oracle's java documentation.
I ended up creating new input fields that replaced \t with a space. Then everything worked fine. What it was about the tab character that threw things off is still a mystery.
I have an assignment: User enters a String, example ABCD and the program has to give out alll the permutations.
I don't want the whole code just a tip. this is what i got so far in thery i got nothing implemented.
Taking ABCD as an example:
get factorial of length of String in this case 4! = 24.
24/4 = 6 So first letter has to change after 6. so far so good.
than get factorial of remaining letters which are three 3! = 6.
6/3 =2 2 places for each letter. from here i don't know how it can continue to fill 24 places.
With this algorithm all i will have is
ABCD
ABD
AC
AC
AD
AD
B
B
B
B
B
B
.
. (continues with 6 C's and 6 D's)
I think my problem is i do not have alot of experience with recursive problems so who can suggest some programs to program to help me get to know recursion better please do.
Thanks! If somethings aren't clear please point out.
You are right that recursion is the way to go. The example you worked thru w/ the little bit of math was all correct, but kind of indirect.
Here's some pseudocode:
def permute(charSet, soFar):
if charSet is empty: print soFar //base case
for each element 'e' of charSet:
permute(charSet without e, soFar + e) //recurse
example of partial recursion tree
permute({A,B,C}, '')
/ | \
permute({B,C}, 'A') permute({A,C}, 'B') permute({A,B}, 'C')
/ \
permute({A}, 'BC') permute({C}, 'BA')
|
permute({}, 'BCA')<---BASE CASE, print 'BCA'
EDIT
To handle repeated characters without duplicating any permutations. Let unique be a function to remove any repeated elements from a collection (or string that is being treated like an ordered character collection thru indexing).
1) Store results (including dupes), filter them out afterwards
def permuteRec(charSet, soFar):
if charSet is empty: tempResults.add(soFar) //base case
for each element 'e' of charSet:
permute(charSet without e, soFar + e) //recurse
global tempResults[]
def permute(inString):
permuteRec(inString, '')
return unique(tempResults)
print permute(inString)
2) Avoid generating duplicates in the first place
def permute(charSet, soFar):
if charSet is empty: print soFar //base case
for each element 'e' of unique(charSet):
permute(charSet without e, soFar + e) //recurse
Make a method that takes a string.
Pick a letter out of the string and output it.
Create a new string with the input string minus the letter you picked.
call the above method with the new string if it has at least 1 character
do this picking of one letter for each possible letter.