File not getting created during second MR

File not getting created during second MR - java

I have a hadoop implementation for an algorithm.
I am doing it in Eclipse:
When i run in eclipse my algorithm works fine and creates necessary files and output.
Algorithm
|
|___creates a file0.txt file.
|
|___creates a file1.txt file.
|
|___creates a file3.txt file.
|
|___creates a file4.txt file.
|
|___creates a file5.txt file.
|
|___creates a file6.txt file.
|
|___creates a file7.txt file.
Completes the job.
When i tried my program in Hadoop cluster except file0.txt all other files are not getting created in hdfs from reducer phase.
Do any one gone through this issue.
Pls help.
Source
Output from eclipse
Output from cluster

The output file is specified by the Driver code, irrespective of the MR job. Please check your Driver code or share it here

Your questions is slightly confusing. All I understand is that you have 413 bytes long file and you are trying to run 7 MR jobs.
So, are you saying you have 7 pairs of Mapper and Reducer classes that you want to run on that 413 byte file ?
Again you mentioned my algorithm runs different MR jobs depending upon the data sets , so I'm left to assume that a dataset gets to be used only by one pair of Mapper-Reducer class. Did you verify that your dataset satisfies the condition for Mapper-Reducer pair 1,3,4,5,6,7,
Are all these Mapper-Reducer pair using same output folder... ? That might also be big concern.
Please answer them , then possibly I can help.

Related

Logical processor does not appear in omnetpp application

I need to run in omnetpp application parallelly, I made the configuration of.ini file as follow
parallel-simulation = true
parsim-communications-class = cMPICommunications
parsim-synchronization-class = cNullMessageProtocol
the program build completed successfully, but when initialization the console view the following
cMPICommunications: started as process 0 out of 1.
WARNING: MPI thinks this process is the only one in the session (did you use mpirun to start this program?)
Loading NED files from ..: 4
Loading NED files from ../../src: 8
Loading NED files from ../../../inet/examples: 151
Loading NED files from ../../../inet/src: 492
Loading NED files from ../../../inet/tutorials: 4
Preparing for running configuration General, run #6...
Scenario: $0=500, $repetition=0
Assigned runID=General-6-20220301-23:26:45-19607
Setting up network `Fog'...
<!> Error in module (cModule) Fog (id=1) during network setup: wrong partitioning: value 1 too large for 'Fog.Broker' (total partitions=1).
End.
I think that there are some configuration i missed, probably the configuration that pass the number of LPs to the omnetpp
in omnetpp manual they state that passing number of LPs must done as follow
./cqn -p0,3 &
./cqn -p1,3 &
./cqn -p2,3 &
but I don't know where to add these lines exactly

Let's assume that your project's name is foo and the Target type is equal to Executable (in Project | Properties | OMNeT++ | Makemake | Options | Target). After compilation an executable file called foo (in Windows: foo.exe) is created.
Open the console (in Windows: mingwenv) and go to the directory where your project is located, for example:
cd somefolder/foo
Then type the command you need.

Corrupt extraction with ZipArchive in PHP

I have a PHP script that must unzip some uploads. The uploads are packed folders, basically zip files with a custom extension.
I am having problems with some zip files packed in one machine, but not with the same folder packed in another machine. In both cases, the compression is done with the same Java library.
This is the expected result, which then PHP further proceses:
This is the corrupted result, which makes PHP choke:
If I look at their permissions, this is what I see (01_Orig is okay, 02_Modif is corrupted):
If I look at the two packages with unzip -l (the first one is okay, the second one is corrupt):
And this is my PHP function (which is the same in both cases):
$uploads = "uploads_dir/";
$dir = new DirectoryIterator("uploads_dir/");
foreach ($dir as $fileinfo) {
if (!$fileinfo->isDot()) {
$filename = $fileinfo->getFilename();
$zip = new ZipArchive;
$res = $zip->open($uploads . $fileinfo);
if ($res === TRUE) {
$zip->extractTo($uploads . $filename . "_extracted");
$zip->close();
} else {
echo "Unable to unzip";
}
}
}
Both uploads look fine when I manually unzip or open them with 7zip in my Windows machine.
If I create two hex dumps of both zip files and compare them, this is what I get: https://gist.github.com/msoutopico/22a9ef647381c2e4d26313f135c526e2
Thanks a lot in advance for any tips.
UPDATE:
In case it's relevant, the zip files are created (saved) in a linux server, and both machines where this is done (the one that works, and the one that corrupt the package) run Windows 10.

Sorted. Version 2 of the plugin was tweaked to transform path separators from \ to / in filenames. Now, even though the version 3 of the plugin was installed in both machines, in the faulty machine there was also an older one (version 1, previous to that tweak), which is the one that was being used instead of version 3. Just removing the version 1 duplicate has fixed the problem. #pmqs was right. Thank you everyone for helping me quickly solve this!

how to command line result save into textfile

i have following code i want command line result save in my textfile also how do this. please help
#echo off
set JAVA_HOME=C:\jdk1.5.0_05
set CLI_HOME=c:\projects\utds\applications\cli
set CLI_LIB=%CLI_HOME%\lib
set CLASSPATH=%CLI_LIB%\commons-logging.jar;%CLI_LIB%\commons-logging-api.jar
set CLASSPATH=%CLASSPATH%;%CLI_LIB%\spring.jar;%CLI_LIB%\spring-core.jar;%CLI_LIB%\spring-support.jar;%CLI_LIB%\spring-remoting.jar
set CLASSPATH=%CLASSPATH%;%CLI_LIB%\utds-infra.jar;%CLI_HOME%\src\conf\spring;%CLI_HOME%\src\conf
set CLASSPATH=%CLASSPATH%;%CLI_LIB%\aopalliance.jar
set CLASSPATH=%CLASSPATH%;%CLI_HOME%\dist\cli.jar;%JAVA_HOME%\jre\lib\ext\comm.jar
set path=%JAVA_HOME%\bin;%path%
java -Dport=COM3 -DbaudRate=19200 -Dparser=panasonicCliParser -DappContext=applicationContext-service.xml com.utds.cli.service.comm.CallerIdListener

I would pipe the output of the batch command into a text file by running the following command in the command prompt:
myBatchFile.bat > output.log

Okay it looks like you're trying to put the output of the program into a text file. If that is the case, in your code just add:
java > log.txt

In my opinion, you should better use your logging library. I can see from the script here above that your applications uses Apache's commons-logging and the output shows it is clearly used.
This library is a wrapper indeed. It can use Log4J or JDK's logging library under the hood.
Of course, this requires much more learning and struggling with configuration files but the advantage for you is that you could (following the implementation you chose):
Filter logs following their gravity (debug < info < warning < error...) and/or the classes emitting them. Some libraries are quite verbose .
Create rolling log files : once the the log file reaches a certain size, a new log file can be created and the old one is backup-ed. (It can be possible to limit the number of backups...).
Create a log file per day
Log into databases if you ever need it...
....

add only >mylog.txt Thanks All
java -Dport=COM3 -DbaudRate=19200 -Dparser=panasonicCliParser -DappContext=applicationContext-service.xml com.utds.cli.service.comm.CallerIdListener> mylogs.txt

Split HDFS files into multiple local files using Java

I have to copy HDFS files into local file system using Java Code and before writing to the disk split into multiple parts . The files are compressed using snappy / lzo. I have used Bufferedreader and Filewriter to read and write the file . But this operation is very slow . 20 Mins for 30 GB file. I can dump file using hadoop fs -text in 2 minutes (but can not split it). Is there anything else that I can do to speed up the operation ?

Since I had two do pass , first to get the line count and then the Split. hadoop fs -text was cpu intensive. Did the below approach :
1) Use a line count Java program as Map reduce to get the line count in the file. Then dividing it by total number of files I need i got the count of lines to write to each file .
2) Use the code mentioned this link with hadoop fs -text
https://superuser.com/a/485602/220236
Hope it helps someone else.

slurping /proc/cpuinfo with clojure

(Clojure newbie)
On my linux machine, slurping /proc/cpuinfo raises an error:
user=> (slurp "/proc/cpuinfo")
java.io.IOException: Invalid argument (NO_SOURCE_FILE:0)
Anybody knows why that is? (is the /proc filesystem some kind of second-class citizen in Java?)
Edit: the following code, adapted from nakkaya.com, works flawlessly:
(with-open [rdr (java.io.BufferedReader.
(java.io.FileReader. "/proc/cpuinfo"))]
(let [seq (line-seq rdr)]
(apply print seq)))
I wonder why this difference ?

I've had a similar problem with files in /proc. The solution is simple though:
(slurp (java.io.FileReader. "/proc/cpuinfo"))

the problem is that java cannot open a DataInputStream on /proc so the slurp function isnt going to help you here, sorry :(
/proc/cpuinfo is a little strange because it has a file size of zero and produces bytes when read. this upsets the smarter java file handling classes.
ls -l /proc/cpuinfo
-r--r--r-- 1 root root 0 2012-01-20 00:10 /proc/cpuinfo
see this thread for more http://www.velocityreviews.com/forums/t131093-java-cannot-access-proc-filesystem-on-linux.html
you are going to have to open it with a FileReader. I'll add an example in a bit

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

File not getting created during second MR - java

The output file is specified by the Driver code, irrespective of the MR job. Please check your Driver code or share it here

Related

Logical processor does not appear in omnetpp application

Corrupt extraction with ZipArchive in PHP

how to command line result save into textfile

Split HDFS files into multiple local files using Java

slurping /proc/cpuinfo with clojure

Categories

Resources