getting error in importing spark dependencies in intellij idea - java

I am using intelli j idea with maven integration but I am getting error on following lines
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
I am trying to run following example
package com.spark.hello;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
public class Hello {
public static void main(String[] args) {
String logFile = "F:\\Spark\\a.java";
SparkConf conf = new SparkConf().setAppName("Simple Application");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> logData = sc.textFile(logFile).cache();
long numAs = logData.filter(new Function<String, Boolean>() {
public Boolean call(String s) { return s.contains("a"); }
}).count();
long numBs = logData.filter(new Function<String, Boolean>() {
public Boolean call(String s) { return s.contains("b"); }
}).count();
System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);
}
}
plz help me to solve this issue or is there any other way to run this kind of project???

Without seeing the error, I'm guessing the IDE is telling you they are unused imports be sure to double check the dependencies and the versions.
Alt + Enter is the shortcut I've used to resolve many of the issues.

Related

MiniKdc not available from org.springframework.security.kerberos.test.MiniKdc

I am trying to use "MiniKdc" in my code implementation like "MiniKdc.main(config)" but am getting error "can not resolve symbol 'MiniKdc' ".
I am following this example https://www.baeldung.com/spring-security-kerberos-integration
i have added this dependecy in my build.gradle
implementation 'org.springframework.security.kerberos:spring-security-kerberos-test:1.0.1.RELEASE'
i tried to search the dependecy from maven central/repository and i can't find it.
here is the class i am working on, i want to be able to import Minikdc in the second import statement.
import org.apache.commons.io.FileUtils;
import org.springframework.security.kerberos.test.MiniKdc;
import java.io.File;
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
class KerberosMiniKdc {
private static final String KRB_WORK_DIR = ".\\spring-security-sso\\spring-security-sso-kerberos\\krb-test-workdir";
public static void main(String[] args) throws Exception {
String[] config = MiniKdcConfigBuilder.builder()
.workDir(prepareWorkDir())
.confDir("minikdc-krb5.conf")
.keytabName("example.keytab")
.principals("client/localhost", "HTTP/localhost")
.build();
MiniKdc.main(config);
}
private static String prepareWorkDir() throws IOException {
Path dir = Paths.get(KRB_WORK_DIR);
File directory = dir.normalize().toFile();
FileUtils.deleteQuietly(directory);
FileUtils.forceMkdir(directory);
return dir.toString();
}
}
is there anything am doing wrong?
As of 2021, spring-security-kerberos is not well maintained.
I suggest using Apache Kerby instead, either directly or via other library like Kerb4J. See an example here.
package com.kerb4j;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.kerby.kerberos.kerb.client.KrbConfig;
import org.apache.kerby.kerberos.kerb.server.SimpleKdcServer;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.BeforeEach;
import java.io.File;
public class KerberosSecurityTestcase {
private static final Log log = LogFactory.getLog(KerberosSecurityTestcase.class);
private static int i = 10000;
protected int kdcPort;
private SimpleKdcServer kdc;
private File workDir;
private KrbConfig conf;
#BeforeAll
public static void debugKerberos() {
System.setProperty("sun.security.krb5.debug", "true");
}
#BeforeEach
public void startMiniKdc() throws Exception {
kdcPort = i++;
createTestDir();
createMiniKdcConf();
log.info("Starting Simple KDC server on port " + kdcPort);
kdc = new SimpleKdcServer(workDir, conf);
kdc.setKdcPort(kdcPort);
kdc.setAllowUdp(false);
kdc.init();
kdc.start();
}
#AfterEach
public void stopMiniKdc() throws Exception {
log.info("Stopping Simple KDC server on port " + kdcPort);
if (kdc != null) {
kdc.stop();
log.info("Stopped Simple KDC server on port " + kdcPort);
}
}
public void createTestDir() {
workDir = new File(System.getProperty("test.dir", "target"));
}
public void createMiniKdcConf() {
conf = new KrbConfig();
}
public SimpleKdcServer getKdc() {
return kdc;
}
public File getWorkDir() {
return workDir;
}
public KrbConfig getConf() {
return conf;
}
}
Disclaimer: I'm the author of Kerb4J

unable to read file in java spark

I am trying to run the spark program on java using eclipse. Its is running if i simply print something on console but I am not able to read any file using textFile function.
I have read somewhere that reading a file can only be done using HDFS but I am not able to do in my local system.
Do let me know how to access/read file , if using HDFS then how to install HDFS in my local system so that i can rad the text file.
Here's a code on which I am testing , though this program is working fine but it is unable to read file saying Input path does not exist.
package spark;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.DataFrame;
import org.apache.spark.sql.SQLContext;
import org.apache.spark.api.java.function.Function;
public class TestSpark {
public static void main(String args[])
{
String[] jars = {"D:\\customJars\\spark.jar"};
System.setProperty("hadoop.home.dir", "D:\\hadoop-common-2.2.0-bin-master");
SparkConf sparkConf = new SparkConf().setAppName("spark.TestSpark")
.setMaster("spark://10.1.50.165:7077")
.setJars(jars);
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
SQLContext sqlcon = new SQLContext(jsc);
String inputFileName = "./forecaster.txt" ;
JavaRDD<String> logData = jsc.textFile(inputFileName);
long numAs = logData.filter(new Function<String, Boolean>() {
#Override
public Boolean call(String s) throws Exception {
return s.contains("a");
}
}).count();
long numBs = logData.filter(new Function<String, Boolean>() {
public Boolean call(String s) { return s.contains("b"); }
}).count();
System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);
System.out.println("sadasdasdf");
jsc.stop();
jsc.close();
}
}
My File Struture :
Update: you don't have .txt extension in file name and you are using it in your application. You should use it as String inputFileName = "forecaster" ;
If file is in same folder as java class TestSpark ($APP_HOME):
String inputFileName = "forecaster.txt" ;
If file is in Data dir under your project of spark:
String inputFileName = "Data\\forecaster.txt" ;
Or use fully qualified Path log says from below testing:
16/08/03 08:25:25 INFO HadoopRDD: Input split: file:/C:/Users/user123/worksapce/spark-java/forecaster.txt
~~~~~~~
String inputFileName = "file:/C:/Users/user123/worksapce/spark-java/forecaster.txt" ;
For example: I copied your code and ran on my local environment:
this is how my project step up is, and I run it as:
String inputFileName = "forecaster.txt" ;
Test File:
this is test file
aaa
bbb
ddddaaee
ewwww
aaaa
a
a
aaaa
bb
Code that I used:
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
public class TestSpark {
public static void main(String args[])
{
// String[] jars = {"D:\\customJars\\spark.jar"};
// System.setProperty("hadoop.home.dir", "D:\\hadoop-common-2.2.0-bin-master");
SparkConf sparkConf = new SparkConf().setAppName("spark.TestSpark").setMaster("local");
//.setMaster("spark://10.1.50.165:7077")
//.setJars(jars);
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
//SQLContext sqlcon = new SQLContext(jsc);
String inputFileName = "forecaster.txt" ;
JavaRDD<String> logData = jsc.textFile(inputFileName);
long numAs = logData.filter(new Function<String, Boolean>() {
#Override
public Boolean call(String s) throws Exception {
return s.contains("a");
}
}).count();
long numBs = logData.filter(new Function<String, Boolean>() {
public Boolean call(String s) { return s.contains("b"); }
}).count();
System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);
System.out.println("sadasdasdf");
jsc.stop();
jsc.close();
}
}
Spark needs schema and proper path in order to understand how to read the file. So if you are reading from HDFS, you should use:
jsc.textFile("hdfs://host/path/to/hdfs/file/input.txt");
If you are reading local file (local to the worker node, not the machine the driver is running), you should use:
jsc.textFile("file://path/to/hdfs/file/input.txt");
For reading Hadoop Archive File (HAR), you should use:
jsc.textFile("har://archive/path/to/hdfs/file/input.txt");
And so on.

How to run particular Test step of soapUi in java

I want to run particular testStep of my testcase of soap ui using java code. My problem is when I try to run at test step level it need argument of TestCase runner which is anonymous inner type and TestCaseRunContext which is interface. Do I have to implement both to run the same? if yes can please any sample how to do that??
here's my code
package com.testauto.soaprunner.soap.impl;
import java.sql.Timestamp;
import java.util.ArrayList;
import java.util.Date;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.eviware.soapui.SoapUI;
import com.eviware.soapui.StandaloneSoapUICore;
import com.eviware.soapui.impl.wsdl.WsdlProject;
import com.eviware.soapui.impl.wsdl.WsdlTestSuite;
import com.eviware.soapui.impl.wsdl.testcase.WsdlTestCase;
import com.eviware.soapui.impl.wsdl.testcase.WsdlTestCaseRunner;
import com.eviware.soapui.impl.wsdl.teststeps.WsdlTestStep;
import com.eviware.soapui.model.TestPropertyHolder;
import com.eviware.soapui.model.iface.MessageExchange;
import com.eviware.soapui.model.propertyexpansion.PropertyExpansionUtils;
import com.eviware.soapui.model.testsuite.TestCase;
import com.eviware.soapui.model.testsuite.TestCaseRunContext;
import com.eviware.soapui.model.testsuite.TestProperty;
import com.eviware.soapui.model.testsuite.TestStepResult;
import com.eviware.soapui.model.testsuite.TestSuite;
import com.eviware.soapui.support.types.StringToObjectMap;
import com.eviware.soapui.support.types.StringToStringsMap;
import com.testauto.soaprunner.data.InputData;
import com.testauto.soaprunner.data.ReportData;
public class RunTestImpl{
static Logger logger = LoggerFactory.getLogger(RunTestImpl.class);
List<ReportData> reportDatList=new ArrayList<ReportData>();
public List<ReportData> process(Map<String, String> readDataMap, InputData input, Map<List<String>, String> configurationMap, List<String> configuration, WsdlTestSuite testSuite)
{
List<ReportData> report = new ArrayList<ReportData>();
logger.info("Into the Class for running test cases");
try{
report= getTestSuite(readDataMap,input,configurationMap,configuration,testSuite);
}
catch(Exception e)
{
logger.info(e.getMessage());
}
return report;
}
private List<ReportData> getTestSuite(Map<String, String> readDataMap, InputData input, Map<List<String>, String> configurationMap, List<String> configuration, WsdlTestSuite testSuite) throws Exception {
ReportData report=new ReportData();
logger.info("Into the Class for running test cases");
String suiteName = "";
String reportStr = "";
List<String> testCaseNameList= setPropertyValues(readDataMap,input);
WsdlTestCaseRunner runner = null;
List<TestSuite> suiteList = new ArrayList<TestSuite>();
List<TestCase> caseList = new ArrayList<TestCase>();
SoapUI.setSoapUICore(new StandaloneSoapUICore(true));
System.out.println("testcase name "+ configurationMap.get(configuration));
// WsdlTestCase testCase= testSuite.getTestCaseByName(input.getApiName()+"_"+testCaseName+"_TestCase");
WsdlTestCase testCase= testSuite.getTestCaseByName("my_TESTCASE");
WsdlTestStep tesStep=testCase.getTestStepByName(configurationMap.get(testCaseNameList));
System.out.println("test case name:"+testCase.getName());
report.setTestCase(testCase.getName());
suiteList.add(testSuite);
runner= tesStep.run(?,?);
return reportDatList;
}
private List<String> setPropertyValues(Map<String, String> readDataMap, InputData input) {
String testCaseName="";
TestPropertyHolder holder = PropertyExpansionUtils.getGlobalProperties();
List<String> dataConfigurationList=new ArrayList<String>();
Iterator entries = readDataMap.entrySet().iterator();
while (entries.hasNext()) {
Entry thisEntry = (Entry) entries.next();
String key = (String) thisEntry.getKey();
String value = (String) thisEntry.getValue();
testCaseName+=key;
holder.setPropertyValue(key, holder.getPropertyValue(key));
dataConfigurationList.add(key);
}
System.out.println("testCaseName"+testCaseName);
return dataConfigurationList;
}
}
}
After trying different things I got something like this.
TestCaseRunContext context = new MockTestRunContext(new MockTestRunner(testStep.getTestCase()), testStep);
MockTestRunner runner = new MockTestRunner(testStep.getTestCase());
TestStepResult testStepResult= testStep.run(runner, context);
I don't know how it works this trick worked for me. if someone know the reason behind this please share

MapReduce-Cassandra wordcount compilation error: ConfigHelper not found

I am trying to run WordCount MapReduce program to read and count data stored in Cassandra table (Column Family) but, when I compile my program I got the same error repeated times. Below is my source code and error I got. Can anyone help me to solve this issue? Thanks in advance.
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.cassandra.db.IColumn;
import org.apache.cassandra.hadoop.*;
import org.apache.cassandra.hadoop.ColumnFamilyInputFormat;
import org.apache.cassandra.hadoop.ConfigHelper;
import org.apache.cassandra.thrift.*;
import org.apache.cassandra.utils.ByteBufferUtil;
/**
* This sums the word count stored in the input_words_count ColumnFamily for the key "key-if-verse1".
*
* Output is written to a text file.
*/
public class WordCountCounters extends Configured implements Tool
{
private static final Logger logger = LoggerFactory.getLogger(WordCountCounters.class);
static final String COUNTER_COLUMN_FAMILY = "input_words";
private static final String OUTPUT_PATH_PREFIX = "/Users/Deepu/Documents/dse-3.2.4/dse-data/word_count_counters";
public static void main(String[] args) throws Exception
{
// Let ToolRunner handle generic command-line options
ToolRunner.run(new Configuration(), new WordCountCounters(), args);
System.exit(0);
}
public static class SumMapper extends Mapper<ByteBuffer, SortedMap<ByteBuffer, IColumn>, Text, LongWritable>
{
public void map(ByteBuffer key, SortedMap<ByteBuffer, IColumn> columns, Context context) throws IOException, InterruptedException
{
long sum = 0;
for (IColumn column : columns.values())
{
logger.debug("read " + key + ":" + column.name() + " from " + context.getInputSplit());
sum += ByteBufferUtil.toLong(column.value());
}
context.write(new Text(ByteBufferUtil.string(key)), new LongWritable(sum));
}
}
public int run(String[] args) throws Exception
{
Job job = new Job(getConf(), "wordcountcounters");
job.setJarByClass(WordCountCounters.class);
job.setMapperClass(SumMapper.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
FileOutputFormat.setOutputPath(job, new Path(OUTPUT_PATH_PREFIX));
job.setInputFormatClass(ColumnFamilyInputFormat.class);
ConfigHelper.setRpcPort(job.getConfiguration(), "9160");
ConfigHelper.setInitialAddress(job.getConfiguration(), "localhost");
ConfigHelper.setPartitioner(job.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner");
ConfigHelper.setInputColumnFamily(job.getConfiguration(), WordCount.KEYSPACE, WordCountCounters.COUNTER_COLUMN_FAMILY);
SlicePredicate predicate = new SlicePredicate().setSlice_range(
new SliceRange().
setStart(ByteBufferUtil.EMPTY_BYTE_BUFFER).
setFinish(ByteBufferUtil.EMPTY_BYTE_BUFFER).
setCount(100));
ConfigHelper.setInputSlicePredicate(job.getConfiguration(), predicate);
job.waitForCompletion(true);
return 0;
}
}
Compiation Errors are:
Because you commented out these two lines perhaps:
//import org.apache.cassandra.hadoop.ColumnFamilyInputFormat;
//import org.apache.cassandra.hadoop.ConfigHelper;

Running a mapreduce class in another Java program

I write a mapreduce class and create a jar file from the class. now I want to use this jar in another java program.
can anyone help me please how could I do this?
thanks
here is my MapReduce Program:
package org.apache.cassandra.com;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.Map;
import java.util.Map.Entry;
import org.apache.cassandra.hadoop.ConfigHelper;
import org.apache.cassandra.hadoop.cql3.CqlConfigHelper;
import org.apache.cassandra.hadoop.cql3.CqlPagingInputFormat;
import org.apache.cassandra.utils.ByteBufferUtil;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class CassandraSumLib extends Configured implements Tool
{
public CassandraSumLib(){
}
static final String KEYSPACE = "weather";
static final String COLUMN_FAMILY = "momentinfo1";
static final String OUTPUT_PATH = "/tmp/OutPut";
private static final Logger logger = LoggerFactory.getLogger(CassandraSum.class);
public int CassandraSum(String[] args) throws Exception
{
return ToolRunner.run(new Configuration(), new CassandraSumLib(), args);
}
///////////////////////////////////////////////////////////
public static class Summap extends Mapper<Map<String, ByteBuffer>, Map<FloatWritable, ByteBuffer>, Text, DoubleWritable>
{
Text word = new Text("SUM");
float temp;
public void map(Map<String, ByteBuffer> keys, Map<FloatWritable, ByteBuffer> columns, Context context) throws IOException, InterruptedException
{
for (Entry<FloatWritable, ByteBuffer> column : columns.entrySet())
{
if (!"column".equals(column.getKey()))
continue;
temp = ByteBufferUtil.toFloat(column.getValue());
//System.out.println(temp);
context.write(word, new DoubleWritable(temp));
//System.out.println(word + " " + temp);
}
}
}
///////////////////////////////////////////////////////////
public static class Sumred extends Reducer<Text, DoubleWritable, Text, DoubleWritable>
{
public void reduce(Text key, Iterable<DoubleWritable> values, Context context) throws IOException, InterruptedException
{
Double sum = 0.0;
for (DoubleWritable val : values){
// System.out.println(val.get());
sum += val.get();}
context.write(key, new DoubleWritable(sum));
}
}
///////////////////////////////////////////////////////////
public int run(String[] args) throws Exception
{
Job job = new Job(getConf(), "SUM");
job.setJarByClass(CassandraSum.class);
job.setMapperClass(Summap.class);
JobConf conf = new JobConf( getConf(), CassandraSum.class);
// conf.setNumMapTasks(1000);
// conf.setNumReduceTasks(900);
job.setOutputFormatClass(TextOutputFormat.class);
job.setCombinerClass(Sumred.class);
job.setReducerClass(Sumred.class);
job.setOutputKeyClass(Text.class);
job.setNumReduceTasks(900);
job.setOutputValueClass(DoubleWritable.class);
FileOutputFormat.setOutputPath(job, new Path(OUTPUT_PATH));
job.setInputFormatClass(CqlPagingInputFormat.class);
ConfigHelper.setInputRpcPort(job.getConfiguration(), "9160");
ConfigHelper.setInputInitialAddress(job.getConfiguration(), "localhost");
ConfigHelper.setInputColumnFamily(job.getConfiguration(), KEYSPACE, COLUMN_FAMILY);
ConfigHelper.setInputPartitioner(job.getConfiguration(), "Murmur3Partitioner");
CqlConfigHelper.setInputCQLPageRowSize(job.getConfiguration(), "3");
job.waitForCompletion(true);
return 0;
}
}
I want to call this class from another program. here is my second program that call my firs program:
package org.apache.cassandra.com;
import java.util;
import org.apache.hadoop.util.RunJar;
import org.apache.cassandra.com.CassandraSumLib;
public class CassandraSum {
public static void main(String[] args) throws Exception{
CassandraSumLib CSL = new CassandraSumLib();
CSL.??? (which method should I write here?)
}
}
thanks
Steps to add jar file in eclipse
1. right click on project
2. click on Bulid Path->configure path
3. click on java Build path
4. Click on libraries tab
5. click on add external jar tab
6. choose jar file
7. click on ok
Add the jar to class path of the second program. If you are compiling/running from command line, use -cp option.

Categories

Resources