Creating synonym filter programmatically - java

I am struggling with the creation of a SynonymFilter that I try to create programmatically. How are you supposed to tell the filter where the synonym list is?
I am using Hibernate Search, but I don't want to use the #AnalyzerDef annotation.
All I can do is pass a synonym map?
private class AllAnalyzer extends Analyzer {
private SynonymFilterFactory synonymFilterFactory = new SynonymFilterFactory();
public AllAnalyzer() {
ClassLoader classLoader = getClass().getClassLoader();
String filePath = classLoader.getResource("synonyms.txt").getFile();
HashMap<String, String> stringStringHashMap = new HashMap<String, String>();
stringStringHashMap.put("synonyms", filePath);
stringStringHashMap.put("format", "solr");
stringStringHashMap.put("ignoreCase", "false");
stringStringHashMap.put("expand", "true");
stringStringHashMap.put("luceneMatchVersion", Version.LUCENE_36.name());
synonymFilterFactory.init(stringStringHashMap);
}
#Override
public TokenStream tokenStream(String fieldName, Reader reader) {
TokenStream result = null;
result = new StandardTokenizer(Version.LUCENE_36, reader);
result = new StandardFilter(Version.LUCENE_36, result);
result = synonymFilterFactory.create(result);
return result;
}
}
Unable to get it to work. When I debug it says that the map is null and I get a NPE. What is wrong?

Yes, you need to pass a SynonymMap to the SynonymFilter.
Sounds like you want to populate it from a file, so you'll likely want to use SolrSynonymParser to generate it. Along the lines of:
SolrSynonymParser parser = new SolrSynonymParser(true, false, analyzer);
Reader synonymFileReader = new FileRader(new File(path));
parser.add(synonymFileReader);
SynonymMap map = parser.build(); // SolrSynonymParser extends SynonymMap.Builder

Related

Is there any way to reduce the amount of code?

I have been doing a project on my studies, it looks fine, but I want to make it as good as possible. I have two separate JSON files, containing Users and Actions. I need to extract that data and do some work with it. But the question is about getting that data. I have a class called DataReader that has two methods - readUsers and readActions.
public class DataReader {
Gson gson = new GsonBuilder().setDateFormat("MM.dd").create();
public ArrayList<Action> readActions(String fileName)
throws JsonIOException, JsonSyntaxException, FileNotFoundException {
Type actionsArrayList = new TypeToken<ArrayList<Action>>() {
}.getType();
return gson.fromJson(new FileReader(fileName), actionsArrayList);
}
public HashMap<Integer, User> readUsers(String fileName)
throws JsonIOException, JsonSyntaxException, FileNotFoundException {
Type usersHashMap = new TypeToken<HashMap<Integer, User>>() {
}.getType();
return gson.fromJson(new FileReader(fileName), usersHashMap);
}
}
As you can see, those two methods do pretty much the same thing, the difference is only the type of object it returns and gets from that JSON file.
So is there any possibilities to make a method like readData that would get only the fileName parameter and sort the things out itself to reduce the amount of code?
You need to close that Reader, specifically your FileReader object you're creating. You also don't need to define Type as a local variable since it's unnecessary. Just inline it.
You can do the same for the other method.
public List<Action> readActionsSimplified(String fileName) throws IOException {
try (Reader reader = new FileReader(fileName)) {
return gson.fromJson(reader, new TypeToken<List<Action>>() {}.getType());
}
}
Maybe you can try this.
public<T> T readData(String fileName,TypeToken<T> typeRef)
throws JsonIOException, JsonSyntaxException, FileNotFoundException {
return gson.fromJson(new FileReader(fileName), typeRef);
}
// make a type class , e.g: MyGsonTypes
public final class MyGsonTypes{
public static final TypeToken<HashMap<Integer, User>> usersHashMapType = new TypeToken<HashMap<Integer, User>>(){}.getType();
}
// when you use it
var data = readData("1.json", MyGsonTypes.usersHashMapType);

super csv nested bean

I have a csv
id,name,description,price,date,name,address
1,SuperCsv,Write csv file,1234.56,28/03/2016,amar,jp nagar
I want to read it and store it to json file.
I have created two bean course(id,name,description,price,date) and person(name,address)
on reading by bean reader i'm not able to set the person address.
The (beautified) output is
Course [id=1,
name=SuperCsv,
description=Write csv file,
price=1234.56,
date=Mon Mar 28 00:00:00 IST 2016,
person=[
Person [name=amar, address=null],
Person [name=null, address=jpnagar]
]
]
I want the adress to set with name
My code:
public static void readCsv(String csvFileName) throws IOException {
ICsvBeanReader beanReader = null;
try {
beanReader = new CsvBeanReader(new FileReader(csvFileName), CsvPreference.STANDARD_PREFERENCE);
// the header elements are used to map the values to the bean (names must match)
final String[] header = beanReader.getHeader(true);
final CellProcessor[] processors = getProcessors();
final String[] fieldMapping = new String[header.length];
for (int i = 0; i < header.length; i++) {
if (i < 5) {
// normal mappings
fieldMapping[i] = header[i];
} else {
// attribute mappings
fieldMapping[i] = "addAttribute";
}}
ObjectMapper mapper=new ObjectMapper();
Course course;
List<Course> courseList=new ArrayList<Course>();
while ((course = beanReader.read(Course.class, fieldMapping, processors)) != null) {
// process course
System.out.println(course);
courseList.add(course);
}
private static CellProcessor[] getProcessors(){
final CellProcessor parsePerson = new CellProcessorAdaptor() {
public Object execute(Object value, CsvContext context) {
return new Person((String) value,null);
}
};
final CellProcessor parsePersonAddress = new CellProcessorAdaptor() {
public Object execute(Object value, CsvContext context) {
return new Person(null,(String) value);
}
};
return new CellProcessor[] {
new ParseInt(),
new NotNull(),
new Optional(),
new ParseDouble(),
new ParseDate("dd/MM/yyyy"),
new Optional(parsePerson),
new Optional(parsePersonAddress)
};
SuperCSV is the first parser I have seen that lets you create an object within an object.
for what you are wanting you can try Apache Commons CSV or openCSV (CSVToBean) to map but to do this you need to have the setters of the inner class (setName, setAddress) in the outer class so the CSVToBean to pick it up. That may or may not work.
What I normally tell people is to have a plain POJO that has all the fields in the csv - a data transfer object. Let the parser create that then use a utility/builder class convert the plain POJO into the nested POJO you want.

submit spark program from code,and that job not shown on history server ui

When I submit a spark job from command line, the job will appear on both spark history server UI and hadoop resource manager UI, but when I submit from java code, job is shown on hadoop resource manager UI but not on spark history server UI.
The spark program is the same, only the way I'm submitting it is not.
Submit code :
public class submitJava {
public static Map getPropXmlAsMap(String path) throws Exception, IOException{
File file = new File(path);
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(file);
Element root = doc.getRootElement();
List props = root.getChildren("property");
Map result = new HashMap();
for(Iterator iter = props.iterator(); iter.hasNext();){
Element element = (Element) iter.next();
Element nameele = element.getChild("name");
Element valueele = element.getChild("value");
String name = nameele.getText();
String value = valueele.getText();
result.put(name, value);
}
return result;
}
public static void fillProperties(Configuration conf,Map map){
Iterator iter = map.entrySet().iterator();
while(iter.hasNext()){
Map.Entry entry = (Map.Entry)iter.next();
String key = (String)entry.getKey();
String value = (String)entry.getValue();
conf.set(key, value);
}
}
public static void main(String[] args) {
Configuration config = new Configuration();
SparkConf sparkConf = new SparkConf();
sparkConf.setMaster("yarn-cluster");
try {
fillProperties(config,getPropXmlAsMap("/home/tseg/hadoop-2.6.0/etc/hadoop/core-site.xml"));
fillProperties(config,getPropXmlAsMap("/home/tseg/hadoop-2.6.0/etc/hadoop/yarn-site.xml"));
} catch (Exception e) {
e.printStackTrace();
}
List<String> runArgs = Arrays.asList(
"--class","org.apache.spark.examples.JavaSparkPi",
"--addJars","hdfs://tseg0:9010/user/tseg/hh/spark-assembly-1.4.0-hadoop2.6.0.jar",
"--jar","file:////home/tseg/hh/spark-1.4.0-bin-hadoop2.6.0/lib/spark-examples-1.4.0-hadoop2.6.0.jar"
,"--arg","10"
);
System.setProperty("SPARK_YARN_MODE", "true");
ClientArguments argss = new ClientArguments(runArgs.toArray(new String[runArgs.size()]),sparkConf);
Client client = new Client(argss,config,sparkConf);
ApplicationId applicationId = client.submitApplication();
}
}
Have a look at https://spark.apache.org/docs/1.2.0/running-on-yarn.html
It states that there is a property spark.yarn.historyServer.address which defaults to empty since the history server is an optional service. Probably it is not getting set when you submit from code, but is set when you launch via the command line.

MyBatis performance - access mappers

I'm using MyBatis in a project that fetches many rows (more than 2M rows).
I have a simple question about how MyBatis works. Everytime I need an action from the mapper, does MyBatis read the XML file and extract the query? Or are the mappers put into memory and MyBatis access them directly?
This is important, because access and read a XML file can have impact on the performance values we are expecting.
Thanks in advance.
Regards
Shortly, MyBatis parses the XML file when you first build your SqlSessionFactory from the configuration XML file. All the properties, mappers and settings are stored in-memory after that.
Explanation:
As stated in the doc, you can set up your SqlSessionFactory without XML, directly in Java as follows (see, last line):
DataSource dataSource = BlogDataSourceFactory.getBlogDataSource();
TransactionFactory transactionFactory = new JdbcTransactionFactory();
Environment environment = new Environment("development", transactionFactory, dataSource);
Configuration configuration = new Configuration(environment);
configuration.addMapper(BlogMapper.class);
SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(configuration);
Actually, when you build your SqlSessionFactory from XML, you will write something like this:
String resource = "org/mybatis/example/mybatis-config.xml";
InputStream inputStream = Resources.getResourceAsStream(resource);
SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(inputStream);
If you trace the source in SqlSessionFactoryBuilder,
public SqlSessionFactory build(Reader reader, String environment, Properties properties) {
try {
XMLConfigBuilder parser = new XMLConfigBuilder(reader, environment, properties);
return build(parser.parse());
...
The parse() method returns Configuration object, which holds all the information you supplied in the XML file.
public class Configuration {
protected Environment environment;
protected boolean safeRowBoundsEnabled = false;
protected boolean safeResultHandlerEnabled = true;
protected boolean mapUnderscoreToCamelCase = false;
protected boolean aggressiveLazyLoading = true;
protected boolean multipleResultSetsEnabled = true;
protected boolean useGeneratedKeys = false;
protected boolean useColumnLabel = true;
protected boolean cacheEnabled = true;
protected boolean callSettersOnNulls = false;
protected String logPrefix;
protected Class <? extends Log> logImpl;
protected LocalCacheScope localCacheScope = LocalCacheScope.SESSION;
protected JdbcType jdbcTypeForNull = JdbcType.OTHER;
protected Set<String> lazyLoadTriggerMethods = new HashSet<String>(Arrays.asList(new String[] { "equals", "clone", "hashCode", "toString" }));
protected Integer defaultStatementTimeout;
protected ExecutorType defaultExecutorType = ExecutorType.SIMPLE;
protected AutoMappingBehavior autoMappingBehavior = AutoMappingBehavior.PARTIAL;
protected Properties variables = new Properties();
protected ObjectFactory objectFactory = new DefaultObjectFactory();
protected ObjectWrapperFactory objectWrapperFactory = new DefaultObjectWrapperFactory();
protected MapperRegistry mapperRegistry = new MapperRegistry(this);
protected boolean lazyLoadingEnabled = false;
protected ProxyFactory proxyFactory;
protected String databaseId;
/**
* Configuration factory class.
* Used to create Configuration for loading deserialized unread properties.
*
* #see <a href='https://code.google.com/p/mybatis/issues/detail?id=300'>Issue 300</a> (google code)
*/
protected Class<?> configurationFactory;
protected final InterceptorChain interceptorChain = new InterceptorChain();
protected final TypeHandlerRegistry typeHandlerRegistry = new TypeHandlerRegistry();
protected final TypeAliasRegistry typeAliasRegistry = new TypeAliasRegistry();
protected final LanguageDriverRegistry languageRegistry = new LanguageDriverRegistry();
protected final Map<String, MappedStatement> mappedStatements = new StrictMap<MappedStatement>("Mapped Statements collection");
protected final Map<String, Cache> caches = new StrictMap<Cache>("Caches collection");
protected final Map<String, ResultMap> resultMaps = new StrictMap<ResultMap>("Result Maps collection");
protected final Map<String, ParameterMap> parameterMaps = new StrictMap<ParameterMap>("Parameter Maps collection");
protected final Map<String, KeyGenerator> keyGenerators = new StrictMap<KeyGenerator>("Key Generators collection");
protected final Set<String> loadedResources = new HashSet<String>();
protected final Map<String, XNode> sqlFragments = new StrictMap<XNode>("XML fragments parsed from previous mappers");
protected final Collection<XMLStatementBuilder> incompleteStatements = new LinkedList<XMLStatementBuilder>();
protected final Collection<CacheRefResolver> incompleteCacheRefs = new LinkedList<CacheRefResolver>();
protected final Collection<ResultMapResolver> incompleteResultMaps = new LinkedList<ResultMapResolver>();
protected final Collection<MethodResolver> incompleteMethods = new LinkedList<MethodResolver>();
...
Just don't worry about this.
After setup, MyBatis parsed mapper xml to a local variable(xml file read at first time).
org.apache.ibatis.session.Configuration.mappedStatements.
Now, you invoke mapper.add()/sqlSession.selectOne() or others, it will get this parameter/resultMap/resultType from mappedStatements firstly , won't read xml again.
Also, MyBatis cached the mapper proxy method. like this(create proxy method instance first time you invoke)
final String resource = "org/apache/ibatis/builder/MapperConfig.xml";
final Reader reader = Resources.getResourceAsReader(resource);
manager = SqlSessionManager.newInstance(reader);
AuthorMapper mapper = manager.getMapper(AuthorMapper.class);
Author expected = new Author(500, "cbegin", "******", "cbegin#somewhere.com", "Something...", null);
mapper.insertAuthor(expected);
(how to get mapper)
public class MapperRegistry {
public <T> T getMapper(Class<T> type, SqlSession sqlSession) {
final MapperProxyFactory<T> mapperProxyFactory = (MapperProxyFactory<T>) knownMappers.get(type);
if (mapperProxyFactory == null) {
throw new BindingException("Type " + type + " is not known to the MapperRegistry.");
}
try {
return mapperProxyFactory.newInstance(sqlSession);
} catch (Exception e) {
throw new BindingException("Error getting mapper instance. Cause: " + e, e);
}
}
}
And, u can use query cache in mybatis 3.x+. Following this
<configuration>
<settings>
<setting name="cacheEnabled" value="true" />
<settings>
</configuration>

Merge properties into a ResourceBundle from System.getProperties()

I'm building a ResourceBundle from a file, this bundle holds < String, String> values.
InputStream in = getClass().getResourceAsStream("SQL.properties");
properties = new PropertyResourceBundle(in);
in.close();
I would like to add/replace on this bundle some properties that I'm passing from the command line using -Dsome.option.val.NAME1=HiEarth
I don't care dumping the old bundle and creating a new one instead.
Could you please tip?
I think that what I need to do is :
Create from the bundle a HashMap< String, String>
Replace values.
Transform the HashMap into a InputStream. //This is the complicated part...
Build the new bundle from that.
This does some of what you want (converts the System.properties to a ResourceBundle). Better error handling is left up to you :-)
public static ResourceBundle createBundle()
{
final ResourceBundle bundle;
final Properties properties;
final CharArrayWriter charWriter;
final PrintWriter printWriter;
final CharArrayReader charReader;
charWriter = new CharArrayWriter();
printWriter = new PrintWriter(charWriter);
properties = System.getProperties();
properties.list(printWriter);
charReader = new CharArrayReader(charWriter.toCharArray());
try
{
bundle = new PropertyResourceBundle(charReader);
return (bundle);
}
catch(final IOException ex)
{
// cannot happen
ex.printStackTrace();
}
throw new Error();
}
This might not be the best way to do it but it's the best I can think of: implement a subclass of ResourceBundle that stores the properties you want to add/replace, then set the parent of that bundle to be the PropertyResourceBundle you load from the input stream.
InputStream in = getClass().getResourceAsStream("SQL.properties");
properties = new PropertyResourceBundle(in);
in.close();
MyCLIResourceBundle b = new MyCLIResourceBundle(properties);
// use b as your bundle
where the implementation would be something like
public class MyCLIResourceBundle extends ResourceBundle {
public MyCLIResourceBundle(ResourceBundle parent) {
super();
this.setParent(parent);
// go on and load your chosen properties from System.getProperties() or wherever
}
}

Categories

Resources