It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I was asked to work on this back-end scheduled job that export some customers data (from an e-commerce DB) to a custom-format text file. The code that follows is what I found.
I just would like to delete it all, but I can't. Would it be possible for me to improve this without changing it so much?
public class AConverter implements CustomerConverter {
protected final Logger LOG = LoggerFactory.getLogger(AConverter.class);
private final static String SEPARATOR = ";";
private final static String CR = "\n";
public String create(Customer customer) {
if (customer == null)
return null;
LOG.info("Exporting customer, uidpk: {}, userid: {}", customer.getUidPk(), customer.getUserId());
StringBuilder buf = new StringBuilder();
buf.append("<HEAD>");
buf.append(SEPARATOR);
buf.append(String.valueOf(customer.getUidPk()));
buf.append(SEPARATOR);
byte[] fullName = null;
try {
fullName = customer.getFullName().getBytes("UTF-8");
} catch (UnsupportedEncodingException e1) {
fullName = customer.getFullName().getBytes();
}
String name = null;
try {
name = new String(fullName, "UTF-8");
} catch (UnsupportedEncodingException e) {
name = customer.getFullName();
}
buf.append(limitString(name, 40));
buf.append(SEPARATOR);
final CustomerAddress preferredShippingAddress = customer.getPreferredShippingAddress();
if (preferredShippingAddress != null) {
final String street1 = preferredShippingAddress.getStreet1();
if (street1 != null) {
buf.append(limitString(street1, 40));
}
} else {
buf.append(" ");
}
buf.append(SEPARATOR);
final String addressStr = buildAddressString(customer);
buf.append(limitString(addressStr, 40));
buf.append(SEPARATOR);
buf.append(limitString(customer.getEmail(), 80));
buf.append(SEPARATOR);
if (preferredShippingAddress!=null && preferredShippingAddress.getStreet2() != null) {
buf.append(limitString(preferredShippingAddress.getStreet2(), 40));
} else {
buf.append(" ");
}
buf.append(SEPARATOR);
buf.append(limitString(customer.getPhoneNumber(), 25));
buf.append(SEPARATOR);
if (preferredShippingAddress!=null) {
if(preferredShippingAddress.getCountry()!=null) {
buf.append(preferredShippingAddress.getCountry());
} else {
buf.append(" ");
}
} else {
buf.append(" ");
}
buf.append(SEPARATOR);
if (preferredShippingAddress!=null) {
if(preferredShippingAddress.getCountry()!=null) {
buf.append(preferredShippingAddress.getCountry());
} else {
buf.append(" ");
}
} else {
buf.append(" ");
}
buf.append(SEPARATOR);
String fodselsnummer = " ";
try {
Map<String, AttributeValue> profileValueMap = customer.getProfileValueMap();
AttributeValue attributeValue = profileValueMap.get("CODE");
fodselsnummer = attributeValue.getStringValue();
} catch (Exception e) {
}
buf.append(fodselsnummer);
buf.append(CR);
final String string = buf.toString();
return string;
}
private String buildAddressString(Customer customer) {
final CustomerAddress preferredShippingAddress = customer.getPreferredShippingAddress();
if (preferredShippingAddress != null) {
final String zipOrPostalCode = preferredShippingAddress.getZipOrPostalCode();
final String city = preferredShippingAddress.getCity();
if (zipOrPostalCode != null && city != null) {
return zipOrPostalCode + " " + city;
} else if(zipOrPostalCode == null && city != null) {
return city;
} else if(zipOrPostalCode != null && city == null) {
return zipOrPostalCode;
}
}
return " ";
}
private String limitString(String value, int numOfChars) {
if (value != null && value.length() > numOfChars)
return value.substring(0, numOfChars);
else
return value;
}
}
You say you want to improve it, you'd like to delete it, but you can't. I'm not sure why you can't. I also don't understand why you'd want to delete it. But it sounds to me like the kind of attitude I used to have before I read Refactoring by Martin Fowler. I would strongly suggest you read that book, if you haven't already.
It is certainly possible to improve this code (or any code) without rewriting it all. The most obvious improvements would be to eliminate some of the repetitive code in the create method by creating some utility methods, and then breaking up the create method into several smaller methods à la template methods.
Also, there is a questionable bit of code in the create method that turns the customer's name into a UTF-8 byte stream and then back into a string. I can't imagine what that's for. Finally, it returns null if the customer is null. That is unlikely to be necessary or wise.
For fun, I decided to do a little refactoring on this code. (Note that proper refactoring involves unit tests; I don't have any tests for this code and have not even compiled the code below, much less tested it.) Here is one possible way you could rewrite this code:
public class AConverter implements CustomerConverter {
protected final Logger LOG = LoggerFactory.getLogger(AConverter.class);
private final static String SEPARATOR = ";";
private final static String CR = "\n";
public String create(Customer customer) {
if (customer == null) throw new IllegalArgumentException("no cust");
LOG.info("Exporting customer, uidpk: {}, userid: {}",
customer.getUidPk(), customer.getUserId());
StringBuilder buf = new StringBuilder();
doHead(buf, customer);
doAddress(buf, customer);
doTail(buf, customer);
return buf.toString();
}
private void doHead(StringBuilder buf, Customer customer) {
append(buf, "<HEAD>");
append(buf, String.valueOf(customer.getUidPk()));
append(buf, limitTo(40, customer.getFullName()));
}
private void doAddress(StringBuilder buf, Customer customer) {
append(buf, limitTo(40, street1of(customer)));
append(buf, limitTo(40, addressOf(customer)));
append(buf, limitTo(80, customer.getEmail()));
append(buf, limitTo(40, street2of(customer)));
append(buf, limitTo(25, customer.getPhoneNumber()));
append(buf, countryOf(customer));
append(buf, countryOf(customer));
}
private void doTail(StringBuilder buf, Customer customer) {
buf.append(fodselsnummerOf(customer));
buf.append(CR);
}
private void append(StringBuilder buf, String s) {
buf.append(s).append(SEPARATOR);
}
private String street1of(Customer customer) {
final CustomerAddress shipto = customer.getPreferredShippingAddress();
if (shipto == null) return " ";
if (shipto.getStreet1() != null) return shipto.getStreet1();
return " ";
}
private String street2of(Customer customer) {
final CustomerAddress shipto = customer.getPreferredShippingAddress();
if (shipto == null) return " ";
if (shipto.getStreet2() != null) return shipto.getStreet2();
return " ";
}
private String addressOf(Customer customer) {
final CustomerAddress shipto = customer.getPreferredShippingAddress();
if (shipto == null) return " ";
final String post = preferredShippingAddress.getZipOrPostalCode();
final String city = preferredShippingAddress.getCity();
if (post != null && city != null) return post + " " + city;
if (post == null && city != null) return city;
if (post != null && city == null) return post;
return " ";
}
private String countryOf(Customer customer) {
final CustomerAddress shipto = customer.getPreferredShippingAddress();
if (shipto == null) return " ";
if (shipto.getCountry() != null) return shipto.getCountry();
return " ";
}
private String limitTo(int numOfChars, String value) {
if (value != null && value.length() > numOfChars)
return value.substring(0, numOfChars);
return value;
}
private String fodelsnummerOf(Customer customer) {
try {
Map<String, AttributeValue> profileValueMap =
customer.getProfileValueMap();
AttributeValue attributeValue = profileValueMap.get("CODE");
return attributeValue.getStringValue();
} catch (Exception e) {
return " ";
}
}
}
I also notice that there is a problem with your format for the custom-format text file if any of the fields of the customer data (email address, for example) happen to have a semicolon in them, because that is your separator character. I trust that is a known issue?
Related
I use LanguageTool for some spellchecking and spell correction functionality in my application.
The LanguageTool documentation describes how to exclude words from spell checking (with call the addIgnoreTokens(...) method of the spell checking rule you're using).
How do you add some words (e.g., from a specific dictionary) to spell checking? That is, can LanguageTool fix words with misspellings and suggest words from my specific dictionary?
Unfortunately, the API doesn't support this I think. Without the API, you can add words to spelling.txt to get them accepted and used as suggestions. With the API, you might need to extend MorfologikSpellerRule and change this place of the code. (Disclosure: I'm the maintainer of LanguageTool)
I have similar requirement, which is load some custom words into dictionary as "suggest words", not just "ignored words". And finally I extend MorfologikSpellerRule to do this:
Create class MorfologikSpellerRuleEx extends from MorfologikSpellerRule, override the method "match()", and write my own "initSpeller()" for creating spellers.
And then for the language tool, create this custom speller rule to replace existing one.
Code:
Language lang = new AmericanEnglish();
JLanguageTool langTool = new JLanguageTool(lang);
langTool.disableRule("MORFOLOGIK_RULE_EN_US");
try {
MorfologikSpellerRuleEx spellingRule = new MorfologikSpellerRuleEx(JLanguageTool.getMessageBundle(), lang);
spellingRule.setSpellingFilePath(spellingFilePath);
//spellingFilePath is the file has my own words + words from /hunspell/spelling_en-US.txt
langTool.addRule(spellingRule);
} catch (IOException e) {
e.printStackTrace();
}
The code of my custom MorfologikSpellerRuleEx:
public class MorfologikSpellerRuleEx extends MorfologikSpellerRule {
private String spellingFilePath = null;
private boolean ignoreTaggedWords = false;
public MorfologikSpellerRuleEx(ResourceBundle messages, Language language) throws IOException {
super(messages, language);
}
#Override
public String getFileName() {
return "/en/hunspell/en_US.dict";
}
#Override
public String getId() {
return "MORFOLOGIK_SPELLING_RULE_EX";
}
#Override
public void setIgnoreTaggedWords() {
ignoreTaggedWords = true;
}
public String getSpellingFilePath() {
return spellingFilePath;
}
public void setSpellingFilePath(String spellingFilePath) {
this.spellingFilePath = spellingFilePath;
}
private void initSpellerEx(String binaryDict) throws IOException {
String plainTextDict = null;
if (JLanguageTool.getDataBroker().resourceExists(getSpellingFileName())) {
plainTextDict = getSpellingFileName();
}
if (plainTextDict != null) {
BufferedReader br = null;
if (this.spellingFilePath != null) {
try {
br = new BufferedReader(new FileReader(this.spellingFilePath));
}
catch (Exception e) {
br = null;
}
}
if (br != null) {
speller1 = new MorfologikMultiSpeller(binaryDict, br, plainTextDict, 1);
speller2 = new MorfologikMultiSpeller(binaryDict, br, plainTextDict, 2);
speller3 = new MorfologikMultiSpeller(binaryDict, br, plainTextDict, 3);
br.close();
}
else {
speller1 = new MorfologikMultiSpeller(binaryDict, plainTextDict, 1);
speller2 = new MorfologikMultiSpeller(binaryDict, plainTextDict, 2);
speller3 = new MorfologikMultiSpeller(binaryDict, plainTextDict, 3);
}
setConvertsCase(speller1.convertsCase());
} else {
throw new RuntimeException("Could not find ignore spell file in path: " + getSpellingFileName());
}
}
private boolean canBeIgnored(AnalyzedTokenReadings[] tokens, int idx, AnalyzedTokenReadings token)
throws IOException {
return token.isSentenceStart() || token.isImmunized() || token.isIgnoredBySpeller() || isUrl(token.getToken())
|| isEMail(token.getToken()) || (ignoreTaggedWords && token.isTagged()) || ignoreToken(tokens, idx);
}
#Override
public RuleMatch[] match(AnalyzedSentence sentence) throws IOException {
List<RuleMatch> ruleMatches = new ArrayList<>();
AnalyzedTokenReadings[] tokens = getSentenceWithImmunization(sentence).getTokensWithoutWhitespace();
// lazy init
if (speller1 == null) {
String binaryDict = null;
if (JLanguageTool.getDataBroker().resourceExists(getFileName())) {
binaryDict = getFileName();
}
if (binaryDict != null) {
initSpellerEx(binaryDict); //here's the change
} else {
// should not happen, as we only configure this rule (or rather its subclasses)
// when we have the resources:
return toRuleMatchArray(ruleMatches);
}
}
int idx = -1;
for (AnalyzedTokenReadings token : tokens) {
idx++;
if (canBeIgnored(tokens, idx, token)) {
continue;
}
// if we use token.getToken() we'll get ignored characters inside and speller
// will choke
String word = token.getAnalyzedToken(0).getToken();
if (tokenizingPattern() == null) {
ruleMatches.addAll(getRuleMatches(word, token.getStartPos(), sentence));
} else {
int index = 0;
Matcher m = tokenizingPattern().matcher(word);
while (m.find()) {
String match = word.subSequence(index, m.start()).toString();
ruleMatches.addAll(getRuleMatches(match, token.getStartPos() + index, sentence));
index = m.end();
}
if (index == 0) { // tokenizing char not found
ruleMatches.addAll(getRuleMatches(word, token.getStartPos(), sentence));
} else {
ruleMatches.addAll(getRuleMatches(word.subSequence(index, word.length()).toString(),
token.getStartPos() + index, sentence));
}
}
}
return toRuleMatchArray(ruleMatches);
}
}
I have a list of names in the form of a CSV and I am up for google searching those names using java. But the problem that i am facing is that when i initially run the code i am able to search the query but in the middle of the code the code starts to throw 503 exceptions and when i again run the code it starts throwing 503 exceptions from the very beginning.Here is the code that i am using.
public class ExtractInformation
{
static String firstname,middlename,lastname;
public static final int PAGE_NUMBERS = 10;
public static void readCSV()
{
boolean first = true;
try
{
String splitBy = ",";
BufferedReader br = new BufferedReader(new FileReader("E:\\KOLDump\\names.csv"));
String line = null;
String site = null;
while((line=br.readLine())!=null)
{
if(first)
{
first = false;
continue;
}
String[] b = line.split(splitBy);
firstname = b[0];
middlename = b[1];
lastname = b[2];
String name = null;
if(middlename == null || middlename.length() == 0)
{
name = firstname+" "+lastname+" OR "+lastname+" "+firstname.charAt(0);
}
else
{
name = firstname+" "+lastname+" OR "+lastname+" "+firstname.charAt(0)+" OR "+firstname+" "+middlename.charAt(0)+". "+lastname;
}
BufferedReader brs = new BufferedReader(new FileReader("E:\\KOLDump\\site.csv"));
while((site = brs.readLine()) != null)
{
if(first)
{
first = false;
continue;
}
String [] s = site.split(splitBy);
String siteName = s[0];
siteName = (siteName.replace("www.", ""));
siteName = (siteName.replace("http://", ""));
getDataFromGoogle(name.trim(), siteName.trim());
}
brs.close();
}
//br.close();
}
catch(Exception e)
{
System.out.println("unable to read file...some problem in the csv");
}
}
public static void main(String[] args)
{
readCSV();
}
private static void getDataFromGoogle(String query,String siteName)
{
Set<String> result = new HashSet<String>();
String request = "http://www.google.co.in/search?q="+query+" "+siteName;
try
{
Document doc = Jsoup.connect(request).userAgent("Chrome").timeout(10000).get();
Element query_results = doc.getElementById("ires");
Elements gees = query_results.getElementsByClass("g");
for(Element gee : gees)
{
Element h3 = gee.getElementsByTag("h3").get(0);
String annotation = h3.getElementsByTag("a").get(0).attr("href");
if(annotation.split("q=",2)[1].contains(siteName))
{
System.out.println(annotation.split("q=",2)[1]);
}
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
any suggestions on how to remove this exceptions from the code would really be helpful.
If you wait a little do the 503's go away? If so, then you're probably being rate-limited by Google. https://support.google.com/gsa/answer/2686272?hl=en
You may need to put some kind of delay between requests.
I was wondering, I have the following snippets of code that I would like to eliminate the use of sessions and make it RESTful. I have a controller servlet that uses several handlers, which are used to determine the page to return to. For example, here are two of my handlers:
public class ReporterLoginHandler implements ActionHandler {
public String handleIt(Map params, HttpSession session) {
String reporterId = null;
String passwd = null;
String errMsg = null;
ReporterBean reporterBean = null;
String returnPage = "home";
try {
reporterId = ((String[]) params.get("reporterid"))[0];
passwd = ((String[]) params.get("passwd"))[0];
} catch (Exception ex) {
System.out.println("Oops, couldn't parse the parameters for login!");
}
if (reporterId == null || reporterId.length() == 0 || passwd == null || passwd.length() == 0) {
errMsg = "The reporterID or password cannot be empty";
} else if ((reporterBean = ReporterBeanFactory.getReporter(reporterId, passwd)) == null) {
errMsg = "The reporterID or password is not valid";
}
if (errMsg != null) {
session.setAttribute("msg", errMsg); //should be removed and replaced by a RESTful API
} else }
returnPage = "reporter_home";
session.setAttribute("reporterBean", reporterBean);
}
return returnPage;
}
}
public class PostItemHandler implements ActionHandler {
#Override
public String handleIt(Map<String, String[]> params, HttpSession session) {
String title = params.get("title")[0];
String story = params.get("story")[0];
String itemId = null;
String returnPage = "home";
if (params.containsKey("item")) {
itemId = params.get("item")[0];
}
ReporterBean rBean = (ReporterBean) session.getAttribute("reporterBean"); // needs to be replaced by a RESTful API
String msg = "";
int id = 0;
String filename = session.getAttribute("newsfile").toString();
if (title != null && title.length() > 0 && story != null && story.length() > 0) {
if (itemId != null && itemId.length() > 0) {
try {
id = Integer.parseInt(itemId);
} catch (Exception exc) {
msg = "Invalid format for news item ID";
}
if (rBean != null) {
if (msg.equals("") && NewsItemBeanFactory.editNewsItem(id, title, story, rBean.getReporterId())) {
msg = "News item " + id + " successfully edited!";
returnPage = "reporter_home";
try {
NewsItemBeanFactory.saveNewsItems(filename);
} catch (IOException ex) {
Logger.getLogger(PostItemHandler.class.getName()).log(Level.SEVERE, null, ex);
}
} else {
msg = "News item " + id + " could not be edited!";
}
} else }
msg = "Error: please log in before adding or editing an item.";
}
} else {
if (rBean != null) {
NewsItemBeanFactory.addNewsItem(title, story, rBean.getReporterId());
msg = "News item successfully added!";
returnPage = "reporter_home";
try {
NewsItemBeanFactory.saveNewsItems(filename);
} catch (IOException ex) {
Logger.getLogger(PostItemHandler.class.getName()).log(Level.SEVERE, null, ex);
}
} else {
msg = "Error: please log in before adding a new item.";
}
}
}
if (params.get("returnpage") != null) {
if (params.get("returnpage")[0].toString().equals("mynews")) {
Collection<NewsItemBean> newsItems = NewsItemBeanFactory.getAllItems();
ArrayList<NewsItemBean> myNewsItems = new ArrayList<NewsItemBean>();
for (NewsItemBean item : newsItems) {
if (rBean != null && rBean.getReporterId().equals(item.getReporterId())) {
myNewsItems.add(item);
}
}
session.setAttribute("mynews", myNewsItems); //needs to be replaced by a RESTful API
returnPage = "mynews";
}
}
session.setAttribute("msg", msg); //needs to be replaced by a RESTful API
return returnPage;
}
}
Specifically, I would like to eliminate the use of all sessions from my handlers (as well as from my controller servlet) and would like to create a RESTful API where the java beans are represented with JSON.
I would prefer not to use an external REST API creator such as Spring or Jersey, however, I am open to using Google's Gson to convert my beans to and from JSON.
EDIT: Also, I would like the login to return an authorization token when successful.
Could anyone help me here?
I am trying to pull some XHTML out of an RSS feed so I can place it in a WebView. The RSS feed in question has a tag called <content> and the characters inside the content are XHTML. (The site I'm paring is a blogger feed)
What is the best way to try to pull this content? The < characters are confusing my parser. I have tried both DOM and SAX but neither can handle this very well.
Here is a sample of the XML as requested. In this case, I want basically XHTML inside the content tag to be a string. <content> XHTML </content>
Edit: based on ignyhere's suggestion I have tried XPath, but I am still having the same issue. Here is a pastebin sample of my tests.
It's not pretty, but this is (the essence of) what I use to parse an ATOM feed from Blogger using XmlPullParser. The code is pretty icky, but it is from a real app. You can probably get the general flavor of it, anyway.
final String TAG_FEED = "feed";
public int parseXml(Reader reader) {
XmlPullParserFactory factory = null;
StringBuilder out = new StringBuilder();
int entries = 0;
try {
factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(true);
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(reader);
while (true) {
int eventType = xpp.next();
if (eventType == XmlPullParser.END_DOCUMENT) {
break;
} else if (eventType == XmlPullParser.START_DOCUMENT) {
out.append("Start document\n");
} else if (eventType == XmlPullParser.START_TAG) {
String tag = xpp.getName();
// out.append("Start tag " + tag + "\n");
if (TAG_FEED.equalsIgnoreCase(tag)) {
entries = parseFeed(xpp);
}
} else if (eventType == XmlPullParser.END_TAG) {
// out.append("End tag " + xpp.getName() + "\n");
} else if (eventType == XmlPullParser.TEXT) {
// out.append("Text " + xpp.getText() + "\n");
}
}
out.append("End document\n");
} catch (XmlPullParserException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
// return out.toString();
return entries;
}
private int parseFeed(XmlPullParser xpp) throws XmlPullParserException, IOException {
int depth = xpp.getDepth();
assert (depth == 1);
int eventType;
int entries = 0;
xpp.require(XmlPullParser.START_TAG, null, TAG_FEED);
while (((eventType = xpp.next()) != XmlPullParser.END_DOCUMENT) && (xpp.getDepth() > depth)) {
// loop invariant: At this point, the parser is not sitting on
// end-of-document, and is at a level deeper than where it started.
if (eventType == XmlPullParser.START_TAG) {
String tag = xpp.getName();
// Log.d("parseFeed", "Start tag: " + tag); // Uncomment to debug
if (FeedEntry.TAG_ENTRY.equalsIgnoreCase(tag)) {
FeedEntry feedEntry = new FeedEntry(xpp);
feedEntry.persist(this);
entries++;
// Log.d("FeedEntry", feedEntry.title); // Uncomment to debug
// xpp.require(XmlPullParser.END_TAG, null, tag);
}
}
}
assert (depth == 1);
return entries;
}
class FeedEntry {
String id;
String published;
String updated;
// Timestamp lastRead;
String title;
String subtitle;
String authorName;
int contentType;
String content;
String preview;
String origLink;
String thumbnailUri;
// Media media;
static final String TAG_ENTRY = "entry";
static final String TAG_ENTRY_ID = "id";
static final String TAG_TITLE = "title";
static final String TAG_SUBTITLE = "subtitle";
static final String TAG_UPDATED = "updated";
static final String TAG_PUBLISHED = "published";
static final String TAG_AUTHOR = "author";
static final String TAG_CONTENT = "content";
static final String TAG_TYPE = "type";
static final String TAG_ORIG_LINK = "origLink";
static final String TAG_THUMBNAIL = "thumbnail";
static final String ATTRIBUTE_URL = "url";
/**
* Create a FeedEntry by pulling its bits out of an XML Pull Parser. Side effect: Advances
* XmlPullParser.
*
* #param xpp
*/
public FeedEntry(XmlPullParser xpp) {
int eventType;
int depth = xpp.getDepth();
assert (depth == 2);
try {
xpp.require(XmlPullParser.START_TAG, null, TAG_ENTRY);
while (((eventType = xpp.next()) != XmlPullParser.END_DOCUMENT)
&& (xpp.getDepth() > depth)) {
if (eventType == XmlPullParser.START_TAG) {
String tag = xpp.getName();
if (TAG_ENTRY_ID.equalsIgnoreCase(tag)) {
id = Util.XmlPullTag(xpp, TAG_ENTRY_ID);
} else if (TAG_TITLE.equalsIgnoreCase(tag)) {
title = Util.XmlPullTag(xpp, TAG_TITLE);
} else if (TAG_SUBTITLE.equalsIgnoreCase(tag)) {
subtitle = Util.XmlPullTag(xpp, TAG_SUBTITLE);
} else if (TAG_UPDATED.equalsIgnoreCase(tag)) {
updated = Util.XmlPullTag(xpp, TAG_UPDATED);
} else if (TAG_PUBLISHED.equalsIgnoreCase(tag)) {
published = Util.XmlPullTag(xpp, TAG_PUBLISHED);
} else if (TAG_CONTENT.equalsIgnoreCase(tag)) {
int attributeCount = xpp.getAttributeCount();
for (int i = 0; i < attributeCount; i++) {
String attributeName = xpp.getAttributeName(i);
if (attributeName.equalsIgnoreCase(TAG_TYPE)) {
String attributeValue = xpp.getAttributeValue(i);
if (attributeValue
.equalsIgnoreCase(FeedReaderContract.FeedEntry.ATTRIBUTE_NAME_HTML)) {
contentType = FeedReaderContract.FeedEntry.CONTENT_TYPE_HTML;
} else if (attributeValue
.equalsIgnoreCase(FeedReaderContract.FeedEntry.ATTRIBUTE_NAME_XHTML)) {
contentType = FeedReaderContract.FeedEntry.CONTENT_TYPE_XHTML;
} else {
contentType = FeedReaderContract.FeedEntry.CONTENT_TYPE_TEXT;
}
break;
}
}
content = Util.XmlPullTag(xpp, TAG_CONTENT);
extractPreview();
} else if (TAG_AUTHOR.equalsIgnoreCase(tag)) {
// Skip author for now -- it is complicated
int authorDepth = xpp.getDepth();
assert (authorDepth == 3);
xpp.require(XmlPullParser.START_TAG, null, TAG_AUTHOR);
while (((eventType = xpp.next()) != XmlPullParser.END_DOCUMENT)
&& (xpp.getDepth() > authorDepth)) {
}
assert (xpp.getDepth() == 3);
xpp.require(XmlPullParser.END_TAG, null, TAG_AUTHOR);
} else if (TAG_ORIG_LINK.equalsIgnoreCase(tag)) {
origLink = Util.XmlPullTag(xpp, TAG_ORIG_LINK);
} else if (TAG_THUMBNAIL.equalsIgnoreCase(tag)) {
thumbnailUri = Util.XmlPullAttribute(xpp, tag, null, ATTRIBUTE_URL);
} else {
#SuppressWarnings("unused")
String throwAway = Util.XmlPullTag(xpp, tag);
}
}
} // while
} catch (XmlPullParserException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
assert (xpp.getDepth() == 2);
}
}
public static String XmlPullTag(XmlPullParser xpp, String tag)
throws XmlPullParserException, IOException {
xpp.require(XmlPullParser.START_TAG, null, tag);
String itemText = xpp.nextText();
if (xpp.getEventType() != XmlPullParser.END_TAG) {
xpp.nextTag();
}
xpp.require(XmlPullParser.END_TAG, null, tag);
return itemText;
}
public static String XmlPullAttribute(XmlPullParser xpp,
String tag, String namespace, String name)
throws XmlPullParserException, IOException {
assert (!TextUtils.isEmpty(tag));
assert (!TextUtils.isEmpty(name));
xpp.require(XmlPullParser.START_TAG, null, tag);
String itemText = xpp.getAttributeValue(namespace, name);
if (xpp.getEventType() != XmlPullParser.END_TAG) {
xpp.nextTag();
}
xpp.require(XmlPullParser.END_TAG, null, tag);
return itemText;
}
I'll give you a hint: None of the return values matter. The data is saved into a database by a method (not shown) called at this line:
feedEntry.persist(this);
I would attempt to attack it with XPath. Would something like this work?
public static String parseAtom (InputStream atomIS)
throws Exception {
// Below should yield the second content block
String xpathString = "(//*[starts-with(name(),"content")])[2]";
// or, String xpathString = "//*[name() = 'content'][2]";
// remove the '[2]' to get all content tags or get the count,
// if needed, and then target specific blocks
//String xpathString = "count(//*[starts-with(name(),"content")])";
// note the evaluate expression below returns a glob and not a node set
XPathFactory xpf = XPathFactory.newInstance ();
XPath xpath = xpf.newXPath ();
XPathExpression xpathCompiled = xpath.compile (xpathString);
// use the first to recast and evaluate as NodeList
//Object atomOut = xpathCompiled.evaluate (
// new InputSource (atomIS), XPathConstants.NODESET);
String atomOut = xpathCompiled.evaluate (
new InputSource (atomIS), XPathConstants.STRING);
System.out.println (atomOut);
return atomOut;
}
I can see your problem here, the reason why these parsers are not producing the correct result is because contents of your <content> tag are not wrapped into <![CDATA[ ]]>, what I would do until I find more adequate solution I'd use quick and dirty trick :
private void parseFile(String fileName) throws IOException {
String line;
BufferedReader br = new BufferedReader(new FileReader(new File(fileName)));
StringBuilder sb = new StringBuilder();
boolean match = false;
while ((line = br.readLine()) != null) {
if(line.contains("<content")){
sb.append(line);
sb.append("\n");
match = true;
continue;
}
if(match){
sb.append(line);
sb.append("\n");
match = false;
}
if(line.contains("</content")){
sb.append(line);
sb.append("\n");
}
}
System.out.println(sb.toString());
}
This will give you all content in String. You can optionaly seperate them by slightly modyfiying this method or if you don't need actual <content> you can filter that out as well.
I'm doing a simple MessageRenderer.
It's specification:
Render message based on an Context (it's a map that's contains all key/value pair parameters)
Supports simple render such as: Your username is << username >>. Assume username in the context is barcelona and the result will be Your username is Barcelona.
Supported function-like object. Example: Current time is << now() >>, now(): is an object that will returns a string of current date time. And result will be: Current time is 2011-05-30
Each parameter of function can also be templated: Current time is << now( << date_format >> ) >> . This template returns a string of current date time with format is the value of key 'date_format' retrieved from the Context. Assume date_format in Context is dd/MM/yyyy and the result will be: Current time is 30/05/2011
Each parameter of function can also be templated with a different method call: Time is << now_locale ( << getLocale() >> ). Assume that getLocale() is an function object that will be return a locale is en_US and the result will be: Time is 2011/05/30 11:20:34 PM
Template can be nested. Example: Your user name is << << username >> >>. It means, Key username has value param1, Key param1 has value is barcelona so the final result will be: Your user name is Barcelona.
My classes and interfaces:
RenderContext.java
public interface RenderContext {
public String getParameter(String key);
}
MessageRenderer.java
public interface MessageRenderer {
public String render(String s, RenderContext... context);
}
MethodExpressionEvaluator.java
// Using this class to implements the method evaluation, such as now(), now_locale()
public interface MethodExpressionEvaluator {
public String evaluate(String[] methodParams, RenderContext... context);
}
AbstractMessageRenderer.java
public abstract class AbstractMessageRenderer implements MessageRenderer {
public static final String DEFAULT_NULL = "###";
public static final String PLACEHOLDER_START_TOKEN = "<<";
public static final String PLACEHOLDER_END_TOKEN = ">>";
protected int lenPlaceholderStartToken = 0;
protected int lenPlaceholderEndToken = 0;
protected String nullToken;
protected String placeholderStartToken;
protected String placeholderEndToken;
protected boolean escape = true;
public AbstractMessageRenderer() {
placeholderStartToken = PLACEHOLDER_START_TOKEN;
placeholderEndToken = PLACEHOLDER_END_TOKEN;
lenPlaceholderStartToken = placeholderStartToken.length();
lenPlaceholderEndToken = placeholderEndToken.length();
nullToken = DEFAULT_NULL;
}
public String getNullToken() {
return nullToken;
}
public void setNullToken(String defaultNull) {
this.nullToken = defaultNull;
}
public String getPlaceholderStartToken() {
return placeholderStartToken;
}
public void setPlaceholderStartToken(String placeholderStartToken) {
this.placeholderStartToken = placeholderStartToken;
lenPlaceholderStartToken = placeholderStartToken.length();
}
public String getPlaceholderEndToken() {
return placeholderEndToken;
}
public void setPlaceholderEndToken(String placeholderEndToken) {
this.placeholderEndToken = placeholderEndToken;
lenPlaceholderEndToken = placeholderEndToken.length();
}
public boolean isEscape() {
return escape;
}
public boolean getEscape() {
return escape;
}
public void setEscape(boolean escape) {
this.escape = escape;
}
public String getParam(String key, RenderContext... context) {
if(context != null)
{
for(RenderContext param:context)
{
if(param != null)
{
String value = param.getParameter(key);
if(!StringUtil.isEmpty(value))
{
return value;
}
}
}
}
return nullToken;
}
public String render(String s, RenderContext... context) {
// handle trivial cases of empty template or no placeholders
if (s == null)
{
Log4j.app.debug("Message is null in template. Cannot render null message.");
return nullToken;
}
if (context == null)
{
Log4j.app.debug("RenderContext is null. Cannot render message with null RenderContext.");
return nullToken;
}
if (s.indexOf(placeholderStartToken) < 0)
{
return s;
}
String msg = nullToken;
try
{
// private int renderTemplate(Renderable r, String src, StringBuffer dst, String nil, int i, String[] marks, StringBuffer end,boolean escapes)
msg = doRender(s, context);
}
catch (Exception e)
{
Log4j.app.error("Exception in rendering template: " + e.getMessage(), e);
return nullToken;
}
return msg;
}
protected abstract String doRender(String s, RenderContext... context);
}
MethodExpressionRenderer.java
public class MethodExpressionRenderer extends AbstractMessageRenderer {
private boolean inSingleQuote = false;
private boolean inDoubleQuote=false;
private int placeholders;
private Stack<String> methodStack;
private String[] endTokens;
private String marker;
private List<String> methodParams;
private String prefix = "&";
public MethodExpressionRenderer() {
super();
methodStack = new Stack<String>();
marker = ",";
endTokens = new String[] { placeholderEndToken, marker, "(", ")" };
methodParams = new ArrayList<String>();
}
public String getPrefix() {
return prefix;
}
public void setPrefix(String prefix) {
this.prefix = prefix;
}
public String getMarker() {
return marker;
}
public void setMarker(String marker) {
this.marker = marker;
endTokens = new String[] { placeholderEndToken, marker };
}
#Override
public void setPlaceholderEndToken(String placeholderEndToken) {
super.setPlaceholderEndToken(placeholderEndToken);
endTokens = new String[] { placeholderEndToken, marker };
}
protected String doRender(String s, RenderContext... context) {
StringBuffer sb = new StringBuffer();
try
{
renderTemplate(s, sb, nullToken, 0, endTokens, null, context);
}
catch (Exception e)
{
Log4j.app.error("Exception in rendering method expression message emplate: " + e.getMessage(), e);
return nullToken;
}
return sb.toString();
}
private int renderTemplate(String src, StringBuffer dst, String nil, int i, String[] marks, StringBuffer end, RenderContext... context) {
int len = src.length();
while (i < len)
{
char c = src.charAt(i);
if (escape)
{
if (c=='\\')
{
i++;
char ch = src.charAt(i);
if(inSingleQuote)
{
if(ch=='\'')
{
inSingleQuote=false;
}
}
else if(inDoubleQuote)
{
if(ch=='"')
{
inDoubleQuote=false;
}
}
else
{
if(ch=='\'')
{
inSingleQuote=true;
}
else if(ch=='"')
{
inDoubleQuote=true;
}
}
dst.append(ch);
i++;
continue;
}
}
if(inSingleQuote)
{
if(c=='\'')
{
inSingleQuote=false;
}
}
else if(inDoubleQuote)
{
if(c=='"')
{
inDoubleQuote=false;
}
}
else
{
if(c=='\'')
{
inSingleQuote=true;
}
else if(c=='"')
{
inDoubleQuote=true;
}
}
// check for end marker
if (marks != null && !inSingleQuote && !inDoubleQuote)
{
for (int m = 0; m < marks.length; m++)
{
// If one of markers found
if (src.regionMatches(i, marks[m], 0, marks[m].length()))
{
// return marker if required
if (end != null)
{
end.append(marks[m]);
}
return i+marks[m].length();
}
}
}
// check for start of placeholder
if (src.regionMatches(i, placeholderStartToken, i, lenPlaceholderStartToken))
{
synchronized(this)
{
++placeholders;
}
i = renderPlaceholder(src, dst, nil, i, new ArrayList<String>(), context);
continue;
}
// just add plain character
if(c != '\'' && c!= '"')
{
dst.append(c);
}
i++;
}
return i;
}
private int renderPlaceholder(String src, StringBuffer dst, String nil, int i, List<String> params, RenderContext... context){
StringBuffer token = new StringBuffer(); // placeholder token
StringBuffer end = new StringBuffer(); // placeholder end marker
String value;
i = renderTemplate(src, token, nil, i+lenPlaceholderStartToken, endTokens, end);
String sToken = token.toString().trim();
String sEnd = end.toString().trim();
boolean isFunction = sEnd.equals("(");
// This is method name
if(isFunction && placeholders > methodStack.size())
{ // Method
synchronized(this)
{
methodStack.push(sToken); // put method into stack
}
}
else if(!isFunction && (methodStack.size()==0) && sEnd.equals(placeholderEndToken)) // Single template param such as <<param>>
{
value = getParam(sToken, context);
if(value != null)
{
if(value.trim().startsWith(placeholderStartToken))
{
value = render(src, context);
}
dst.append(value);
return i;
}
}
// TODO: Process method parameters to invoke
//.... ?????????
// Found end method token ')'
// Pop method out of stack to invoke
if ( (methodStack.size() >0) && (sEnd.length() == 0 || sEnd.equals(")")))
{
String method = null;
synchronized(this)
{
// Pop method out of stack to invoke
method = methodStack.pop();
--placeholders;
dst.append(invokeMethodEvaluator(method, methodParams.toArray(new String[0]), context));
methodParams.clear();
}
}
return i;
}
// Currently this method just implement to test so it just printout the method name
// and its parameter
// We can register MethodExpressionEvaluator to process
protected String invokeMethodEvaluator(String method, String[] params, RenderContext... context){
StringBuffer result = new StringBuffer();
result.append("[ ")
.append(method)
.append(" ( ");
if(params != null)
{
for(int i=0; i<params.length; i++)
{
result.append(params[i]);
if(i != params.length-1)
{
result.append(" , ");
}
}
}
result.append(" ) ")
.append(" ] ");
return result.toString();
}
}
We can easily register more method to the renderer to invoke. Each method will be an object and can be reused. But I'm in trouble how to resolve the nested method parameter. Can anyone give me an advice how we can process nested template of method parameter to invoke??? The line has TODO. Will my code in on the right way???
When you evaluate something like << count( << getTransId() >> ) >> you can either:
perform direct-evaluation as you parse, and push each function onto a stack, so that once you've evaluated getTransId() you pop the stack and use the return value (from the stack) as an argument for count(), or
you can build a parse tree to represent all the function calls that will be made, and then evaluate your parse tree after building it. (Building a tree probably doesn't buy you anything; since you're writing a template engine there is probably no high-level tree operation 'optimizations' that you could perform.)
An excellent little book I really enjoyed was Language Implementation Patterns by Parr. He walks through building simple to complex languages, and covers decisions like this in some depth. (Yes, he uses the ANTLR parser generator throughout, but your code looks like you're familiar enough with hand-generated parsers that different tools won't be a distraction for you.)
I found the bug and fixed it.
This is my new source:
// AbstractMethodExpressionRenderer.java
public class AbstractMethodExpressionRenderer extends AbstractMessageRenderer {
private boolean inSingleQuote = false;
private boolean inDoubleQuote=false;
private Stack<MethodExpressionDescriptor> functionStack;
private String[] endTokens;
private String marker;
private String prefix = "~";
public AbstractMethodExpressionRenderer() {
super();
functionStack = new Stack<MethodExpressionDescriptor>();
marker = ",";
endTokens = new String[] { placeholderEndToken, "(", ")", };
}
private class MethodExpressionDescriptor {
public List<String> params;
public String function;
public MethodExpressionDescriptor() {
params = new ArrayList<String>();
}
public MethodExpressionDescriptor(String name) {
this();
this.function = name;
}
}
public String getPrefix() {
return prefix;
}
public void setPrefix(String prefix) {
this.prefix = prefix;
}
public String getMarker() {
return marker;
}
public void setMarker(String marker) {
this.marker = marker;
endTokens = new String[] { placeholderEndToken, marker };
}
#Override
public void setPlaceholderEndToken(String placeholderEndToken) {
super.setPlaceholderEndToken(placeholderEndToken);
endTokens = new String[] { placeholderEndToken, marker };
}
protected String doRender(String s, RenderContext... context) {
StringBuffer sb = new StringBuffer();
try
{
renderTemplate(s, sb, nullToken, 0, endTokens, null, context);
}
catch (Exception e)
{
Log4j.app.error("Exception in rendering method expression message emplate: " + e.getMessage(), e);
return nullToken;
}
return sb.toString();
}
private int renderTemplate(String src, StringBuffer dst, String nil, int i, String[] marks, StringBuffer end, RenderContext... context) {
int len = src.length();
while (i < len)
{
char c = src.charAt(i);
if (escape)
{
if (c=='\\')
{
i++;
char ch = src.charAt(i);
if(inSingleQuote)
{
if(ch=='\'')
{
inSingleQuote=false;
}
}
else if(inDoubleQuote)
{
if(ch=='"')
{
inDoubleQuote=false;
}
}
else
{
if(ch=='\'')
{
inSingleQuote=true;
}
else if(ch=='"')
{
inDoubleQuote=true;
}
}
dst.append(ch);
i++;
continue;
}
}
if(inSingleQuote)
{
if(c=='\'')
{
inSingleQuote=false;
}
}
else if(inDoubleQuote)
{
if(c=='"')
{
inDoubleQuote=false;
}
}
else
{
if(c=='\'')
{
inSingleQuote=true;
}
else if(c=='"')
{
inDoubleQuote=true;
}
}
// check for end marker
if (marks != null && !inSingleQuote && !inDoubleQuote)
{
for (int m = 0; m < marks.length; m++)
{
// If one of markers found
if (src.regionMatches(i, marks[m], 0, marks[m].length()))
{
// return marker if required
if (end != null)
{
end.append(marks[m]);
}
return i+marks[m].length();
}
}
}
// check for start of placeholder
if (src.regionMatches(i, placeholderStartToken, 0, lenPlaceholderStartToken))
{
i = renderPlaceholder(src, dst, nil, i, new ArrayList<String>(), context);
continue;
}
// just add plain character
if(c != '\'' && c!= '"')
{
dst.append(c);
}
i++;
}
return i;
}
/**
* Render a placeholder as follows:
*
* <<key>>: Simple render, key value map
* <<function(<<param1>>, <<param2>>)>> : Function object render
*
* #param src
* #param dst
* #param nil
* #param i
* #param params
* #param context
* #return
*/
private int renderPlaceholder(String src, StringBuffer dst, String nil, int i, List<String> params, RenderContext... context){
StringBuffer token = new StringBuffer(); // placeholder token
StringBuffer end = new StringBuffer(); // placeholder end marker
String value = null;
// Simple key
i = renderTemplate(src, token, nil, i+lenPlaceholderStartToken, endTokens, end, context);
String sToken = token.toString().trim();
String sEnd = end.toString().trim();
// This is method name
if(sEnd.equals("("))
{ // Method
functionStack.add(new MethodExpressionDescriptor(sToken));
}
else // Try to resolve value
{
if(sToken.startsWith(placeholderStartToken))
{
value = render(sToken, context);
}
else if(sToken.startsWith(prefix))
{
if(functionStack.size() > 0)
{
functionStack.peek().params.add(sToken.substring(1));
}
return i;
}
else
{
value = getParam(sToken, context);
}
}
if (sEnd.length() == 0 || sEnd.equals(placeholderEndToken))
{
// No method found but found the end of placeholder token
if(functionStack.size() == 0)
{
if(value != null)
{
dst.append(value);
}
else
{
dst.append(nil);
}
}
else
{
functionStack.peek().params.add(value);
}
}
else
{
if(value != null)
{
value = value.trim();
}
if(end.substring(0, 1).equals("(") ||
end.substring(0, 1).equals(marker))
{
// right hand side is remainder of placeholder
StringBuffer tmp = new StringBuffer();
end = new StringBuffer();
i = renderTemplate(src, tmp, nil, i, endTokens, end, context);
}
if(end.substring(0, 1).equals(")"))
{
if ( functionStack.size() > 0 )
{
// Pop method out of stack to invoke
MethodExpressionDescriptor descriptor = functionStack.pop();
if(functionStack.size() > 0 )
{
functionStack.peek().params.add(invokeMethodEvaluator(descriptor.function, descriptor.params.toArray(new String[0]), context));
}
else
{
dst.append(invokeMethodEvaluator(descriptor.function, descriptor.params.toArray(new String[0]), context));
}
end = new StringBuffer();
StringBuffer tmp = new StringBuffer();
i = renderTemplate(src, tmp, nil, i, endTokens, end, context);
}
}
}
return i;
}
protected String invokeMethodEvaluator(String method, String[] params, RenderContext... context){
StringBuffer result = new StringBuffer();
result.append("[ ")
.append(method)
.append(" ( ");
if(params != null)
{
for(int i=0; i<params.length; i++)
{
result.append(params[i]);
if(i != params.length-1)
{
result.append(" , ");
}
}
}
result.append(" ) ")
.append(" ] ");
return result.toString();
}
}