Show message tree like in livejournal (java) - java

I am writing a forum using Spring MVC+Hibernate. Hibernate uses lazy initialization and to make it work i use OpenSessionInViewInterceptor and it works. There shouldn't be any problem with lazy initialization.
I am trying to show message tree like it is done in livejournal replies.
I DB i have only id, parentId, text columns.
mysql> select * from posts;
+----+----------+----------+----------+--------+------------+----------+
| id | threadId | authorId | parentId | text | created | modified |
+----+----------+----------+----------+--------+------------+----------+
| 1 | 5 | NULL | NULL | fda | 2011-11-24 | NULL |
| 2 | 5 | NULL | NULL | aff | 2011-11-24 | NULL |
| 3 | 5 | NULL | NULL | faee | 2011-11-24 | NULL |
| 13 | 6 | NULL | NULL | f52 | 2011-11-26 | NULL |
| 14 | 6 | NULL | 13 | c431 | 2011-11-26 | NULL |
| 15 | 6 | NULL | NULL | c31c13 | 2011-11-26 | NULL |
| 16 | 6 | NULL | 15 | n754 | 2011-11-26 | NULL |
| 23 | 4 | NULL | NULL | v52 | 2011-11-26 | NULL |
| 24 | 4 | NULL | 23 | v53 | 2011-11-26 | NULL |
| 25 | 4 | NULL | NULL | v423 | 2011-11-26 | NULL |
| 26 | 4 | NULL | 24 | v523 | 2011-11-26 | NULL |
| 27 | 4 | NULL | 23 | v253 | 2011-11-26 | NULL |
+----+----------+----------+----------+--------+------------+----------+
POJO class Post:
#Entity
#Table(name="posts")
public class Post{
#Id
#GeneratedValue(strategy=GenerationType.IDENTITY)
private Integer id;
#ManyToOne(cascade=CascadeType.REFRESH,fetch=FetchType.LAZY)
#JoinColumn(name="threadId")
private Thread thread;
#Column(name="authorId")
private Integer authorId;
#ManyToOne(cascade=CascadeType.REFRESH,fetch=FetchType.LAZY)
#JoinColumn(name="parentId")
private Post parentPost;
#Column(name="text")
private String text;
#Column(name="created")
private Date created;
#Column(name="modified")
private Date modified;
....Many getters and setters....
}
I have written a JSP custom tag:
<custom:tree postList="${posts}"/>
posts - the list of messages for this thread.
My customTags.tld:
...
<tag>
<description>message tree</description>
<name>tree</name>
<tag-class>forum.tag.MessageTree</tag-class>
<body-content>empty</body-content>
<attribute>
<name>postList</name>
<required>true</required>
<rtexprvalue>true</rtexprvalue>
</attribute>
</tag>
...
And class for this custom tag:
public class MessageTree extends SimpleTagSupport{
private List<Post> postList;
private StringBuffer output = new StringBuffer("<ul>");
public void setPostList(List<Post> postList){
this.postList = postList;
}
public void doTag()throws JspException,IOException{
retrieveOutput(null);
output.append("</ul>");
getJspContext().getOut().print(output.toString());
}
private void retrieveOutput(Integer parentId){
int j = 0;
while(j<postList.size()){
if(parentId==null && postList.get(j).getParentPost()==null){
output.append("<li>Id: "+postList.get(j).getId());
output.append("<ul>");
//retrieveOutput(postList.get(j).getId());
output.append("</ul></li>");
}else{
if(postList.get(j).getParentPost().getId().equals(parentId)){ // !!!Here it throws java.lang.NullPointerException!!!!
output.append("<li>Id: "+postList.get(j).getId());
output.append("<ul>");
retrieveOutput(postList.get(j).getId());
output.append("</ul></li>");
}
}
j++;
}
}
}
And it throws exceptions when it is checking if(postList.get(j).getParentPost().getId().equals(parentId))
java.lang.NullPointerException
forum.tag.MessageTree.retrieveOutput(MessageTree.java:31)
forum.tag.MessageTree.retrieveOutput(MessageTree.java:28)
forum.tag.MessageTree.doTag(MessageTree.java:18)
org.apache.jsp.WEB_002dINF.jsp.showThread_jsp._jspx_meth_custom_005ftree_005f0(showThread_jsp.java:457)
org.apache.jsp.WEB_002dINF.jsp.showThread_jsp._jspService(showThread_jsp.java:239)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:433)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:389)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:333)
javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
org.springframework.web.servlet.view.InternalResourceView.renderMergedOutputModel(InternalResourceView.java:238)
org.springframework.web.servlet.view.AbstractView.render(AbstractView.java:250)
org.springframework.web.servlet.DispatcherServlet.render(DispatcherServlet.java:1047)
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:817)
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:719)
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:644)
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:549)
javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
But when I try, for example,
public void doTag()throws JspException,IOException{
output.append(postList.get(1).getParentPost().getId());
getJspContext().getOut().print(output.toString());
}
It works and it retrieves 23!
May be I am doing it absolutely incorrect? What can you advice?
Ye, I have found errors! I have rewritten it a little bit. The problem was that I didn't check in the else-clause whether the object has the parrent post or not.
private void retrieveOutput(Integer parentId){
int j = 0;
while(j<postList.size()){
if(parentId==null && postList.get(j).getParentPost()==null){
output.append("<li>Id: "+postList.get(j).getId()+"<br/>ParentId: 0<br/>Text: "+postList.get(j).getText()+"<br/>Posted: "+postList.get(j).getCreated()+"<br/>Delete this shit");
output.append("<ul>");
retrieveOutput(postList.get(j).getId());
output.append("</ul></li>");
}else{
if(postList.get(j).getParentPost()!=null && postList.get(j).getParentPost().getId().equals(parentId)){
output.append("<li>Id: "+postList.get(j).getId()+"<br/>ParentId: "+postList.get(j).getParentPost().getId()+"<br/>Text: "+postList.get(j).getText()+"<br/>Posted: "+postList.get(j).getCreated()+"<br/>Delete this shit");
output.append("<ul>");
retrieveOutput(postList.get(j).getId());
output.append("</ul></li>");
}
}
j++;
}
}

I think your problem is with Lazy fetching you can't access a lazy fetched property outside a transaction check this link.

Related

Memory leak with hashMap

I have a memory leak problem that I need to resolve it.
I have this File can help me to find the memory leak
2'777'369'064 (62.72%) [32] 8 class */planning/canvas/shared/serializable/ActionCycleSZ 0x68759f768
|- 2'777'365'536 (62.72%) [256] 35 org/apache/catalina/loader/WebappClassLoader 0x688ce9df8
| |- 2'775'589'272 (62.68%) [48] 1 java/util/HashMap 0x688ceabe0
| | |- 2'775'589'224 (62.68%) [32'784] 3'533 array of java/util/HashMap$Entry 0x689af74c0
| | |- 2'763'509'944 (62.41%) [24] 2 java/util/HashMap$Entry 0x68a0b1f98
| | | |- 2'763'509'744 (62.41%) [40] 1 org/apache/catalina/loader/ResourceEntry 0x68a0b1fb0
| | | | |- 2'763'509'704 (62.41%) [32] 41 class */gwt/server/servlet/TaProjectsSessionManager 0x68653c8e8
| | | | |- 2'763'047'360 (62.4%) [32] 6 class */selfservice/SelfConfigurator 0x6875922a0
| | | | | |- 2'763'047'328 (62.4%) [16] 2 */gwt/server/servlet/TaProjectsSessionManager$1 0x689aee328
| | | | | | |- 2'154'573'968 (48.66%) [160] 30 */impl/HRSessionImpl 0x689ee49f8
| | | | | | | |- 2'138'350'824 (48.29%) [32] 3 java/util/Collections$SynchronizedMap 0x689ee4c70
| | | | | | | | |- 2'138'350'760 (48.29%) [64] 3 org/apache/commons/collections/map/LRUMap 0x689ee5218
| | | | | | | | | |- 2'134'913'368 (48.21%) [32] 2 org/apache/commons/collections/map/AbstractLinkedMap$LinkEntry 0x68a3573d0
| | | | | | | | | |- 3'437'328 (0.08%) [2'064] 121 array of org/apache/commons/collections/map/AbstractHashedMap$HashEntry 0x68a356bc8
| | | | | | | | | |- 16 (0%) [16] 1 org/apache/commons/collections/map/AbstractHashedMap$KeySet 0x69d443088
| | | | | | | | |- 32 (0%) [16] 2 java/util/Collections$SynchronizedSet 0x69d443098
| | | | | | | | |- 2'138'350'824 (48.29%) [32] 3 java/util/Collections$SynchronizedMap 0x689ee4c70
| | | | | | | |- 16'078'096 (0.36%) [104] 19 */impl/Dictionary 0x689ee4ad8
I conclude that the class ActionCycleSZ produce the memory leak
this is ActionCycleSZ
public class ActionCycleSZ extends ActionDTO implements IsSerializable {
private CycleSZ bean;
public ActionCycleSZ() {
}
public ActionCycleSZ(Type actionType, CycleSZ bean ) {
super(actionType);
this.bean = bean;
}
public CycleSZ getBean(){
return bean;
}
public void setBean(CycleSZ bean){
this.bean = bean;
}
}
public class CycleSZ implements Serializable{
/**
*
*/
private static final long serialVersionUID = 1L;
String cycleLabel;
Date startDate;
Date endDate;
String startDateDTO;
String endDateDTO;
Integer numlign;
String accumulatedHours;
List<SiteSZ> listOfSites = new LinkedList<SiteSZ>();
//getter and setter
}
public class SiteSZ implements Serializable {
/**
*
*/
private static final long serialVersionUID = 1L;
int week;
String siteLabel;
Date startDate;
Date endDate;
String startHour;
String endHour;
String site;
String time;
String particularSlotTime;
Integer numlign;
DaySZ dayAttribute;
String accumulatedWeekHours;
Map<Util.WeekDays,DaySZ> mapAttributes = new
LinkedHashMap<Util.WeekDays,DaySZ>();
boolean workedDay; //Flag for Exceptional Canevas Entry
boolean reposHebdo;
String contratId; //contratId for Exceptional Canevas Entry
In all ActionCycleSZ I have just this Map Map mapAttributes = new
LinkedHashMap();
I think that is the problem of the leak. Do I think right? I see the code and I don't see something like a memory leak.
Can anyone help me how to detect this memory leak or give me examples of memory leak due to hashmap

Restriction on #OneToMany mapping

I have a product class where there is an #OneToMany association for a list of buyers. What I want to do is that buyer search performed by the association when I search for a product, use a null constraint for the end date column of the Buyer table. How to do this in a list mapping like this, below.
// it would be something I needed cri.createCriteria("listaBuyer", "buyer).add(Restriction.isNull("finalDate"));
Example
Registered data
product code | initial date | final date |
-------------------------------------------------------
1 | 2016-28-07 | 2017-28-07 |
------------------------------------------------------
2 | 2016-10-08 | 2017-28-07 |
------------------------------------------------------
3 | 2017-28-08 | |
-----------------------------------------------------
4 | 2017-30-08 | |
Product Class
public class Product {
#OneToMany(targetEntity=Buyer.class, orphanRemoval=true, cascade={CascadeType.PERSIST,CascadeType.MERGE}, mappedBy="product")
#LazyCollection(LazyCollectionOption.FALSE)
public List<Buyer> getListaBuyer() {
if (listaBuyer == null) {
listaBuyer = new ArrayList<Buyer>();
}
return listaBuyer;
}
built criterion
Criteria cri = getSession().createCriteria(Product.class);
cri.createCriteria("status", "sta");
cri.add(Restrictions.eq("id", Product.getId()));
return cri.list();
Expected outcome
product code | initial date | final date |
-------------------------------------------------------
3 | 2017-28-08 | |
-----------------------------------------------------
4 | 2017-30-08 | |
Returned result
product code | initial date | final date |
-------------------------------------------------------
1 | 2016-28-07 | 2017-28-07 |
------------------------------------------------------
2 | 2016-10-08 | 2017-28-07 |
------------------------------------------------------
3 | 2017-28-08 | |
-----------------------------------------------------
4 | 2017-30-08 | |

how to get communicate with 3 tables

Hi I have a table parent and its fields are
mysql> select * from parent;
+----+----------+------------+-----------------------------------+---------+------+
| id | category | is_deleted | name | version | cid |
+----+----------+------------+-----------------------------------+---------+------+
| 1 | default | | Front Office | 0 | NULL |
| 2 | default | | Food And Beverage | 0 | NULL |
| 3 | default | | House Keeping | 0 | NULL |
| 4 | default | | General | 0 | NULL |
| 5 | client | | SPA | 0 | NULL |
| 7 | client | | house | 0 | NULL |
| 8 | client | | test | 0 | NULL |
| 9 | client | | ggg | 0 | 1 |
| 10 | client | | dddd | 0 | 1 |
| 11 | client | | test1 | 0 | 1 |
| 12 | client | | java | 0 | 1 |
| 13 | client | | dcfdcddd | 0 | 1 |
| 14 | client | | qqqq | 0 | 1 |
| 15 | client | | nnnnnn | 0 | 1 |
| 16 | client | | category | 0 | 1 |
| 17 | client | | sukant | 0 | 1 |
| 18 | client | | bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb | 0 | 1 |
I have another table parent_question
mysql> select * from parent_question;
+----+------------+---------+-----+------+
| id | is_deleted | version | pid | qid |
+----+------------+---------+-----+------+
| 1 | | 0 | 1 | 1 |
| 2 | | 0 | 1 | 2 |
| 3 | | 0 | 1 | 3 |
| 4 | | 0 | 1 | 4 |
| 5 | | 0 | 1 | 5 |
| 6 | | 0 | 1 | 6 |
| 7 | | 0 | 2 | 7 |
| 8 | | 0 | 2 | 1 |
| 9 | | 0 | 2 | 2 |
| 10 | | 0 | 2 | 8 |
| 11 | | 0 | 3 | 9 |
| 12 | | 0 | 3 | 1 |
| 13 | | 0 | 3 | 10 |
| 14 | | 0 | 3 | 11 |
| 15 | | 0 | 4 | 12 |
| 16 | | 0 | 1 | 1 |
| 17 | | 0 | 1 | 2 |
| 18 | | 0 | 1 | 3 |
| 19 | | 0 | 5 | 13 |
| 20 | | 0 | 2 | 7 |
| 21 | | 0 | 2 | 2 |
| 22 | | 0 | 1 | 14 |
| 23 | | 0 | 1 | 15 |
| 24 | | 0 | 1 | 16 |
| 25 | | 1 | 1 | 17 |
| 26 | | 0 | 1 | 21 |
| 27 | | 0 | 2 | 22 |
| 28 | | 0 | 13 | 23 |
| 29 | | 0 | 9 | 24 |
| 30 | | 0 | 12 | 25 |
| 31 | | 0 | 12 | 26 |
| 32 | | 0 | 12 | 27 |
| 33 | | 0 | 12 | 28 |
| 34 | | 0 | 14 | 29 |
| 35 | | 0 | 15 | 30 |
| 36 | | 0 | 10 | 31 |
| 37 | | 0 | 4 | 32 |
| 38 | | 0 | 16 | 33 |
| 39 | | 0 | 10 | 34 |
| 40 | | 0 | 3 | 35 |
| 41 | | 0 | 17 | 36 |
| 42 | | 0 | 1 | 37 |
| 43 | | 0 | 1 | 38 |
| 44 | | 0 | 18 | 39 |
| 45 | | 0 | 18 | 40 |
+----+------------+---------+-----+------+
45 rows in set (0.00 sec)
and this is my question table
ysql> select * from question;
----+----------+------------+------------------------------------------------------------+--
id | category | is_deleted | question | v
----+----------+------------+------------------------------------------------------------+--
1 | default | | Staff Courtesy |
2 | default | | Staff Response |
3 | default | | Check In |
4 | default | | Check Out |
5 | default | | Travel Desk |
6 | default | | Door Man |
7 | default | | Restaurant Ambiance |
8 | default | | Quality Of Food |
9 | default | | Cleanliness Of The Room |
10 | default | | Room Size |
11 | default | | Room Amenities |
12 | default | | Any Other Comments ? |
13 | client | | How is Food? |
14 | client | | test question |
15 | client | | test1 |
16 | client | | test2 |
17 | client | | test2 |
18 | client | | test2 |
19 | client | | working |
20 | client | | sss |
21 | client | | ggggg |
22 | client | | this is new question |
23 | client | | dddddddddddd |
24 | client | | ggggggggggggggggg |
25 | client | | what is a class? |
26 | client | | what is inheritance |
27 | client | | what is an object |
28 | client | | what is an abstract class? |
29 | client | | qqqq |
30 | client | | nnnn question |
31 | client | | add some |
32 | client | | general question |
33 | client | | category question |
34 | client | | hhhhhhhh |
35 | client | | this is hos |
36 | client | | gggg |
37 | client | | dddd |
38 | client | | ddddd |
39 | client | | bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb |
40 | client | | ggg |
----+----------+------------+------------------------------------------------------------+--
what I know I have pid of parent_question table;
What I want question of question table;
for example.If I were given to find the question whose pid is 18.
So from the parent_question table I can know qid is 39 and 40 and from the question table 39 refers to bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb and 40 refers to ggg
What i tried
String queryString="SELECT distinct q FROM Question q , ParentQuestion pq ,Parent p where pq.qid.id = q.id and p.id = pq.pid.id and p.category = 'default' AND p.id = "+pid;
Query query=entityManagerUtil.getQuery(queryString);
List questionsList = query.getResultList();
return questionsList;
but it did not work.I mean I get nothing in the list.Can anybody point out my mistake.
question entity class
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.Version;
import javax.validation.constraints.Size;
import org.springframework.beans.factory.annotation.Configurable;
#Configurable
#Entity
public class Question {
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
#Column(name = "id")
private Long id;
#Version
#Column(name = "version")
private Integer version;
private String question;
private String category;
private boolean isDeleted;
public boolean isDeleted() {
return isDeleted;
}
public void setDeleted(boolean isDeleted) {
this.isDeleted = isDeleted;
}
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
public Integer getVersion() {
return version;
}
public void setVersion(Integer version) {
this.version = version;
}
public String getQuestion() {
return question;
}
public void setQuestion(String question) {
this.question = question;
}
public String getCategory() {
return category;
}
public void setCategory(String category) {
this.category = category;
}
}
parent entity class
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.ManyToOne;
import javax.persistence.Version;
import javax.validation.constraints.NotNull;
import javax.validation.constraints.Size;
import org.springframework.beans.factory.annotation.Configurable;
#Configurable
#Entity
public class Parent {
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
#Column(name = "id")
private Long id;
#Version
#Column(name = "version")
private Integer version;
#ManyToOne
private Client cid;
public Client getCid() {
return cid;
}
public void setCid(Client cid) {
this.cid = cid;
}
private boolean isDeleted;
/* #ManyToOne
private Client cid;
public Client getCid() {
return cid;
}
public void setCid(Client cid) {
this.cid = cid;
}*/
public boolean isDeleted() {
return isDeleted;
}
public void setDeleted(boolean isDeleted) {
this.isDeleted = isDeleted;
}
private String name;
private String category;
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
public Integer getVersion() {
return version;
}
public void setVersion(Integer version) {
this.version = version;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getCategory() {
return category;
}
public void setCategory(String category) {
this.category = category;
}
}
Commenters are confused by your classes, and this is a good indication that you might need to think about your design. suninsky has a good point that you may not need to have an entity class call ParentQuestion (unless ParentQuestion has extra data about the relationship, of course). Here are some typical questions I would be asking.
Does every Question have a Parent? If so then there should probably be a parent property on your Question class, mapped as #ManyToOne
Does every a Parent object have a set of Questions? If yes, then the Parent object should probably have a property named questions, the type of which is some kind of collection of Question objects.

Java code in Hadoop

I am running a map only job in Hadoop. The data-set is a set of html pages in a single file (returned by a crawler)
The mapper code is written in Java. I am using JSoup to parse. What I want as my output is a key that has both the contents of the title tag and the content of a meta tag. Ideally I should get 1592 records for my map output records. I am getting 3184.
The concatenation I attempt to do with this line of code is not happening.
String MN_Job = (jobT + "\t" + jobsDetail);
What I get instead is each of these separately, hence double the number of outputs. What am I doing wrong here?
public class JobsDataMapper extends Mapper<LongWritable, Text, Text, Text> {
private Text keytext = new Text();
private Text valuetext = new Text();
#Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
Document doc = Jsoup.parse(line);
Elements desc = doc.select("head title, meta[name=twitter:description]");
for (Element jobhtml : desc) {
Elements title = jobhtml.select("title");
String jobT = "";
for (Element titlehtml : title) {
jobT = titlehtml.text();
}
Elements meta = jobhtml.select("meta[name=twitter:description]");
String jobsDetail ="";
for (Element metahtml : meta) {
String content = metahtml.attr("content");
String content1 = content.replaceAll("\\p{Punct}+", " ");
jobsDetail = content1.replaceAll(" (?i)a | (?i)able | (?i)about | (?i)across | (?i)after | (?i)all | (?i)almost | (?i)also | (?i)am | (?i)among | (?i)an | (?i)and | (?i)any | (?i)are | (?i)as | (?i)at | (?i)be | (?i)because | (?i)been | (?i)but | (?i)by | (?i)can | (?i)cannot | (?i)could | (?i)dear | (?i)did | (?i)do | (?i)does | (?i)either | (?i)else | (?i)ever | (?i)every | (?i)for | (?i)from | (?i)get | (?i)got | (?i)had | (?i)has | (?i)have | (?i)he | (?i)her | (?i)hers | (?i)him | (?i)his | (?i)how | (?i)however | (?i)i | (?i)if | (?i)in | (?i)into | (?i)is | (?i)it | (?i)its | (?i)just | (?i)least | (?i)let | (?i)like | (?i)likely | (?i)may | (?i)me | (?i)might | (?i)most | (?i)must | (?i)my | (?i)neither | (?i)no | (?i)nor | (?i)not | (?i)nbsp | (?i)of | (?i)off | (?i)often | (?i)on | (?i)only | (?i)or | (?i)other | (?i)our | (?i)own | (?i)rather | (?i)said | (?i)say | (?i)says | (?i)she | (?i)should | (?i)since | (?i)so | (?i)some | (?i)than | (?i)that | (?i)the | (?i)their | (?i)them | (?i)then | (?i)there | (?i)these | (?i)they | (?i)this | (?i)tis | (?i)to | (?i)too | (?i)twas | (?i)us | (?i)wants | (?i)was | (?i)we | (?i)were | (?i)what | (?i)when | (?i)where | (?i)which | (?i)while | (?i)who | (?i)whom | (?i)why | (?i)will | (?i)with | (?i)would | (?i)yet | (?i)you | (?i)your "," ");
}
String IT_Job = (jobT + "\t" + jobsDetail);
keytext.set(IT_Job) ;
valuetext.set("JobDetail");
context.write( keytext, valuetext );
}
}
}
Edit: I know what the problem is. But the thing is that the solution might not be obvious in MapReduce. You might have to write your custom RecordReader. Let me explain the problem.
In your code you read line by line. Then you apply this to the line you read:
Elements desc = doc.select("head title, meta[name=twitter:description]");
But evidently, it might only have a title or a <meta name=twitter:description> tag. So you read one of those and store it. The other one remains blank. So at a time, only one of your variables, jobT and jobsDetail has any data. So for the code snippet:
String IT_Job = (jobT + "\t" + jobsDetail);
one time, the first one is blank and the second time, the other one is blank. So if you are expecting n records, you get 2n records. Similarly, if you'll attempt to extract three fields, then you should get 3n records. So you can test this theory by extracting another field and then checking if you are getting thrice the number of expected records.
If the theory turns out to be correct, you might want to delimit the webpages you extract with a specific delimiter string. Then you want to write a custom RecordReader which will read one html file at a time according to the delimiter and then process the entire html file at once. That way you'll get the title and the meta tags together.
Just by the look at the numbers: 3184/2 = 1592.
I think that your file is just duplicated in the input folder. I can't tell for sure, because you have not given the code how you submit the job, but maybe you can verify it with a simple:
bin/hadoop fs -ls /your/input_path
When submitting, either make sure that there is just the single file in there, or just reference the single file in your submission logic.
I made changes to the original code removing the loops that were not necessary. What was happening the older code was that when there is a title in the record, it is output, and later when there is a content, it is output as well. So, there are two writes per HTML file.
public class JobsDataMapper extends Mapper<LongWritable, Text, Text, Text> {
private Text keytext = new Text();
private Text valuetext = new Text();
private String jobT = new String();
private String jobName= new String();
#Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
Document doc = Jsoup.parse(line);
Elements desc = doc.select("head title, meta[name=twitter:description]");
for (Element jobhtml : desc){
Elements title = jobhtml.select("title");
String jobTT = title.text();
jobT =jobTT ;
if (jobT.length()> 0){
jobName=jobTT;
}
Elements meta = jobhtml.select("meta[name=twitter:description]");
String jobsDetail ="";
String content = meta.attr("content");
String content1 = content.replaceAll("\\p{Punct}+", " ");
jobsDetail = content1.toLowerCase();
jobsDetail = content1.replaceAll(" a| able | about | across | after | all | almost | also | am | among | an | and | any | are | as | at | be| because | been | but | by | can | cannot | could | dear | did | do | does | either | else | ever | every | for | from | get | got | had | has | have | he | her | hers | him | his | how | however | i | if | in | into | is | it | its | just | least | let | like | likely | may | me | might | most | must | my | neither | no | nor | not | nbsp | of | off | often | on | only | or | other | our | own | rather | said | say | says | she | should | since | so | some | than | that | the | their | them | then | there | these | they | this | tis | to | too | twas | us | wants | was | we | were | what | when | where | which | while | who | whom | why | will | with | would | yet | you | your "," ");
if (jobsDetail.length()>0) {
String MN_Job = (jobName+ "\t" + jobsDetail);
keytext.set(MN_Job) ;
valuetext.set("JobInIT");
context.write( keytext, valuetext );
}
}
}
}

Tomcat Connection pool creating too many connections, stuck in sleep mode

I'm using Tomcat 6.0.29, with Tomcat 7's connection pool and MySQL. Testing my application, it doesn't reuse anything from the pool, but ends up creating a new pool, to eventually where I cannot use the database because there are hundreds of sleeping connections in the pool when the max active size for the pool is set to 20.
See here for reference:
+----+------+-----------------+--------+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------------+--------+---------+------+-------+------------------+
| 2 | root | localhost:51877 | dbname | Sleep | 9 | | NULL |
| 4 | root | localhost | NULL | Query | 0 | NULL | show processlist |
| 5 | root | localhost:49213 | dbname | Sleep | 21 | | NULL |
| 6 | root | localhost:53492 | dbname | Sleep | 21 | | NULL |
| 7 | root | localhost:46012 | dbname | Sleep | 21 | | NULL |
| 8 | root | localhost:34964 | dbname | Sleep | 21 | | NULL |
| 9 | root | localhost:52728 | dbname | Sleep | 21 | | NULL |
| 10 | root | localhost:43782 | dbname | Sleep | 21 | | NULL |
| 11 | root | localhost:38468 | dbname | Sleep | 21 | | NULL |
| 12 | root | localhost:48021 | dbname | Sleep | 21 | | NULL |
| 13 | root | localhost:54854 | dbname | Sleep | 21 | | NULL |
| 14 | root | localhost:41520 | dbname | Sleep | 21 | | NULL |
| 15 | root | localhost:38112 | dbname | Sleep | 13 | | NULL |
| 16 | root | localhost:39168 | dbname | Sleep | 13 | | NULL |
| 17 | root | localhost:40427 | dbname | Sleep | 13 | | NULL |
| 18 | root | localhost:58179 | dbname | Sleep | 13 | | NULL |
| 19 | root | localhost:40957 | dbname | Sleep | 13 | | NULL |
| 20 | root | localhost:45567 | dbname | Sleep | 13 | | NULL |
| 21 | root | localhost:48314 | dbname | Sleep | 13 | | NULL |
| 22 | root | localhost:34546 | dbname | Sleep | 13 | | NULL |
| 23 | root | localhost:44928 | dbname | Sleep | 13 | | NULL |
| 24 | root | localhost:57320 | dbname | Sleep | 13 | | NULL |
| 25 | root | localhost:54643 | dbname | Sleep | 29 | | NULL |
| 26 | root | localhost:49809 | dbname | Sleep | 29 | | NULL |
| 27 | root | localhost:60993 | dbname | Sleep | 29 | | NULL |
| 28 | root | localhost:36676 | dbname | Sleep | 29 | | NULL |
| 29 | root | localhost:53574 | dbname | Sleep | 29 | | NULL |
| 30 | root | localhost:45402 | dbname | Sleep | 29 | | NULL |
| 31 | root | localhost:37632 | dbname | Sleep | 29 | | NULL |
| 32 | root | localhost:56561 | dbname | Sleep | 29 | | NULL |
| 33 | root | localhost:34261 | dbname | Sleep | 29 | | NULL |
| 34 | root | localhost:55221 | dbname | Sleep | 29 | | NULL |
| 35 | root | localhost:39613 | dbname | Sleep | 15 | | NULL |
| 36 | root | localhost:52908 | dbname | Sleep | 15 | | NULL |
| 37 | root | localhost:56401 | dbname | Sleep | 15 | | NULL |
| 38 | root | localhost:44446 | dbname | Sleep | 15 | | NULL |
| 39 | root | localhost:57567 | dbname | Sleep | 15 | | NULL |
| 40 | root | localhost:56445 | dbname | Sleep | 15 | | NULL |
| 41 | root | localhost:39616 | dbname | Sleep | 15 | | NULL |
| 42 | root | localhost:49197 | dbname | Sleep | 15 | | NULL |
| 43 | root | localhost:59916 | dbname | Sleep | 15 | | NULL |
| 44 | root | localhost:37165 | dbname | Sleep | 15 | | NULL |
| 45 | root | localhost:45649 | dbname | Sleep | 1 | | NULL |
| 46 | root | localhost:55397 | dbname | Sleep | 1 | | NULL |
| 47 | root | localhost:34322 | dbname | Sleep | 1 | | NULL |
| 48 | root | localhost:54387 | dbname | Sleep | 1 | | NULL |
| 49 | root | localhost:55147 | dbname | Sleep | 1 | | NULL |
| 50 | root | localhost:47280 | dbname | Sleep | 1 | | NULL |
| 51 | root | localhost:56856 | dbname | Sleep | 1 | | NULL |
| 52 | root | localhost:58369 | dbname | Sleep | 1 | | NULL |
| 53 | root | localhost:33712 | dbname | Sleep | 1 | | NULL |
| 54 | root | localhost:44315 | dbname | Sleep | 1 | | NULL |
| 55 | root | localhost:54649 | dbname | Sleep | 14 | | NULL |
| 56 | root | localhost:41202 | dbname | Sleep | 14 | | NULL |
| 57 | root | localhost:59393 | dbname | Sleep | 14 | | NULL |
| 58 | root | localhost:38304 | dbname | Sleep | 14 | | NULL |
| 59 | root | localhost:34548 | dbname | Sleep | 14 | | NULL |
| 60 | root | localhost:49567 | dbname | Sleep | 14 | | NULL |
| 61 | root | localhost:48077 | dbname | Sleep | 14 | | NULL |
| 62 | root | localhost:48586 | dbname | Sleep | 14 | | NULL |
| 63 | root | localhost:45308 | dbname | Sleep | 14 | | NULL |
| 64 | root | localhost:43169 | dbname | Sleep | 14 | | NULL |
It creates exactly 10 for each request, which is the minIdle & InitialSize attribute as seen below.
Here is the sample test code embedded into a jsp page. The code is not the code in my application and just used to see if the issue was with my code, but the problem still persisted.
Context envCtx;
envCtx = (Context) new InitialContext().lookup("java:comp/env");
DataSource datasource = (DataSource) envCtx.lookup("jdbc/dbname");
Connection con = null;
try {
con = datasource.getConnection();
Statement st = con.createStatement();
ResultSet rs = st.executeQuery("select * from UserAccount");
int cnt = 1;
while (rs.next()) {
out.println((cnt++)+". Token:" +rs.getString("UserToken")+
" FirstName:"+rs.getString("FirstName")+" LastName:"+rs.getString("LastName"));
}
rs.close();
st.close();
} finally {
if (con!=null) try {con.close();}catch (Exception ignore) {}
}
Here is my context.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<Context>
<Resource name="jdbc/dbname"
auth="Container"
type="javax.sql.DataSource"
factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
testWhileIdle="true"
testOnBorrow="true"
testOnReturn="false"
validationQuery="SELECT 1"
validationInterval="30000"
timeBetweenEvictionRunsMillis="30000"
maxActive="20"
minIdle="10"
maxWait="10000"
initialSize="10"
removeAbandonedTimeout="60"
removeAbandoned="true"
logAbandoned="true"
minEvictableIdleTimeMillis="30000"
jmxEnabled="true"
jdbcInterceptors=
"org.apache.tomcat.jdbc.pool.interceptor.ConnectionState;org.apache.tomcat.jdbc.pool.interceptor.StatementFinalizer"
username=""
password=""
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/dbname?autoReconnect=true&useUnicode=true&characterEncoding=utf8"/>
<WatchedResource>WEB-INF/web.xml</WatchedResource>
<WatchedResource>META-INF/context.xml</WatchedResource>
</Context>
I'm sure I can use removeAbandonedTimeout to a low number and it would purge all these sleeping connections, but that wouldn't fix the real problem would it? Does anyone know what I'm doing wrong? Thank you very much.
I don't have an environment to test this in, at the moment, however, I believe that you should be closing your Connection, Statement, and ResultSet after each query; if any of these leak, it could leave the Connection hanging in an idle (but not necessarily returned to the pool) state.
The Connection object you receive should actually be a sort of proxy from the pooling layer; calling close on it releases your "reservation" on that connection and returns it to the pool. (It will not necessarily close the underlying, actual database connection.)
Because it could be remaining open (usually will be), unclosed Statements or ResultSets could be interpreted by the pool layer as an indication of being still “busy.”
You may be able to inspect (e.g. debugger makes this easy) the Connection object to identify its state at run-time, to confirm this.
For simplicity (…) we used the following nasty little routine in the finally blocks after every database connection call: … finally { closeAll (rs, st, con); }, ensuring that they would fall out of context immediately.
/**
* Close a bunch of things carefully, ignoring exceptions. The
* “things” supported, thus far, are:
* <ul>
* <li>JDBC ResultSet</li>
* <li>JDBC Statement</li>
* <li>JDBC Connection</li>
* <li>Lock:s</li>
* </ul>
* <p>
* This is mostly meant for “finally” clauses.
*
* #param things A set of SQL statements, result sets, and database
* connections
*/
public static void closeAll (final Object... things) {
for (final Object thing : things) {
if (null != thing) {
try {
if (thing instanceof ResultSet) {
try {
((ResultSet) thing).close ();
} catch (final SQLException e) {
/* No Op */
}
}
if (thing instanceof Statement) {
try {
((Statement) thing).close ();
} catch (final SQLException e) {
/* No Op */
}
}
if (thing instanceof Connection) {
try {
((Connection) thing).close ();
} catch (final SQLException e) {
/* No Op */
}
}
if (thing instanceof Lock) {
try {
((Lock) thing).unlock ();
} catch (final IllegalMonitorStateException e) {
/* No Op */
}
}
} catch (final RuntimeException e) {
/* No Op */
}
}
}
}
This was just syntactic sugar to ensure that nobody forgot to put in the longer, uglier stanza of if (null != con) { try { con.close () } catch (SQLException e) {} } (usually repeated three times for ResultSet, Statement, and Connection); and removed the "visual noise" of what our formatter would turn into a full screen of incidental cleanup code on every block of code that touched the database.
(The Lock support in there was for some related, but nasty, deadlock states on potential exceptions, that didn't have much to do with the database at all, but we used in a similar way to reduce the line noise in some thread-synchronization code. This is from an MMO server that might have 4,000 active threads at a time trying to manipulate game objects and SQL tables.)
Look into the maxAge property of the Connection pool. ( I noticed you didn't have it set.)
maxAge is
Time in milliseconds to keep this connection. When a connection is
returned to the pool, the pool will check to see if the now -
time-when-connected > maxAge has been reached, and if so, it closes
the connection rather than returning it to the pool. The default value
is 0, which implies that connections will be left open and no age
check will be done upon returning the connection to the pool. [source]
Basically this allows your sleeping threads to be recovered and should solve your problem.
perhaps this note from the dbcp connection pool docs may be the answer:
NOTE: If maxIdle is set too low on heavily loaded systems it is possible you will see connections being closed and almost immediately new connections being opened. This is a result of the active threads momentarily closing connections faster than they are opening them, causing the number of idle connections to rise above maxIdle. The best value for maxIdle for heavily loaded system will vary but the default is a good starting point.
perhaps maxIdle should == maxActive + minIdle for your system.
A short note on your code: not only Connection, but the ResultSet and Statement should be closed in the Finally block as well. The method given by BRPocock should work fine.
But that is not the actual reason for your 10 connections per request! The reason you get 10 connections each request is because you have set minIdle to 10, meaning that you force each DataSource to have 10 connections when you create it. (Try to set minIdle to 5, and you see that you will have 5 connections per request.)
The problem in your case is, that every time you do a request, you create a new DataSource:
DataSource datasource = (DataSource) envCtx.lookup("jdbc/dbname");
I'm not sure how the lookup exactly works, but given your processlist from mysql i'm pretty convinced that for every request you create a new datasource. If you have a Java Servlet, then you should create the DataSource in the init() method of your main Servlet. From there you can then get connections from it.
In my case I did something else, because I have multiple DataSources (multiple databases) I use the following code to get my datasource:
private DataSource getDataSource(String db, String user, String pass)
{
for(Map.Entry<String, DataSource> entry : datasources.entrySet())
{
DataSource ds = entry.getValue();
if(db.equals(ds.getPoolProperties().getUrl()))
{
return ds;
}
}
System.out.println("NEW DATASOURCE CREATED ON REQUEST: " + db);
DataSource ds = new DataSource(initPoolProperties(db, user, pass));
datasources.put(db, ds);
return ds;
}
The datasource relies on an equals method which is not really fast, but yea it works. I just keep a global HashMap containing my datasources, and if I request a datasource that does not exist yet, I create a new one. I know this works very well because in the logs I only see the NEW DATASOURCE CREATED ON REQUEST: dbname message only once per database, even multiple clients use the same datasource.
You should try with a connection provider, create a class which will contain your datasource provider declared as static instead of looking for it every call. Same for your InitialContext. Maybe it's because you create a new instance each time.
I had this problem because I was using Hibernate and failed to annotate some of my methods with #Transactional. The connections were never returned to the pool.
This Happen is due to your application reload without Resource Killing. And your application context resource is still alive. There is noway to solve this unless you delete the /Catalina/localhost/.xml and put it back or more frequent doing service restart with :: service tomcat7 restart
NOTE:: Nothing wrong with your code, nothing wrong with your configuration..
cheer~

Categories

Resources