wittakarn: 2014

Tuesday, October 28, 2014

Maven Central Repository as proxy in Nexus manager.

In nexus console

Figure 1. Set Repository Central proxy point to http://repo1.maven.org/maven2/

Figure 2. Add Repository Central proxy into Group Repositories.

The Ordered Group Repositories is used to order repositories, which is applied to look up library respectively.

In settings.xml, create a nexus profile and also active a nexus profile.

<mirrors>

<mirror>

<id>nexus</id>

<mirrorOf>*</mirrorOf>

<url>http://172.17.2.212:8089/nexus/content/groups/public</url>

</mirror>

</mirrors>

<profiles>

<profile>

<id>nexus</id>

<repositories>

<repository>

<id>central</id>

<url>http://central</url>

<releases>

<enabled>true</enabled>

<updatePolicy>always</updatePolicy>

</releases>

<snapshots>

<enabled>true</enabled>

<updatePolicy>always</updatePolicy>

</snapshots>

</repository>

<repository>

<id>primeface-releases</id>

<url>http://172.17.2.212:8089/nexus/content/repositories/primeface-releases/</url>

<releases>

<enabled>true</enabled>

<updatePolicy>always</updatePolicy>

</releases>

<snapshots>

<enabled>true</enabled>

<updatePolicy>always</updatePolicy>

</snapshots>

</repository>

</repositories>

<pluginRepositories>

<pluginRepository>

<id>central</id>

<url>http://central</url>

<releases>

<enabled>true</enabled>

<updatePolicy>always</updatePolicy>

</releases>

<snapshots>

<enabled>true</enabled>

<updatePolicy>always</updatePolicy>

</snapshots>

</pluginRepository>

<pluginRepository>

<id>primeface-releases</id>

<url>http://172.17.2.212:8089/nexus/content/repositories/primeface-releases/</url>

<releases>

<enabled>true</enabled>

<updatePolicy>always</updatePolicy>

</releases>

<snapshots>

<enabled>true</enabled>

<updatePolicy>always</updatePolicy>

</snapshots>

</pluginRepository>

</pluginRepositories>

</profile>

</profiles>

<activeProfiles>

<activeProfile>nexus</activeProfile>

</activeProfiles>

Friday, October 3, 2014

Programatically add tabs and contents using PrimeFaces.

Example code for adding new tabs and contents dynamically.

XHTML

<h:form id="form">
    <p:tabView id="tabview"

               cache="false"

               activeIndex="0"

               dynamic="true"

               binding="#{tabbedView.tabView}">

        <p:tab title="a">

            <p:outputLabel value="aaa" />

        </p:tab>

        <p:tab title="b">

            <p:outputLabel value="bbb" />

        </p:tab>

        <p:tab title="c">

            <p:outputLabel value="ccc" />

        </p:tab>

    </p:tabView>

    <p:outputLabel value="dynamic content" />

    <p:inputText id="dynamicText" value="#{tabbedView.dynamicText}" />

    <p:commandButton value="add tab"

                     process="@this,dynamicText"

                     update="tabview"

                     actionListener="#{tabbedView.addNewTab}">

    </p:commandButton>
    
</h:form>

ManagedBean

@ManagedBean

public class TabbedView {

    private TabView tabView;

    private String dynamicText;

    @PostConstruct

    public void init() {

        tabView = new TabView();

    }

    public TabView getTabView() {

        return tabView;

    }

    public void setTabView(TabView tabView) {

        this.tabView = tabView;

    }

    public String getDynamicText() {

        return dynamicText;

    }

    public void setDynamicText(String dynamicText) {

        this.dynamicText = dynamicText;

    }


    public void addNewTab(ActionEvent event) {

        OutputLabel label = new OutputLabel();

        label.setValue(dynamicText);

        Tab newTab = new Tab();

        newTab.setTitle(UUID.randomUUID().toString());

        newTab.getChildren().add(label);

        tabView.getChildren().add(newTab);

    }

}

Thursday, October 2, 2014

Default p:selectOneRadio selection in Primefaces.

This is my demonstration which present how to default p:selectOneRadio value in p:dataTable
XHTML

<h:form>

    <p:dataTable var="catalog" value="#{radioView.catalogs}">

        <p:column headerText="City">

            <p:selectOneRadio id="city"

                              value="#{catalog.city}"

                              columns="3">

                <f:selectItems value="#{radioView.cities}"

                               var="c"

                               itemLabel="#{city}"

                               itemValue="#{city}"/>

            </p:selectOneRadio>

        </p:column>

    </p:dataTable>

    <p:commandButton value="changeSelection"

                     process="@form"

                     update="@form"

                     actionListener="#{radioView.changeSelection}"/>

    <p:commandButton value="submit"

                     process="@form"

                     update="@form"

                     actionListener="#{radioView.submit}"/>

</h:form>

ManagedBean

@ManagedBean

public class RadioView {



    private List<Catalog> catalogs;

    private List<String> cities;



    @PostConstruct

    public void init() {

        cities = new ArrayList<String>();

        cities.add("San Francisco");

        cities.add("London");

        cities.add("Paris");



        //default radio value

        Catalog c1 = new Catalog("San Francisco");

        Catalog c2 = new Catalog("London");

        Catalog c3 = new Catalog("Paris");

        Catalog c4 = new Catalog("London");



        catalogs = new ArrayList<Catalog>();

        catalogs.add(c1);

        catalogs.add(c2);

        catalogs.add(c3);

        catalogs.add(c4);

    }

    public List<Catalog> getCatalogs() {

        return catalogs;

    }

    public void setCatalogs(List<Catalog> catalogs) {

        this.catalogs = catalogs;

    }

    public List<String> getCities() {

        return cities;

    }

    public void changeSelection(ActionEvent event){

        for (Catalog catalog : catalogs) {

            catalog.setCity("San Francisco");

        }

    }

    public void submit(ActionEvent event) {

        for (Catalog catalog : catalogs) {

            System.out.println(catalog.getCity());

        }

    }

}

Domain

public class Catalog implements Serializable{



    private String city;

    public Catalog(String city){

        this.city = city;

    }

    public String getCity() {

        return city;

    }

    public void setCity(String city) {

        this.city = city;

    }

}

Friday, September 19, 2014

Upgrading Mojarra in IBM WebSphere 8.5

Add the Mojarra listener to the required web.xml file.

<listener>
     <listener-class>
          com.sun.faces.config.ConfigureListener
     </listener-class>
</listener>

Add the Mojarra JSF implementation, which are javax.faces-2.1.24.jar, jsf-api-2.1.24.jar and jsf-impl-2.1.24.jar, to the Environment > Shared Libraries.
- javax.faces-2.1.24.jar (GroupId: org.glassfish) from http://search.maven.org/#artifactdetails%7Corg.glassfish%7Cjavax.faces%7C2.1.24%7Cjar
- jsf-api-2.1.24.jar (GroupId: com.sun.faces) from http://search.maven.org/#artifactdetails%7Ccom.sun.faces%7Cjsf-api%7C2.1.24%7Cjar
- jsf-impl-2.1.24.jar (GroupId: com.sun.faces) from http://search.maven.org/#artifactdetails%7Ccom.sun.faces%7Cjsf-impl%7C2.1.24%7Cjar

After configuration shared libraries, the Websphere server must be Restart.
After deploy your project, at the Enterprise Applications > staaec-web-1_0_10-SNAPSHOT_war(your project) > Class loader. Class loader order must be select (Classes loaded with local class loader first (parent last))
For reference the shared libraries to your project, you must go to Enterprise Applications > staaec-web-1_0_10-SNAPSHOT_war > Shared library references > Shared Library Mapping in order to map shared libraries to your project.
After mapping, the shared libraries columns must display as image below.
Start your application.

Decision tree(Data mining)

We can take a unique service each customer by using customer relationship management (CRM) in order to discover his/her behavior in detail. In the CRM, Decision Trees can be used to classify existing customer records into customer segments. The process starts with data related whose behavior is already known; for example, customers who have responded to a promotional campaign and someone have not; C4.5 is an algorithm developed by Ross Quinlan that generates Decision Trees. This algorithm dealing with both continuous and discrete attributes, missing values and pruning trees after construction.

Algorithm C4.5(D)
Input: an attribute-value dataset D
1: Tree = {}
2: if D is "pure" OR other stopping criteria met then
3: terminate
4:end if
5:for all attribute a Є D do

6:     Compute information-theoretic criteria if we split on a
7:end for
8:a_best = Best attribute according to above compute criteria
9:Tree = Create a decision node that test a_est in the root
10:D_v = Induce sub-datasets from D based on a_best
11:for all D_v do
12:     Tree_v = C4.5(D_v)
13:     Attach Tree_v to the correspoding branch of Tree
14:end for
15:return Tree

Example

Gender	Age	Salary	Status	Number of times travel per year	interest travel promotion or not
Male	21-40	30001-50000	Single	1-3	No
Female	21-40	10001-30000	Married	4-5	Yes
Female	> 60	10001-30000	Single	1-3	No
Female	41-60	30001-50000	Married	4-5	Yes
Male	41-60	> 50001	Single	1-3	Yes
Female	21-40	> 50001	Single	1-3	Yes
Male	41-60	30001-50000	Married	4-5	Yes
Male	> 60	10001-30000	Single	1-3	No
Female	41-60	10001-30000	Married	1-3	No
Male	> 60	30001-50000	Married	1-3	Yes

a) Entropy using the frequency table of one attribute:

Yes of interest is 6.

No is 4.

E(travel promotion) = E(4/10, 6/10) = - (0.4log₂0.4) - (0.6log₂0.6) = 0.971

E original = 0.971

b) Entropy using the frequency table of two attributes:

Gender	Yes	No
Male	3	2
Female	3	2

E(Interest, Gender) = P(Male)*E(3/5, 2/5) + P(Female)*E(3/5, 2/5)

= 5/10*(- 0.6log₂0.6 - 0.4log₂0.4)) + 5/10*(- 0.6log₂0.6 - 0.4log₂0.4)) = 0.971

c) Information gain formula.

Gain(Original, Gender) = 0.971 - 0.971 = 0

d) SplitINFO formula.

SplitINFO(Interest, Gender) = - (5/10log₂5/10) - (5/10log₂5/10) = 1

e) GainRaTIO formula.

GainRATIO(Interest, Gender) = 0/1 = 0

The C4.5 uses "Gain Ratio" measure which is Information Gain divided by SplitInfo. The Information Gain divided by SplitInfo is call GainRaTIO.

Step 1: Calculate entropy of the target.

Step 2: Split the dataset on the different attribute, and calculate GainRATIO of each attribute.

Step 3: Choose attribute that have largest GainRATIO as the decision node.

Step 4: The C4.5 algorithm is run recursively one the non-leaf branch, until all data is classified.

Step 1:

E(travel promotion) = E(4/10, 6/10) = - (0.4log₂0.4) - (0.6log₂0.6) = 0.971

Step 2:

Gender	Yes	No
Male	3	2
Female	3	2

E(Interest, Gender) = P(Male)*E(3/5, 2/5) + P(Female)*E(3/5, 2/5)

= 5/10*(- 0.6log₂0.6 - 0.4log₂0.4) + 5/10*(- 0.6log₂0.6 - 0.4log₂0.4) = 0.971

Gain(Original, Gender) = 0.971 - 0.971 = 0

SplitINFO(Interest, Gender) = - (5/10log₂5/10) - (5/10log₂5/10) = 1

GainRATIO(Interest, Gender) = 0/1 = 0

Age	Yes	No
21-40	2	1
41-60	3	1
> 60	2	1

E(Interest, Age) = P(21-40)*E(2/3, 1/3) + P(41-60)*E(3/4, 1/4) + P(> 60)*E(2/3, 1/3)

= 3/10*0.9183 + 4/10*0.8113 + 3/10*0.9183 = 0.876

Gain(Original, Age) = 0.971 - 0.876 = 0.095

SplitINFO(Interest, Age) =- (3/10log₂3/10) - (4/10log₂4/10) - (3/10log₂3/10) = 1.571

GainRATIO(Interest, Age) = 0.095/1.571 = 0.06

Salary	Yes	No
10001-30000	1	3
30001-50000	3	1
> 50001	2	0

E(Interest, Age) = P(10001-30000)*E(1/4, 3/4) + P(30001-50000)*E(3/4, 1/4) + P(> 50001)*E(2/2, 0/2)

= 4/10*0.811 + 4/10*0.811 + 2/10*0 = 0.649

Gain(Original, Age) = 0.971 - 0.649 = 0.322

SplitINFO(Interest, Age) =- (4/10log₂4/10) - (4/10log₂4/10) - (2/10log₂2/10) = 2.376

GainRATIO(Interest, Age) = 0.322/2.376 = 0.136

Status	Yes	No
Married	4	1
Single	2	3

E(Interest,Status) = P(Married)*E(4/5, 1/5) + P(Single)*E(2/5, 3/5)

= 5/10*0.722 + 5/10*0.971 = 0.847

Gain(Original,Status) = 0.971 - 0.847 = 0.124

SplitINFO(Interest,Status) =- (5/10log₂5/10) - (5/10log₂5/10) = 1

GainRATIO(Interest,Status) = 0.124/1 = 0.124

Number of times travel per year	Yes	No
1-3	3	4
4-5	3	0

E(Interest, travel per year) = P(1-3)*E(3/7, 4/7) + P(4-5)*E(3/3, 0/3)

= 7/10*0.985 + 3/10*0 = 0.69

Gain(Original, travel per year) = 0.971 - 0.69 = 0.281

SplitINFO(Interest, travel per year) =- (7/10log₂7/10) - (3/10log₂3/10) = 0.881

GainRATIO(Interest, travel per year) = 0.281/0.881 = 0.319

Step 3: GainRATIO of Number of times travel per year is come to top, so number of times travel per year is the root node.

Step 4: Next we will interest only number of times travel per year that have 1-3 value.

Gender	Age	Salary	Status	interest travel promotion or not
Male	21-40	30001-50000	Single	No
Female	> 60	10001-30000	Single	No
Male	41-60	> 50001	Single	Yes
Female	21-40	> 50001	Single	Yes
Male	> 60	10001-30000	Single	No
Female	41-60	10001-30000	Married	No
Male	> 60	30001-50000	Married	Yes

Step 5: Calculate GainRATIO that have travel per year = 1-3

E(Interest[1-3]) = E(3/7, 4/7) = - (3/7log₂3/7) - (4/7log₂4/7) = 0.985

Gender	Yes	No
Male	2	2
Female	2	1

E(Interest[1-3], Gender) = P(Male)*E(2/4, 2/4) + P(Female)*E(2/3, 1/3)

= 4/7*1 + 3/7*0.918 = 0.965

Gain(Original[1-3], Gender) = 0.985 - 0.965 = 0.02

SplitINFO(Interest[1-3], Gender) =- (4/7log₂4/7) - (3/7log₂3/7) = 0.985

GainRATIO(Interest[1-3], Gender) = 0.02/0.985 = 0.02

Age	Yes	No
21-40	1	1
41-60	1	1
> 60	1	2

E(Interest[1-3], Age) = P(21-40)*E(1/2, 1/2) + P(41-60)*E(1/2, 1/2) + P(> 60)*E(1/3, 2/3)

= 2/7*1 + 2/7*1 + 3/7*0.918 = 0.965

Gain(Original[1-3], Age) = 0.985 - 0.965 = 0.02

SplitINFO(Interest[1-3], Age) =- (2/7log₂2/7) - (2/7log₂2/7) - (3/7log₂3/7) = 1.556

GainRATIO(Interest[1-3], Age) = 0.02/1.556 = 0.013

Salary	Yes	No
10001-30000	0	3
30001-50000	1	1
> 50001	2	0

E(Interest[1-3], Salary) = P(10001-30000)*E(3/3, 0/3) + P(30001-50000)*E(1/2, 1/2) + P(> 50001)*E(2/2, 0/2)

= 3/7*0 + 2/7*1 + 2/7*0 = 0.286

Gain(Original[1-3],Salary) = 0.985 - 0.286 = 0.699

SplitINFO(Interest[1-3],Salary) =- (3/7log₂3/7) - (2/7log₂2/7) - (2/7log₂2/7) = 1.556

GainRATIO(Interest[1-3],Salary) = 0.699/1.556 = 0.449

Status	Yes	No
Married	1	1
Single	2	3

E(Interest[1-3], Status) = P(Married)*E(1/2, 1/2) + P(Single)*E(2/5, 3/5)

= 2/7*1 + 5/7*0.971 = 0.979

Gain(Original[1-3], Status) = 0.985 - 0.979 = 0.006

SplitINFO(Interest[1-3], Status) =- (2/7log₂2/7) - (5/7log₂5/7) = 0.863

GainRATIO(Interest[1-3], Status) = 0.006/0.863 = 0.007

Step 6: GainRATIO of salary is come to top, so salary is the child node of travel per year.

Step 7: we will interest only number of times travel per year that have 1-3 value and salary = 30001-50000.

Gender	Age	Status	interest travel promotion or not
Male	21-40	Single	No
Male	> 60	Married	Yes

Step 8: Calculate GainRATIO that have travel per year = 1-3 and salary = 30001-50000.

E(Interest[1-3][30001-50000]) = E(1/2, 1/2) = - (1/2log₂1/2) - (1/2log₂1/2) = 1

Gender	Yes	No
Male	1	1
Female	0	0

E(Interest[1-3][30001-50000], Gender) = P(Male)*E(1/2, 1/2) + P(Female)*E(0, 0) = 2/2*1 + 0 = 1

Gain(Original[1-3][30001-50000], Gender) = 1 - 1 = 0

SplitINFO(Interest[1-3][30001-50000], Gender) =- (2/2log₂2/2) = 0

GainRATIO(Interest[1-3][30001-50000], Gender) = 0

Age	Yes	No
21-40	0	1
41-60	0	0
> 60	1	0

E(Interest[1-3][30001-50000], Age) = P(21-40)*E(0, 1) + P(41-60)*E(0, 0) + P(> 60)*E(1, 0) = 0

Gain(Original[1-3][30001-50000], Age) = 1 - 0 = 1

SplitINFO(Interest[1-3][30001-50000], Age) = - (1/2log₂2/7) - (1/2log₂1/2) = 1

GainRATIO(Interest[1-3][30001-50000], Age) = 1/1 = 1

Status	Yes	No
Married	1	0
Single	0	1

E(Interest[1-3][30001-50000], Status) = P(Married)*E(0, 1) + P(Single)*E(1, 0) = 0

Gain(Original[1-3][30001-50000], Status) = 1 - 0 = 1

SplitINFO(Interest[1-3][30001-50000], Status) =- (1/2log₂1/2) - (1/2log₂1/2) = 1

GainRATIO(Interest[1-3][30001-50000], Status) = 1

GainRATIO both age and status is equal, so we can use either age or status to child node of salary, but we should select a node that have least number of type of attribute. Therefore we got a decision tree like this.

After We constructed a decision tree completely, we can use post pruning in order to discard unreliable parts.

f is the error on the training data.

N is the number of instances covered by the left.

z from normal distribution.

e is error estimate for a node.

If the error estimate at children node greater than parent node, we do not want to keep the children.

The advantages of the C4.5 are:

Builds models that can be easily interpreted and explained to executives.
Easy to implement because construct a decision tree only once that can handle test sets.
Can use both categorical and continuous values.
Deals with noise.

The disadvantages are:

Small variation in data can lead to different decision trees (especially when the variables are close to each other in value).
Does not work very well on a small training set.
If in the first time we use small training set to construct a tree, It can't handle a test set accurately.

Solution 1:

When test data has been predicted by decision tree, and get Yes in class interest travel promotion that means customer should have interest in a new promotion. We'll send e-mail to him/her. Since he/she rejected, the training set can't handle correctly. If number of rejection greater than threshold(incorrect predictions/correct prediction), we'll use test sets to become training sets combination with old training sets to construct a decision tree again for preparing to predict new test set.

Example

Since we define threshold=0.3, (incorrect predictions)/(correct prediction) is greater than 0.3, we'll construct decision tree immediately.

Solution 2:

Use C5.0 instead of C4.5. The advantages of C5.0

Multi-threaded: can take advantage of multiple CPUs or cores that can improve speed to construct a decision tree.
Boosting: that is a technique for generating and combining multiple classifiers to improve predictive accuracy.
Decision trees generally smaller.

Subscribe to: Posts (Atom)