Lecture Notes in Computer Science- P11 pot

5 310 0
Lecture Notes in Computer Science- P11 pot

Đang tải... (xem toàn văn)

Thông tin tài liệu

40 X. Wang, F. Yuan, and L. Qi many kinds of available relations. In this scenario, how to use the semantic explicit relations among the resources in the education portal to make better recommendation becomes a new challenge. In this paper, we propose a complementary study on using the relations between resources and other entities in education portal to do recommendation. Therefore, we separate this problem to the following two problems: 1. Formalize the recommendation problem in the education portal; 2. Using relations of the resources and other typed entities to do better recom- mendation in education portal. For an education portal, the recommendation usually happens when a user enters the portal. A user can enter the portal by logining in or just using a guest account (anonymous viewer for the web page). In these two scenarios, recommendation is needed to choose the important learning resources in the education portal to be dis- played in the first web page when users enter the portal. In this paper, we mainly focus on this kind of recommendation. By calculating the important rank of each learning resource, we can decide which resources should be placed in the recommen- dation block of the portal to attract users. Advantages of the proposed approach are as follows: 1. It can make use of the resources’ relations in the education portal. The relations can be dynamically added and removed; 2. The approach is easily adapted to recommend other types of entities, for exam- ple, to recommend categories. An example of the resources’ relation diagram is as figure 1: Fig. 1. An example of the learning resources’ relations Recommendation in Education Portal by Relation Based Importance Ranking 41 The rest of the paper is organized as follows: in section 2, we formalize the rec- ommendation of education portal in this paper. In section 3, the proposed important ranking approach is presented in detail. In section 4, we analyze the experimental results. Finally, after giving the related works in section 5, we discuss the conclusions in section 6. 2 Problem Statements Every education portal may have its own structure for entities and relations in it. This paper proposes a general method to be easily adapted to fit for the specific structure in different education portals. In problem statements, we will use the scenario in figure 1 to give example. Such example doesn’t affect the generality of the proposed approach. First, we give the definition of the entities of an education portal as follows: {R, U, C, T, D} (1) Where R indicates resource, which is a core element for recommendation; U indicates user, C indicates category, T indicates tag, and D indicates a department in the univer- sity. In practical case, a resource may be a series of web pages for an online course, or a web page for some knowledge tips, or even a web page for an announcement of academic forum. Category is usually used to indicate which topic the resource is, for example, the resource is about Information Technology. Tag is widely used in Web 2.0 in published resources. It can be any keywords to help users to understand the resource or to give summarization of the resource. Department is a specific concept in education portal, for example, the Department of Computer Science. Except entities, relations are also important in the education portal. Here we define the relations as follows: {RC, UT, UR, TR, DU, CD} (2) Where RC indicates the resource-category (belongto) relation; UT indicates user-tag (creates) relation; UR indicates user-resource (creates) relation; TR indicates tag- resource (annotate) relation; DU indicates department-user (contains) relation, CD indicates category-department (managedby) relation. Thus, the education portal network can be represented by nodes and edges, which can be represented as follows: EVnetwork ∪= (3) Where V is as follows: }{ dctur VVVVVV ∪∪∪∪= (4) Where V r indicates all the resources, V u indicates all the users, V t indicates all the tags, V c indicates all the categories and V d indicates all the departments. E is as follows: }{ cddutrurutrc EEEEEEE ∪∪∪∪∪= (5) 42 X. Wang, F. Yuan, and L. Qi Where E rc indicates all the relations of resource-category; E ut indicates all the relations of user-tag; E ur indicates all the relations of user-resource; E tr indicates all the rela- tions of tag-resource; E du indicates all the relations of department-user; E cd indicates all the relations of category-department. The network model can be formalized as figure 2. Fig. 2. The formalization of education portal network model We define the transition probabilities λ for different relations. As for random walk, the transition probability can be viewed as the probability of a user jumps to another entity when he/she is visiting an entity in the network. According to the transition theory, we have the following formula: 0 1 1 1 1 1 > ⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎨ ⎧ = = = = =+ mn cd rc tr du urut λ λ λ λ λ λλ (6) Based on the above analysis, the recommendation in education portal can be for- malized as follows: Recommend top n important resources in the education portal network by impor- tance ranking. For other kinds of structures in education portal, the V and E can be different. How- ever, other structures can be easily adapted to our formalization by turning entities into nodes and relations into edges. 3 Relation Based Importance Ranking In this section, we will present our approach in detail. Firstly, we will introduce our approach briefly. And then, the detail is presented step by step. Recommendation in Education Portal by Relation Based Importance Ranking 43 3.1 Brief Introduction of Our Approach Our approach is mainly focus on using the relations between entities to calculate the importance of the resources. This approach is similar to PageRank [5] in web search. The major difference is that in PageRank, there is only one type of entity, which is web page, and one type of relation, which is hyperlink. However, in an education portal, there have some semantic explicit relations between different typed entities. Therefore, our approach can be divided into following steps: 1) Decide the transition probability. In this special case, only λ ut and λ ur needs to be decided. 2) Importance Ranking Calculation. The calculation is similar to PageRank in web search except that we have several types of entities and relations. 3) Recommendation. After the importance ranking calculation, recommendation is simple. The top n ranked resources will be selected to recommend to users. For the first step, a simple method is to give equal values to them. Another method is to try different pairs of values and use experimental results to select a proper one. The second step is the core element in our approach and we will mainly describe the second step in the next section. The third step is straight forward enough in current scenario. 3.2 Detail Introduction of Our Approach In this section, we will present our approach in detail. A random walk [6] based ap- proach is applied to calculate the importance rank of the resources. Firstly, we will give the random walk algorithm in this scenario. For example, if you are visiting a user, which is a node of V u in our network, you can have the probability of λ ur to browse the resources the user creates. Furthermore, if the user has created n resources, for each resource the user created, you have λ ur /n probability to visit it. We can formalize this scenario. For any two entities in the network i and j, there is a transition probability t ij , for example, if i is a node represents a user and j is a node represents a resource, the t ij is as follows: ur r ij Vicount t λ × >− = )( 1 (7) Where count(i->V r ) is the number of all the resource nodes the node i has user- resource relation with. In the same way, for any two nodes x and y, we can calculate the transition probability of x and y by this method. Therefore, we can get the transition matrix A, whose element at row i and column j represent t ij . Reference to PageRank, there is a random jump parameter α. Random jump parameter means that users can randomly jump to any other nodes in the net- work other than the neighbor nodes. We set α=0.15 in this case. T nn EEA ) 1 , , 1 )(1, ,1(,)1(A =+−= ′ αα (8) 44 X. Wang, F. Yuan, and L. Qi Where n is the number of the entities in the network. The importance score can be viewed as a vector S as follows: T n ssS ], ,[ 1 = (9) Where s k represents the important score of entity k, there are n entities totally in the education portal. Thus, the calculation of S is as follows: SAS T × ′ = ′ )( (10) The calculation of S is similar to PageRank by using iterative method. In the fol- lowing section, we will describe the calculation methods in detail. Decide the transition probability. In traditional PageRank in web pages, the transition probability is the same in the link relation because there is only one type of relation, which is hyperlink. However, in education portal, there may be different types of relations between entities. For example, in our scenario, there are user- resource and user-tag relations in the user entity. How to give transition probability to them is a problem. Here we simply use the equal values to give transition probability to them, which means λ ut =λ ur =0.5. In experimental evaluation section, we will compare the result of different parameters’ values. Calculate the importance rank. According to the formula (7) ~ (10), the calculation of importance rank is as follows: 1) Give the initial importance to every node in the network. Here we initialize every entity’s importance equally, which is 1/n. n is the number of all the en- tities in the network. 2) Calculate the transition matrix A ′ using the formula (7) and (8). 3) Begin to calculate the importance rank of every entity in the network. Until one of the finish conditions is met. The finish conditions are as follows: 1. If the sum of all the entities’ importance ranks’ variation is smaller than a value, which is marked as ε 2. If the number of iteration has reached m. 4) Output the importance rank of every entity. The process can be summarized as follows 1. Initialization. Input all the transition probabili- ties  ut ,  ur ,  du ,  tr ,  rc ,  cd ; Initialize the im- portance vector S=[1/n, … , 1/n]; Initialize the number of iteration as m . 2. Calculate the transition probability matrix A’ using formula (7) and (8); . portal in this paper. In section 3, the proposed important ranking approach is presented in detail. In section 4, we analyze the experimental results. Finally, after giving the related works in. portal by logining in or just using a guest account (anonymous viewer for the web page). In these two scenarios, recommendation is needed to choose the important learning resources in the education. formalization by turning entities into nodes and relations into edges. 3 Relation Based Importance Ranking In this section, we will present our approach in detail. Firstly, we will introduce our

Ngày đăng: 05/07/2014, 09:20

Tài liệu cùng người dùng

Tài liệu liên quan