0

unsupervised discovery of domainspecific knowledge from text

Báo cáo khoa học:

Báo cáo khoa học: "Unsupervised Discovery of Domain-Specific Knowledge from Text" pptx

Báo cáo khoa học

... Computational Linguistics Unsupervised Discovery of Domain-Specific Knowledge from Text Dirk Hovy, Chunliang Zhang, Eduard HovyInformation Sciences InstituteUniversity of Southern California4676 Admiralty ... Research with a Series of Reading Tasks. In Proceedings of LREC 2010.Fabian M. Suchanek, Gjergji Kasneci, and GerhardWeikum. 2007. Yago: a core of semantic knowledge. In Proceedings of the 16th international ... 2006. Unsupervised Learning of Verb Argument Structures.Computational Linguistics and Intelligent Text Pro-cessing, pages 59–70.Anselmo Pe˜nas and Eduard Hovy. 2010. Semantic en-richment of text...
  • 10
  • 377
  • 0
Tài liệu Báo cáo khoa học: Biochemical characterization and inhibitor discovery of shikimate dehydrogenase from Helicobacter pylori docx

Tài liệu Báo cáo khoa học: Biochemical characterization and inhibitor discovery of shikimate dehydrogenase from Helicobacter pylori docx

Báo cáo khoa học

... characterization of HpSDH demonstrates its activity with kcat of 7.7 s)1and Km of 0.148 mmtoward shikimate, kcat of 7.1 s)1and Km of 0.182 mm toward NADP, kcat of 5.2 s)1and Km of 2.9 mm ... a kcat of 7.7 ± 0.9 s)1, Km of 0.148 ± 0.028 mm andkcat⁄ Km of 5.2 · 104m)1Æs)1toward shikimate, and akcat of 7.1 ± 0.7 s)1, Km of 0.182 ± 0.027 mm andkcat⁄ Km of 3.9 · ... Shen1,21 Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China2 School of Pharmacy, East...
  • 11
  • 529
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Unsupervised Discovery of Rhyme Schemes" pdf

Báo cáo khoa học

... is an edited ver-sion of the public-domain portion of the corpus usedby Sonderegger (2011), and consists of just under12000 stanzas spanning a range of poets and dates from the 15thto 20thcenturies. ... 2011.c2011 Association for Computational Linguistics Unsupervised Discovery of Rhyme SchemesSravana ReddyDepartment of Computer ScienceThe University of ChicagoChicago, IL 60637sravana@cs.uchicago.eduKevin ... extremely useful forlarge-scale statistical analyses of poetic texts.• Historical Linguistics/Study of DialectsRhymes of a word in poetry of a given timeperiod or dialect region provide clues...
  • 6
  • 371
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "A Practical Solution to the Problem of Automatic Part-of-Speech Induction from Text" pdf

Báo cáo khoa học

... prob-lem of automatic word sense induction. Proceedings of ACL (Companion Volume), Barcelona, 195-198. Schütze, Hinrich (1993). Part -of- speech induction from scratch. Proceedings of ACL, Columbus, ... clustering of global vectors is more adequate (see footnote 1). This finding is of interest when trying to understand the nature of syntax versus semantics if expressed in statistical terms. Acknowledgements ... assignment of the ambiguous words to clusters is not required at this stage, as this is taken care of in the next step. This step involves computing the differential vector of each word from the...
  • 4
  • 433
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Unsupervised Discovery of Persian Morphemes" docx

Báo cáo khoa học

... is the left part of word, RP is the right part of it, Len (p) is the length of part P (number of characters), freq(p) is the frequency of part P in corpus, WN is the number of words (corpus ... definition of the word as a unit has been agreed upon. If effective methods can be devised for the unsupervised discovery of morphemes, they could aid the formulation of a linguistic theory of morphology ... results of a research on unsupervised Persian mor-pheme discovery. In this paper we pre-sent a method for discovering the mor-phemes of Persian language through automatic analysis of corpora....
  • 4
  • 357
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Unsupervised Discovery of Generic Relationships Using Pattern Clusters and its Evaluation by Automatically Generated SAT Analogy Questions" pot

Báo cáo khoa học

... EMNLP ’06.Hearst, M., 1992. Automatic acquisition of hyponyms from large text corpora. COLING ’92Lin, D., Pantel, P., 2002. Concept discovery from text. COLING 02.Moldovan, D., Badulescu, A., Tatu, ... unsuper-vised discovery of word categories using symmetricpatterns and high frequency words. COLING-ACL’06.Davidov, D., Rappoport, A. and Koppel, M., 2007. Fully unsupervised discovery of concept-specific ... cor-pus, the set of the contexts in which the word ap-pears. Each context is a window containing Wwords or punctuation characters before and after thehook word. We avoid extracting text from clearlyunformatted...
  • 9
  • 390
  • 0
Web Mining and Knowledge Discovery of Usage Patterns

Web Mining and Knowledge Discovery of Usage Patterns

Kỹ thuật lập trình

... Mining From the technique point of view, Web usage mining is the application of data mining techniques to usage logs (secondary Web data) of large Web data repositories. The purpose of it is ... content mining focuses on the discovery/ retrieval of the useful information from the Web contents/data/documents, while the Web structure mining emphasizes to the discovery of how to model the underlying ... patterns and to extract the interesting rules or patterns from the output of the pattern discovery process. The output of Web mining algorithms is often not in the form suitable for direct human consumption,...
  • 25
  • 630
  • 3
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Unsupervised Segmentation of Chinese Text by Use of Branching Entropy" pdf

Báo cáo khoa học

... 0.6 0.7 0.8 0.9 1 10 100 1000 10000 100000 1e+06size(KB)recallprecision434Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 428–435,Sydney, July 2006.c2006...
  • 8
  • 395
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Deriving an Ambiguous Word’s Part-of-Speech Distribution from Unannotated Text" doc

Báo cáo khoa học

... we suggest to look at local contexts instead of global co-occurrence vec-tors. As can be seen from human performance, in almost all cases the local context of a syntactically ambiguous word ... part of speech. The core assumption underlying our approach, which in the context of cognition and child lan-guage has been proposed by Mintz (2003), is that words of a particular part of speech ... the part -of- speech distribution of syntacti-cally ambiguous words without explicitly tagging the underlying text corpus. This is achieved by assuming that the word pair consisting of the left...
  • 4
  • 389
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "AUTOMATIC ACQUISITION OF SUBCATEGORIZATION FRAMES FROM UNTAGGED TEXT" doc

Báo cáo khoa học

... about the text) . Accuracy in identifying SF occurrences • Simplicity of design and speed Efficient use of the available text was not a high priority, since it was felt that plenty of text was ... nique based on the Case Filter of Rouvret and Vergnaud (1980). The completeness of the output list increases monotonically with the total number of occurrences of each verb in the corpus. False ... verbs occur in a sen- tence. Finding some of the verbs in a text reliably is hard enough; finding all of them reliably is well beyond the scope of this work. Finally, any system applied...
  • 6
  • 416
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Knowledge Acquisition from Texts : Using an Automatic Clustering Method Based on Noun-Modifier Relationship" pptx

Báo cáo khoa học

... context". The four syntactic links of LEXTER Can be used to define this terminological context. For in- stance, the "expansion terminological context" (E- terminological context) ... the object LINE. This definition of the context is original compared to the classical context definitions used in Informa- tion Retrieval, where the context of a lexical unit is obtained by ... description of the concept "line" 5 Discussion • Evaluation of the quality of the clustering pro- cedure • in the majority of the works using clus- tering methods, the evaluation of the...
  • 3
  • 408
  • 0
Báo cáo khoa học: Discovery of a eugenol oxidase from Rhodococcus sp. strain RHA1 ppt

Báo cáo khoa học: Discovery of a eugenol oxidase from Rhodococcus sp. strain RHA1 ppt

Báo cáo khoa học

... tostabilize the dimeric form of EUGO. The partially apoform of EUGO became fully flavinylated in vitro bythe addition of FAD. The cofactor incorporationresulted in formation of holo dimeric EUGO, ... struc-ture (black) and the modeled apo-EUGO structure (gray). His422 of VAO, linking the FAD cofactor, aligns with His390 of EUGO. Discovery of a eugenol oxidase J. Jin et al.2314 FEBS Journal 274 (2007) ... were purchased from Invitrogen (Carlsbad, CA, USA).Expression and purification of recombinant EUGODNA from Rhodococcus sp. strain RHA1 was a kind gift from R v.d. Geize (University of Groningen,...
  • 11
  • 520
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Automatic construction of a hypernym-labeled noun hierarchy from text" docx

Báo cáo khoa học

... up of multiple words, rather than just using the head nouns of the noun phrases. 124 Automatic construction of a hypernym-labeled noun hierarchy from text Sharon A. Caraballo Dept. of Computer ... cluster of cities that because of sparse data was assigned a poor hypernym. Some of the suggestions in the .following sec- tion might correct this problem. Of the 50 noise words, a few of them ... two groups of nouns, we define similarity as the average of the cosines between each pair of nouns made up of one noun from each of the two groups. sim(A,B) = Ev,wCOS (v,w) size(A)size(B)...
  • 7
  • 418
  • 0
Báo cáo khoa học:

Báo cáo khoa học: " The Development of Lexical Resources for Information Extraction from Text Combining Word Net and Dewey Decimal Classification" potx

Báo cáo khoa học

... Unfortunately one of the cur- rent trends in IE is the progressive reduction of the size of training corpora: e.g., from the 1,000 texts of the MUC-5 (MUC-5, 1993) to the 100 texts in MUC-6 ... Abstract Lexicon definition is one of the main bot- tlenecks in the development of new ap- plications in the field of Information Ex- traction from text. Generic resources (e.g., lexical ... also cover the generic knowledge of the world. The integration consists in marking parts of WordNet's hierarchy, i.e. some synsets, with semantic labels taken from the DDC. 4 The development...
  • 4
  • 436
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web" pdf

Báo cáo khoa học

... similarity of their contexts. Contexts are col-lected as patterns of two kinds: dependency pat-terns from dependency analysis of sentences inWikipedia, and surface patterns generated from highly ... combinations of patterns: dependency patterns from depen-dency analysis of texts in Wikipedia, andsurface patterns generated from highly re-dundant information related to the Web.Evaluations of the ... clustering pairs of co-occurring entities represented as vectors of con- text features. They used a simple representation of contexts; the features were words in sentencesbetween the entities of the...
  • 9
  • 345
  • 0

Xem thêm