... Optimal Multi-Paragraph Text Segmentation by Dynamic Programming
Oskari Heinonen
University of Helsinki, Department of Computer ... of successive text con-
stituents, e.g., paragraphs. Methods for deciding
the locations of fragment boundaries are, however,
scarce. We propose a fragmentation method based
on dynamic programming. ...
fragment size is of importance.
1 Introd...
... McClintock PVE (2000) Changes in the
dynamical behavior of nonlinear systems induced by
noise. Phys Report 323, 1–80.
3 Longtin A & L’Heureux I (2001) Dynamical effects of
noise on nonlinear ... such as seen in Fig. 6 dominate the dynamics
rather than escape events. The amplitudes, being dicta-
ted by the sequence of random fluctuations experi-
enced by the system, vary quite a bit f...
... problem by modeling segmentation
as character classification (Xue, 2003; Gao et al.,
2004). This approach observes that by classifying
characters as word-initial, word-final, penultimate,
etc., word segmentation ... de-
scribed in (Chen and Liu, 1992) and still adopted in
many recent works, considers text segmentation as a
69
tokenization. Segmentation is typically divided into
tw...
... accuracy of text categoriza-
tion. For the Na¨ıve Bayes classifier this increase is
significant.
1 Motivation
In the process of automatic classifying documents
into several predefined classes – text categorization
(Sebastiani, ... – text documents are usually seen
as sets or bags of all the words that have appeared
in a document, maybe after removing words in a
stop-list. In this paper we...
... same ob-
ject must respect its position by signing the sign at
the same location or by anaphoric pointing at that
location. This form of agreement is achieved by in-
... phonetic descriptions,
as illustrated in Figure 1.
3.1 Syntactic Parsing
English text (Figure 2 top left) is parsed by the
Carnegie Mellon University (CMU) link grammar
parser (Sleator and Temperley, ... English t...
... on the context in which a word is
used, each definition, i.e. each semantic
representation, must include slots to be
filled b~ that context. The slots will pro-
vide a unique context for each ...
initions of a word and thereby to create a
net which will be capable of discriminating
among all definitions of a word.
The following requirements must
be
satisfied
by such a parser and i...
... entailment graph.
2.2 Entailment Graph
Recognizing Textual Entailment (RTE) is the task
of deciding, given two text fragments, whether the
meaning of one text can be inferred from another
(Dagan et ... usually produced by an extraction method, such
as TextRunner (Banko et al., 2007) or ReVerb (Fader
et al., 2011). In order to support the exploration
process, the documents are indexed b...
... discovery of <NAME> by <ANSWER>.
0.95
<NAME> was discovered by
<ANSWER>
0.91
of <ANSWER> ' s <NAME>
0.9 <NAME> was discovered by
<ANSWER> ... words “Mozart” and “1756”.
8. Replace the word for the question term by
the tag “<NAME>” and the word for the
answer term by the term “<ANSWER>”.
This procedure i...
... of
human segmentation of our corpus, where speaker
intention is the segmentation criterion. We then use
the subjects' segmentations to evaluate the corre-
lation of discourse segmentation ... perform discourse
segmentation using speaker intention as a criterion.
We use the segmentations produced by our subjects
to quantify and evaluate the correlation of discourse
segmen...
... combination of text
processing with interactive editing.
We first used straight text processing to
identify synonym references in definitions and reduce
them to triples. Our next essay in the text ... others being nouns,
adjectives, and Iransitive verbs) with 8,883 texts.
Virtually all verb definition texts begin with
to
followed by a head verb, or a set of conjoined head
ver...