... one of
a growing number of corpora with human-to-human
multi-party conversations. In this corpus, record-
ings of meetings ranged primarily over three differ-
ent recurring meeting types, all of ... error of P
k
= 15.79, while the average performance
of the algorithm is P
k
= 15.31 on the WSJ test corpus (unknown number of segments).
mean and the variance of the hypothesized...
... Automatic Segmentation of Multiparty Dialogue
Pei-Yun Hsueh
School of Informatics
University of Edinburgh
Edinburgh, EH8 9LW, GB
p.hsueh@ed.ac.uk
Johanna D. Moore
School of Informatics
University of ... shifts
to the problem of identifying subtopic
boundaries. We then explore the impact
on performance of using ASR output as
opposed to human transcription. Exam-
ination of th...
... compound
nouns in 11 years of the French
Le Monde
newspaper. They have been collected with the
INTEX tool of Silberztein (1994). The part of
speech tagger TreeTagger of Schmid (1994) is
applied ...
which is an indicator of the importance
of a term according to its distribution in a text.
It is defined by:
wij
= ~). log
where
tfij
is the number of occurrences of a...
... comparison of the perfor-
mance of two versions of our discourse processor,
one based on strict TST, and one with our extended
version of TST, demonstrating that our extension
of TST yields ... Implications of this model of Attentional
State are explored more fully in (Rosd 1995).
3 Discourse Processing
We evaluated the effectiveness of our theory of dis-
course struc...
... numerator of the multinomial is the factorial of
the total number of morph tokens, N, which equals
the sum of frequencies of every morph type. The de-
nominator is the product of the factorial of the ... [%]
Probabilistic
Recursive MDL
Linguistica
No segmentation
Figure 2: Expectation of the percentage of recog-
nized morphemes for English data.
a baseline of no segment...
... parts of the discourse context to extend the
coverage of a dialogue system.
1
Motivation
Most computational models of discourse are based pri-
marily on an analysis of the intentions of the ... complex set of
motivations for action. In particular, much of one's behavior
arises from a sense of obligation to behave within limits set
by the society that the agent is...
... Background
As part of our research on definite description (DD)
interpretation, we asked 3 subjects to classify the
uses of DDs in a corpus using a taxonomy related
to the proposals of (Hawkins, ... DDs) found a total of 240 relations, dis-
tributed over 107 cases of DDs. There were 54 cor-
rect resolutions (distributed over 34 DDs) and 186
false positives.
Types of
bridg...
... we
obtained TDT document clusters for 2 instances
of airplane crashes, 3 instances of earthquakes, 6
instances of presidential elections and 3 instances
of terrorist attacks. The number of the documents
corresponding ... (from
two documents for one of the earthquakes up to
156 documents for one of the terrorist attacks).
This variation in the number of documents per
topic is t...
... decision of gender had
led to deterioration in MRR performance of the
male names compared to the case where no prior
information was assumed. Soft decision of gender
yielded further gains of 17.1% ... the C-C corpus, out of the total of 4,507
characters, only 776 of them are for surnames. It is
interesting to find that female given names are
represented by a smaller set...