... case
for measuring cost in assessing AL methods.
1 Introduction
Obtaining human annotations for linguistic data is
labor intensive and typically the costliest part of the
acquisition of an annotated ... applicable to AL for sequential labeling
in general. We make the case for measuring cost in
assessing AL methods by showing that the choice of
a cost function s...
... for inference. In another 28%,
references could optionally support the inference
of the hypothesis. In the remaining 28%, refer-
ences did not contribute towards inference. The
total number of ... generic framework
for modeling semantic inference. TE reduces the
inference requirements of many text understand-
ing applications to the problem of determining
whether...
... the
class of the majority of the items which reached
it during training. The trees were grown using
recursive partitioning; the splitting criterion was
reduction in deviance. Using the Gini ... grave for the
LPE experiments because of the ceiling effect and
the small size of the complete data set, therefore,
we did not rerun the corresponding experiment...
... another label before starting
the active learning trial, but retain the distribution of
the different labels in the pool data (active learning
with random errors); (Table 1, ALrand, 30%). In
the ... margin sampling is less sensitive to cer-
tain types of noise than entropy sampling (Table 2).
Because of space limitations we only show curves
for margin samplin...
... Sarkar. 2009. Active
learning for multilingual statistical machine trans-
lation. In Proceedings of the Joint Conference of
the 47th Annual Meeting of the ACL and the 4th In-
ternational Joint Conference ... for Computational Linguistics.
Katrin Tomanek and Udo Hahn. 2009. Semi-
supervised active learning for sequence labeling. In
Proceedings of the Joint...
... of Japanese dependency pars-
ing
that the algorithm in Figure 4 does not generate
every pair of bunsetsus.
3
4 Active Learning for Parsing
Most of the methods of active learning for parsing
in ... is
called pool-based active learning. Following their
sequential sampling algorithm, we show in Fig-
ure 1 the basic flow of pool-based active learning.
V...
... between the original summarizer
ranking and the ranking after excluding topics by one or
two worst assessors in each category.
we should examine the potential impact of incon-
sistent assessors on the ... too.
2
Therefore, it would be better to look at whether as-
sessors tend to find the same SCUs (information
“nuggets”) in different summaries on the same topic,
and whether...
... because, in many
cases, the previous label of a named entity is “O”,
which indicates a non-named entity. For 98.0% of
the named entities in the training data of the shared
task in the 2004 JNLPBA, the ... than the system without it (the p-value is
less than 1.0 < 10
−4
). The result of the preceding
entity information improves the performance. On
the...
... derivations, the probability of a tree is
the sum of the probabilities of the derivations
producing that tree. The probability of a derivation
is the product of the subtree probabilities. The
original ...
402
the number of subtrees of all trees increases with
the Catalan number, and only ad hoc sampling
could make the method work.
Since U-DOP* compu...
... compute the length of the
intervals between stirring events. The length of a
single stirring event is a default which is part of the
representation of the primitive actions. The number
of stirring ... system (Karlin,
1988). SEAFACT operates in the domain of cooking
tasks. The domain is limited to a mini-world con-
sisting of a small set of verbs chosen...