... French ones, to a sink, to control the
flow quantity we want to go through the words.
2.1 Flow networks
We meet here the notion of flow networks that
we can formalise in the following way (we ...
reader to (Ford and Fulkerson, 1962; Klein,
1967).
2.2 Alignmentmodels
Flows and networks define a general framework
in which it is possible to model alignments be-
tween words, and to ... translations for
the English term.
4 Conclusion
We presented a new model forwordalignment
based on flow networks. This model allows
us to integrate different types of constraints in
the search for...
... Retrieving Word Alignments
Two word- alignment retrieval schemes are de-
signed for BiTAMs: the uni-direction alignment
(UDA) and the bi-direction alignment (BDA). Both
use the posterior mean of the alignment ... Null word
and Laplace smoothing for the BiTAM models.
We train, for comparison, IBM-1&4 and HMM
models with 8 iterations of IBM-1, 7 for HMM
and 3 for IBM-4 (1
8
h
7
4
3
) with Null word ... Incorporation of Word “Null”
Similar to IBM models, “Null” word is used for
the source words which have no translation coun-
terparts in the target language. For example, Chi-
nese words “de” ()...
... are
less than 20 percent.
2 1 : n Word Alignment
Our discussion of uni-directional alignments of
word alignment is limited to IBM Model 4.
Definition 1 (Word alignment task) Let e
i
be
the i-th ... two word alignments
as an alignment point, 2) add new alignment points
that exist in the union with the constraint that a
new alignment point connects at least one previ-
ously unaligned word, ... purpose of
the wordalignment task is to obtain a lexical
translation probability p(
¯
f
i
|¯e
i
), which is a 1 : n
uni-directional word alignment. The initial idea
underlying the IBM Models, consisting...
... bilin-
gual wordalignment finds word- to -word connec-
tions across languages. Originally introduced as a
byproduct of training statistical translation models
in (Brown et al., 1993), wordalignment ... Log-linear modelsfor word
alignment. In Meeting of the Association for Computa-
tional Linguistics, pages 459–466, Ann Arbor, USA.
I. D. Melamed. 2000. Models of translational equivalence
among words. ... new information resulting in im-
proved alignments.
2 Constrained Alignment
Let an alignment be the complete structure that
connects two parallel sentences, and a link be
one of the word- to-word...
... words to
the right and left of the verb, identified using POS
tags, represented by has_narrow(snt,
word_ position, word) :
has_narrow(snt
1
, 1st _word_ left, mind).
has_narrow(snt
1
, 1st _word_ right, ... the positions of the words,
represented by has_narrow_trns(snt,
word_ position, portuguese _word) :
has_narrow_trns(snt
1
, 1st _word_ right, como).
has_narrow_trns(snt
1
, 2nd _word_ right, um). … ... for the disambigua-
tion of verbs.
We plan to further evaluate our approach for
other sets of words, including other parts-of-speech
to allow further comparisons with other approach-
es. For...
... Automatically-extracted thesauri for
cross-language IR: When better is worse. In Proceed-
ings of COMPUTERM’98.
Eric Gaussier. 1998. Flownetworkmodelsfor word
alignment and terminology extraction ... language word.
is expressed as follows: a word qualifies for clus-
tering if
As before, are all the target language words
that cooccur with source language word .
Similarly to the most frequent words, ... clustering. Those words
that are considered for clustering should account
for more than of the cooccur-
rences of the source language word with any tar-
get language word. If a word falls below...
... short words.
340
Combining Clues forWord Alignment
Rirg Tiedemann
Department of Linguistics
Uppsala University
Box 527
SE-751 20 Uppsala, Sweden
joerg@stp.ling.uu.se
Abstract
In this paper, a word ... bilingual
lexical information. Wordalignment approaches
focus on the automatic identification of translation
relations in translated texts. Alignments are usu-
ally represented as a set of links between words
and ... an alignment clue for the cor-
responding word pairs. The likelihood of each
translation alternative can be weighted, e.g., by
frequency (if available).
2.3 Clue Combinations
So far, word alignment...
... based on word alignment.
In this paper we introduce a confidence mea-
sure forword alignment, which is robust to extra
or missing words in the bilingual sentence pairs,
as well as wordalignment ... confidence sentence
alignments and alignment links from mul-
tiple word alignments of the same sen-
tence pair. Additionally, we remove
low confidence alignment links from the
word alignment of a bilingual ... the same word does in-
crease the confusion forwordalignment and re-
duce the link confidence. On the other hand, ad-
ditional information (such as the distance of the
word pair, the alignment...
... two parameters for the dis-
tortion probability: one for head words and the
other for non-head words.
Distortion Probability for Head Words
The distortion probability for head
words represents ... two wordalignmentmodelsfor language
pairs L1-L3 and L2-L3, respectively. And then,
with L3 as a pivot language, we can build a word
alignment model for L1 and L2 based on the
above two models. ... language word similarity of the Chinese word c and the Japanese
word given the English word
);,( efcsim
f
e
Figure 1. Similarity Calculation
English word e. For the ambiguous English word
e,...
... as 1.
In building wordalignment models, a special
“NULL” word is usually introduced to address tar-
get words that align to no source words. Since this
physically non-existing word is not in the ... computational
4
2 Constrained WordAlignment Models
The framework that we propose to incorporate sta-
tistical constraints into wordalignmentmodels is
generic. It can be applied to complicated models
such IBM ... candidate. This information is de-
rived before wordalignment model training and will
act as soft constraints that need to be respected dur-
ing training and alignments. For a given word pair,
the...
... with
all word space models, which facilitates word
space based applications.
The package is written in Java and defines a
standardized Java interface forword space algo-
rithms. While other word ... July 2010.
c
2010 Association for Computational Linguistics
The S-Space Package: An Open Source Package forWord Space Models
David Jurgens
University of California, Los Angeles,
4732 Boelter ... algorithms,
code documentation and mailing list archives.
2 Word Space Models
Word space models are based on the contextual
distribution in which a word occurs. This ap-
proach has a long history in linguistics,...
... rea-
sonable alignments, wordalignmentmodels must
constrain the set of alignments considered. In this
section, we discuss and compare alignment fami-
lies used to train our discriminative models.
Initially, ... many-to-one block alignment
potential, and efficient pruning, ITG models can
yield state-of-the art word alignments, even when
the underlying gold alignments are highly non-
ITG. Our models yielded ... across alignments. Specif-
ically, for each alignment cell (i, j) which is not
a possible alignment in a
∗
, we incur a loss of 1
when a
ij
= a
∗
ij
; note that if (i, j) is a possible
alignment, ...
... syntax-based translation framework.
Most wordalignmentmodels distinguish trans-
lation direction in deriving wordalignment matrix.
Given a parallel sentence, word alignments in two
directions are ... different word align-
ment combination methods
4 Conclusions
We presented a simple yet effective method for
wordalignment symmetrization and combination
in general. The problem is formulated ... DARPA
TransTac program for funding and the anonymous
reviewers for their constructive suggestions.
References
N. F. Ayan. 2005. Combining Linguistic and Machine Learn-
ing Techniques forWordAlignment Improvement....
...
model for Chinese word segmentation was pro-
posed. Gao et al. (2005) further developed it to a
linear mixture model. In these statistical models,
language models are essential forword segmen-
tation ... bigram
probability P
m
(w
y
|w
x
) for seen bigram w
x
w
y
in
training corpus, unigram probability P
m
(w) and
backoff coefficient α
m
(w) for any word w. For
any w
x
and w
y
in the vocabulary, ... pages 1001–1008,
Sydney, July 2006.
c
2006 Association for Computational Linguistics
Discriminative Pruning of Language Modelsfor
Chinese Word Segmentation
Jianfeng Li Haifeng Wang Dengjun...