... is outlined in
Section 3.
2 Statistical Machine Translation
The goal of the translation process in statisti-
cal machine translation can be formulated as fol-
lows: A source language string
is ... shown in Table 5.
6.2 Training and test perplexities
In order to compute the training and test perplex-
ities, we split the whole aligned training corpus
in two parts as shown...
... domains: text news, trained on a large cor-
pus, and spoken travel conversation, trained on a sig-
nificantly smaller corpus. We show that segmenting
the Arabic target in training and decoding ... for re-
combining the segmented Arabic, and compare their
effect on translation. We also report on applying
Factored Translation Models (Koehn and Hoang,
2007) for English-to-Arabic translati...
... results in line
- are obtained by training ’float’ weights only. Here,
the training is carried out by running only once over
% of the training data. The model including the binary
features is trained ... change in perfor-
mance between training on the original training data in
Eq. 2 or on the modified training data in Eq. 10. Line
shows that even when training the float weights on an
eve...
... search algorithm in de-
tail. Finally, experimental results for a bilingual cor-
pus are reported.
1.1 Statistical Machine Translation
In statistical machine translation, the goal of the
search ... parameters of
the model.
1.3 Search in Statistical Machine
Translation
In the last few years, there has been a number of
papers considering the problem of finding an...
... for do-
main adaptation in machine translation.
1 Introduction
Statistical machine translation (SMT) systems re-
quire large parallel corpora in order to be able to
obtain a reasonable translation ... Blvd, Gatineau, QC, Canada
george.foster@nrc.gc.ca
Abstract
Statistical machine translation is often faced
with the problem of combining training data
from many diverse sou...
... seedlings.
The measured v alues are summarized in Table 2. The total
cytokinin content in the grains was approximately threefold
lower than in young seedlings. The increase was mainly
observed in ... modeling of the flavoprotein catabolizing
plant hormones cytokinins. In Recent Research Develop ments in
Proteins, vol. 2. (Pandalai, S .G., ed.), pp. 63–81. Transworld
Research Networ...
... Charniak, Kevin Knight, and Kenji Yamada.
2003. Syntax-based language models for statistical
machine translation. In Proceedings of the Ninth Ma-
chine Translation Summit of the International ... Osborne, and Philipp Koehn.
2007. CCG supertags in factored statistical machine
translation. In Proceedings of the Second Workshop
on Statistical Machine Translation, pages 9...
... Decoding
The stack (also called A*) decoding algorithm is
a kind of best-first search which was first intro-
duced in the domain of speech recognition (Je-
linek, 1969). By building solutions incremen-
tally ... to right, but allowing the decoder to consume
its input in any order. This change makes decod-
ing significantly more complex in MT; instead of
knowing the order of the input in...
... advantages of integrating EBMT with
RBMT.
1 INTRODUCTION
Machine Translation requires handcmt~ and
complicated large-scale knowledge (Nirenburg 1987).
Conventional machine translation systems ... FLOW
In this section, the basic idea of EBMT,
which is general and applicable to many phenomena
dealt with by machine translation, is shown.
In order to conquer these problems i...
... Minimum Error Rate Training in Statistical Machine Translation
Franz Josef Och
Information Sciences Institute
University of Southern California
4676 Admiralty Way, Suite 1001
Marina del ... sentences.
4 Training Criteria for Minimum Error
Rate Training
In the following, we assume that we can measure
the number of errors in sentence by comparing it
with a reference sentence using a...