... n is the number of features, αiis the weight coefficient of the ithfeature and φi(C, T )is the evaluation of the ithfeature for context Cand tag T . In the averaged perceptron, the val-ues ... iterative training by the averaged perceptron algorithm. The binary features describe the tag being pre-dicted and its context. They can be derived fromany information we already have about the text ... data,different for each iteration (in this order). Inother words, the training algorithm proper doesnot change at all: it is the data and their selection(including the selection of the way they are...