Fuentes web
Entradas
Comentarios

Machine translation (MT) is “a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitution of words in one natural language for words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies.”

In Computer aided-translation, or more precisely Machine-Aided Human Translation (MAHT), by contrast, translation is performed by a human, and the computer offers supporting tools.

Multilingual Content Management has two main functions: facilitating the creation of contents in a web site and the presentation of those contents. It provides the necessary tools to manage the improvement of the contents: creation, management, presentation and maintenance and updating.

Translation is “the action of interpretation of the meaning of a text, and subsequent production of an equivalent text, also called a translation, that communicates the same message in another language. The text to be translated is called the source text, and the language it is to be translated into is called the target language; the final product is sometimes called the “target text.”

SOURCES:

Here we have the same text translated to five different languages:

SPANISH:
Eclecticismo es una especie de estilo mixto en las bellas artes, a las cuales los rasgos son tomados de varias fuentes y estilos. Considerablemente, el eclecticismo casi nunca constituyó un estilo específico en el arte: es caracterizado por el hecho que esto no era un estilo particular.

CATALAN:
Eclecticisme és una espècie d’estil mixt en les belles arts, a les quals els trets són presos de diverses fonts i estils. Considerablement, l’eclecticisme gairebé mai no va constituir un estil específic en l’art: és caracteritzat pel fet que això no era un estil particular.

FRENCH:
Un éclectisme est une espèce de style mixte dans les beaux arts, à lesquels les traits sont pris par quelques fontaines(sources) et styles. Considérablement, l’éclectisme a constitué presque jamais un style spécifique dans l’art : il est caractérisé par le fait que ce n’était pas un style particulier.

ENGLISH:
Eclecticism is a species(kind) of mixed style in the fine arts, to which the features are taken of several sources(fountains) and styles. Considerably, the eclecticism almost never constituted a specific style in the art: it(he) is characterized by the fact that this was not a particular style. 

GERMAN:
Eklektizismus ist eine Art Mischstil in den schönen Künsten, zu denen die Eigenschaften von mehreren Quellen und Stilen genommen werden. Beträchtlich setzte der Eklektizismus fast nie einen spezifischen Stil in der Kunst ein: es wird durch die Tatsache charakterisiert, dass das nicht ein besonderer Stil war.  

 

 

TOOLS: translendium, reverso

 

 

 

 The FEMTI report focuses on the evaluation of MT and other language processing applications. Acoording to what is settled in the framework Evaluation of Machine Translation in ISLE, the main characteristics of a translation task are:

  • Assimilation.
    ” The ultimate purpose of the assimilation task is to monitor a large volume of texts produced by people outside the organization, usually in several languages.”
  • Disssemination.
    “The ultimate purpose of dissemination is to deliver to others a translation of documents produced inside the organizaton.”
  • Communication.
    “The ultimate purpose of the communication task is to support multi-turn dialogues between people who speak different languages. The translation quality must be high enough for painless conversation, despite possible syntactically ill-formed input and idiosyncratic word and format usage.”

 

SOURCES:

 

 

 

Grammar Induction, also known as Grammar Inference is “the process of learning grammars and languages from data”. There are many types of approaches wich are focused in this topic, the most known ones are:

Learning Recursive Transition Networks (It works by converting grammatically correct sentences into transition networks that are similar to finite state diagrams) Learning CFG using Version Spaces, Learning NPDA using Genetic Search and Learning Deterministic CFG using Connectionist Networks.

It should be also mentioned that there are different models of grammar Induction, such as learning from examples, learning using examples and queries, incremental VS non incremental learning, distribution free models of learning, learning under various distributional assumptions, Impossibility results, complexity results ans finally characterizations of representetional and search biases of grammar induction algorithms.

 

SOURCES:

Word Sense Disambiguation (WSD) is one of the topics were The Stanford Natural Language Processing Group is focused on, according to it’s relation with the translation issue.
It can de defined as the tool used for knowing which of the definition of a word does properly fits with the context were it’s used. This is one of the examples that appear in the Wikipedia’s article:

” Consider the word bass, two distinct senses of which are:

1. a type of fish
2. tones of low frequency

and the sentences:

  1. I went fishing for some sea bass
  2. The bass line of the song is very moving “

For a translation machine is difficult to translate the correct word becouse it’s not able to difference between the meaning of one definition and the other, which is not the same.

There’s are some WSD paradigms that have been proposed for machine translation (MT), wich are:

  • Knowledge-based approaches: depend on manual linguistic knowledge and disambiguation rules.
  • Corpus-based approaches: make use of knowledge taken from text using machine learning techniques.
  • Hibrid approaches: mix characteristics of the two previous ones.

Nowadays, the most used ones in the recent works are the corpus-based and the hibrid techniques becouse they have very good results. Althought they help resolving the problem of ambiguity, the lack of effective mechanisms is mone of the main reasons for the unsatisfactory results of the Machine Translation.

SOURCES:

According to what Wikipedia says, CALL (Computer-assisted language learning) is “an approach for teaching and learning foreign languages where the computer and computer-based resources such as the Internet are used to present, reinforce and assess material to be learned”.

The integration of computers as a way of learning foreign languages is a fact increasingly developed among suppliers and language materials. Nowadays there is a tendency of using CALL didactid materials for independent learning, without the supervision of a proffesor.

By the other side, the gradual integration of technology in classrooms over the last twenty years reflects the technological developments that those technologies are undergoing. The tools used by this new system (CALL) enable the proffesor and the students to have a more interactive teaching and learning, becouse they use the internet and digital plataforms where the students can work on and evaluate their own knowledge and progresses on the language they are focused on.

Lourdes Ortega, Proffesor of the Department of Second Language Studies in the university of Hawai says: “The instructional use of Local Area Networks (which link computers in a laboratory or a classroom to each other) and the usage of electronic mail, bulletin boards, or discussion lists on worldwide networks such as the Internet as educational tools; have introduced the possibility of real-time, synchronous, many-to-many written discussion by a whole class or by smaller groups within the class. Both technologies underscore a view of learning as a collaborative act that happens in a social and political context, with learners and teacher working together in the new medium of networked interaction.”

Furthermore, Computer Assisted Language Learning (CALL) has highly developed in Europe and the USA. There are organizations such as CALICO (Canadian Journal of Learning and Technology), EUROCALL, IALLT (The International Association for Language Learning Technology), APACALL (Asia-Pacific Association for Computer-Assisted Language Learning) or PacCALL (Pacific Association for Computer Assisted Language Learning) wich centre their investigacions on this topic.

To conclude, this new teaching tool continues improving the learning of foreign languages and making them more accesible to those who use this method.

SOURCES:

According to the Human Language Technologies, the most known research centres base their work on different projects, which correspond to the following ones:

1. Language Technology Lab (DFKI, Germany)

The LTL’s aim is to improve language technologies trough novel computational techniques for processing text, speech and knowledge, a deeper understanding of human language. Its goal is to create software products that have some knowledge of human language, becouse the main obstacle in the interaction between human and computer is a communication problem. Its cenrted in areas such as:

  • exploiting – and automatically extending – ontologies for content processing
  • tighter integration of shallow and deep techniques in processing
  • enriching deep processing with statistical methods
  • combining language checking with structuring tools in document authoring
  • document indexing for German and English
  • automatically associating recognized information with related information and thus building up collective knowledge
  • automatically structuring and visualizing extracted information
  • processing information encoded in multiple languages, among them Chinese and Japanese

2. National Centre for Language Technology (Ireland)

It conducts research into the processing of human language by computers, such as speech recognition and synthesis, machine translation, human-computer interfaces, information retrieval and extraction, the teaching and learning of languages using computers and software localisation and globalisation. The main thopics were it focuses are:

  • CALL (Computer Assisted Language Learning)
  • Corpus Linguistics
  • Machine Translation and Translation Technology
  • Treebank – Based Unification Grammar Acquisition
  • Semantics
  • Speech Technology
  • Multilingual Information Retrieval/Extraction
  • Language Evolution

3. The Stanford Natural Language Processing Group

Their work ranges from basic research in computational linguistics to key applications in human language technology, and covers areas such as sentence understanding, probabilistic parsing and tagging, biomedical information extraction, grammar induction, word sense disambiguation, and automatic question answering.
Here are some of the researches were they are centred on:

  • Shallow Semantic Parsing
  • Question Answering (QA)
  • Knowledge Representation from Text
  • Semantic Taxonomy Induction
  • Word Sense Disambiguation (WSD)
  • Grammar Induction
  • Morphology & Phonology Induction
  • Semantic Taxonomy Induction

SOURCES:

Seguir

Get every new post delivered to your Inbox.