Fuentes web
Entradas
Comentarios

Machine translation (MT) is “a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitution of words in one natural language for words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies.”

In Computer aided-translation, or more precisely Machine-Aided Human Translation (MAHT), by contrast, translation is performed by a human, and the computer offers supporting tools.

Multilingual Content Management has two main functions: facilitating the creation of contents in a web site and the presentation of those contents. It provides the necessary tools to manage the improvement of the contents: creation, management, presentation and maintenance and updating.

Translation is “the action of interpretation of the meaning of a text, and subsequent production of an equivalent text, also called a translation, that communicates the same message in another language. The text to be translated is called the source text, and the language it is to be translated into is called the target language; the final product is sometimes called the “target text.”

SOURCES:

Here we have the same text translated to five different languages:

SPANISH:
Eclecticismo es una especie de estilo mixto en las bellas artes, a las cuales los rasgos son tomados de varias fuentes y estilos. Considerablemente, el eclecticismo casi nunca constituyó un estilo específico en el arte: es caracterizado por el hecho que esto no era un estilo particular.

CATALAN:
Eclecticisme és una espècie d’estil mixt en les belles arts, a les quals els trets són presos de diverses fonts i estils. Considerablement, l’eclecticisme gairebé mai no va constituir un estil específic en l’art: és caracteritzat pel fet que això no era un estil particular.

FRENCH:
Un éclectisme est une espèce de style mixte dans les beaux arts, à lesquels les traits sont pris par quelques fontaines(sources) et styles. Considérablement, l’éclectisme a constitué presque jamais un style spécifique dans l’art : il est caractérisé par le fait que ce n’était pas un style particulier.

ENGLISH:
Eclecticism is a species(kind) of mixed style in the fine arts, to which the features are taken of several sources(fountains) and styles. Considerably, the eclecticism almost never constituted a specific style in the art: it(he) is characterized by the fact that this was not a particular style. 

GERMAN:
Eklektizismus ist eine Art Mischstil in den schönen Künsten, zu denen die Eigenschaften von mehreren Quellen und Stilen genommen werden. Beträchtlich setzte der Eklektizismus fast nie einen spezifischen Stil in der Kunst ein: es wird durch die Tatsache charakterisiert, dass das nicht ein besonderer Stil war.  

 

 

TOOLS: translendium, reverso

 

 

 

 The FEMTI report focuses on the evaluation of MT and other language processing applications. Acoording to what is settled in the framework Evaluation of Machine Translation in ISLE, the main characteristics of a translation task are:

  • Assimilation.
    ” The ultimate purpose of the assimilation task is to monitor a large volume of texts produced by people outside the organization, usually in several languages.”
  • Disssemination.
    “The ultimate purpose of dissemination is to deliver to others a translation of documents produced inside the organizaton.”
  • Communication.
    “The ultimate purpose of the communication task is to support multi-turn dialogues between people who speak different languages. The translation quality must be high enough for painless conversation, despite possible syntactically ill-formed input and idiosyncratic word and format usage.”

 

SOURCES:

 

 

 

Grammar Induction, also known as Grammar Inference is “the process of learning grammars and languages from data”. There are many types of approaches wich are focused in this topic, the most known ones are:

Learning Recursive Transition Networks (It works by converting grammatically correct sentences into transition networks that are similar to finite state diagrams) Learning CFG using Version Spaces, Learning NPDA using Genetic Search and Learning Deterministic CFG using Connectionist Networks.

It should be also mentioned that there are different models of grammar Induction, such as learning from examples, learning using examples and queries, incremental VS non incremental learning, distribution free models of learning, learning under various distributional assumptions, Impossibility results, complexity results ans finally characterizations of representetional and search biases of grammar induction algorithms.

 

SOURCES:

Word Sense Disambiguation (WSD) is one of the topics were The Stanford Natural Language Processing Group is focused on, according to it’s relation with the translation issue.
It can de defined as the tool used for knowing which of the definition of a word does properly fits with the context were it’s used. This is one of the examples that appear in the Wikipedia’s article:

” Consider the word bass, two distinct senses of which are:

1. a type of fish
2. tones of low frequency

and the sentences:

  1. I went fishing for some sea bass
  2. The bass line of the song is very moving “

For a translation machine is difficult to translate the correct word becouse it’s not able to difference between the meaning of one definition and the other, which is not the same.

There’s are some WSD paradigms that have been proposed for machine translation (MT), wich are:

  • Knowledge-based approaches: depend on manual linguistic knowledge and disambiguation rules.
  • Corpus-based approaches: make use of knowledge taken from text using machine learning techniques.
  • Hibrid approaches: mix characteristics of the two previous ones.

Nowadays, the most used ones in the recent works are the corpus-based and the hibrid techniques becouse they have very good results. Althought they help resolving the problem of ambiguity, the lack of effective mechanisms is mone of the main reasons for the unsatisfactory results of the Machine Translation.

SOURCES:

According to what Wikipedia says, CALL (Computer-assisted language learning) is “an approach for teaching and learning foreign languages where the computer and computer-based resources such as the Internet are used to present, reinforce and assess material to be learned”.

The integration of computers as a way of learning foreign languages is a fact increasingly developed among suppliers and language materials. Nowadays there is a tendency of using CALL didactid materials for independent learning, without the supervision of a proffesor.

By the other side, the gradual integration of technology in classrooms over the last twenty years reflects the technological developments that those technologies are undergoing. The tools used by this new system (CALL) enable the proffesor and the students to have a more interactive teaching and learning, becouse they use the internet and digital plataforms where the students can work on and evaluate their own knowledge and progresses on the language they are focused on.

Lourdes Ortega, Proffesor of the Department of Second Language Studies in the university of Hawai says: “The instructional use of Local Area Networks (which link computers in a laboratory or a classroom to each other) and the usage of electronic mail, bulletin boards, or discussion lists on worldwide networks such as the Internet as educational tools; have introduced the possibility of real-time, synchronous, many-to-many written discussion by a whole class or by smaller groups within the class. Both technologies underscore a view of learning as a collaborative act that happens in a social and political context, with learners and teacher working together in the new medium of networked interaction.”

Furthermore, Computer Assisted Language Learning (CALL) has highly developed in Europe and the USA. There are organizations such as CALICO (Canadian Journal of Learning and Technology), EUROCALL, IALLT (The International Association for Language Learning Technology), APACALL (Asia-Pacific Association for Computer-Assisted Language Learning) or PacCALL (Pacific Association for Computer Assisted Language Learning) wich centre their investigacions on this topic.

To conclude, this new teaching tool continues improving the learning of foreign languages and making them more accesible to those who use this method.

SOURCES:

According to the Human Language Technologies, the most known research centres base their work on different projects, which correspond to the following ones:

1. Language Technology Lab (DFKI, Germany)

The LTL’s aim is to improve language technologies trough novel computational techniques for processing text, speech and knowledge, a deeper understanding of human language. Its goal is to create software products that have some knowledge of human language, becouse the main obstacle in the interaction between human and computer is a communication problem. Its cenrted in areas such as:

  • exploiting – and automatically extending – ontologies for content processing
  • tighter integration of shallow and deep techniques in processing
  • enriching deep processing with statistical methods
  • combining language checking with structuring tools in document authoring
  • document indexing for German and English
  • automatically associating recognized information with related information and thus building up collective knowledge
  • automatically structuring and visualizing extracted information
  • processing information encoded in multiple languages, among them Chinese and Japanese

2. National Centre for Language Technology (Ireland)

It conducts research into the processing of human language by computers, such as speech recognition and synthesis, machine translation, human-computer interfaces, information retrieval and extraction, the teaching and learning of languages using computers and software localisation and globalisation. The main thopics were it focuses are:

  • CALL (Computer Assisted Language Learning)
  • Corpus Linguistics
  • Machine Translation and Translation Technology
  • Treebank – Based Unification Grammar Acquisition
  • Semantics
  • Speech Technology
  • Multilingual Information Retrieval/Extraction
  • Language Evolution

3. The Stanford Natural Language Processing Group

Their work ranges from basic research in computational linguistics to key applications in human language technology, and covers areas such as sentence understanding, probabilistic parsing and tagging, biomedical information extraction, grammar induction, word sense disambiguation, and automatic question answering.
Here are some of the researches were they are centred on:

  • Shallow Semantic Parsing
  • Question Answering (QA)
  • Knowledge Representation from Text
  • Semantic Taxonomy Induction
  • Word Sense Disambiguation (WSD)
  • Grammar Induction
  • Morphology & Phonology Induction
  • Semantic Taxonomy Induction

SOURCES:

Hanz Uskzkoreit (Q.1)


Hans Uskzkoreit is Professor of Computational Linguistics at the Departament of Computational Linguistics and Phonetics of Saarland University at Saarbrücken (Germany).
He studied Linguistics and Computer Science at the Technical University of Berlin and the University of Texas. He began working in a translation research at the Linguistics Research Center, where he began his successful career. He also worked as a computer scientist at the Artificial Intelligence Center of SRI International in Menlo Park.

In 1989 he became the head of the newly founded Language Technology Lab, and now he’s member of a high number of associations:

  • International Committee of Computational Linguistics (ICCL)
  • European Academy of Sciences
  • European Association for Logic, Language and Information. (He was president)
  • Executive Board of the European Network of Language and Speech
  • Board of the European Language Resources Association (ELRA).

He’s one of the most known scholars centered on The New Technologies issue and ones of his most relevant works are:

–> Uszkoreit, H. (1987): Word Order and Constituent Structure in German, CSLI LectureNotes 8, Center for the Study of Language and Information, Stanford University, 1987, Stanford, Calif.
–> Uszkoreit, H. (2002) New Chances for Deep Linguistic Processing, In: Proceedings of COLING 2002, Taipei

SOURCES:

 

According to what Viviane Reding (Member of the European Commission responsible for information, society and media) and Ján Figel’ (Member of the European Commission responsible for education, training, culture and multilingualism) say on their work “Human Language Technologies for Europe“, HLT are changing our habits and undergoing an increasing importance on our society.

“We are now aware of another computing technology with enormous potential and it is fair to say that we are facing another revolution. It is proceeding rather slowly, and many of the research topics have been addressed for so many years that some have given up hope. It is very hard to teach computers to handle human speech and language – language in both the spoken and the written form – in the various ways that we humans can master: to speak naturally, understand what has been said (and meant), summarize a document or a conversation, find an audio recording given its content, translate from one language to another. We want to be able to interface with machines by voice and language,because we use these communication means, and we want computers to process this form of information in all the ways that we consider useful. Theset of technologies which do this are known as human language technologies (HLT). Automatic speech recognition, machine translation and text to speech are the more prominent technologies, but there are many more. As with previous advances in computing, networking and digitalization, HLT has the potential to radically alter how we think about and work with information because we will be able to access and process the information encoded in language in fundamentally new ways.”

Another relevant scholar who was worked on this area is Hans Uszkoreit, who says:“Language technology — sometimes also referred to as human language technology — comprises computational methods, computer programs and electronic devices that are specialized for analyzing, producing or modifying texts and speech. These systems must be based on some knowledge of human language. Therefore language technology defines the engineering branch of computational linguistics.”

 

SOURCES:

 

 

 

· “Human Language technologies for Europe”(11-04-08/22:01)
http://www.tc-star.org/pubblicazioni/D17_HLT_ENG.pdf

· Viviane Reading, homepage. (11-04-08/22:02)
http://ec.europa.eu/commission_barroso/reding

· Ján Figel, homepage. (11-04-08/22:o6)
http://ec.europa.eu/commission_barroso/figel/index_es.htm

Worl Wide Web

By the Wikipedia Encyclopedia the World Wide Web is “a system of interlinked, hypertext documents accessed via the Internet. With a Web browser, a user views Web pages that may contain text, images, videos, and other multimedia and navigates between them using hyperlinks.

The Web came up in 1989 thanks to the CERN, in Genova (Switzerland). Tim Berners-Lee invented this system when he was trying to find an effective solution to the problem of the proliferation of all type of information available in the net.
He developed the basic arquitecture of what the Web is nowadays, and he describes it like this:
The WWW is a way of visualizing the available information in the Internet without breaking-offs. Using hipertextual hops and searches, the user navigates through an informatic world partially created by hand. As with other Internet applications such as email, instant messaging, and voice over IP, the Web would have been impossible to create without the Internet itself operating as an open platform.”

According to what Berners-Lee says, the main characteristics of the Web are:

  1. Hipermedial: We can find multimedial information and navigate through it.
  2. Heterogeneous: It has the advantage that it can gather older services (Gopher, News, FTP…) presenting the information from one only client programme.
  3. Colaborative: Anyone can add information to the Web that is looked up by someone else afterwards.

world-wide-web.jpg

Sources:

Entradas antiguas »

Seguir

Get every new post delivered to your Inbox.