What do they mean?

Abril 28, 2008 por littlemisssunshines

-Machine translation,  is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.  MT performs simple substitution of words in one natural language for words in another. Using corpus techniques, attemps more complex translations , allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies.

-Machine aided Translation , where translation proper is performed by a computer, even if the human helps by preediting, postediting, or answering questions to disambiguate the source text. In Computer-Aided Translation, or more precisely Machine-Aided Human Translation (MAHT), by contrast, translation is performed by a human, and the computer offers supporting tools.

-Multilingual Content Management systems contain information, mostly in the form of more or less structured text documents, but potentially also including audio clips, video clips and images. Minimally, such a system provides mechanisms for storage and retrieval of content data, but it may also give support for indexing of documents, distributed document editing, version management, and generation of different views and guided tours. 

Finally…

-Translation technology is the type of technology that offers translation between two languages. It’s aim is to make simultaneous translations between oral language to another languages.  Researchers  revealed a directional speaker system that delivers a translated audio feed to just one person in a room, removing the need for them to wear headphones. And another concept device projected translated subtitles along the bottom of one lens of a modified pair of glasses.

 

 

 

 

Sources:

-Machine translation. (2008, April 7). In Wikipedia, The Free Encyclopedia. Retrieved  April 9, 2008,11.50 from http://en.wikipedia.org/w/index.php?title=Machine_translation&oldid=203927830

-MCM Project, Multilingual Content Management, Vaxjo University (WSCC); april 10 12.00http://wscc.info/index.php?show=53044_SWE&&page_anchor=http://wscc.info/p53044/p53044_swe.php

-Kitsite, Multilingual Content Management (2007); april 10 12.05 http://www.kitsite.com/articles/multilingual-content-management.html

-Christian Boitet, 8.4 Machine-aided Human Translation; kontsulta: april 12 13.40 http://cslu.cse.ogi.edu/HLTsurvey/ch8node6.html

-Will Knight, NewScientist.com news service, Live speech-translation technology unveiled 18:05 31 October 2005; april 12 12.38 http://www.newscientist.com/article.ns?id=dn8241

 

Translation examples between related languages

Abril 20, 2008 por littlemisssunshines

Example of a translation from Galician to a related language as it is the Spanish:


Oito galegos secuestrados en Somalia

Catro persoas lograron acceder ao atuneiro vasco armados con lanzagranadas e manteñen retida á tripulación do ‘Praia Bakio’, composta por 13 persoas de orixe africana, oito galegos e cinco vascos. As autoridades españolas non teñen constancia de que ningún dos 26 resulte ferido durante o asalto.
A pesar de que o atuneiro sufriu danos materiais durante o asalto, os danos non impiden o seu navegabilidad e gobernabilidade e, segundo o seguimento que se lle está facendo, os primeiros indicios apuntan a que o buque diríxese cara a terra firme.


Ocho gallegos secuestrados en Somalia

Cuatro personas lograron acceder al atunero vasco armados con lanzagranadas y mantienen retenida a la tripulación del ‘Playa Bakio’, compuesta por 13 personas de origen africana, ocho gallegos y cinco vascos. Las autoridades españolas no tienen constancia de que ninguno de los 26 resulte herido durante el asalto.
A pesar de que el atunero sufrió daños materiales durante el asalto, los daños no impiden su navegabilidad y gobernabilidad y, según el seguimiento que se le está haciendo, los primeros indicios apuntan a que el buque se dirige cara a tierra firme.

As it can be seen, there is only one mistake in the translation so it can be said that it is easier to make translations with good results between related languages, on the other hand, translation between less related languages have usually typical grammatical, syntactical or word-order flaws.

Sources:

-Comprendium translator 21 april 2008, 12:57 from [(http://www.translendium.net:8080/home/text.do;jsessionid=E9A5ABD30639A486D97FC1824833274F)

-A Nosa Terra diario 21 april 2008, 12.30 from http://www.anosaterra.org/nova/oito-galegos-secuestrados-en-somalia-.html



1st translation example applied to less related languages ( Elnglish -to- Spanish)

Abril 16, 2008 por littlemisssunshines

The followings are the MT systems I have used and the translations I have made by using different web pages that offer quite a good range of possibilities to make them possible:

Instituto Cervantes. Servicio de traducción automática interactivo. Powered by AutomaticTrans.

Resultado de la traducción:
Texto original:
“Tough sentence for gun possession welcomed(1)

The Leader of Nottingham City Council has welcomed a lengthy jail sentence handed down to a man caught with a loaded gun in Nottingham.(2)

Councillor Jon Collins said he hoped it would act as a deterrent to others considering carrying guns.

Makan Dayil, 28, of Beardsley Gardens, The Meadows, was jailed for ten years following his conviction at Nottingham Crown Court for possessing the gun and ammunition, after police(3) found a revolver loaded with five bullets in his car on Queens Drive, Nottingham, on(4) August 29, last year.”

Texto traducido:

“Oración resistente para la posesión del arma dada la bienvenida

El líder del consejo de ciudad de Nottingham ha dado la bienvenida a una oración de cárcel muy larga dada abajo a un hombre cogido con un arma cargado en Nottingham.

El concejal Jon Collins dijo él esperaba que actuara como impedimento a otros que consideran el llevar de los armas.

Makan Dayil, 28, de los jardines de Beardsley, los prados, fue encarcelado por diez años que seguían su convicción en la corte de la corona de Nottingham para poseer el arma y la munición, después de que el policía encontrara un revólver cargado con cinco balas en su coche en reinas conduzca, Nottingham, de agosto el 29, el año pasado”.

-There are some obvious errors in the translation:

  1. The headline of the article has been wrongly translated:
    • “sentence” in this case does not mean “oración” but it does mean ” sentencia”.
    • “thought” in this case does not mean “resistente” “but does mean ”dura”.
    • “for” has been wrongly translated into “para” instead of “por”.

2. In the first paragraph :

  • “sentence” in this case does not mean “oración” but it does mean ” sentencia”.
    • “handed down” has been wrongly translated into “dada abajo” instead of “dada”.
    • “caught” has been wrongly translated into ”cogido” instead of “pillado”.
    • the adjective “loaded” has been translated as if it was masculine but it is femenine in this translation of “loaded gun” = “arma cargadA”.

3. In the third paragraph:

    • “police” = “el policía” has been translated as if it was a single noun but it is a colective noun “polocía” = ” La policía”.
    • The preposotion “on” has been wrongly translated into “de” instead of “el”.

 

 

 

Sources:

-Nottingham City Council “Tough sentence for gun possesion welcomed”, April 16 12:00 from [(http://www.nottinghamcity.gov.uk/news_page/news_about_nottingham_-_policing_and_public_safety_/tough_sentence_for_gun_possession_welcomed.htm)

-Instituto Cervantes- Servicio de traducción automática interactivo16 april 2008, 12:00 from http://oesi.cervantes.es/traduccionAutomatica.html

Characteristics of a translation task according to the FEMTI report

Abril 9, 2008 por littlemisssunshines

The characteristics of the translation task refers to the information flow intended for the output, from the point of view of the agent who receives the translation.

From the point of view of the FEMTI or Framework for the Evaluation of Machine Translation in ISLE the main characteristics of a translation task are these three ones:

  1. Assimilation: The ultimate purpose of the assimilation task (of which translation forms a part) is to monitor a relatively large volume of texts produced by people outside the organization, in usually several languages.
  2. Dissemination: The ultimate aim of dissemination is to deliver to others a translation of documents produced inside the organization.
  3. Communication: The purpose of the communication task is to support multi-turn dialogues between people who speak different languages. The translation quality must be high enough for painless conversation, despite possible syntactically ill-formed input and idiosyncratic word and format usage.

 

Sources:

-FEMTI – a Framework for the Evaluation of Machine Translation in ISLE, April 7, 12,10 from http://www.issco.unige.ch:8080/cocoon/femti/st-home.html

Explanation of three of the topics

Marzo 30, 2008 por littlemisssunshines

In this article I’ll make the asked explanation on three of the topics we have spoken about recently:

This first topic I’m going to talk about is the “Humaine” or “Human-machine interaction Network on emotions” one of the current projets of the German Research Center for Artificial Intelligence.

Humaine aims to lay the foundations for European development of systems that can register, model and influence human emotional and emotion-related states and processes – ‘emotion-oriented systems’. Such systems may be central to future interfaces, but their conceptual underpinnings are not sufficiently advanced to be sure of their real potential or the best way to develop them.

One of the reasons is that relevant knowledge is dispersed across many disciplines. Humaine brings together leading experts from the key disciplines in a programme designed to achieve intellectual integration. It identifies six thematic areas that cut across traditional groupings and offer a framework for an appropriate division of labour – theory of emotion; signal/sign interfaces; the structure of emotionally coloured interactions; emotion in cognition and action; emotion in communication and persuasion; and usability of emotion-oriented systems. Teams linked to each area will run a workshop in it and carry out joint research to define an exemplar embodying guiding principles for future work in their area.

The second topic on which I am going to focus is the one called “Whiteboard“; a completed project of the same research centre. This project focused on the “Multilevel annotation for dimamic free text processing”.

The project aimed at designing, implementing, investigating and evaluating a new system architecture that facilitated the combination of different language technologies for a range of practical applications. Language technologies offered numerous means for a partial analysis of texts that could be employed for information retrieval, information extraction, language checking, and many other applications. Processing methods and tools differed along several dimensions, e.g., wrt. levels of linguistic description, depth of analysis, or the way knowledge of language is derived (linguistically or statistically).

Methods often overlaped in their functionality but differed in their strengths and weaknesses. Finding optimal combinations of heterogeneous techniques and processing components was one of the most difficult tasks in language processing – the challenge of the Whiteboard project. The novel architecture to be developed and explored in Whiteboard was based on the concept of an annotated text. The different LT components enriched an XML. Each component can exploit or disregard previously assigned annotations. Its architecture had a single shared data structure, which at the same time was the input, throughput, and output of the system. The envisaged architecture permited the pragmatic combination of different processing approaches, most notably novel ways of the combination of shallow and deep methods.

Finally, the last topic I had picked to focus on is the “Neca” or “The net environment for embodied emotional conversational agents”; one of the previous projects of the Austrian Research Institute for Artificial Intelligence.

The objective of the NECA project was to develop a new generation of mixed multi-user / multi-agent virtual spaces populated by affective conversational agents. The agents are be able to express themselves through synchronised emotional speech and non-verbal expression, generated from an abstract representation. This is the first time that such expressive capabilities are featured in Internet applications. The agents’ usefulness were evaluated in two concrete application scenarios. From a technical point of view, the NECA platform provides a confederation of dedicated components including an affective reasoner, co-ordinated generation of verbal and nonverbal aspects of communication, and emotional speech synthesis, thus providing a basis for the development of new Internet applications with emotional agents.

Sources:

http://www.dfki.de/pas/f2w.cgi?ltp/humaine-e

http://www.dfki.de/pas/f2w.cgi?ltc/whiteboard-e

http://www.ofai.at/research/nlu/projects/nlproject_neca.html

Recent research topics

Marzo 30, 2008 por littlemisssunshines
In this article recent research topics mentioned on different sites of Human Language technologies will be pointed out.

Referring to Researc Centers, the following ones are the most remarkables:

- In the German Research Centre the following themes are the most elaborated in research:

  • exploiting – and automatically extending – ontologies for content processing.
  • tighter integration of shallow and deep techniques in processing.
  • enriching deep processing with statistical methods.
  • combining language checking with structuring tools in document authoring.
  • document indexing for German and English.
  • automatically associating recognized information with related information and thus building up collective knowledge.
  • automatically structuring and visualizing extracted information.
  • processing information encoded in multiple languages, among them Chinese and Japanese.

- The Stanford Natural Processing Language Processing group of California works in several grounds:

  • Basic research on conputational-linguistic.
  • Grammar induction.
  • Sentence understanding.
  • Word sense disanbiguation.
  • Automatic question answering.

- The Edinburgh Language Technology Group produces research in the following areas:

  • Combining Shallow Semantics and Domain Knowledge.
  • Text Mining for Biomedical Content Curation.
  • Cross-retail Multi-agent Retail Comparison .
  • Smart Qualitalive Data: Methods and Community tools for Data Mark-up.
  • Machine Learning for Named Entity Recognition.
  • Integrated Models and Tools for Fine-Grained Prosody in Discourse.
  • Joint Action Science and Technology.
  • AMI consorting projects that are developing technologies for meeting browsing and to assist people participating in meetings from a remote location.
  • Study of how pairs collaborate when in planning a route on a map (Collaborating using diagrams).

Between the most highlighted research networks the most remarkable one could be:

- The European Network of Excellence in Human Language Technologies (ELSNET) is a plataform made up to reach the following goals:

  • Make an analisys of the present and future views.
  • Share knowledge and experience.
  • Work out innovative actions.
  • Make a united enviroment examination.
  • To unite the Human Language Technologies making posible the European research and developing.

Within the associations, The Spanish society for the procesement of Natural Language it is also of great importance and analizes these themes:

  • The fixing up of lexical ambiguity.
  • Rescuing of information of great importance.
  • Linguistic technics to work with multilinguism.
  • Linguistic knowledge to make possible the semantic errors.

Finally, within the latest conferences on Natural Language Processing, I have focused on the one called “XXIV. Edition of Anual Congress of Spanish Society for the processment of Natural Language 2008 (SEPLN ‘08)”.

-The main thematic areas of this conference were:

    • Linguistic, mathematic and psicolinguistic models of the language.
    • Linguistic of Corpus.
    • Automatic translation.
    • Recognizing th voice.
    • Semantic, pragmatism and discurs.
    • PLN industrial aplications.
    • Automatic analyis of texts’s containings.

Sources:

http://www-nlp.stanford.edu/

http://www.ltg.ed.ac.uk/projects

http://www.dfki.de/lt/projects.php

http://www.elsnet.org/http://www.sepln.org/

http://www.sepln.org/

http://basesdatos.uc3m.es/sepln2008/web/

European research centres for Human Language Technologies

Marzo 12, 2008 por littlemisssunshines

It is important to point out the importance of different research centres all over Europe so we can go deeply into Human Language Technologies.

These are four of the main research centres for Human Language Technologies of Europe I have found on the net:

- The National Centre for Language Technology (Germany) : their aim is to conduct research into the processing of human language using diferentapplications as for example, computers, speech recognition and synthesis, machine translation, human-computer interfaces, information retrieval and extraction, the teaching and learning of languages using computers and software localisation and globalisation.

- The Edinburgh Language Technology Group is a research and development group that has been working in the area of natural language engineering since the early 1990s. It was originally established as part of the Human Communication Research Centre, and is now based in the Institute for Communicating and Collaborative Systems of the Division of Informatics, University of Edinburgh, one of the largest communities of natural language processing specialists in Europe.

- Language Technology Documentation Centre in Finland: In order to make speech-to-speech translation real, some concerted European key actors in SST technologies will be undertaken, organized along the following themes: technology and service development for SST components and SST systems – development of platforms and creation of services, research in SST technologies – performance improvement of speech recognition, speech synthesis, speech centered translation, language resources (LR) for many languages – make available speech databases, corpora and lexica needed to develop SST components, to evaluate their performance and to transfer such SST-components to other languages and technology dissemination – create infrastructure to support a fast spreading SST technology.

- Language Technology Group : Language Technology (LT) forms a major research area at the Austrian Research Institute for Artificial Intelligence (OFAI) since its birth in 1984. they make research in modelling and processing human languages, especially for German. This includes constructing linguistic resources (such as lexicons, grammars, discourse models), processing algorithms (such as morphological components, parsers, generators, speech synthesizers, discourse processing components), and application prototypes (such as natural language interfaces, advisory systems and concept-to-speech systems).

The Language Technology Group at OFAI is a member of the EU’s European Network of Excellence in Human Language Technologies (ELSNET)

Sources:

  • http://www.dfki.de/lt/lt-general.php
  • http://www.nclt.dcu.ie/
  • http://www.ofai.at/research/nlu/
  • http://www.ltg.ed.ac.uk/
  • http://www.ling.helsinki.fi/filt/projects/index-en.shtml
  • Hans Uszkoreit

    Marzo 12, 2008 por littlemisssunshines

    Hans Uszkoreit is Professor of Computational Linguistics at Saarland University. At the same time he serves as Scientific Director at the German Research Center for Artificial Intelligence where he heads the Language Technology Lab. By cooptation he is also Professor of the Computer Science Department.

    Uszkoreit studied Linguistics and Computer Science at the Technical University of Berlin and the University of Texas at Austin. During his time in Austin he also worked as a research associate in a large machine translation project at the Linguistics Research Center. In 1984 Uszkoreit received his Ph.D. in linguistics from the University of Texas. From 1982 until 1986, he worked as a computer scientist at the Artificial Intelligence Center of SRI International in Menlo Park, Ca. During this time he was also affiliated with the Center for the Study of Language and Information at Stanford University as a senior researcher and later as a project leader.

    In 1988 Uszkoreit was appointed to a newly created chair of Computational Linguistics at Saarland University and started the Department of Computational Linguistics and Phonetics. In 1989 he became the head of the newly founded Language Technology Lab at DFKI. He has been a co-founder and principal investigator of the Special Collaborative Research Division “Resource-Adaptive Cognitive Processes” of the DFG (German Science Foundation). He is also co-founder and professor of the “European Postgraduate Program Language Technology and Cognitive Systems”, a joint Ph.D. program with the University of Edinburgh.

    Here are some of his recent publications:

    • Uszkoreit, H. (2007) Methods and Applications for Relation Detection. In: Proceedings of the Third IEEE International Conference on Natural Language Processing and Knowledge Engineering, Beijing, 2007.
    • Uszkoreit, H. F. Xu, W. Liu (2007) Challenges and Solutions of Multilingual and Translingual Information Service Systems, To appear in Proceedings of HCI International 2007, 12th International Conference on Human-Computer Interaction, Beijing, 2007.
    • Uszkoreit, H., F. Xu, Weiquan Liu, J. Steffen, I. Aslan, J. Liu, C. Müller, B. Holtkamp, M. Wojciechowski (2007)
    • Xu F., H. Uszkoreit, H. Li (2007) A Seed-driven Bottom-up Machine Learning Framework for Extracting Relations of Various Complexity, To appear in: Proceedings of ACL 07, Annual Meeting of the Association of Computational Linguistics, Prague 2007.
    • Frank, A., H.-U. Krieger, F. Xu, H. Uszkoreit, B. Crysmann, B. Jörg, U. Schäfer (2007): Question Answering from Structured Knowledge Resources. In: Journal of Applied Logic, Volume 5, Issue 1, March 2007, Pages 20-48
    • Jörg, B., M. Jermol, H. Uszkoreit, M. Grobelnik, J. Ferlezˇ. (2006) Analytic Information Services for the European Research Area. To appear In Proceedings: eChallenges2006 e-2006 Conference, October 25-27, 2006. Barcelona, Spain.

    Sources:

    http://www.coli.uni-saarland.de/%7Ehansu/

    http://www.coli.uni-saarland.de/%7Ehansu/publ.html

    What is called “Language Technology” ?

    Marzo 10, 2008 por littlemisssunshines

    Language technology is often called human language technology (HLT) or natural language processing (NLP) and consists of computational linguistics (or CL) and speech technology as its core but includes also many application oriented aspects of them. Language technology is closely connected to computer science and general linguistics.

    Human language has been a reason for studies for years and from different points of views in different disciplines. The use of computers acelerated its study and at the same time opened new investigation fields. It is just sorrounding the computers that Language Technologies were borned.

    Language technologies of speaking and writing produce expressions in these two types of language forms. In spite of this division, language has different aspects that are shared between the spoken world and texts, as diccionaries, grammar, etc.

    On the other hand, Language Technologies cannot be reduced to spoken and written technologies. Through those technologies we find, for example, the ones that link language to knowledge. Human Language includes other ways of communication, for example, the speaking way is combined with facial expressions, digital texts present combinations with images and sounds, etc. This way, Language Technologies include more different technologies that let the process of multimodal communication and multimedia documents.

    Sources:

      http://en.wikipedia.org/w/index.php?title=Natural_language_processing&oldid=110067975

      http://www.coli.uni-saarland.de/%7Ehansu/

    Anglo-Saxon poetry: Beowulf

    Febrero 5, 2008 por littlemisssunshines

    “Old English Literature, was a male-centred literature, one which laid a stress on the virtues of a tribal community, on the ties of loyalty between lord and liegeman, on the significance of individual heroism and on the poweful sway of wyrd.

     

    Old English poetry was designed for public recitation and artful improvisation. The scop or proffessional poet would be expected to perform at gatherings in the royal, lordly or even monastic halls. The scop’s inherited pattern of poetry-making derived from an art which was essentially oral in its origins and development.”

     

      

     Beowulf is the first written poem. It was orally trasnmited until somebody annonymous decided to write it. The only surviving manuscript of Beowulf now in the British Library in London. It was written the year A.D.1000 although its composition is asigned to the 8th century. The manuscript was damaged by a fire in the Cotton Library and some scorched edges have curmbled away. It is written continuously as if it were prose.

    At first Beowulf was analized as  an historical thing and not fo its literacy value. As in the story there is nothing related with England, it was considered as a part of Danish or Scandinavian literature for many years.

    The main character, Beowulf is introduced as  a strong young man who fights for his people.  People are loyal to him, fight with hiem, support him.

    The poem has two parts according to some scholars:

    • The first part in which Beowulf, the prince of the Geats ( a nation in the South of Sweden)   fights with Grendel, the monster who is killing the soldiers of the court of Hothgar (the King of Denmark) and her mother returns for revenge and is also killed by the hero.
    • The second one ( 50 years later) in which Beowulf, now king of the Geats fights with a dragon. A fight that ends with the dragon being killed and Beowulf being seriosly wounded.

    The poem is a combination of Christian (“the soul passed from his breast to reach the glory of the righteous”, link with the mosnters with Caín, etc) and pagan beliefs ( Beowulf was brought to his resting place in a big pyre and there are references to qualities and beings far from the human ones) and its main characteristics are:

    •  Contrasts between youth and old Beowulf
    • Use of weapons
    • Code of loyalty: if broken the soldiers would have to live in solitude  with nowhere to go.  “To any warrior death is better than a life of disgrace” (“Beowulf” M. Swanton).

     

     

     

    Sources:

    • “Literatura inglesa medieval”- Fernando Galván
    • “Beowulf”-M.Swanton
    • Literatura Inglesa-Introducción general-Universidad de Deusto