{"id":100090,"date":"2018-03-11T10:21:28","date_gmt":"2018-03-11T10:21:28","guid":{"rendered":"https:\/\/www.deberes.net\/tesis\/sin-categoria\/eusmt-incorporating-linguistic-information-to-statistical-machine-translation-for-a-morphologically-rich-language-its-use-in-smt-rbmt-ebmt-hybridation\/"},"modified":"2018-03-11T10:21:28","modified_gmt":"2018-03-11T10:21:28","slug":"eusmt-incorporating-linguistic-information-to-statistical-machine-translation-for-a-morphologically-rich-language-its-use-in-smt-rbmt-ebmt-hybridation","status":"publish","type":"post","link":"https:\/\/www.deberes.net\/tesis\/traduccion-automatica\/eusmt-incorporating-linguistic-information-to-statistical-machine-translation-for-a-morphologically-rich-language-its-use-in-smt-rbmt-ebmt-hybridation\/","title":{"rendered":"Eusmt: incorporating linguistic information to statistical machine translation for a morphologically rich language. its use in smt-rbmt-ebmt hybridation"},"content":{"rendered":"<h2>Tesis doctoral de <strong> Gorka Labaka Intxauspe <\/strong><\/h2>\n<p>This thesis is defined in the framework of machine translation for basque. Having developed a rule-based machine translation (rbmt) system for basque in the ixa group (mayor, 2007), we decided to tackle the statistical machine translation (smt) approach and experiment on how we could adapt it to the peculiarities of the basque language.  first, we analyzed the impact of the agglutinative nature of basque and the best way to deal with it. In order to deal with the problems presented above, we have split up basque words into the lemma and some tags which represent the morphological information expressed by the inflection. By dividing each basque word in this way, we aim to reduce the sparseness produced by the agglutinative nature of basque and the small amount of training data.  similarly, we also studied the differences in word order between spanish and basque, examining different techniques for dealing with them. We confirm the weakness of the basic smt in dealing with great word order differences in the source and target languages. Distance-based reordering, which is the technique used by the baseline system, does not have enough information to properly handle great word order differences, so any of the techniques tested in this work (based on both statistics and manually generated rules) outperforms the baseline.  once we had obtained a more accurate smt system, we started the first attempts to combine different mt systems into a hybrid one that would allow us to get the best of the different paradigms. The hybridization attempts carried out in this phd dissertation are preliminaries, but, even so, this work can help us to determine the ongoing steps.<\/p>\n<p>&nbsp;<\/p>\n<h3>Datos acad\u00e9micos de la tesis doctoral \u00ab<strong>Eusmt: incorporating linguistic information to statistical machine translation for a morphologically rich language. its use in smt-rbmt-ebmt hybridation<\/strong>\u00ab<\/h3>\n<ul>\n<li><strong>T\u00edtulo de la tesis:<\/strong>\u00a0 Eusmt: incorporating linguistic information to statistical machine translation for a morphologically rich language. its use in smt-rbmt-ebmt hybridation <\/li>\n<li><strong>Autor:<\/strong>\u00a0 Gorka Labaka Intxauspe <\/li>\n<li><strong>Universidad:<\/strong>\u00a0 Pa\u00eds vasco\/euskal herriko unibertsitatea<\/li>\n<li><strong>Fecha de lectura de la tesis:<\/strong>\u00a0 29\/03\/2010<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3>Direcci\u00f3n y tribunal<\/h3>\n<ul>\n<li><strong>Director de la tesis<\/strong>\n<ul>\n<li>Kepa Sarasola Gabiola<\/li>\n<\/ul>\n<\/li>\n<li><strong>Tribunal<\/strong>\n<ul>\n<li>Presidente del tribunal: mikel lorenzo Forcada zubizarreta <\/li>\n<li>Mar\u00eda  victoria Arranz corzana (vocal)<\/li>\n<li>andrew Way (vocal)<\/li>\n<li>llu\u00eds M\u00e1rquez villodre (vocal)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Tesis doctoral de Gorka Labaka Intxauspe This thesis is defined in the framework of machine translation for basque. Having developed [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"footnotes":""},"categories":[12909,656],"tags":[204004,204002,29691,63935,204003,53901],"class_list":["post-100090","post","type-post","status-publish","format-standard","hentry","category-pais-vasco-euskal-herriko-unibertsitatea","category-traduccion-automatica","tag-andrew-way","tag-gorka-labaka-intxauspe","tag-kepa-sarasola-gabiola","tag-lluis-marquez-villodre","tag-maria-victoria-arranz-corzana","tag-mikel-lorenzo-forcada-zubizarreta"],"_links":{"self":[{"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/posts\/100090","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/comments?post=100090"}],"version-history":[{"count":0,"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/posts\/100090\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/media?parent=100090"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/categories?post=100090"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.deberes.net\/tesis\/wp-json\/wp\/v2\/tags?post=100090"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}