{"id":1277,"date":"2022-04-27T18:45:42","date_gmt":"2022-04-27T18:45:42","guid":{"rendered":"http:\/\/www.wellformedness.com\/blog\/?p=1277"},"modified":"2022-04-27T18:45:57","modified_gmt":"2022-04-27T18:45:57","slug":"a-star-shortest-string-decoding-for-non-idempotent-semirings-decoding","status":"publish","type":"post","link":"https:\/\/www.wellformedness.com\/blog\/a-star-shortest-string-decoding-for-non-idempotent-semirings-decoding\/","title":{"rendered":"A* shortest string decoding for non-idempotent semirings"},"content":{"rendered":"<p>I recently completed some work, in collaboration with Google&#8217;s Cyril Allauzen, on a new algorithm for computing the shortest string through weighted finite-state automaton. For so-called <em>path semirings,\u00a0<\/em>the shortest string is given by the shortest path, but up until now, there was no general-purpose algorithm for computing the shortest string over non-idempotent semirings (like the log or probability semiring). Such an algorithm would make it much easier to decode with interpolated language models or elaborate channel models in a noisy-channel formalism. In <a href=\"https:\/\/arxiv.org\/abs\/2204.07236\">this preprint<\/a>, we propose such an algorithm using A* search and lazy (&#8220;on-the-fly&#8221;) determinization, and prove that it is correct. The algorithm in question is implemented in my <a href=\"https:\/\/www.opengrm.org\/twiki\/bin\/view\/GRM\/BaumWelch\">OpenGrm-BaumWelch library<\/a> by the <tt>baumwelchdecode<\/tt> command-line tool.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I recently completed some work, in collaboration with Google&#8217;s Cyril Allauzen, on a new algorithm for computing the shortest string through weighted finite-state automaton. For so-called path semirings,\u00a0the shortest string is given by the shortest path, but up until now, there was no general-purpose algorithm for computing the shortest string over non-idempotent semirings (like the &hellip; <a href=\"https:\/\/www.wellformedness.com\/blog\/a-star-shortest-string-decoding-for-non-idempotent-semirings-decoding\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;A* shortest string decoding for non-idempotent semirings&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"link","meta":{"_crdt_document":"","footnotes":""},"categories":[3,5,21],"tags":[],"class_list":["post-1277","post","type-post","status-publish","format-link","hentry","category-dev","category-nlp","category-speech","post_format-post-format-link"],"_links":{"self":[{"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/posts\/1277","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/comments?post=1277"}],"version-history":[{"count":3,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/posts\/1277\/revisions"}],"predecessor-version":[{"id":1281,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/posts\/1277\/revisions\/1281"}],"wp:attachment":[{"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/media?parent=1277"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/categories?post=1277"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wellformedness.com\/blog\/wp-json\/wp\/v2\/tags?post=1277"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}