<?xml-stylesheet type="text/xsl" href="http://coco-lab.org/Elgg/timo/weblog/rss/helpme/rssstyles.xsl"?>

<rss version='2.0'   xmlns:dc='http://purl.org/dc/elements/1.1/'>        
    <channel xml:base='http://coco-lab.org/Elgg/timo/weblog/'>
        <title><![CDATA[Timo Baumann : Weblog items tagged with helpme]]></title>
        <description><![CDATA[The weblog for Timo Baumann, hosted on Coco-Lab Weblog.]]></description>
        <link>http://coco-lab.org/Elgg/timo/weblog/</link>        
        <item>
            <title><![CDATA[Generic Minimum Edit Distance of lists in Perl]]></title>
            <link>http://coco-lab.org/Elgg/timo/weblog/77.html</link>
            <guid isPermaLink="true">http://coco-lab.org/Elgg/timo/weblog/77.html</guid>
            <pubDate>Tue, 20 Jan 2009 15:26:23 GMT</pubDate>
		<dc:subject><![CDATA[perl]]></dc:subject>
		<dc:subject><![CDATA[helpme]]></dc:subject>
		<dc:subject><![CDATA[fixme]]></dc:subject>
            <description><![CDATA[<p>The title says it all: I am looking for a generic implementation that tells me the edit distance of two lists. The implementations on CPAN all seem to work on string-data. Which is OK for finding typos but makes WER calculation tedious. </p><p>So, I want a generic implementation that takes a comparator-function (as in sort {$a &lt;=&gt; $b} @list) and two lists and outputs the edit distance. Nice to have would be distance-weights and really nifty if the value of the comparator function (not only !=0 but how much lower or higher) was taken into account.</p><p>Luckily I don&#39;t need it now, so I don&#39;t have to write it. But it would be a great finger exercise for a Perl-in-NLP class.</p><p>EDIT: The obvious module Text::Levenshtein on CPAN actually *<a href="http://rt.cpan.org/Public/Bug/Display.html?id=42459">miscalculates</a>* Levenshtein-distance for some input. Luckily I wondered what the 3 bugs in the module were about before I just happily used that code... So I ended up slightly modifying an <a href="http://www.merriampark.com/ldperl.htm">implementation by Eli Bendersky</a>, which already uses lists internally. So I left out the part about the comparator interface for now and just calculate standard WER, which is all I need right now.</p>]]></description>
        </item>
        
    </channel>
</rss>
