<?xml-stylesheet type="text/xsl" href="http://coco-lab.org/Elgg/timo/weblog/rss/perl/rssstyles.xsl"?>

<rss version='2.0'   xmlns:dc='http://purl.org/dc/elements/1.1/'>        
    <channel xml:base='http://coco-lab.org/Elgg/timo/weblog/'>
        <title><![CDATA[Timo Baumann : Weblog items tagged with perl]]></title>
        <description><![CDATA[The weblog for Timo Baumann, hosted on Coco-Lab Weblog.]]></description>
        <link>http://coco-lab.org/Elgg/timo/weblog/</link>        
        <item>
            <title><![CDATA[aggregated audio duration]]></title>
            <link>http://coco-lab.org/Elgg/timo/weblog/82.html</link>
            <guid isPermaLink="true">http://coco-lab.org/Elgg/timo/weblog/82.html</guid>
            <pubDate>Sat, 17 Apr 2010 12:55:31 GMT</pubDate>
		<dc:subject><![CDATA[perl]]></dc:subject>
		<dc:subject><![CDATA[audio]]></dc:subject>
            <description><![CDATA[<p>I&#39;ve finally written the one script that was missing from the interwebs and that I have longed to have for so long: </p> <div style="background-color: lightgray; font-family: monospace; line-height: 110%"><p>#!/usr/bin/perl<br /># Copyright (C) 2010 Timo Baumann<br /># This program is free software; you can redistribute it and/or modify it <br /># under the terms of the GNU General Public License as published by the <br /># Free Software Foundation; either version 2 of the License, <br /># or (at your option) any later version.<br /># <br /># This program is distributed in the hope that it will be useful, <br /># but WITHOUT ANY WARRANTY; without even the implied warranty of <br /># MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. <br /># See the GNU General Public License for more details.<br /># <br /># You should have received a copy of the GNU General Public License <br /># along with this program; if not, see &lt;<a href="http://www.gnu.org/licenses/">http://www.gnu.org/licenses/</a>&gt;.</p><p>use strict;<br />use warnings;<br />use Audio::Wav;<br />use Audio::Wav::Read;<br /><br />#usage: audio-duration.pl path-or-file1 path-or-file2 ...<br /><br />my @files;<br />for my $arg (@ARGV) {<br />&nbsp;&nbsp; &nbsp;my $findresult = `find $arg`;<br />&nbsp;&nbsp; &nbsp;push @files, grep /.wav$/, split &quot; &quot;, $findresult;<br />}<br />#print join &quot; &quot;, @files;<br />my $duration = 0.0;<br />my $wav = new Audio::Wav;<br />for my $file (@files) {<br />&nbsp;&nbsp; &nbsp;my $read = $wav-&gt;read($file);<br />&nbsp;&nbsp; &nbsp;$duration += $read-&gt;length_seconds();<br />}<br /># convert to something readable<br />my $readableDuration = &quot;&quot;;<br />if ($duration &gt; 600) {<br />&nbsp;&nbsp; &nbsp;my $seconds = int($duration + .5);<br />&nbsp;&nbsp; &nbsp;my $minutes = int($duration / 60);<br />&nbsp;&nbsp; &nbsp;$seconds -= $minutes * 60;<br />&nbsp;&nbsp; &nbsp;my $hours = int($minutes / 60);<br />&nbsp;&nbsp; &nbsp;$minutes -= $hours * 60;<br />&nbsp;&nbsp; &nbsp;$readableDuration = &quot;(&quot; . ($hours &gt; 0 ? &quot;$hours:&quot; : &quot;&quot;) . &quot;$minutes&#39;$seconds&quot;) &quot;;<br />}<br />print &quot;$duration seconds &quot;, $readableDuration, &quot;in &quot;, ($#files + 1), &quot; wave files.n&quot;;</p></div> <p>Running this in any directory wil yield the duration of audio (only .wav files) of all the files in this directory. If you supply arguments, it will look into the given directories (or files) and tell you the summed duration.</p><p>A must-have for any corpus-linguist dealing with loads of audio files! </p>]]></description>
        </item>
                
        <item>
            <title><![CDATA[Generic Minimum Edit Distance of lists in Perl]]></title>
            <link>http://coco-lab.org/Elgg/timo/weblog/77.html</link>
            <guid isPermaLink="true">http://coco-lab.org/Elgg/timo/weblog/77.html</guid>
            <pubDate>Tue, 20 Jan 2009 15:26:23 GMT</pubDate>
		<dc:subject><![CDATA[perl]]></dc:subject>
		<dc:subject><![CDATA[helpme]]></dc:subject>
		<dc:subject><![CDATA[fixme]]></dc:subject>
            <description><![CDATA[<p>The title says it all: I am looking for a generic implementation that tells me the edit distance of two lists. The implementations on CPAN all seem to work on string-data. Which is OK for finding typos but makes WER calculation tedious. </p><p>So, I want a generic implementation that takes a comparator-function (as in sort {$a &lt;=&gt; $b} @list) and two lists and outputs the edit distance. Nice to have would be distance-weights and really nifty if the value of the comparator function (not only !=0 but how much lower or higher) was taken into account.</p><p>Luckily I don&#39;t need it now, so I don&#39;t have to write it. But it would be a great finger exercise for a Perl-in-NLP class.</p><p>EDIT: The obvious module Text::Levenshtein on CPAN actually *<a href="http://rt.cpan.org/Public/Bug/Display.html?id=42459">miscalculates</a>* Levenshtein-distance for some input. Luckily I wondered what the 3 bugs in the module were about before I just happily used that code... So I ended up slightly modifying an <a href="http://www.merriampark.com/ldperl.htm">implementation by Eli Bendersky</a>, which already uses lists internally. So I left out the part about the comparator interface for now and just calculate standard WER, which is all I need right now.</p>]]></description>
        </item>
        
    </channel>
</rss>
