Log on: Remember me
Powered by Elgg
  • Publish Comment:

  • David Schlangen's Pages:

    Pages
  • David Schlangen

  • Owned communities

David Schlangen : Home Page > minutes100907

  - present: Michaela, Timo, David
  - more on predicting end-of... turns? utterances?
  - important: need to add certain amount of silence at end of each
    training unit, to see if certainty about turn end is higher in
    reactive model.
  - binning of predictions: end is 3secs or longer away, between 2 and
    3, between 2 and 1.8, etc.. bins get progressively larger at
    higher distances. or, equvialently: prediction has higher temporal
    resolution for closer end-events than for more distant ones.
  - use our infrastructure for this:
    - player tool that reads in file for utterance and plays the
      utterance to the other modules; these modules compute features
      which are either recorded in a file (for training) or are sent
      to classifier (for live mode / online evaluation).
    - should be able to handle various amounts of gold info.
      - e.g., minimal version of file:
        "utterance ID, file, start.time, end.time"
      - version w/ some gold info:
      "utterance ID, file, WORD, POS, F0-summary, start.time, end.time"
  - evaluation:
    - online: factors in time used up by classifier. While classifier
      is computing its prediction, point that it predicts as end may
      already have been reached! So the measure here is "when could
      system have made a decision?"
    - more complicated online version: assume certain time needed to
      compute a response, and another (shorter) time to release
      pre-planned utterance. Given these constraints, how good is
      system at always having something ready when turn is yielded,
      and how responsive is it?
    - a good model of human turn-taking performance could be expected
      to make the same systematic errors, e.g. overlap over optional
      phrases (talking over tag-questions). Should be factor in / be
      considered when doing error analysis.
    - offline: can be done on corpus in the same format as for
      training & in the usual way of feeding a vector to classifier
      and comparing its decision to the truth.
  - open question / parameter to play with: what are the increments at
    which
    - gold info is encoded
    - features are created for training and testing
    (doesn't have to be the same, larger increments in gold can be
    hacked into smaller ones by player)
  - each utterance generates (utt_length / increment_length) vectors
    for learning & testing.

  ---> to do: build minimal version with only one feature: how long
       has the utterance already gone on for?



das, 09/10/07 03:57 (GMT)

Keyword: emmy, inpro, learning, meetings, minutes, predicting TRPs

Add a new page under this one