® Sentence + designated verb: SRL identify the arguments of the verb. And label them with semantic roles.
® Steps:
¯ 1 Pruning.
¯ 2 Local scoring
¯ 3 Joint scoring
(4) Fixing common errors
® Filtering the set of arguments for a given predicate.
¯ Any subsequence of words in the sentence is an argument candidate.
® Xue and Palmer (2004)
Joint scoring
® Good structure of labeled arguments.
® Arguments do not overlap,
® Core arguments do not repeat, etc.
® Re-ranking
® Probabilistic methods.
Conditional random fields
SRL architecture:
® Combination of systems and input annotations.
¯ Increase robustness,
¯ Gain coverage
¯ Reduce effects of parse errors
® One my combine:
¯ Output of independents srl basic systems
¯ Outputs from same srl s. changing input annotations or parameters.
Gaing of 2-3 F1 points.
® Joint labeling…
® Dependency parsing
® Combine parsing and srl in a single step. …
® Characterize candidate argument
phrase type, headword, …
® Characterize verb predicate +cntx
lemma, voice,
® Characterize the relation.
Syntactic + semantic.
Left/right position of the constituent with respect to the verb…
® Recall:
® 81% argument identification
® 95% assigned correct semantic role.
® SemEval-2007
® Disambiguation of 50 verbs.
® FrameNet
® 40 frames
® F1=92% asigning semantic roles
® F1 83% segmenting and labeling arg.
® Complete analysis of semantic roles on unseen texts,
® Precision 60s
® Recall 30s
® SRL relies on syntactic structure.
® Output by statistical parser 90% matching.
® Is common to use parser trees.
® Gold-standard trees.???
® Most of the errors are by having incorrect syntactic constituents.
® SRL relies on syntactic structure.
® Output by statistical parser 90% matching.
® Is common to use parser trees.
® Gold-standard trees.???
® Most of the errors are by having incorrect syntactic constituents.
® CoNLL-2005 Brown corpus annotated
® Performance drop below 70%
® They clamed errors are in assigning the semantic roles rather than identification of argument boundaries.
® Spanish and catalan, CESS-ECE corpus.
® 86% disambiguation predicates
® 83% labeling arguments.
® chinese
® semEval-2007 25K SENTENCE
® 80% core elements.
Top performing team use machine learning techniques
No comments:
Post a Comment