Categories
English Text Analytics

Types of language ambiguity in natural language processing

In this article, we present a simplified list of language ambiguity in natural language processing as well as the effects of ambiguity in the processing:

  • Word sense ambiguity: The same noun could have two different meaning, such as the word bank in the sentences “Bank of America” and “river bank”.
  • Part of speech ambiguity: Some words could have more than one POS tag, such as the word “lie”, which is according to the context, could come as a verb such as “lie down” or a noun “don’t tell me lies”.
  • Syntactic or structural ambiguity: The sentence “I saw a man with a telescope” is it the man who’s carrying telescope? Or the man has been seen through the telescope? Such type of ambiguity causes a problem in creating knowledge graphs, relations extraction, machine translation.
    • Attachment Ambiguity: when particular constituent in the sentence, could be attached to the parse tree at more than one place.
    • Coordination Ambiguity: different sets or phrases could be created using the conjunction in the sentence.

Ambiguity could affect NLP pipeline stages such as:

  • POS tagging: when the probability of some word seems apparently close for two words with a different part of speech tags.
  • Tokenization: words such as VS. or et al. could cause a problem when it is confused with the sentence ending dot.
  • Parsing: in PCFG, the parsing could be affected by Syntactic probability on different levels and in the different methodology of PCFG top-down or bottom-up or chart parsing

Leave a Reply

Your email address will not be published. Required fields are marked *