Introduction:
Wikipedia defines An n-gram as: a subsequence of n items from a given sequence. The items in question can be phonemes, syllables, letters, words or base pairs according to the application.
An n-gram of size 1 is referred to as a "unigram"; size 2 is a "bigram" (or, less commonly, a "digram"); size 3 is a "trigram"; and size 4 or more is simply called an "n-gram".
Usages
n-grams are used in various areas of statistical natural language processing and genetic sequence analysis.
Examples:
Examples of word level 3-grams and 4-grams (and counts of the number of times they appeared) from the Google n-gram corpus.
- ceramics collectables collectibles (55)
- ceramics collectables fine (130)
- ceramics collected by (52)
- ceramics collectible pottery (50)
- ceramics collectibles cooking (45)
- serve as the incoming (92)
- serve as the incubator (99)
- serve as the independent (794)
- serve as the index (223)
- serve as the indication (72)
- serve as the indicator (120)
1. Wikipedia: N-Gram
0 comments:
Post a Comment