Package: ngram 3.2.3

ngram: Fast n-Gram 'Tokenization'

An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.

Authors:Drew Schmidt [aut, cre], Christian Heckendorf [aut]

ngram_3.2.3.tar.gz
ngram_3.2.3.zip(r-4.5)ngram_3.2.3.zip(r-4.4)ngram_3.2.3.zip(r-4.3)
ngram_3.2.3.tgz(r-4.4-x86_64)ngram_3.2.3.tgz(r-4.4-arm64)ngram_3.2.3.tgz(r-4.3-x86_64)ngram_3.2.3.tgz(r-4.3-arm64)
ngram_3.2.3.tar.gz(r-4.5-noble)ngram_3.2.3.tar.gz(r-4.4-noble)
ngram_3.2.3.tgz(r-4.4-emscripten)ngram_3.2.3.tgz(r-4.3-emscripten)
ngram.pdf |ngram.html
ngram/json (API)

# Install 'ngram' in R:
install.packages('ngram', repos = c('https://wrathematics.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/wrathematics/ngram/issues

On CRAN:

ngramtexttext-mining

18 exports 71 stars 4.15 score 0 dependencies 4 dependents 5 mentions 854 scripts 1.1k downloads

Last updated 9 months agofrom:99ebbc3790. Checks:OK: 9. Indexed: yes.

TargetResultDate
Doc / VignettesOKSep 05 2024
R-4.5-win-x86_64OKSep 05 2024
R-4.5-linux-x86_64OKSep 05 2024
R-4.4-win-x86_64OKSep 05 2024
R-4.4-mac-x86_64OKSep 05 2024
R-4.4-mac-aarch64OKSep 05 2024
R-4.3-win-x86_64OKSep 05 2024
R-4.3-mac-x86_64OKSep 05 2024
R-4.3-mac-aarch64OKSep 05 2024

Exports:babbleconcatenateget.nextwordsget.ngramsget.phrasetableget.stringgetseedmultireadng_orderngramngram_aswekapreprocessprintrcorpusshowsplitterstring.summarywordcount

Dependencies:

Guide to the ngram Package

Rendered fromngram-guide.Rnwusingutils::Sweaveon Sep 05 2024.

Last update: 2022-03-13
Started: 2014-06-16

Readme and manuals

Help Manual

Help pageTopics
ngram: Fast n-Gram Tokenizationngram-package
ngram Babblerbabble babble,ngram-method
Concatenateconcatenate
getseedgetseed
ngram Gettersget.nextwords get.nextwords,ngram-method get.ngrams get.ngrams,ngram-method get.string get.string,ngram-method getters ng_order ng_order,ngram-method
Multireadmultiread
n-gram Tokenizationngram tokenize
Class ngramngram-class
ngram printingngram-print print,ngram-method show,ngram-method
Get Phrasetableget.phrasetable phrasetable
Basic Text Preprocessorpreprocess
Random Corpusrcorpus
Character Splittersplitter
Text Summarystring.summary
Weka-like n-gram Tokenizationngram_asweka Tokenize-AsWeka
wordcountwordcount wordcount.character wordcount.ngram