Package: ngram 3.2.3
ngram: Fast n-Gram 'Tokenization'
An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.
Authors:
ngram_3.2.3.tar.gz
ngram_3.2.3.zip(r-4.5)ngram_3.2.3.zip(r-4.4)ngram_3.2.3.zip(r-4.3)
ngram_3.2.3.tgz(r-4.4-x86_64)ngram_3.2.3.tgz(r-4.4-arm64)ngram_3.2.3.tgz(r-4.3-x86_64)ngram_3.2.3.tgz(r-4.3-arm64)
ngram_3.2.3.tar.gz(r-4.5-noble)ngram_3.2.3.tar.gz(r-4.4-noble)
ngram_3.2.3.tgz(r-4.4-emscripten)ngram_3.2.3.tgz(r-4.3-emscripten)
ngram.pdf |ngram.html✨
ngram/json (API)
# Install 'ngram' in R: |
install.packages('ngram', repos = c('https://wrathematics.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/wrathematics/ngram/issues
Last updated 12 months agofrom:99ebbc3790. Checks:OK: 9. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Nov 04 2024 |
R-4.5-win-x86_64 | OK | Nov 04 2024 |
R-4.5-linux-x86_64 | OK | Nov 04 2024 |
R-4.4-win-x86_64 | OK | Nov 04 2024 |
R-4.4-mac-x86_64 | OK | Nov 04 2024 |
R-4.4-mac-aarch64 | OK | Nov 04 2024 |
R-4.3-win-x86_64 | OK | Nov 04 2024 |
R-4.3-mac-x86_64 | OK | Nov 04 2024 |
R-4.3-mac-aarch64 | OK | Nov 04 2024 |
Exports:babbleconcatenateget.nextwordsget.ngramsget.phrasetableget.stringgetseedmultireadng_orderngramngram_aswekapreprocessprintrcorpusshowsplitterstring.summarywordcount
Dependencies:
Readme and manuals
Help Manual
Help page | Topics |
---|---|
ngram: Fast n-Gram Tokenization | ngram-package |
ngram Babbler | babble babble,ngram-method |
Concatenate | concatenate |
getseed | getseed |
ngram Getters | get.nextwords get.nextwords,ngram-method get.ngrams get.ngrams,ngram-method get.string get.string,ngram-method getters ng_order ng_order,ngram-method |
Multiread | multiread |
n-gram Tokenization | ngram tokenize |
Class ngram | ngram-class |
ngram printing | ngram-print print,ngram-method show,ngram-method |
Get Phrasetable | get.phrasetable phrasetable |
Basic Text Preprocessor | preprocess |
Random Corpus | rcorpus |
Character Splitter | splitter |
Text Summary | string.summary |
Weka-like n-gram Tokenization | ngram_asweka Tokenize-AsWeka |
wordcount | wordcount wordcount.character wordcount.ngram |