时过境迁,包也要更新了,中文分词用jiebaR,做词云用wordcloud2,该包是封装了javascript的某个词云包,因此,自带交互效果。
R code:
library(wordcloud2)
library(jiebaR)
### get data
data <- read.csv('C:\\Users\\your_file.csv',
stringsAsFactors = FALSE)
text <- data$包含文本的字段
### load the engine
### add stopword file
cutter <- worker(byline = TRUE,
stop_word = "C:\\Users\\your_dir\\stop_words.txt")
# add new words
new_user_word(cutter, "不稳")
# split the sentences to words
word <- segment(text, cutter)
word_2 <- unlist(word)
# get word frequency
wordFreq <- sort(table(word_2), decreasing = TRUE)
df_freq <- data.frame(word = names(wordFreq),
freq = as.numeric(unname(wordFreq)))
# get wordcloud
wordcloud2(df_freq[1:100, ], fontFamily = 'Microsoft YaHei')
备注:转移自新浪博客,截至2021年11月,原阅读数99,评论0个。