1.text.concordance(word)
这个函数就是用来搜索单词word在text 中出现多的情况,包括出现的那一行,重点强调上下文,实例如下:
>>> text1.concordance("monstrous")
Displaying 11 of 11 matches:
ong the former , one was of a most monstrous size . ... This came towards us ,
ON OF THE PSALMS . " Touching that monstrous bulk of the whale or ork we have r
ll over with a heathenish array of monstrous clubs and spears . Some were thick
d as you gazed , and wondered what monstrous cannibal and savage could ever hav
that has survived the flood ;
most monstrous and most mountainous ! That Himmal
they might scout at Moby Dick as a monstrous fable , or still worse and more de
th of Radney .'" CHAPTER 55 Of the monstrous Pictures of Whales . I shall ere l
ing Scenes . In connexion with the monstrous pictures of whales , I am strongly
ere to enter upon those still more monstrous stories of them which are to be fo
ght have been rummaged out of this monstrous cabinet there is no telling . But
of Whale - Bones ;
for Whales of a monstrous size are oftentimes cast up dead u
>>>
**2.text.similar(word)**
这个函数的作用则是根据word 的上下文的单词的情况,来查找具有相似的上下文的单词.similar() 函数会在文本中 搜索具有类似结构的其他单词, 不过貌似这个函数只会考虑一些简单的指标,来作为相似度,比如上下文的词性,更多的完整匹配, 不会涉及到语义.可以看看下面的例子:```python
>>> text1.similar("monstrous")
mean part maddens doleful gamesome subtly uncommon careful untoward
exasperate loving passing mouldy christian few true mystifying
imperial modifies contemptible
>>> text2.similar("monstrous")
very heartily so exceedingly remarkably as vast a great amazingly
extremely good sweet
>>>
这个可以看出的是, text1 和text2 对同一个单词monstrous 的不同使用风格.
3.text.common_contexts([word1,word2…])
这个函数跟simailar() 有点类似,也是在根据上下文搜索的.
不同的是,这个函数是用来搜索 共用 参数中的列表中的所有单词,的上下文.即: word1,word2 相同的上下文.看例子:
>>> text2.common_contexts(["monstrous", "very"])
a_pretty is_pretty am_glad be_glad a_lucky
>>>
4.text.dispersion_plot([word1, word2,])
这个函数是用离散图 表示 语料中word 出现的位置序列表示.
text4.dispersion_plot(["citizens", "democracy", "freedom", "duties", "America"])
文章图片
其中横坐标表示文本的单词位置.纵坐标表示查询的单词, 坐标里面的就是,单词出现的位置.就是 单词的分布情况。
【NLTK入门-常用函数】5.text.generate()
以上述不同风格产生随机文本。虽然文本是随机的,但重复使用了源文本中常见的单词和短语,从而能使我们感受到它的风格和内容。
>>> text3.generate()
In the beginning of his brother is a hairy man , whose top may reach
unto heaven ;
and ye shall sow the land of Egypt there was no bread in
all that he was taken out of the month , upon the earth . So shall thy
wages be ? And they made their father ;
and Isaac was old , and kissed
him : and Laban with his cattle in the midst of the hands of Esau thy
first born , and Phichol the chief butler unto his son Isaac , she
推荐阅读
- 人工智能|hugginface-introduction 案例介绍
- 深度学习|论文阅读(《Deep Interest Evolution Network for Click-Through Rate Prediction》)
- nlp|Keras(十一)梯度带(GradientTape)的基本使用方法,与tf.keras结合使用
- NER|[论文阅读笔记01]Neural Architectures for Nested NER through Linearization
- 深度学习|2019年CS224N课程笔记-Lecture 17:Multitask Learning
- 深度学习|[深度学习] 一篇文章理解 word2vec
- 论文|预训练模型综述2020年三月《Pre-trained Models for Natural Language Processing: A Survey》
- NLP|NLP预训练模型综述
- NLP之文本表示——二值文本表示
- 隐马尔科夫HMM应用于中文分词