在Python中命名为NLTK的实体识别。标识NE

| 我需要将单词分类为它们的词性。像动词，名词，副词等。我用了

nltk.word_tokenize() #to identify word in a sentence 
nltk.pos_tag()       #to identify the parts of speech
nltk.ne_chunk()      #to identify Named entities.

它的输出是一棵树。例如

>>> sentence = \"I am Jhon from America\"
>>> sent1 = nltk.word_tokenize(sentence )
>>> sent2 = nltk.pos_tag(sent1)
>>> sent3 =  nltk.ne_chunk(sent2, binary=True)
>>> sent3
Tree(\'S\', [(\'I\', \'PRP\'), (\'am\', \'VBP\'), Tree(\'NE\', [(\'Jhon\', \'NNP\')]), (\'from\', \'IN\'), Tree(\'NE\', [(\'America\', \'NNP\')])])

当访问该树中的元素时，我按如下方式进行操作：

>>> sent3[0]
(\'I\', \'PRP\')
>>> sent3[0][0]
\'I\'
>>> sent3[0][1]
\'PRP\'

但是，当访问命名实体时：

>>> sent3[2]
Tree(\'NE\', [(\'Jhon\', \'NNP\')])
>>> sent3[2][0]
(\'Jhon\', \'NNP\')
>>> sent3[2][1]    
Traceback (most recent call last):
  File \"<pyshell#121>\", line 1, in <module>
    sent3[2][1]
  File \"C:\\Python26\\lib\\site-packages\\nltk\\tree.py\", line 139, in __getitem__
    return list.__getitem__(self, index)
IndexError: list index out of range

我收到上述错误。我想要得到的输出类似于前一个\'PRP \'的\'NE \'，因此我无法确定哪个单词是命名实体。有没有办法用python中的NLTK做到这一点？如果是这样，请发布命令。还是树库中有一个函数可以做到这一点？我需要节点值\'NE \'

已邀请:

5 个回复

掸牛浓疗

这个答案可能不正确，在这种情况下，我将其删除，因为我没有在此处安装NLTK进行尝试，但是我认为您可以这样做：

   >>> sent3[2].node
   \'NE\'

sent3[2][0]返回树的第一个子节点，而不是节点本身编辑：我回到家时尝试过这种方法，确实可以工作。

你换

下面是我的代码：

chunks = ne_chunk(postags, binary=True)
for c in chunks:
  if hasattr(c, \'node\'):
    myNE.append(\' \'.join(i[0] for i in c.leaves()))

佬棠

这会工作

for sent in chunked_sentences:
  for chunk in sent:
    if hasattr(chunk, \"label\"):
        print(chunk.label())

屑凉赦

我同意bdk sent3[2].node O / P-\'NE \' 我认为nltk中没有任何功能可以执行上述解决方案，但作为参考，您可以在这里查看对于循环问题，你可以做：-

 for i in range(len(sent3)):
     if \"NE\" in str(sent3[i]):
          print sent3[i].node

我已经在nltk中执行了它，并且效果很好。

目浆搽

现在send3 [2] .node已过时。使用send3 [2] .label（）代替

要回复问题请先登录或注册

在Python中命名为NLTK的实体识别。标识NE

5 个回复

发起人

nlp

named_entity_recognition

nltk

python

问题状态

在Python中命名为NLTK的实体识别。标识NE

与内容相关的链接

5 个回复

发起人

nlp

named_entity_recognition

nltk

python

问题状态