添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am using Python 2.7, Django 1.8 and my server is Apache on Linux Ubuntu. I have a JSON file with 23000 tweets in it. I want to classify the tweets according to predefined categories. But when I run the code, it throws MissingCorpusError at / and suggests:

To download the necessary data, simply run

python -m textblob.download_corpora

I already have the latest corpora for TextBlob. Still, I get the error.

My views.py is as follows:

def get_tweets(request):
    retweet = 0
    category = ''
    sentiment = ''
    tweets_data_path = STATIC_PATH+'/stream.json'
    tweets_data = []
    tweets_file = open(tweets_data_path, "r")
    for line in tweets_file:
            tweet = json.loads(line)
            tweets_data.append(tweet)
        except:
            continue
    subs = []
    for l in tweets_data:
        s = re.sub("http[\w+]{0,4}://t.co/[\w]+","",l)
        subs.append(s)
    for t in subs:
        i = 0
        while i < len(t):
            text = t[i]['tweet_text']
            senti = TextBlob(text)
            category = cl.classify(text)
            if senti.sentiment.polarity > 0:
                sentimen = 'positive'
            elif senti.sentiment.polarity < 0:
                sentimen = 'negative'
            else:
                sentimen = 'neutral'
            if text.startswith('RT'):
                retweet = 1
            else:
                retweet = 0
            twe = Tweet(text=text,category=category,
                sentiment=sentimen, retweet= retweet)
            twe.save()
            i = i+1
    return HttpResponse("done")
                Please post the structure of the json. And rewrite the while loop as for ti in t . How many tweets are there for each subs?
– Pynchia
                Sep 15, 2015 at 4:22
                altogether, there are 23689 tweets. should i post the structure of json file or a particular tweet?
– user5315166
                Sep 17, 2015 at 1:04

I have the same problem. When i download nltk_data it was placed to /root/nltk_data/, when I copy this nltk_data folder to /var/www/ it works OK.

$ sudo cp -avr nltk_data/ /var/www/
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.