Como analisar sentimentos em Tweets usando Python

Sentimentos em tweets

Como analisar sentimentos em Tweets usando Python

Autor do post:Vinicius dos Santos
Post publicado:12 de dezembro de 2020
Categoria do post:Curso de Processamento de Linguagem Natural usando Python / Cursos
Comentários do post:0 Comentário

Nessa aula iremos compreender como analisar os sentimentos em tweets utilizando python e um classificador sentimentos treinado na ultima aula. Para isso, iremos utilizar a API tweepy para minerar (extrair) esses tweets da base de dados do Twitter. Assim, poderemos fazer uma conexão com o twitter dar ao programador a possibilidade de manipular os tweets com maior facilidade.

Usando a API no python

O primeiro passo para analisar sentimentos em python é utilizar a API tweepy é importar a biblioteca (já instalada via pip install tweepy). Além disso, vamos importar algumas bibliotecas que iremos utilizar nesse projeto.

      
import tweepy
import re
import pickle

from tweepy import OAuthHandler

Assim como foi descrito na aula de introdução você deverá inicializar os tokens. Para isso você deve acessar a plataforma do twitter para desenvolvedores, criar uma nova conta, ler e aceitar todos os termos e solicitar uma nova chave de acesso.

# initializing the keys
consumer_key = '__colocar sua consumer key___'
consumer_secret = '___ colocar sua consumer secret___'
access_token = '___colocar seu access token___'
access_secret = '___colocar seu access secret___'

Passo 03 – A seguir, você deverá utilizar o objeto OAuthHandler e passar os tokens para autenticar na API. Na lista de args, você deverá passar o termo no qual você está buscando em tweets.

auth = OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token,access_secret)

args = ['league of legends']

api = tweepy.API(auth,timeout=10)

Em seguida você poderá carregar os tweets em uma lista:


list_tweets = []

query = args[0]
if len(args) == 1:
    for status in tweepy.Cursor(api.search, q=query + ' -filter:retweets', lang='en', result_type='recent').items(100):
        list_tweets.append(status.text)

Para classificar um tweet vamos precisar de um classificador treinado, se você não sabe como fazer isso veja nosso post sobre como treinar classificadores em python. Mas lembre-se, que para que esse código funcione, você vai precisar de um corpus anotado e que seja capaz de identificar sentimentos em textos similares a tweets.

# aqui fazemos a abertura do nosso vetorizador
with open('tfidfmodel.pickle', 'rb') as f:
    vectorizer = pickle.load(f)

# aqui fazemos a abertura do classificador    
with open('classifier.pickle', 'rb') as f:
    clf = pickle.load(f)
          
resultado = clf.predict(vectorizer.transform(['we are awesome']))
print (resultado)

Por fim, realizamos o pré-processamento de cada tweet e classificamos o tweet utilizando o classificador treinado.

      
# pré processamento

for tweet in list_tweets:
    tweet = re.sub(r"^https://t.co/[a-zA-Z0-9]*s", " ", tweet)
    tweet = re.sub(r"s+https://t.co/[a-zA-Z0-9]*s", " ", tweet)
    tweet = re.sub(r"s+https://t.co/[a-zA-Z0-9]*$", " ", tweet)
    tweet = tweet.lower()
    tweet = re.sub(r"that's", "that is", tweet)
    tweet = re.sub(r"what's", "what is", tweet)
    tweet = re.sub(r"where's", "where is", tweet)
    tweet = re.sub(r"it's", "it is", tweet)
    tweet = re.sub(r"who's", "who is", tweet)
    tweet = re.sub(r"i'm", "i am", tweet)
    tweet = re.sub(r"she's", "she is", tweet)
    tweet = re.sub(r"he's", "he is", tweet)
    tweet = re.sub(r"they're", "they are", tweet)
    tweet = re.sub(r"who're", "who are", tweet)
    tweet = re.sub(r"ain't", "am not", tweet)
    tweet = re.sub(r"wouldn't", "should not", tweet)
    tweet = re.sub(r"can't", "can not", tweet)
    tweet = re.sub(r"couldn't", "could not", tweet)
    tweet = re.sub(r"won't", "will not", tweet)
    tweet = re.sub(r"W", " ", tweet)
    tweet = re.sub(r"d", " ", tweet)
    tweet = re.sub(r"s+[a-z]s+", " ", tweet)
    tweet = re.sub(r"s+[a-z]$", " ", tweet)
    tweet = re.sub(r"^[a-z]s+", " ", tweet)
    tweet = re.sub(r"s", " ", tweet)
    sent = clf.predict(vectorizer.transform([tweet]))
    print(tweet, ":", sent, "n")

Ao fim teremos o output (considerando que [1] – positivo – [0] – negativo):

    
added video to youtube playlist zedd ignite worlds league of legends : [0] 
added video to youtube playlist legends never die  ft  against the current    worlds        league of : [0] 
added video to youtube playlist imagine dragons  warriors   worlds  league of legends : [0] 
liked youtube video uzay serüveni  modu     league of legends gameplay : [0] 
live league of legends   mundial        fase de entrada   dia    via youtube : [0] 
liked youtube video  league of legends diamond iv yuhuhuhu facem si noi likeuri : [0] 
shaian we re updating the emblems because they don fit the aesthetic of league of legends or the world of ru   : [1] 
li league of legends makes me really pissed off lot  but the game itself isn the problem  it is assholes and   : [0] 
looking for some people who want to play league of legends with me  just for fun am terrible    but would like to play with group : [1] 
liked youtube video rise  ft  the glitch mob  mako  and the word alive    worlds        league of legends : [0] 
new ryze skin spotlight pre release league of legends thanks riotgames  : [1] 
the permanent ban turned out to be not so permanent  here the story of  loltyler road to league of legends red   : [0] 
dia league of legends : [0] 
liked youtube video pulsefire ezreal  skin spotlight   pre release   league of legends : [0] 
intersting structure developments   : [1]

A análise de sentimentos hoje é um assunto bastante difundido na comunidade de data science e principalmente entre os estudiosos de processamento de linguagem natural. Ela não representa um grande desafio, porém, pode ser uma ferramenta interessante para os gestores de grandes empresas e cientistas de dados.

É preciso lembrar também que o treinamento do modelo influencia muito nos resultados, sendo assim, preste muita atenção nos dados que são utilizados e também como eles foram anotados.

Tags: Algoritmos de PLN, Análise de Sentimentos, Classificador, Python

Vinicius dos Santos

Apenas um apaixonado por Ciência da Computação e a forma com que ela pode transformar vidas!

Cookie	Duração	Descrição
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Usando a API no python

Vinicius dos Santos

Você também pode gostar

Eventos com JavaScript

Funções em Python

Conhecendo o NumPy

Deixe um comentário Cancelar resposta

Informações sobre sua privacidade