This page is in progress. Suggestions are welcome!
Computational sociolinguistics (general)
Language and Social Identity
Prediction of social variables
latent variable model for geographic
lexical variation. Eisenstein et al., EMNLP 2010.
categorizing written texts by author
gender. Koppel et al., Literary and Linguistic Computing, 2002.
Developing age and gender predictive
lexica over social media. Sap et al., EMNLP 2014
- Gender attribution:
Tracing stylometric evidence beyond topic
and genre. Sarawgi et al., CoNLL 2011.
old do you think I am?" A study of
language and age in Twitter. Nguyen et al., ICWSM 2013.
Personality, gender, and age in the
language of social media: The
open-vocabulary approach. Schwartz. et al., PLoS ONE, 2013.
Papers that take a critical look at operationalization of gender in NLP
Improving NLP tools by accounting for language variation
Large scale analyses
Language and Social Interaction
Automatic identification at the word level
NLP tools for code-switched data
Large scale analyses of multilingual communication in social media
General resources on NLP code-switching research