Computational Social Science

I am interested in computational methods and analyzing large social media data to investigate various aspects of political communication. I use text-mining, natural language processing in multiple languages, machine learning, and data visualization. My primary language is R. I also use Python, SPSS, Ruby, and Mplus when it is necessary.

Political Communication

Political polarization is one of my major research topics. I look at political polarization as a multi dimensional concept. My research looks at polarization from different perspectives and examines how cognitive, affective, and perceived polarization is related with each other.

Audience Psychology

How different segments of audiences differently process the same information? I study how media messages interact with strong partisan identity, especially when the partisans encounter information conflicting to own viewpoint.



Computational Social Science

Shah, D. V., Hanna, A., Bucy, E. P., Lassen, D. S., Van Thomme, J., Bialik, K., Yang, J., & Pevehouse, J. (2016). Verbal, Tonal, and Visual Influences During Presidential Debates: Assessing Effects on the Volume and Valence of Online Expression in Real Time. American Behavioral Scientist.

Bode, L., Hanna, A., Yang, J., & Shah, D.V. (2015). Candidate Networks, Citizen Clusters, and Political Expression: Strategic Hashtag Use in the 2010 Midterms. The Annals of the American Academy of Political and Social Science.

Political Polarization

Yang, J., Rojas, H., Wojcieszak, M., Coen, S., Curran, Iyengar, S., …, & Tiffen, R. (Accepted for publication) Why Are “Others” So Polarized?: Perceived Political Polarization and Media Use in 10 Countries. Journal of Computer Mediated Communication.

Political Selective Exposure

Edgerly, S., Vraga, E. K., McLaughlin, B., Alvarez, G., Yang, J., & Kim, Y. M. (2014) Navigational Structure & Information Selection Goals: A Closer Look at Online Selectivity. Journal of Broadcasting and Electronic Media.

Political Socialization

Vraga, E. K., Bode, L., Yang, J., Edgerly, S., Thorson, K., Wells, C., & Shah, D. V. (2014). Political influence across generations: partisanship and candidate evaluations in the 2008 election. Information, Communication & Society, 17(2), 184-202. doi: 10.1080/1369118X.2013.872162

  • Recent scholarship in political socialization has moved beyond traditional transmission models of parent-driven socialization to consider alternative pathways, like trickle-up socialization and its predictors. However, these studies have paid less attention to the diverse ways in which parents and children develop discrete political orientations, especially during a competitive presidential campaign. In this study, we examine various pathways through which influence occurs across generations in terms of partisanship and candidate evaluations. Our results suggest that while harmonious attitudes remain the norm, there are substantial opportunities for youth to demonstrate their independence, particularly when gaining perspectives from schools and digital media sources. Our findings indicate the importance of exploring how youth and their parents come to understand politics and the forces that shape youth socialization.

Effect of Online Expression in Health Communication

McLaughlin, B., Yang, J., Yoo, W. H., Kim, S. Y., Shah, D. V., Shaw, B., Gustafson, D. H. (2015). The Effects of Expressing Religious Support Online for Breast Cancer Patients. Health Communication.doi: 10.1080/10410236.2015.1007550

Yoo, W., Chih, M.-Y., Kwon, M.-W., Yang, J., Cho, E., McLaughlin, B., . . . Gustafson, D. H. (2013). Predictors of the change in the expression of emotional support within an online breast cancer support group: A longitudinal study. Patient education and counseling, 90(1), 88-95.


Computational Social Science

Shah, D. V., Culver, K., Hanna, A., Macafee, T., & Yang, J. (Forthcoming). Everyday Political Talk Online, In Handbook of Digital Politics, edited by S. Coleman & D. Freelon. Edward Elgar: Cheltenham, UK.

Political Socialization

Vraga, E. K., Bode, L., Yang, J., Edgerly, S., Thorson, K., Wells, C., & Shah, D. V. (2014). Political Influence across Generations: Partisanship and Candidate Evaluations in the 2008 US Presidential Election. In The Networked Young Citizen: Social Media, Political Participation and Civic Engagement, edited by B. D. Loader, A. Vromen & M. Xenos. New York, NY: Routledge.

Spencer Foundation. (2014). White paper on Political Influence within Parent-Child Dyads: Partisan Ideology, Candidate Preference, and Political Participation.

Political Polarization

Yang, J. (2011). Polarized attitude or polarized perception?: Political polarization in Colombia. In Comunicacion y Ciudadania, edited by H. Rojas, M. Wojcieszak, H Gil de Zuniga and D. Mazorra. Universidad Externado de Colombia Press: Bogota.




Effect of Online Expression in Health Communication

Yoo, W., Yang, J., & Cho, E. (R & R). How Social Media Influence College Students’ Smoking Attitudes and Susceptibility?: Focused on the Influence of Presumed Influence Model. Computers in Human Behavior.

Political Socialization

Bode, L., Vraga, E., Yang, J., Edgerly, S., Thorson, K., Wells, C., & Shah, D. (Invited for publication). Participatory Influence within Parent-Child Dyads: Rethinking the Transmission Model of Socialization. In Resources, Engagement, and Recruitment: New Advances in the Study of Civic Voluntarism, edited by C. A. Klofstad.

Political Selective Exposure

Yang, J., Barnidge, M., & Rojas, H. (Under Review). The Politics of Unfriending. Computers in Human Behavior.

Yang, J., Gunther, A. C., & Wise, D. (Under Preparation). What Comes After First Click?: The Patterns of Selective Exposure.

Computational Social Science

Yang, J., & Kim, Y. M. (R & R). The Million Follower Fallacy? Measuring Candidates’ Political Twitter Activity in the 2010 Midterm Elections. The International Journal of Press/Politics.

Yang, J., Sangari, A., Duncan, M., Zhang, Cao, D., Lukito, J, Bialik, K., Kim, S., Kornfield, R., Wu, Y., & Zhang, W. (2016). Obamacare and Political Polarization on Twitter: An Application of Machine Learning and Social Network Analysis. Paper presented at Communication Crossroads 2016. Madison, WI, USA. [Presentation Slide]

  • This study investigates political polarization in the Twitter conversation about the Affordable Care Act. Using the Twitter Gardenhose API, we collected over 300,000 tweets over three different periods in 2012 when “Obamacare” received national news coverage. This sample ranged from the day the Supreme Court announced it would hear the Obamacare case through the 2012 U.S. presidential election. Using supervised machine learning methods, we classified Twitter users’ political orientation based on text features used in their tweets and profile descriptions. In addition to political orientation, we distinguished members of the public from members of the elite based on Twitter’s verified status and follower count. We assessed retweet networks, which revealed highly polarized clusters of liberals and conservatives. However, levels of polarization were not uniform across time or across Twitter users. In the earlier time periods, conversations between the groups demonstrate substantial cross-cutting, but these cross-cutting links disappeared and the network became more polarized near election day, showing signs of party-sorting. Furthermore, the elites and the public interacted differently within the network. Our findings suggest the role of the grassroots conservative movement on Twitter for promoting anti-Obamacare agenda and also highlight the role of mainstream media and journalists in bridging the divide between the two ideological groups. The implications of machine learning and social network analysis to political communication research are discussed.

Yang, J., Sangari, A., Zhang, W., & Shah, D. V. (2016). Applying Supervised Machine Learning to Compute Political Ideology Among Twitter Users. Paper submitted to the 2016 International Conference on Computational Social Science. Evanston, IL, USA.

  • Social media have become a new political arena where politicians, journalists, media, activists, and individuals talk about different issues and share information. As social media data becomes more accessible and are read for traces of public opinion, growing scholarly attention has been paid to identify political ideology of users. Relying on Twitter data, we seek to identify the political leaning of Twitter users according to linguistic features of tweets, hashtag use, patterns of retweet and @mention, and self-described user profile using supervised machine learning methods. In particular, this paper discusses several different procedures of training machine classifiers and examines how different methodological choices impact the overall precision of the estimation. First, we suggest a strategy of sorting out highly influential users to reduce errors in human coding procedure. Since human coding of highly active twitter accounts results in labeling a large volume of tweets generated by these accounts, this additional step is useful to boost the size of trained data and enhance overall accuracy of the estimation. Second, we propose cross-validation of trained data as an important step of supervised machine learning. Third, we apply N-grams to capture phrases and multi-word expressions and to take into consideration of the word dependencies. Fourth, we compare the outcome of two-category classification (e.g., conservative vs. liberal) with the outcome of three-category classification (e.g., conservative vs. liberal vs. neutral) and discuss how these methodological choices affect the accuracy of automated classification of Twitter users’ political ideology. We present applications of this method using two different events as case studies, an issue-specific conversation around the Affordable Care Act in 2012 and a general political conversation during the first U.S. presidential debate in 2012. Further application of the method and implications in social science research are discussed.

Kornfield, R., Yang, J., Zhang, Y., Lukito, J., & Wu, Y. How People Talk about Obamacare?: Differences in Linguistic Patterns of the Conservatives and the Liberals.