[PHOTO]

PARAMVEER DHILLON

Assistant Professor
School of Information
University of Michigan

Affiliate Faculty
Electrical Engineering & Computer Science (EECS)
Michigan Institute for Data Science (MIDAS)

Digital Fellow
MIT Initiative on the Digital Economy

Office: 3389 North Quad, 105 S. State Street, Ann Arbor, MI 48109
Phone: 734-764-5876
Email: lastname followed by the letter 'p' at umich dot edu
Twitter: @dhillon_p
Google Scholar: https://goo.gl/FEsnE8


Quick Navigation Links

Please follow the links below to navigate to specific subsections of the site or just scroll down to view all the content.

Research Interests   Publications   Professional Background   Teaching   Research Group Members   Service   Software  


Research Interests

My research centers around:

  1. Understanding the user-level processes involved in consumption and production of online news.
  2. Studying the impact of internet technologies (including social media) on users.
  3. Developing new Machine Learning, Text Mining, Network Science, and Causal Inference methods for 1 and 2.


Professional Background

Starting Fall 2019, I am an Assistant Professor in the School of Information (SI) at the University of Michigan, where I research and teach various topics in Artificial Intelligence (AI), broadly defined.

I got my A.M. in Statistics and M.S.E. & Ph.D. in Computer Science from the University of Pennsylvania where I was advised by Profs. Lyle Ungar, Dean Foster (now at Amazon), and James Gee. During my time at Penn, I also worked closely with Dr. Brian Avants on topics related to Machine Learning in Brain Imaging. My Ph.D. thesis was entitled, "Advances in Spectral Learning with Applications to Text Analysis and Brain Imaging," and won the Best Computer Science Dissertation Award at Penn (Morris and Dorothy Rubinoff Award). It proposed novel statistical methods for problems in Text Modeling/NLP and Brain Imaging. More specifically, my doctoral dissertation proposed statistically and computationally efficient methods for the problem of learning word embeddings in NLP and for the problem of data-driven parcellation/segmentation of human brain images. Our methods not only gave predictive accuracies that were better or comparable to the state-of-the-art statistical methods (circa 2015) but also had strong theoretical guarantees. Please look at our JMLR 2015 and NeuroImage 2014 papers for more details. I also did other research in my Ph.D. on establishing connections between PCA and ridge regression (cf. JMLR 2013) and on provably faster row and column subsampling algorithms for least squares regression (cf. NeurIPS 2013a,b).

Towards the end of my Ph.D., I got interested in computational social science and causal inference. After finishing my Ph.D., I proceeded to complete a Postdoc with Prof. Sinan Aral at MIT. At MIT, I worked on several social science problems, e.g., finding influential individuals in a social network with realistic real-world assumptions (cf. Nature Human Behaviour 2018), devising revenue maximizing price discrimination strategies for newspapers (cf. Management Science 2020), and designing sequential interventions for news websites to help them maintain sustained user engagement. At MIT, I was also involved with the Initiative on the Digital Economy (IDE) on studying the economic and societal impacts of AI. I am still affiliated with IDE as a Digital Fellow.

Much before all this, I was a carefree undergrad studying Electronics & Electrical Communication Engineering at PEC in my hometown of Chandigarh, India. I developed my interest in AI/ML and the desire to pursue a Ph.D. as a result of three memorable summer internships, before my Ph.D., at Computer Vision Center @ Barcelona [summer 2006], Max Planck Institute for Intelligent Systems @ Tuebingen [summer 2008], and Information Sciences Institute/USC @ Los Angeles [summer 2009].


Teaching

  1. SI 671/721 Data Mining: Methods and Applications [Significantly Re-designed] @ F[19,20] [75,100] students.
  2. SIADS 642 [online] Introduction to Deep Learning [Developed from scratch] @ F20,W21 [70,150] students.
  3. SIADS 532 [online] Data Mining I @ W21 [158] students.


Research Group Members

Ph.D. Students

  1. Yulin Yu [F20-] Last Stop: MS @ University of Michigan.
  2. Yachuan Liu [F20-] Last Stop: BS @ UC Berkeley.
  3. Ji Eun Kim [F21-] Last Stop: MSc @ London School of Economics.
  4. Sanzeed Anwar [F21-] Last Stop: BS+MEng @ MIT.

Undergrad/Masters Students

  1. Evan Weissburg [Undergrad]
  2. Arya Kumar [Undergrad]
  3. Xingjian Zhang [Undergrad]
  4. Jupiter Zhu [Undergrad]
  5. Houming Chen [Undergrad]
  6. Tianyi Li [Undergrad]
  7. Xianglong Li [Undergrad]
  8. Yingzhuo Yu [Undergrad]
  9. Sida Zhong [Undergrad]
  10. Florence Wu [Undergrad]
  11. Zhengyang Shan [Masters]
  12. Bohan Zhang [Masters]

Alumni

  1. Jiapeng Guo [Undergrad, next Masters in CS at Columbia University]
  2. Zilu Wang [Undergrad, next Masters in MS&E at Stanford University]

I always have openings for strong students in my group at all levels (Postdoctoral, Ph.D, Masters, or Undergrad). I am broadly looking to supervise students who are interested in working on ML, Computational Social Science, NLP, or Information Systems. Prior research experience in these areas is highly valued, as are strong programming skills and a solid applied math/statistics background.

Process: Masters/Undergrads (already at University of Michigan) interested in working with me can email their CV and transcripts. Prospective Postdocs can directly email me their latest CV and Research Statement. Prospective Ph.D. students need not email me directly but are encouraged to apply to our Ph.D. program here and mention my name as a potential advisor. The deadline is December 1 each year.


Service to the Profession

  1. Editorial Board Reviewer JMLR [2020-].
  2. Ad-hoc Reviewer: Nature, Nature Human Behaviour, PNAS, JAIR, Machine Learning Journal, Management Science, Marketing Science, IEEE TKDE, IEEE TPAMI.
  3. Reviewer/PC/SPC Member @ Core AI/ML Conferences: [every year since 2013] NeurIPS, ICML, AISTATS, ICLR, AAAI, IJCAI.
  4. Reviewer/PC/SPC Member @ Core NLP/Computational Social Science Conferences: [every year since 2019] EMNLP, NAACL, ICWSM, IC2S2.
  5. Reviewer/PC/SPC Member @ Core Information Systems Conferences: [every year since 2017] ICIS, CIST.


Publications

Acronyms for conferences and journals wherever applicable:

[General Science venues] PNAS: Proceedings of the National Academy of Sciences.

[Statistical Machine Learning/AI/Data Mining venues] JMLR: Journal of Machine Learning Research; NeurIPS: Advances in Neural Information Processing Systems Conference; ICML: International Conference on Machine Learning; AISTATS: International Conference on Artificial Intelligence and Statistics; ECML: European Conference on Machine Learning; ICDM: International Conference on Data Mining.

[NLP/CL venues] EMNLP: International Conference on Empirical Methods in Natural Language Processing; ACL: Annual Conference of the Association for Computational Linguistics; COLING: International Conference on Computational Linguistics.

[Social Media/Web/Computational Social Science/Information Management venues] ICWSM: International Conference on Web and Social Media; CIKM: International Conference on Information and Knowledge Management.

[(Medical, Neuro) Imaging venues] ISBI: IEEE International Symposium on Biomedical Imaging; MICCAI: International Conference on Medical Image Computing and Computer Assisted Intervention.

*indicates alphabetical author listing.

  1. Social Status and Novelty Drove the Spread of Online Information During the Early Stages of COVID-19. new
    Antonis Photiou, Christos Nicolaides, and Paramveer Dhillon.
    Nature Scientific Reports (Forthcoming), 2021.
    [PDF (Coming soon)]
  2. Judging a book by its cover: Predicting the marginal impact of Title on relative post popularity in Social Media. new
    Evan Weissburg, Arya Kumar, and Paramveer Dhillon.
    ICWSM, 2022.
    [PDF (Coming soon)]
  3. What is Novelty? Unpacking Brokers' "Vision Advantage." new
    Sinan Aral, Paramveer Dhillon.
    Management Science (Forthcoming), 2021.
    [PDF(Coming soon)]
  4. Modeling Dynamic User Interests: A Neural Matrix Factorization Approach. new
    Paramveer Dhillon, Sinan Aral.
    Marketing Science, 2021.
    [PDF]
  5. Targeting for long-term outcomes. new
    Jeremy Yang, Dean Eckles, Paramveer Dhillon, and Sinan Aral.
    Management Science (Minor Revision), 2021.
    [ArXiv:PDF]
  6. Interdependence and the Cost of Uncoordinated Responses to COVID-19.
    David Holtz, Michael Zhao, Seth Benzell, Cathy Cao, Amin Rahimian, Jeremy Yang, Jennifer Allen, Avinash Collis, Alex Moehring, Tara Sowrirajan, Dipayan Ghosh, Yunhao Zhang, Paramveer Dhillon, Christos Nicolaides, Dean Eckles, and Sinan Aral.
    PNAS, 2020.
    [PDF] [Supplementary Information]
  7. Press Coverage: [Michigan News] [OneDetroit PBS Interview (Starts at 14:40)] [Los Angeles Times] [The Washington Post] [MSNBC] [The Boston Globe] [Yahoo Finance] [The Hill] [TechRepublic] [WGBH]

  8. Digital Paywall Design: Implications for Content Demand & Subscriptions.*
    Sinan Aral, Paramveer Dhillon.
    Management Science, 2020.
    [PDF]
  9. Press Coverage: [Michigan News]

  10. Social Influence Maximization under Empirical Influence Models.*
    Sinan Aral, Paramveer Dhillon.
    Nature Human Behaviour, May 2018.
    [PDF] [Supplementary Information]
  11. Eigenwords: Spectral Word Embeddings.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    JMLR, December 2015.
    [PDF] [Code + Pre-trained Embeddings]
  12. Subject-Specific Functional Parcellation via Prior Based Eigenanatomy.
    Paramveer Dhillon, David Wolk, Sandhitsu Das, Lyle Ungar, James Gee, and Brian Avants.
    NeuroImage, October 2014.
    [PDF] [Code]
  13. New Subsampling Algorithms for Fast Least Squares Regression.
    Paramveer Dhillon, Yichao Lu, Dean Foster, and Lyle Ungar.
    NeurIPS 2013.
    [PDF] [Supplementary Information]
  14. Faster Ridge Regression via the Subsampled Randomized Hadamard Transform.
    Yichao Lu, Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    NeurIPS 2013.
    [PDF] [Supplementary Information]
  15. A Risk Comparison of Ordinary Least Squares vs Ridge Regression.
    Paramveer Dhillon, Dean Foster, Sham Kakade, and Lyle Ungar.
    JMLR, May 2013.
    [PDF]
  16. Two Step CCA: A new spectral method for estimating vector models of words.
    Paramveer Dhillon, Jordan Rodu, Dean Foster, and Lyle Ungar.
    ICML 2012.
    [PDF] [Supplementary Information] [Code + Pre-trained Embeddings] [Note: This paper was superseded by our JMLR 2015 paper.]
  17. Spectral Dependency Parsing with Latent Variables.
    Paramveer Dhillon, Jordan Rodu, Michael Collins, Dean Foster, and Lyle Ungar.
    EMNLP 2012.
    [PDF]
  18. Partial Sparse Canonical Correlation Analysis (PSCCA) for population studies in Medical Imaging.
    Paramveer Dhillon, Brian Avants, Lyle Ungar, and James Gee.
    ISBI 2012.
    [PDF]
  19. Eigenanatomy improves detection power for longitudinal cortical change.
    Brian Avants, Paramveer Dhillon, Benjamin Kandel, Philip Cook, Corey McMillan, Murray Grossman, and James Gee.
    MICCAI 2012.
    [PDF]
  20. Deterministic Annealing for Semi-Supervised Structured Output Learning.
    Paramveer Dhillon, Sathiya Keerthi, Olivier Chapelle, Kedar Bellare, and S. Sundararajan.
    AISTATS 2012.
    [PDF]
  21. Metric Learning for Graph-based Domain Adaptation.
    Paramveer Dhillon, Partha Talukdar, and Koby Crammer.
    COLING 2012.
    [PDF]
  22. Multi-View Learning of Word Embeddings via CCA.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    NeurIPS 2011.
    [PDF] [Supplementary Information] [Code + Pre-trained Embeddings] [Note: This paper was superseded by our JMLR 2015 paper.]
  23. Minimum Description Length Penalization for Group and Multi-Task Sparse Learning.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    JMLR, February 2011.
    [PDF]
  24. Semi-supervised Multi-task Learning of Structured Prediction Models for Web Information Extraction.
    Paramveer Dhillon, S. Sundararajan, and S. Sathiya Keerthi.
    CIKM 2011.
    [PDF]
  25. A New Approach to Lexical Disambiguation of Arabic Text.
    Rushin Shah, Paramveer Dhillon, Mark Liberman, Dean Foster, Mohamed Maamouri, and Lyle Ungar.
    EMNLP 2010.
    [PDF]
  26. Learning Better Data Representation using Inference-Driven Metric Learning (IDML).
    Paramveer Dhillon, Partha Pratim Talukdar, and Koby Crammer.
    ACL 2010.
    [PDF]
  27. Feature Selection using Multiple Streams.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    AISTATS 2010.
    [PDF]
  28. Transfer Learning, Feature Selection and Word Sense Disambiguation.
    Paramveer Dhillon, Lyle Ungar.
    ACL 2009.
    [PDF]
  29. Multi-Task Feature Selection using the Multiple Inclusion Criterion (MIC).
    Paramveer Dhillon, Brian Tomasik, Dean Foster, and Lyle Ungar.
    ECML 2009.
    [PDF]
  30. Efficient Feature Selection in the Presence of Multiple Feature Classes.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    ICDM 2008.
    [PDF]
  31.                                   Other Publications

  32. Is Deep Learning a Game Changer for Marketing Analytics? [Survey Paper for Practitioners]
    Glen Urban, Artem Timoshenko, Paramveer Dhillon, and John Hauser.
    MIT Sloan Management Review, 2020.
    [PDF]
  33. Mapping of pain circuitry in early post-natal development using manganese-enhanced MRI in rats.
    Megan Sperry, Ben Kandel, Suzanne Wehrli, KN Bass, Sandhitsu Das, Paramveer Dhillon, James Gee, and Gordon Barr.
    Neuroscience, 2017.
    [PDF]
  34.                                   Workshop Papers (Venues for getting initial feedback on research. Often do not have proceedings.)

    Acronyms for workshops include:

    NBER-SI: National Bureau of Economic Research - Summer Institute; CODE: MIT Conference on Digital Experimentation; WISE: Workshop on Information Systems and Economics; WIN: Workshop on Information in Networks; NSF-ITN: NSF Conference on Information Transmission in Networks at Harvard University; WCBA: Utah Winter Conference on Business Analytics; QME: Quantitative Marketing and Economics; PRNI: International Workshop on Pattern Recognition in Neuroimaging; SSDBM: Scientific and Statistical Database Management Conference; NESCAI: North East Student Colloquium on Artificial Intelligence; ViSU/CVPR: Visual Scene Understanding Workshop at CVPR; CISIS/LNCS: Computational Intelligence in Security for Information Systems Conference/Lecture Notes in Computer Science; GLB: Workshop on Graph Learning Benchmarks; CSEDM: Educational Data Mining in CS;IC2S2: International Conference on Computational Social Science.

  35. Unique in what sense? Heterogeneous relationships between multiple types of uniqueness and popularity in music.
    Yulin Yu, Pui Yin Cheung, Yong-Yeol Ahn, and Paramveer Dhillon.
    IC2S2 2021. [Best poster award]
    [No Archived Proceedings]
  36. Comparing Ebook Student Interactions With Test Scores: A Case Study Using CSAwesome.
    Hisamitsu Maeda, Barbara Ericson, and Paramveer Dhillon.
    CSEDM 2021.
    [No Archived Proceedings]
  37. A New Benchmark of Graph Learning for PM2.5 Forecasting under Distribution Shift.
    Yachuan Liu, Jiaqi Ma, Paramveer Dhillon, and Qiaozhu Mei.
    GLB Workshop @ The Web Conference 2021.
    [No Archived Proceedings]
  38. Targeting for long-term outcomes.
    Jeremy Yang, Dean Eckles, Paramveer Dhillon, and Sinan Aral.
    WISE 2020. [Nominated for Best student paper award]
    INFORMS Annual Conference 2020. [Best paper award]
    QME Conference 2020.
    [Abstract + Talk Only]
  39. Optimizing Targeting Policies via Sequential Experimentation for User Retention.
    Jeremy Yang, Dean Eckles, Paramveer Dhillon, and Sinan Aral.
    NeurIPS Workshop on "Do the right thing": Machine learning and causal inference for improved decision making 2019.
    CODE 2019.
    [Abstract + Talk Only]
  40. Digital Paywall Design: Implications for Content Demand and Subscriptions.
    Sinan Aral, Paramveer Dhillon.
    NBER-SI (Economics of Digitization) 2017.
    CODE 2016.
    WISE 2016. [Runner-up best paper award]
    WCBA 2016.
    [Abstract + Talk Only]
  41. Influence Maximization Revisited.
    Sinan Aral, Paramveer Dhillon.
    WIN 2015.
    NSF-ITN 2015.
    [Abstract + Talk Only]
  42. Anatomically-Constrained PCA for Image Parcellation.
    Paramveer Dhillon, James Gee, Lyle Ungar, and Brian Avants.
    PRNI 2013.
    [PDF]
  43. Learning to Explore Scientific Workflow Repositories.
    Julia Stoyanovich, Paramveer Dhillon, Brian Lyons, and Susan Davidson.
    SSDBM 2013.
    [PDF]
  44. Inference Driven Metric Learning for Graph Construction.
    Paramveer Dhillon, Partha Pratim Talukdar, and Koby Crammer.
    NESCAI 2010.
    [PDF]
  45. Combining Appearance and Motion for Human Action Classification in Videos.
    Paramveer Dhillon, Sebastian Nowozin, and Christoph Lampert.
    ViSU/CVPR 2009.
    [PDF]
  46. Robust Real-Time Face Tracking Using an Active Camera.
    Paramveer Dhillon
    CISIS/LNCS 2009.
    [PDF]

Software

  1. Code and data for our Nature Human Behaviour 2018 paper is available here.
  2. The ANTsR toolkit for medical image analysis (including the implementation of our NeuroImage 2014 paper) is available here.
  3. The SWELL (Spectral Word Embedding Learning for Language) JAVA toolkit for inducing word embeddings (cf. JMLR 2015, ICML 2012, NeurIPS 2011) is available here.
  4. Various Eigenword (SWELL) embeddings for reproducing the results in our JMLR 2015 paper can be found below [No additional scaling required for embeddings. Use them as is]. [Based on our results, OSCCA and TSCCA embeddings are the most robust and work best on a variety of tasks.]
  5. Generic eigenwords embeddings for various languages [Trained on much larger corpora.]


Last Modified: 8.6.21