Assistant Professor
School of Information
University of Michigan

Faculty Affiliate
CSE (Computer Science & Engineering)
MIDAS (Michigan Institute for Data Science)

Digital Fellow
MIT Initiative on the Digital Economy

Office: 3389 North Quad, 105 S. State Street, Ann Arbor, MI 48109
Phone: 734-764-5876
Email: lastname followed by the letter 'p' at umich dot edu
Twitter: @dhillon_p
Google Scholar: https://goo.gl/FEsnE8

Quick Navigation Links

Please follow the links below to navigate to specific subsections of the site or just scroll down to view all the content.

Research Interests   Publications   Professional Background   Teaching   CV   Students   Software  

Research Interests

My research centers around:

  1. Understanding the impact of internet technologies on users by empirically studying their interactions with such systems.
  2. Machine Learning, Natural Language Processing, Network Science, and Causal Inference for 1.

Professional Background

I am an Assistant Professor in the School of Information (SI) at the University of Michigan, where I research and teach various topics in Artificial Intelligence (AI), broadly defined.

I got my A.M. in Statistics and Ph.D. in Computer Science from the University of Pennsylvania where I was advised by Profs. Lyle Ungar, Dean Foster (now at Amazon), and James Gee. During my time at Penn, I also worked closely with Dr. Brian Avants on topics related to Machine Learning in Brain Imaging. My Ph.D. thesis was entitled, "Advances in Spectral Learning with Applications to Text Analysis and Brain Imaging," and won the Morris and Dorothy Rubinoff Best Dissertation Award. It proposed novel statistical methods for problems in Text Modeling/NLP and Brain Imaging. More specifically, my doctoral dissertation proposed statistically and computationally efficient methods for the problem of learning word embeddings in NLP and for the problem of data-driven parcellation/segmentation of human brain images. Our methods not only gave predictive accuracies that were better or comparable to the state-of-the-art statistical methods (circa 2015) but also had strong theoretical guarantees. Please look at our JMLR 2015 and NeuroImage 2014 papers for more details. I also did other research in my Ph.D. on establishing connections between PCA and ridge regression (cf. JMLR 2013) and on provably faster row and column subsampling algorithms for least squares regression (cf. NeurIPS 2013a,b).

Towards the end of my Ph.D., I got interested in computational social science and causal inference. After finishing my Ph.D., I proceeded to complete a Postdoc with Prof. Sinan Aral at MIT. At MIT, I worked on several social science problems e.g., finding influential individuals in a social network with realistic real-world assumptions (cf. Nature Human Behaviour 2018), devising revenue maximizing price discrimination strategies for newspapers (cf. Management Science 2020), and designing sequential interventions for news websites to help them maintain sustained user engagement. At MIT, I was also involved with the Initiative on the Digital Economy on studying the economic and societal impacts of AI.

Much before all this, I was a carefree undergrad studying Electronics & Electrical Communication Engineering at PEC in my hometown of Chandigarh, India. I developed my interest in AI/ML and the desire to pursue a Ph.D. as a result of three memorable summer internships, before my Ph.D., at Computer Vision Center @ Barcelona [summer 2006], Max Planck Institute for Intelligent Systems @ Tuebingen [summer 2008], and Information Sciences Institute/USC @ Los Angeles [summer 2009].


  1. SI 671/721 Data Mining: Methods and Applications @ Fall 2019 [75 students], Fall 2020.
  2. SIADS 642 [online] Introduction to Deep Learning @ Fall 2020, Winter 2021.


  1. Yulin Yu [Ph.D. Student 1st year] Previously MS @ U. Michigan, BS @ Indiana University.
  2. Yachuan Liu [Ph.D. Student 1st year] Previously BS @ UC Berkeley.
  3. Evan Weissburg [Undergrad]
  4. Arya Kumar [Undergrad]
  5. Jiapeng Guo [Undergrad]
  6. Vishal Nayak [Undergrad]

I always have openings for strong students in my group at all levels (Postdoctoral, Ph.D, Masters, or Undergrad). I am broadly looking to supervise students who are interested in working on ML, DL, NLP, Social Computing, Information Systems, or Information Economics. Prior research experience in these areas is highly valued, as are strong programming skills and a solid applied math/statistics background.

Process: If you're interested in working with me, then please send me an email with your CV and a one-page 11pt font document summarizing your favorite research paper that you've read in the last year and two *concrete* ways in which you would extend it. [Prospective Ph.D. students need not email me but are encouraged to apply to our Ph.D. program here. The deadline is typically in early December.]


Acronyms for conferences and journals wherever applicable:

[General Science venues] PNAS: Proceedings of the National Academy of Sciences.

[Statistical Machine Learning/AI venues] JMLR: Journal of Machine Learning Research; NeurIPS: Advances in Neural Information Processing Systems Conference; ICML: International Conference on Machine Learning; AISTATS: International Conference on Artificial Intelligence and Statistics; ECML: European Conference on Machine Learning.

[NLP/CL venues] EMNLP: International Conference on Empirical Methods in Natural Language Processing; ACL: Annual Conference of the Association for Computational Linguistics; COLING: International Conference on Computational Linguistics.

[Data Mining/Information Management venues] ICDM: International Conference on Data Mining; CIKM: International Conference on Information and Knowledge Management.

[(Medical, Neuro) Imaging venues] ISBI: IEEE International Symposium on Biomedical Imaging; MICCAI: International Conference on Medical Image Computing and Computer Assisted Intervention.

Note: The list below only contains the published papers. I do not list the various {preprints, working papers, papers under review} below for a variety of reasons. Please get in touch with me if you're interested in seeing them.

*indicates alphabetical author listing.

  1. Interdependence and the Cost of Uncoordinated Responses to COVID-19.
    David Holtz, Michael Zhao, Seth Benzell, Cathy Cao, Amin Rahimian, Jeremy Yang, Jennifer Allen, Avinash Collis, Alex Moehring, Tara Sowrirajan, Dipayan Ghosh, Yunhao Zhang, Paramveer Dhillon, Christos Nicolaides, Dean Eckles, and Sinan Aral.
    PNAS, 2020.
    [PDF] [Supplementary Information]
  2. Press Coverage: [Michigan News] [OneDetroit PBS Interview (Starts at 14:40)] [Los Angeles Times] [The Washington Post] [MSNBC] [The Boston Globe] [Yahoo Finance] [The Hill] [TechRepublic] [WGBH]

  3. Digital Paywall Design: Implications for Content Demand & Subscriptions.*
    Sinan Aral, Paramveer Dhillon.
    Management Science, 2020.
  4. Social Influence Maximization under Empirical Influence Models.*
    Sinan Aral, Paramveer Dhillon.
    Nature Human Behaviour, May 2018.
    [PDF] [Supplementary Information]
  5. Eigenwords: Spectral Word Embeddings.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    JMLR, December 2015.
    [PDF] [Code + Pre-trained Embeddings]
  6. Subject-Specific Functional Parcellation via Prior Based Eigenanatomy.
    Paramveer Dhillon, David Wolk, Sandhitsu Das, Lyle Ungar, James Gee, and Brian Avants.
    NeuroImage, October 2014.
    [PDF] [Code]
  7. New Subsampling Algorithms for Fast Least Squares Regression.
    Paramveer Dhillon, Yichao Lu, Dean Foster, and Lyle Ungar.
    NeurIPS 2013.
    [PDF] [Supplementary Information]
  8. Faster Ridge Regression via the Subsampled Randomized Hadamard Transform.
    Yichao Lu, Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    NeurIPS 2013.
    [PDF] [Supplementary Information]
  9. A Risk Comparison of Ordinary Least Squares vs Ridge Regression.
    Paramveer Dhillon, Dean Foster, Sham Kakade, and Lyle Ungar.
    JMLR, May 2013.
  10. Two Step CCA: A new spectral method for estimating vector models of words.
    Paramveer Dhillon, Jordan Rodu, Dean Foster, and Lyle Ungar.
    ICML 2012.
    [PDF] [Supplementary Information] [Code + Pre-trained Embeddings] [Note: This paper was superseded by our JMLR 2015 paper.]
  11. Spectral Dependency Parsing with Latent Variables.
    Paramveer Dhillon, Jordan Rodu, Michael Collins, Dean Foster, and Lyle Ungar.
    EMNLP 2012.
  12. Partial Sparse Canonical Correlation Analysis (PSCCA) for population studies in Medical Imaging.
    Paramveer Dhillon, Brian Avants, Lyle Ungar, and James Gee.
    ISBI 2012.
  13. Eigenanatomy improves detection power for longitudinal cortical change.
    Brian Avants, Paramveer Dhillon, Benjamin Kandel, Philip Cook, Corey McMillan, Murray Grossman, and James Gee.
    MICCAI 2012.
  14. Deterministic Annealing for Semi-Supervised Structured Output Learning.
    Paramveer Dhillon, Sathiya Keerthi, Olivier Chapelle, Kedar Bellare, and S. Sundararajan.
    AISTATS 2012.
  15. Metric Learning for Graph-based Domain Adaptation.
    Paramveer Dhillon, Partha Talukdar, and Koby Crammer.
    COLING 2012.
  16. Multi-View Learning of Word Embeddings via CCA.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    NeurIPS 2011.
    [PDF] [Supplementary Information] [Code + Pre-trained Embeddings] [Note: This paper was superseded by our JMLR 2015 paper.]
  17. Minimum Description Length Penalization for Group and Multi-Task Sparse Learning.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    JMLR, February 2011.
  18. Semi-supervised Multi-task Learning of Structured Prediction Models for Web Information Extraction.
    Paramveer Dhillon, S. Sundararajan, and S. Sathiya Keerthi.
    CIKM 2011.
  19. A New Approach to Lexical Disambiguation of Arabic Text.
    Rushin Shah, Paramveer Dhillon, Mark Liberman, Dean Foster, Mohamed Maamouri, and Lyle Ungar.
    EMNLP 2010.
  20. Learning Better Data Representation using Inference-Driven Metric Learning (IDML).
    Paramveer Dhillon, Partha Pratim Talukdar, and Koby Crammer.
    ACL 2010.
  21. Feature Selection using Multiple Streams.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    AISTATS 2010.
  22. Transfer Learning, Feature Selection and Word Sense Disambiguation.
    Paramveer Dhillon, Lyle Ungar.
    ACL 2009.
  23. Multi-Task Feature Selection using the Multiple Inclusion Criterion (MIC).
    Paramveer Dhillon, Brian Tomasik, Dean Foster, and Lyle Ungar.
    ECML 2009.
  24. Efficient Feature Selection in the Presence of Multiple Feature Classes.
    Paramveer Dhillon, Dean Foster, and Lyle Ungar.
    ICDM 2008.
  25.                                   Workshop Papers (Venues for getting initial feedback on research. Often do not have proceedings.)

    Acronyms for workshops include:

    NBER-SI: National Bureau of Economic Research - Summer Institute; CODE: MIT Conference on Digital Experimentation; WISE: Workshop on Information Systems and Economics; WIN: Workshop on Information in Networks; NSF-ITN: NSF Conference on Information Transmission in Networks at Harvard University; WCBA: Utah Winter Conference on Business Analytics; PRNI: International Workshop on Pattern Recognition in Neuroimaging; SSDBM: Scientific and Statistical Database Management Conference; NESCAI: North East Student Colloquium on Artificial Intelligence; ViSU/CVPR: Visual Scene Understanding Workshop at CVPR; CISIS/LNCS: Computational Intelligence in Security for Information Systems Conference/Lecture Notes in Computer Science.

  26. Optimizing Targeting Policies via Sequential Experimentation for User Retention.
    Jeremy Yang, Dean Eckles, Paramveer Dhillon, and Sinan Aral.
    NeurIPS 2019 Workshop on "Do the right thing": Machine learning and causal inference for improved decision making.
    [Talk Only]
  27. Optimizing Targeting Policies via Sequential Experimentation for User Retention.
    Jeremy Yang, Dean Eckles, Paramveer Dhillon, and Sinan Aral.
    CODE 2019.
    [Talk Only]
  28. Digital Paywall Design: Implications for Content Demand and Subscriptions.
    Sinan Aral, Paramveer Dhillon.
    NBER-SI (Economics of Digitization) 2017.
    [Abstract + Talk Only]
  29. Digital Paywall Design: Implications for Content Demand and Subscriptions.
    Sinan Aral, Paramveer Dhillon.
    CODE 2016.
    [Abstract + Talk Only]
  30. Digital Paywall Design: Implications for Content Demand and Subscriptions.
    Sinan Aral, Paramveer Dhillon.
    WISE 2016. [Runner-up best paper award]
    [Abstract + Talk Only]
  31. Digital Paywall Design: Implications for Content Demand and Subscriptions.
    Sinan Aral, Paramveer Dhillon.
    WCBA 2016.
    [Abstract + Talk Only]
  32. Influence Maximization Revisited.
    Sinan Aral, Paramveer Dhillon.
    WIN 2015.
    [Abstract + Talk Only]
  33. Influence Maximization Revisited.
    Sinan Aral, Paramveer Dhillon.
    NSF-ITN 2015.
    [Abstract + Talk Only]
  34. Anatomically-Constrained PCA for Image Parcellation.
    Paramveer Dhillon, James Gee, Lyle Ungar, and Brian Avants.
    PRNI 2013.
  35. Learning to Explore Scientific Workflow Repositories.
    Julia Stoyanovich, Paramveer Dhillon, Brian Lyons, and Susan Davidson.
    SSDBM 2013.
  36. Inference Driven Metric Learning for Graph Construction.
    Paramveer Dhillon, Partha Pratim Talukdar, and Koby Crammer.
    NESCAI 2010.
  37. Combining Appearance and Motion for Human Action Classification in Videos.
    Paramveer Dhillon, Sebastian Nowozin, and Christoph Lampert.
    ViSU/CVPR 2009.
  38. Robust Real-Time Face Tracking Using an Active Camera.
    Paramveer Dhillon
    CISIS/LNCS 2009.


  1. Code and data for our Nature Human Behaviour 2018 paper is available here.
  2. The ANTsR toolkit for medical image analysis (including the implementation of our NeuroImage 2014 paper) is available here.
  3. The SWELL (Spectral Word Embedding Learning for Language) JAVA toolkit for inducing word embeddings (cf. JMLR 2015, ICML 2012, NeurIPS 2011) is available here.
  4. Various Eigenword (SWELL) embeddings for reproducing the results in our JMLR 2015 paper can be found below [No additional scaling required for embeddings. Use them as is]. [Based on our results, OSCCA and TSCCA embeddings are the most robust and work best on a variety of tasks.]
  5. Generic eigenwords embeddings for various languages [Trained on much larger corpora.]

Last Modified: June 12, 2020