Hedegaard, H., Miniño, A. M., Spencer, M. R. & Warner, M. Drug Overdose Deaths in the United States, 1999–2020. NCHS Data Brief, no. 394 (National Center for Health Statistics, Hyattsville, MD, 2020).
Ciccarone, D. The rise of illicit fentanyls, stimulants and the fourth wave of the opioid overdose crisis. Curr. Opin. Psychiatry 34, 344–350 (2021).
Rigg, K. K., Monnat, S. M. & Chavez, M. N. Opioid-related mortality in rural America: geographic heterogeneity and intervention strategies. Int. J. Drug Policy 57, 119–129 (2018).
Castillo-Carniglia, A. et al. Prescription drug monitoring programs and opioid overdoses: exploring sources of heterogeneity. Epidemiology (Cambridge, MA) 30, 212 (2019).
Morrow, J. B. et al. The opioid epidemic: moving toward an integrated, holistic analytical response. J. Anal. Toxicol. 43, 1–9 (2019).
Jones, M. R. et al. Government legislation in response to the opioid epidemic. Curr. Pain Headache Rep. 23, 1–7 (2019).
Vaswani, A. et al. Attention is all you need. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathanand, S., & Garnett, R. Advances in Neural Information Processing Systems 5998–6008 (Curran Associates, Inc., 2017).
Song, H., Rajan, D., Thiagarajan, J. J. & Spanias, A. Attend and diagnose: Clinical time series analysis using attention models. In Proc. 32nd AAAI Conference on Artificial Intelligence (AAAI Press, Palo Alto, California USA, 2018).
Jaidka, K. et al. Estimating geographic subjective well-being from twitter: a comparison of dictionary and data-driven language methods. Proc. Natl Acad. Sci. USA 117, 10165–10171 (2020).
Curtis, B. et al. Can Twitter be used to predict county excessive alcohol consumption rates? PLoS ONE 13, 0194290 (2018).
Blanco, C., Wall, M. M. & Olfson, M. Data needs and models for the opioid epidemic. Mol. Psychiatry 27, 787–792 (2022).
Center For Disease Control (CDC). Overdose Data to Action. https://www.cdc.gov/drugoverdose/od2a/about.html (2022).
Center For Disease Control (CDC). CDC Launches New Center For Forecasting and Outbreak Analytics. https://www.cdc.gov/media/releases/2022/p0419-forecasting-center.html (2022).
Friedman, J. R. & Hansen, H. Evaluation of increases in drug overdose mortality rates in the US by race and ethnicity before and during the COVID-19 pandemic. JAMA Psychiatry 79, 379–381 (2022).
Flores, L. & Young, S. D. Regional variation in discussion of opioids on social media. J. Addict. Dis. 39, 316–321 (2021).
Barenholtz, E., Fitzgerald, N. D. & Hahn, W. E. Machine-learning approaches to substance-abuse research: emerging trends and their implications. Curr. Opin. Psychiatry 33, 334–342 (2020).
Dong, X. et al. An integrated LSTM-heterodyne model for interpretable opioid overdose risk prediction. Artif. Intell. Med. 135, 102439 (2022).
Sarker, A., Gonzalez-Hernandez, G., Ruan, Y. & Perrone, J. Machine learning and natural language processing for geolocation-centric monitoring and characterization of opioid-related social media chatter. JAMA Netw. Open 2, 1914672–1914672 (2019).
Lo-Ciganic, W.-H. et al. Developing and validating a machine-learning algorithm to predict opioid overdose in Medicaid beneficiaries in two US states: a prognostic modelling study. Lancet Digit. Health 4, 455–465 (2022).
Han, D.-H., Lee, S. & Seo, D.-C. Using machine learning to predict opioid misuse among us adolescents. Prev. Med. 130, 105886 (2020).
Madden, M. et al. Teens, Social Media, and Privacy. Pew Research Center 21 (1055), 2–86 (2013).
Zamani, M. & Schwartz, H. A. Using Twitter language to predict the real estate market. In Proc. 15th Conference of the European Chapter of the Association for Computational Linguistics Vol. 2, Short Papers 28–33 (Association for Computational Linguistics, 2017).
Giorgi, S. et al. The remarkable benefit of user-level aggregation for lexical-based population-level predictions. In Proc. 2018 Conference on Empirical Methods in Natural Language Processing 1167–1172 (Association for Computational Linguistics, Brussels, Belgium, 2018).
Quercia, D., Ellis, J., Capra, L. & Crowcroft, J. Tracking “gross community happiness” from Tweets. In Proc. ACM 2012 Conference on Computer Supported Cooperative Work 965–968 (Association for Computing Machinery, 2012).
Hassanpour, S., Tomita, N., DeLise, T., Crosier, B. & Marsch, L. A. Identifying substance use risk based on deep neural networks and Instagram social media data. Neuropsychopharmacology 44, 487–494 (2019).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Chung, J., Gulcehre, C., Cho, K. & Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In NIPS 2014 Workshop on Deep Learning.
Lin, T., Wang, Y., Liu, X. & Qiu, X. A Survey of Transformers (AI Open, 2022).
Zhao, S., Browning, J., Cui, Y. & Wang, J. Using machine learning to classify patients on opioid use. J. Pharm. Health Serv. Res. 12, 502–508 (2021).
Lo-Ciganic, W.-H. et al. Evaluation of machine-learning algorithms for predicting opioid overdose risk among medicare beneficiaries with opioid prescriptions. JAMA Netw. Open 2, 190968–190968 (2019).
Ripperger, M. et al. Ensemble learning to predict opioid-related overdose using statewide prescription drug monitoring program and hospital discharge data in the state of Tennessee. J. Am. Med. Inform. Assoc. 29, 22–32 (2022).
Kenton, J. D. M.-W. C. & Toutanova, L. K. Bert: pre-training of deep bidirectional transformers for language understanding. In Proc. NAACL-HLT 4171–4186 (Association for Computational Linguistics, 2019).
Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).
Ganesan, A. V. et al. Empirical evaluation of pre-trained transformers for human-level NLP: the role of sample size and dimensionality. In Proc. 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 4515–4532 (Association for Computational Linguistics, 2021).
Sun, C., Qiu, X., Xu, Y. & Huang, X. How to fine-tune BERT for text classification? In Proc. China National Conference on Chinese Computational Linguistics 194–206 (Springer, 2019).
Halder, K., Poddar, L. & Kan, M.-Y. Modeling temporal progression of emotional status in mental health forum: a recurrent neural net approach. In Proc. 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 127–135 (Association for Computational Linguistics, 2017).
Matero, M. & Schwartz, H. A. Autoregressive affective language forecasting: a self-supervised task. In Proc. 28th International Conference on Computational Linguistics 2913–2923 (Association for Computational Linguistics, 2020).
Ragheb, W., Moulahi, B., Azé, J., Bringay, S. & Servajean, M. Temporal mood variation: at the CLEF eRisk-2018 tasks for early risk detection on the internet. In CLEF: Conference and Labs of the Evaluation Forum (HAL open science, 2018).
Center for Disease Control (CDC). Underlying Cause of Death, 1999–2020 Request. https://wonder.cdc.gov/ucd-icd10.html (2022).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc.: Ser. B (Methodological) 57, 289–300 (1995).
Si, Y., Wang, J., Xu, H. & Roberts, K. Enhancing clinical concept extraction with contextual embeddings. J. Am. Med. Inform. Assoc. 26, 1297–1304 (2019).
Naseem, U., Razzak, I., Eklund, P. & Musial, K. Towards improved deep contextual embedding for the identification of irony and sarcasm. In 2020 International Joint Conference on Neural Networks (IJCNN) 1–7 (IEEE, 2020).
Eichstaedt, J. C. et al. Psychological language on Twitter predicts county-level heart disease mortality. Psychol. Sci. 26, 159–169 (2015).
Giorgi, S. et al. Regional personality assessment through social media language. J. Personal. 90, 405–425 (2022).
Mattson, C. L. et al. Trends and geographic patterns in drug and synthetic opioid overdose deaths—United States, 2013–2019. Morb. Mortal. Wly Rep. 70, 202 (2021).
Schwartz, H. A. et al. Characterizing geographic variation in well-being using tweets. In Proc. 17th International AAAI Conference on Weblogs and Social Media (AAAI Press, Palo Alto, California USA, 2013).
Févotte, C. & Idier, J. Algorithms for nonnegative matrix factorization with the β-divergence. Neural Comput. 23, 2421–2456 (2011).
Matero, M. et al. Suicide risk assessment with multi-level dual-context language and bert. In Proc. Sixth Workshop on Computational Linguistics and Clinical Psychology 39–44 (Association for Computational Linguistics, 2019).
Cai, Z. & Tiwari, R. C. Application of a local linear autoregressive model to bod time series. Environmetrics 11, 341–350 (2000).
Zhang, Y., Liu, B., Ji, X. & Huang, D. Classification of EEG signals based on autoregressive model and wavelet packet decomposition. Neural Process. Lett. 45, 365–378 (2017).
Zubaidi, S. L. et al. Prediction and forecasting of maximum weather temperature using a linear autoregressive model. In IOP Conference Series: Earth and Environmental Science Vol. 877, 012031 (IOP Publishing, 2021).
Michel, P., Levy, O. & Neubig, G. Are sixteen heads really better than one? Adv. Neural Inf. Process. Syst. 32, 14014–14024 (2019).
Sanh, V., Debut L., Chaumond, J. & Wolf, T. DistilBERT, A Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter. CoRR, abs/1910.01108 (2019).
Guu, K., Lee, K., Tung, Z., Pasupat, P. & Chang, M. Retrieval augmented language model pre-training. In Proc. International Conference on Machine Learning 3929–3938 (PMLR, 2020)
Zhuang, L., Wayne, L., Ya, S. & Jun, Z. A robustly optimized BERT pre-training approach with post-training. In Proc. 20th Chinese National Conference on Computational Linguistics 1218–1227 (Chinese Information Processing Society of China, Huhhot, China, 2021).
Paszke, A. et al. Pytorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems Vol. 32 (eds, Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E. & Garnett, R.) 8024–8035 (Curran Associates, Inc., 2019).
Falcon, e.a. Pytorch lightning, WA, Vol. 3. GitHub (2019) https://github.com/PyTorchLightning/pytorch-lightning.
Ilya, L., Frank, H. et al. Decoupled weight decay regularization. In Proc. ICLR (ICLR, 2019)
Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: a next-generation hyperparameter optimization framework. In Proc. 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, 2019).
Schuster, M. & Paliwal, K. K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 2673–2681 (1997).
Connor, J. T., Martin, R. D. & Atlas, L. E. Recurrent neural networks and robust time series prediction. IEEE Trans. Neural Netw. 5, 240–254 (1994).
Yang, S., Yu, X. & Zhou, Y. LSTM and GRU neural network performance comparison study: taking yelp review dataset as an example. In Proc. 2020 International Workshop on Electronic Communication and Artificial Intelligence (IWECAI) 98–101 (IEEE, 2020).
Luong, M.-T., Pham, H. & Manning, C.D. Effective approaches to attention-based neural machine translation. In Proc. 2015 Conference on Empirical Methods in Natural Language Processing 1412–1421 (Association for Computational Linguistics, 2015).
Cleveland, W. S. Robust locally weighted regression and smoothing scatterplots. J. Am. Stat. Assoc. 74, 829–836 (1979).