Arjun Ashok

Arjun Ashok

I am a Student Researcher at Google Cloud AI Research in the San Francisco Bay Area, and a third-year PhD student at MILA-Quebec AI Institute and Université de Montréal advised by Irina Rish and Alexandre Drouin.

My research interests are in time series forecasting, specifically in building flexible models. Some of my notable contributions include making transformer architectures faster and better for forecasting tasks (TACTiS-2), building some of the early foundation models in this space (Lag-Llama), developing methods and benchmarks for textual-context aided forecasting (CiK, Beyond DP), and building post-training data pipelines to adapt LLMs for multimodal forecasting (CAF-7M).

Prior to Google, I spent three years at ServiceNow Research in Montreal, where I worked on forecasting research and published at top-tier venues. Earlier, I interned at IBM Research and dubverse.ai during my undergraduate studies.

To shape the evolving discourse around foundation models for forecasting, I led the organization of a successful first edition of the Time Series in the Age of Large Models workshop at NeurIPS 2024 (1000+ attendees), with a second edition now to be organized at ICLR 2026.

I also consult for startups in the forecasting space. Reach out if you're interested.

My email address is arjun.ashok.psg [at] gmail [dot] com.



Descriptive Alt Text
Photo from NeurIPS 2024, Vancouver - opening the first Time Series in the Age of Large Models workshop.

News

This is likely not up-to-date. I tend to post updates more often on LinkedIn these days.

Feb '26 Started as a Student Researcher at Google Cloud AI Research in the San Francisco Bay Area.
Jan '26 Co-organizing the Time Series in the Age of Large Models workshop at ICLR 2026 - second edition overall, first time at ICLR.
Aug '25 One paper out on arXiv, on strategies for improved zero-shot context-aided forecasting with LLMs.
May '25 Context is Key is accepted for publication at ICML 2025. Another paper accepted at the workshop on Foundation Models for Structured Data.
Dec '24 Co-organized The first NeurIPS workshop on Time Series in the Age of Large Models at NeurIPS 2024 in Vancouver, Canada, with 1000+ attendees. Checkout all the papers and talks from the workshop here.
Oct-Nov '24 Paper on natural-language based context-aware forecasting, Context is Key: A Benchmark for Forecasting with Essential Textual Information, is out on arXiv. Gave an oral presentation at the FM4TS Workshop at ACM ICAIF 2024, New York, USA.
July '24 Gave an invited talk on Natural Language based Context-Aware Forecasting at the International Symposium on Forecasting (ISF) 2024.
May '24 Presented TACTiS-2 at ICLR 2024. TACTiS-2 is a highly flexible model for multivariate probabilistic time series prediction tasks. Check out the tweet thread and poster here!
Feb '24 The full version of Lag-Llama released with open-source model checkpoints! Check the announcement here!
Jan '24 I gave a talk on our efforts Towards General-Purpose Models for Time-Series Prediction at the Winter 2024 Montreal Time Series Meetup.
Jan '24 TACTiS-2 accepted at ICLR 2024!
Dec '23 I gave a talk on Building Foundation Models for Time Series Data at the 6th workshop on Neural Scaling Laws co-located with NeurIPS 2023.
Oct '23 TACTiS-2 is out on arXiv.
Oct '23 A preliminary version of Lag-Llama is out on arXiv.
Jan '23 One paper on out-of-distribution detection accepted to ICLR 2023. This is work in collaboration with folks at ML Collective mentored by Rosanne Liu.
Jan '23 Started as a Visiting Researcher (Full-Time) at ServiceNow Research, Montreal. Excited to continue working on problems in time series representation learning!
Aug '22 Preliminary work on self-supervised learning objectives for weather time series accepted at the AAAI 2022 Fall Symposium on Climate Change.
Jul '22 One paper on Class-Incremental Learning accepted as a full paper at ECCV 2022.
Jun '22 Started as a Research Intern at IBM Research, India. I'll be working on building self-supervised learning objectives and pre-trained models for geospatial weather time series.
Jun '22 One paper on cross-task generalization in NLP submitted to EMNLP 2022 (Update: Accepted).
Apr '22 One paper on Class-Incremental Learning accepted at the CLVISION Workshop at CVPR 2022 as a non-archival paper (Update: Accepted at ECCV 2022).
Apr '22 One reproducibility report on Self-Supervision and Few-shot Learning accepted at the ML Reproducibility Challenge 2021 (Fall Edition) and published at ReScience-C.
Oct '21 One paper on out-of-distribution generalization accepted as AAAI 2022 as a student abstract.
Jun '21 Started as a Research Assistant at IIT Hyderabad under Prof. Vineeth Balasubramanian.

Selected Papers

Overcoming the Modality Gap in Context-Aided Forecasting
Vincent Zhihao Zheng, Étienne Marcotte, Arjun Ashok, Andrew Robert Williams, Lijun Sun, Alexandre Drouin, Valentina Zantedeschi
Preprint; under review
A method to turn real-world time series datasets into multimodal context-aided forecasting datasets, which we use to generate a datasets containing 7 millions windows.
Context-aided forecasting (CAF) holds promise for integrating domain knowledge and forward-looking information, enabling AI systems to surpass traditional statistical methods. However, recent empirical studies reveal a puzzling gap: multimodal models often fail to outperform their unimodal counterparts. We hypothesize that this underperformance stems from poor context quality in existing datasets, as verification is challenging. To address these limitations, we introduce a semi-synthetic data augmentation method that generates contexts both descriptive of temporal dynamics and verifiably complementary to numerical histories. This approach enables massive-scale dataset creation, resulting in CAF-7M, a corpus of 7 million context-augmented time series windows, including a rigorously verified test set. We demonstrate that semi-synthetic pre-training transfers effectively to real-world evaluation, and show clear evidence of context utilization. Our results suggest that dataset quality, rather than architectural limitations, has been the primary bottleneck in context-aided forecasting.
Beyond Naïve Prompting: Strategies for Improved Context-aided Forecasting with LLMs
Arjun Ashok, Andrew Robert Williams, Vincent Zhihao Zheng, Irina Rish, Nicolas Chapados, Étienne Marcotte, Valentina Zantedeschi, Alexandre Drouin
Preprint; under review
We introduce a unified framework for context-aided time series forecasting that bridges the LLM 'execution gap,' boosts prediction accuracy by up to 50%, and significantly reduces inference costs through adaptive model routing
Real-world forecasting requires models to integrate not only historical data but also relevant contextual information provided in textual form. While large language models (LLMs) show promise for context-aided forecasting, critical challenges remain: we lack diagnostic tools to understand failure modes, performance remains far below their potential, and high computational costs limit practical deployment. We introduce a unified framework of four strategies that address these limitations along three orthogonal dimensions: model diagnostics, accuracy, and efficiency. Through extensive evaluation across model families from small open-source models to frontier models including Gemini, GPT, and Claude, we uncover both fundamental insights and practical solutions. Our findings span three key dimensions: diagnostic strategies reveal the “Execution Gap” where models correctly explain how context affects forecasts but fail to apply this reasoning; accuracy-focused strategies achieve substantial performance improvements of 25-50%; and efficiency-oriented approaches show that adaptive routing between small and large models can approach large model accuracy on average while significantly reducing inference costs. These orthogonal strategies can be flexibly integrated based on deployment constraints, providing practitioners with a comprehensive toolkit for practical LLM-based context-aided forecasting.
Context is Key: A Benchmark for Forecasting with Essential Textual Information
Arjun Ashok*, Andrew Robert Williams*, Étienne Marcotte, Valentina Zantedeschi, Jithendaraa Subramanian, Roland Riachi, James Requeima, Alexandre Lacoste, Irina Rish, Nicolas Chapados, Alexandre Drouin  (* co-first authorship)
A forecasting benchmark with problems that require the combined use of numerical historical data and textual context.
Forecasting is a critical task in decision making across various domains. While numerical data provides a foundation, it often lacks crucial context necessary for accurate predictions. Human forecasters frequently rely on additional information, such as background knowledge or constraints, which can be efficiently communicated through natural language. However, the ability of existing forecasting models to effectively integrate this textual information remains an open question. To address this, we introduce "Context is Key" (CiK), a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context, requiring models to integrate both modalities. We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters, and propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark. Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings. By presenting this benchmark, we aim to advance multimodal forecasting, promoting models that are both accurate and accessible to decision-makers with varied technical expertise. The benchmark can be visualized at https://servicenow.github.io/context-is-key-forecasting/.
TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series
Arjun Ashok, Étienne Marcotte, Valentina Zantedeschi, Nicolas Chapados, Alexandre Drouin
A flexible model for multivariate probabilistic time series prediction, simplifying the training of attentional copulas, with state-of-the-art accuracy on diverse forecasting tasks, while supporting interpolation and learning from irregular data.
We introduce a new model for multivariate probabilistic time series prediction, designed to flexibly address a range of tasks including forecasting, interpolation, and their combinations. Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS), wherein the number of distributional parameters now scales linearly with the number of variables instead of factorially. The new objective requires the introduction of a training curriculum, which goes hand-in-hand with necessary changes to the original architecture. We show that the resulting model has significantly better training dynamics and achieves state-of-the-art performance across diverse real-world forecasting tasks, while maintaining the flexibility of prior work, such as seamless handling of unaligned and unevenly-sampled time series.
Lag-Llama: Towards Foundation Models for Time Series Forecasting
Arjun Ashok*, Kashif Rasul*, Andrew Robert Williams, Hena Ghonia, Rishika Bhagwatkar, Arian Khorasani, Mohammad Javad Darvishi Bayazi, George Adamopoulos, Roland Riachi, Nadhir Hassen, Marin Biloš, Sahil Garg, Anderson Schneider, Nicolas Chapados, Alexandre Drouin, Valentina Zantedeschi, Yuriy Nevmyvaka, Irina Rish  (* co-first authorship)
· · Paper · Code · Weights · Demo · Video
A foundation model for probabilistic time series forecasting with strong zero-shot and few-shot capabilities
Over the past years, foundation models have caused a paradigm shift in machine learning due to their unprecedented capabilities for zero-shot and few-shot generalization. However, despite the success of foundation models in modalities such as natural language processing and computer vision, the development of foundation models for time series forecasting has lagged behind. We present Lag-Llama, a general-purpose foundation model for univariate probabilistic time series forecasting based on a decoder-only transformer architecture that uses lags as covariates. Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities compared to a wide range of forecasting models on downstream datasets across domains. Moreover, when fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance, outperforming prior deep learning approaches, emerging as the best general-purpose model on average. Lag-Llama serves as a strong contender to the current state-of-art in time series forecasting and paves the way for future advancements in foundation models tailored to time series data.

Previous Work

In an earlier chapter (in the pre-LLM era), I worked on problems in out-of-distribution generalization, continual learning, and few-shot learning, spanning the domains of computer vision and natural language processing. Please check my Google Scholar for a list of previous publications.

Invited Talks

Academic Service

Misc

On the side, I am a Carnatic Musician. I have performed Carnatic concerts in multiple venues in India, Canada and the US regularly. Here is a recording of a concert of mine from Dec 2025.

I'm also a hybrid athelete (I enjoy running, biking and working out), and enjoy reading non-fiction.

From a concert performed in Dec 2025