Arjun Ashok

Arjun Ashok

I am a research assistant at IIT Hyderabad supervised by Prof. Vineeth Balasubramanian. I am also an Integrated Master's student at PSG Tech.

I work on developing algorithms that improve the learning efficiency and adaptiveness of machine learning systems. My current research interests are in transfer learning, particularly in pre-training, meta-learning, multi-task learning, continual learning and out-of-distribution generalization.

Education

Experience

Integrated B.Sc-M.Sc., Software Systems
PSG Tech, Coimbatore
Jun '18 - May '23 (Expected)
Research Intern
IIT Madras
May '20 - Aug '20
Research Assistant
IIT Hyderabad
Jun '21 - Present
Research Engineering Intern
dubverse.ai
Mar '22 - Present
Research Intern
IBM Research, India
June '22 - Aug '22

News

Sep '22 One paper on out-of-distribution detection submitted to ICLR 2023. This is work in collaboration with folks at ML Collective mentored by Rosanne Liu.
Aug '22 Preliminary work on self-supervised learning objectives for weather time series accepted at the AAAI 2022 Fall Symposium on Climate Change.
Jul '22 One paper on Class-Incremental Learning accepted as a full paper at ECCV 2022.
Jun '22 Started as a Research Intern at IBM Research, India. I'll be working on building self-supervised learning objectives and pre-trained models for geospatial weather time series.
Jun '22 One paper on cross-task generalization in NLP submitted to EMNLP 2022 (Update: Accepted).
Apr '22 One paper on Class-Incremental Learning accepted at the CLVISION Workshop at CVPR 2022 as a non-archival paper (Update: Accepted at ECCV 2022).
Apr '22 One reproducibility report on Self-Supervision and Few-shot Learning accepted at the ML Reproducibility Challenge 2021 (Fall Edition) and published at ReScience-C.
Mar '22 Started as a Research Engineering Intern at dubverse.ai. Excited to work on problems on lip-synchronization here.
Oct '21 One paper on out-of-distribution generalization accepted as AAAI 2022 as a student abstract.
Jun '21 Started as a Research Assistant at IIT Hyderabad. Grateful to be working under Prof. Vineeth Balasubramanian.

Publications

Conference Publications

Class-Incremental Learning with Cross-Space Clustering and Controlled Transfer
Arjun Ashok, K J Joseph, Vineeth Balasubramanian
Accepted at ECCV 2022
Also appeared at the CLVISION Workshop, CVPR 2022

Paper arXiv Project Page Code

We propose two distillation-based objectives for class incremental learning that leverage the structure of the feature space to maintain accuracy on previous classes, as well as enable learning the new classes
In class-incremental learning, the model is expected to learn new classes continually while maintaining knowledge on previous classes. The challenge here lies in preserving the model's ability to effectively represent prior classes in the feature space, while adapting it to represent incoming new classes. We propose two distillation-based objectives for class incremental learning that leverage the structure of the feature space to maintain accuracy on previous classes, as well as enable learning the new classes. In our first objective, termed cross-space clustering (CSC), we propose to use the feature space structure of the previous model to characterize directions of optimization that maximally preserve the class - directions that all instances of a specific class should collectively optimize towards, and those that they should collectively optimize away from. Apart from minimizing forgetting, this indirectly encourages the model to cluster all instances of a class in the current feature space, and gives rise to a sense of herd-immunity, allowing all samples of a class to jointly combat the model from forgetting the class. Our second objective termed controlled transfer (CT) tackles incremental learning from an understudied perspective of inter-class transfer. CT explicitly approximates and conditions the current model on the semantic similarities between incrementally arriving classes and prior classes. This allows the model to learn classes in such a way that it maximizes positive forward transfer from similar prior classes, thus increasing plasticity, and minimizes negative backward transfer on dissimilar prior classes, whereby strengthening stability. We perform extensive experiments on two benchmark datasets, adding our method (CSCCT) on top of three prominent class-incremental learning methods. We observe consistent performance improvement on a variety of experimental settings.
Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, ..., Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi, Daniel Khashabi (40 authors)
Accepted at EMNLP 2022

arXiv Dataset Code

We introduce a benchmark of 1,600+ diverse language tasks and their expert-written instructions, and rigorously benchmark cross-task/unseen-task generalization of models. We introduce Tk-Instruct, an encoder-decoder Transformer that is trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples) which outperforms existing larger models on our benchmark.
How can we measure the generalization of models to a variety of unseen tasks when provided with their language instructions? To facilitate progress in this goal, we introduce Natural-Instructions v2, a benchmark of 1,600+ diverse language tasks and their expert-written instructions. It covers 70+ distinct task types, such as tagging, in-filling, and rewriting. These tasks are collected with contributions of NLP practitioners in the community and through an iterative peer review process to ensure their quality. With this large and diverse collection of tasks, we are able to rigorously benchmark cross-task generalization of models -- training on a subset of tasks and evaluating on the remaining unseen ones. For instance, we quantify generalization as a function of various scaling parameters, such as the number of observed tasks, the number of instances, and model sizes. Based on these insights, we introduce Tk-Instruct, an encoder-decoder Transformer that is trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples) which outperforms existing larger models on our benchmark. We hope this benchmark facilitates future progress toward more general-purpose language understanding models.

Workshop/Symposium Publications

Self-Supervised Representations of Geolocated Weather Time Series - an Evaluation and Analysis
Arjun Ashok, Devyani Lambhate, Jitendra Singh
Accepted at AAAI 2022 Climate Change Symposium

Preprint

We analyse existing self-supervised multivariate time series learning algorithms on their ability to learn representations of weather features, evaluating them on weather-driven downstream applications
Self-supervised learning (SSL) algorithms are gaining traction in various domains as a general paradigm of learning representations from data, largely outperforming supervised learning algorithms in tasks where labelled data is limited and costly to collect. In this work, we analyse existing self-supervised multivariate time series learning algorithms on their ability to learn representations of weather features, evaluating them on weather-driven downstream applications involving regression, classification and forecasting tasks. We experiment with a two-step protocol. In the first step, we employ an SSL algorithm and learn generic weather representations from multivariate weather data. Then, in the next step, we use these representations and train simple linear models for multiple downstream tasks. Through our experiments on air quality prediction tasks, we highlight the benefits of self-supervised weather representations. The benefits include improved performance across multiple tasks, the ability to generalize with limited in-task data, and a reduction in training time and carbon emissions. We highlight several areas of future work and the potential impact that such algorithms can have on real-world problems. We expect such a direction to be relevant in multiple weather-driven applications supporting climate change mitigation and adaptation efforts.
Learning Modular Structures That Generalize Out-Of-Distribution
Arjun Ashok, Chaitanya TD, Vineeth Balasubramanian
Accepted at AAAI 2022 Student Track

Short Version

Designed two regularizers that enforce a network to preserve expert features that are reusable across domains, enabling them to extrapolate to unseen distributions better
Out-of-distribution (O.O.D.) generalization remains to be a key challenge for real-world machine learning systems. We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network. Preliminary evaluation on two benchmark datasets corroborates the promise of our method.
Does Self-Supervision Always Improve Few-Shot Learning?
Arjun Ashok, Haswanth Aekula
Accepted at ReScience-C Journal through the Machine Learning Reproduciblity Challenge (MLRC) 2021
To be presented at the Journal Showcase Poster Session at NeurIPS 2022

PDF W&B Blog Code

In contrast to prior literature, we show that the effectiveness of self-supervision in improving few-shot learning highly depends on the architecture and image size used, and that using self-supervised to train models decreases cross-domain few-shot performance

Preprints

Extremely Simple Activation Shaping for Out-of-Distribution Detection
Andrija Djurisic, Nebojsa Bozanic, Arjun Ashok, Rosanne Liu
Preprint, Under Review at ICLR 2023

arXiv Project Page Code

We develop an extremely simple, post hoc, on-the-fly, and plug-and-play activation shaping method for out-of-distribution detection.
The separation between training and deployment of machine learning models implies that not all scenarios encountered in deployment can be anticipated during training, and therefore relying solely on advancements in training has its limits. Out-of-distribution (OOD) detection is an important area that stress-tests a model's ability to handle unseen situations: Do models know when they don't know? Existing OOD detection methods either incur extra training steps, additional data or make nontrivial modifications to the trained network. In contrast, in this work, we propose an extremely simple, post-hoc, on-the-fly activation shaping method, ASH, where a large portion (e.g. 90%) of a sample's activation at a late layer is removed, and the rest (e.g. 10%) simplified or lightly adjusted. The shaping is applied at inference time, and does not require any statistics calculated from training data. Experiments show that such a simple treatment enhances in-distribution and out-of-distribution sample distinction so as to allow state-of-the-art OOD detection on ImageNet, and does not noticeably deteriorate the in-distribution accuracy. We release alongside the paper two calls for explanation and validation, believing the collective power to further validate and understand the discovery.

Ongoing Work

Self-Supervised Learning Objectives for Geospatial Weather Time Series
In collaboration with IBM Research, India
Fairness in Neural Ranking with Multiple Sensitive Attributes
At PSG Tech
Analysing Differential Privacy and Fairness in Vision Transformers

Hackathons

Microsoft CodeFundo++ Hackathon 2019-20

Regional Runner-up out of 550 teams for project on decentralized blockchain-powered electoral voting system.

Smart India Hackathon 2020

Placed 3rd out of 900 teams for project on real-time road congestion prediction.

Stanford TreeHacks 2022


Built a web application to track a user's carbon footprint through grocery receipts.

Check out our work here!

Teaching

Fall 2021 Deep Learning for Computer Vision, NPTEL (Online Course) Instructor: Vineeth Balasubramanian Taken By: 6426 students

Music

I am a Carnatic Vocalist and a student of Vidwan Bharat Sundar.

I have performed in multiple venues in India. Here is a news article that covered one of my concerts in Coimbatore.
Here is a SoundCloud playlist with songs from my performances.

May 2022

Jan 2019

Awards

  • Department Rank 1 among 120 students for the past three consecutive academic years.
  • State Rank 4 among 104,000 candidates in TNHSE examinations in 2018, 100th percentile (one among 10 students out of all candidates).
  • Institute Gold Medal and Outstanding Student Award, G.D. Matriculation Higher Secondary School in 2018 - chosen out of 140 students in the graduating batch.
  • PASCH Scholarship to attend a language summer school at Frankfurt, Germany in 2017.
    Awarded to only 80 students worldwide in 2017.

Misc.