Soumyajit Karmakar

I am a Project Assistant at the Vision and AI Lab (VAL), Indian Institute of Science working on Multi-Modal Learning and Vision-Language Models with Prof. R. Venkatesh Babu.

Prior to this, I completed B.Tech in CSE from IIIT Guwahati, where I secured departmental rank 2. I did my Bachelor's thesis at the Centre for Visual Information Technology (CVIT), IIIT Hyderabad under Prof. C. V. Jawahar on the topic of Semantic segmentation for the urban road settings using Diffusion Models.

In the summer of 2022, I participated in the Parameterized Algorithms and Computational Experiments (PACE) competition under the guidance of Dr. Srinibas Swain (IIIT Guwahati) and secured Global Rank 1, in the heuristic track.

Email  /  CV  /  Google Scholar  /  Github

profile photo
Research

I am interested in deep learning and computer vision. My current research interests lie in intersection of vision and language models, with a focus on Contrastive-Loss based models like CLIP. I have previously worked on various topics such as Latent Diffusion Models, Person Re-identification, and Few-Shot learning.

News

  • Oct 2023 - Paper on "A Study on ViTs Augmented by Masked Autoencoders" accepted at WACV 2024.
  • Sept 2023 - Joined VAL, IISc as a Project Assistant under Prof. R. Venkatesh Babu.
  • Jan 2023 - Joined CVIT, IIITH as a Research Fellow for my Bachelor's thesis under Prof. C. V. Jawahar.
  • Oct 2022 - Paper on "Convolutional Ensembling based Few-Shot Defect Detection Technique" accepted at ICVGIP 2022.
  • Aug 2022 - Started a project under Dr. Srijan Das at UNCC.
  • July 2022 - Secured Global Rank 1 at the PACE 2022 competition in the heuristics track student category.
  • May 2022 - Joined AITG, CSIR-CEERI as a Research Intern under Dr. Sanjay Singh.
  • Aug 2019 - Started BTech CSE at IIIT Guwahati.

Publications

Preprint & 2024
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders
Srijan Das, Tanmay Jain, Dominick Reilly, Pranav Balaji, Soumyajit Karmakar, Shyam Marjit, Xiang Li, Abhijit Das, and Michael S. Ryoo.

Winter Conference on Applications of Computer Vision (WACV), 2024
arXiv / code

We developed a joint ViT training pipeline which uses a Self-Supervised Auxiliary Task alongside the primary task, for when the training data is limited.


2022
Convolutional Ensembling based Few-Shot Defect Detection Technique
Soumyajit Karmakar, Abeer Banerjee, Prashant Sadashiv Gidde, Sumeet Saurav, and Sanjay Singh.

Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) 2022
arXiv / code

We developed a novel convolution stacking based ensemble learning technique that can efficiently extract and process the knowledge base of various pretrained CNN networks.

Feedback vertex set using Edge Density and REmove Redundant (FEDRER): A heuristic solver for finding a feedback vertex set in a directed graph
Aman Jain, Sachin Agarwal, Nimish Agrawal, Soumyajit Karmakar, Srinibas Swain.

Poster session of the International Symposium on Parameterized and Exact Computation (IPEC), 2022
arXiv / code

Our submission for the PACE 2022 competition. We developed a novel divide and conquer heuristic algorithm to obtain the Directed Feedback Vertex Set of a graph.

Academic Activities and other Projects

  • Standardized Exam: 331/340 in GRE (162 Verbal Reasoning, 169 in Quantitative Reasoning). 112/120 in TOEFL.
  • Contributed to open source project CompilerGym. CompilerGym is a open source library of reinforcement learning environments for compiler tasks maintained by Facebook Research.
  • Served as reviewer at ICVGIP 2022.


Thanks to Jon Barron for the template.