Phani Krishna Uppala

I am a research associate at Video Analytics Lab, where i work on deep learning, computer vision and deep reinforcement learning, In particular i worked on unsupervised domain adaptation for the depth estimation, crafting image agnostic adversarial attacks, adversarial training, depth aware style transfer, feature visualization and other interesting deep learning stuff.

I received Bachelors in ECE at IIT Guwahati, where i worked on several projects including hyperspectral image segmentation under Prof. Amit Sethi.

I was a research intern at Computational Intelligence Lab during this time i worked on face recognition and tracking in UAV videos. Prior to that i worked in Next Generation Wireless Systems Lab with Prof. Neelesh Mehta on decision rules for Rayleigh faded channels.

Email  /  CV  /  Google Scholar  /  LinkedIn  /  Blog


  • Our Paper AdaDepth is accepted in CVPR 2018 (Spotlight).
  • AdaDepth featured in IISC news article.
  • AdaDepth - Codes & Pre-trained models Project Page.
  • Ask Acquire Attack is published in ECCV 2018.
  • Unsupervised Feature Learning of Action Sequences as Trajectories in Pose Manifold is accepted in WACV 2019.
  • Semi-supervised Segmentation of Hyperspectral Tissue Images Ask Acquire Attack is accepted in IEEE TMI.


I'm interested in computer vision, deep learning, reinforcement learning, adversarial attacks and defenses, domain adaptation for structured regression tasks, neural style transfer preserving 3D information.

AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation
J Kundu*, Phani Krishna Uppala*, A Pahuja, V Babu. Computer Vision and Pattern Recognition (CVPR), 2018   (Spotlight) * Equal Contribution.

An unsupervised domain adaptation strategy for the pixel-wise regression task of monocular depth estimation.

Ask, Acquire and Attack
Phani Krishna Uppala, KR Mopuri, V Babu. In Submission to European Conference on Computer Vision (ECCV), 2018

Crafting image agnostic adversarial attacks.


Some of the projects i had fun implementing.

Self Driving Car Simulation using a RL agent.

Deep Q network (DQN) agent is trained to drive a simulated car, agent is penalized for going off-track or crashing and rewarded for on-track high speeds. Car speed, game screen, and collision state are collected from emulator, and actions given are pressing right key, left key, brake key, accelerate key.

Graph Convolutions for the Semi supervised Classification.

Graphs are ubiquitous data structures, recently research community has taken up the challenge of generalizing the well performing CNNs to graph structures, taking inspiration from other works, this project involved implementing variants of the regularization loss for GCNs.

Simulated Human stick figure trained to walk using a RL agent.

Using the Actor-Critic method, where policy gradient method is trust region policy optimization, and the value function estimate is by generalized advantage estimation, a human stick figure is trained to walk.

Chatbot built using tensorflow.

Using the sequence to sequence model, one lstm encoder is used to read the input sequence, this encoded representation is used by lstm decoder to generate output sequence.

Chatbot built using IBM Watson conversation.

Leveraging the IBM Cloud services and Watson conversation which is trained on huge amount of data, a chatbot is developed using the custom trained natural language classifier, and integrating with Dialog service to make appropriate decisions.

Blockchain to have a transaction validation.

Transaction pool is created between imaginary people, hash function is created to give fingerprint to transactions, then used a set of state transition rules to create blocks, implemented mechanisms for validating a transaction, block and full chain.

Feature visualization for the convolutional neural networks.

Previously deep neural networks were considered as black box structures, to alleviate this research community proposed many visualization techniques, to gain further insights, a class level feature visualization is implemented, these class level visualizations showed structures which are present in real data samples remarkably.

Depth - aware style transfer.

Style transfer using Gram matrices opened a new venue in deep learning, but the 3D perspective of original image is not preserved in stylized images, to overcome this recently depth aware loss is used, to further improve on this, a GAN based discriminator is trained to differentiate between the depth images of original images and depth images of stylized images.

Anomaly detection in human movement patterns.

Anomaly detected has very important security implications, a neural network based anomaly detection model is trained on the video data.

Top view estimation of indoor scenes for robotic navigation.

Bird’s eye view of a room gives a very easy navigation protocols for robots, compared to the first person view, but videos collected using cameras are in first person view, so a neural network based model is trained to get top view from the RGBD image.

Hyperspectral image segmentation using semi supervised hierarchical clustering.

In biomedical image segmentation often the relationships among different cells is important, so just one label per pixel does not convey the full picture, so hierarchical clustering is used, but the hierarchical clustering is very expensive in memory and in computation, to overcome this a two stage clustering is used, this encapsulates the rich information of hierarchical clustering and computational efficiency of traditional methods.

Hyperspectral image segmentation using convolutional neural networks with spectral and spatial features.

In hyperspectral images the information is encoded in spectral domain and in the spatial domain, to exploit all the available information, a two streamed CNN with joint final fully connected layer is used to predict the segmentation.

Face recognition and tracking by using a surveillance video from an UAV.

Using the surveillance videos from the UAVs the cascade of various features are extracted, starting with the skin color segmentation, followed by viola jones feature extraction and others, finally a SVM classifier is used to detect the face of the subjects.

Establishing the optimum decision rule for ML detector when the transmitted signal is corrupted by Rayleigh faded channel.

Establishing the optimum decision rule for ML detector when the transmitted signal is corrupted by Rayleigh faded channel with AWGN or Log-normal or Rayleigh or Rician interference for MPSK and MQAM modulation schemes.


Some of the products over the years along with my collegues.

2.4GHZ RF receiver

Design of Radio frequency 2.4Ghz receiver, PCB layout with LNA, MIXER and ANTENNA is designed in eagle and fabricated.

Game - Digital Flappy Bird

Developed a synthesizable code for Flappy bird game in Verilog using Quartus software and burned the program in krypton CPLD Kit using JTAG, game and score are displayed on led matrix whereas the controller for users are on the board itself.

Game - Air hockey

Developed an LED simulation of the game Air Hockey on an (8*9) LED grid using 8085 microprocessor and logic gates, Dyna-kit, Interfaces like 8255, Interrupts, Masks, Memory registers are used in coordination to design the game.

Moving target tracking automobile via shortest path using RFIDS

The tracking is done with the help of RFID tags attached (beforehand) to the proposed targets, automobile would run by an algorithm which takes real time inputs from the tags and uses particle filtering techniques to correctly estimate the shortest “path of approach" to the target.