The Fall of The Simpsons

Simpsons_viz.png
PDF version.

This visualization is designed to show how the once most beloved TV show declined in its reputation and popularity over time and why this happened.

Tools used: Python matplotlib library for generating time series charts, Google Sheet for the heatmap, and Sketch to put together the visual design.

Interactive big data visualization for app performance monitoring

For the years out of college, I’ve been working as a software engineer (focusing on UI) on the core APM team at AppDynamics (now part of Cisco) based in downtown San Francisco, California. Application Performance Management (APM) is a technology that provides end-to-end business transaction-centric management of complex and distributed software applications. Auto-discovered transactions, dynamic baselining, code-level diagnostics, and Virtual War Room collaboration ensure rapid issue identification and resolution to maintain an ideal user experience. At AppDynamics, I developed complex yet performant AngularJS-based web application UI providing rich user interaction with a wealth of APM data in large scale. I’ve been made seasoned in all phases of the software product lifecycle: designing, prototyping, developing, maintaining, test automation, and shipping out useful features to our customers.

This slideshow requires JavaScript.

Automatic detection of epileptiform events in EEG recordings

An electroencephalogram (EEG) is the most important tool in the diagnosis of seizure disorders. Between seizures, epileptiform neural activities in EEG recordings occur in the forms of spikes or spike-and-slow wave complexes. Seeking for an automated EEG interpretation algorithm that is well-accepted by clinicians has been a research goal stretched for decades. As a participant in an NSF-funded Research Experience for Undergraduates (REU) program hosted at Clemson University School of Computing, I continued on this endeavor to develop an automated system that detected epilepsy-related events, in real-time, from scalp EEG recordings.

In finding the optimal algorithm for this purpose, I constructed a multi-stage processing pipeline. In the first stage, I cleaned up the clinic data gathered from 100 epileptic patients and treated them with cross-validation. Next, I used wavelet transformations to generate the features for study from EEG signal in a “sliding window” approach. I then applied machine learning algorithms and analyzed their performances in classifying data patterns into epileptiform activities versus other activities. For this stage I also explored the use of hidden Markov model to fit the time sequence in which epileptiform events occurred. In the final step, I further separated target eplieptiform events from noise signals, by applying a statistical model locally, and stitched outputs from different signal windows together. – source code

The automation results were highlighted these findings in realtime on the eegNet (standardized EEG database developed by Clemson) web interface.

Automatic detection of epileptiform events in EEG recordings – poster

This slideshow requires JavaScript.

The Open Science Investigation

Barriers for scientists to practice open science prevail due to a range of cultural and technological reasons. This undergraduate thesis, developed under the guidance of the Center for Open Science, seeks to understand the incentive structure for open science from a sociotechnical perspective, and attempts at a software solution to facilitate its implementation. The research paper, Incentive structure for Open Science in Web 2.0, elucidates how current reward system needs to be changed to encourage more practices of open science: to create incentives for researchers to open up their research materials for the broader community, organizations need to provide researchers with intrinsic rewards, proper credit allocation, and tangible career benefits. In the technical portion of the project, Designing Data Visualizations for Open Science, I prototyped an interactive research exploration and organizing tool for the Open Science Framework. The thesis contributes to this collective effort towards open science by making the creation of incentives as an explicit design goal for open science web applications. – thesis cover   |  STS paper

Twitter Sentiment Analysis

Social media, such as Facebook and Twitter, was a significant focus of Big Data. Utilizing Hadoop and AWS, I did a Sentiment Analysis of Twitter Data, concerning a series of events around the dismissal and subsequent re-hiring of U.Va. President Sullivan, in the summer of 2012. My data source was approximately 52,000 tweets collected by the U.Va. Library. The analysis result was presented in an infograph and an interactive data explorer showing the interplay of Twitter users centered around Larry Sabato, the most prominent influencer identified during the summer.

screen-shot-2016-06-12-at-12-09-20-am.png