Let me tell you more about my project. I started working on my project in the beginning of July. My main task was to create a Generative Adversarial Networks (GANs) model to simulate the passage of a particle through matter. Why GANs? I’ll explain it later.
Since I've never heard about GANs I had to learn about it. Firstly, I’ve read Ian Goodfellow’s paper, and I was really impressed by the results! The generated images looked like real ones. It’s amazing because to my mind the idea is pretty simple. However to get some results we need a real experience - only practice makes sense. So, secondly, I’ve looked through CS231n Stanford course. I highly recommend this computer vision course! After reading through notes I was ready to try to build my own model.
To do fancy stuff in machine learning you need to be familiar with your dataset. The main difference of our dataset from all vanilla examples is that images are three dimensional. One dataset includes around 50,000 events. You can imagine to build such a model one would need high computational resources which are limited. The dataset can be regarded as “images” of a shower in a detector, and using GANs we can produce similar images. What is also cool is that the discriminator can not only classify generated images as being fake or real but can also provide a particle type, for example, proton or electron. It’s more complicated, so firstly we decided to implement a classical GANs model.
As you can see we have a really huge amount of data so we decided to implement it using neon, it’s a new Intel product committed to best performance on all hardware. For me it was new, and I had to quickly learn how to work with neon.
Our model is based on convolutional and transposed convolutional layers. One more point - we already had some prototypes of the model, but instead of deconvolutional layers the generator consists from convolutions and upsampling. We always should follow each layer’s output dimension in order to get the image of the desired size. I designed a new model on a piece of paper and was ready to start implementation.
Firstly, we had to use 3D sized filters in convolutional layers. Not every machine learning library has 3D convolutional or deconvolutional layers, but neon does. Also software engineers from Intel were working with us and helping to implement various functions. Secondly, the idea of GANs is pretty simple, however making your model work is a complicated task. For me the GANs tricks notices were really useful. A must read if you are starting to work with GANs. Unfortunately, we faced the so called mode collapse. It’s really painful because this process is about to be random. Several ways how to struggle with mode collapse you can find in my other blog.
To improve the performance we came up with changing our model to use so called conditional GANs. This technique helps to get more reasonable results using “hidden” generator layers with some conditions about the data. In our case it’s information about the particle type. It’s really interesting, because conclusions obtained in the article are more empirical rather then mathematically rigorous. Unfortunately, here we faced some technical problems in implementation again and Intel developers are communicating with us about it. Hope it will help in our research!
That’s the short review of past weeks. I have a unique opportunity to work with professionals, get new knowledge and improve my skills! I really want to say thanks to my supervisor Sofia Vallecorsa, to Andrea Zanetti and Andrea Luiselli from Intel and my colleague Gulrukh Khattak for their high professional assistance and support!