Week of 5/19 – 5/26/2024
This week what I have been working on for the past few weeks finally came to fruition. I was able to get the TensoBoard setup, so now when the model is training anyone with the link can login to the TensorBoard session to view the progress of the training for themselves. I have the TensorBoard keeping track of Generator and Discriminator loss amongst other loss scores tracking how the training is progressing over the ticks. Right now, the model is training with the labels for 107 ticks, totaling the training on 451.1k images which is still shy of the 2,500,000 images that I want to train on before moving from the Coco Processed dataset to the Flickr Faces HQ dataset.
To get the TensorBoard training correctly I had to make some changes to the network over the past few weeks. For starters I saw how the NVIDIA team was packaging their datasets for easy training, and I took a few notes from them, adjusting our image dataset to be a zip file that has the images separated into 1000 image directories with a labels file there to fill in the labels per the image ID. Then I went through to make sure that the model was tracking the proper info needed for logging like the various loss scores. Once that was done, I made sure the docker image was run with the ability to use an in-going/outgoing port from the container that would allow others to tunnel in to view the training live. Once these changes were implemented, we were able to view the training from that point on, and luckily has been stable since being implemented.
I am still getting a concerning error, when the training network is being built the images are coming in through the tensor correctly as (3, 256, 256) however, when the labels are pulled they are returning a tensor shape of (0), which leads me to believe that the network is not pulling the label names correctly. I need to spend some time in the upcoming week to figure this out, because we cannot have the model train on the Coco dataset without pulling the labels per image ID for that dataset.
Thank you for joining me this week, I will see you next week
-Will Hoover



Leave a comment