Week of 5/5 – 5/12/2024
This week I found myself working from the road while traveling to watch my sister receive her MBA from University of Illinois Springfield. This was wonderful to see but it brought about a slew of logistical issues that I have little experience with. Traveling with my laptop is not something I’m used to but didn’t turn out to be very difficult, however, the acquisition of useful internet during the trip has been difficult to obtain. This has opened my eyes to how difficult it is to not only be productive while you travel, but also spending the time before the trip making sure that you have everything you are going to need while traveling to work is vitally important.
When I found myself with free time I further improved and worked on the network to accept inputs through a StyleGan-T architecture, while adapting that a bit to also include the CLIP Image Encoding class which would in theory allow the user to also use a image as a base prompt for the generator to create an image from. This would further enhance the ability of FaceCraft and what it can do. The team is working on various features that are dependent on the architecture that I am building with the StyleGan network. For the training we are going to be using the CelebA dataset which has features connected to each individual image id, so things like smiling, brown hair, male etc.
One of our team members, Tem has also taken it upon himself to adapt a pre-trained model to intake a image and describe what that image is. With that he is going to feed the model the FlickrFaceHQ dataset. This dataset does not have an attached prompt text document or descriptors so the model will amend a text document with the image id, short description of the image and this will be used for further training the model to generate faces accurate to the request.
One concern I have with the model is the size of the final product and if we are going to be able to host it, if we add too many datasets the training time increases but so does the quality of the images, what they are able to generate and a slew of other positive benefits. The problem is that if the size of the model gets out of line then we are unable to run to end product which would negate the entire project. So far when producing a smaller sized image 128*128 we haven’t run into too many issues, but anything larger than that and we might run into issues ranging from memory issues to the length of time it takes to generate a prompted image.
Thank you for joining me this week!
I’ll see you next week.
-Will Hoover



Leave a comment