Oh no, I forgot to save the image preprocessor config when training vocabulary
No matter, we're just using the defaults from CLIPFeatureExtractor, I can just copy the preprocessor_config.json from OpenAI CLIP (they're the same, and uncopyrightable)
...Oh no, it's not actually trying to read the file, is it? #PDDiffusion
And now the U-Net training loop is choking because CLIP wants everything on the CPU for some reason...
Might as well just move the CLIP step into dataset loading at this point
Oh no.