If I jump into the actual set of images it read from, a LOT of them are . Which explains why the from-scratch CLIP trained likes to draw maps, but not why the OpenAI CLIP trained one generates pixel nonsense.

Actually, no, it doesn't explain it, because these are all clearly labeled as maps and should be able to distinguish between the maps, portraits, and landscapes in the set.

Show thread

Hey, remember how was spitting out nothing but maps?

Well, I retrained on OpenAI CLIP, and now it's spitting out nothing but nonsense. The attached image is supposed to be a "landscape painting of a forest".

90k finished today, and the results are...

Uh... it literally forgot how to anything that isn't a map. The prompt for this was "forest landscape painting". All the training data from the 29k version is still there.

I'm retraining with instead of my own from-scratch model to try and narrow down the cause of this model forgetting literally everything.

puts articles on their new tab page, because someone thought it was a good idea to buy Pocket.

I usually don't mind them, but sometimes they trend towards unnerving:

I decided to screw the VAE training for now and just start scraping images again

I have to babysit the scraper because the wikitext parsing still hits corner cases and crashes because, say, this CHEEKY FUCKER decided he was going to be painted on 176X


I thought the X years were only invented in 200X

I'm a decades-long loyal Dropbox user. I specifically use Dropbox because I have one foot in every tech ecosystem (incl. Linux, which nobody else bothers to support). I do not want to migrate all that data over to another service if I can help it.

No, and stop bugging me about this.

Neural networks are exceptionally good at pattern recognition. The process of training involves connecting input to output neurons, backpropagating error, and then calculating the loss function.

You know big tech has pissed people off when Y Combinator and the Writers Guild are sponsoring the same petition

And also... Type-Moon? Like, the "Visual Novels You Heard About On 4chan In 2006" company?

"painting circa 1800 portrait of a paintress oil on canvas"

So, CLIP isn't broken after all. 's label set is so narrow and with so many specific phrases that prompt engineering is hilariously critical to getting anything useful out of it - even with the improved wikitext parser. Descriptions aren't good enough.

Definitely going to have to build a manual labeling tool at some point, because there's entire styles of things in the dataset that you just can't recall right now.

"a dense forest covered in fog and mountains"

Yes, this is exactly what I wanted, .

So, 's U-Net finished training. I handed it four prompts and it actually spat out vaguely-related images.

- "1800s portrait" - It doesn't quite understand figures yet, but it at least knows chiaroscuro.
- "a historical portrait of king george" - This looks like an illustration of a savannah or grasslands
- "landscape painting of a forest" - A blue cave. Is this the 9th circle of hell?
- "a guinea pig" - A map. I guess it assumes I wanted the country of Guinea?

Of course, it runs slow as heck and it has all sorts of fun graphical glitches

Show thread
Show older
Pooper by Fantranslation.org

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!