Looks like there are many improvements to their image gen like multi-turn generation, text rendering and better character consistency
pull down to refresh
pull down to refresh
Looks like there are many improvements to their image gen like multi-turn generation, text rendering and better character consistency
That's pretty wild.
Image:
Prompt:
A wide image taken with a phone of a glass whiteboard, in a room overlooking the Bay Bridge. The field of view shows a woman writing, sporting a tshirt wiith a large OpenAI logo. The handwriting looks natural and a bit messy, and we see the photographer's reflection. The text reads: (left) "Transfer between Modalities: Suppose we directly model p(text, pixels, sound) [equation] with one big autoregressive transformer. Pros: * image generation augmented with vast world knowledge * next-level text rendering * native in-context learning * unified post-training stack Cons: * varying bit-rate across modalities * compute not adaptive" (Right) "Fixes: * model compressed representations * compose autoregressive prior with a powerful decoder" On the bottom right of the board, she draws a diagram: "tokens -> [transformer] -> [diffusion] -> pixels"not sure, this is what I got with the same prompt....
Ha, you are right, it was listed as additional option, I missed that. Thanks for pointing that out, here we go, pretty close to yours... Wow it is much better...
these are all fuckin mindblowing to me
Try logging out and back in, and make sure 4o is selected so that it says this:
The perfect angle of the bridge in the background is the giveaway
How about this one 😂
Looks legit lol
This is going to be great for storyboards
So, does it make sense for the person taking the photo to be reflected like he is? I'd have to actually set up this scenario in order to figure that out...