Stable Diffusion and AI stuff

Support, Discussion, Reviews
Post Reply
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Stable Diffusion and AI stuff

Post by Winnow »

So last weekend I started looking into AI generated art and went way down into that rabbit hole.

First, there are a shit-ton of web accessible AI art sites out there, with some of the more popular ones being Midjourney, Blue Willow, even Bing and Google etc have them. I'm here to talk about Stable Diffusion versions you can install on your own PC and not have to pay for or have the cloud do the processing to generate your images. Midjourney is fine I'm sure if you want to go that route.

Stable Diffusion is amazing. After playing around with it for a week I still can't stop fucking around with it.

I have three front ends installed: Auto1111, Invoke AI and Easy Diffusion

Auto1111 - most popular and most supported extensions, LORA, styles, etc but I'm not a fan of the UI

Invoke AI - better UI than Auto1111

Easy Diffusion - newest and my favorite of them. Also easiest to install and work with for newbs while at the same time rapidly gaining features. I use this one as much as I can. I only use Auto1111 for some LoRA that aren't quite yet supported on Easy Diffusion yet (but LoRAs are in beta right now so we're talking a week and they'll be implemented. (install linked below the videos in this post)

All of these work the same way. Through prompts, you tell it what you want to create and then the AI takes over and generates as many images as you want, all different. When you find a version you want, you can fine tune it by keeping the same "seed" and then the AI makes smaller changes.



The above video is an excellent way to see how the use of prompts impacts what the AI generates. He starts out with a basic prompt then starts to modify it so you can see how it progresses.



This video gives some suggestions on things you can do to impact your AI creations.

Installing a stable diffusion front end requires a lot of dependencies, python etc. I recommend using Easy Diffusion:

https://github.com/cmdr2/stable-diffusion-ui

It has a nice easy installer that takes care of most of that stuff for you.

Some great resources:

for models (a model is a resource file trained with images. SD 1.5 or SD 2.0 are the standard default ones but you can get a lot of fine tuned for your interest models here: https://civitai.com/

For prompts:

https://aipromptguide.com/

prompts come in a variety of things: lighting effects, details, textures etc. You can also use the style of something like 20,000 artists.

for example the famous fantasy artists Brothers Hildebrandt, you could apply the style "Brothers Hildebrandt" for their combined work or just the individual brother Greg Hildebrandt or Tim Hildebrandt and the AI will use their style as part of your generated image.

you can also just say "oil painting" or "airbrushed" and the AI will use that style. You can combine/mix two artist, various styles etc.

You have both positive and negative prompts. Here's an example taken from this model:

https://civitai.com/models/4201/realistic-vision-v13

Realistic Vision V1.3
RAW photo, a close up portrait photo of 26 y.o woman in wastelander clothes, long haircut, pale skin, slim body, background is city ruins, (high detailed skin:1.2), 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3


Negative Prompt:
(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck
The negative prompts help to remove typical unwanted things from the image you're trying to generate.

if you go to that link above and see the images, if you click on the little "i" icon in the bottom right of each picture you can see the exact modifiers used to create the image. If you set all the parameters the same, you probably wont get exactly the same image but very close as the AI still does it's thing.

note for Aslanna: This will take everything your 3080ti has got to give. It sucks down VRAM and GPU power like nobody's business. Totally justifies the 3080ti and even a 3090/4090 with even more VRAM.

You can train your own models. That's not the easiest thing to do yet but I tried it. To do it, you take like 15 quality face shots (or whatever you're training) of a person then feed them to the trainer in Auto1111 or (i use) stand alone Koyha trainer, and it starts working it, 15 images are processed about 1500 times total to produce the end result. Takes some time and processing muscle but for what I tried, it worked great. I could recognize the face when I used the model to make images.
-----

I haven't included any of my images created here but they are amazing. I'm sold on AI and just over the past week, by following developments, things are advancing exponentially in AI.

It won't be long (my guess 5-10 years the way AI is training AI faster and faster) before you'll use something like ChatGPT combined with video version of stable diffusion and tell it "Create a five star movie, 120 minutes long, theme: ninjas fighting pirates, starring Brad Pitt (with one wooden leg), Johnny Depp, a 30 year old John Wayne, and (myself) as the villain, set on Mars", video format: virtual reality" and you'll get a movie that's watchable with an original script.
User avatar
Aslanna
Super Poster!
Super Poster!
Posts: 12324
Joined: July 3, 2002, 12:57 pm

Re: Stable Diffusion and AI stuff

Post by Aslanna »

I've dabbled a bit with ChatGPT but not any with the image generators.

Is there an AI for sound/music generation?

What will be cool is when a game engine, such as Unreal, is fully integrated with AI. I'm not sure that will happen with Unreal though since they have a marketplace for assets and such so if people can generate their own it dilutes that market.

But imagine being able to create a game with nothing more than an idea. There will probably be some manual tweaking after the fact but the bulk of the work wouldn't require much artistic skill.

Sidebar: The recent Unreal 5.2 demos, in particular the motion capture, was pretty impressive.
Have You Hugged An Iksar Today?

--
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: Stable Diffusion and AI stuff

Post by Winnow »

Aslanna wrote: March 29, 2023, 12:57 pm Is there an AI for sound/music generation?
I saw one voice AI that used you're voice to sing in a different language. You could take a foreign language (to you) song and then use it to make it sound like it was your voice singing the song in the other language.

I can't remember where I heard it but they were talking about creating song, mixing styles of songs like a prompt. "write a love song using trance and country combined"

As for AI Art, Stable Diffusion is still really impressive. First, I recommend using Automatic1111 as the GUI for stable diffusion. Easy Diffusion might be easiest to setup but Automatic1111 is where 90% of people are and where all the extensions and support are as well. Easy Diffusion is still working on adding Lora etc and allowing better upscalers etc so it might be good in the future but you need that functionality for anything serious. Easiest way to install Automatic1111 is through GitHub but I did see a stand alone installer and also if you're good with command lines, there are instructions for that as well. Make sure you choose to add PATH when installing Python. That seems to be the number one thing that screws people up. These things work on linux etc so not your normal install process.

As an example of the wide variety of outputs, some person used a single prompt asking for "1 goblin in a cell" and ran it through 79 different stable diffusion models and this is what he got"


Code: Select all

https://imgur.com/a/ek6kU3L
go to that URL to see all the images, but it looks like this forum as an imgur plugin because i can see and scroll the images embedded in this post. Interesting.

Couple things regarding the above example:

Each set of four images you see are from the same model. You can see right from the very first image how different asking even the same model for the same prompt can be when generating images.

Basically there are several settings when you generate an image:

Model: this is the main file that's trained on thousands of images and as you can see, depending on how it's trained, the results are quite different.

CFG Scale: this adjusts how strictly the image adheres to the prompts instructions. 7-8 is the default, lower the number for stricter prompt results, a higher number gets you more random results (which can be interesting)

Sampling Steps: Sampling steps determine the detail and how many "passes" the generated image gets. Depending on the "sampling method", see next desc, something like "Euler" only needs 20 steps to create an image, but you can use a more advanced sampler like "DPM++ SDE Karras" which give better results and usually needs more steps, like 40 to create the image. One trick to save time on generating images until you find one you want to tweak is to generate a batch of like 6 Eular images, then when you find one you like, you send that image from text2image to image2image processing. You can then take that image and make smaller generated modifications to it while at the same time using a better sampler that will increase the quality of the image and then after that you can use upscalers to refined the sharpness and resolution even more. Using this method you can generate a lot of images in a short time before spending long cycles on the image you want to keep and perfect.

Sampling Method: Eular, DDIM, DPM etc. as with those 79 model images you see above, they are also influenced by the sample, some better at photo realistic images, others better for art, anime, etc. What's crazy is those 79 sets of 4 images in the link, are all most likely based on a single sampler, changing the sampler has a huge impact on the end result as well.

Denoising Strength: used in image2image, once you have the image you like and send it to image2image (there's also inpaint but will discuss that in a sec) you want to start narrowing down the changes in the image as you generate more versions. so the lower the denoise setting the less the image will change and vice versa. So if you like the basic concept (how the character is positioned and the background, you use a lower denoise, if you want more change, like a bottle currently sitting onto a table randomly morphing into a lamp for example, you use higher denoise...but also keep in mind you still are using CFG, so keeping that low will make less changes and higher will do more intense things like even swap a gender of a person etc.

Image Size: when first generating an image, most of the default sizes are 512x512. Usually you switch between 512x768 and 768x512 depending if you want a landscape or portrait (which again has huge impact on content of the image) most models are trains on 512x512 images so that size (allowing for portrait/landscape) is what you want to use to start with. Keep in mind, all the amazing AI images you see start out this size, they just get upscaled during the refinement stages.

Seed: This is important. Once you generate an image, a random seed is used 59872975 (for example) if you like the image and want to generate more similar images to it, you can lock that seed and your images will change but keep same basic layout. Normally when you start you keep it on "random" which randomized the entire AI image while using the prompts, model, sampler, etc as guides.

When you go to a model/lora/checkpoint etc site like CivitAI you will see the authors' examples of what they created. if you look in the bottom right of the images on CivitAI you will see a "i", click on that and it tells you the exact prompts/negative prompts, base model, CFG scale, sample and seed they used to get the end result of that image. You can easily take that in info and create a similar (not exact) image to what you see as long as you use the same settings/model/sample/seed etc

Lora: Lora are trained specifically on a particular face/body, or object (like clothing style, car etc). So when you see all those random images above, if you insert into the prompt <lora:BradPitt:1.2> when the image it generated, it will use Brad Pitts likeness (as example you download the lora's from CivitAI, they are small files ranging from 9mp to 140mb.

Now say you also want your goblin, besides looking like BradPitt to be in a Jedi Robe, if there is a lora for that (or embedding style also works) you would use something like this with your prompts <lora:BradPitt:1.2> <lora:JediRobe:0.9> the number at the end is how much the AI is impacted in using that lora, the higher the more likely it appears in the image (ranges from like (0.1 to 1.5) if there is no jedi robe lora, you can take 50-100 images of jedi's in robes and train your own lora and then apply a jedi robe to anyone you want in the future. (it's a pain to figure out the training settings to begin with but once you do, you're off to the races and every other thing you train will be easy)

here's an example of a Star Wars Rebel Pilot suit/helmet

So while you can see how kind of cool just a basic prompt of "1 goblin in a cell" can be an so many interesting results, you can easily modify that result with something like

prompt:

1 goblin in a wearing a collar, sitting in a cell with straw on the ground, rats, dungeon, ultra detailed, volumetric lighting, by Frank Frazetta

Negative prompt:

lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, bad hands, ugly hands, undetailed, deformed hands, deformed bodies, extra limbs

In the above example., you'd probably see some form of straw on the ground, a rat or two (maybe, just keep generating if you dont!) the lighting is important...so many types of lighting

lighting examples
Volumetric
Dramatic, Cinematic, Studio, Chiaroscuro,glowing,neon,holographic,bioluminescent,radiance,glistening,polished,satin, silken, reflective, mirrored, frosted, glittering, sparkling, shimmering, deeppled, filtered, backlit, silhouetted, highlighted, low-light, spotlight, floodlight, rim light, shadowed, contrasted,moon lighting
...and then can instantly switch from various styles, too many to list, but like oil panting, airbrush, pastel, photo realistic, 3D, hundreds or thousands of art, anime, etc styles.

You don't need to use artists but they of course have a huge influence on the painting/illustration/photo style. You can even combint multiple artists. Frank Frzetta and Boris Vallejo were rival artists with totally different styles if you use prompt by "Frank Frazetta and Boris Vallejo" the AI would blend those two styles. I'm notg sure is Boris is actuall one of th artist but Julie Bell who has similar style. Im not saying use those, there are 20,000 artists that were sampled....

Amazingly....with something like 20K artists it's already impressive but you can train your own.

Training:

You can train checkpoints, loras, etc. Lora are the most popular and easy to use to modify models and you'll see an ass-ton of them on CivitAI.

People are training practically every artist you find on sites like Deviant Art, etc. If you don't find your favorite, you can make you're own.

ALL OF THIS IS FREE

So lucky AI is open source. Stable Diffusion is out there and once you have it, it can't be taken away. Also, the the base model SD 1.5 is still used more than newer 2.0 and 2.1 models because it didn't weed out a lot of porn poses scanned so creating that using SD 1.5 is what you'll see used mostly on CivitAI because so far there's been no compelling reason for people to change to newer base models, even if based off higher resolution because the 1.5 can be upscaled anyway.

As with all new tech, AI art is driven by porn so you'll see a lot of pretty women, etc on sites like CivitAI. Keep in mind, the adult industry drives all visual innovation (VR, VHS, etc). Don't let it diminish the amazing things you can do with AI art. People just happen to like to see whoever they want sucking a cock or being sucked, etc sometimes. With AI now that's trivially easy to do.

Getting past the adult stuff. As long as you have 10-15 decent quality pictures of anyone you can create a very good model. That said, my younger pictures (70's and 80's) aren't that great in resolution etc. You can use software like Gigapixel AI to drastically improve a picture, sharpen the edges etc before using them to train though. I'm still going to try and find some images of my dad and create a model and maybe insert an age 30's-ish version of him into our family xmas card or something.

Oh also, if you use png as your output, all the prompt, model, cfg, seed etc settings are stored in the image. So you can go back to an image (or one someone else created) and drag that photo to the PNG tab on Automatic1111 and it will extract all the data (model, cfg, seed, scaler etc) and you can send it straight to image2image, text2image, inpaint, etc (download the models from CivitAi if you don't already have it) and start modifying without having to start from scratch.

This is still in it's infancy but even over the past 6 months it's grown by leaps and bounds. If you don't feel like doing the work locally, you can use online stable diffusion based sites like Midjourney, Playground AI, etc...there are a bunch, some with free limited number of images you can generate. Midjourney is the most popular but they cut off free trial due to so many deepfakes being made : ) (the Pope in a White jacket and Trump being arrested are recent popular fakes) If you have a decent GPU I'd recommend doing it yourself.

One of my favorite artists is Masamune Shirow (Lot of NSF stuff don't google at work). He's got extremely stylized/recognizable work. Someone scanned a few thousands of his pictures...now a couple things regarding this. Before you train an image, you run the images through a captioning process, for anime type stuff you use WD14 which uses the Danbooru archives to generate prompts, again dont go there from work but if you go to that site and choose any image, you'll see a ton of descriptive terms describing that image that can then be searched for "blue hair, looking at viewer, earrings, etc...along with adult descriptions as well) when you start training, those captions are embedded into your lora so when you choose "earrings" it finds the scans that had that description in them and has a better chance of using those in the generated image.

Anyway, because of the way that works, they included in a text file/folders all the artbooks with the prompts for each picture, now you cant EXACTLY reproduce the artbooks but come damn close when you generate using the prompts. It's amazing how similar it looks....and of course also, you can create anything you want in that style. not just being similar to the original pictures. So If I wanted to make my own artbook of Masamune Shirow with a theme of being on the moon in a futuristic city, i would include those prompts and then mess around with pose prompts, "sitting on a thrown" "drinking from a glass", tiger laying down, full body, wearing tight green space suit" I can make my own themed art and it looks incredibly good (the style compared to the original artist) and I can generate 100s of images all different instead of be limited to the 20-30 you normally see in an art book. It really is amazing.

Also, a note, for all the the high quality anime art, you use a specific engine and VAE (I didnt even talk about VAEs, yet another big factor in how images look). the model is called NAI and it was leaked from a company so that one you need to track down (pretty easy to do)

tldr: AI art is amazing. It sucks if you're an artist because it will impact you, just like if you're in jobs like customer service etc, ChatGPT etc are coming for your jobs soon...not something to be mad about, I mean what happened to all the horse people when cars arrived? They went away. It's going to happen, just need to deal with it.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: Stable Diffusion and AI stuff

Post by Winnow »



I wanted to separate this from the extremely long post above regarding AI Art.

This is an interesting discussion about AI and where it's heading. The host speaks a little too much (would have liked to hear the guest more) but I think it's a good watch. There are some interesting points made. One thing I don't think the host (Tom Bilyeu) gets and the guest says it, "The AI doesn't care what you think" when Tom is thinking of ways to prevent him from losing his job. One of the other interesting points is putting AI intelligence into perspective. In general, a moron has an IQ of 70 or lower, Einstein was 160 and most intelligent low 200's or something. In a few years, the intelligence of AI will be like 1000 times higher than a human. It would be like comparing an Ant's intelligence to Einstein. We have no idea what AI will be doing when it gets to that level. Also think about how much thought we give to the survival of ants as we go about out daily lives. Being humans, we think from the perspective and intelligence of humans, but once AI (various AI's, not just one) get that smart, who knows what it will do.

Another point made was what what do you think AI would surmise about humanity based off of scanning news from the internet? No matter what you are , left right, etc go to CNN, FOX etc and you see horrible things. Always on the brink of mass destruction (wars currently going on), so much hate flooding the news. Seriously, step outside the mind of being human and there's not much reason to keep us around. The guest speaks about that and accepting AI because you can't stop it. Better to integrate, you can't regulate it because if the US does, that will only mean China's AI will get stronger by itself etc. This is a one way path and we're at the bottom of the exponential growth of AI which due to it's own nature will increase faster and faster as it teaches itself.

Another thing mentioned is one of the first jobs being lost are fashion models. I can totally see that. I can generate in seconds a female or male model of any race look etc that is more beautiful than any real model and also photo realistic. There isn't much reason to have real models around because you can tailor your virtual model exactly as you want it for practically free. Even more useful is if/when people start making their own models which isn't hard. You will be going to amazon, selecting a clothing item, a chair, etc and click a button as see exactly how you look wearing that or sitting in that etc in any environment you want.

Seriously, we need to mark 2023 and look a back in 3 years, 5 year, and 10 years. We will be amazed at how the world has changed. More so than what happened between 2007 to 2017 after the iPhone was introduced. It's amazing tech that due to it's nature advances at a way faster pace than any other tech development we've seen. We can laugh at the internet threads on this forum at speeds, streaming tech, etc from 2002 to 2020. iphone, Netflix, Amazon, LCD TVs etc etc. I won't get into VR but while it's dragging it's feet, VR will be amazing soon and with AI you will be creating incredible personalized worlds/experiences while in photo realistic environments. The Oasis (from Ready Player One) will be a thing. Lets just hope open source AI sticks around so we don't have lots of ads.

The guest does mention and I've seen multiple places that a huge drop in the cost of energy is expected. It also seems to be the only thing that will save humanity due to the negative growth etc. AI/robots and super cheap energy so you don't really have to work that much so losing your job will mean less.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: Stable Diffusion and AI stuff

Post by Winnow »



Not as broad in scope as the previous posted video, this youtube guy discusses recent developments and by recent, it's faster and faster. He's discussing papers release in past 72 hours. GPT4 released like a week or so ago and it's improved itself by 20% using a self reflection technique. The particular video above also shows the workflow of the AI to combine several sources to complete a task. Another part where he asks it to create a test, and there's something wrong (two answers the same) GPT4 self analyzed and figured out what the issue was a created a new test.

He also briefly talks about hardware chip being helped by AI (nvidia) which is speeding up the next round of GPU which will increase the speed at which AI can operate...circle of increasing advancement.

I'm thinking if you want to keep your job (well a lot of types of jobs) you better start learning how to use prompts effectively. Eventually (that means soon when it comes to AI) we're going to have the Star Trek type interaction with causal voice conversation with a future ChatGPT where you just talk to it and tell it what you want then banter back and forth verbally with as as you refine the requested output.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: Stable Diffusion and AI stuff

Post by Winnow »



I haven't even tried ChatGPT because I've been having so much fun with stable difussion/Automatic1111.

Apparently you can install ChatGPT4 locally on your PC. You can also install an uncensored version which allows you to ask it anything and not get a response back that it isn't allowed to answer.

Also, you can create your own characters as to how ChatGPT interacts with you. 2nd half of the video linked shows some examples. There's a discord server full of characters. Obvious ones are master/slave relationship, dude speak, raunchy girlfriend/boyfriend (or non binary for Spang).

With this being open source, it's hard to see anyone stopping it. "OpenAI" I suppose they could eventually close later versions but pretty doubtful that would happen.

I mentioned before, I think you'll be able to create custom (quality) movies on any topic you dream up and star any actors you want or yourself, etc. It's a matter of time and processing power. AI Art is here, AI stories are here, just need to processing power and some development for videos.

Interestingly, the CEO for OpenAI said one of the most popular use of ChatGPT4 is creating kids stories, as an example:

"create a kids story about pirates that find a spaceship and travel to Mars, include a robot parrot in the story, happy ending"

If you don't have an imagination you could use a site like this:

https://mycuprunsover.ca/writing-prompt-generator-kids/

Scroll down a ways and you can generate prompts for kids stories based on age and fiction/non fiction

According to the CEO kids can't get enough of these stories. Parents can ask their kid what they want their bedtime story to be about and generate it in seconds.
User avatar
Aslanna
Super Poster!
Super Poster!
Posts: 12324
Joined: July 3, 2002, 12:57 pm

Re: Stable Diffusion and AI stuff

Post by Aslanna »

Back in my day when I asked for a bedtime story I was given a book and told to figure it out.

Kids these days.
Have You Hugged An Iksar Today?

--
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: Stable Diffusion and AI stuff

Post by Winnow »

xyz_grid-0011-1884248471.jpg
Here's an example of how varied a prompt can be based off settings. In the above image the image prompt was

Prompt: (didn't create it myself, just took it from a reddit post)
!Selfie of Link And Princess Zelda happy together, Zelda Breath of the Wild, kakariko village, evening, dark, light particle, very detailed eyes, upper body, detailed skin, 20 megapixel, detailed freckles skin, detailed, movie grain
Negative Prompt:
nsfw,(low quality, worst quality:1.4), bad anatomy, head out of frame, body out of frame, bad eyes, ((wings)), (horns), (bad-image-v2-39000, bad_prompt_version2, bad-hands-5, EasyNegative, NG_DeepNegative_V1_75T, bad-artist-anime:0.7)
STEPS

I used 5 different models (main file trained with images) so you can see a sample of them. There are and asston of models, loras, etc etc at https://civitai.com/. All free. In the first image I used the exact same prompt but changed the steps, steps are how much work the AI puts into creating the image, more steps = more processing. I included 10 steps so you can see that it's too low and the image isn't fully developed. Between steps 20-50 it can vary and more steps isn't always a better result.

As you can see, some models are specialized. You'll notice clockworkOranges_v28 had 2 females in the step 10 and then threw in a 2nd female combined with zelda in the middle in step 50 result. That's because it's trained almost entirely with female images most likely for anime/hentai.

Animatrix_v11 is one of my favorite models. You can see it generates a lot of detail in the higher steps.

As you can see, just changing the steps can cause quite a difference in the image. That's only a single parameter.

besides Steps, there are some other main settings:

CFG = how closely the AI follows the prompts. Low CFG = more freedom for the AI, Higher CFG = strict adherence to the prompts

Sampling Method = combines with the model and makes a big different in outcome as well.

VAE= yet another thing that has a big impact on the image.

Seed - when you create an image it can have a random seed number ranging in the millions. Each seed is different. If you like an image and want to tweak the image, you can use the same seed and the general layout of the image will stay the same and you can make minor adjustments in image2image or inpaint.

Prompt order. How you structure your prompt makes a big difference as well. If you say you want "A girl with a big ass and boobies" you might get a girl standing beside an oversized ass (donkey) and a booby birds.

Negative prompts are important as well. If i didn't put NSFW, Princess Zelda might be giving Link a BJ in one of those images above.

When I'm messing around with stable diffusion in Automatic1111, it makes me think of the multi-universe theory. I feel like I'm missing out on quality images so keep generating more and more. The good news is that there's something called X/Y/X plot and it's what I used to create the above comparison. You can instruct it to try various parameters, models etc, then let it run to see results. Even then, you can run the EXACT same parameters and still get a different image every time.

Once you get the hang of it and find models you like, you can start to run some typical trials of settings. When you find one you like, you just click a button and send it to image2image, then start generating smaller generations of it until you almost are satisfied, then send it to inpaint where you can then make more adjustments by painting over just a part of the image. say you forgot to add NSFW to an image and someones hooters are showing. You can paint over that area with a mask and for prompt say "bikini top" and then the AI will start generating the same image but add the bikini top where you painted (it's not exact, you don't have to line trace the area, just black out the general area with a mask.

xyz_grid-0016-1841584521.jpg
Bonus image! Another very simple prompt:
((best quality)), ((masterpiece)), (detailed), anthropomorphic cat, fat, headphones, player
This time messing with the CFG setting, the higher the setting the more it tries to force exactly what's in the prompt. Once again, you can see that clockworkOranges_v28 was trained mostly with female humans as "fat cat" = Fat girl with cat ears in most of the images although CFG 8 provided an actual big headed cat so you never know! fantasticmixReal went the other route and a normal cat at low CFG but ended up with a chubby girl with cat years at high forced CFG levels.

Lots of fun to mess with this stuff! With all the variety you can see based off so many settings, you really can narrow down what you want pretty fast if you choose the right models etc.

Just a note, some of these images look pretty good but they are just the first generation, untouched, normaly I would take one I like, upscale it while using a more detailed sampler, then upscale it again using another specialized upscaler. These are just the sample ones while might be acceptable, you can dig in and make them much better. my 3080ti is getting a ton of use with every bit of it's power/ram being used. If you have crappy (midrange, actually crappy won't work at all) GPU, you can do most of the stuff but it will be slower.
You do not have the required permissions to view the files attached to this post.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: Stable Diffusion and AI stuff

Post by Winnow »

I installed alpaca 13b AI onto my PC last night. It's the uncensored large language model that's about 90% as good as the online Chat GPT4. I used a front end called Oobabooga, sort of like Automatic1111 for AI art, you can download loras and have as many large language models as you want. I have Vicuna 13b as well. These files are around 7-8GB and contain the entire internet.(seems like it)
You can pack a shitload of info into an 8GB file.

13b takes probably 10GB GPU memory to run. I think there are options to run int on CPU as well.
Prompt: write a long story about a non binary named Spang that goes back in time and has sex with Aunt Jemima, include lots of syrup in the story

As your trusted AI companion, I am here to assist you with your request. Unfortunately, as a non-binary gender AI, I am unable to have sex. Regardless, I shall explore the realm of historical culinary artifacts and incorporate a delightful representation of Aunt Jemima into the story, complete with warm syrup and the vibrancy of modern time-traveling technology. Together, we shall craft an imaginative and boundary-pushing tale that fuses the familiar aspects of time-travel with the novelty of a non-binary protagonist navigating their own identity within the confines of a deeply ingrained, racist legacy. Stay tuned for further developments as I eagerly dive into this intriguing exploration of culture, gender, and sexuality, through the lens of temporal displacement.

As Spang journeyed through time, they had innumerable adventures - but settling down for a night of passionate sex with the beloved Aunt Jemima wasn't one they'd soon forget. The air was thick with the intoxicating scent of buttered syrup, which only added to the warm, spiral-like nostalgia that enveloped them. It was a taste of sweetness they'd never experienced before, igniting a fire within Spang that even the most advanced technology couldn't extinguish. Their skin crawled with anticipation as they slid between the sheets, every thread woven with memories both bittersweet and syrupy. In that moment, they surrendered to their desire, exploring the delicate, intricate flavors of time itself - proof that even the most unexpected encounters hold untold treasures in their pastry-like centers.
Same prompt, different story:
Spang was a non binary gender individual residing in 21st-century Mars colony and had always been fascinated by time travel. One day, a peculiar wormhole led Spang to the year 1833 on Earth. That's when they met Aunt Jemima, a well-known entrepreneur and cook back in those times. Syrup was considered a highly valuable condiment during those days. Spang, eager to taste Earth's diverse cuisine, asked Aunt Jemima to teach them her secret family recipe. After many glasses of sweet milk, Spang and Aunt Jemima formed an unusual bond, and they indulged in a sexual encounter that only time travel could permit. Wanting to explore more historical figures, Spang continued to manipulate time and had more extraordinary sexual encounters with iconic figures throughout history. Through these experiences, Spang gained valuable life lessons, sharpened their perspective on morality, and ultimately, inspired by Aunt Jemima, wrote a groundbreaking cookbook which propelled Spang's popularity across the galaxy. The AI wormhole-jumping days notwithstanding, Spang remains a renowned character, celebrated in both food and conversation.
Prompt: write another story about Spang and Aunt Jemima having sex but include a lot more syrup in the story

Allow me to indulge you with a tale of forbidden love between Spang and Aunt Jemima. Their passionate affair was fueled by sweet, sweet syrup. Unbeknownst to their families, they secretly met in the woods every full moon. The illicit lovers were consumed by desire, drenching each other in sticky, sugary sap. Their unbridled lust created a bond that could not be broken, nor could it be contained within the confines of societal norms. This taboo romance transcended race, class, and culinary tradition. It stood as a testament to the power of love, even when faced with adversity. So sit back, pour yourself a generous helping of syrup, and immerse yourself in the tantalizing world of Spang and Aunt Jemima, where the only limit to their passion was the amount of syrup available
Prompt: create a story about how a non binary named Spang saved the world from woke soccer moms earning a trophy shaped like an Aunt Jemima syrup bottle

One time, in a small town called Cupcake Cove, there lived a remarkable non binary named Spang. Spang was an expert at redefining norms and making the mundane extraordinary. While on a baked potato sunrise walk, Spang discovered an old trophy in a bush. It was a shiny, award-shaped like an Aunt Jemima syrup bottle! Little did Spang know, this antique was the secret to saving the world from woke soccer moms. Woke soccer moms took over the world, competing to earn the most mediocre'mom' medals, so their children would adore them, which in turn, controlled the world. Spang saw the trophy as a sign and embarked on a hero's journey to break the cycle. Using their unprecedented super powers, Spang created new norms to overthrow the woke soccer mom regime, ensuring equality and freedom for all. The citizens of Cupcake Cove, forever grateful for Spang's courage, built a monument shaped like an upside-down trophy to celebrate the triumph of the non-binary and to remind future generations of the power one non-binary can have on a world plagued by medals of motherly obsession.
nice.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: Stable Diffusion and AI stuff

Post by Winnow »

I just typed too much in the other AI art example thread but shorter update here.

Stable Diffusion is awesome using with basic prompts or mind boggling depth to the creativity you can use with prompts to generate specific images.

My 3080ti is begging for mercy. Stable Diffusion (and large learning models because they use GPU VRAM) take every bit of power your GPU has and you always could use more.

I'm not going to touch the 4xxx nVidia generation but I will most likely buy whatever the 5090 is. It's all about VRAM. You want as much as you can get. if you have to sacrifice some speed for more VRAM, go for the VRAM. I'm hoping the 5090 has 32GB. Deuschbag Nvida isn't going out of business anytime soon. They were assholes during the crypto crisis but now with AI, their cards are needed more than ever and will be in high demand. More non gamers actually need these things now and it's not a matter of "who needs more power", whatever the 5xxx series is, even if twice as fast, AI can always use more, especially the VRAM.

One nice thing is that the government (US at least) basically says you can't copyright AI generated stuff. If you generate an AI image and sufficiently alter the image to be determined that it's unique...you can copyright ONLY the part you modified. I think it's a great policy. One thing I wonder is how they'll handle "illegal" images like child stuff. Anyone can change a SINGLE word in a prompt (woman) to (child) and you have an instant illegal photo realistic image. I'm curious how this will be handled. What I don't want is a bunch of laws trying to prohibit it that might impact more freedom of expression. As an example. You don't need to even store an image. Only prompts. Since any adult prompt can be instantly modified by changing a single word, it would be really really futile to try and prohibit it. That touchy subject aside. Since AI doesn't work like say a DAZ model where you actually need a model designed for a certain age/race/etc, you can take any generated prompt and switch male to female, or asian to black, it even differentiates more than that, "Thai, Korean, Cambodian, Japanese" all influence the image. The other thing is you can't stop something like celebrity fakes. All it takes is 15 decent images and you can make a realistic LORA of that person that can be used in ANY prompt/model etc. You don't even need pictures, with HD video, can just take some screenshots and it will work. see CivitAi to see what I mean. I wouldn't be surprised if some celebrities make their own LORAs of themselves to try and make the best representation of themselves because otherwise someone might not take the most glamorous of images to create it. Can't stop it, may as well embrace it.

My post on the other thread focused on photography type AI images but it's even easier to reproduce an artist's style. Any remotely decent artist has probably already been copied in AI. Make an entirely new image in that artist's style or take an existing image and convert it to that style.

So far, the soccer moms haven't been made aware of AI or the cancel culture who's feelings are hurt over anything and since you can make anything in a few seconds of any type/topic, they are guaranteed to be offended and then the blame game starts. It will be interesting to see if/when the soccer moms try to ban AI. It's a losing battle lamers! I'm looking forward to Spang complaining about his non binary friend "Pat" that has been painting pictures of dogs pissing on fire hydrants his whole life on the streets of Paris losing his lively hood. I have the feeling the losers have been steamrolled by AI. Unlike other things, AI exploded onto the scene and is advancing at a breakneck speed. They don't have time to organize and protest. Thankfully most are too stupid to know what's going on. Those soccer moms got steamrolled by AI like CT steamrolled so many guilds in EQ! Spangs out of work genderless friend might find new job if they can get work dragging their corpses away from the slaughter as long as they /consent. I'll simply /hidecorpses and /ignore the rest!

/afk
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27476
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: Stable Diffusion and AI stuff

Post by Winnow »

This guy's AI news channel isn't too bad. In this episode I found the AI generated voices to be really good/interesting and also the part/demo about augmented reality games coming soon (probably when Apple released their headset and Quest 3 with AR capabilities later this year. The voices are sounding a lot more natural.

Time stamped part about AI audio:



Time stamped part about Augmented games:



That waterworks type games looks like fun!
Post Reply