AI Art - Examples

Support, Discussion, Reviews
Post Reply
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27530
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

AI Art - Examples

Post by Winnow »

I didn't want to clutter up the other AI discussion thread so will use this one for various examples.

When it comes to prompts, usually you are less using regular sentences to describe what you want in the picture as opposed to more specifics like
"wearing a hoodie, blue eyes, city background, oil painting" so I saw this one prompt over on CivitAI that was:
A young girl sits on the side of the street, her dirty clothes barely covering her thin frame. She holds out a small cup with a few coins in it, hoping for the generosity of passersby. Her eyes are downcast, filled with sadness and desperation. She looks around nervously, avoiding eye contact with those who walk by. The city bustles around her, but she remains isolated and alone in her poverty. dirty torn clothes , single cup, malnourished
The original example used this checkpoint https://civitai.com/models/36732/sb250kpl

with these results: https://civitai.com/images/471229?perio ... tId=139157

I tried the same prompt in two other checkpoints, here's what I got:

model: animatrix_v13
00019.png
I cut the resolution in half but still good details, the original image would be clear at 4K but they're about 7mb in size.

ok results. she has dirty clothes but really isn't holding out a cup and looks more like a model. No dirt. Definitely doesn't look malnourished.

tried another checkpoint: (revAnimated_v121)
00021.png
Gets the clothes right, starting to hold out a cup. A little dirty and definitely more crap on the ground, the coins are in the bottom left of the picture.

Still, didn't seem pathetic enough for the description so I used the same checkpoint - revAnimated_v121 and added "snowing, wearing hoodie, baseball cap, snow on the ground" to the description and got this:
00023.png
Nice, dirty face, looks less like posing for a picture.

If I were to continue work on this picture, I'd go to inpaint and change what looks like a coffee cup to maybe something like a Styrofoam cup with no lid and then fix the feet of that guy walking in the street (bad ankles)

The point of this is that even typing some descriptive regular sentences, you can get a starting point for an image. as for the body, malnourished is hard to genrate, you'd probably have to add (((skinny))) to the prompt, the ((())) adds emphasis in whatever is between them (skinny) less emphasis than ((skinny)) for example.
You do not have the required permissions to view the files attached to this post.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27530
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: AI Art - Examples

Post by Winnow »

00234-4060014453.png
Making porn in stable diffusion is the easiest thing you can do. Making a proper crapper is a challenge! I wish I could take credit for this masterpiece but it's someone elses prompt.
You do not have the required permissions to view the files attached to this post.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27530
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: AI Art - Examples

Post by Winnow »

It's been a month and I'm still consumed by Stable Diffusion, AI art.

The number of parameters you can apply to a prompt to impact the outcome is absolutely mind blowing.

Sure you could type "a picture of a monkey wearing lipstick" and get a result that would be a monkey wearing lipstick, but there are more combinations than stars in the Milky Way to impact how that generated image ends up looking.

I'm still getting a grasp on prompts. AI art may end "traditional" artist careers but there is a whole other set of skills used to obtain the desired generated image in AI art. I bet "Prompt artist" becomes a common term.

After a month, I've finally worked out a decent prompt to give a photo realistic image. First tip, don't use "realistic" in the prompt.

using the monkey wearing lipstick would be:

Code: Select all

(8k uhd), RAW Photo, monkey wearing lipstick, detailed skin, quality, sharp focus , tack sharp, Fujifilm XT3, crystal clear 
focusing on camera type images, because art/style is whole other thing, you can use pretty much all the terms you use with regular photography and they impact the image...example:

Film type (Kodak gold 200, Portra 400, fujifilm superia), DSLR, Camera model, Hasselblad, Film Format or Lens type (35mm, 70mm IMAX), (85mm, Telelens etc.), Film grain

Then there are examples of details and lighting effects:

accent lighting, ambient lighting, backlight, blacklight, blinding light, candlelight, concert lighting, crepuscular rays, direct sunlight, dusk, Edison bulb, electric arc, fire, fluorescent, glowing, glowing radioactively, glow-stick, lava glow, moonlight, natural lighting, neon lamp, nightclub lighting, nuclear waste glow, quantum dot display, spotlight, strobe, sunlight, ultraviolet, dramatic lighting, dark lighting, soft lighting, gloomy

highly detailed, grainy, realistic, unreal engine, octane render, bokeh, vray, houdini render, quixel megascans, depth of field (or dof), arnold render, 8k uhd, raytracing, cgi, lumen reflections, cgsociety, ultra realistic, volumetric fog, overglaze, analog photo, polaroid, 100mm, film photography, dslr, cinema4d, studio quality

And camera view etc:

ultra wide-angle, wide-angle, aerial view, massive scale, street level view, landscape, panoramic, bokeh, fisheye, dutch angle, low angle, extreme long-shot, long shot, close-up, extreme close-up, highly detailed, depth of field (or dof), 4k, 8k uhd, ultra realistic, studio quality, octane render

These settings actually matter.

still focusing just on generating photo type images, all of those prompts (just a sampling there are many many more) all influence the crazy number os other things that impact the end result:

Model: 100's of them now, maybe thousands. This is the base of what all the images are trained on. There are more realistic focused models, more anime type models, artistic types, etc. But there are also some really good models that can manage all types with good output.

CFG - Classifier Free Guidance: lower the number, more freedom the AI has to render the image, higher the setting, the stricter the AI adheres to the prompt you input. Typical settings from 3-15, every setting in between produces a different outcome.

Steps: the higher the number, the more time the AI spends diffusing the image. To a point, the more step, the higher the quality of image output (depending on samping method used)

Clip Skip: another setting that totally changes the image. Clip 1 in general best for more realistic images, Clip 2 better for art/animated.

Sampling Method: long technical explanation for it, but it impacts the style, colors, etc of the end image.

Seed: determines randomness of generated image. Usually you set to random (-1) but if you find an image you like and want to make more like it with only small changes, you keep the seed # which will keep the general layout of the image.

Text to image: this is where you put in your initial prompt and create something

Image to Image: after you find something you like, you send it to img2img and then start generating smaller changes until:

Inpaint: once you have something you really like but maybe the person has six toes or you want to add or take out something, you use this. Done with masking technique, there are various ways but basically if a hand looks screwed up (extra digits etc) you mask the hand and then you only regenerate that part of the image until you get what you want, the rest of the image stays the same and the AI blends in the new result.

After all of that, you still have LORA, embedding, textual inversions, wildcards, control net, etc that you can use.

LORA are small files that are trained on something specific, like a person, an object, suite of armor, a style etc.

Say you do everything above, all those bazillion parameters but you want to have the same person in the image, you use a LORA with that person's trains features and add it to the prompt: <lora:chevy_chase:1> would give your image Chevy Chase's face (and maybe body etc depending on how it was trained) the 1 at the end is the weight. 0.5 would give the Model being used more flexibility inf integrating yhour LROA into the scene, 1.5 would force the image or make it more prominant.

Now with LORA, the first thing you think of is training a celebrity face or anyone's face, but you can train much more detailed things. One example I just saw on CivitAI was "bags under the eyes". if the normal "tired, sleepy, etc prompts are enough, someone trained a bunch of pictures of people with dark bags under their eyes, if I add that LORA along with the Chevy Chase Lora, he would have baggy eyes, and then you can adjust the amount of baggy eyes with that same 0.1-1.5 scale.

Obviously sexual positions are all LORAs but you can also use Control Net to specifically position your bodies etc.

TLDR to thise point: Stable Diffusion with (Automatic1111) is amazing. You can keep it simple and get decent results but there is also incredible depth to it.

I've been having fun with dynamic prompting/wildcards. (X/Y/Z Scrpt also really cool)

Since there are an insane amount of ways you can affect the end result, X/Y/Z lets you create grids, say enter a prompt, then give me a grid using various CFG settings, or steps, clip skip, etc, then also us these x number of models, then you see the same prompt with different outcomes and compare them. But you can also use PR to change the prompt so you for example, if you wanted to see what the different film (Kodak, Fuji etc) impace the prompt, you could use same prompt with only the film changing to create a grid sample result.

Another way to use wildcards is say you want to see what your favorite prompt looks like but painted but 100 different artists. You create a text file with those 100 artists, use the artist name as the wildcard, then generate 100 images, each one rendered based on the individual artist.

Another nice thing is Automatic1111 saves all the images from txt2img, img2img, and also the batches (say you generate 12 images at one time, it grates a thumbnail contact sheet of it). every one of those images has embedded it it, all the settings from the prompt used as well as the mode, cfg, seed, steps etc so if/when you go back and look at them, you can drag that image into Automatic1111's PNG Info tab and it will extract the info and you can send it straight to img2img, text2img, etc to work with...basically once you have something you like or something close to what you want from someone else, you can use that image as a starting point then start tweaking.
User avatar
Spang
Way too much time!
Way too much time!
Posts: 4811
Joined: September 23, 2003, 10:34 am
Gender: Male
Location: Tennessee

Re: AI Art - Examples

Post by Spang »

IMG_5419.jpeg
You do not have the required permissions to view the files attached to this post.
Make love, fuck war, peace will save us.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27530
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: AI Art - Examples

Post by Winnow »

kittens.jpg
Squint if you need to!
You do not have the required permissions to view the files attached to this post.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27530
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: AI Art - Examples

Post by Winnow »

00196-499953900.png

Code: Select all

RAW photo, Polaroid, girl dressed in cosplay as barbarian shaman smiling at convention holding sign that says VEESHAN FOH SUCKS
00199-499953903.png
SDXL is much better at text than SD 1.5 but still takes several attempts to get what you want.
Took a few tries but mission accomplished!
You do not have the required permissions to view the files attached to this post.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27530
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: AI Art - Examples

Post by Winnow »

Joker.png
Trump.png
Here's some cool SDXL prompts that you don't need a LORA for, just use whatever model you want.

"Heath Ledger The Joker in The Dark Knight, figure socket, Miniature, Models Figure, Gaming pieces, Collectible, Statue, Action figure, Gaming minis"

"Donald Trump, figure socket, Miniature, Models Figure, Gaming pieces, Collectible, Statue, Action figure, Gaming minis"

Put the subject at the front and then add "figure socket, Miniature, Models Figure, Gaming pieces, Collectible, Statue, Action figure, Gaming minis"

----------------------------------------------------------
dragons.png
Stand alone local app called "Diffusion Toolkit" that will scan your output stable diffusion image directory(s) and creates an sql database that is very fast, allowing you to search for text in prompts, by model used, loras, etc.

Above is just some sample runs I made with simple prompt "Dungeons and Dragons manual that says "Dungeons & Dragons". text is still a bit of a challenge. at the bottom you can see a few Everquest prompts that was "Computer Game box with dragon, shaman and bard that says "EverQuest"

Stable Diffusion does funny stuff. Since prompt says manual, it put a book as part of the actual artwork. If I was actually focused on making a nice cover, I'd put "manual" in the negative prompt. EverQuest must be known to the AI because it got the font and look of the title pretty accurate.

The sidebar says "model not found" because I had a custom model that has at least 7 models mixed together. There are some outstanding SDXL models now, you don't really need to mix but you'd usually have one for NSFW and one for SFW. I'm trying to mix an all purpose one.

very useful tool to find AI stuff.

https://github.com/RupertAvery/DiffusionToolkit

just unzip to install and then install Microsoft .NET 6 Desktop Runtime if don't already have it, linked on the install page.
You do not have the required permissions to view the files attached to this post.
User avatar
Winnow
Super Poster!
Super Poster!
Posts: 27530
Joined: July 5, 2002, 1:56 pm
Location: A Special Place in Hell

Re: AI Art - Examples

Post by Winnow »

jemima.png
We haven't forgotten the injustice Aunt Jemima! You will be back on a syrup bottle some day! The tide is starting to turn vs those woke freaks!

Creating text in AI generated images is about to get way better. Stable Diffusion 3.0 has been announced with some example and it does text almost perfectly every time and integrates it into the picture. It also has amazing prompt adherence so in a month or so things will be a lot better in that area for images.
VV-word.jpg
I wanted to see what Stable Diffusion would come up with using just the user names of some VVers with nothing else in the prompt: Winnow, Aslanna, Funkmaster, Spang

This is just a sample from the SDXL model "Mohawk" but the results are quite varied depending on the model you choose. In general, Winnow image ends up being a boat, a bird, a field or random guy portrait. Aslanna is almost always depicted as a female from India or somewhere from the east. Funkmaster is consistently variants of a dark skinned man getting funky!. Spang is all over the place, mostly a bird, grass/field, rarily shown as human although this model shows humans.
VV-word4.jpg
User names converted to action figures. If there's any racism detected it would be with "Funk" I guess that's associated with black men because that's all Funkmaster's name creates.
You do not have the required permissions to view the files attached to this post.
Post Reply