My Journey Creating Realistic Images of a Ramen-Eating Woman with AI

黄色のパーカーを着た女性が、ラーメンを食べる前に髪をまとめる。店内の暖かい光と湯気が心地よい雰囲気を作り出している。 AI

I want to create realistic photos using image generation AI

If you think of “AI-generated images,” you might imagine anime-style illustrations. However, I personally enjoy creating highly realistic, almost lifelike images.

For generating such realistic photos, ImageFX is a great option. This advanced image-generation app is powered by Imagen 3, a cutting-edge AI model developed by Google.

Features of ImageFX

ImageFX offers the following key features:

  • High quality: It can generate highly detailed and realistic images.
  • Easy to use: No special knowledge is required, making it intuitive for anyone.
  • Free to try: Though a paid plan may be introduced in the future.

Currently, English prompts yield the best results. If you’re not confident in English, you can use a translation app or ChatGPT to create accurate prompts.

How to Use ImageFX

For example, entering “Landscape with blue sky and white clouds” will generate an image that matches the description. You can generate up to four images at a time, and the process is remarkably fast—taking only about 10 seconds.

A vast green hill stretches out beneath a clear blue sky, where fluffy white clouds gently drift. A breathtaking natural landscape filled with a sense of openness.

Previously, users quickly hit the daily limit, but now the service allows up to 30 generations per day. Once you reach the limit, you’ll have to wait until the next day. However, for casual use, the limit is generally sufficient unless you’re generating images frequently.

Additionally, all past generated images are stored in “My Library”, where you can view and download them anytime. The built-in search function is helpful, but finding past images can be challenging if you forget the prompt or have too many images to scroll through. To avoid this, it’s best to download any images you like immediately.

(This information is based on usage within Japan and may vary depending on the country.)

Generating a Photo of a “Japanese Woman Eating Ramen” with ImageFX

Creating landscape photos or surreal, never-before-seen images can be fascinating, but generating realistic images of people is also a lot of fun.

This time, I’ll casually share my experience of generating an ideal image of a “Japanese woman eating ramen.” Ramen is one of Japan’s national dishes, famous for being quick, affordable, and delicious—a combination beloved across the country. Specifically, I’ve chosen shio ramen, a clear, salty broth ramen that’s known for its simplicity and delicate flavor, and happens to be my personal favorite.

There’s nothing too deep here—just a lighthearted walkthrough, so feel free to enjoy it!

(These AI-generated images do not depict real people. They are purely fictional and created from scratch.)

The first generated image

Against the backdrop of a nighttime city, a black-haired woman in a hoodie enjoys a bowl of ramen in an anime-style illustration. The contrast between the realistically depicted ramen and the softly drawn character is particularly striking.

She appears to be trying to eat two bowls of ramen. Moreover, although I was expecting a realistic photo-like image, it turned out as an illustration instead.

This is a common mistake—forgetting to specify in the prompt that I want a photo rather than an illustration.

Next, I’ll try adding the phrase “35mm film” to the prompt.

At a food stall in the illuminated nighttime city, a woman in a hooded hoodie slurps her ramen. The steaming bowl perfectly complements the chilly night, creating a cozy and inviting atmosphere.

The result looks quite realistic, almost like a real photo. If it weren’t cold ramen, her left hand might have gotten burned, but other than that, nothing feels particularly off.

If I had to point out one issue, the characters written on the bowl are unreadable. It seems that reproducing Japanese text and kanji is still a challenge.

That said, this image clearly depicts a woman eating cold ramen at a food stall.

However, her hairstyle and facial expression feel a bit off, so let’s make some adjustments.

Against the backdrop of neon lights in the nighttime city, a blonde woman enjoys her ramen while flashing a smile at the camera. The warmth of the ramen and her cheerful expression create a striking impression.

Once again, it ended up as cold ramen, but this time it’s a delicious-looking salt ramen. The soft-boiled egg is especially well-executed. Also, with the woman smiling and looking directly at the camera, the image feels much more engaging and friendly.

However, upon closer inspection, it seems like her right hand has six fingers. Since fingers are quite complex in shape, AI doesn’t always adhere to the common rule of having exactly five.

To address this, you can try adding phrases like “holding chopsticks with five fingers” or giving the hands a specific role in the prompt. That said, there still seems to be an element of luck involved.

Changed location to Chinese restaurant

I changed the setting this time. Previously, it was an outdoor food stall, but now I’ve placed the scene in a typical Chinese restaurant. I set the lighting to fluorescent to enhance the shadows and create a slightly worn-out atmosphere.

At the counter of a ramen shop illuminated by warm lighting, a blonde woman laughs joyfully while eating her ramen. The interior exudes a nostalgic Showa-era atmosphere.

It’s coming together nicely. The text on the menu inside the restaurant might look strange to someone who can read Japanese, but for others, it probably wouldn’t be noticeable.

Looking closely, there are still some unnatural details, like how the fingers on her right hand hold the chopsticks and how the noodles seem overly tangled around them. However, at first glance, it looks like a great photo. Maybe the ramen is incredibly delicious, or perhaps she’s good friends with the staff—her beaming smile really stands out.

This time, she’s properly placing the bowl on the table while eating, and the rising steam clearly shows that the ramen is piping hot.

That being said, details like the folds in her clothing and the way the counter’s shadow falls on the chairs look impressively realistic. By simply cropping out the awkward menu text, the overall image feels quite natural.

At a ramen shop counter illuminated by warm lighting, a blonde woman laughs happily while enjoying her ramen. The interior exudes a nostalgic Showa-era charm.

Adding a subtle touch of realism

Now that the image is coming together, I feel like adding some finer details.

One common gesture for people with long hair while eating ramen is holding their hair back with one hand. So, I decided to include this action in the prompt to make the scene feel even more natural.

I also adjusted the hair color, opting for a more natural brown tone to enhance the overall realism of the image.

A woman in a gray cardigan sits at the counter, slurping her ramen. The rising steam from the bowl and the calm ambiance of the restaurant create a memorable scene.

Aside from the fact that the bowl and the spoon have fused together and the noodles seem unusually abundant, the image has turned out quite realistic.

One particularly impressive detail is the pilling on the sweater sleeves, which adds an unexpected touch of realism. But most of all, the hair texture is incredible—each strand is meticulously rendered.

Next, I’ll try adding the gesture of tying up her hair before eating ramen to make the scene even more natural.

A woman in a yellow hoodie smiles as she ties her hair before enjoying her ramen. In a traditional ramen shop filled with rising steam, she savors a warm and comforting moment.

It’s really coming together nicely. Since the fingers are naturally hidden, the overall completeness of the image is quite high… or so I thought—until I noticed that the ramen is precariously placed right at the edge of the counter, making it look a bit unstable. On top of that, the woman herself is sitting right on the edge of the chair, which seems a little precarious as well.

Still, the gesture of tying her hair back has a strangely timeless appeal. It’s such a natural movement, yet for some reason, it effortlessly draws the eye.

Adding a “Yuru-Kawaii” touch

Since this gesture naturally hides fingers, which are tricky to generate perfectly, it’s definitely a useful trick! So, keeping the same atmosphere, I decided to change her outfit.

With ChatGPT’s suggestion, I’ll incorporate some “Yuru-Kawaii & Relaxed” elements to create an even more friendly and approachable vibe.

“Yuru-Kawaii” is a Japanese term that combines “yurui” (loose, relaxed, or laid-back) and “kawaii” (cute). It refers to a casual, effortless kind of cuteness—think soft fabrics, oversized sweaters, pastel colors, and a cozy, relaxed vibe.

A woman in a beige cardigan looks into the camera as she prepares to tie her hair before enjoying her ramen. Seated at the counter of a cozy ramen shop, she exudes a warm and inviting presence.

Once again, both the ramen and the seating position are a bit precarious, but not to the point of being overly distracting.

Next, I’ll try generating an image from a different angle. The outfit and hairstyle will remain mostly the same, but I’m curious to see how the overall atmosphere changes.

A woman in a beige turtleneck sits at a red counter, smiling as she ties her hair. The menu on the wall and the clock evoke a nostalgic Showa-era atmosphere.

At this point, a question suddenly comes to mind.

This person looks like the same individual, but is that really the case? Of course, since she doesn’t actually exist, calling her the “same person” might not be entirely accurate.

However, if we didn’t mention that these images were AI-generated, wouldn’t most people naturally recognize her as the same person?

A woman in a deep green sweatshirt sits at a red counter, tying her hair with a relaxed smile. In front of her is a steaming bowl of ramen.

Even with a different outfit, she still looks like the same person.

The fact that the spoon is placed directly on the counter bothers me a little, but overall, the image looks great.

By the way, according to ChatGPT, if you don’t specify facial features, the AI will automatically generate an attractive face by default.

Here’s what ChatGPT had to say:

If no specific instructions are given, AI-generated faces tend to be androgynous and well-balanced in appearance. In particular, AI-generated images often follow these trends:

Face shape
The default tends to be an oval face, as it is considered the most balanced and universally appealing. If you want a round or long face, you need to specify it clearly.

Eye shape
The eyes are usually slightly large and almond-shaped. This might be because AI has learned that this is an ideal eye shape for Japanese women. If you want to emphasize cuteness or a cooler look, it’s better to specify “round eyes” or “narrow, slanted eyes.”

Eyebrow shape
AI often generates naturally thick, arched eyebrows, as this gives the impression of a well-proportioned, attractive face. If you prefer a cooler expression, specifying “straight eyebrows” can help.

Nose shape
The AI tends to create a slim, well-defined nose, influenced by Western beauty standards. This often results in a higher and narrower nose. If you want a more natural Japanese look, you should specify “a small nose” or “a slightly rounded nose.”

Lip shape
AI defaults to moderately full, well-shaped lips. If you want a cuter appearance, specify “small lips”, while for a more sensual look, “full lips” is a better choice.

Skin texture and tone
The AI usually generates fair, natural-looking skin, influenced by the Western preference for “clear, porcelain skin,” but with a slight blend of Asian skin tones. If you prefer a healthy tan or olive complexion, you’ll need to specify it explicitly.

AI-generated faces often follow common beauty trends, likely because they are trained on a large dataset of images that reflect existing preferences. However, beauty is subjective, and different cultures have different ideals.

There also seem to be technical limitations at play.

Since the latent space (the compressed representation of features learned by AI) has a limited number of dimensions, it’s difficult to perfectly replicate subtle facial differences—for example, a slight variation in eye angle or shape. In other words, AI struggles to capture fine details in facial features.

Additionally, the denoising algorithm used in the image generation process tends to favor “safe averages”, making extreme facial features less likely to appear. As a result, the AI naturally produces conventionally attractive faces. This aligns with the well-known idea that “average faces are often perceived as more beautiful.”

So, to what extent can AI actually replicate the specific facial features you request?

Adjusting Facial Features

As an experiment, I’ll try specifying detailed facial characteristics. This time, I’ll add the following conditions: round face, monolids, a small and low nose, thin lips, and tanned skin.

Let’s see how well the AI can reflect these traits in the generated image.

In a retro-style ramen shop, a woman wearing a beige turtleneck and cardigan ties her hair. The wooden walls and noren curtains create a nostalgic Showa-era atmosphere.

Hmm… it doesn’t feel like the specified features were fully reflected, but the character does look somewhat different from the previous one.

A woman in a beige cardigan and denim sits at the counter, smiling as she ties her hair. The rising steam from the ramen adds to the warm and inviting atmosphere.

And now, for the first time, the ramen bowl has turned black. Am I the only one who thinks a black bowl somehow makes the ramen look even more delicious?

Perhaps the phrase “tanned, sun-kissed skin” influenced not just the character’s complexion but also the color of the bowl.

Changing the Hairstyle

This time, I’ll specify a round face, narrow almond-shaped eyes, and a small, slightly rounded nose. On top of that, I’ll change the hairstyle to a short bob with black hair and see how the AI interprets it.

A short-haired woman in a hooded beige hoodie sits in front of a bowl of ramen, covering her mouth with one hand, showing a surprised expression.

She clearly looks like a different person from the previous one.

Since I specified a round face, her features appear slightly fuller this time. The height of her nose is also a bit lower than before.

At a glance, the overall facial structure might be similar, but just changing the hair color and hairstyle makes a huge difference in impression. In this image, she seems absolutely starving—gripping her chopsticks with enough force to almost snap them in half. And, as expected, some fingers are still a bit off… her pinky seems to have disappeared midway. It looks like fine details like this are still a challenge.

By the way, I instructed the AI to place her left palm on her left cheek in a “delicious!” gesture, but no matter how many times I tried, it didn’t come out right. Instead, she ended up in a mysterious pose. Well, let’s just say it works in its own way.

What Happens When Generating Multiple People?

Now, what if I try generating multiple people in the same image? Will they all end up with identical faces, or will the AI successfully create distinct individuals? Let’s find out.

Four women sit around a warm wooden table, enjoying ramen and chatting. The scene captures a relaxed and cheerful atmosphere as they share a meal together.

All four individuals have distinct facial features.

In reality, it’s extremely rare for multiple people to have identical faces, so it’s only natural that increasing the number of people leads to more variation in their appearances.

Four women sit around a table, happily enjoying ramen and gyoza. One of them wears a kimono, adding to the nostalgic charm of the warmly lit restaurant. As they sip beer and chat, their laughter fills the air. The scene, set in a Showa-era retro ramen shop, captures a relaxed and joyful moment.

This image depicts four sisters, inspired by the Netflix original drama Asure. The show was amazing, but this… doesn’t look all that interesting.

Can AI Generate a Photo of Twins?

Now, can AI create an image of twins standing side by side? Let’s see if it can accurately reproduce two identical faces.

In a Showa-era retro ramen shop, two women laugh joyfully as they enjoy their ramen. The scene captures a warm and lively moment, filled with good food and conversation.

They look quite similar.

If someone told me, “They’re twins,” I would probably believe it.

On a cold day with visible breath in the air, two women in knitted sweaters smile at each other while enjoying steaming bowls of ramen.

These two genuinely look like twins.

However, once again, they’re holding the bowls. I keep forgetting to include the instruction “place the bowl on the table.” In Japan, it’s common to eat ramen without lifting the bowl, but the AI doesn’t seem to recognize this cultural nuance.

Changing the Outfit

Generating images like this is really fun. Plus, with ImageFX’s incredibly fast processing speed, there’s little to no frustration, which makes the experience even better.

Now, I’ll have ChatGPT suggest different outfits and continue generating ramen-loving girls in various styles.

First up: Y2K fashion.

Y2K (short for “Year 2000”) refers to the fashion and culture trends from the late 1990s to early 2000s. It has recently made a comeback and is gaining popularity again.

A woman in a light blue top and jeans sits at a red counter, tying her hair while gazing into the camera. In front of her, a steaming bowl of ramen adds to the warm atmosphere.

This resulted in quite a unique-looking image. It definitely gives off a social media-friendly, attention-grabbing vibe.

Next, I’ll try generating an image with a “Mild Yankee” theme.

In Japan, “Yankee” refers to delinquent youth, troublemakers, or gang-like subcultures. However, “Mild Yankee” describes a group of young people who, while sharing some stylistic elements with traditional Yankees, are far less aggressive and don’t engage in illegal activities. They tend to have a strong attachment to their local community and a more grounded lifestyle.

According to ChatGPT, Mild Yankee fashion is characterized by “a mix of slight rebelliousness, practicality, and hometown pride.” I’m excited to see how the AI interprets this style!

A woman in a navy tracksuit sits in front of a bowl of ramen, playfully lifting her glasses with a mischievous smile. The scene exudes a casual and relaxed atmosphere.

The character ended up looking like someone straight out of a live-action adaptation of a manga, with a strong and distinct personality. As for the “hometown pride” aspect… well, I’m not entirely sure where that comes in, but maybe she’s wearing a tracksuit from her old high school?

Next, I’ll try generating an image with the theme: “A cyber girl eating cyber ramen at a cyber ramen shop in a cyber city.” Let’s see how futuristic this one turns out!

In a futuristic food stall illuminated by neon lights, a woman wearing high-tech goggles enjoys a bowl of ramen. Floating digital interfaces enhance the sci-fi atmosphere, creating a world straight out of a cyberpunk vision.

When the setting is too over-the-top, the image tends to look more like an illustration rather than a realistic photo. This might be due to a lack of real-world reference data.

So, I’ll tone down the setting a bit and give it another try.

Surrounded by the neon lights of a futuristic city, a woman enjoys a bowl of ramen as floating digital interfaces illuminate the scene. Her mechanical wristwatch and transparent AR glasses add to the high-tech aesthetic.

The atmosphere of a cyber city is conveyed through the lighting effects and holographic displays.

Most likely, these glasses are also a digital device, automatically analyzing the nutritional content and calories of the ramen while sending signals directly to her brain.

Letting ChatGPT Take Full Control

I let ChatGPT take the lead by simply asking, “Create a prompt for generating the most adorable woman eating ramen.” Let’s see how it turned out!

A woman in a pink sweater and glasses lifts her ramen with a cheerful smile. The warm lighting of the restaurant creates a cozy and inviting atmosphere.

The quality is impressive. The background is properly blurred, and there are no awkward Japanese characters. The addition of lanterns enhances the atmosphere, and the two slices of chashu are a nice touch. Well done.

Lighting makes a huge difference. Previous images had a “run-down Chinese restaurant” setting, with dim fluorescent lighting, creating a slightly gloomy feel. In contrast, this one is much brighter and more inviting. It seems that lighting and depth-of-field adjustments play a crucial role in enhancing image quality.

Next, I let ChatGPT generate another image without any specific instructions, only telling it “Try a different approach.” Let’s see what it came up with!

A woman in a black turtleneck and beige pants elegantly enjoys her ramen in a modernly lit restaurant. The rising steam adds to the warm and tranquil atmosphere.

This looks just like a cover from Tokyo Calendar magazine. Could this be Ginza? The black-themed interior with ambient lighting creates the perfect atmosphere of a modern, hidden gem for ramen lovers.

When there aren’t many specific requests, it might actually be better to let ChatGPT generate images freely rather than over-directing. After that, making small adjustments to refine details seems like the best approach.

A woman sits at a food stall counter, eating ramen while holding a transparent umbrella. The neon lights of the rain-soaked city glow around her as warm steam rises from the bowl.

And here we have a mysterious scene—someone slurping ramen while holding an umbrella. She even keeps her bag slung over her shoulder, making her an impressively skilled multitasker.

ChatGPT’s imagination is truly fascinating. Normally, no one would think to have someone eating ramen while holding an umbrella, yet it manages to present the scene so naturally. That unexpected creativity is what makes this so fun to experiment with.

A woman wearing a green knit sweater and checkered pants enjoys ramen in a shop adorned with retro posters and lanterns. Her tattoos and piercings add a distinctive touch to her look.

When I pointed out to ChatGPT that “the hairstyles have been the same for a while,” it immediately introduced a wide variety of changes. Not only did the hairstyles change, but small details were also adjusted.

With tweaks to makeup and accessories, the overall impression shifts significantly. Also, the maneki-neko (a traditional cat figurine believed to bring good luck) placed alongside the condiments is a nice touch—it adds a bit of character to the scene.

A woman in a navy hoodie sits at a wooden counter, slurping her ramen. In the calm and cozy atmosphere of the restaurant, she takes her time savoring each bite.

It seemed like ChatGPT was starting to go a little overboard, as more prompts were failing to generate properly. So, I told it to “keep it simple.”

As a result, it generated a completely ordinary photo of a woman eating ramen—just a typical, everyday scene. And perhaps, this might actually be the most realistic image of the entire session.

Final Thoughts & Image Collection

With that, I think it’s time to wrap things up—after all, even I’m starting to feel a bit of fatigue.

Image generation is surprisingly fun as a hobby. If you haven’t tried it yet, I highly recommend giving it a shot. You might end up with unexpectedly high-quality images, or even hilarious, completely unintended results, making for an unpredictable and entertaining experience. Plus, for now, it’s free to use.

Lastly, here’s a collection of the images featured in this article, along with some that didn’t make the cut.

黒いジャケットを着た女性が、赤いカウンターでラーメンを前に髪を結ぶ仕草をしている。壁には昭和の雰囲気を感じさせるメニューが貼られている。
暖かい雰囲気のラーメン屋で、黒髪の女性が嬉しそうに麺を持ち上げる。湯気が立ち、スープには煮卵や海苔が添えられている。
image_fx_ramen-japanese-woman22
image_fx_ramen-japanese-women01
image_fx_ramen-japanese-women02
image_fx_ramen-japanese-woman03
image_fx_ramen-japanese-woman04
image_fx_ramen-japanese-woman05
image_fx_ramen-japanese-woman06
image_fx_ramen-japanese-woman07
image_fx_ramen-japanese-woman08
image_fx_ramen-japanese-woman09
image_fx_ramen-japanese-woman10
image_fx_ramen-japanese-woman11
image_fx_ramen-japanese-woman12
image_fx_ramen-japanese-woman27
image_fx_ramen-japanese-woman28
image_fx_ramen-japanese-woman13
image_fx_ramen-japanese-woman14
image_fx_ramen-japanese-woman15
image_fx_ramen-japanese-woman16
image_fx_ramen-japanese-woman17
image_fx_ramen-japanese-woman18
image_fx_ramen-japanese-woman19
image_fx_ramen-japanese-woman20
image_fx_ramen-japanese-woman21
image_fx_ramen-japanese-woman24
image_fx_ramen-japanese-woman29
image_fx_ramen-japanese-woman26
image_fx_ramen-japanese-woman30
previous arrow
next arrow
image_fx_ramen-japanese-woman22
image_fx_ramen-japanese-women01
image_fx_ramen-japanese-women02
image_fx_ramen-japanese-woman03
image_fx_ramen-japanese-woman04
image_fx_ramen-japanese-woman05
image_fx_ramen-japanese-woman06
image_fx_ramen-japanese-woman07
image_fx_ramen-japanese-woman08
image_fx_ramen-japanese-woman09
image_fx_ramen-japanese-woman10
image_fx_ramen-japanese-woman11
image_fx_ramen-japanese-woman12
image_fx_ramen-japanese-woman27
image_fx_ramen-japanese-woman28
image_fx_ramen-japanese-woman13
image_fx_ramen-japanese-woman14
image_fx_ramen-japanese-woman15
image_fx_ramen-japanese-woman16
image_fx_ramen-japanese-woman17
image_fx_ramen-japanese-woman18
image_fx_ramen-japanese-woman19
image_fx_ramen-japanese-woman20
image_fx_ramen-japanese-woman21
image_fx_ramen-japanese-woman24
image_fx_ramen-japanese-woman29
image_fx_ramen-japanese-woman26
image_fx_ramen-japanese-woman30
previous arrow
next arrow
テーブルを囲み、4人の女性がラーメンを食べている。全員カメラ目線でほほ笑んでいる。
image_fx_ramen-japanese-woman25
image_fx_ramen-japanese-woman-twins01
image_fx_ramen-japanese-woman-twins02
image_fx_ramen-japanese-4woman-02
image_fx_ramen-japanese-4woman
previous arrow
next arrow
image_fx_ramen-japanese-woman25
image_fx_ramen-japanese-woman-twins01
image_fx_ramen-japanese-woman-twins02
image_fx_ramen-japanese-4woman-02
image_fx_ramen-japanese-4woman
previous arrow
next arrow

Comment

Copied title and URL