Directed Image Positioning v1版本Image Positioner V1 (ID: 1061880)

Directed Image Positioning v1版本Image Positioner V1 (ID: 1061880)

Tell it where you want the images to be placed in your scene with natural language prompting. Experimental.

...

After some experimentation it does seem to enhance the image coherency and positional adherence. Pics below with notes in images.

I have added a public dataset of PNG images with metadata in JSON I created with Python. The images are not captioned but should be better than anything I found online if anyone cares to join me in my experiments.


I used GeoShapes that were hand labeled with just the smaller shapes with their approximate location in various plain language descriptions, no formal way, then let an AI add the descriptive captions after them. If I had a more formal dataset of positional images i could do magic.

From ChatGPT:

That’s a clever approach! Using GeoShapes with plain language labeling for training is an innovative start, as it provides the AI a loose "positional vocabulary" to work with. This flexibility allows it to generalize placement and context, which is why you’re seeing those natural, somewhat structured compositions.

To elevate this even further, a more formal positional dataset would be incredibly powerful. Here are a few ideas for constructing such a dataset to amplify the precision and adaptability of your model:

1. Gridded Coordinate System: Create images with shapes positioned according to a defined grid (e.g., 3x3 or 5x5). Each grid cell would represent a different placement: top-left, center-right, bottom-middle, etc. Labels would include grid-based coordinates (e.g., (1,1) for top-left), allowing for a finer positional mapping. This structured approach could help the AI learn precise placements.

2. Layered Position Descriptions: Instead of just approximate descriptions, use layered terms. For instance, combine location (e.g., top-left, bottom-center), size (small, medium, large), distance from edges, and angles (rotation, tilt). This would give the model richer metadata, allowing for more specific placements, rotations, and even layering of shapes.

3. Structured Language for Composition Rules: By integrating a formalized language structure that includes rules for composition (e.g., “centered vertically, offset slightly left horizontally”), the model could learn to arrange shapes based on relational positions—how one object should sit relative to others or the background.

4. Dynamic Background and Lighting Contexts: Incorporate variations in backgrounds and lighting to teach the model how positional accuracy changes in context. For example, adding shadows based on hypothetical light sources could make positional interpretation even more robust.

5. Use of Bounding Boxes or Anchor Points: Label images with bounding boxes or anchor points for shapes, with coordinates that specify exact placements. These markers would help the AI learn exact placements rather than just "approximate" positioning, useful for precise layouts.

6. Combined Shapes in Hierarchical Order: Introducing overlapping or adjacent shapes with a hierarchy of importance (e.g., primary, secondary) would let the model understand layered compositions. For example, a small red square within a larger green square can be tagged as "centered within green square," helping the AI learn containment and layering nuances.

Once you have this kind of dataset, you could train the AI to handle much more complex layouts, orientations, and compositions. With such a structured dataset, the potential for precise layout and arrangement—whether for graphic design, illustration, or even UI composition—would be almost endless. Let me know if you’d like to discuss any of these ideas further or explore ways to implement them!

描述:

Re-uploaded to change the name. Don't be mad if it isn't perfect or doesn't work very well lol, I don't know yet.

训练词语:

名称: Image_Positioner_v1.safetensors

大小 (KB): 18809

类型: Model

Pickle 扫描结果: Success

Pickle 扫描信息: No Pickle imports

病毒扫描结果: Success

名称: myGeoShapesFromCode.zip

大小 (KB): 662

类型: Training Data

Pickle 扫描结果: Success

Pickle 扫描信息: No Pickle imports

病毒扫描结果: Success

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

Directed Image Positioning v1

资源下载
下载价格VIP专享
仅限VIP下载升级VIP
犹豫不决让我们错失一次又一次机会!!!
原文链接:https://1111down.com/1167634.html,转载请注明出处
由于网站升级,部分用户密码全部设置为111111,登入后自己修改, 并且VIP等级提升一级(包月提升至包季,包季提升到包年 包年提升至永久)
没有账号?注册  忘记密码?

社交账号快速登录