ComfyUI With Florence 2 Vision LLM - This Is Not Just A Segmentation Model

  Рет қаралды 8,135

Future Thinker @Benji

Future Thinker @Benji

Күн бұрын

ComfyUI With Florence 2 Vision LLM
In this video, I delve into a new LLM - Florence 2, an extraordinary vision foundation model developed by Microsoft. Join me as I discuss its features, demonstrate its capabilities, and guide you through the installation process.
Florence 2: An Image-to-Text Prompt Large Language Model
Florence 2 is trained with the massive FLD-5B dataset, making it one of the most accurate and detailed text generation models for images. In this video, I'll showcase two custom nodes that connect to Florence 2: the KJ version and the Spacepxl version. These custom nodes enable segmentations, image captioning, and object detection.
Explainer About Florence-2 : • Florence-2 And Deepsee...
Workflows In This Tutorial : / 106792381
ComfyUI-Florence-2
huggingface.co/microsoft/Flor...
arxiv.org/abs/2311.06242
github.com/spacepxl/ComfyUI-F...
github.com/kijai/ComfyUI-Flor...
Installing Florence 2 Custom Nodes in ComfyUI
Before we dive into the demonstrations, we need to install the Florence 2 custom nodes. Don't worry, I'll guide you through the process step by step. Just head to the ComfyUI manager, search for the custom nodes, click install, and wait for the downloads to complete. Once installed, you'll have the powerful Florence 2 custom nodes at your fingertips.
Exploring the KJ Version of Custom Nodes
Let's start our journey by testing the KJ custom nodes. With just two simple custom nodes, you'll be able to perform segmentations, captionings, and bounding boxes. I'll demonstrate their usage using an example image, providing you with a clear understanding of how these custom nodes enhance your workflow.
Unleash the Power of the Spacepxl Version
Next, we'll explore the Spacepxl version of the Florence 2 custom nodes. These custom nodes offer even more features and functions, allowing you to create diverse and versatile workflows. With separate custom nodes for each capability, you'll have the flexibility to incorporate Florence 2 seamlessly into your ComfyUI projects.
The Time is Now: Experience the Future of AI Image Generation
Florence 2 harnesses the potential of large language models, providing accurate and detailed text descriptions for any element in an AI image. Witness the magic of caption-to-phrase grounding, region captions, and object detection as Florence 2 effortlessly combines text and visuals. Prepare to be amazed by the future of AI image and video generation!
If You Like tutorial like this, You Can Support Our Work In Patreon:
/ aifuturetech
Discord : / discord

Пікірлер: 45
@swannschilling474
@swannschilling474 17 күн бұрын
Great tutorial! Thanks!!😊
@TheFutureThinker
@TheFutureThinker 17 күн бұрын
Glad it was helpful!
@reaperhammer
@reaperhammer 18 күн бұрын
It will be interesting to see how you integrate this into other workflows as you suggested
@TheFutureThinker
@TheFutureThinker 18 күн бұрын
Yes I will make another video about that. It should be interesting.
@Ai-dl2ut
@Ai-dl2ut 17 күн бұрын
@@TheFutureThinker Can't Wait :)
@TheCinefotografiando
@TheCinefotografiando 17 күн бұрын
I have found myself watching your videos more and more
@x3gxu
@x3gxu 17 күн бұрын
Hi. How did you make the talking head in the corner? Looks pretty good. It's not wav2lip. V-express? Hallo? I wasn't able to achieve results like this, so I'm very interested. Can you point me in the right direction?
@pabloapiolazza4353
@pabloapiolazza4353 17 күн бұрын
Also interested!
@TheFutureThinker
@TheFutureThinker 8 күн бұрын
This is a short videos done for you guys : kzfaq.info/get/bejne/e8lxgLxe2MCdnY0.html :) hope this help
@crazyleafdesignweb
@crazyleafdesignweb 18 күн бұрын
that is great stuff! We are getting out of Stable Diffusion with more alternatives.
@TheFutureThinker
@TheFutureThinker 17 күн бұрын
There's more 😉 well, I think many users are going to keep SD1.5 and SDXL, others from Stupidity AI throw to bin. 🤭
@Rico-nj3vl
@Rico-nj3vl 18 күн бұрын
Very nice !
@TheFutureThinker
@TheFutureThinker 18 күн бұрын
Thank you! Cheers!
@kalakala4803
@kalakala4803 18 күн бұрын
nice! I just check the Florence-2 LLM video you did. this AI model looks promising. Can you integrate this with AnimateDiff V2V?
@TheFutureThinker
@TheFutureThinker 18 күн бұрын
😉👍you got it
@vitalis
@vitalis 17 күн бұрын
So cool
@TheFutureThinker
@TheFutureThinker 14 күн бұрын
Yup
@triojakeson116
@triojakeson116 16 күн бұрын
Hey bro i wanted to ask about the kling ai video, can u tell me how much time it will take for u to get accepted after getting into waitlist, cz am already in waitlist for one day, just wanna make sure, if u know pls reply thanks?
@TheFutureThinker
@TheFutureThinker 16 күн бұрын
Depends, some got it few days in waiting.
@triojakeson116
@triojakeson116 16 күн бұрын
​@@TheFutureThinkeralso bro am from india, and here all chinese apps are banned, so i had to use an American vpn and a fake chinese number, do u think they will accept my waitlist request if they see all this, will they check all this info 😢
@TheFutureThinker
@TheFutureThinker 16 күн бұрын
@@triojakeson116 sorry to hear that. This is more about the company policy, I have no comment.
@triojakeson116
@triojakeson116 16 күн бұрын
@@TheFutureThinker ok bro i will update if something happens 😭
@TheFutureThinker
@TheFutureThinker 16 күн бұрын
@@triojakeson116 but wish you good luck , i see other people are getting access now. So hopefully it will be okay for ya
@RamonGuthrie
@RamonGuthrie 10 күн бұрын
Hey can you prompt the Florence2 model on what parts of an image you want to describe? Example Describe the background only or Describe the person in detail only! or are there better vision models for this?
@TheFutureThinker
@TheFutureThinker 9 күн бұрын
Yes this vision model can be segment the background and then do captioning
@jairuskersey8311
@jairuskersey8311 18 күн бұрын
Nice vid. Can you also make a tutorial on how you made the talking avatar in this video? Thanks~
@TheFutureThinker
@TheFutureThinker 17 күн бұрын
just use Hedra, very easy website no need tutorial :) I believe you can do it
@context_eidolon_music
@context_eidolon_music 18 күн бұрын
Holy crap!
@SageGoatKing
@SageGoatKing 17 күн бұрын
I don't miss the right click menu bar at all since getting the sidebar where I can pin my favorite nodes, etc.
@TheFutureThinker
@TheFutureThinker 17 күн бұрын
Normally, I use search. I don't have favourite nodes. Cause I use too many
@promptaganda
@promptaganda 12 күн бұрын
using spacepxl node, i am getting strange polygons for all images i try to run region to segmentation on. The captioning is working correct. any ideas?
@TheFutureThinker
@TheFutureThinker 8 күн бұрын
This might possibly happen, when the node have no condition logic to specific conditions you are setting. Hopefully the developer will do a new update and add something to handle this.
@davidsmith-lv4kq
@davidsmith-lv4kq 7 күн бұрын
@@TheFutureThinker do u mean the prompt must match a word in the caption or region suggestions? i seemed to try alot of different images and regions and could never get it to do anything other than make the polygons
@patagonia4kvideodrone91
@patagonia4kvideodrone91 15 күн бұрын
There are other nodes, I don't remember the name now, but what do you say detect me such a thing, and it generates the automatic mask, (but not square) but with its real contour,
@TheFutureThinker
@TheFutureThinker 15 күн бұрын
Segment Anything
@promptaganda
@promptaganda 13 күн бұрын
@@TheFutureThinker every time ive ever tried to add a prompt to a segment anything mode it makes zero mask ....... any suggestions?
@TheFutureThinker
@TheFutureThinker 13 күн бұрын
@@promptaganda what is your setting?
@nkofr
@nkofr 18 күн бұрын
what's the use of the 'finetuned' versions?
@TheFutureThinker
@TheFutureThinker 18 күн бұрын
Finetuned model on a collection of downstream tasks
@nkofr
@nkofr 18 күн бұрын
@@TheFutureThinker excuse me but what does that mean?
@riggitywrckd4325
@riggitywrckd4325 11 күн бұрын
It means that if you have a dataset of stuff you want it to learn you can train those ideas and it will learn to spot them like it does in this one.
@bilalalam1
@bilalalam1 17 күн бұрын
Hi , I would like to use Florence 2 with LM studio
@TheFutureThinker
@TheFutureThinker 14 күн бұрын
No sure, but I use OpenWebUi x Ollama
@robd7724
@robd7724 4 күн бұрын
I'm sorry but the AI voice you are using sounds awful and the video is very creepy.
Stable Diffusion ComfyUI And Diffutoon Create AI Videos - Domo AI Alternative?
17:27
A look at the NEW ComfyUI Samplers & Schedulers!
13:54
Nerdy Rodent
Рет қаралды 14 М.
THE POLICE TAKES ME! feat @PANDAGIRLOFFICIAL #shorts
00:31
PANDA BOI
Рет қаралды 24 МЛН
Vivaan  Tanya once again pranked Papa 🤣😇🤣
00:10
seema lamba
Рет қаралды 34 МЛН
MEGA BOXES ARE BACK!!!
08:53
Brawl Stars
Рет қаралды 36 МЛН
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Alex Chao
Рет қаралды 89 М.
5 Steps To Build The Perfect AI Voice Agent
17:48
Hugo Pod
Рет қаралды 3,1 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 227 М.
Claude 3.5 Sonnet vs GPT-4o: Side-by-Side Tests
25:10
Patrick Storm
Рет қаралды 92 М.
How to make great videos ANYWHERE with AI ENVIRONMENTS
9:25
Epic Light Media
Рет қаралды 840 М.
The BEST AI Video Model Is Out & FREE!
12:44
Theoretically Media
Рет қаралды 136 М.
You’ll NEVER Need Prompt Engineering Again with Meta-Prompting
17:32
ComfyUI With Dense Diffusion - Better Control For Your AI Images
9:09
Future Thinker @Benji
Рет қаралды 3,8 М.
Claude 3.5 Deep Dive: This new AI destroys GPT
36:28
AI Search
Рет қаралды 471 М.
Опыт использования Мини ПК от TECNO
1:00
Андронет
Рет қаралды 763 М.
Первый обзор Galaxy Z Fold 6
12:23
Rozetked
Рет қаралды 392 М.
Здесь упор в процессор
18:02
Рома, Просто Рома
Рет қаралды 259 М.
Samsung Galaxy Unpacked July 2024: Official Replay
1:8:53
Samsung
Рет қаралды 23 МЛН