this post was submitted on 25 Jul 2024
1 points (100.0% liked)

Technology

59651 readers
2722 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

The new global study, in partnership with The Upwork Research Institute, interviewed 2,500 global C-suite executives, full-time employees and freelancers. Results show that the optimistic expectations about AI's impact are not aligning with the reality faced by many employees. The study identifies a disconnect between the high expectations of managers and the actual experiences of employees using AI.

Despite 96% of C-suite executives expecting AI to boost productivity, the study reveals that, 77% of employees using AI say it has added to their workload and created challenges in achieving the expected productivity gains. Not only is AI increasing the workloads of full-time employees, it’s hampering productivity and contributing to employee burnout.

you are viewing a single comment's thread
view the rest of the comments
[–] WalnutLum@lemmy.ml 0 points 4 months ago (1 children)

All the models I've used that do TTS/RVC and rotoscoping have definitely not produced professional results.

[–] Hackworth@lemmy.world 0 points 4 months ago* (last edited 4 months ago) (1 children)

What are you using? Cause if you're a professional, and this is your experience, I'd think you'd want to ask me what I'm using.

[–] WalnutLum@lemmy.ml 0 points 4 months ago (1 children)

Coqui for TTS, RVC UI for matching the TTS to the actor's intonation, and DWPose -> controlnet applied to SDXL for rotoscoping

[–] Hackworth@lemmy.world 0 points 4 months ago (1 children)

Full open source, nice! I respect the effort that went into that implementation. I pretty much exclusively use 11 Labs for TTS/RVC, turn up the style, turn down the stability, generate a few, and pick the best. I do find that longer generations tend to lose the thread, so it's better to batch smaller script segments.

Unless I misunderstand ya, your controlnet setup is for what would be rigging and animation rather than roto. I do agree that while I enjoy the outputs of pretty much all the automated animators, they're not ready for prime time yet. Although I'm about to dive into KREA's new key framing feature and see if that's any better for that use case.

[–] WalnutLum@lemmy.ml 0 points 4 months ago

I was never able to get appreciably better results from 11 labs than using some (minorly) trained RVC model :/ The long scripts problem is something pretty much any text-to-something model suffers from. The longer the context the lower the cohesion ends up.

I do rotoscoping with SDXL i2i and controlnet posing together. Without I found it tends to smear. Do you just do image2image?