Next-Gen TTS Models are here with ultra low latency

There’s an AI for that now: Generative AI and the future of work

  • There’s an AI for that now: Generative AI and the future of work

Table of Contents

Talk about generative AI in the fields of

  • Art: AI paintings and image and video reconstruction — DALL-E
  • Music: AI-generated songs —  SymphonyNet
  • Voices: AI-generated voices —Resemble/Dubverse
  • Writing: AI-generated copywriting — ChatGPT

The new age of work

Talk about how human needs will evolve from creating to curating. We’ll need to become better at giving AI the right prompts than generating work from scratch.

Imagine a world where you can create an entire Metallica-themed album from scratch by typing the phrase ‘Metal’ into a computer program. Or design a custom Renaissance painting of your pet by simply describing its core features and letting AI do the rest? Sounds right out of a sci-fi show, right? In 2023, the technological prowess is here and it’s called generative AI. There are so many different types of generative AI that an entire database exists solely to track all of them. But how do these AI work?

Generative AI uses machine learning algorithms to analyze and mimic patterns in existing data to create entirely original content, often indistinguishable from human-created works. 

Generative AI has the potential to revolutionize the way we not only create but also consume media. It has made it possible for anyone to participate in the creator economy. Whether you’re a professional artist looking to up your game or a total novice looking to try your hand at creating something new, the tools of generative AI can be your leg up in the game. Let’s consider the field of writing.

ChatGPT’s lyricism

For copywriters, authors, and college students all over the world, Open AI’s ChatGPT is both a blessing and a curse. ChatGPT is one of the best AI-generative software available publicly that there has been yet. It can write thousand-word college essays with just a single prompt, generate snappy advertisement scripts in seconds, or compose poems about current global events in the voices of deceased poets like Yeats or Woolf. 

In his article, The Brilliance, and Weirdness of ChatGPT, Kevin Roose describes how Chat GPT has also made successful guesses at medical diagnoses, created text-based Harry Potter games, and broke-down scientific concepts at varying levels of complexity. Perhaps the scariest or coolest (depending on how you want to look at it) thing is that AI is free for the world to get its hands on.

image 7

Before Chat GPT, there was GPT2 or the Generative Pre-trained Transformer 2. Just like ChatGPT, GPT2 could be trained using datasets to provide one with the writing output they wanted to generate from scratch. Here is an example of the AI GPT2 generating a Rick and Morty episode from scratch. The result? As with most generative AI: eerily believable. 

Now, we have you.com and perplexity.ai that take in a search query, fetch the pages from google, then feed it to GPT3 to summarise it. These AI baestion-answering machines, to choose the most relevant information and present it as a response to your query.

DALL-E 2’s versatility

But writing isn’t the only field AI seems to be spreading its tendrils into. Let’s move to art where we see the likes of DALL-E 2 and Stable Diffusion. The viral DALL-E 2’s generative abilities work in the same way ChatGPT’s do, but the output, in this case, is images. You can request DALL-E 2 with prompts like:

Garfield playing baseball and losing in a temple, Detailed scientific diagram:
Darth Vader learning to fly in a McDonalds’s, a Renaissance painting
image 9
Garfield playing baseball and losing in a temple, Detailed scientific diagram:
image 10
A gorilla holding a camera and eating asparagus in a school, an Oil painting:
image 11
A jam-packed Mcdonalds’s ‘ in the style of an impressionist painting:

But how does DALL-E 2 function? The answer is shockingly straightforward. It, and other generative AI, works by using a neural network trained on a dataset of images and their associated textual descriptions. When given a textual description, DALL-E 2 uses this dataset to generate an image that corresponds to the description. For example, if the description is “a two-story pink house with a white fence and a red door,” DALL-E 2 will generate an image of a house that matches this description.

Like DALL-E 2, Stable Diffusion works to generate images through text descriptions, but it has one additional advantage: depth. This means it can be used to generate architectural models through a single prompt. And now we can witness AI generating entire videos through text prompts! One of the unique features of these AI as well as ChatGPT is their ability to generate highly detailed and diverse results, even for descriptions that are relatively vague or abstract. This is because it is trained on a large and varied dataset, which allows it to understand a wide range of concepts and generate corresponding images.

Symphony Net’s fluidity

But words and pictures aren’t all that AI can seem to manipulate. When it comes to music-generating AI we have the likes of SymphonyNet, which can generate entire symphonies s if all you give it is a starting point. Not only can one use SymphonyNet to create original compositions but it can also help generate backing tracks for live performances. There are also benefits it offers with music production tasks such as arrangement and orchestration. It is expected that musicians might use it as a source of inspiration for their musical creations, by using the generated tracks as a starting point and building upon them, as writers would do using ChatGPT, and painters with DALL-E.

Resemble’s utility

AI’s ability to generate audio doesn’t apply solely to instrumentals; it can do so with voices too. When most people think of AI voices, Siri and Alexa come to mind, but the field has evolved well past these two. The likes of Resemble (by Open AI) and us here at Dubverse create AI voices using the neural network model of datasets mentioned earlier. 

Using speech cloning, AI can ‘sample’ voices to create complete speeches. Imagine Joe Biden’s voice in a video game announcing the outbreak of a zombie virus in the U.S. Or a docuseries made of the serial killer Ted Bundy that can simulate his conversions in real-time without any prior recordings of it. At Dubverse, we let you dub any piece of content in any language you want using just a sample of your or any selected voice. Voice cloning is powerful.

To clone a voice, the process typically begins by collecting a dataset of audio samples recorded by the person whose voice is to be cloned. This dataset is then used to train the neural network. Once the network is trained, it can be used to generate new audio in the style of the person whose voice was cloned by providing it with a transcription of the desired text and letting it predict the corresponding audio waveform. The resulting audio can be quite realistic and may be indistinguishable from the original person’s voice to the listener.

Content creators, public speakers, and anybody needing to voice a video now simply have to furnish a sample of their voice, and AI will generate the rest. When it comes to fresh pieces of writing, music, and art, prompts are all we need. So are these jobs obsolete?

image 12

The New Age of Work

So, where do we go from here? Do we still need writers, painters, and musicians? Do advertisements, speeches, songs, and movies need to be scripted by us? Or AI has taken the handle of all of it? The short answer: kind of.

Finetuning > Creating

The skill set of creating anything from scratch can be left in the hands of software, but the skill of fine-tuning is still very human. A prototype is never perfect. It requires editing, fine-tuning, and making micro-adjustments to match your long-term goal. The future of work might favor humans for being fine-tuners rather than creators. After all, knowing how to give the precise prompt to these creative AIs is a skill in the first place. 

The better the use of prompts, the more it will allow for the targeted and efficient use of generative AI. Each model — whether it be DALL-E 2’s art or JukeBox’s music — can focus on generating relevant content aligned with a specific goal or purpose. This can be more effective than creating content from scratch, as it allows the model to build on existing information and knowledge, rather than a person having to start from scratch.

image 13

Build with a preset foundation

Think of generative AI like a lego set. It lives you the building blocks in the shapes and sizes you need to create something stable and accurate. You get to build off of a pre-set foundation made for you by various generative AI. This pre-set foundation can be used in a variety of ways to make work easier for people. For example:

  1. In the field of software development, a generative AI system could be used to create the initial codebase for a new project. This would save developers the time and effort of starting from scratch and allow them to focus on building out the specific features and functionality they need.
  2. In the field of design, a generative AI system could be used to create initial design concepts or layouts for a project, saving designers the time and effort of coming up with ideas from scratch.
  3. In the field of research, a generative AI system could be used to generate initial hypotheses or ideas for investigation, which researchers could then test and verify through experimentation.

Overall, the use of a pre-set foundation created by generative AI can help to streamline the work process and allow people to focus on the most important and creative aspects of their jobs.

Creativity takes center stage

The frequent use of Generative AI can work toward enhancing collective creativity iseveralof ways:

  1. By automating tedious or time-consuming tasks, generative AI can free up time and mental energy for more creative endeavors. For example, if an architect is using a generative AI system to generate initial design concepts, they can spend more time refining and iterating on those concepts, rather than starting from scratch every time.
  2. Generative AI can also serve as a source of inspiration or a starting point for creative projects. By presenting a wide range of options or ideas, generative AI can spark new ideas or approaches that a person may not have considered on their own. Many authors can come up with story plots by adding a prompt into ChatGPT and then using their creativity to take the plot generated by the AI further. 
  3. Generative AI can also be used to help people explore creative possibilities that would be otherwise impractical or impossible to achieve manually. For example, a generative AI system like DALL-E 2 could be used to create unique and complex designs that would be too time-consuming or difficult for a human to create by hand.

In other words, while generative AI is not a replacement for human creativity, it can be a useful tool to help people unleash their creative potential and explore new ideas.

As generative AI becomes more state-of-the-art and widespread, it will likely have a significant impact on the creator economy. Creators of all types — artists, designers, writers, musicians, and more — may find that they can produce more and higher quality work with the aid of generative AI tools. All the same, the use of generative AI may also lead to new challenges and opportunities for these creators. 

For example, creators may need to adapt to new ways of working with generative AI, or they may need to find new ways to differentiate their work in a market that is increasingly relying on AI-generated content. There are also ethical and legal implications to consider. If DALL-E 2’s dataset is based on all artwork that exists on the internet (a huge portion of which is created and owned by artists), are the AI’s creations truly unique? To what extent are these renditions ‘inspired’ versus plagiarised? And what does the future hold for creators financially if all the grunt work can now be subsidized in the hands of AI? 

Ultimately, the relationship between generative AI and the creator economy will be a complex and evolving one, and it will be important for creators to not just stay informed but also adaptable as this field continues to grow.

Latest Blogs

Get AI Dubbing updates in your inbox

Subscribe to our mailing list

Author

Harshmeet

Building a better impact-focused marketing pillar at Dubverse. I like to hack and create solutions the Mar-Tech way. While working I love to listen to Slash, RHCP & occasionally EDM while sipping good espresso.

Leave a Reply

Your email address will not be published. Required fields are marked *

Choose from Languages

5 Videos