How to Do Stuff with AI: A Practical Guide to Using Today’s Best Tools

How to Do Stuff with AI: A Practical Guide to Using Today’s Best Tools

The world of Artificial Intelligence is evolving at breakneck speed. Just recently, we witnessed the launch of Claude 2, arguably the second most powerful AI accessible to the public. Prior to that, OpenAI introduced Code Interpreter, a highly sophisticated AI mode. And even before that, AI systems gained the ability to “see” images, marking a continuous stream of rapid advancements.

Despite these groundbreaking developments, a curious absence persists: user-friendly documentation from AI labs themselves. Instead, guidance on utilizing these powerful tools seems to be primarily disseminated through informal channels like Twitter threads and online rumors. This “documentation-by-rumor” approach is perplexing, particularly from organizations that emphasize responsible technology use.

Consider this guide as an orientation to the current AI landscape. Drawing from my ongoing efforts to create a “Getting Started Guide to AI” for my students and readers, this version reflects the significant shifts of recent months. The AI realm is in constant flux, demanding frequent updates to stay current.

This guide offers an opinionated perspective, grounded in practical experience, focusing on How To Do specific tasks effectively by selecting the appropriate AI tool. For a broader understanding of potential AI applications, you might find it helpful to first explore my article on tasks you may want AI to do.

Understanding the Major Large Language Models (LLMs)

When we talk about AI in 2023, the conversation largely revolves around Large Language Models (LLMs). These models are the engine behind most contemporary AI applications. A select few organizations develop these “Foundation Models,” and they typically offer direct access through chatbots. Key players include:

  • OpenAI: Creators of GPT-3.5 and GPT-4, powering ChatGPT and Microsoft’s Bing AI (accessible via Edge browser).
  • Google: Offers various models under the Bard umbrella.
  • Anthropic: Developers of Claude and Claude 2.

While other LLMs exist, like Pi (focused on conversational AI) and numerous open-source models, this guide will concentrate on these primary, readily accessible options. Open-source models, while promising, are generally less user-friendly for casual users at present.

Here’s a quick comparison table to give you an overview of the current LLM landscape:


A comparison chart of Large Language Models, highlighting key features and providers like OpenAI’s GPT models, Google’s Bard, and Anthropic’s Claude.

The first four entries, including Bing AI, are based on OpenAI technology. Currently, OpenAI primarily operates two major models: 3.5 and 4. Model 3.5 sparked the current AI surge in late 2022, while Model 4, released in Spring 2023, represents a significant leap in capability. Variations of GPT-4 include plugin integration for internet and app connectivity, and Code Interpreter, a particularly powerful version capable of running Python code. Free ChatGPT users typically access model 3.5. Notably, most of these models, excluding plugin versions and a temporarily suspended browsing-enabled GPT-4, lack direct internet access. Microsoft’s Bing AI utilizes a blend of models 4 and 3.5 and often pioneers new features within the GPT-4 family, such as image creation and viewing and in-browser document reading, along with internet connectivity. Bing AI can be unconventional to use but offers substantial power.

Google’s Bard, powered by models like PaLM 2, represents their consumer AI offering. Despite Google’s foundational role in LLM technology, Bard has initially underperformed compared to competitors. However, recent improvements, including code execution and image interpretation capabilities, indicate ongoing development. For most tasks, it is advisable to explore other options first.

Anthropic’s Claude 2 stands out with its exceptionally large context window, essentially its working memory. Claude 2 can process vast amounts of text, such as entire books or numerous PDFs, at once. Furthermore, it is engineered to be less prone to generating harmful or inappropriate content compared to other LLMs, which sometimes manifests as a more cautious or even slightly scolding tone in its responses.

Now, let’s delve into practical applications and explore how to do specific tasks with these AI tools.

How to Write Stuff with AI

Best Free Options: Bing AI and Claude 2
Paid Option: ChatGPT 4.0/ChatGPT with plugins

Currently, GPT-4 remains the leading AI for writing tasks. You can access it for free through Bing AI (in “creative mode”) or via a $20/month ChatGPT Plus subscription. Claude 2 is a strong second choice, offering a useful free tier.

These AI writing tools are increasingly being integrated into familiar office software. Microsoft Office will feature a GPT-powered Copilot, and Google Docs will incorporate suggestions from Bard. These integrations promise significant changes in how we approach writing.

Here are some effective ways how to do writing tasks using AI:


A visual representation of using AI for writing, illustrating prompt inputs and AI-generated text outputs in a writing interface.

  • Drafting Content: Provide a topic or outline, and AI can generate a first draft of articles, reports, emails, or even creative writing pieces.
  • Improving Existing Text: Paste in your writing and ask the AI to refine it for clarity, conciseness, tone, or style. You can specify improvements like “make this more professional” or “rewrite this to be more persuasive.”
  • Summarizing Long Documents: Feed lengthy articles or documents to AI and request concise summaries, extracting key points and arguments.
  • Brainstorming and Idea Generation: Use AI to overcome writer’s block by asking it to generate ideas for topics, headlines, or different angles on your subject matter.
  • Translation: While accuracy should be verified, AI can provide rapid translations of text between languages.

Things to be mindful of when writing with AI: AI models are prone to “hallucinations,” fabricating plausible-sounding but false information. AI lies convincingly and consistently. Always verify every fact and detail it provides. This is particularly critical when requesting references, quotes, citations, or information sourced from the internet (for models without live web access). Bing AI, leveraging GPT-4 and its internet connection, generally exhibits fewer hallucinations. Consult this guide for strategies on minimizing AI hallucinations, but complete elimination is currently impossible.

Furthermore, AI explanations of its own reasoning are often fabricated. When asked to explain its writing process, it generates plausible-sounding justifications that are not actual reflections of its internal operations. This makes identifying and mitigating biases within these systems exceptionally challenging, despite their likely existence.

Finally, be aware of the ethical implications. AI can be misused for manipulation or academic dishonesty. Users are ultimately responsible for the ethical application of these tools and the output they generate.

How to Make Images with AI

Most Transparent Option: Adobe Firefly
Open Source Option: Stable Diffusion
Best Free Option: Bing AI or Bing Image Creator (using DALL-E), Playground AI (supports multiple models)
Best Quality Images: Midjourney

Four primary image generators are widely available:

  1. Stable Diffusion: An open-source option, runnable on powerful computers. It requires a steeper learning curve for prompt engineering but yields excellent results, particularly for combining AI with existing images. This guide offers a comprehensive overview of Stable Diffusion.
  2. DALL-E (OpenAI): Integrated into Bing AI (creative mode) and Bing Image Creator. A reliable system, but generally less advanced than Midjourney.
  3. Midjourney: Considered the leading image generator in mid-2023. It boasts the easiest learning curve. Simply input “thing-you-want-to-see –v 5.2” (the --v 5.2 is crucial for using the latest model) for impressive results. Midjourney operates through Discord. Refer to this guide on using Discord.
  4. Adobe Firefly: Integrated into Adobe products. While currently lagging behind DALL-E and Midjourney in image quality, Adobe emphasizes its ethical training data, using only licensed images.

Here’s a visual comparison of these models, each generating an image from the same prompt:


A comparison of AI image generators, showing outputs from Stable Diffusion, DALL-E, Midjourney, and Adobe Firefly, all based on the prompt “Fashion photoshoot of sneakers inspired by Van Gogh”.

Prompt: “Fashion photoshoot of sneakers inspired by Van Gogh” – the first images generated by each model.

Considerations for AI image generation: These systems inherit biases from their training data, primarily sourced from the internet. For example, prompting for an “entrepreneur” image is likely to generate more images of men than women unless “female entrepreneur” is specified. This tool allows you to explore these biases.

Furthermore, the training datasets often include copyrighted artwork, raising legal and ethical concerns regarding AI art generation. Copyright ownership of generated images is also legally ambiguous.

Currently, AI image generators struggle with rendering realistic text within images, often producing gibberish. However, Midjourney excels at depicting hands, a historically challenging element for AI image generation.

How to Come Up with Ideas Using AI

Best Free Option: Bing AI
Paid Option: ChatGPT 4.0, though Bing AI’s internet connectivity often makes it superior.

Paradoxically, AI’s constraints and quirks make it remarkably effective for idea generation. Generating a high volume of ideas is often key to discovering truly innovative ones, and AI excels at quantity. With effective prompting, you can also push AI towards highly creative outputs.

Utilize Bing AI in creative mode to explore unconventional idea-generation techniques. Request it to apply methods like Brian Eno’s Oblique Strategies or Marshall McLuhan’s Tetrads. Alternatively, ask for unconventional ideas inspired by random patents or your favorite superheroes.

[An illustrative image symbolizing AI-driven idea generation, possibly showing a network of connected ideas or a brainstorming session with AI assistance.

Examples of how to do idea generation with AI prompts:

  • “Brainstorm 10 marketing campaign ideas for a new eco-friendly water bottle, targeting Gen Z.”
  • “Generate 5 product ideas that combine virtual reality and fitness.”
  • “Using the principles of biomimicry, suggest 3 innovative solutions for urban traffic congestion.”
  • “Develop 7 story ideas for a science fiction novel set on Mars, incorporating themes of artificial intelligence and resource scarcity.”
  • “Create a list of 10 potential names for a new coffee shop that evokes a sense of community and sustainability.”

How to Make Videos with AI

Best Animation Tool: D-ID for animating faces in videos. Runway v2 for text-to-video creation.
Best Voice Cloning: ElevenLabs

Creating videos with AI is becoming increasingly accessible. You can now generate complete videos featuring AI-generated characters, AI-written scripts, AI-synthesized voices, and AI-driven animation. AI can even create deepfakes, as demonstrated in this example of a self-deepfake. Instructions and details are available here. Exercise caution with deepfakes, but these tools are excellent for explainer videos and introductions.

Runway v2, the first commercially available text-to-video tool, recently launched. While it currently produces short, 4-second clips, it offers a glimpse into the future of text-to-video technology and is worth exploring to understand the direction of this field.

Ethical considerations for AI video generation: Deepfakes pose significant ethical concerns, and responsible and ethical use of these technologies is paramount.

How to Work with Documents and Data Using AI

For Data (and code-related tasks): Code Interpreter
For Documents: Claude 2 for large or multiple documents, Bing AI Sidebar for smaller documents and web pages.

Code Interpreter was discussed in detail in a previous post. This GPT-4 mode allows you to upload files, enables AI to write and execute code, and provides downloadable results. It facilitates program execution, data analysis (though statistical knowledge is needed to verify results), and file creation, including web pages and even games. While debates exist regarding the risks of untrained users performing data analysis with it, experts have been impressed by Code Interpreter’s capabilities, with some suggesting it will reshape data science education. Refer to the previous article for detailed usage instructions. A starting prompt for setting up Code Interpreter for effective data visualizations is available here, providing basic chart design principles and reminding the AI of its diverse output file format capabilities.

Claude 2 excels at processing text, especially PDFs. Previous versions of Claude have successfully processed entire books, and Claude 2 is even more powerful. Past experiences and useful prompts are documented here. It effectively summarizes complex academic articles and allows for in-depth interrogation of the material through follow-up questions about evidence, author conclusions, and more.


An image depicting AI assisting with document and data processing, possibly showing an AI interface analyzing charts or summarizing text documents.

Data and document processing caveats: Hallucinations persist, although potentially in more limited forms. Accuracy verification remains crucial for ensuring reliable results.

How to Get Information and Learn with AI

Best Free Option: Bing AI
Paid Option: Bing AI generally remains optimal. For children’s education, Khanmigo from Khan Academy offers excellent AI-powered tutoring based on GPT-4.

Using AI as a direct search engine replacement is currently discouraged due to hallucination risks and limited internet connectivity in most models (making Bing AI a notable exception). Google’s Bard, in particular, tends to hallucinate more frequently. However, emerging research suggests that, when used judiciously, AI can provide more valuable answers than traditional search engines, as indicated by a recent pilot study. In areas where search engines are less effective, such as tech support, dining recommendations, or advice seeking, Bing AI often serves as a superior starting point compared to Google. This area is rapidly evolving, but caution remains advisable for now. Legal repercussions for relying on AI-generated information have already emerged.

The educational potential of AI is particularly exciting. AI can enhance teaching methodologies and assist teachers in lesson planning and effectiveness. Furthermore, AI supports self-directed learning. You can ask AI to explain complex concepts effectively. This prompt creates an automated AI tutor, and a direct link to activate this tutor in ChatGPT is available here. Given the potential for AI hallucinations, critical data should always be cross-referenced with reliable sources.

Beyond the Basics: Exploring Further AI Capabilities

Due to the rapid pace of AI development, the tools discussed here likely represent the least sophisticated AI we will use in the future. The advancements of recent months underscore this point, and future guides will undoubtedly be necessary to keep pace.

Two fundamental principles regarding AI remain constant:

  • AI is a Tool, Not a Panacea: It is not universally applicable. Carefully evaluate whether AI is the appropriate tool for your intended purpose, considering its inherent limitations.
  • Ethical Considerations are Paramount: AI raises numerous ethical concerns, including copyright infringement, academic dishonesty, plagiarism, and manipulation. The development and deployment of AI models also involve complex ethical questions regarding benefits and access. Ultimately, users bear the responsibility for ethical AI utilization.

We are in the nascent stages of a transformative technological revolution. Are there other AI applications you’re interested in exploring or how-to questions you have? Share your thoughts in the comments below.


A concluding image representing the future of AI and its vast potential, perhaps showing abstract futuristic technology or interconnected networks.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Share

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *