With ChatGPT being down, I guess I’ll take a break and share my thoughts.
I am by no means an expert, nor have I dove deep enough into it to understand the true power of these tools.
This is my surface-level, gut reaction to all of the AI that has taken the world by storm over the last few weeks.
First of all, they are fun, they are addicting, and they continue to get better almost every time I circle back. The two main ones I have explored are ChatGPT (Text-Based) and MidJourney (Image-Based).
For the text-based options, I see it as a simple search option. You ask a question, give it a prompt, or make a request, and you get a result. It’s very easy to use and it provides pretty good feedback, based on my experiences. What I love about it is that you can learn almost anything from it. Next time you’re exploring, ask how to make a website accessible. Then pick one piece of the response, and use that for the next prompt. You can easily dive into rabbit holes and get nearly instant information. No noise, no ads, just the information you requested.
For the image options, I won’t lie, I spent way too much time playing with this one and exploring what others were creating with it. While some videos have surfaced of it using protected works, I have a hard time believing it has the capabilities at this point to create, from scratch, 100 percent of the time. From my experience, the images it creates are as good as you describe them. The more detail and structure you set for the parameters, the closer the outcome will be to the vision in your head. I admit, I struggled sometimes to get what I wanted, but it does a fairly good job of taking its own approach, if the outline is too vague.
Below are some of the images I created and screen recordings of promoting the text-based options.
Mars from the moon, an AI generated image
The first image used the prompt, “looking at mars as if you were standing on the moon” with the final version coming after generating two variations and upscaling two different versions.
The second image used the prompt, “a dog riding a magic carpet over a city at night” with the final version coming after creating two sets of variations and upscaling one of them.
The first video shows the text response to the prompt, “how do I make a website accessible?” The response shared 7 things that can help make a website more accessible including using clear and consistent navigation, keyboard accessibility, use of color and contrast, alternative text for images, descriptive link text, captions and transcript, and testing with assistive technology.
The second video shows the text response to the prompt, “Write a blog about website accessibility”. The blog writes about people with disabilities who are impacted by inaccessible experiences and provides an equivalent experience for everyone. It lightly walks the line of legal requirements and mentions, “in addition to being the right thing to do” and makes reference to the Americans with Disabilities Act (ADA).
From an accessibility perspective, I did not see any glaring issues that would prevent someone using assistive technologies from using them. The main issues I found were centered around labels, instructions, feedback, and status messages. For example, there were thumbs up and thumbs down buttons that were announced as “button” by NVDA. This isn’t a huge deal, but this limits feedback opportunities from anyone who may not be able to see the small outline icons off to the far right side of the screen, and makes it impossible for anyone who is using a screen reader. Other options that I explored lived inside fairly accessible environments such as Discord and other web-based applications.
From the perspective of someone looking to learn or implement accessibility into their websites, documents, or applications, these are great tools to add to the stack. For example, you can be as high level as, “10 ways to make a website more accessible” to as detailed as “steps for adding alt text to images in Squarespace”. With this much flexibility, you get nearly instant feedback that can be executed. What I like most is the potential to dive into rabbit holes just like in the early days of YouTube where one video led to another and before you knew it, 3 hours had passed. The main difference is the ability to utilize all this new knowledge and information.
Another interesting perspective that I have not seen yet, is that of alternative text with AI generator images. Typically we create the image and write alt text after. But how long should alt text be? How descriptive should it be? Should it be a literal description or a description based on the content adjacent to it? All of these questions come up in conversation and I think we can all learn something from the AI here. The more descriptive our prompt, the better the result, right? What if we applied this same mindset to alt text, long descriptions, or captions? In a way, it reverse engineers the process by putting the text alternative first, before the image is even created. If we challenge ourselves with text that accurately describes the image and its purpose, our alternative text options will become significantly better and provide a more equivalent experience.
With all of that said, I certainly plan to continue learning, exploring, and playing with AI as much as possible. I am not entirely sure what the future holds in terms of making technology and communication more accessible, but I do know it makes it much easier to learn and create actionable steps toward a more inclusive experience.