New feature: SofySense – Manual test case & test results AI generator to help you speed up your testing process. Try it today.  

Sofy’s No-Code Automation Now Integrates with DataDog’s Robust Performance Metrics in the DataDog Marketplace.

sofy logo
Image: Sofy x josefkubes, Shutterstock

Chat GPT test automation: What’s the verdict?

It's impossible to overstate the hype around ChatGPT. People are even attempting Chat GPT test automation. What are the results?

It seems like you can’t go 10 minutes without seeing or hearing something about OpenAI’s ChatGPT. From LinkedIn posts to eavesdropped coffee shop banter, people are making out like ChatGPT is the most exciting thing to come out of the internet since, well, the internet. There’s a rumbling that Chat GPT test automation might even be the next big thing for mobile app testing.

But is it really?

There’s discussion of how ChatGPT changes everything for application development. It can turn a use case into Java and test cases into Python. It can find mistakes in your code and pull from libraries of libraries. It’ll replace developers and testers completely!

But can it really?

In this post, I cut through the hype and examine a variety of claims and issues surrounding ChatGPT, particularly as it relates to testing automation. 

What’s the deal with ChatGPT?

Developed by the research organization OpenAI, ChatGPT is an AI-driven chatbot. It uses GPT-3, a language model also developed by OpenAI that uses a deep neural network fed by apparently huge sets of data (albeit of unclear origin) to generate human-like responses.

ChatGPT is one of the largest language models currently available, pulling from over 175 billion parameters. This makes it one of the most powerful AI language models to date.

From my informal assessment, most people seem to be finding finding ChatGPT useful for things like:

  • Chatbots for customer service or information retrieval
  • Generating entertaining creative writing samples and prompts
  • Content creation and marketing copy like blogs, product descriptions, and landing pages
  • Summarizing large amounts of information
  • Language translation and text-to-speech synthesis

Yet of all the uses I’ve seen discussed so far, perhaps the most promising is its potential application for developers and testers. 

ChatGPT can generate sample code and test cases based on simple prompts. LambdaTest recently performed a deep dive, instructing ChatGPT to write detailed test scripts using Selenium with Java.

It found that while it was impressive, it had considerable flaws. For starters, it didn’t understand the purpose behind the test. Rather, it relied on underlying assumptions and statistical patterns that may or may not be correct.

In addition, it frequently provides incomplete code, needing to be prompted and guided several times to finish.

All this and one of the most important lessons goes over our heads: At the end of the day—for this purpose—it doesn’t save any time. AI is supposed to free up resources, right? The tester had to continually prompt and correct the assumptions of ChatGPT.

It was more like training a junior developer than having a robot assistant.

AI: The not-so-intelligent side

Despite all of the claims of a revolution, something is still amiss with this variety of AI. Especially when left to its own devices.

Now quite notoriously, CNET found itself under intense scrutiny after a large portion of its AI content was found to be plagiarized. And in many cases, just flat-out incorrect.

It may come as a surprise that ChatGPT doesn’t know what the truth is. Instead, it generates text based on its (notably mysterious) data sets intended to sounds plausible. That’s exactly the reason many cases the code generated by ChatGPT was wrong.

As the initial hype around ChatGPT begins to quiet, a variety of industry experts and scholars have increasingly highlighted these issues. For example, Princeton University Computer Science professor Arvind Narayanan characterizes ChatGPT as more of a bullshit generator than truly revolutionary tech:

[ChatGPT] is trained to produce plausible text. It is very good at being persuasive, but it’s not trained to produce true statements. It often produces true statements as a side effect of being plausible and persuasive, but that is not the goal.

So what does this mean for tasks like mobile app development and test automation work? 

There is an equally harsh criticism for that as well.

Test automation coach and award-winning software developer Zhimin Zahn performed an experiment similar to LambdaTest’s. He breaks down three common arguments given to use ChatGPT to help write test cases.

His conclusion: ChatGPT is Useless for Real Test Automation. According to Zahn:

In summary, don’t use ChatGPT for test automation. ChatGPT cannot learn test automation, you can. After all, developing test scripts in test automation is only a minor effort, the majority is on stabilizing and ongoing maintenance.

Even if it can help nudge testers in the right direction, there are so many errors that you’re likely just better off doing it yourself. And if you can use assistance, there are no shortage of superior tools for your tasks.

Image: Sofy x josefkubes, Shutterstock

Chat GPT test automation: Is this a thing?

Let’s now turn to the topic of Chat GPT test automation. Can ChatpGPT assist in the composition of test automation? The answer is quite simple: Yes, but not very wellright now. Currently, ChatGPT misses the mark in some big ways. It’s terrible at making assumptions because it has no real insight into your goals. For this reason, it can be the kiss of death for junior developers or testers who can’t identify glaring inaccuracies.

Beyond that, in many cases it’s working with an old and outdated library, leaving many syntax mistakes and inoperable code.

I think we’re asking ourselves the wrong question here. We’re missing a bigger point. Rather, we should ask if ChatGPT should help write test automation.

Why are we asking ChatGPT to write code-based test scripts? Shouldn’t AI make things easier? Even if ChatGPT wrote perfect, error-free code, it still doesn’t help alleviate the true contemporary pain points around code-based test automation in the first place. For example:

  • More time writing lines of code: Even if it’s written by a robot, it’s still tedious, and must be thoroughly reviewed.
  • Harder to find development resources: You can’t bet on the available skillset to write code-based test cases.
  • Challenges of managing changes: From source control to dependencies, there’s always a web that needs untangling.
  • Environment configuration and setup overhead: These are major time sinks that reduce efficiency and promote corner cutting, potential leading to drops in quality.

It’s like asking ChatGPT to translate your texts into Morse code.

Frankly, in its current state, ChatGPT doesn’t sound like the revolution I’d like to be a part of. Especially when there are real testing revolutions occurring in our midsts.

The real future: Scriptless test automation

If your goal is to prepare for the mobile app testing of the future, then it’s best to simply go code-free: Join the No-Code Revolution. With modern no-code platforms, instead of prodding ChatGPT to write Selenium test classes in Java, Python, or whatever OOP language, you just interact with a declarative UI to build your test automation. 

Stop writing and meticulously managing your  test cases. 

Stop setting up test data and environments, mocking and stubbing services, and dealing with the complex DevOps that comes with managing change. 

Scriptless test automation saves you headaches by abstracting the tedious and technical details so you can focus on building test automation that scales.

Sofy’s no-code testing automation platform allows you to provision a real device, record a manual test, and automate it. Then, run it across hundreds of real mobile devices in minutes—and from anywhere. Sofy also employs AI and Natural Language Processing, but it doesn’t make things harder. For example:

  • Intelligent regression suggests changes to your test case when updated and new features are detected
  • Sofy’s advanced no-code engine understands the deeper context behind your test. So even if an element name changes or a layout is different, it makes the right assumption. No prodding or hand-holding necessary.

Automation is both natural and good. AI can in fact be helpful. Sofy just does it a little better.

ChatGPT or AI isn’t the natural progression of testing automation. Scriptless automation is.

Splash or ripple?

There’s no question that ChatGPT is making a big splash. But after taking a step back, an increasing number of voices are highlighting that its impact may well be less of a wave and more of a ripple, particularly for mobile app testing—at least in its current state.

There are two important questions we should ask ourselves when we broach the efficacy of AI: Can we? and Should we? One is a question of capability, the other of ethics.

But there’s one more question that I think we should ask: Why are we? Specifically, why are we asking ChatGPT to write code-based test automation when we can do it better by utilizing the No-Code Revolution? There’s no doubt a place for something like Chat GPT test automation around the corner, but you can be sure no-code will be a major component of any such solutions.

Disclaimer: The views and opinions expressed above are those of the contributor and do not necessarily represent or reflect the official beliefs or positions of Sofy.