IBM trains its LLM to read, rewrite COBOL apps

This audio is auto-generated. Please let us know if you have feedback.

Dive Brief:

IBM trained its watsonx.ai large language model to ingest COBOL code and rescript business applications in Java, the company announced Tuesday.
The generative AI solution is designed to ease mainframe modernization, assisting developers in the arduous process of analyzing, refactoring and transforming legacy code and validating the results, Skyla Loomis, VP of IBM Z Software, said during a demonstration.
IBM intends to deploy the AI-enabled coding assistant’s new capabilities by the end of the year, the company said.

Dive Insight:

The specter of technical debt haunts organizations, often leaving critical business functions perched perilously atop layers of arcane code. Despite modernization efforts, businesses still run on-prem applications architected with COBOL, a programming language created in the 1950s.

IBM estimates individual clients at the average enterprise may have tens of millions of COBOL lines running in production. Globally, enterprise production systems run more than 800 billion lines of COBOL daily, according to a Vanson Bourne study commissioned last year by software company Micro Focus.

Several generative AI companies, including Anthropic and OpenAI, recently introduced coding assistants. In February, Microsoft released GitHub Copilot for Business, an AI-enabled developer tool for the enterprise, and saw user headcount double in the first half of the year.

While human language contains nuances and tonal variations that can outwit the best commercially available models, computer code consists of straightforward machine instructions with clearly articulated semantics.

Errors and hallucinations can occur in coding translations, but they are relatively easy to identify and resolve, said Kyle Charlet, IBM fellow and CTO of IBM Z Software.

“Code doesn't lie, so we can immediately highlight any hallucinations that have worked their way into the code and correct them,” said Charlet.

The company trained the LLM on its COBOL data and tested the dataset on IBM's CodeNet, a database of 14 million code samples in more than 55 common programming languages, including C++, Java, Python, FORTRAN and COBOL. IBM used CodeNet to test for accuracy in COBOL to Java translation.

The model was then tuned for two specific use cases: a coding assistant for RedHat’s Ansible automation toolkit; and the new COBOL solution, which has now been trained on more than 80 languages and 1.5 trillion tokens of data, according to the company.

To mitigate risk, all of the model’s training data originated from licensed open source software, Charlet said.

The solution has four functional phases, detailed by Loomis.

Auto-discovery, when the tool analyzes the original script, identifies its data dependencies and provides a metadata overview of the application.
Refactor, which identifies the application’s business function and suggests modernization updates.
Transform, when the user triggers the generative AI’s COBOL-to-Java translation capabilities.
Validate, which tests the results to ensure that the new service is semantically and logically equivalent to the original script.

“What we are not doing is a line-by-line COBOL syntax translation to Java,” Loomis said. “When that happens, what you end up with is COBOL syntax expressed in largely unreadable, largely unmaintainable Java.”

Expanding the watsonx toolkit is part of a broader business integration strategy, built around hybrid cloud, mainframe modernization, emerging AI capabilities and IT consulting services.

IBM previously partnered with Microsoft to ease mainframe modernization and deploy enterprise-grade generative AI solutions. The two companies introduced the IBM Z and Cloud Modernization Stack on the Microsoft Azure Marketplace in June and launched a generative AI managed service last week.

Correction: This article has been updated to reflect IBM trained the LLM using its COBOL data. The company used CodeNet to test it for accuracy.