Skip to content

Machine learning tool trains on old code to spot bugs in new code

Microsoft and Altran release Code Defect AI to identify potential problems in software development and suggest fixes.

Altran has released a new tool that uses artificial intelligence (AI) to help software engineers spot bugs during the coding process instead of at the end.

Available on GitHub, Code Defect AI uses machine learning (ML) to analyze existing code, spot potential problems in new code, and suggest tests to diagnose and fix the errors.

Walid Negm, group chief innovation officer at Altran, said that this new tool will help developers release quality code quickly.

“The software release cycle needs algorithms that can help make strategic judgments, especially as code gets more complex,” he said in a press release.

Code Defect AI uses several ML techniques including random decision forests, support vector machines, multilayer perceptron (MLP) and logistic regression. The platform extracts, processes and labels historical data to train the algorithm and build a reliable decision model. Developers can use a confidence score from Code Defect AI that predicts whether the code is compliant or buggy.

Here is how Code Defect AI works:

  1. For an open source GitHub project, historical data is collected using RESTFul interfaces and Git CLI. This data includes complete commit history and complete bugs history.
  2. Preprocessing techniques like feature identification, label encoding, one hot encoding, data scaling and normalization are applied to the collected historical commit data.
  3. Labelling is performed on the preprocessed data. The labelling process involves understanding of the pattern in which the fix commits (where a bug has been closed) are tagged for each of the closed issues. After the fix commits are collected, the commits which introduced the bugs are identified by backtracking on historical changes for each file in a fix commit.
  4. If a data set contains a very small amount of bug data as compared with clean records, synthetic data is also generated to avoid bias toward the majority class.
  5. Multiple modelling algorithms are trained on the data prepared.
  6. Once there is a model that has acceptable value of precision and recall, the selected model is deployed for prediction on new commits.

Code Defect AI supports integration with third-party analysis tools and can help identify bugs in a given program code. Also, the Code Defect AI tool allows developers to assess which features in the code should take higher priority in terms of bug fixes.

“Microsoft and Altran have been working together to improve the software development cycle, and Code Defect AI, powered by Microsoft Azure, is an innovative tool that can help software developers through the use of machine learning,” said David Carmona, general manager of AI marketing at Microsoft, in a press release.

SEE: Tips for building a successful career as a software engineer (free PDF)

Code Defect AI can be hosted on premises as well as on cloud computing platforms such as Microsoft Azure. The solution can be integrated with other source-code management tools as needed.

AI employee joins the dev team

In a new report about artificial intelligence and software development, Deloitte predicts that more and more companies will use AI-assisted coding tools. From January 2018 to September 2019, software vendors launched dozens of AI-powered software development tools and startups working in this space, and raised $704 million over a similar timeframe.

The biggest benefit from these platforms is efficiency, according to Deloitte analysts David Schatsky and Sourabh Bumb, the authors of “AI is helping to make better software:”
“The benefits of AI-assisted coding are numerous. However, the principal benefit for companies is efficiency. Many of the new AI-powered tools work in a similar way to spell- and grammar-checkers, enabling coders to reduce the number of keystrokes they need to type by around 50%. They can also spot bugs while code is being written, while they can also automate as many as half of the tests needed to confirm the quality of software.”
This capability is even more important as companies continue to rely on open-source code.
The Deloitte report can speed up the coding process significantly by “reducing the number of keystrokes developers need to type by half, catching bugs even prior to code review or testing, and automatically generating half of the tests needed for quality assurance.”

According to the report, these tools are best suited for these elements of the software development process:

  1. Project requirements
  2. Coding, review and bug detection, and resolution
  3. More thorough testing
  4. Deployment
  5. Project management

Also see

Programming source code abstract backgroundProgramming source code abstract background

Image: Scanrail/Getty Images/iStockphoto