Generative AI in software development
For automotive companies dealing with today’s challenges of networked vehicles, autonomous driving, shared mobility and electrification, AI coding assistants offer invaluable support. They enable developers to react faster to changing requirements and implement innovative software solutions more efficiently, what is crucial in these fast-moving areas.
In the fast-paced world of software development, the automation of coding tasks is no longer a dream of the future, but a reality. One of the most exciting advances in this area is the use of generative artificial intelligence to automatically generate code. GitHub Copilot has caused quite a stir as a commercial solution in this context, promising to assist developers as a 'co-pilot' when programming. But what about open source alternatives? Are they just as efficient, cost-effective or even better? In this article, we take you on a journey through our own evaluation processes at msg to answer these questions. We compare commercial offerings with open source alternatives and draw conclusions that could be decisive for both small and large software organizations.
Do you have any questions?
Dragan Sunjka
Lead IT Consultant
Automotive & Manufacturing
Code generation: Pros and cons of existing commercial solutions
The market for AI-powered programming tools is growing rapidly. Microsoft’s GitHub Copilot is leading the way with context-sensitive code suggestions. According to a recent survey on StackOverflow, Microsoft’s solution enjoys high popularity among developers [1]. But there are alternatives: Codeium focuses on high code quality, CodeGeex AI on error detection, Amazon CodeWhisperer on AWS integration and TabNine on broad language support.
Despite the notable benefits offered by commercial AI-supported programming tools there are also a number of limitations and cost considerations that should not be overlooked:
- License costs: For large companies with many developers, licensing costs can be considerable as there is a fixed amount per user, which can quickly increase the overall cost[2].
- Questions about data sovereignty and intellectual property: With commercial offerings, it is often unclear which training data has been incorporated into the model and under what license terms. It is therefore questionable how the generated code snippets are to be treated from a legal perspective and whether the results belong to the respective providers or to the company itself (or even third parties).
- Lack of customizability: The tools trained with public code bases cannot generate company-specific and project-specific code patterns (coding guidelines, internally developed libraries, established unit test patterns etc.). This means that generated code often has to be reworked.
In this context, open source alternatives can be an attractive solution. Apart from the obvious cost advantages and a legally more solid training data basis, they offer a high level of flexibility and customizability. Open source tools can be modified according to a company's specific requirements, enabling seamless integration into existing processes and system landscapes. In addition, the use of open source avoids dependency on individual providers and minimizes the risk of a vendor lock-in situation. This can be an advantage especially for large software companies that manage a wide range of projects and technologies and thus require a high degree of flexibility and independence.
Open source alternatives
But what about the quality of code generation of open source alternatives before we think about their customizability? How well do basic models score compared to prominent players such as GitHub Copilot? These questions led us at msg to conduct an in-depth evaluation. On the one hand, we were able to rely on various published benchmarks that provide an initial overview of the performance of different open source models. On the other hand, we also wanted to gain initial hands-on experience with hosting and using such models.
To do this, we hosted one of the best open source models currently available (July 2023) - SalesForce Codegen - on AWS and used it in an internal msg development project. The integration was done using specially developed plug-ins for common development environments such as IntelliJ IDEA and Visual Studio Code. After a test phase, we conducted a qualitative survey among developers to better understand the perceived benefits of these tools. For comparison, we conducted the same survey in a customer project using GitHub Copilot. Due to the small number of participants, we make no scientific claims regarding the validity of this study.
Evaluation and results
The figure below shows an excerpt of questions we asked the developers and the respective average of the responses from both projects (use of Codegen versus GitHub Copilot). The response scale ranges from 1 to 5, where 1 corresponds to "strongly disagree" and 5 to “strongly agree".
In total and on the basis of a differentiated analysis of the responses, we were able to observe the following trends, among others:
- Overall, both tools achieved solid, average recommendation scores (i.e., the participants agreed with the statement "Would you recommend it to a colleague" on average "moderately", without being overly strongly for or against recommending it). A rather restrained, but nevertheless existing agreement with the statements means that the tools are definitely not yet perfect, but can certainly be put to good use.
- The GitHub Copilot is ahead – but the lead over open source is (subjectively) smaller than we expected. It is therefore worth not following the media hype surrounding Copilot but also considering alternatives.
Irrespective of the type (open source or GitHub Copilot), we also found the following:
- Highly experienced developers and architects are less euphoric about AI-based assistants than the less experienced developers. This indicates that when introducing AI-based assistants careful consideration should be given to the situations and types of tasks for which they can best be used, especially considering the potential challenges and limitations of these technologies.
- The ideal area of application of such tools seems – at least at present – to be in repetitive, routine coding tasks. This is likely to change in the coming years as the models become more powerful and the input context (e.g., in the form of other data sources such as requirements documentation) becomes more comprehensive.
As part of this evaluation, we were able to gain valuable experience regarding the technical setup and the provision of open source AI coding assistants. To meet our requirements in terms of cost efficiency, scalability and customizability, we had to rely on AWS basic services such as EC2, EFS and Application Load Balancer – the SaaS services offered, such as SageMaker, for example, did not offer the required flexibility in terms of providing the latest open source models or they were accompanied by limitations in terms of supported security mechanisms. Nevertheless, in specific cases, such managed services can also be the first, quick step into the world of open source coding assistants.
Conclusion and recommendations
The AI (r)evolution in the area of software development is undeniable. Tools such as GitHub Copilot already provide valuable support for code generation and have shown impressive results in many contexts. But as our survey suggests, open source alternatives are not marginal or inferior substitutes but a serious option for companies looking for flexibility, customizability and cost efficiency.
Especially in terms of data control, intellectual property and the ability to implement one's own specific coding standards and patterns, open source solutions can be of great value. Although commercial solutions score better in certain benchmarks and surveys, our study shows that the gap between commercial and open source alternatives may be narrower than expected in actual use. At the time of writing, other, more powerful open source code generation models have been released (e.g., Code Llama and WizardCoder) that, at least in public benchmarks, reach the GitHub Copilot level. It is not yet clear to what extent the commercial approaches will prevail in the long term and secure decisive competitive advantages (and thus market share).
For highly experienced developers, dependency on such tools may initially appear as an unnecessary crutch. However, they too could benefit from the automation of routine tasks and thus focus their abilities on more complex and more creative problem solving. However, this differentiation – by experience, task and skill – requires a methodological and process-related adaptation in software development processes. Only in this way can companies realize the full benefits of these tools while mitigating the potential risks. This means that both the selection of the right tool, training of employees and the continuous monitoring and adaptation of processes are crucial to ensure a balance between automation and human expertise.
For every organization in the automotive sector, it is crucial to make individual assessments that consider both technical as well as economic aspects. We should not forget that the technology landscape in the automotive industry is especially dynamic: What appears to be a limitation in vehicle software development today may be overtaken by new developments tomorrow. In a sector that is grappling with network vehicles (Software Defined Vehicle (SDV), autonomous driving and electrification, it is of utmost importance to always remain open and adaptable to new technology trends and to regularly review the current options. At msg, we combine our many years of experience in software engineering with modern know-how in the areas of artificial intelligence and cloud architectures specially adapted to the needs and challenges of the automotive industry. We are happy to support and advise your teams when it comes to effectively evaluating and introducing AI coding assistants and adapting them to the specific requirements in the automotive sector.