1. Motivation#

Free and Open Source Software (FOSS) is a cornerstone of modern science and technology, fueling innovation by granting users the freedom to access, enhance, and redistribute source code [95, 209]. Its impact spans diverse fields—from computational sciences and engineering to healthcare, education, and cybersecurity [103, 115, 119, 201, 226]. FOSS promotes collaboration, security, and transparency while reducing costs and dependency on proprietary software. This inclusive model not only supports economic efficiency, but is also a vital element in shaping a more open and sustainable technological future.

In contrast to the software industry where each task is carried out by specialized teams, the entire responsibility of FOSS development is typically in the hands of small groups of developers [80, 257, 263] with little time and exposure to modern software engineering methodologies [36, 82, 138, 198, 202, 262]. As software development is a complex and resource-intensive task [31], FOSS is often faced with challenges regarding funding, time, staffing, and technical expertise [37, 93, 140, 184]. Therefore, the amount of effort and skills required to produce high-quality software in accordance with the latest software engineering best practices often far exceeds the capabilities of FOSS development teams [17, 103, 108, 130, 222, 225, 236]. This can lead to FOSS lacking in areas like accessibility, ease of installation and use, documentation, interoperability, extensibility, and maintainability, significantly hampering scientific and technological advancements [32, 33, 73, 106, 107, 131, 157, 169, 170, 178, 195, 240].

Acknowledging the importance and challenges of FOSS development, efforts have been made to improve the status quo. These include the introduction of research software engineering as a new academic discipline [29, 57, 114, 203, 267], and the establishment of various guidelines [8, 9, 15, 18, 53, 72, 83, 124, 129, 135, 137, 153, 156, 161, 182, 235, 264, 265] and workshops [38, 260, 261] to promote software engineering best practices among developers. However, widespread adoption of such initiatives is often hindered by increased production costs [37, 48, 136, 184, 257]. For example, employing engineering best practices can be challenging due to a lack of supporting tools and limited time and knowledge [166]. To improve this situation, we need solutions that are readily accessible and adoptable by all developers, empowering them to employ software engineering best practices with ease and minimal overhead [130, 140].

Software engineering involves multiple phases including planning, development, and operations, requiring a well-coordinated workflow using various tools and technologies [143, 256]. By far, the most common problems faced by FOSS developers are technical issues regarding management, tooling, testing, documentation, deployment, and maintenance of software [16, 93, 108, 126, 178, 195, 257]. Thus, automation tools that streamline such repetitive engineering tasks can significantly accelerate development, improve quality, and lower production costs at the same time [130, 222, 264]. An example proven successful in large-scale scientific initiatives [113] are project skeletons that provide basic infrastructure for software development [100, 127, 210, 223]. While these are great automation tools for project initiation, the bulk of repetitive engineering activities is carried out throughout the development process with increasing complexity and frequency [143, 166, 256]. Although existing general-purpose tools can help streamline individual tasks, comprehensive solutions to automate the entire software development process are lacking.

In the following, we outline key requirements and challenges in FOSS development, highlighting PyPackIT’s solutions to them.

1.1. Cloud-Native Automation#

FOSS often faces evolving requirements and specifications so that determining the exact requirements and specifications of the end product is usually not possible in advance [37, 225]. Consequently, as traditional development methodologies may not effectively accommodate the experimental nature of FOSS development [130, 218, 226], Agile development and cloud-native practices such as Continuous software engineering and DevOps are recommended to reduce variance, complexity, cost, and risk in the development process and produce higher quality software more rapidly, efficiently, and reliably [37, 108, 222, 224, 232, 264].

While Agile and cloud-native methodologies are considered crucial in collaborative software development [181, 228, 233] and are well-established in industry [148, 190, 191, 206] and some large research institutions [133, 134, 199, 266], their adoption in FOSS projects presents opportunities for growth [36, 202, 222, 225, 226]. A major barrier to adoption is implementing Continuous pipelines, which is a complex task [186, 187] faced with challenges such as lack of consensus on a single well-defined standard and limited availability of tools, technologies, instructions, and resources [78, 228, 238]. For example, GitHub offers public repositories free integration with GitHub Actions (GHA), which can be used to for automation well beyond conventional CI/CD practices [219, 255]. However, implementing GHA workflows involves challenges related to tooling, configuration, testability, debugging, maintenance, and security [22, 70, 141, 246]. Action reuse is also low, due to issues with compatibility, functionality, and findability [219]. Consequently, most projects do not make use of these advanced features that can greatly improve the software development process [45, 65]. As free and ready-to-use solutions are scarce [44, 141, 219, 228, 233], many FOSS projects do not follow Continuous practices or use outdated pipelines that can compromise the development process or introduce security vulnerabilities into the project [22, 70, 84, 188].

PyPackIT’s Solution

PyPackIT exploits the full potential of GHA to enable a cloud-native Agile development process by providing a comprehensive set of ready-to-use and highly customizable automation pipelines for Continuous configuration automation, Continuous integration and deployment, and Continuous maintenance, refactoring, and testing, designed according to the latest guidelines and engineering best practices [19, 21, 75, 86, 87, 117, 151, 155, 162, 211, 228, 233, 238, 249, 251, 259]. These pipelines fully integrate with various repository components to automate numerous repetitive engineering and management tasks throughout the entire software life cycle.

1.2. Collaborative Workflow#

Software development has become a highly collaborative and distributed process [54, 111]. The additional social aspects increase project complexity, requiring high degrees of communication and coordination [39, 120, 176, 245], as well as a robust workflow to orchestrate the development process [143, 256]. Consequently, effective collaboration and project management are major challenges in FOSS development [166, 257], where lacking workflows result in using non-standard and error-prone development processes [108, 225, 226].

Cloud-based social coding platforms (SCPs) address these challenges by providing essential software engineering tools in a transparent mutual environment [7, 60, 242], including distributed version control systems (VCSs) like Git [269]. GitHub, currently the largest SCP [], is especially recommended for FOSS projects [161, 197, 264] as it provides special features like software citation and free upgrades for academic use [90]. A GitHub repository serves as much more than a code-hosting platform—it functions as the central hub for a project. It’s where contributors meet to discuss issues and ideas, review work, and plan for future development. Additionally, the repository acts as the project’s public face, allowing users to learn about the software, follow its progress, provide feedback, and contribute. A well-structured GitHub repository is crucial for a software project’s success, directly impacting adoption, growth, and long-term sustainability. However, setting up a robust repository is a non-trivial task involving multiple steps such as configuration of various features and customization with project metadata.

GitHub’s pull-based development model offers an effective solution for collaboration by enabling community contributions through issuing tickets and pull requests (PRs), while maintainers review and integrate changes. This accelerates development and enhances code quality through reviews, but also requires careful management [18, 242]. For example, projects need a well-defined governance model to facilitate task assignment [83, 129]. Another crucial aspect is documenting the development process to record a clear overview of the project evolution and ensure that the implementation matches the expected design [12, 137, 156, 218, 264].

Issue tracking systems (ITSs) like GitHub Issues (GHI) help document and organize tasks, but need significant setup to function effectively [8, 108, 216, 221]. By default, GHI only offers a single option for opening unstructured issue tickets, which can lead to problems such as missing crucial information that complicate issue triage [216]. Another problem is maintaining links between issue tickets and the corresponding commits resolving the issues in the VCS, which is important for tracing changes back to their associated tickets and accompanied documentation and discussion [166]. To facilitate issue management, GitHub offers labeling features to help categorize and prioritize tickets [59, 160]. However, labeling and issue–commit linkage tasks must be done manually, which is time-consuming and prone to errors [26, 35, 81, 212], resulting in the loss of a large portion of the project’s evolution history [11]. Such problems have even motivated the development of machine-learning tools for automatic ticket classification [49, 132] and issue–commit link recovery [215, 239]. In 2021 GitHub introduced issue forms, allowing projects to provide multiple issue submission options using structured web forms that enable the collection of machine-readable user inputs [158, 159, 216]. While these can be used in conjunction with GHA to automate a variety of issue management tasks without the need for machine-learning tools, such capabilities are often not exploited due to the initial implementation barrier.

PyPackIT’s Solution

PyPackIT establishes a comprehensive development workflow for collaborative and distributed cloud development using a well-tested pull-based strategy [166]. It provides dynamically-maintained type-specific issue forms designed according to best practices to collect machine-readable user inputs [24, 25, 30]. PyPackIT then uses these inputs to automate issue management activities on GHA, including ticket labeling and organization, task assignment, documentation, and creating issue–commit links.

1.3. FAIRness#

FOSS is a valuable asset for technological innovations and scientific advancements, but often lacks findability, accessibility, interoperability, and reusability [106, 108, 126, 170, 178]—key aspects of the FAIR principles [15]. Findability requires that software is searchable by its functionalities and attributes. This necessitates distribution to permanent public indexing repositories along with comprehensive metadata and unique global identifiers like DOIs to enable reliable citations [8, 9, 12, 83, 129, 153, 204].

Accessibility involves adopting an open-source model under a permissive license [109]—ideally from the start [16, 93, 167]—to enable transparent peer reviews, facilitate progress tracking, and promote trust, adoption, and collaboration [126, 200, 204]. The license determines the legal status of the project and defines the terms and conditions under which the software can be used, modified, and distributed. Unlike traditional copyright licenses that restrict the rights of others to use, modify, and distribute a work, open-source and copyleft licenses grant users the freedom to access, modify, and redistribute the work and its derivative works, as long as they remain under the same copyleft license. This ensures that the work and all its derivatives will always remain free to use. Licensing is thus an important aspect of software projects and can have a significant impact on their adoption and growth. A suitable license protects the rights of the creator while encouraging use and contribution from others. Therefore, it is crucial for developers to carefully choose a license that best suits their needs, and correctly add it to their project so that it can be automatically detected by other services and indexing repositories. This makes it clear to users and collaborators under which terms they can use and contribute to the project.

For interoperability, a key factor is using a well-suited and popular programming language in the target community [37, 119]. While low-level languages like C still dominate legacy high-performance computing (HPC) communities due to their speed and hardware integration [17, 82, 130], their complexity can obstruct software extension and maintenance [82, 130, 205]. Therefore, higher-level languages are commonly advised to improve development, collaboration, and productivity [130, 264]. Python is now the most popular and recommended programming language due to its simplicity, versatility, extensive ecosystem of performance-optimized libraries, and the ability to quickly implement complex tasks that are hard to address in low-level languages [13, 14, 74, 173, 180, 185, 205].

Lastly, reusability is enabled by employing DRY (Don’t Repeat Yourself) principles and modularizing code into applications with clear programming and user interfaces [15, 119, 137, 264]. Applications must then be packaged into as many distribution formats as possible, to ensure compatibility with different hardware and software environments. This can also greatly simplify the setup process for users [8, 73, 161], which is a common problem in FOSS [169, 257]. On the other hand, to ensure the reproducibility of results, consistent execution and predictable outcomes must be guaranteed regardless of the runtime environment. This is achieved through containerization—a cloud-native approach using technologies like Docker to encapsulate applications and all their dependencies into isolated, portable images [8, 161, 222].

Ensuring FAIRness is a non-trivial task, requiring developers to be familiar with the specific requirements and nuances of packaging and distribution for different formats and platforms (cf. packaging in Python vs Conda). Consequently, despite its importance, FAIRness is often overlooked in FOSS projects [16, 80, 169, 236], leading to unsustainable prototypes unfit for production environments [16, 83, 161]. This hampers FOSS adoption [106, 108] and forces projects to reimplement algorithms from scratch [93, 225], which can lead to errors and redundancy issues [130, 202]. In scientific fields that rely heavily on research software, FAIRness issues have resulted in many controversies and paper retractions [168, 231]. Thus, there is a growing call for a FAIR and open research culture to enhance transparency and reproducibility [131, 183, 195, 235], and many journals now mandate source code submissions for peer-review and public access [4, 104, 145, 194, 234]. This highlights the need for efficient tools and mechanisms for licensing, packaging, containerization, distribution, indexing, and maintenance—key challenges in publishing FAIR software [16, 93, 108, 126, 153, 195].

PyPackIT’s Solution

PyPackIT is specialized in the production of FAIR Python applications, and provides a comprehensive package infrastructure and automated solutions based on the latest guidelines and best practices for licensing, build, containerization, and distribution of software to multiple indexing repositories with comprehensive metadata and identifiers.

1.4. Quality Assurance and Testing#

Code quality assurance and testing are crucial aspects of every software development process, ensuring that the application is functional, correct, secure, and maintainable [8, 108, 137, 156, 169, 221, 264]. As projects grow in complexity over time, it becomes increasingly challenging to ensure that the software functions as expected in all scenarios and that changes do not introduce new bugs or disrupt existing functionality. To prevent the accumulation of errors into complex problems, it is highly recommended to use test-driven development (TDD) methodologies [18, 108, 124]. This involves early and frequent unit and regression testing to validate new code components and ensure existing features remain functional after changes [108, 124, 153, 264]. To ensure testing effectiveness, coverage metrics must be frequently monitored to identify untested components [124, 221]. Users should also be able to run tests locally to verify software functionality and performance on their machines [8, 221], necessitating the tests to be packaged and distributed along with the software [9, 153, 156].

Other crucial quality assurance routines include [72, 108, 124, 221, 264]:

  • Code Formatting: Python imposes few restrictions on the formatting of source code, leaving developers free to decide how to structure their programs. This flexibility can lead to inconsistent styles, making codebases harder to read, review, and maintain, particularly in collaborative projects. Python’s official style guide, PEP 8, addresses these issues by outlining best practices for naming conventions, indentation, line length, whitespace usage, and other layout rules. Compliant code formatting tools such as Black and YAPF can thus be used to establish a consistent code style by automatically reformatting files.

  • Static Code Analysis: Static code analysis, or linting, involves inspecting code for potential errors, code smells, and style violations without executing it. This process is an essential first line of defense in maintaining code quality, promoting code refactoring according to best practices. Tools like Ruff, Pylint, Flake8, and Bandit are popular examples that can help to detect security vulnerabilities, syntax errors, unused imports, and non-compliant code structures.

  • Type Checking: Although Python is a dynamically typed language, it supports optional type annotations introduced in PEP 484. These improve code readability and documentation while enabling static type checking. Tools like Mypy analyze type annotations to identify type-related errors, ensuring that functions and variables conform to their expected types. Static type checking helps developers catch potential bugs early in the development cycle, especially in large and complex codebases.

To warrant consistent and effective quality assurance, code analysis and testing practices need to be automated in the project’s development workflow and carried out on configurable virtual machines [9, 124, 264]. This ensures that all changes in the project pass the same checks and standards, and that the code is always tested in the same reproducible and transparent environment. While the Python ecosystem offers powerful code analysis and testing tools like Pytest, assembling the right set of tools into a comprehensive automated pipeline is still a challenging task [186], resulting in the prevalence of slow and ineffective testing methods in FOSS projects [76, 108, 130, 202, 225]. Consequently, software products may contain hidden bugs that do not interrupt the execution of the program but generate incorrect outputs. In sensitive areas like governmental and military applications, such bugs can compromise critical scientific conclusions and result in multi-million-dollar losses [2, 3, 6, 42, 43, 58, 77, 85, 101, 112, 118, 138, 139, 154, 163, 169, 172].

PyPackIT’s Solution

PyPackIT provides a ready-to-use test suite, where users only need to add test cases in the provided skeleton files. The test suite benefits from the same features as the project’s main Python package, and can be automatically distributed in each release as a stand-alone package. All quality assurance and testing routines are automated in the provided Continuous Integration, Refactoring, and Testing pipelines, with feature such as code style formatting, linting and automatic refactoring, coverage monitoring, and comprehensive report generation.

1.5. Documentation#

Documentation is a key factor in software quality and success, ensuring users understand how to install, use, and exploit the software’s capabilities while recognizing its limitations [12, 18, 153, 169, 204, 257, 264, 265]. This is especially important for FOSS, which often suffers from knowledge loss due to high developer turnover rates [57, 135, 184, 230]. As software evolves, documenting and publishing changelogs with each release allows existing users to assess the update impact and helps new users and contributors understand the software’s progression [8, 83, 265]. As community building is crucial for FOSS success [12, 184], project documentation should also include contribution guidelines, developer guides, governance models, and codes of conduct [8, 72, 83, 108, 129, 156, 221, 264]. Other important documents include README files for both the source code repository and other indexing repositories hosting binary distributions. Acting as the front page of the repository, README files should provide visitors with a concise and visually appealing overview of the project, including a short description, keywords, and links to important resources and documents. Ideally, they should also include dynamic information such as project statistics and status indicators that are automatically updated to reflect the current project state.

However, high-quality documentation requires time, effort, and skills, including web development knowledge to create user-friendly websites that stay up to date with the latest project developments [12, 108]. Although tools exist to aid documentation [12, 152, 264], developers must still invest time in setting them up. A more important issue is the lack of automation, requiring developers to manually document a large amount of specifications, instructions, design decisions, implementation details, changelogs, and other essential essential documentation. Consequently, FOSS is often not well-documented [80, 130, 218, 225], creating barriers to use and leading to software misuse and downstream issues [169, 198, 263].

PyPackIT’s Solution

PyPackIT provides a fully designed website filled with automatically generated documentation such as project information, package metadata, installation guides, API reference, changelogs, release notes, contribution guides, and citation data. The website can be automatically deployed to GitHub Pages and Read The Docs platforms, and is easily customizable via the control center with no web development knowledge. PyPackIT can also dynamically generate standalone documents in various Markdown formats, such as community health files and READMEs for different indexing repositories.

1.6. Version Control#

Version control practices such as branching, merging, tagging, and history management are vital yet challenging tasks in software development [8, 9, 229, 270]. Branching provides isolation for development and testing of individual changes, which must then be merged back into the project’s mainline with information-rich commit messages to maintain a clear history. Moreover, tags allow to annotate specific states of the code with version numbers to clearly communicate and reference changes [228, 253].

While established versioning schemes like Semantic Versioning and branching models like trunk-based development [102], git-flow [71], GitHub flow [88], and GitLab flow [92] exist, enforcing their consistent and effective application in the project still requires implementing them into automated workflows. Moreover, these general-purpose strategies may require adjustments to fully align with the evolving nature of FOSS, which often begins as a prototype and undergoes significant changes [17]. For example, a suitable model for FOSS development should support simultaneous development and long-term maintenance of multiple versions, to facilitate rapid evolution while ensuring the availability and sustainability of earlier releases [166].

PyPackIT’s Solution

PyPackIT automates version control tasks such as branching, pull request creation, commit message generation, versioning, tagging, and merging, with a branching model and versioning scheme specialized for FOSS requirements.

1.7. Configuration Management#

Software projects rely on various tools and services throughout the development life cycle, each requiring separate configuration via specific files or user interfaces. This can lead to several maintenance challenges [63, 264]: Tool-specific formats and requirements result in data redundancy, since many settings are shared. As configuration files are often static, they require manual intervention to reflect each change. Otherwise they quickly fall out of sync with the current state of the project, leading to conflicts and inconsistencies. Moreover, configurations via interactive user interfaces complicate the tracking and replication of settings, as they must be manually recorded and applied. These issues complicate project initialization, configuration, and customization, hampering the growth of sustainability of FOSS projects.

DevOps practices such as Continuous Configuration Automation (CCA) and Infrastructure-as-Code (IaC) were developed to tackle these issues, enabling dynamic configuration management of hardware and software infrastructures through machine-readable definition files [147]. While these practices are more prevalent in server and network management applications, they can greatly benefit software development projects as well. Nevertheless, due to a lack of publicly available tools, most projects still rely on a combination of different configuration files and manual settings, which are hard to manage, modify, and reproduce.

PyPackIT’s Solution

PyPackIT provides a centralized user interface for automatic configuration, customization, and management of the entire project, and even multiple projects at once. PyPackIT’s control center consolidates all project configurations into a unified data structure, supporting both declarative definitions and dynamic data generation at runtime via built-in templating, scripting, and online retrieval features. Configurations are automatically applied to related components, eliminating redundancy and rendering the entire project dynamic.

1.8. Maintenance#

Modern software can remain useful and operational for decades [57, 130]. Considering the amounts of time and effort required to develop high-quality software from scratch, ensuring the long-term sustainability of available software is crucial [153]. This requires continuous feedback from the community and active maintenance to fix existing issues, improve functionalities, and add new features. Maintaining software dependencies is equally important [144], as software must remain compatible with diverse computer environments and future dependency versions [69]. However, many projects overlook outdated dependencies [146], leading to incompatibilities and bugs [55, 68, 150].

Challenges such as funding [93, 140], small team sizes [130, 263], and high developer turnover rates [135, 230] further hinder maintenance of FOSS projects, exacerbated by technical debt and increased software entropy from neglected software engineering best practices [34, 57, 64, 93, 130, 140, 204, 225]. Consequently, the extra effort required for maintenance is a major barrier to publicly releasing software [16, 93], often leaving it as an unsustainable prototype [16, 83, 161]. To prevent such issues, quality assurance and maintenance tasks should be automated and enforced from the beginning of the project [130], in form of Continuous Maintenance (CM) [189], Refactoring (CR) [250], and Testing (CT) [86] pipelines to periodically update dependencies and development tools, and automatically maintain the health of the software and its development environment [130]. Furthermore, providing a ready-to-use development environment tailored to project needs can greatly lower the entry barrier for future maintainers and external collaborators, fostering the long-term sustainability of FOSS.

PyPackIT’s Solution

PyPackIT provides fully automated Continuous Maintenance, Refactoring, and Testing pipelines that periodically perform tasks such as testing previous releases with up-to-date dependencies, refactoring code according to the latest standards, upgrading development tools and project infrastructure, and cleaning up the repository and its development environment. PyPackIT can automatically submit issue tickets and pull requests for applying updates and fixes, thus maintaining the health of the project and ensuring its long-term sustainability.

1.9. Security#

Security is a crucial aspect of software development, and should be considered at every stage of the development process. This is especially important for open-source projects where the source code and other project resources are publicly available. Therefore, implementing security measures and protocols for reporting and handling security issues in the repository is essential for ensuring software integrity and safeguarding the project against vulnerabilities. GitHub provides several security features that must be correctly configured to help developers identify, privately report, and fix potential security issues in their repositories, such as code scanning, dependency review, secret scanning, security policies, and security advisories. In addition, setting up various branch protection rules for repository’s release branches is another crucial security measure, safeguarding the main codebase and ensuring that changes are reviewed and tested before being merged. This practice, which is especially important for projects with multiple contributors and outside collaborators, not only maintains code quality but also fosters a disciplined development environment.